
The time in which we live is often referred to as the Information Age. An empirical study on a commuter dataset in Beijing shows that our proposed model is approximately 20% better than XGBoost (state-of-the-art), thus proving its effectiveness. In the last step of the decoder, a random forest regressor is trained to predict the commuting flow based on the learned embedding vectors. The second step of the encoder designs a convolutional neural network (CNN) to achieve the fusion of neighborhood features, constructs a spatial interaction network with the grids as nodes and the flows as edges, and then uses the graph convolutional network (GCN) to extract the embeddings of the nodes. Specifically, in the preprocessing part, we divide the study area into grids, and then incorporates features such as location, population, and land use types. In this paper, we propose a ‘preprocessing-encoder-decoder’ hybrid learning model, which can make full use of geographic semantic information and spatial neighborhood effects, thereby significantly improving the prediction performance. The other is the machine learning models, most of which simply leverage the features of Origin-Destination (OD), ignoring the topological nature of the interaction network and the spatial correlation brought by the nearby areas. These models rely on fixed and simple mathematical formulas derived from physics, and ignore rich geographic semantics, which makes them difficult to model complex human mobility patterns. One is traditional models, such as the gravity model and radiation model. However, the two existing types of solutions have inherent flaws. Commuting flow prediction is a crucial issue for transport optimization and urban planning.
