1 Introduction and Related Work

Thanks to the popularity of GPS-embedded devices, much more trajectory data has been generated, by analyzing which municipal authorities may identify and optimize the traffic congestion in advance. However, predicting an accurate travel time is still very challenging as the travel time is affected by many dynamic factors, such as dynamic departure time, dynamic traffic conditions and dynamic driver behavior. All these ‘dynamics’ make it intractable to predict future pattern of traffic with statistic model [6].

With rapid evolution of deep learning, some studies adopt embedding technologies to solve the challenge of dynamics [3, 7]. They transform departure time, drivers, weather and locations into low-dimensional learnable real vectors, and construct a deep neural network to predict the travel time. Nevertheless, most of them don’t extract travel speed features adequately from sparse trajectory data because trajectory data isn’t necessarily generated on all road segments at every momentFootnote 1, which results in poor performance.

Meanwhile, tensor/matrix decomposition algorithms have been adopted to solve the data sparsity [5, 8]. However, these decomposition methods often take several minutes to restore the travel time/speed on a road, which is almost infeasible in reality. Even worse, tensor/matrix decomposition algorithms can only estimate the previous travel time/speed of a road because there’s no future data in the tensor/matrix. Consequently, it cannot be directly applied to the problem of travel time predictionFootnote 2.

With the aim of solving the aforementioned challenges, we propose a novel deep learning framework named Tensor-CNN-LSTM (TCL) for travel time prediction, which can extract travel speed features effectively from historical sparse trajectory data and predict the travel time of a given path with better accuracy.

2 Model Architecture

In this section, we introduce the framework of our proposed TCL model, as is shown in Fig. 1. TCL is comprised of three major components: non-negative tensor decomposition, long-short-term speed CNN and LSTM prediction model.

Fig. 1.
figure 1

The framework of TCL model.

Non-negative Tensor Decomposition. In the module of non-negative tensor decomposition, we partition an hour into M time slots, and construct three homogenous matrices \( A_{h}, A_{m}, A_{r} \in \mathbb {R}^{N^2\times M} \), where \( A_{r}(i,j)=a \) denotes the i-th grid with travel speed a in time slot j, and \( A_{h} \) is constructed based on historical trajectories over a long period of time (e.g. a week). \( A_{m} \) is a mixed matrix to combine \( A_{r} \) and \( A_{h} \). After constructing these matrices, we concatenate them together to construct a 3D non-negative tensor \( \mathcal {A} \in \mathbb {R}^{N^{2}\times M\times 3} \). We fill the missing value in \( \mathcal {A} \) by using a fast non-negative CP decomposition algorithm [2].

Long-short-term Speed CNN. In the module of long-short-term speed CNN, we extract the long/short-term speed features from a given path, based on \( \mathcal {A^{*}} \), where the long-term speed features are the travel speed values of the target grid in the past 7 days, and the short-term speed features are the speed distributions in that grid and relevant grids in the previous hourFootnote 3. Afterwards, we construct a CNN to obtain the whole speed features, as is shown in Fig. 2.

LSTM Prediction Model. The LSTM prediction model consists of two parts: feature extraction layer and prediction layer, as is shown in Fig. 3. The former extracts useful features from the path, such as augment features (the driver ID, the departure time, the day of the week, the travel distance and holiday indicator) and location features (the latitude, the longitude and the grid ID). The prediction layer predicts travel time of the path, which consists of a 2-layer LSTM and a multi-layer perceptron (MLP). The loss function of our model is Mean Absolute Percentage Error(MAPE), and the optimizer is Adam.

Fig. 2.
figure 2

Architecture of CNN model.

Fig. 3.
figure 3

Architecture of LSTM model.

3 Experiments

Datasets: We evaluate the performance of our model on two real-world trajectory datasets, namely Beijing and Shanghai. The Beijing dataset contains 3,384,847 trajectories of 10,039 drivers from Oct. 1\(^{st}\) to Oct. 31\(^{st}\) in 2013. The Shanghai dataset contains 9,727,798 trajectories of 13,622 drivers from Apr. 1\(^{st}\) to Apr. 30\(^{th}\) in 2015. For each dataset, we split the trajectories generated in the last 7 days as the test set and the rest as the training setFootnote 4.

Results: We select TEMP [4], XGBoost, DeepTTE [3] as baseline methods for comparison with our model. For DeepTTE and TCL, we train these models for 50 epochs and repeat each experiment 3 times. The mean and the standard deviation are calculated, and the results are shown in Table 1.

Table 1. Performance comparison of travel time prediction.

As we can see, TEMP only considers the information of starting points and destinations, resulting in the worst performance. XGBoost performs better than TEMP on both datasets because the feature selection of XGBoost is consistent with our model. However, XGBoost can not handle sequence data, so fixing the length of a path will lose some information. DeepTTE consider various factors which may affect the travel time, the performance of DeepTTE are much better than aforementioned methods. Our proposed TCL captures the travel speed accurately, which is the most important factor affecting travel time. TCL scores 12.40% and 13.08% (on two datasets respectively) in MAPE, and also outperforms other models in other metrics.

4 Conclusion

In this paper, we propose a novel deep learning framework, namely TCL, to predict the travel time of a given path. Specifically, TCL can extract travel speed effectively from historical sparse trajectory data and predict travel time with better accuracy. TCL achieves satisfying performance on two real-world datasets, which means that we have conquered the challenges of dynamics and sparsity in the trajectory data.