Keywords

1 Introduction

Travel-time prediction is a critical component of Intelligent Transportation Systems (ITS) [1]. It plays an important role in the implementation of Advanced Traveler Information System (ATIS) and Advanced Traffic Management Systems (ATMS) [2]. Travel-time information can be applied as input or auxiliary data of dynamic navigation, congestion control, accident detection and so on. Therefore, it is significant to study travel-time prediction methods. Predicting future travel-time is a complex task because travel-time changes in different periods due to the weather, road conditions, drivers’ habits, etc. It is crucial to understand these fluctuations and develop accurate travel-time prediction algorithms. Therefore, predicting travel-time requires complex traffic models or data-driven models that can learn traffic patterns from data.

In recent years, a variety of travel-time prediction methods have been proposed. These methods use different technologies and have their own advantages and disadvantages. Contributions of our work are as follows: (a) we classify travel- time prediction methods as model-based and data-driven methods, and provide some brief descriptions of these methods; (b) we compare model-based and data-driven methods in terms of datasets, prediction range, and accuracy; (c) we discuss several solutions to overcome shortcomings of existing methods, and highlight future research challenges.

2 Problem Statement

Travel-time can be generally defined as the time to reach a destination or cross a link. Travel-time prediction refers to the prediction of current or future travel-time. There are two ways to predict travel-time, namely direct prediction methods and indirect prediction methods. We usually utilize parametric or non-parametric methods to fit the functional relationship of travel-time data, and predict travel-time in the near future directly [3]. We predict time-space speed by using historical data such as flow, density, occupancy, or average speed, and then calculate travel-time indirectly [3].

The problem generally consists of three components, namely data collection, data processing and travel-time prediction. Traffic data is collected by loop detectors, radar monitors, the global positioning systems, etc. Data can be stored in historical database after pre-processing, such as missing data completing, data aggregation and so on. Some algorithms can be employed to predict travel-time in the near future with historical data and real-time data.

3 Classification of Travel-Time Prediction Methods

Various travel-time prediction methods have been proposed in the past decades. We categorize these methods as model-based and data-driven methods (See Fig. 1).

Fig. 1.
figure 1

Classification of travel-time prediction methods

The model-based methods predict future travel-time by building traffic models using traffic parameters (such as density, flow, and speed). They estimate traffic condition over time. This paper describes two common traffic models of travel-time prediction, namely queuing theory [4,5,6] and Cell Transmission Model [7,8,9,10].

The data-driven methods predict travel-time by mining potential patterns. We classify data-driven methods into two categories: parametric and non-parametric models. Common parametric models include Linear Regression [11,12,13], Autoregressive Integrated Moving Average [14,15,16,17] and Kalman Filtering [17,18,19,20,21]. Non-parametric models of travel-time prediction include Neural Networks (Back Propagation Neural Network [22, 23], State-Space Neural Network [24, 25], Recurrent Neural Network [26, 27], and Long Short-Term Memory Network [28, 29]), Support Vector Regression [30,31,32], Nearest Neighbors [33,34,35], and Ensemble Learning methods [36,37,38,39].

4 Review of Travel-Time Prediction Methods

4.1 Model-Based Methods

This kind of methods builds models using traffic data, such as flow, speed and density. It can describe the collective behavior of numerous vehicles, or the individual behavior of a vehicle. Table 1 lists the description and performance of these methods.

Table 1. Description and performance of model-based methods

4.1.1 Queuing Theory

The queuing theory model generally utilizes historical data to analyze the length of the waiting queue, number of vehicles waiting in the queue and waiting time to obtain statistical patterns, and then predicts travel-time.

Takaba et al. [4] employed a sandglass model and a time-delay model to predict travel-time using data from Mejiro Street, Tokyo. The error rate (ER) was about 11–24%. They found that the performance of the sandglass model was more stable than the time-delay model. Akiva et al. [5] proposed a framework called DynaMIT to predict travel-time. However, it is not suitable for long-term forecasting. Skabardonis et al. [6] used a time-space model to predict travel-time on the main roads. They conducted experiments on Washington and Lincoln Avenue. The ER was less than 5%.

4.1.2 Cell Transmission Model

The Cell Transmission Model (CTM) can describe the formation, propagation and elimination of waiting queues and back-propagation of crowded waves. In CTM models, roads are divided into fixed-length units. Vehicles travel from one cell to another adjacent one.

Juri et al. [7] combined statistical forecasting techniques with CTM simulation to forecast short-term travel-time online. The advantage of the framework is that it is flexible and can take advantage of online data. Wan et al. [8] utilized Link-Node CTM to provide a probability distribution of travel-time. Xiong et al. [9] proposed a three-stage highway travel-time prediction framework. Seybold [10] proposed an improved CTM (CTM-v) model, and carried out experiments using data from E4 highway. The mean percentage error (MPE) of the proposed model was reduced by 16%. We find that the least squares (LS) and total least squares (TLS) methods can optimize parameters of CTM, thus improving the accuracy of CTM.

4.2 Data-Driven Methods

The idea of data-driven methods is to fit a mapping function between variables to approximate the real situation with a large quantity of historical data.

4.2.1 Parametric Methods

Parametric methods generally assume that all data satisfies a certain distribution and train models according to pre-defined rules. Table 2 shows the description and performance of these parametric methods.

Table 2. Description and performance of parametric methods

Linear Regression.

LR model assumes that the function of travel-time prediction is a linear function of traffic variables.

Kwon et al. [11] employed a LR model to predict highway travel-time with data from I-880S in California. The mean absolute percentage error (MAPE) was lower than 23%. Zhang et al. [12] used a LR model with time-varying coefficients to predict travel-time. The ER on I-880 data increased from 5% to 24%. The ER on I-405 data was about 8–14%. Sun et al. [13] exploited the multi-variable local LR model to predict the speed using data from US-290 highway. The mean relative error (MRE) was 11.38%. The results showed that the performance of the local LR model was better than k-nearest neighbors and kernel smoothing methods.

Autoregressive Integrated Moving Average.

ARIMA models convert a non-stationary time series into a stationary one, and fit a regression function of current values and lag values of variables and random error.

Oda et al. [14] experimented with ARIMA using vehicle sensor data collected on a 7 km highway. The ER was less than 13.9%. Zhicharevich et al. [15] applied the KARIMA model which combined a Kohonen network with ARIMA to predict short-term travel-time. Xia et al. [16] combined a Seasonal ARIMA with an adaptive Kalman Filter. They utilized detector data on I-80 highway and reported MAPE was 5.34%. The model can continuously adjust forecasting results as real-time data is available. Sun et al. [17] forecasted travel-time of origin-destination pairs by combining SARIMA with KF. The results showed that the mean absolute error (MAE) and MAPE of the model were both less than 7%, which was better than SARIMA and KF.

Kalman Filtering.

KF theory uses a state-space model of a linear stochastic system which consists of a state equation and an observation equation. The theory optimally estimates the state of system by input and output observation data.

Chen et al. [18] conducted experiments using simulation data from I-80 in New Jersey with a relative root square error (RRSE) less than 2.8%. Ji et al. [19] established KF equations for dynamic travel-time prediction. The MRE of the model was 1.6%. Ojeda et al. [20] proposed an adaptive KF for travel-time prediction online. The simulation experiment performed with ER less than 9%. Liu et al. [21] combined simple exponential smoothing (SES) with KF. The experiment showed that the mean absolute relative error (MARE) of ESES was 3.1% which was better than KF and SES. We think that KF methods can optimize smoothing factors over time, thus improving the performance of SES when traffic conditions change suddenly.

4.2.2 Non-parametric Methods

Non-parametric methods make none assumptions about distribution of the data. They learn from data and train models directly or indirectly. Table 3 shows the description and performance of non-parametric models in some researches.

Table 3. Description and performance of non-parametric methods

Neural Networks.

As for travel-time prediction, we generally utilize travel-time or speed data as input to train NNs.

Back Propagation Neural Network.

Park et al. [22] established a BPNN model and found the MAPE was 7.4–18%. Wisitpongphan et al. [23] designed a BPNN model with three hidden layers to predict travel-time. The mean squared error (MSE) of the proposed model was less than 3%.

State-Space Neural Network.

Lint et al. [24] proposed a framework to process missing data. The MRE of the model was 1.6%. Li et al. [25] exploited a Bayesian SSNN with terminal conditions. Compared with the SSNN model, the training time of BSSNN reduced by 90 min, and MAPE also decreased by 0.17%. We conclude that using control factors to limit confidence intervals can shorten training time of neural network, accelerate convergence, and enhance stability.

Recurrent Neural Network.

Yun et al. [26] conducted an experiment and found the MAPE of RNN was 12% less than BPNN. We think the reason is that RNN has a short-term memory and performs better at processing time series data than BPNN. Ickes et al. [27] used a Genetic Algorithm (GA) to improve the performance of Time-Delayed Recurrent Network (TDRN). The experiment showed the MPE of the model was less than 15%.

Long Short-Term Memory Network.

Duan et al. [28] utilized travel-time data to verify the performance of LSTM. The MRE of LSTM was 0.17–0.77. Liu et al. [29] proposed a LSTM-DNN model using travel-time data on I-80 highway and found MAPE less than 7.3%. We believe that the model can mine the short-term and long-term correlation patterns of travel time data. However, it takes a long time to train models.

Support Vector Regression.

The basic idea of SVR is to map the training data from the low-dimensional space to the high-dimensional feature space by fitting a function. SVR models can construct a separated hyperplane with the largest margin in the high-dimensional feature space.

Wu et al. [30] used speed data to predict travel-time using SVR. The MRE of SVR was less than 4.5% and the RMSE was less than 7.4%. Castro-Neto et al. [31] proposed an online SVR (OL-SVR) model using PeMS data. The result showed that the MAPE was less than 9% in off-peak hours, while the MAPE was less than 23.4% in peak hours. Gao et al. [32] exploited Immune Genetic Algorithms (IGA) to optimize SVR parameters. The experiment reported the MAPE of the model was 12.4%.

Nearest Neighbors.

The Nearest Neighbors algorithm is also known as k-nearest neighbors (k-NN). In k-NN models, if most similar samples of a sample in the feature space belong to a certain class, the sample also belongs to the class. The k-NN regression method utilizes historical data of neighbors to predict travel-time.

Lim et al. [33] combined a point-detection system with an interval-detection system to predict travel-time. The MAPE of the proposed model was 4.3%–14.8%. Wang et al. [34] proposed an improved 1-NN model and showed that the MAPE was less than 8.6%, and the MPE was less than 16.2%. Tak et al. [35] proposed a multi-layer k-NN (Mk-NN) travel-time prediction framework for cloud systems. The framework conducted data classification, global matching, and local matching. The result showed that Mk-NN was 8 times faster than k-NN, and the MAPE and RMSE were less than 3.5%. We believe that the multi-layer matching process reduces searching space and computational complexity, making it a promising method.

Ensemble Learning.

The main idea of EL is to predict travel-time based on the voting results of multiple classifiers.

Zhang et al. [36] built a Gradient Boosting (GB) regression method using travel-time data from I-95 highway. The MAPE was 8.7%–18.4% during peak periods, and 2.3%–14.8% during off-peak periods. Yu et al. [37] combined RF with k-NN (RFNN). The MAPE of RFNN was less than 14.3%. Gupta et al. [38] employed RF and GB models to predict travel-time of taxis in Porto. The MRE of RF was 17%–29% and the MRE of GB was 24–29%. Hamner et al. [39] applied a context-dependent Random Forest (RF) method to predict travel-time. The RMSE of the model was less than 7.5%. We conclude that GB regression methods perform better than RF regression methods. It is because GB models pay more attention to samples with larger prediction errors, while samples in RF are randomly selected. However, RF requires less time than GB to train models because RFs can be trained in parallel.

5 Open Issues

We classify travel-time prediction methods as model-based and data-driven methods. They have different applicable scenarios, advantages and disadvantages.

Most of model-based methods are suitable for short-distance short-term prediction on highways and urban roads. These methods have well-defined traffic models and a mature theoretical system. However, these methods have poor transferability.

Data-driven methods can be used for short-term and long-term prediction on highways. There are a few studies applied to urban roads. Most data-driven methods are suitable for non-linear, high-dimensional data. However, most methods have numerous parameters and lack interpretability. Only a few methods are partly interpretable, such as k-NN, SSNN and EL methods.

We discuss some solutions to overcome shortcomings of existing methods, and highlight significant research challenges in the future as follows.

  1. (1)

    Data processing: Existing data-processing algorithms always assume that noise is a known distribution, while realistic noise is difficult to describe. Therefore, it is worthwhile to study new algorithms. Excessive data can increase calculation of models, such as k-NN. Cluster methods can be used to select high-quality data.

  2. (2)

    Combining spatial information: Travel-time in target roads can be affected by vehicles from upstream and downstream. Correlation metrics of roads may help to improve accuracy of methods. In addition, data mining algorithms can be exploited to analyze traffic data to monitor whether the traffic condition changes or not.

  3. (3)

    Hybrid methods: Hybrid algorithms can have a better performance. SSNN can capture spatial information but has a short memory. It is a potential method to combine SSNN with LSTM. Furthermore, Mk-NN can be applied to select training samples of GB. The high-quality samples may improve the accuracy of GB.

  4. (4)

    Deep learning algorithm: Deep learning methods have been exploited to many fields in recent years. Deep Belief Network (DBN), which consists of several RBMs, can learn the potential patterns and trends from data. Therefore, it is worthy to study travel-time prediction with DBN models.

6 Conclusion

This paper reviews travel-time prediction methods in the past decades. These methods are classified as model-based and data-driven methods. Besides, these models are compared from datasets, prediction range, and accuracy. Last but not least, some solutions are proposed to overcome shortcomings of existing methods. Although there are so many methods to predict travel-time, many problems still need to be solved in the future.