A Short-term Traffic Flow Forecasting Method Based on the Hybrid PSO-SVR

Hu, Wenbin; Yan, Liping; Liu, Kaizeng; Wang, Huan

doi:10.1007/s11063-015-9409-6

A Short-term Traffic Flow Forecasting Method Based on the Hybrid PSO-SVR

Published: 08 February 2015

Volume 43, pages 155–172, (2016)
Cite this article

Download PDF

Access provided by CONRICYT – Journals CONACYT

Neural Processing Letters Aims and scope Submit manuscript

A Short-term Traffic Flow Forecasting Method Based on the Hybrid PSO-SVR

Download PDF

Wenbin Hu¹,
Liping Yan^1,2,
Kaizeng Liu¹ &
…
Huan Wang¹

2070 Accesses
133 Citations
Explore all metrics

Abstract

Accurate short-term flow forecasting is important for the real-time traffic control, but due to its complex nonlinear data pattern, getting a high precision is difficult. The support vector regression model (SVR) has been widely used to solve nonlinear regression and time series predicting problems. To get a higher precision with less learning time, this paper presents a Hybrid PSO-SVR forecasting method, which uses particle swarm optimization (PSO) to search optimal SVR parameters. In order to find a PSO that is more proper for SVR parameters searching, this paper proposes three kinds of strategies to handle the particles flying out of the searching space Through the comparison of three strategies, we find one of the strategies can make PSO get the optimal parameters more quickly. The PSO using this strategy is called fast PSO. Furthermore, aiming at the problem about the decrease of prediction accuracy caused by the noises in the original data, this paper proposes a hybrid PSO-SVR method with historical momentum based on the similarity of historical short-term flow data. The results of extensive comparison experiments indicate that the proposed model can get more accurate forecasting results than other state-of-the-art algorithms, and when the data contain noises, the method with historical momentum still gets accurate forecasting results.

A short-term traffic flow forecasting method and its applications

Article 02 April 2015

Online Prediction Model of Short-Term Traffic Flow Based on Improved LS-SVM

Short-Term Traffic Flow Prediction Based on Least Square Support Vector Machine with Hybrid Optimization Algorithm

Article 01 March 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The short-term traffic forecasting has always been one of the most important problems in real-time traffic control for three decades. The information to be forecasted involves journey time, vehicle speed, traffic flow density and so on. The forecasting accuracy of traffic flow directly influences the effects of traffic guidance, planning and control.

Various forecasting approaches have been applied to forecast the short-term traffic flow, which can be classified as follows: (1) time series analysis methods, including ARIMA [1], SARIMA [2], Kalman filtering models [3, 4] and so on; (2) machine learning methods, including K nearest neighbor(KNN) [5–10], kernel estimator [11], artificial neural network(ANN) [12] and so on. Among the ANN approaches, several improved ones have been proposed, such as back-propagation neural network (BPNN) [13], radial basis function neural network [14], wavelet network [15], fuzzy neural network [16], object-oriented neural network [17] and so on. But these methods are relatively complicated in modeling, or their performances are not good enough in forecasting accuracy [18–22]. Therefore in this paper a hybrid method is proposed, which is less complicated in modeling and can yield accurate real-time flow forecasting.

Unlike ANN, the support vector regression model (SVR) can get the global optimal solution, and can map a nonlinear regression problem into a linear regression problem by applying a kernel function [23]. Particle Swarm Optimization (PSO) is a population-based stochastic approach for solving continuous and discrete optimization problems, and compared with Genetic Algorithm (GA) and other heuristic algorithms, it is easy to be realized and there are not many parameters needing to be adjusted. Through the comprehensive analysis of SVR and PSO, this paper proposes a hybrid short-term traffic flow forecasting method based on PSO and SVR. In addition, to handle the condition when the traffic data contain noises, this paper proposes a PSO-SVR forecasting method with historical momentum based on the historical similarity of the traffic flow data. To sum up, the contribution of this paper can be summarized as follows.

(1)
It proposes a short-term traffic flow forecasting method based on hybrid PSO-SVR, which uses PSO to optimize the parameters of SVR.
(2)
It proposes three kinds of strategies to handle the particles flying out of the searching space, in order to find an approach to improve the search speed of PSO.
(3)
In order to handle the condition when the original traffic data contain noises, it proposes a PSO-SVR forecasting method with historical momentum, which is based on the fact of historical data’s similarity.

The rest of this paper is organized as follows. Section 2 introduces the related works on the classical methods for short-term traffic flow forecasting. Section 3 details a hybrid short-term traffic flow forecasting method based on hybrid PSO-SVR. Section 4 details a PSO-SVR method with historical momentum. Section 5 presents the analysis for the results of extensive comparison experiments. Finally, conclusions and future research are made in Sect. 6.

2 Related Work

The related work on the short-term traffic flow forecasting can be presented as follows.

(1) Kalman filtering model and system identification model built the traffic flow prediction as a multidimensional model and involved the relationship between the time and the location. However, this kind of method was difficult to deal with the observation noise in the traffic flow data and the prediction accuracy was limited. Danech-Panjouh and Aron [24] adopted a hierarchical statistical method, which used a mathematical clustering technology to classify the traffic flow data and built a tuned linear regression model for each category respectively. The modified method improved the prediction accuracy, but it had a high demand for the quality of essential data [25, 26].

(2) The ARIMA model developed by Box [20] is one of the most popular methods in traffic flow forecasting. Kamarianakis [27] successfully applied the ARIMA model considering space and time factors to forecast the traffic flow data. However, ARIMA had its own limitation that the tendency to concentrate on the average values of the past data series made it unable to capture the rapid varying process of the traffic flow. In addition, Willianms [2] employed the seasonal ARIMA (SARIMA) model to the traffic flow forecasting, which considered the periodic difference of the peak, non-peak traffic data, and obtained good forecasting results. However, it took much time to detect the needed outliers and to estimate the parameters of SARIMA model.

(3) ANN imitates the human neurological system’s information processing, and several kinds of ANN have been widely used in the traffic flow forecasting. Yin [28] applied a fuzzy-neural model (FNM) to predict the traffic flow of urban street network, which was verified by the experiments to be more accurate than ANN. Vlahogria [29] adopted genetic algorithm (GA) and a multilayer structure optimization strategy to determine the appropriate neural network structure. Their experiments indicated that a simple and static neural network with genetic optimization step, momentum and a certain number of hidden units was suitable for modeling the univariate and multivariate traffic data. ANN model is suitable for arbitrary functions especially nonlinear functions, but its disadvantage is that the objective function is difficult to be understood and it is difficult to find the global optimal solution for non-convex problems.

(4) Support Vector Machine (SVM) can effectively overcome the shortcoming of ANN. It can not only use the minimal risk strategy to train, but also use the structure risk minimization strategies to minimize the upper bound of the error. SVM can obtain the global optimal value in theory, while ANN can only get the local optimal value. In addition, through the application of the kernel function, SVM can map a nonlinear problem in the low dimensional input space to a linear problem in the high dimensional feature space. Hong [30, 33] applied SVR to short-term traffic flow forecasting, and used simulated annealing algorithm and genetic algorithm to optimize the SVR parameters selection process. However, they failed to consider the condition when the traffic data contain noises.

In conclusion, short-term traffic flow forecasting has the characteristics of nonlinearity, complexity and real-time performance. However, the present methods are not perfect in constructing adaptive models, ensuring high precision accuracy and providing a real-time solution. In this paper, we combine PSO with SVR and propose a hybrid PSO-SVR short-time traffic flow forecasting method, which can ensure the accurate and real-time prediction of the short-time traffic flow and reduce the influence of the noises in the traffic flow data.

3 A Hybrid PSO-SVR Method for Short-term Traffic Flow Forecasting

3.1 SVR

Support Vector Machines refers to a kind of specific algorithms, which can be used to solve the classification and regression problems. They were invented by Vladimir Vapnik and his colleagues, and were firstly introduced on the computational learning theory (COLT) conference in 1992. Their basic model is a hyper plane with the maximum margin in feature space.

Considering the given train dataset $\left\{ {\left( {x_1 ,y_1 } \right) ,\ldots ,\left( {x_n ,y_n } \right) } \right\} $, the study target of SVR is to find a function representing the relationship of $x$ and $y$, and when a new $x$ is given, the function can get the corresponding forecasted value. This function is shown as Eq. (1):

$$\begin{aligned} f\left( x \right) =\sum \nolimits _{i=1}^n w\phi \left( x \right) +b \end{aligned}$$

(1)

where $w$ and $b$ are the final study targets of SVR, which decide a linear hyper plane that can fit the training dataset. $\phi \left( x \right) $ is the nonlinear mapping about $x$, which maps $x$ to a new space when the relationship of $x$ and $y$ is nonlinear. In the new space the relationship of $\phi \left( x \right) $ and $y$ is linear.

The goal of SVR is to minimize the expected risk, which can be defined as Eq. (2), where $L_{\epsilon }$ is called ${\upepsilon }$-insensitive loss function proposed by Vapnik. $L_{\epsilon }$ is defined as Eq. (3).

$$\begin{aligned} R_{emp} =\frac{1}{n}\sum \nolimits _{i=1}^n L_{\epsilon } (y_i ,f(x_i )) \end{aligned}$$

(2)

$$\begin{aligned} L_{\epsilon } \left( {y,f\left( x \right) } \right) = \left\{ {{\begin{array}{l@{\quad }l} 0,&{} \mathrm{if}\left| {y-f(x)} \right| \le {\epsilon } \\ \left| {y-f(x)} \right| -{\epsilon }, &{} \mathrm{otherwise} \\ \end{array} }} \right. \end{aligned}$$

(3)

SVR performs linear regression in the feature space to lower the expected risk using $\upepsilon $ -insensitive loss and, and at the same time, tries to reduce the complexity of the model by minimizing $\Vert w^{2}\Vert $. This can be realized in Eq. (4), where $\xi _i $, $\xi _i^*(i=1,\ldots ,n)$ are the non-negative slack variables, representing the deviation between the function $f(x)$ of training dataset and the actual value.

$$\begin{aligned} \begin{array}{ll} {\mathop {min}\limits _{w,b,\xi ,\xi ^{*}}} &{} \frac{1}{2} \Vert w^{2}\Vert +C \sum \limits _{i=1}^n (\xi _i +\xi _i^*)\\ {s.t.} &{} w\phi \left( {x_i } \right) +b-y_i \le \epsilon +\xi _i ,\\ &{} y_i -w\phi \left( {x_i } \right) -b\le \epsilon +\xi _i^*,\\ &{} \xi _i ,\xi _i^*\ge 0,\quad i=1,\ldots ,n.\\ \end{array} \end{aligned}$$

(4)

This optimization problem can be transformed into the dual problem and its solution is given by Eq. (5), where $a_i^*,a_i $ are the Lagrange multipliers that can be got by solving the dual problem and $K\left( {x_i ,x_j } \right) $ is the kernel function that equals the inner product of $\phi \left( {x_i } \right) $ and $\phi \left( {x_j } \right) $. Any function that meets Mercer’s condition [31] can be used as the kernel function.

$$\begin{aligned}&f( x )=\sum \nolimits _{i=1}^n \left( {a_i^*-a_i } \right) K\left( {x_i ,x} \right) +b\nonumber \\&{s.t.}\quad \quad 0\le a_i^*\le C,\quad 0\le a_i \le C \end{aligned}$$

(5)

The most frequently used kernel functions are polynomial kernel function, sigmoid kernel function and the radial basis kernel function. This paper uses the radial basis kernel function, which is defined as Eq. (6):

$$\begin{aligned} K\left( {x,z} \right) =exp\left( {\frac{\Vert x-z\Vert ^{2}}{2\gamma ^{2}}} \right) \end{aligned}$$

(6)

where $\gamma $ is the parameter needing to be manually set, as the same as $\upepsilon $ and $C$ in Eq. (4), all of which have much influence on the forecasting accuracy of SVR.

The hybrid PSO-SVR method proposed in this paper uses SVR to forecast the short-term traffic flow, and uses PSO to optimize the selection procss of the parameters of SVR. The forecasting process is shown in Fig. 1, where the “Pre-Processing Unit” represents a unit that preprocesses the real-time data from the sensor or the history data to obtain the data in the format that SVR needs.

3.2 Performance Evaluation Index

In this paper $\textit{RMSE}$(Root Mean Squared Error) and $r^{2}({\textit{Rsquared}})$ are used to evaluate the forecasting performance of the model, which are respectively defined in Eqs. (7) and (8). Here $n$ is the number of the test samples, $x_i (i=1,\ldots ,n)$ is the instance of a test sample, $f\left( {x_i } \right) $ is the forecasting value of an instance, and $y_i \left( {i=1,\ldots ,n} \right) $ is the true value. The smaller $\textit{RMSE}$ is, the higher the forecasting accuracy is, and the bigger $r^{2}$ is, the higher the forecasting accuracy is. $r^{2}$ is not more than 1.

$$\begin{aligned} \textit{RMSE}=\frac{1}{n} \sum \nolimits _{i=1}^n \left( {f( {x_i } )-y_i } \right) ^{2} \end{aligned}$$

(7)

$$\begin{aligned} r^{2}=\frac{\left( {n \sum \nolimits _{i=1}^n f( {x_i } )y_i -\sum \nolimits _{i=1}^n f(x_i ) \sum \nolimits _{i=1}^n y_i } \right) ^{2}}{\left( {n \sum \nolimits _{i=1}^n f( {x_i } )^{2}-( \sum \nolimits _{i=1}^n f(x_i ))^{2}} \right) \times \left( {n \sum \nolimits _{i=1}^n y_i^2 -\left( { \sum \nolimits _{i=1}^n y_i} \right) ^{2}} \right) } \end{aligned}$$

(8)

3.3 Hybrid PSO-SVR Algorithm

In order to get the optimal parameters of SVR, this paper uses PSO to optimize the selection process of the parameters of SVR, and proposes the PSO-SVR model. In this model, SVR is trained by the method of S-fold cross validation, and $\textit{RMSE}$ is selected to evaluate the performance of SVR. The smaller $\textit{RMSE}$ is, the better the SVR is.

In PSO, assume $\Phi $ is the searching space of the particles, which is the value range of the vector $\left( {\gamma ,C,\epsilon } \right) $, and $n$ is the swarm size. The ranges of $\gamma ,C,\epsilon $ are set as [0, 1000], [1, 10000] and [0, 50] respectively. Each particle has its own position $\vec {x_\imath }$ and velocity $\vec {v_\imath }$, among which $\vec {x_\imath }$ corresponds to $(\gamma ,C,\epsilon )$. Whether the particle is excellent depends on its fitness function, which is the $\textit{RMSE}$ of the SVR’s train result using this particle. At each iteration, the particles update their position and velocity according to their own historical optimal position $\vec {b_i}$ and the global historical optimal position $\vec {g}$, which are also updated correspondingly after each iteration. The updating formulas of the position and velocity for each particle are shown as Eqs. (9) and (10). The iteration terminates either after the maximum number of iterations or on the condition when $\textit{RMSE}$ is smaller than a preset value. The position $\vec {g}$ is the output that represents the desired parameters combination. The variables discussed in above description are detailed in Table 1.

$$\begin{aligned} \vec {v}_{i}^{t+1}&= w\vec {v}_{i}^{t}+{\varphi }_{1}\vec {U}_{1}^{t}(\vec {b}_{i}-\vec {x}_{i}^{t}) +{\varphi }_{2}\vec {U}_{2}^{t}(\vec {g} -\vec {x}_{i}^{t}) \end{aligned}$$

(9)

$$\begin{aligned} \vec {x}_{i}^{t+1}&= \vec {x}_{i}^{t}+\vec {v}_{i}^{t+1} \end{aligned}$$

(10)

Table 1 Description of PSO variables in PSO-SVR

A Short-term Traffic Flow Forecasting Method Based on the Hybrid PSO-SVR

Abstract

Similar content being viewed by others

A short-term traffic flow forecasting method and its applications

Online Prediction Model of Short-Term Traffic Flow Based on Improved LS-SVM

Short-Term Traffic Flow Prediction Based on Least Square Support Vector Machine with Hybrid Optimization Algorithm

Explore related subjects

1 Introduction

2 Related Work

3 A Hybrid PSO-SVR Method for Short-term Traffic Flow Forecasting

3.1 SVR

3.2 Performance Evaluation Index

3.3 Hybrid PSO-SVR Algorithm

4 PSO-SVR Algorithm with Historical Momentum

4.1 Weekly Similarity and Holiday Similarity Analysis of Traffic Flow

4.2 PSO-SVR with Historical Momentum

5 Experiments and Results Analysis

5.1 Experimental Data

5.2 Experimental Environment

5.3 Performance Analysis of the Hybrid PSO-SVR Method

5.3.1 Analysis of the Particle Number and the Iteration Number

5.3.2 Analysis of the Algorithm’s Running Time

5.4 Comparison Experiments

5.5 Performance Analysis of the PSO-SVR Method with Historical Momentum

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation