Abstract
Accurate forecasting of inter-urban traffic flow has been one of the most important issues globally in the research on road traffic congestion. Because the information of inter-urban traffic presents a challenging situation, the traffic flow forecasting involves a rather complex nonlinear data pattern, particularly during daily peak periods, traffic flow data reveals cyclic (seasonal) trend. In the recent years, the support vector regression model (SVR) has been widely used to solve nonlinear regression and time series problems. However, the applications of SVR models to deal with cyclic (seasonal) trend time series had not been widely explored. This investigation presents a traffic flow forecasting model that combines the seasonal support vector regression model with chaotic immune algorithm (SSVRCIA), to forecast inter-urban traffic flow. Additionally, a numerical example of traffic flow values from northern Taiwan is used to elucidate the forecasting performance of the proposed SSVRCIA model. The forecasting results indicate that the proposed model yields more accurate forecasting results than the seasonal autoregressive integrated moving average, back-propagation neural network, and seasonal Holt–Winters models. Therefore, the SSVRCIA model is a promising alternative for forecasting traffic flow.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
The effective capacity of inter-urban motorway networks is an essential component of traffic control and information systems, particularly during daily peak periods. Since slightly inaccurate capacity predictions will lead to congestion with huge social costs in terms of travel time, fuel costs and environment pollution, accurate forecasting of the traffic flow during peak periods is a very topic attracted interest in the literature.
There has been a wide variety of forecasting approaches applied to forecast the traffic flow of inter-urban motorway networks. Those approaches could be classified according to the type of data, forecast horizon, and potential end-use [1], including Kalman state space filtering models [2–5] and system identification models [6]. However, traffic flow data are in the form of spatial time series and are collected at specific locations at constant intervals of time. The above-mentioned studies and their empirical results have indicated that the problem of forecasting inter-urban motorway traffic flow is multi-dimensional, including relationships among measurements made at different times and geographical sites. In addition, these methods have difficultly coping with observation noise and missing values while modeling. Therefore, Danech-Pajouh and Aron [7] employed a layered statistical approach with a mathematical clustering technique to group the traffic flow data and a separately tuned linear regression model for each cluster. Their experimental results revealed that the proposed model is superior to the other forecasting approach—autoregressive integrated moving average models (ARIMA). Based on the multi-dimensional pattern recognition requests, such as intervals of time, geographical sites, and the relationships between dependent variable and independent variables, non-parametric regression models [8–10] have also successfully been employed to forecast motorway traffic flow.
Furthermore, the ARIMA models, initially developed by Box and Jenkins [11], are one of the most popular alternatives in traffic flow forecasting [10, 12–15]. For example, Kamarianakis and Prastacos [13] successfully employed the ARIMA model with space and time factors to forecast space–time stationary traffic flow. However, the limitation of ARIMA models is that their natural tendency to concentrate on the mean values of the past series data seems unable to capture the rapid variational process underlying of traffic flow [16]. Recently, as an extension of ARIMA model, Williams [17] applied seasonal ARIMA (SARIMA) model to traffic flow forecasting. The proposed model considered the peak/non-peak flow periods by seasonal differencing and forecasting results reported that it significantly outperformed the heuristic forecast generation method in terms of forecasting accuracy. However, it is quite time-consuming to detect the outlier required and to estimate the parameter of SARIMA model. These new findings are also encouraging the author to employ the SARIMA model as the bench model in this study.
As mentioned above that the process underlying inter-urban traffic flow is complicated to be captured by a single linear statistical algorithm, the artificial neural networks (ANN) models, able to approximate any degree of complexity and without prior knowledge of problem solving, have received much attention and been considered as alternatives for traffic flow forecasting models [14, 18–23]. ANN is based on a model of emulating the processing of the human neurological system to determine related numbers of vehicle and temporal characteristics from the historical traffic flow patterns, especially for nonlinear and dynamic evolutions. Therefore, ANN is widely applied in traffic flow forecasting. Recently, Yin et al. [24] developed a fuzzy-neural model (FNM) to predict traffic flow in an urban street network. The FNM contains two modules: gate network (GN) and expert network (EN). The GN classifies the input data using fuzzy approach, and the EN identifies the input–output relationship by neural network approaches. The empirical results showed that the FNM model provides more accurate forecasting results than the BPNN model. Vlahogianni et al. [22], successfully considering based on the proper representation of traffic flow data with temporal and spatial characteristics, employed a genetic algorithm-based, multilayered, structural optimization strategy to determine the appropriate neural network structure. Their results show that the capabilities of a simple static neural network, with genetically optimized step size, momentum, and number of hidden units, are very satisfactory when modeling both univariate and multivariate traffic data. Even though ANN-based forecasting models could approximate any function particularly for nonlinear function, the limitations are not only difficult to explain the operations of the so-called black-box (such as how to determine suitable network structure), but also the problem of any ANN algorithm minimizing network training errors is non-convex and it is hard to find the global optimum.
Support vector machines (SVM) were originally developed to solve pattern recognition and classification problems. With the introduction of Vapnik’s ε-insensitive loss function, SVMs have been extended to solve nonlinear regression estimation problems, i.e., the so-called support vector regression (SVR), and have been successfully applied to solve forecasting problems in many fields in many fields, such as financial time series (stocks index and exchange rate) forecasting [25–29] engineering and software field (production values and reliability) forecasting [30, 31], atmospheric science forecasting [32–35], electric load forecasting [36–40], and so on. The practical results indicated that poor forecasting accuracy is suffered from the lack of knowledge of the selection of the three parameters (σ, C, and ε) in an SVR model. However, the structured ways in determining three free parameters in an SVR model is poor. Recently, some major nature-inspired evolutionary algorithms are applied to solve optimization problems, immune algorithm (IA) is one among them. IA, proposed by Mori et al. [41] is used in this study and is based on the learning mechanism of natural immune systems. Similar to GA, SA, and PSO, IA is also a population-based evolutionary algorithm; therefore, it provides a set of solution for exploration and exploitation of search space to obtain optimal/near optimal solution [42]. In addition, the diversity of the employed population set will determine the searching results, the desired solution, or premature convergence (trapped into local minimum). As special mechanism to avoid being trapped in local minimum, the ergodicity property of chaotic sequences has been used as an optimization technique to hybridize with evolutionary algorithms. In this investigation, the chaotic immune algorithm (CIA) is tried to determine the values of three parameters in an SVR model. On the other hand, as mentioned that the traffic flow data not only involves a complicated nonlinear data pattern, but also reveals cyclic (seasonal) trend during daily peak periods (morning/evening commute peak time). However, the applications of SVR models to deal with cyclic (seasonal) trend time series had not been widely explored. Therefore, this paper also attempts to apply the seasonal adjustment method [43, 44] to deal with seasonal trend time series problem. Thus, the proposed SSVRCIA model is applied to forecast inter-urban motorway traffic flow in Panchiao city of Taipei County, Taiwan. The rest of this paper is organized as follows. Section 2 presents the models for comparing forecast performance and SVR models. Section 3 introduces the proposed SSVRCIA forecasting model. Section 4 illustrates a numerical example that reveals the forecasting performance of the proposed models. Conclusions are finally made in Sect. 5.
2 Forecasting methodology
In this investigation, two models, the seasonal ARIMA (SARIMA), seasonal Holt–Winters (SHW), back-propagation neural network (BPNN) models and the SSVRCIA model, are used to compare the forecasting performance of traffic flow.
2.1 Seasonal autoregressive integrated moving average (SARIMA) model
Proposed by Box and Jenkins [11], the seasonal ARIMA process has been one of the most popular approaches in time series forecasting, particularly for strong seasonal component. The SARIMA process is often referred to as the \( {\text{SARIMA}}(p,d,q) \times (P,D,Q)_{S} \) model. Similar to the ARIMA model, the forecasting values are assumed to be a linear combination of past values and past errors. A time series \( \left\{ {X_{t} } \right\} \) is a SARIMA process with seasonal period length S if d and D are nonnegative integers and if the differenced series \( W_{t} = (1 - B)^{d} (1 - B^{S} )^{D} X_{t} \) is a stationary autoregressive moving average process. In symbolic terms, the model can be written as
where N is the number of observations up to time t; B is the backshift operator defined by \( B^{a} W_{t} = W_{t - a} \); \( \phi_{p} (B) = 1 - \phi_{1} B - \cdots - \phi_{p} B^{p} \) is called a regular (non-seasonal) autoregressive operator of order p; \( \Upphi_{P} (B^{S} ) = 1 - \Upphi_{1} B^{S} - \cdots - \Upphi_{P} B^{PS} \) is a seasonal autoregressive operator of order P; \( \theta_{q} (B) = 1 - \theta_{1} B - \cdots - \theta_{q} B^{q} \) is a regular moving average operator of order q; \( \Uptheta_{Q} (B^{S} ) = 1 - \Uptheta_{1} B^{S} - \cdots - \Uptheta_{Q} B^{QS} \) is a seasonal moving average operator of order Q; ε t is identically and independently distributed as normal random variables with mean zero, variance σ2 and \( {\text{cov}}(\varepsilon_{t} ,\varepsilon_{t - k} ) = 0 \), \( \forall k \ne 0 \).
In the definition above, the parameters p and q represent the autoregressive and moving average order, respectively; and the parameters P and Q represent the autoregressive and moving average order at the model’s seasonal period length, S, respectively. The parameters d and D represent the order of ordinary and seasonal differencing, respectively.
Basically, when fitting a SARIMA model to data, the first task is to estimate values of d and D, the orders of differencing needed to make the series stationary and to remove most of the seasonality. The values of p, P, q, and Q then need to be estimated by the autocorrelation function (ACF) and partial autocorrelation function (PACF) of the differenced series. Other model parameters may be estimated by suitable iterative procedures.
2.2 Seasonal Holt–Winters (SHW) model
To consider the seasonal effect, the second employed model is the seasonal Holt–Winters’ linear exponential smoothing (SHW) approach, which is extended from the Holt–Winters model [45, 46]. The Holt–Winters method cannot be extended to accommodate additive seasonality if the magnitude of the seasonal effects does not change with the series or multiplicative seasonality if the amplitude of the seasonal pattern changes over time. The forecast for SHW model is as follows:
where a t is the actual value at time t; s t is the smoothed estimate at time t; b t is the trend value at time t; α is the level smoothing coefficient; and β is the trend smoothing coefficient. L is the length of seasonality; I is the seasonal adjustment factor; and γ is the seasonal adjustment coefficient.
Equation (2) lets the actual value be smoothed in a recursive manner by weighting the current level (α), and then adjusts s t directly for the trend of the previous period, b t−1, by adding it to the last smoothed value, s t−1. This helps to eliminate the lag and brings s t to the approximate base of the current data value. In addition, the first term of (2) is divided by the seasonal number I t−L ; this is done to de-seasonalize a t (eliminate seasonal fluctuations from a t ). Equation (3) updates the trend, which is expressed as the difference between the last two smoothed values. It modifies the trend by smoothing with β in the last period (s t − s t−1) and adding that to the previous estimate of the trend multiplied by (1 − β). Equation (4) is comparable to a seasonal index that is found as a ratio of current values of the series, a t , divided by the smoothed value for the series, s t . If a t is larger than s t , the ratio will be greater than 1, else, the ratio will be less than 1. In order to smooth the randomness of a t , (4) weights the newly computed seasonal factor with γ and the most recent seasonal number corresponding to the same season with (1 − γ). Equation (5) is used to forecast ahead. The trend, b t , is multiplied by the number of periods ahead to be forecast, i, and added to the base value, s t , finally, the summation of s t and ib t is multiplied by the seasonal number I t−L+i . The forecast error (e t ) is defined as the actual value minus the forecast (fitted) value for time period t, that is:
The forecast error is assumed to be an independent random variable with zero mean and constant variance. Values of smoothing coefficients, α and β, and seasonal adjustment coefficient, γ, are determined to minimize the forecasting error.
2.3 Back-propagation neural networks (BPNN) model
The multi-layer back-propagation neural network (BPNN) is one of the most widely used neural network models. Consider the simplest BPNN architecture including three layers: an input layer (x), an output layer (o), and a hidden layer (h). The computational procedure of this network is described below:
where o i denotes the output of node i, f(·) represents the activation function, g ij is the connection weight between nodes i and j in the lower layer which can be replaced with v ji and w kj , and x ij denotes the input signal from the node j in the lower layer.
The BPNN algorithm attempts to improve neural network performance by reducing the total error through changing the gradient weights. The BPNN algorithm minimizes the sum-of-error-square, which can be calculated by:
where E denotes the square errors, K represents the output layer neurons, P is the training data pattern, d pj denotes the actual output and o pj represents the network output. The BPNN algorithm is expressed as follows. Let Δv ji denote the weight change for any hidden layer neuron and Δw kj for any output layer neuron,
where η represents the learning rate parameter, specified at the start of training cycle and determining the training speed and stability of the network. Notably, the Jth node is the bias neuron without weight. Equations (11) and (12) express the signal (s j ) to each hidden layer neuron and the signal (u k ) to each neuron in the output layer are expressed as \( s_{j} = \sum\nolimits_{i = 1}^{I} {v_{ji} x_{i} } \) and \( u_{k} = \sum\nolimits_{j = 1}^{J - 1} {w_{kj} y_{j} } \), respectively.
The error signal terms for the jth hidden neuron δ yj , and for the kth output neuron δ ok are defined as \( \delta_{yj} = - {\frac{\partial E}{{\partial s_{j} }}} \) and \( \delta_{ok} = - {\frac{\partial E}{{\partial u_{k} }}} \), respectively.
Applying the chain rule, the gradients of the cost function with respect to weights v ji and w kj are \( {\frac{\partial E}{{\partial v_{ji} }}} = {\frac{\partial E}{{\partial s_{j} }}}\,{\frac{{\partial s_{j} }}{{\partial v_{ji} }}} \) and \( {\frac{\partial E}{{\partial w_{kj} }}} = {\frac{\partial E}{{\partial u_{k} }}}\,{\frac{{\partial u_{k} }}{{\partial w_{kj} }}} \), respectively. Then, obviously, the gradients of s j and u k with respect to weights v ji and w kj are \( {\frac{{\partial s_{j} }}{{\partial v_{ji} }}} = x_{i} \) and \( {\frac{{\partial u_{k} }}{{\partial w_{kj} }}} = y_{j} \), respectively. By combining above mention equations, we will obtain \( {\frac{\partial E}{{\partial v_{ji} }}} = - \delta_{yj} x_{i} \) and \( {\frac{\partial E}{{\partial w_{kj} }}} = - \delta_{ok} y_{j} \). Finally, the weight change from (9) and (10) can now be written as \( \Updelta v_{ji} = - \eta {\frac{\partial E}{{\partial v_{ji} }}} = \eta \delta_{yj} x_{i} \) and \( \Updelta w_{kj} = - \eta {\frac{\partial E}{{\partial e_{kj} }}} = \eta \delta_{ok} y_{j} \), respectively. The weights, v ji and w kj , are changed as (11) and (12),
The most common activation functions are the squashing sigmoid function, such as the logistic and tangent hyperbolic functions.
2.4 Support vector regression (SVR) model
The brief ideas of SVMs for the case of regression are introduced. A nonlinear mapping \( \varphi ( \cdot ):\Re^{n} \to \Re^{{n_{h} }} \) is defined to map the input data (training data set) \( \left\{ {({\mathbf{x}}_{i} ,y_{i} )} \right\}_{i = 1}^{N} \) into a so-called high dimensional feature space (which may have infinite dimensions), \( \Re^{{n_{h} }} \). Then, in the high dimensional feature space, there theoretically exists a linear function, f, to formulate the nonlinear relationship between input data and output data. Such a linear function, namely SVR function, is as (13),
where f(x) denotes the forecasting values; the coefficients w (\( {\mathbf{w}} \in \Re^{{n_{h} }} \)) and b (\( b \in \Re \)) are adjustable. As mentioned above, SVM method one aims at minimizing the empirical risk by employing the ε-insensitive loss function to find out an optimum hyper plane on the high dimensional feature space to maximize the distance separating the training data into two subsets. Thus, the SVR focuses on finding the optimum hyper plane and minimizing the training error between the training data and the ε-insensitive loss function.Then, the SVR minimizes the overall errors,
with the constraints
After the quadratic optimization problem with inequality constraints is solved, the parameter vector w in (13) is obtained,
where \( \beta_{i}^{*} \), β i are obtained by solving a quadratic program and are the Lagrangian multipliers. Finally, the SVR regression function is obtained as (16) in the dual space,
where K(x i , x j ) is called the kernel function, and the value of the Kernel equals the inner product of two vectors, x i and x j , in the feature space φ(x i ) and φ(x j ), respectively; that is, \( K({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} ) = \varphi ({\mathbf{x}}_{i} ) \circ \varphi ({\mathbf{x}}_{j} ) \). Any function that meets Mercer’s condition [47] can be used as the Kernel function.
There are several types of kernel function. The most used kernel functions are the Gaussian RBF with a width of \( \sigma :K({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} ) = \exp \left( { - 0.5{{\left\| {{\mathbf{x}}_{i} - {\mathbf{x}}_{j} } \right\|^{2} } \mathord{\left/ {\vphantom {{\left\| {{\mathbf{x}}_{i} - {\mathbf{x}}_{j} } \right\|^{2} } {\sigma^{2} }}} \right. \kern-\nulldelimiterspace} {\sigma^{2} }}} \right) \) and the polynomial kernel with an order of d and constants a 1 and a 2: \( K({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} ) = (a_{1} {\mathbf{x}}_{i} {\mathbf{x}}_{j} + a_{2} )^{d} \). Till now, it is hard to determine the type of kernel functions for specific data patterns [48, 49]. However, the Gaussian RBF kernel is not only easier to implement, but also capable to nonlinearly map the training data into an infinite dimensional space; thus, it is suitable to deal with nonlinear relationship problems. Therefore, the Gaussian RBF kernel function is specified in this study.
3 Chaotic immune algorithm (CIA) in selecting parameters and seasonal adjustment
3.1 CIA in selecting parameters
The selection of the three parameters, σ, ε, and C, of an SVR model influence the accuracy of forecasting. However, structural methods for confirming efficient selection of parameters efficiently are lacking. Recently, Hong [38] applied immune algorithm (IA) to determine parameters of an SVR model and found that the proposed model is superior to other competitive forecasting models (ANN and regression models). However, based on the operation procedure of IA, if the population diversity of an initial population cannot be maintained under selective pressure, i.e., the initial individuals are not necessarily fully diversified in the search space, then IA could only seek for the solutions in the narrow space and the solution is far from the global optimum (premature convergence). To overcome the shortcoming, it is necessary to find some effective approach and improved design or procedure on IA to track in the solution space effectively and efficiently. One feasible approach is focused on the chaos approach, due to its easy implementation and special ability to avoid being trapped in local optimum [50]. The application of chaotic sequences can be a good alternative to diversify the initial definition domain in stochastic optimization procedures, i.e., small changes in the parameter settings or the initial values in the model. Due to the ergodicity property of chaotic sequences, it will lead to very different future solution finding behaviors; thus, chaotic sequences can be used to enrich the searching behavior and to avoid being trapped into local optimum [51]. There are lots of applications in optimization problem using chaotic sequences [52–56]. Coelho and Mariani [57] recently apply chaotic artificial immune network (chaotic opt-aiNET) to solve the economic dispatch problem (EDP), which are based on Zaslavsky’s map by its spread-spectrum characteristic and large Lyapunov exponent to successfully escape from local optimum and to converge to a stable equilibrium. Therefore, it is believable that applying chaotic sequences to diversify the initial definition domain in IA’s initialization procedure (CIA) is a feasible approach to optimize the parameter selection in an SVR model.
To design the CIA, many principal factors like identification of the affinity, selection of antibodies, crossover and mutation of antibody population are similar to the IA factors; more procedure details of the CIA on this study is as follows, and the flowchart is shown as Fig. 1.
Step 1
Initialization of antibody population
The values of the three parameters in an SVR model in the ith iteration can be represented as \( X_{k}^{(i)} ,k = C,\sigma ,\varepsilon \). Set i = 0, and employ (17) to map the three parameters among the intervals (Min k , Max k ) into chaotic variable \( x_{k}^{(i)} \) located in the interval (0, 1).
Then, employ the chaotic sequence, defined as (18), with μ = 4 to compute the next iteration chaotic variable, \( x_{k}^{(i + 1)} \).
where x (i) is the value of the chaotic variable x at the ith iteration, μ is the so-called bifurcation parameter of the system, \( \mu \in [0,4] \).
And, transform \( x_{k}^{(i + 1)} \) to obtain three parameters for the next iteration, \( X_{k}^{(i + 1)} \), by the following (19).
After this transformation, the three parameters, C, σ, and ε, are constituted the initial antibody population and then will be represented by binary-code string. For example, assume that an antibody contains 12 binary codes to represent three SVR parameters. Each parameter is thus expressed by four binary codes. Assume the set-boundaries for parameters σ, C, and ε are 2, 10, and 0.5, respectively, then, the antibody with binary-code “1 0 0 1 0 1 0 1 0 0 1 1” implies that the real values of the three parameters σ, C, and ε are 1.125, 3.125, and 0.09375, respectively. The number of initial antibodies is the same as the size of the memory cell. The size of the memory cell is set to ten in this study.
Step 2
Identification of the affinity and the similarity
A higher affinity value implies that an antibody has a higher activation with an antigen. To continue keeping the diversity of the antibodies stored in the memory cells, the antibodies with lower similarity have higher probability of being included in the memory cell. Therefore, an antibody with a higher affinity value and a lower similarity value has a good likelihood of entering the memory cells. The affinity between the antibody and antigen is defined as (20).
where d k denotes the SVR forecasting errors obtained by the antibody k.
The similarity between antibodies is expressed as (21).
where T ij denotes the difference between the two SVR forecasting errors obtained by the antibodies inside (existed) and outside (will be entering) the memory cell.
Step 3
Selection of antibodies in the memory cell
Antibodies with higher values of Ag k are considered to be potential candidates for entering the memory cell. However, the potential antibody candidates with Ab ij values exceeding a certain threshold are not qualified to enter the memory cell. In this investigation, the threshold value is set to 0.9.
Step 4
Crossover of antibody population
New antibodies are created via crossover and mutation operations. To perform crossover operation, strings representing antibodies are paired randomly. Moreover, the proposed scheme adopts the single-point-crossover principle. Segments of paired strings (antibodies) between two determined break-points are swapped. In this investigation, the probability of crossover (p c) is set as 0.5. Finally, the three crossover parameters are decoded into a decimal format.
Step 5
Annealing chaotic mutation of antibody population
For the ith iteration (generation), crossover antibody population (\( \hat{X}_{k}^{(i)} ,k = C,\sigma ,\varepsilon \)) of current solution space (Min k , Max k ) are mapped to chaotic variable interval [0, 1] to form the crossover chaotic variable space \( \hat{x}_{k}^{(i)} ,k = C,\sigma ,\varepsilon \), as (22),
where q max is the maximum evolutional generation of the population. Then, the ith chaotic variable \( x_{k}^{(i)} \) is summed up to \( \hat{x}_{k}^{(i)} \), and the chaotic mutation variable are also mapped to interval [0, 1] as in (23),
where δ is the annealing operation. Finally, the chaotic mutation variable obtained in interval [0, 1] is mapped to the solution interval (Min k , Max k ) by definite probability of mutation (p m ), thus completing a mutative operation.
Step 6
Stopping criteria
If the number of generations equals a given scale, then the best antibody is a solution, otherwise return to Step 2.
The CIA is used to seek a better combination of the three parameters in SVR. The value of the normalized root mean square error (NRMSE) is used as the criterion (the smallest value of NRMSE) of forecasting errors to determine the suitable parameters used in SVR model in this investigation, which is given by (25).
where n is the number of forecasting periods; a i is the actual traffic flow value at period i; and f i is the forecasting traffic flow value at period i.
3.2 Seasonal adjustment
As mentioned that during daily peak periods, traffic flow data reveals cyclic (seasonal) trend, any model attempts to accomplish the goal of high accurate forecasting performance, it is necessary to estimate this seasonal component. There are several approaches to estimate the seasonal index of data series [44, 58, 59], including product-model type and non-product-model type. Based on the data series type consideration, this investigation employed Deo and Hurvich’s [58] approach to compute the seasonal index, as shown in (26),
where t = j, l + j, 2 l + j,…, (m − 1)l + j only for the same peak time point in each period. Then, the seasonal index (SI) for each peak time point j is computed as (27),
Eventually, the forecasting value of the SSVRCIA is obtained by (28),
where k = j, l + j, 2 l + j,…, (m − 1)l + j implies the peak time point in another period (for forecasting period).
4 A numerical example and experimental results
The traffic flow data sets were originated from three civil motorway detector sites. The civil motorway is the busiest inter-urban motorway networks in Panchiao city, the capital of Taipei County, Taiwan. The major site was located at the center of Panchiao City, where the flow intersects an urban local street system, and it provided one way traffic volume for each hour in weekdays. Therefore, one way flow data for peak traffic are employed in this investigation, which includes the morning peak period (from 6:00 to 10:00) and the evening peak period (from 16:00 to 20:00). The data collection is conducted from February 2005 to March 2005. During the observation period, the number of traffic flow data available for the morning and evening peak periods are 45 and 90 h, respectively. For convenience, the traffic flow data are converted to equivalent of passengers (EOP), and both of these two peak periods show the seasonality of traffic data. In addition, traffic flow data are divided into three parts: training data, validation data, and testing data. For the morning peak period, the training data set, validation data set, and testing data set are 30, 10, and 10 h accordingly. For the evening peak period, the experimental data are arranged as training data (60 h), validation data (15 h), and testing data (15 h).
4.1 Parameter determination of different comparative forecasting models
The parameter selection of forecasting models is important for obtaining good forecasting performance. For the SARIMA model, the parameters are determined by taking the first-order regular difference and first seasonal difference to remove non-stationary and seasonality characteristics. Using statistical packages, with no residuals autocorrelated and approximately white noise residuals, the most suitable models for these two morning/evening peak periods for the traffic data are \( {\text{SARIMA}}(1,0,1) \times (0,1,1)_{5} \) with non-constant item and \( {\text{SARIMA}}(1,0,1) \times (1,1,1)_{5} \)with constant item, respectively. The equations used for the SARIMA models are presented as (29) and (30), respectively.
For the seasonal Holt–Winters method, by Minitab 14 statistic software, the appropriate parameters (L, α, β, and γ) for morning peak period are determined 5, 0.15, 0.72, and 0.73, correspondingly; for evening peak period as 5, 0.48, 0.04, and 0.13, correspondingly. For the BPNN model, Matlab 6.5 computing software is employed to implement the forecasting procedure. The number of nodes in the hidden layer is used as a validation parameter of the BPNN model. The most suitable number of hidden nodes of a BPNN model is three.
4.2 SSVRCIA traffic forecasting model
Before conducting the seasonal adjustment for the SSVRCIA model, it is necessary to implement the CIA algorithm to determine suitable values of the three parameters in an SVR model. The parameters of the CIA in the proposed models for both traffic peak periods are experimentally set respectively as shown in Table 1. For the SVRCIA modeling procedure, in the training stage, a rolling-based forecasting procedure was conducted, and, in the validation and testing stage, a 1-h-ahead forecasting policy adopted. Then, several types of data-rolling are considered to forecast traffic flow in the next hour. Different numbers of the traffic flow in a time series were fed into the SVRCIA model to forecast the traffic flow in the next validation period. While training errors improvement occurs, the three kernel parameters, σ, C, and ε of the SVRCIA model adjusted by CIA algorithm, are employed to calculate the validation error. Then, the adjusted parameters with minimum validation error are selected as the most appropriate parameters. Table 2 indicates that SVRCIA models perform the best when 15 and 35 input data are used for morning/evening traffic forecast respectively.
Now the seasonal term is considered. For the morning peak period, there are five peak time points in each cycle, from 6:00 to 10:00. The seasonal indexes for each peak time point are calculated based on the 40 forecasting values of the SVRCIA model in training (30 forecasting values) and validation (10 forecasting values) stages, as shown in Table 3. Similarly in the evening peak period, there are also five peak time points in each cycle, from 16:00 to 20:00. The seasonal indexes for each peak time point are also shown in Table 3.
The well-trained models, SARIMA, BPNN, SHW, SVRCIA, and SSVRCIA, are applied to forecast the traffic flow during the morning/evening peak period. Tables 4 and 5 show the actual values and the forecast values obtained using various forecasting models in the morning peak and the evening peak, respectively. The NRMSE values for each peak hour are calculated to compare fairly the proposed models with other alternative models. The proposed SSVRCIA model has smaller NRMSE values than the SARIMA, BPNN, SHW, and SVRCIA models to capture the traffic flow patterns on hourly average basis. Clearly, the seasonal adjustment employed here is proficient in dealing with such cyclic peak data type of forecasting problems.
5 Conclusions
Accurate traffic forecast is crucial for the inter-urban traffic control system, particularly for avoiding congestion and for increasing efficiency of limited traffic resources during peak periods. The historical traffic data of Panchiao City in northern Taiwan show a seasonal fluctuation trend which occurs in many inter-urban traffic systems. Therefore, over-prediction or under-prediction of traffic flow influences the transportation capability of an inter-urban system. This study introduces the application of forecasting techniques, SSVRCIA, to investigate its feasibility for forecasting inter-urban motorway traffic. The experimental results indicate that the SSVRCIA model has better forecasting performance than the SARIMA, BPNN, SHW, and SVRCIA models. The superior performance of the SSVRCIA model is due to the generalization ability of SVR model for forecasting and the proper selection of SVR parameters by CIA and effective seasonal adjustments. In addition, SVR method employs the quadratic programming technique which is based on the assumptions of convex set and existence of global optimum solution. Thus, it should be theoretically approximated to the global optimum solution if superior searching algorithms are employed. In the contrast, SARIMA, SHW models employ the parametric technique which is based on specific assumptions, such as linear relationship between the current value of the underlying variables and previous values of the variable and error terms, and these assumptions are not completely tallied with real world problems.
This investigation is the first to apply the SVR with CIA and seasonal adjustment for forecasting inter-urban motorway traffic flow. Many forecasting methodologies have been proposed to deal with the seasonality of traffic flow. However, most models are time-consuming in verifying the suitable time-phase divisions, particularly when the sample size is large. In this investigation, the SSVRCIA model provides a convenient and valid alternative for traffic flow forecasting. The SSVRCIA model directly uses historical observations from traffic control systems and then determines suitable parameters by efficient optimization algorithms. The next step would be to develop trading strategies to involve other factors and meteorological control variables during peak periods, such as driving speed limitation, important social events, the percentage of heavy vehicles, bottleneck service level, and waiting time during intersection traffic signals can be included in the traffic forecasting model. In addition, even the proposed SSVRCIA model is one of the hybrid forecasting models; some other advanced optimization algorithms for parameters selection can be applied for the SVR model to satisfy the requirement of real-time traffic control systems. The goal of the author is to show that combination of novel techniques is as good as pure techniques.
References
Dougherty MS (1996) Investigation of network performance prediction literature review (Technical Note 394). Institute for Transport Studies, University of Leeds
Okutani I, Stephanedes YJ (1984) Dynamic prediction of traffic volume through Kalman filtering theory. Transp Res B 18(1):1–11
Stathopoulos A, Karlaftis GM (2003) A multivariate state space approach for urban traffic flow modeling and prediction. Transp Res C 11(2):121–135
Van Arem B, Van der Vlist M, de Ruuiter J, Muste M, Smulders S (1994) Travel time estimation in the GERDIEN project. In: Proceedings of the 2nd DRIVE II workshop on short-term traffic forecasting, Delft
Whittaker J, Garside S, Lindveld K (1994) Tracking and predicting a network traffic process. In: Proceedings of the 2nd DRIVE II workshop on short-term traffic forecasting, Delft
Vythoulkas PC (1993) Alternative approaches to short term forecasting for use in driver information systems. In: Daganzo CF (ed) Transportation and traffic theory. Elsevier, Amsterdam
Danech-Pajouh M, Aron M (1991) ATHENA: a method for short-term inter-urban motorway traffic forecasting. Recherche Transports Sécurité 6:11–16
Eubank RL (1988) Spline smoothing and nonparametric regression. Marcel Dekker, New York
Smart BL, Demetsky MJ (1997) Traffic flow forecasting: comparison of modeling approaches. J Transp Eng 123(4):261–266
Smith BL, Williams BM, Oswald RK (2002) Comparison of parametric and nonparametric models for traffic flow forecasting. Transp Res C 10(4):303–321
Box GEP, Jenkins GM (1976) Time series analysis: forecasting and control. Holden-Day, San Francisco
Hamed MM, Al-Masaeid HR, Bani-Said ZM (1995) Short-term prediction of traffic volume in urban arterials. ASCE J Transp Eng 121(3):249–254
Kamarianakis Y, Prastacos P (2005) Space-time modeling of traffic flow. Comput Geosci 31(2):119–133
Kirby HR, Watson SM, Dougherty MS (1997) Should we use neural networks or statistical models for short-term motorway traffic forecasting? Int J Forecast 13(1):43–50
Williams BM (2001) Multivariate vehicular traffic flow prediction: an evaluation of ARIMAX modeling. Transp Res Rec 1776:194–200
Angeline PJ, Saunders GM, Pollack JB (1994) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans Neural Netw 5(1):54–65
Williams BM (1999) Modeling and forecasting vehicular traffic flow as a seasonal stochastic time series process. Doctoral dissertation, Department of Civil Engineering, University of Virginia, Charlottesville
Chen H, Grant-Muller S, Mussone L, Montgomery F (2001) A study of hybrid neural network approaches and the effects of missing data on traffic forecasting. Neural Comput Appl 10:277–286
Dougherty MS, Cobbett MR (1997) Short-term inter-urban traffic forecasts using neural networks. Int J Forecast 13(1):21–31
Florio L, Mussone L (1996) Neural network models for classification and forecasting of freeway traffic flow stability. Control Eng Pract 4(2):153–164
Ledoux C (1997) An urban traffic flow model integrating neural networks. Transp Res C 5(5):287–300
Vlahogianni EI, Karlaftis MG, Golias JC (2005) Optimized and meta-optimized neural networks for short-term traffic flow prediction: a genetic approach. Transp Res C 13(3):211–234
Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of art. Int J Forecast 14:35–62
Yin H, Wong SC, Xu J, Wong CK (2002) Urban traffic flow prediction using a fuzzy-neural approach. Transp Res C 10(2):85–98
Tay FEH, Cao LJ (2001) Application of support vector machines in financial time series forecasting. Omega 29(4):309–317
Huang W, Nakamori Y, Wang SY (2005) Forecasting stock market movement direction with support vector machine. Comput Oper Res 32(10):2513–2522
Hung WM, Hong WC (2009) Application of SVR with improved ant colony optimization algorithms in exchange rate forecasting. Control Cybern 38(3):863–891
Pai PF, Lin CS (2005) A hybrid ARIMA and support vector machines model in stock price forecasting. Omega 33(6):497–505
Pai PF, Lin CS, Hong WC, Chen CT (2006) A hybrid support vector machine regression for exchange rate prediction. Int J Inf Manag Sci 17(2):19–32
Pai PF, Hong WC (2006) Software reliability forecasting by support vector machines with simulated annealing algorithms. J Syst Softw 79(6):747–755
Hong WC, Pai PF (2006) Predicting engine reliability by support vector machines. Int J Adv Manuf Technol 28(1–2):154–161
Hong WC (2008) Rainfall forecasting by technological machine learning models. Appl Math Comput 200(1):41–57
Hong WC, Pai PF (2007) Potential assessment of the support vector regression technique in rainfall forecasting. Water Resour Manag 21(2):495–513
Wang W, Xu Z, Lu JW (2003) Three improved neural network models for air quality forecasting. Eng Comput 20(2):192–210
Mohandes MA, Halawani TO, Rehman S, Hussain AA (2004) Support vector machines for wind speed prediction. Renew Energy 29(6):939–947
Hong WC (2009) Hybrid evolutionary algorithms in a SVR-based electric load forecasting model. Int J Electr Power Energy Syst 31(7–8):409–417
Hong WC (2009) Chaotic particle swarm optimization algorithm in a support vector regression electric load forecasting model. Energy Convers Manag 50(1):105–117
Hong WC (2009) Electric load forecasting by support vector model. Appl Math Model 33(5):2444–2454
Pai PF, Hong WC (2005) Support vector machines with simulated annealing algorithms in electricity load forecasting. Energy Convers Manag 46(17):2669–2688
Pai PF, Hong WC (2005) Forecasting regional electric load based on recurrent support vector machines with genetic algorithms. Electr Power Syst Res 74(3):417–425
Mori K, Tsukiyama M, Fukuda T (1993) Immune algorithm with searching diversity and its application to resource allocation problem. Trans Inst Electr Eng Jpn 113-C(10):872–878
Prakash A, Khilwani N, Tiwari MK, Cohen Y (2008) Modified immune algorithm for job selection and operation allocation problem in flexible manufacturing system. Adv Eng Softw 39(3):219–232
Xiao Z, Ye SJ, Zhong B, Sun CX (2009) BP neural network with rough set for short term load forecasting. Expert Syst Appl 36(1):273–279
Wang J, Zhu W, Zhang W, Sun D (2009) A trend fixed on firstly and seasonal adjustment model combined with the ε-SVR for short-term forecasting of electricity demand. Energy Policy 37(11):4901–4909
Holt CC (1957) Forecasting seasonal and trends by exponentially weighted averages. Carnegie Institute of Technology, Pittsburgh
Winters PR (1960) Forecasting sales by exponentially weighted moving averages. Manag Sci 6:324–342
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Amari S, Wu S (1999) Improving support vector machine classifiers by modifying kernel functions. Neural Netw 12(6):783–789
Vojislav K (2001) Learning and soft computing—support vector machines, neural networks and fuzzy logic models. The MIT Press, Massachusetts
Wang L, Zheng DZ, Lin QS (2001) Survey on chaotic optimization methods. Comput Technol Autom 20(1):1–5
Pan H, Wang L, Liu B (2008) Chaotic annealing with hypothesis test for function optimization in noisy environments. Chaos Solitons Fractals 35(5):888–894
Zuo XQ, Fan YS (2006) A chaos search immune algorithm with its application to neuro-fuzzy controller design. Chaos Solitons Fractals 30(1):94–109
Liu B, Wang L, Jin YH, Tang F, Huang DX (2005) Improved particle swam optimization combined with chaos. Chaos Solitons Fractals 25(5):1261–1271
Yang D, Li G, Cheng G (2007) On the efficiency of chaos optimization algorithms for global optimization. Chaos Solitons Fractals 34(4):1366–1375
Li L, Yang Y, Peng H, Wang X (2006) Parameters identification of chaotic systems via chaotic ant swarm. Chaos Solitons Fractals 28(5):1204–1211
Tavazoei MS, Haeri M (2007) Comparison of different one-dimensional maps as chaotic search pattern in chaos optimization algorithms. Appl Math Comput 187(2):1076–1085
LdS Coelho, Mariani VC (2009) Chaotic artificial immune approach applied to economic dispatch of electric energy using thermal units. Chaos Solitons Fractals 40(5):2376–2383
Deo R, Hurvich C (2006) Forecasting realized volatility using a long-memory stochastic volatility model: estimation, prediction and seasonal adjustment. J Econom 131:29–58
Azadeh A, Ghaderi SF (2008) Annual electricity consumption forecasting by neural network in high energy consuming industrial sectors. Energy Convers Manag 49:2272–2278
Acknowledgments
This research was conducted with the support of National Science Council, Taiwan (NSC 99-2410-H-161-001, NSC 98-2410-H-161-001, and NSC 98-2811-H-161-001).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hong, WC. Application of seasonal SVR with chaotic immune algorithm in traffic flow forecasting. Neural Comput & Applic 21, 583–593 (2012). https://doi.org/10.1007/s00521-010-0456-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-010-0456-7