Quantile Regression Support Vector Machine (QRSVM) Model for Time Series Data Analysis

Patel, Dharmendra

doi:10.1007/978-981-16-0708-0_6

Dharmendra Patel ORCID: orcid.org/0000-0002-4769-1289⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1374))

Included in the following conference series:

International Conference on Soft Computing and its Engineering Applications

641 Accesses
2 Citations

Abstract

Analysis of time series information is very interesting as it can be used to understand the past and to forecast the future. Mainly, the data models of the time series are based on the normal least square regression (LSR). For handle the outliers, the least square regression is not efficient. Data from the time series contains outliers in a notable quantity that may affect the results of the prediction. The proposed solution will use statistical techniques of quantile regression that robustly gives insights based on different dimensions as well as treats outliers. The advantage of quantile regression is to discover more useful predictive relationships in situations where there is a poor relationship between independent variables. The paper described the statistics of QRSVM model. The paper dealt experiments based on time series data and proved that QRSVM model is superior than LSR model in insights generations and for outlier handling.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Time Series Analysis on Univariate and Multivariate Variables: A Comprehensive Survey

Normalization and Bias in Time Series Data

Process Framework for Modeling Multivariate Time Series Data

Keywords

1 Introduction

Time series data is very vital for many applications such as economics, medicine, education, social sciences, epidemiology, weather forecasting, physical sciences etc. to derive meaningful insights at different points in time. Conventional statistics methods have several limitations to deal with time series data so specialized methods known as time series analysis requires predominantly in such cases. The simplest and most popular method is linear least square method. Least square method gives the trend line to best fit to a time series data. It exhibits several advantages:

It is very simple method to understand and derive the prediction
It is to be applicable for all most all applications
It gives maximum likelihood solutions if correlate with Markov Conditions.

However, it suffers from several critical limitations:

Sensitive towards outliers.
Data needs to be normally distributed for better results.
It exhibits tendency of outfit data.

Quantile Regressing method by utilizing support vector machine approach is an idyllic approach to deal with the limitations of least square regression methods. It has advantages over least square regression. Table 1 describes the comparison between least square and quantile regression.

Table 1. Comparison of least and quantile regression

Full size table

Support vector machine in correlation with quantile regression may produce excellent outcomes for time series analysis. The support vector machine has an ability to solve nonlinear regression estimate problems so it is the prominent candidate for time series data analysis. One more significant feature of SVM is that the learning here is analogous to resolve a problem of linear quadratic optimization. Thus, unlike the other traditional stochastic or neural network methods, the solution obtained by applying the SVM method is always unique and globally optimal.

The Sect. 2 of paper will deal with related work in this field. Section 3 will describe QRSVM model in details. Section 4 will discuss about Experiments and Results of the model. At Last, Paper provides the conclusions of research work carried out.

2 Related Work

Statistical Methods predominantly used for time series data analysis. Autoregressive Integrated Moving Average (ARIMA) model is the most prevalent and commonly used for time series data analysis [6]. Notwithstanding, these sort of models depend on the hypothesis that take into an account that time series must be linear and follows a normal distribution of the data. C. Hamzacebi in 2008 [1] proposed a distinction of ARIMA model called as Seasonal ARIMA (SARIMA). The prototypical produced good results for seasonal time series data, however it required to undertake linear form of associated time series data. The limitations of the linear models could be overcome by non-linear stochastic models [5, 19]. However, the implementation of these kind of models is very complex.

Neural Network based time series models have grown as of late and pulled in expanding considerations [8, 9]. The astounding element of ANNs is their inherent capability of non-linear modeling with no presupposition about the statistical distribution monitored by the annotations. The incredible highlights about ANN based models are self-versatile in nature [28]. There is assortment of ANN models exist in the literature. The Multi-Layer Perceptron (MLP) is the most famous and basic model dependent on ANN [2, 4, 13, 22]. MLPs contain different layers of computational components, unified in a feed-forward way [18].MLPs utilize a variety of learning techniques, the conspicuous is back-propagation [16, 20, 29] where the output esteems are related with the exact response to compute the value of some foreordained error-function. The error is then served back through the network. Utilizing this data, the algorithm controls the degree of each linkage so as to decrease the estimation of the error function by some insignificant quantity. An overall strategy for non-linear optimization called gradient descent [21, 23] is applied to regulate the degrees. Time Lagged Neural Network (TLNN) is another variation of Feed Forward way [15, 26]. In TLNN, the input nodes are the time series values at some specific lags. Likewise, there is a constant input term, which may be expediently taken as 1 and this is linked to every neuron in the hidden and output layer. The presentation of this constant input unit circumvents the need of separately introducing a bias term. In 2007, Pang et al. [17] introduced one model dependent on neural network and efficaciously applied to the simulation in the rainfall. Li et al. [14], In 2008, presented hybrid model based on AR * and generalized regression neural network model (GRNN) and that gave respectable results in the setting to the time series data. Chen and Chang in 2009 [3] came out with an Evolutionary Artificial Neural Network model (EANN) to build automatically the architecture and the connections of the weights of the neural network. Khashei and Bijari in 2010 [11] introduced a new hybrid ANN model, utilizing an ARIMA model to discover predictions more precise than the model of neural networks. Wu and Shahidehpour [27] proposed a fusion model based on an adaptive Wavelet Neural Network (AWNN) and time series models, such as the ARMAX and GARCH, to predict the day by day estimation of electricity in the market. In [7] researchers proposed a regression neural network model to anticipate widespread time series, which is a fusion of diverse algorithms for machine learning. Artificial Neural Network based algorithms are overwhelming for time series data analysis however they show various constraints such as: appropriate network structure is attained through trial and error, sometimes mysterious performance of the network, usually require more data to train the model fittingly, computationally complex and affluent. Support Vector Machine [12, 24, 25] is the vigorous machine learning technique for the pattern generation and classification.

The proposed model will use SVM as it isn’t just intended for decent classification yet additionally expected for an improved speculation of the training data. Solutions obtained by SVM is always unique as it depends on linearly constrained quadratic optimization. The model will use the fusion methodology of SVM and Quantile Regression [10]. Quantile Regression methodology permits for comprehension relationships between variables outside of the average of the data, making it valuable in understanding outcomes that are non-normally dispersed and that have nonlinear relationships with predictor variables.

3 Quantile Regression Support Vector Machine (QRSVM) Model

The least square regression model is representing by the Eq. (1).

$$ Y = a + b\;X +\upvarepsilon $$

(1)

Where Y is Dependent Variable whose value is going to be predicted

a is the intercept of Y

b is the slope of line

$\upvarepsilon $ represents an error and s identically, independently, and normally distributed with mean zero and unknown variance σ2.

Least square regression model attempt to define conditional distribution by utilizing the average of a distribution. Another thing is, it assumes that the error term is same across all values of X in which conditional variable (Y/X) to be assumed a constant variance σ2. When this assumption fail, we must change the LSR algorithm to accommodate conditional mean and scale. The new equation based on conditional scale is:

$$Y=a+b \;X+{e}^{r}\upvarepsilon $$

(2)

Where r is the unknown parameter

$$ {\text{Var}}\,{\text{(Y/X)}} =\upsigma ^{2} e^{r} $$

(3)

In this also, conditional scale for dependent variable y is not vary with independent variable X. In order to realize covariate properties in context to dependent variable Quantile Regressing concept is required.

$$Y={a}^{(p)}+ {b}^{(p)}X+ {\upvarepsilon}^{(p)}$$

(4)

Where p is the probability and it ranges between 0 and 1.

We specify the p^th conditional quantile given X with

$${Q}^{\left(p\right) }\left(\frac{Y}{X}\right)= {a}^{(p)}+ {b}^{(p)}X$$

(5)

Least square regression having only one conditional mean while Quantile Regression contains numerous conditional quantiles. In the nonlinear quantile regression, the quantile of the dependent variable Y for a given independent attribute X is assumed to be nonlinearly related to the input vector Xi ∈ Rd and represented by nonlinear mapping function ϕ(…). The new version related to nonlinearity characteristic of quantile function is represented as:

$$ QX = W\theta \,\upphi ({\text{X}}) $$

(6)

Where θ ∈ (0, 1),

${W}_{\theta }$ is θ^th regression quantile.

Absolute deviation loss will occur in quantile regression so SVM with quantile regression plays a vital role. The equation of quantile regression with SVM is represented as:

$$ Minimize12\,\left\| {w\uptheta } \right\|2 + C\,i = 1np\theta (Y - W\theta \,\upphi (X)) $$

(7)

for any θ ∈ (0, 1)

Equation (7) is considered as QRSVM Model.

4 Experiments and Results

Experiments of proposed study is carried out by considering weather data of Anand District of Gujarat State, India. The sample data is depicted in the Table 1.

Table 2. Sample weather data of Anand district

Full size table

Experiment simulation is carried out using R programming language (Figs. 1 and 2).

The box plot of rainfall data indicates that several values are outliers. For the accurate prediction of the time series data, outlier values also play an important role.

According to least square regression model, residuals values and Coefficient statistics, are depicted in Table 2 and Table 3 respectively. They having one value based on central tendency value.

Table 3. Residual values of least square regression

Full size table

Table 4. Coefficient values of least square regression

Full size table

Quantile Regression SVM model generate coefficient based on quantile value. For the similar data, the coefficient of the QRSVM Model is depicted in Table 4 (Table 5, Figs. 3 and 4).

Table 5. Coefficient values of QRSVM model

Full size table

From the results we can see that Least Square regression distributes Intercept and X values on central tendency where as QRSVM model distributes them with multiple values based on the values of percentile so we can understand and explore the insights with multiple dimensions. The other thing is if outlier exist in the data then central tendency value might be compromised whereas this situation will not affect in QRSVM Model.

5 Conclusions

In this proposed work, it is concluded that Least Square Regression Model exhibits several limitations such as it attempts to define conditional distribution by utilizing only the average of a distribution. Another thing is, it assumes that the error term is same across all values of X in which conditional variable (Y/X) to be assumed a constant variance σ². In order to realize covariate properties in context to dependent variable, Quantile Regressing concept is required. Based on the experiments of time series data of weather, the paper concluded that QRSVM model distributes Intercept and X values in multiple values in order to understand and interpret them effectively. Paper also concluded that the results of LSR model might be compromised to deal to with outliers whereas this situation does not exist in QRSVM model.

References

Hamzacebi, C.: Improving artificial neural networks’ performance in seasonal time series forecasting. Inf. Sci. 178, 4550–4559 (2008)
Article Google Scholar
Calcagno, G., Antonino, S.: A multilayer neural network-based approach for the identification of responsiveness to interferon therapy in multiple sclerosis patients. Inf. Sci. 180(21), 4153–4163 (2010)
Article Google Scholar
Chen, Y., Chang, F.-J.: Evolutionary artificial neural networks for hydrological systems forecasting. J. Hydrol. 367, 125–137 (2009)
Article Google Scholar
Wang, D., Liu, D., Zhao, D., Huang, Y., Zhang, D.: A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints. Neural Comput. Appl. 22(2), 219–227 (2013)
Article Google Scholar
Zhang, G.P.: A neural network ensemble method with jittered training data for time series forecasting. Inf. Sci. 177, 5329–5346 (2007)
Article Google Scholar
Zhang, G.P.: Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50, 159–175 (2003)
Article Google Scholar
Gheyas, I.A., Smith, L.S.: A novel neural network ensemble architecture for time series forecasting. Neurocomuting 74, 3855–3864 (2011)
Article Google Scholar
Kihoro, J.M., Otieno, R.O., Wafula, C.: Seasonal time series forecasting: a comparative study of ARIMA and ANN models. Afr. J. Sci. Technol. (AJST) Sci. Eng. Ser. 5(2), 41–49 (2004)
Google Scholar
Kamruzzaman, J., Begg, R., Sarker, R.: Artificial Neural Networks in Finance and Manufacturing. Idea Group Publishing, USA (2006)
Google Scholar
Yu, K., Lu, Z., Stander, J.: Quantile regression: applications and current research areas. J. R. Stat. Soc. Ser. D (The Stat.) 52, 331–350 (2003)
Article MathSciNet Google Scholar
Khashei, M., Bijari, M.: An artificial neural network (p,d,q) model for timeseries forecasting. Expert Syst. Appl. 37(1), 479–489 (2010). https://doi.org/10.1016/j.eswa.2009.05.044
Article MATH Google Scholar
Cao, L.J., Tay, F.E.H.: Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans. Neural Netw. 14(6), 1506–1518 (2003). https://doi.org/10.1109/TNN.2003.820556
Article Google Scholar
Tawfiq, L.N.M.: Design and training artificial neural networksfor solving differential equations. Ph.D. thesis, University of Baghdad, College of Education Ibn-Al-Haitham (2004)
Google Scholar
Li, W., Luo, Y., Zhu, Q., Liu, J., Le, J.: Applications of AR*-GRNN model for the financial time series forecasting. Neural Comput. Appl. 17, 441–448 (2008)
Article Google Scholar
Moseley, N.: Modeling economic time series using a focused time lagged feed forward neural network. In: Proceedings of Student Research Day, CSIS, Pace University (2003)
Google Scholar
Nawi, N.M., Ransing, M.R., Ransing, R.S.: An improved conjugate gradient based learning algorithm for back propagation neural networks. J. Comput. Intell. 4, 46–55 (2007)
Google Scholar
Pang, B., Guo, S., Xiong, L., Li, C.: A non linear perturbation model based on artificial neural network. J. Hydrol. 333, 504–516 (2007)
Article Google Scholar
Prochazka, A.P.: Feed-forward and recurrent neural networks in signal prediction. In: 4th International Conference on Computational Cybernetics. IEEE (2007)
Google Scholar
Parrelli, R.: Introduction to ARCH & GARCH models. Optional TA Handouts, Econ 472 Department of Economics, University of Illinois (2001)
Google Scholar
Rehman, M.Z., Nawi, N.M., Ghazali, M.I.: Noise-induced hearing loss (NIHL) prediction in humans using a modified back propagation neural network. In: 2nd International Conference on Science Engineering and Technology, pp. 185–189 (2011)
Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. https://arxiv.org/pdf/1609.04747.pdf. Accessed 23 Jan 2018
Hoda I., S.A., Nagla, H.A.: On neural network methods for mixed boundary value problems. Int. J. Nonlinear Sci. 11(3), 312–316 (2011)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV) (2017). https://doi.org/10.1109/ICCV.2017.74
Farooq, T., Guergachi, A., Krishnan, S.: Chaotic time series prediction using knowledge based Green’s Kernel and least-squares support vector machines. In: Systems, Man and Cybernetics, pp. 373–378 (2007)
Google Scholar
Raicharoen, T., Lursinsap, C., Sanguanbhoki, P.: Application of critical support vector machine to time series prediction. In: Proceedings of the 2003 International Symposium on Circuits and Systems, ISCAS 2003, vol. 5, pp. 741–744 (2003)
Google Scholar
Yolcu, U., Egrioglu, E., Aladag, C.H.: A new linear & nonlinear artificial neural network model for time series forecasting. Decis. Support Syst. 54(3), 1340–1347 (2013)
Article Google Scholar
Lei, W., Shahidehpour, M.: A hybrid model for day-ahead price forecasting. IEEE Trans. Power Syst. 25(3), 1519–1530 (2010). https://doi.org/10.1109/TPWRS.2009.2039948
Article Google Scholar
Wang, X., Meng, M.: A hybrid neural network and ARIMA model for energy consumption forecasting. J. Comput. 7(5), 1184–1190 (2012)
Google Scholar
Zweiri, Y., Seneviratne, L., Althoefer, K.: Stability analysis of a three-term backpropagation algorithm. Neural Netw. 18(10), 1341–1347 (2005). https://doi.org/10.1016/j.neunet.2005.04.007
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Smt. Chandaben Mohanbhai Patel Institute of Computer Applications, CHARUSAT, Changa, Gujarat, India
Dharmendra Patel

Authors

Dharmendra Patel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dharmendra Patel .

Editor information

Editors and Affiliations

Charotar University of Science and Technology, Changa, Anand, Gujarat, India
Kanubhai K. Patel
Bennett University, Greater Noida, Uttar Pradesh, India
Deepak Garg
Charotar University of Science and Technology, Changa, Anand, Gujarat, India
Atul Patel
Saint Mary's University, Halifax, NS, Canada
Pawan Lingras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patel, D. (2021). Quantile Regression Support Vector Machine (QRSVM) Model for Time Series Data Analysis. In: Patel, K.K., Garg, D., Patel, A., Lingras, P. (eds) Soft Computing and its Engineering Applications. icSoftComp 2020. Communications in Computer and Information Science, vol 1374. Springer, Singapore. https://doi.org/10.1007/978-981-16-0708-0_6

Download citation

DOI: https://doi.org/10.1007/978-981-16-0708-0_6
Published: 05 March 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0707-3
Online ISBN: 978-981-16-0708-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Quantile Regression Support Vector Machine (QRSVM) Model for Time Series Data Analysis

Abstract

Similar content being viewed by others

Time Series Analysis on Univariate and Multivariate Variables: A Comprehensive Survey

Normalization and Bias in Time Series Data

Process Framework for Modeling Multivariate Time Series Data

Keywords

1 Introduction

2 Related Work

3 Quantile Regression Support Vector Machine (QRSVM) Model

4 Experiments and Results

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Quantile Regression Support Vector Machine (QRSVM) Model for Time Series Data Analysis

Abstract

Similar content being viewed by others

Time Series Analysis on Univariate and Multivariate Variables: A Comprehensive Survey

Normalization and Bias in Time Series Data

Process Framework for Modeling Multivariate Time Series Data

Keywords

1 Introduction

2 Related Work

3 Quantile Regression Support Vector Machine (QRSVM) Model

4 Experiments and Results

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation