Keywords

1 Introduction

Data analytics has been a major area of Interest in the industry since the past decade. The amount of data will tend to grow 10-fold in upcoming few years having approximate storage of about 50 zetabytes [1]. Due to advancements in domain of IoT (Internet of Things), many devices send the data to central node or server for processing. These data is being dumped at regular time intervals. Time series data is a collection of observations that are sampled according to time. Some common examples of time series include the hourly readings of air temperature, monitoring of a person’s heart rate, and daily closing price of a particular companies stock.

Time series data are analysed in order to identify patterns in data which can be trends or seasonal variations. Hence, understanding the time series mechanism helps to develop a mathematical model which can explain the data in such a way that control, monitoring and prediction can be done with ease. Structural models like regression models can also be used for modelling time series data. But, these models are not necessarily associated with time, for example stock price which varies according to time can be modelled as change in inflation or unemployment rate using structural models like linear regression. In time series model, there is no such concept of dependant or independent variable; there is only one variable which is a model variable which varies according to different periodic trends.

The significance of time series modelling is that same data can be modelled with different periodic trends which in case of structural model are tedious task. Also some factors can be unobservable, hence excluded from regression analysis. In time series analysis, each observation is somewhat dependent upon previous observation, and gets influenced by more than one previous observation. Error term too gets influenced from one observation to another. These influences are mapped using autocorrelation, which is used to either model trend itself or to model the underlying mechanisms. To model using time-series, four common models are being used which are Autoregressive Model (AR), Moving Average Model (MA), Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average (ARIMA). This paper highlights methodologies of each of these time series models and describes their behaviour.

In Sect. 2 a brief review related to time series modelling is discussed. Section 3 focuses on methodologies which are used for analysing time series data. Section 4 highlights observations pertaining to time series modelling. Finally, in Sect. 5, conclusions and future work are discussed.

2 Related Work

In contrast to time series models, great amount of literature is witnessed in the field of neural network approach [2,3,4,5,6]. However, these methods failed to identify the abnormal patterns which can occur in data. Models based on Support Vector Machines (SVM) and stochastic approaches are also utilized for modelling time series data [7, 8]. Genetic Algorithms along with Neural Network are also being used [9,10,11]. These models were unable to give effective analysis due to complex nature of data, noisy data and high dimensionality of data. Therefore effective Time series models were needed.

The time series models helps to interpret hidden patterns of the data and helps in analysis by fitting a model for forecasting [12, 18]. In this paper, not only time series models are described but also all the way these models are being analysed is also being showcased. But it will take plenty amount of time to any researcher to understand many important concepts relative to time series models. This paper provides in depth knowledge of each of these concepts.

3 Time Series Modelling and Forecasting

Regression without lags fails to account for the relationships through time and overestimates the relationship between dependant and independent variables. To overcome this time series analysis needs to be done. The upcoming sub-sections will help to understand the stationary and non-stationary data, white noise process, AR, MA, ARMA, ARIMA models.

3.1 Stationarity

One of the important properties of time-series process is stationarity of data. So a random process or a stochastic process is known to be stationary when its joint distribution doesn’t change over time. Time series data is stochastic or probabilistic in nature because there is no accurate formula when prediction needs to be done. But usually, time series data points are weakly stationary in nature i.e. those data-points which have constant mean µ, constant variance σ2 and constant auto-covariance i.e. Auto-covariance (Yt, Yt−1) = Auto-covariance (Yt−2, Yt−3) at regular periodic intervals.

3.2 White Noise

A random process which has expectancy or mean, variance at any time as a constant and auto-covariance is 0. It implies that each observation is uncorrelated with other observations in the sequence (Fig. 1).

Fig. 1.
figure 1

White noise process [18].

The Ljung-Box statistics [13], which tests the “overall” randomness based on a number of lags, is a standard approach to determine whether the data points exhibit white noise property. Hence if white noise occurs in data, then there is no significance of doing time series modelling or estimation over such data.

3.3 Moving Average Model (MA)

Time series data at time interval t is given as \( y_{t} \), where y t which is current value at time t is considered as linear combination of different white noise processes \( u_{t} \). The model can be represented as follows:

$$ y_{t} = u_{t} + \theta_{1} u_{t - 1} + \theta_{2} u_{t - 2} + \cdots + \theta_{q} u_{t - q} $$
(1)

Where,

θ1, θ2, …, θq are parameters which are estimated using maximum likelihood function. Lag operator L used to represent same data for previous period.

For example,

$$ u_{t - 1} = Lu_{t} \,and\,u_{t - 2} = Lu_{t - 1} $$

Therefore,

$$ u_{t - 2} = L^{2} u_{t} $$

The lag operator is used to make a representation of MA series much better as estimation interpretation becomes much easier. Hence equation is given by:

$$ y_{t} = \mathop \sum \limits_{i = 1}^{q} L^{i} \theta_{i} u_{t} $$
(2)

3.4 Auto Regressive Model (AR)

The AR model depends on the past values and error terms.

$$ y_{t} = \emptyset_{1} y_{t - 1} + \emptyset_{2} y_{t - 2 } + \cdots + \emptyset_{p} y_{t - p} + \mu_{t } $$
(3)

Where,

\( \mu_{t} \) is a error term given as white noise and p is a lag term.

In terms of Lag operator AR model can be represented as:

$$ y_{t} = \sum\nolimits_{i = 1}^{q} {\emptyset_{i} L^{i}_{i} y_{t} + \mu_{t} } $$
(4)

Stationarity condition matters a lot in AR estimation. Non stationary series has a non-declining effect in AR estimation which is undesirable in time series modelling.

3.5 Auto Regressive Moving Average Model (ARMA)

ARMA (p, q) is the process which combines AR series as well as MA series. Mathematically, it can be represented as:

$$ \varvec{y}_{\varvec{t}} = \emptyset_{1} \varvec{y}_{{\varvec{t} - 1}} + \cdots + \emptyset_{\varvec{p}} \varvec{y}_{{\varvec{t} - \varvec{p}}} +\varvec{\theta}_{1} \varvec{u}_{{\varvec{t} - 1}} +\varvec{\theta}_{2} \varvec{u}_{{\varvec{t} - 2}} + \cdots +\varvec{\theta}_{\varvec{q}} \varvec{u}_{{\varvec{t} - \varvec{q}}} + \varvec{u}_{\varvec{t}} $$
(5)

The ARMA (p, q) process is detected by plotting correlogram of Auto Correlation Function (ACF) and Partial Auto Correlation Function (PACF) against lags, which shows declining curve as lag increases in both ACF and PACF cases as shown in Fig. 2 below:

Fig. 2.
figure 2

The ACF and PACF correlogram for ARMA model.

As shown in Fig. 2. The coefficients of acf and pacf values go to negative axis when the coefficient or residual term comes out to be negative.

Box Jenkins Approach for ARMA Model

Box and Jenkins [14] suggested that differencing non-stationary series one or more times can achieve stationarity. Doing so leads to an ARIMA model, with the “I” standing for “Integrated”. The approach consists of three main stages which are:

  1. 1.

    Identification.

  2. 2.

    Estimation and

  3. 3.

    Diagnostic checking.

In first stage, the orders of the model are determined. We plot correlogram of ACF and PACF to identify the lag values. In second stage, parameters ∅ and θ are estimated. The parameters are estimated using Ordinary Least squares [15] and maximum likelihood function [16]. The third stage checks whether the model fit is good. On the other hand, the residual diagnostics, checks whether the residuals have a certain correlation.

Information Criteria for ARMA (p, q) Model Selection

The information criteria for ARMA (p, q) model helps to determine the order of the time series based on Residual Sum of Square (RSS) [17]. The model which gives minimum RSS for value of p and q is best suitable model. When more number of lags are added the one which gives less RSS is best model. But when numbers of lags are added the RSS goes down. Hence, penalty term is considered while finding suitable model which is given as:

$$ min\left( {f\left( {RSS} \right) + penalty\,term} \right) $$
(6)

Thus adding extra terms will also add penalty while finding the p and q. The standard methods which are used for computing information criteria are:

  • Akaike’s Information Criteria (AIC).

  • Schwarz’s Bayesian Information Criteria (SBIC).

  • Hannon Quinn Criteria (HQIC).

Where,

$$ {\text{AIC}} = \ln \left( {\hat{\sigma }^{2} } \right) + \frac{2k}{T}, $$
$$ {\text{SBIC}} = \ln \left( {\hat{\sigma }^{2} } \right) + \frac{k}{T} \ln \left( T \right), $$
$$ {\text{HQIC}} = \ln \left( {\hat{\sigma }^{2} } \right) + \frac{2k}{T} \ln \left( {\ln \left( T \right)} \right) $$

Where,

\( \hat{\sigma }^{2} = \frac{RSS}{T} \), \( k = p + q + 1 \), and \( T = Sample \;size \).

The model which gives the less value of AIC is best model.

3.6 ARIMA Model

The AR, MA, ARMA models cannot handle non-stationarity that is the series that has trend. In stationary series, the model reverts around its mean value. To make non-stationary series into stationary, we difference it i.e. taking the difference of the series with its own lag. Differencing will give the series leads to a new set of values with constant mean and constant variance. Example,

As shown in table above \( \nabla {\mathbf{y}} = \varvec{y}_{\varvec{t}} - \varvec{y}_{{\varvec{t} - 1}} \) is first order difference, and \( \nabla^{2} {\mathbf{y}} \) is a second order difference, which gives constant mean and 0 or constant variance. Thus differencing ARMA model makes it integrated, which gives its name ARIMA. It can be represented as ARIMA (p, q, r) (Table 1),

Table 1. First order differencing and second order differencing.

Where,

p = number of AR terms, q = order of differencing and r = number of MA terms.

Also, ARIMA can be represented AR, MA and ARMA model ARIMA(1,0,0) gives AR(1) model, ARIMA(0,0,1) gives MA(1) model and ARIMA(1,0,1) gives ARMA(1,1) model where only differencing is 0.

The main query which arises is to determine the order of differencing. It can be given by observing the following characteristics:

  1. 1.

    Positive auto-correlation at higher lags will need higher order differencing.

  2. 2.

    Zero or negative auto-correlation for first order itself then no differencing is needed.

  3. 3.

    The optimal order of differencing is the one whose prediction gives less value for RMSE (root mean square error) or MAE (Mean Absolute error).

4 Observations

In time series modelling, depending upon characteristics of periodic patterns, predictions are done. Different trends will lead to different periodic patterns which can be classified as hourly, daily, weekly, monthly data. The goal is to find the appropriate statistical relationship of the given series with its own past values. We build the model in training data, and test it on new dataset (holdout) to check whether predicted values match the actual values. Thus, accuracy is computed by computing forecast errors which can be represented as:

$$ {\text{Forecast}}\;{\text{error}} = {\text{Actual}}\;{\text{value}} - {\text{Predicted value}} . $$

Accuracy can be computed using mean square error (MSE). The model which gives less MSE for Holdout sample is good model for prediction.

The brief behaviours of white noise AR, MA, ARMA and ARIMA for time-series data are shown in Table 2 below:

Table 2. Behaviour of time series models

5 Conclusion

Thus, while doing time series analysis, the data should not be a white noise process or random in nature. Thus only stationary or non-stationary will lead to identification of proper periodic pattern which can be used to determine appropriate forecasting model. The accuracy of forecasting is determined using Root mean square error or Mean Absolute error.

As a part of future work, major focus will be put upon understanding multivariate time series analysis. Also, forecasting over white-noise process can be done by using unsupervised algorithms like Neural Networks. But, it is very complex in nature. Hence, further part of research will be emphasized upon using semi-supervised algorithm like Support Vector Regression.