Keywords

1 Introduction

The exponential smoothing method is an approach that used to produce future forecasts for a univariate time series. The method is believed to have been around since 1950s. The classical forms of exponential smoothing method are including simple exponential smoothing [1], linear [2] and Holt–Winter’s additive and multiplicative [3]. Following the idea of nonlinear state-space framework, characterized by a single source of errors introduced by [4], the exponential smoothing method can now be expressed via state space representation. Hyndman et al. [5] thoroughly explain the state-space approach of the exponential smoothing method. A total of 30 possible models makes this approach to be very useful in producing forecasts for any univariate time series data. The disadvantage of this approach, however, it does not allow for integration of regressor(s) into the model. To mitigate this problem, Osman and King [6] introduced an extended version in which regressor(s) can be included into the model. By taking an example of a basic model that consist of one regressor, \(z_{1,t}\), the method with time varying parameter of regressor can be expressed as follows:

$$y_{t} \text{ = }\ell_{t - 1} + b_{1,t - 1}\Delta _{{z_{1,t} }} + \varepsilon_{t}$$
(18.1a)
$$\ell_{t} = \ell_{t - 1} + b_{1,t - 1}\Delta _{{z_{1,t} }} + \alpha \varepsilon_{t}$$
(18.1b)
$$b_{1,t} = \left\{ {\begin{array}{*{20}l} {b_{1,t - 1} + \frac{{\beta_{1} \left( {\varepsilon_{1,t - 1}^{ + } + \varepsilon_{t} } \right)}}{{\Delta _{{z_{1,t} }}^{ * } }},} \hfill & {{\text{if}} \left| {\Delta _{{z_{1,t} }}^{ * } } \right| \ge L_{{{\text{b}}_{1} }} } \hfill \\ {b_{1,t - 1} ,} \hfill & {{\text{if}} \left| {\Delta _{{z_{1,t} }}^{ * } } \right| < L_{{{\text{b}}_{1} }} } \hfill \\ \end{array} } \right.$$
(18.1c)
$$\varepsilon_{1,t}^{ + } = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {{\text{if}} \left| {\Delta _{{z_{1,t} }}^{ * } } \right| \ge L_{{{\text{b}}_{1} }} } \hfill \\ {\varepsilon_{1,t - 1}^{ + } + \varepsilon_{t} ,} \hfill & {{\text{if }}\left| {\Delta _{{z_{1,t} }}^{ * } } \right| < L_{{{\text{b}}_{1} }} ,} \hfill \\ \end{array} } \right.$$
(18.1d)

where

$$\Delta _{{z_{1,t} }}^{ * } = z_{1,t} - z_{1,t - 1}^{ * } \quad {\text{and}}\quad z_{1,t}^{ * } = \left\{ {\begin{array}{*{20}l} {z_{1,t} ,} \hfill & {{\text{if}} \left| {\Delta _{{z_{1,t} }}^{ * } } \right| \ge L_{{{\text{b}}_{1} }} } \hfill \\ {z_{1,t - 1}^{ * } ,} \hfill & {{\text{if}} \left| {\Delta _{{z_{1,t} }}^{ * } } \right| < L_{{{\text{b}}_{1} }} } \hfill \\ \end{array} } \right..$$
(18.2)

\(\ell_{t}\) in Eq. (18.1) denotes the level of the series, \(b_{1,t}\) represents a regressor parameter, \(\varepsilon_{t}\) denotes errors, while α and \(\beta_{1}\) represent the smoothing parameters. Both \(\varepsilon_{1,t}^{ + }\) and \(z_{1,t}^{ * }\) are representing dummy errors and dummy regressor, respectively. On the other hand, \(L_{{{\text{b}}_{1} }}\) represents the lower boundary for a “switching procedure.” Detailed explanation on this model is given in Osman and King [6].

Given that this is a new developed method, there is no available statistical software that can be used to execute parameter estimation for the method at this moment. In general, the objective of this study is to determine how best of executing “optim” function in R statistical software to estimate parameters of the new models.

2 Research Methodology

There are two main issues involved in estimating parameters of Eq. (18.1). First, what constraints need to be imposed on the smoothing constants, α and \(\beta_{1}\). Second, what values of starting points (for the two smoothing constants) should be used in optimization routine. In order to find answer for the first issue, two approaches were considered that provide different parameter space for the two smoothing constants. The first approach is to impose parameter space according to the classical boundary of the smoothing constants in exponential smoothing method, that is between 0 and 1 for α and between 0 and α for \(\beta_{1}\) as explained by [6] and Osman and King [3]. The second approach is to impose restrictions on eigenvalues of the characteristic equation of the model to be within the unit circle as explained by Osman and King [7]. The second approach is known as forecastability concept which gives wider parameter space for the smoothing constants as compared to the first approach. With regard to the second issue, different sets of starting points for the smoothing constants were considered in this analysis that is either 0.01, 0.1, or 0.5 for each of the smoothing constants.

The analysis which only involves one model as described by Eq. (18.1) was started by producing two optimization R codes for parameter estimation using two different sets of restrictions as explained above. The next step then followed to produce six simulated series with different predetermined values of parameters as listed in Table 18.1. For all cases of simulation, initial level and growth term were set to be \(\ell_{0} = 500\) and \(b_{1,0} = 0.5\). By considering a situation where the “switching procedure” is not required, all simulated series were generated based on the following equations where the error term, \(\varepsilon_{t}\) is a normally distributed generated series.

Table 18.1 Predetermined parameter values for six simulated series
$$y_{t} = \ell_{t - 1} + b_{1,t - 1} \Delta_{{z_{1,t} }} + \varepsilon_{t}$$
(18.3a)
$$\ell_{t} = \ell_{t - 1} + b_{1,t - 1} \Delta_{{z_{1,t} }} + \alpha \varepsilon_{t}$$
(18.3b)
$$b_{1,t} = b_{1,t - 1} + \beta_{1} \varepsilon_{t} /\Delta_{{z_{1,t} }}$$
(18.3c)

The third stage of analysis was done by applying optimization R codes for parameter estimation on all simulated series. A total of eight sets of starting points for smoothing constants were used with six of them that are simply combination between single starting point for both α and \(\beta_{1}\). For other two sets of starting points, either two or three different values of starting points were used for each smoothing constant. All possible combinations of starting points are then considered in the optimization routine. Table 18.2 depicts all starting points used in estimation process. Note that, even though multiple starting points were used, only the one that gives the smallest value of the objective function was examined.

Table 18.2 Starting points used in optimization routine

The estimation process was performed iteratively by minimizing the objective function \(\left( {n{ \log }\left( {\mathop \sum \limits_{t = 1}^{n} \varepsilon_{t}^{2} } \right)} \right)\) of the optimization code. This optimization procedure was conducted using “optim” function with the use of Nelder–Mead algorithm. The final step of analysis was then to evaluate the estimation results by comparing the predetermined values and the estimated values of all parameters.

3 Estimation Results

Estimation results are given in Tables 18.3, 18.4, 18.5, 18.6, 18.7, and 18.8. It can be seen that in most cases, small values of starting points (\(\alpha = 0.01, \,\beta_{1} = 0.01\) and \(\alpha = 0.1, \,\beta_{1} = 0.01\)) managed to produce the smallest value of the objective function. This finding, however, is not true for simulation 5 and also not true for simulations 3 and 4 in which restrictions on parameter space were imposed according to the forecastability concept. Another finding is that, when the actual parameters (predetermined values) are small, the use of large starting points has led to the failure of the optimization routine to produce estimation that close to the actual values as evident by the results of analysis in the case of simulation 1 and 2. As mentioned earlier, estimation based on the forecastability concept provides wider space for the smoothing constants. If we look at the estimation result of simulation 5, the estimated values of α when estimated based on the forecastabilty concept are much higher than the predetermined value. In this particular case, estimation performed based on the classical boundary method produced more accurate estimated value for α.

Table 18.3 Estimation results for simulated series 1
Table 18.4 Estimation results for simulated series 2
Table 18.5 Estimation results for simulated series 3
Table 18.6 Estimation results for simulated series 4
Table 18.7 Estimation results for simulated series 5
Table 18.8 Estimation results for simulated series 6

4 Conclusion

As discussed in the previous section, the use of small values of starting points for smoothing constants in optimization routine is advised as compared to large values of starting points. However, taking into account the better performance of large starting points in some cases, it is encouraged to use multiple starting points that is also include small values of starting points. It is also can be concluded that the use of forecastability concept in imposing restrictions on parameter space does not necessarily will produce better estimate than the use of classical boundary concept. Since it provides wider parameter space for the smoothing constant, it allows for greater chance of estimation error as well.

The use of classical boundary concept in imposing restrictions to the parameter space of the smoothing constant is actually sufficient if we want to update the levels, \(\ell_{t}\) and the regressor parameter, \(b_{1,t}\) based on the weighted average methodology, the concept used in the classical form of the exponential smoothing methods. This is because restricting α to be between 0 and 1 and \(\beta_{1}\) to be between 0 and α in the state-space representation is equivalent to restrict both α and \(\beta_{1}\) to be between 0 and 1 in the classical form of the exponential smoothing methods.