Keywords

Introduction

Probably the most important function of business is forecasting, which is a starting point for planning and budgeting. Traditionally, budgeting is regarded as one of the most important financial and accounting functions (Ekholm and Wallin 2000; Tanlu 2007). In recent years, rapidly changing market conditions have made financial planning tools of increasing importance both for managers and practitioners. However, the use of traditional budgets has been criticized as a management control tool (Hope and Fraser 2003). In the relevant literature, a number of new tools have been proposed as a replacement to traditional budgets, such as rolling forecasts and beyond budgeting (Bergstrand 2009; Bogsnes 2009). The purpose of rolling forecast is to use the most frequent data in order to make more flexible and adaptable organizations that are able to cope with changing environments (Lorain 2010). There is an increased number of companies adopting rolling forecasts as a part of the Beyond Budgeting model (Bogsnes 2009), and the main reason is to become more adaptive and hence can better support company planning and control processes (Hope and Fraser 2003).

Management, typically operates under conditions of uncertainty or risk and one of the fundamental objectives of forecasting is to reduce risk and uncertainty of financial decisions. A variety of forecasting techniques is available for the analyst to choose the most appropriate one. However, in real business life, the number of time series to be forecasted is enormous and the forecasts have to be updated frequently making forecasting modeling an almost impossible procedure. Therefore, automatic forecasts of large numbers of univariate time series are often needed in business (Leonard 2002).

When multiple forecasts are available for a target variable, forecast combination methods provide a simple and effective way to improve the forecasting performance of individual forecasting models. Further, they provide a simple procedure to manage misspecified and unstable forecasters, small sample sizes, and structural breaks in the data (Huang and Lee 2010; Mandel and Sani 2016). Usually, forecast combination methods outperform the best individual forecaster. For example, combination of forecasts has been applied with success, most of the time, in forecasting interest rates (Guidolin and Timmermann 2009), equity premium prediction (Rapach et al. 2010), realized volatility (Patton and Sheppard 2009), stock market return prediction (Nikolopoulos and Papakyriazis 2004), etc. However, most of the existing combination models focus on environments that ignore the complexity of real-world data. Several studies propose combination models capable of adapting to various environments and system instabilities (Aiolfi and Timmermann 2006; Nikolopoulos and Papakyriazis 2002; Smith and Wallis 2009; Tian and Anderson 2014).

In this paper, we apply, in the not trivial problem of forecasting monthly sales and cost of goods sold (COGS) for a manufacturing company, a combination scheme of automatic forecasts, based on a state-space representation where the combination weights are estimated online by means of the Kalman Filter.

Combining Automatic Forecasts

An automatic forecasting system can be used to automatically fit various models (i.e., exponential smoothing models, ARIMA, and dynamic regression models). Automatic forecasts may be used in cases where there is not an experienced forecaster; the number of the forecasts to be generated is large; the frequency of forecasting updates is high; the real model in not known or it is difficult to be identified (Leonard 2002). Combining a number of automatic forecasting models may produce superior forecasts especially out-of-sample. The forecast package (Hyndman 2016) for the R system for statistical computing, implements various automatic forecasting models. In the current work, two general automatic forecasting models are utilized. That is an exponential smoothing state-space model (ETS) and an autoregressive integrated moving average (ARIMA) model (Hyndman and Khandakar 2008; Hyndman et al. 2002). In particular, the ETS model offers 15 methods, such as simple exponential smoothing (N, N), Holt’s linear method (Ad, N), etc. (Hyndman 2016; Hyndman and Khandakar 2008; Hyndman et al. 2002).

The automatic ARIMA model identifies a seasonal ARIMA model in the following form ARIMA (p, d, q)(P, D, Q)m. The three components (p, d, q) are the AR order, the degree of differencing, and the MA order. The other three components are a specification of the seasonal part of the ARIMA model, plus the number of periods per season m. The automatic function in R in order to estimate the (p, d, q, P, D, Q) uses a variation of the Hyndman and Khandakar algorithm, (Hyndman and Khandakar 2008) which combines unit root tests, minimization of the AICc and MLE to obtain an ARIMA model.

For the combination of the forecasts, we propose a state-space representation where a dynamic linear model combines in real time the automatic forecasts. A State-Space model, is composed of an unobservable state: \( x_{0} ,x_{1} ,x_{2} , \ldots ,x_{t} , \ldots \) which forms a Markov Chain, and an observable variable: \( y_{0} ,y_{1} ,y_{2} , \ldots ,y_{t} , \ldots \) which are conditionally independent given the state. A very important class of state-space models is the dynamic linear model, which is specified by three equations. Equation (1), is a normal prior distribution for the p-dimensional state vector at time t = 0, Eq. (2) is called the observation equation and Eq. (3) the state equation or system equation.

$$ \theta \sim {\mathcal{N}}_{p} \left( {m_{0} ,C_{0} } \right) $$
(1)
$$ Y_{t} = F_{t} \theta_{t} + \upsilon_{t} \quad \quad \upsilon_{t} \sim {\mathcal{N}}_{m} \left( {0,V_{t} } \right) $$
(2)
$$ \theta_{t} = G_{t} \theta_{t - 1} + w_{t} \quad \quad w_{t} \sim {\mathcal{N}}_{p} \left( {0,W_{t} } \right) $$
(3)

where \( G_{t} \) and \( F_{t} \) are known matrices (of order \( p \times p \) and \( m \times p \) respectively) and \( (\upsilon_{t} )_{t} \) and \( (w_{t} )_{t} \) are two independent sequences of independent Gaussian random vectors with mean zero and known variance matrices \( (V_{t} )_{t} \ge 1 \) and \( (W_{t} )_{t} \ge 1 \), respectively. Furthermore, it is assumed that \( \theta_{0} \) is independent of \( (\upsilon_{t} ) \) and \( (w_{t} ) \) (Petris et al. 2009).

In this form, one can model nonlinear relationships between x and y, structural changes in the process under study, as well as the omission of some variables. For the optimal properties of the algorithm, the interested reader is referred to the following work (Kalman 1960; Gelb 1974; Hamilton 1994; Nikolopoulos and Papakyriazis 2002).

In our work, the dynamic linear regression model that is used to combine the automatic forecasts is described by

$$ Y_{t} = x_{t}^{'} \theta_{t} + \upsilon_{t} \quad \quad \upsilon_{t} \sim {\mathcal{N}}\left( {0,\sigma_{t}^{2} } \right) $$
(4)
$$ \theta_{t} = G_{t} \theta_{t - 1} + w_{t} \quad \quad w_{t} \sim {\mathcal{N}}_{p} \left( {0,W_{t} } \right) $$
(5)

where \( x_{t}^{{\prime }} = \left\lfloor {x_{1,t} \ldots x_{t}^{p} } \right\rfloor \) are the values of the p explanatory variables at time \( t \). Setting \( G_{t} \) as the identity matrix and \( W \) diagonal, correspond to modeling the regression coefficients as independent random walks (Petris et al. 2009).

Application

In this section, we implement the automatic models presented in Sect. 2 and then we combine the forecasts with a simple average model, an unrestricted linear regression model, and our proposed Dynamic Linear Model. In our experiments, we utilize the data of a Greek manufacturing company that belongs to the chemical sector. In particular, we use monthly sales from January 2008 to December 2010 that is 36 observations, for “in sample” model estimation. The monthly sales of 2011 are used for “out of sample” forecast, combination and forecast evaluation. The estimated automatic models for the sales are ARIMA(0, 0, 0)(1, 1, 0)[12] with drift and ETS(M, N, M), while the respective models for the COGS are ARIMA(1, 1, 0) and ETS(M, A, N). The triplet (E, T, S) refers to the three components: error, trend, and seasonality. Thus, the model ETS(M, A, M) has multiplicative error, additive trend, and multiplicative seasonality, while the model ETS(M, N, M) has multiplicative error, no trend and multiplicative seasonality. More information for model description and measures of forecast accuracy can be found in Hyndman (2016) and Hyndman and Koehler (2006) respectively (Figs. 1 and 2).

Fig. 1
figure 1

Actual and forecasted monthly sales

Fig. 2
figure 2

Actual and forecasted COGS

Conclusions

A dynamic linear model was applied for the combination of monthly sales and COGS forecasts. The combination of forecasts greatly reduced the model selection risk, while the out-of-sample performance of the proposed combination model was better than any other single or combined forecasting model applied in this work. It is noticeable that in terms of RMSE the forecast improves from 1 to 55% points with regard to any other forecasting model (see Tables 1 and 2). We expect the dynamic combination benefits to be higher when we combine more automatic forecasts, and this experiment is left for a future work.

Table 1 Out of sample accuracy measures for monthly sales forecasts
Table 2 Out of sample accuracy measures for monthly COGS forecasts