Abstract
We propose a method for forecasting intermittent demand with generalized state-space model using time series data. Specifically, we employ mixture of zero and Poisson distributions. To show the superiority of our method to the Croston, Log Croston and DECOMP models, we conducted a comparison analysis using actual data for a grocery store. The results of this analysis show the superiority of our method to the other models in highly intermittent demand cases.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Accurately forecasting intermittent demand is important for manufacturers, transport businesses, and retailers [3] because of the diversification of consumer preferences and the consequent small production lots of the highly diversified products. There are many models for forecasting intermittent demand. Croston’s model [2] is one of the most popular and has many variant models, including log-Croston and modified Croston. However, Croston’s model has an inconsistency in its assumptions as pointed out by Shenstone and Hyndman [9]. Further, Croston’s model generally needs round-up approximation on the inter-arrival time to estimate the parameters from discrete time-series data.
We employ non-Gaussian nonlinear state-space models to forecast intermittent demand. Specifically, we employ a mixture of zero and Poisson distributions because the occurrence of an intermittent phenomenon generally implies low average demand. As in DECOMP [5, 6], time series are broken down into trend, seasonal, auto-regression, and external terms in our model. Therefore, we cannot obtain parameters via ordinal maximum likelihood estimators because the number of parameters exceeds the number of data items owing to non-stationary assumptions on the parameters. Therefore, we adopt the Bayesian framework, which is similar to DECOMP. We employ a particle filter [7] for our filtering method instead of the Kalman filter in DECOMP because of the non-Gaussianness of the system, and the observation noises and nonlinearity in these models. To show the superiority of our method to other typical intermittent demand forecasting methods, we conduct a comparison analysis using actual data for a grocery store.
2 Model
2.1 Mixture Distribution and Components
Let the observation of a time series for discrete product demand be \(y_n~(n=1,2,\ldots , N)\). We assume that demand for a product at arbitrary time step n follows a mixture distribution, considering the non-negativity of product demand. We do not need to conduct any approximating operations as in Croston’s model. This mixture distribution is composed of a discrete probability distribution with a value of 0 with weight \(w_n\) and a Poisson distribution that has parameter \(\lambda _n\) with weight \(1-w_n\):
From the expectation property of the Poisson distribution, the expected value of the mixture distribution becomes \((1-w_n)\lambda _n\).
Now assume that parameter \(\lambda _n\) has trend component \(t_n\), seasonal component \(s_n\), steady component \(d_n\), and external component \(e_n\):
Specifically, the fluctuations in each component are as follows:
Here, \(\varDelta _q^m\) indicates the difference between cycle q and degree m in the trend term. k and l are the degrees of differences in the weight and the seasonal component, respectively. I is the auto-regression order and J is the number of external variables. \(a_i\) is the jth auto-regression coefficient. \(v_{0,n}\), \(v_{1,n}\), \(v_{2,n}\), \(v_{3,n}\) and \(v_{e,j,n}\) are the noise terms for the components, and they follow Gaussian distributions:
where \(N(0,\sigma ^2)\) is a Gaussian distribution with mean 0 and variance \(\sigma ^2\). In the seasonal component, we can employ multiple components simultaneously. However, the introduction of plural components often leads to mistakes in practice. Therefore, we employ singular components in the seasonal component. The external component corresponds to variables and parameters such as price and promotion variables. In addition, we utilize the external component to consider the holiday effect via dummy variables.
2.2 State-Space Expression
It is meaningless to estimate the time-varying parameter \(\lambda _n\) via ordinary maximum likelihood estimators. In the simplest setting, the number of unknown variables \(\lambda _n\) equals the number of data items \(y_n\). Furthermore, we cannot estimate the parameters in our settings via ordinary maximum likelihood estimators because the number of unknown variables \(t_n\), \(s_n\), \(d_n\), and \(e_n\) exceeds the number of data items \(y_n\).
We introduce the state-space expression to resolve the above formulation. Let the model be expressed as a state-space model. When \(k=2\), \(l=2\), \(m=1\), \(q=7\), \(I=2\), and \(J=1\), we can write the state vector as
Therefore, the system and observation models are described as
where \({\varvec{v}}_n\) is an independent and identically distributed noise term vector corresponding to \(v_{0,n}\), \(v_{1,n}\), \(v_{2,n}\), \(v_{3,n}\), and \(v_{e,j,n}\). The observation model is not linear, and therefore, we cannot employ a Kalman filter and have to go with a particle filter.
2.3 Parameter Estimation
In the above setting, the elements in \({\varvec{F}}_n\) (with the exception of \(a_i\)) are given; however, we need to estimate the other (hyper) parameters, \(\tau _i\), \(\tau _{e,j}\), and \(a_i\). Let \(R(y_n|{\varvec{x}}_n)\) be the likelihood at arbitrary time n; then, the likelihood with all data \((y_1,\ldots ,y_N)\) is given by
We estimate the parameters by maximizing Eq. (15).
We employ a grid search algorithm to maximize Eq. (15), because of the existence of Monte Carlo errors in calculating the likelihood via particle filters, which varies in each trial. Therefore, we cannot employ gradient methods such as the Newton method. Within the particle filter, we use residual resampling [8] and sequential importance sampling [4] to update the particles.
3 Comparison Analysis
3.1 Analyzing Data, and Models for Comparison
To show the superiority of our method, we conduct a comparison analysis of our method and typical intermittent demand forecasting and other relevant methods, including Croston, log-Croston [10], and DECOMP. The estimation methods used in the Croston and log-Croston methods are those shown in Syntetos and Boylan [11]. The smoothing parameter in the Croston model is set as \(\alpha =0.5\). The data analyzed here comprise fifty days of daily retail data for four SKU-level products in a Japanese grocery store. Further details of the data are shown in Table 1. #1 and #4 have relative higher intermittent demand than #2 and #3. Owing to differences in the estimation schemes, we compare forecast accuracy among these models by root mean squares (RMS).
To shorten the calculation time, the number of particles in each time step is fixed as 10, 000 in this paper. The setting of the degrees and cycles in our model and DECOMP are \(k=2\), \(l=2\), \(m=1\), \(q=7\), \(I=2\), and \(J=1\) (the external variable is the daily price for an objective product, which is not used in DECOMP). In the grid search, each hyperparameter of the error term has five nodes (\(v_i=0.003125, 0.00625,\ldots , 0.05\)) and each auto-regression coefficient has 10 nodes (\(a_i=-1.0,-0.8,\ldots ,1.0\)).
3.2 Results
Table 2 shows the results for each data set. The overall RMS for our model is less than those of the other three models. For #1, the RMS for our model is the lowest. Thus, our method is superior to the other three models. For #4, our method is superior to Croston and log-Croston, but inferior to DECOMP. However, DECOMP predicts negative demand that never happens (five-ahead forecast). Therefore, our method can be concluded to be superior to the other models in highly intermittent demand situations.
In contrast to the highly intermittent demand situation, we cannot show substantial superiority of our model to the other three models for #2 and #3. It is conceivable that the degree of non-Gaussianness in the data influences these differences. If the data have high Gaussianness, Croston and DECOMP are suitable. On the other hand, log-Croston and our model are suitable if the data have low demand (namely low Gaussianness).
4 Conclusions
This paper proposed a method to forecast intermittent demand with non-Gaussian nonlinear state-space models using a particle filter. To show the superiority of our method to other typical intermittent demand forecasting methods, we conducted a comparison analysis using actual data for a grocery store. The results of this comparison analysis show the superiority of our method to the Croston, log-Croston, and DECOMP models in highly intermittent demand cases. In the furture, we intend to shorten the calculation time, and the MCMC filter [1] is a promising method by which to overcome the problem.
References
Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. Roy. Stat. Soc. B. 72(3), 269–342 (2010)
Croston, J.D.: Forecasting and stock control for intermittent demands. Oper. Res. Q. 23(3), 289–303 (1972)
Doucet, A.: On sequential simulation-based methods for Bayesian filtering. Technical Report CUED/F-INFENG/TR. 310, Cambridge University Department of Engineering (1998)
Fildes, R., Nikolopoulos, K., Crone, S.F., Syntetos, A. A.: Forecasting and operational research: a review. J Oper. Res. Soc. 59, 1150–1172 (2008)
Kitagawa, G., Gersch, W.: A smoothness priors-state space approach to the modeling of time series with trend and seasonality. J. Am. Stat. Assoc. 79(386), 378–389 (1984)
Kitagawa, G.: Decomposition of a nonstationary time series—an introduction to the program DECOMP—. Proc. Inst. Stat. Math. 34(2), 255–271 (1986). in Japanese
Kitagawa, G.: Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. J. Comput. Graph. Stat. 5(1), 1–25 (1996)
Liu, J.S., Chen, R.: Sequential Monte Carlo methods for dynamic systems. J. Am. Stat. Assoc. 93(443), 1032–1044 (1998)
Shenstone, L., Hyndman, R.J.: Stochastic models underlying Croston’s method for intermittent demand forecasting. J. Forecast. 24, 389–402 (2005)
Syntetos, A.A., Boylan, J.E.: On the bias of intermittent demand estimates. Int. J. Prod. Econ. 71, 457–466 (2001)
Syntetos, A.A., Boylan, J.E.: Intermittent demand: estimation and statistical properties. In: Altay, N., Litteral, L.A. (eds.) Service Parts Management, pp. 1–30. Springer, London (2010)
Acknowledgments
We acknowledge the support of JSPS Grant Number 23730415.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Takahashi, K., Fujita, M., Maruyama, K., Aizono, T., Ara, K. (2016). Forecasting Intermittent Demand with Generalized State-Space Model. In: Lübbecke, M., Koster, A., Letmathe, P., Madlener, R., Peis, B., Walther, G. (eds) Operations Research Proceedings 2014. Operations Research Proceedings. Springer, Cham. https://doi.org/10.1007/978-3-319-28697-6_82
Download citation
DOI: https://doi.org/10.1007/978-3-319-28697-6_82
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28695-2
Online ISBN: 978-3-319-28697-6
eBook Packages: Business and ManagementBusiness and Management (R0)