Forecasting Sugarcane Yield of Tamilnadu Using ARIMA Models

Suresh, K. K.; Krishna Priya, S. R.

doi:10.1007/s12355-011-0071-7

Forecasting Sugarcane Yield of Tamilnadu Using ARIMA Models

Research Article
Published: 26 May 2011

Volume 13, pages 23–26, (2011)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Sugar Tech Aims and scope Submit manuscript

Forecasting Sugarcane Yield of Tamilnadu Using ARIMA Models

Download PDF

K. K. Suresh¹ &
S. R. Krishna Priya¹

1436 Accesses
47 Citations
Explore all metrics

Abstract

This paper attempts forecasting the sugarcane area, production and productivity of Tamilnadu through fitting of univariate Auto Regressive Integrated Moving Average (ARIMA) models. The data on sugarcane area, production and productivity collected from 1950–2007 has been used for present study. ARIMA (1, 1, 1) model is found suitable for sugarcane area and productivity. ARIMA (2, 1, 2) is found appropriate for modeling sugarcane production. The performances of models are validated by comparing with actual values. Using the models developed, forecast values for sugarcane area, production and productivity are developed for subsequent years.

Predicting Forecast of Sugarcane Production in Pakistan

Article 07 December 2022

Forecasting the Price of Potato Using Time Series ARIMA Model

Comparative Analysis of Sugarcane Production for South East Asia Region

Article 22 December 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Sugarcane, a traditional crop of India plays an important role in agricultural and industrial economy of the country. It is cultivated in most of the states and though it covers an insignificant share of about 2 and 20% of gross cropped area of our country and world respectively, its share in our country’s economic growth has become significant. Among the sugarcane growing states in our country, Tamilnadu ranks first in per hectare productivity of sugarcane with 113.9/ha of cane yield. Sugarcane is a versatile crop. Because of its diversified uses in different industries, this crop is considered as “Karpagavirucham” and in modern terminology as “wonder cane” (Mohan et al. 2007).

Sugarcane occupies a significant position among the commercial crops in India. A proper forecast of production of such important commercial crops is very important in an economic system. There is close association between crop productions with prices. An unexpected decrease in production reduces marketable surplus and income of the farmers and leads to price rise. A glut in production can lead to a slump in prices and has adverse effect on farmers’ incomes. Impact on price of an essential commodity has a significant role in determining the inflation rate, wages, salaries and various policies in an economy. In case of commercial crops like sugarcane, production level affects raw material cost of user industries and their competitive advantages in the market. In our present study sugarcane area, production and productivity of Tamilnadu have been forecasted using Auto Regressive Integrated Moving Average (ARIMA) models.

Previously attempts were made to forecast sugarcane production and productivity using ARIMA models (Bajpai and Venugopalan 1996; Yaseen et al. 2005). ARIMA models have been used for modeling and forecasting of fish catches (Venugopalan and Srinath 1998; Tsitsika et al. 2007). The forecasting efficiency of ARIMA models were compared with neural network models (Hanson et al. 1999). ARIMA models have been developed to forecast the cultivable area, production and productivity of various crops of Tamilnadu (Balanagammal et al. 2000). Wheat production in Pakistan and Canada were forecasted using ARIMA models (Saeed et al. 2000; Boken 2000). ARIMA models were used to obtain seasonal forecast of paddy in Tamilnadu and food grains in India (Balasubramanian and Dhanavanthan 2002). ARIMA models were compared with structural time series models (Ravichandran and Prajneshu 2001; Prajneshu et al. 2002). ARMA models were used in forecasting of milk, fat and protein yields of Italian Simmental cows (Maccioitta et al. 2000, 2002). Univariate forecasting of state level agricultural production was done using ARIMA models (Indira and Datta 2003). ARIMA models were compared with nonparametric regression approach for forecasting oilseed production in India (Chandran and Prajneshu 2005). Forecasting of irrigated crops like Potato, Mustard and Wheat were forecasted using ARIMA models (Sahu 2006). Milk production in India was forecasted using time-series modeling techniques (Pal et al. 2007). The objective of our present study using ARIMA models to forecast sugarcane area, production and productivity of Tamilnadu.

Materials and Methods

The Data on sugarcane area (000’ ha), production (000’ tonnes) and productivity (tonnes/ha) for a period of 57 years from (1950–1951) to (2007–2008) has been collected from various volumes of ‘Cooperative Sugar’ (CSJ 1980, 2007) and ‘Indian Sugar’ (ISJ 1985, 2009) journals.

The data for a period of 55 years (1950–2006) was used in model building. The remaining 2 years data (2007–2008) was used for validation of the model.

Description of the Model

In general, an ARIMA model is characterized by the notation ARIMA (p, d, q) where p, d, q denote orders of auto-regression, integration (differencing) and moving average respectively. In ARIMA, time series is a liner function of past actual values and random shocks. A stationary ARIMA (p, q) process is defined by the equation

$$ Y_{t} = \Upphi_{0} + \Upphi_{1} Y_{t - 1} + \Upphi_{2} Y_{t - 2} + \ldots \ldots \ldots + \Upphi_{p} Y_{t - p} + \varepsilon_{t} \omega_{1} \varepsilon_{t - 1} - \omega {}_{2}\varepsilon_{t - 2} \ldots \ldots \omega_{q} \varepsilon_{t - q} $$

(1)

where, Y_t is the response (dependant) variable at time t. Y_t−1, Y_t−2…Y_t−p is the response (dependant) variable at time lags t−1, t−2,…,t−p respectively; these Y’s are independent variables. Φ₁, Φ_2…Φ_p is the coefficients to be estimated. ɛ_t is the error term at time t that represents the effects of variables not explained by the model; the assumptions about the error term are the same as those for the standard regression model. ɛ_t−1, ɛ_t−2…ɛ_t−q is the error term that represents the effect of variables not explained by the model. The assumptions about the error term are the same as those for the standard regression model. ω₁, ω₂…ω_q is the coefficients to be estimated.

ARIMA Model Building

Identification

The foremost step in the process of modeling is to check for the stationarity of the series, as the estimation procedures are available only for the stationary series. There are two kinds of, viz., stationarity in ‘mean’ and stationarity in ‘variance’. Visual examination of graph of the data and structure of autocorrelation, and partial correlation coefficients helps to check the presence of stationarity. Another way of checking for stationarity is to fit a first order autoregressive model for the raw data and test whether the coefficient ‘Φ₁’ is less than one. If the model is found to be non-stationary, stationarity is achieved by differencing the series.

If ‘X_t’ denotes the original series, the non-seasonal difference of first order is

$$ Y_{t} = \, X_{t} { - }X_{t - 1} $$

(2)

The next step in the identification process is to find the initial values for the orders of non-seasonal parameters, p and q. They are obtained by looking for significant autocorrelation and partial autocorrelation coefficients. There are no strict rules in choosing the initial values. Though sample autocorrelation coefficients are poor estimates of population autocorrelation coefficients, still they are used as initial values while the final models are achieved after going through the stages repeatedly.

Estimation

At the identification stage, one or more models are tentatively chosen that seem to provide statistically adequate representations of the available data. Then precise estimates of parameters of the model are obtained by least squares. Standard computer packages like SAS, SPSS etc. are available for finding the estimates of relevant parameters using iterative procedures.

Diagnostics

Different models are obtained for various combinations of Auto Regressive and Moving Average individually and collectively. The best model is selected based on the following diagnostics:

a)
Low Akaike Information Criteria (AIC)
b)
Insignificance of auto correlations for residuals (Q-tests)
c)
Significance of the parameters

(a) Low AIC: AIC is given by AIC = (−2log L + 2m) where m = p + q and L is the likelihood function. Since −2 log L is approximately equal to {n(1 + log 2π) + n log σ²} where σ² is the model MSE, AIC is written as AIC = {n(1 + log 2π) + n log σ² + 2 m} and because first term in this equation is a constant, it is omitted while comparing between models. As an alternative to AIC, sometimes SBC is also used which is given by SBC = log σ² + (m log n)/n.

(b) Insignificance of auto correlations for residuals (Q-tests): After tentative model is fitted to the data, it is important to perform diagnostic checks to test the adequacy of the model and, to suggest potential improvements. One way to accomplish this is through the analysis of residuals. It has been found that it is effective to measure the overall adequacy of the chosen model by examining a quantity Q known as Box-Pierce statistic (a function of autocorrelations of residuals) whose approximate distribution is chi-square and is computed as follows:

$$ Q = n\sum {r^{2} \left( j \right)} $$

(3)

where summation extends form 1 to k with k as the maximum lag considered, n is the number of observations in the series, r (j) is the estimated autocorrelation at lag j: k is a positive integer and is usually around 20. Q follows Chi-square with (k − m₁) degrees of freedom where m₁ is the number of parameters estimated in the model. A modified Q statistic is the Ljung-box statistic which is given by

$$ Q = n(n + 2)\sum r^{2} (j)/(n - j) $$

(4)

The Q statistic is compared to critical values from chi-square distribution. If model is correctly specified, residuals should be uncorrelated and Q should be small (the probability value should be large). A significant value indicates that the chosen model does not fit well.

Results and Discussion

The stationary check of time series revealed that the time series data on sugarcane area, production and productivity was not stationary. It was made stationary by using the first order differencing technique. For different values of p and q (0, 1 or 2), various ARIMA models were fitted and appropriate model was chosen corresponding to minimum value of the selection criterion i.e. Akaike Information Criteria (AIC) and Schwarz-Bayesian Information Criteria (SBC). In this way, ARIMA (1, 1, 1) model was found to be appropriate for sugarcane area and productivity. ARIMA (2, 1, 2) was suitable for sugarcane production. The estimates of parameters along with their standard errors have been presented in Tables 1, 2 and 3 for sugarcane area, production and productivity. After model fitting, next step is diagnostic checking of the fitted model. ACF and PACF were plotted for residuals of the fitted model. For the present study ACF and PACF were lying within the limits for sugarcane area, production and productivity which shows that ARIMA model fitted well.

Table 1 Final estimates of parameters for sugarcane area

Full size table

Table 2 Final estimates of parameters for sugarcane production

Full size table

Table 3 Final estimates of parameters for sugarcane productivity

Full size table

The fitted models were validated by comparing the actual values with predicted values. The observed and predicted values for sugarcane area, production and productivity along with percentage of deviation has been presented in Tables 3, 4 and 5.

Table 4 Performance of ARIMA (1, 1, 1) model for sugarcane area

Full size table

Table 5 Performance of ARIMA (2, 1, 2) model for sugarcane production

Full size table

The results of Table 4 indicate that the predicted values of sugarcane area are slightly higher than the actual values. From Table 6 it could be seen that the predicted values are much closer to the observed values for sugarcane productivity.

Table 6 Performance of ARIMA (1, 1, 1) model for sugarcane productivity

Full size table

Conclusions

The drought of 2009 has brought home the critical need for a short-term forecasting model for agriculture sector at sub-national level, since good and bad agricultural years are not synchronous across states.

The ARIMA models developed could successfully used for forecasting sugarcane area, production and productivity of Tamilnadu for subsequent years. The forecast values for the year (2010–2012) for sugarcane area, production and productivity are presented in Table 7.

Table 7 Forecast values for the future

Full size table

References

Bajpai, P.K., and R. Venugopalan. 1996. Forecasting sugarcane production by time series modeling. Indian Journal of Sugarcane Technology 11(1): 61–65.
Google Scholar
Balanagammal, D., C.R. Ranganathan, and K. Sundaresan. 2000. Forecasting of agricultural scenario in Tamilnadu—A time series analysis. Journal of Indian Society of Agricultural Statistics 53(3): 273–286.
Google Scholar
Balasubramanian, P., and P. Dhanavanthan. 2002. Seasonal modeling and forecasting of crop production. Statistics and Applications 4(2): 107–118.
Google Scholar
Boken, V.K. 2000. Forecasting spring wheat yield using time series analysis: A case study for the Canadian prairies. Agricultural Journal 92(6): 1047–1053.
Google Scholar
Chandran, K.P., and Prajneshu. 2005. Nonparametric regression with jump points methodology for describing country’s oilseed yield data. Journal of Indian Society of Agricultural Statistics 59(2): 126–130.
Google Scholar
CSJ. Cooperative Sugar Journal—a monthly journal (various volumes from 1980–2007) (Published by National Federation of Cooperative Sugar Factories Ltd. New Delhi).
Hanson, J.V., J.B. Macdonald, and R.D. Nelson. 1999. Time series prediction with genetic-algorithm designed neural networks: An empirical comparison with modern statistical models. Computational Intelligence 15(3): 171–184.
Article Google Scholar
Indira, R., and A. Datta. 2003. Univariate forecasting of state-level agricultural production. Economic and Political Weekly 38: 1800–1803.
Google Scholar
ISJ. Indian Sugar Journal (various volumes from 1985–2009) (Published by Indian Sugar Mills Association, New Delhi).
Maccioitta, N.P.P., A. Cappio-Borlino, and G. Pulina. 2000. Time series autoregressive integrated moving average modeling of test-day milk yields of dairy ewes. Journal of Dairy Science 83: 1094–1103.
Article Google Scholar
Maccioitta, N.P.P., D. Vicario, G. Pulina, and A. Cappio-Borlino. 2002. Test day and lactation yield predictions in Italian simmental cows by ARMA methods. Journal of Dairy Science 85: 3107–3114.
Article Google Scholar
Mohan, S., K. Rajendran, D. Sivam, and B. Saliha. 2007. Sugar—The wonder cane. Co-operative Sugar 38(10): 21–24.
Google Scholar
Pal, S., V. Ramasubramanian, and S.C. Mehta. 2007. Statistical models for forecasting milk production in India. Journal of Indian Society of Agricultural Statistics 61(2): 80–83.
Google Scholar
Prajneshu, S. Ravichandran, and S. Wadhwa. 2002. Structural time series models for describing cyclical fluctuations. Journal of Indian Society of Agricultural Statistics 55: 70–78.
Google Scholar
Ravichandran, S., and Prajneshu. 2001. State space modeling versus ARIMA time-series modeling. Journal of Indian Society of Agricultural Statistics 54(1): 43–51.
Google Scholar
Saeed, N., A. Saeed, M. Zakria, and T.M. Bajwa. 2000. Forecasting of wheat production in Pakistan using ARIMA models. International Journal of Agricultural Biology 2(4): 352–353.
Google Scholar
Sahu, P.K. 2006. Forecasting yield behavior of potato, mustard, rice, and wheat under irrigation. Journal of Vegetable Science 12(1): 81–99.
Article Google Scholar
Tsitsika, E.V., C.D. Maravelias, and J. Haralabous. 2007. Modeling and forecasting pelagic fish production using univariate and multivariate ARIMA models. Fisheries Science 73: 979–988.
Article CAS Google Scholar
Venugopalan, R., and M. Srinath. 1998. Modeling and forecasting fish catches: Comparision of regression, univariate and multivariate time series methods. Indian Journal of Fisheries 45(3): 227–237.
Google Scholar
Yaseen, M., M. Zakria, Islam-ud-din-Shahzad, M. Imran Khan, and M. Aslam Javed. 2005. Modeling and forecasting the sugarcane yield of Pakistan. International Journal of Agricultural Biology 7(2): 180–183.
Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge Sugarcane Breeding Institute, Coimbatore, for their support rendered in conducting this study and making possible to bring out this article.

Author information

Authors and Affiliations

Department of Statistics, Bharathiar University, Coimbatore, India
K. K. Suresh & S. R. Krishna Priya

Authors

K. K. Suresh
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Krishna Priya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. R. Krishna Priya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suresh, K.K., Krishna Priya, S.R. Forecasting Sugarcane Yield of Tamilnadu Using ARIMA Models. Sugar Tech 13, 23–26 (2011). https://doi.org/10.1007/s12355-011-0071-7

Download citation

Received: 06 May 2010
Accepted: 22 November 2010
Published: 26 May 2011
Issue Date: March 2011
DOI: https://doi.org/10.1007/s12355-011-0071-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Forecasting Sugarcane Yield of Tamilnadu Using ARIMA Models

Abstract

Similar content being viewed by others

Predicting Forecast of Sugarcane Production in Pakistan

Forecasting the Price of Potato Using Time Series ARIMA Model

Comparative Analysis of Sugarcane Production for South East Asia Region

Introduction

Materials and Methods

Description of the Model

ARIMA Model Building

Identification

Estimation

Diagnostics

Results and Discussion

Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Forecasting Sugarcane Yield of Tamilnadu Using ARIMA Models

Abstract

Similar content being viewed by others

Predicting Forecast of Sugarcane Production in Pakistan

Forecasting the Price of Potato Using Time Series ARIMA Model

Comparative Analysis of Sugarcane Production for South East Asia Region

Introduction

Materials and Methods

Description of the Model

ARIMA Model Building

Identification

Estimation

Diagnostics

Results and Discussion

Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation