Abstract
Uncertainty in time series can appear in many ways, and its analysis can be performed based on different theories. An important problem appears when time series is incomplete since the analyst should impute those observations before any other analysis.
This chapter focuses on designing an imputation method for multiple missing observations in time series through the use of a genetic algorithm (GA), which is designed for replacing these missed observations in the original series. The flexibility of a GA is used for finding an adequate solution to a multicriteria objective, defined as the error between some key properties of the original series and the imputed one. A comparative study between a classical estimation method and our proposal is presented through an example.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abdella, M., Marwala, T.: Treatment of missing data using neural networks and genetic algorithms. In: IEEE (ed.) Proceedings of International Joint Conference on Neural Networks, pp. 598–603. IEEE (2005)
Abdella, M., Marwala, T.: The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE (ed.) IEEE 3rd International Conference on Computational Cybernetics, ICCC 2005, vol. 3, pp. 207–212. IEEE (April 2005)
Anderson, T.W.: The Statistical Analysis of Time Series. John Wiley and Sons (1994)
Arnold, M.: Reasoning about non-linear AR models using expectation maximization. Journal of Forecasting 22(6), 479–490 (2003)
Aytug, H., Bhattacharrya, S., Koehler, G.J.: A markov chain analysis of genetic algorithms with power of 2 cardinality alphabets. ORSA Journal on Computing 96(6), 195–201 (1997)
Aytug, H., Koehler, G.J.: Stopping criteria for finite length genetic algorithms. ORSA Journal on Computing 8(2), 183–191 (1996)
Aytug, H., Koehler, G.J.: New stopping criterion for genetic algorithms. European Journal of Operational Research 126(1), 662–674 (2000)
Bäck, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press (1996)
Bagchi, T.: Multiobjective Scheduling by Genetic Algorithms. Kluwer Academic Publishers (1999)
Brockwell, P., Davis, R.: Time Series: Theory and Methods. Springer (1998)
Brockwell, P., Davis, R.: Introduction to Time Series and Forecasting. Springer (2000)
Broersen, P., de Waele, S., Bos, R.: Application of autoregressive spectral analysis to missing data problems. IEEE Transactions on Instrumentation and Measurement 53(4), 981–986 (2004)
Burke, E.K., Gustafson, S., Kendall, G.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Transactions on Evolutionary Computation 8(1), 47–62 (2004)
Celeux, G., Diebolt, J.: The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly 2(1), 73–82 (1993)
Chambers, R.L., Skinner, C.J.: Analysis of Survey Data. John Wiley and Sons (2003)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum-likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Society 39(1), 1–38 (1977)
Devroye, L.: Non-Uniform Random Variate Generation. Springer, New York (1986)
Eklund, N.: Using genetic algorithms to estimate confidence intervals for missing spatial data. IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews 36(4), 519–523 (2006)
Figueroa García, J.C., Kalenatic, D., Lopez Bello, C.A.: Missing Data Imputation in Time Series by Evolutionary Algorithms. In: Huang, D.-S., Wunsch II, D.C., Levine, D.S., Jo, K.-H. (eds.) ICIC 2008. LNCS (LNAI), vol. 5227, pp. 275–283. Springer, Heidelberg (2008)
Figueroa García, J.C., Kalenatic, D., López, C.A.: An evolutionary approach for imputing missing data in time series. Journal on Systems, Circuits and Computers 19(1), 107–121 (2010)
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. Evolutionary Computation 3(1), 1–16 (2004)
Gaetan, C., Yao, J.F.: A multiple-imputation metropolis version of the EM algorithm. Biometrika 90(3), 643–654 (2003)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Adisson-Wesley (1989)
González, S., Rueda, M., Arcos, A.: An improved estimator to analyse missing data. Statistical Papers 49(4), 791–796 (2008)
Grimmet, G., Stirzaker, D.: Probability and Random Processes. Oxford University Press (2001)
Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E.: Multivariate Data Analysis, 7th edn. Prentice-Hall (2009)
Hamilton, J.D.: Time Series Analysis. Princeton University (1994)
Harville, D.A.: Matrix Algebra from a Statician’s Perspective. Springer-Verlag Inc. (1997)
Huber, P.: Robust Statistics. John Wiley and Sons, New York (1981)
Ibrahim, J., Molenberghs, G.: Missing data methods in longitudinal studies: a review. TEST 18(1), 1–43 (2009)
JiaWei, L., Yang, T., Wang, Y.: Missing value estimation for microarray data based on fuzzy c-means clustering. In: IEEE (ed.) Proceedings of High-Performance Computing in Asia-Pacific Region, 2005 Conference, pp. 616–623. IEEE (2005)
Kalra, R., Deo, M.: Genetic programming for retrieving missing information in wave records along the west coast of india. Applied Ocean Research 29(3), 99–111 (2007)
Law, A., Kelton, D.: Simulation System and Analysis. McGraw Hill International (2000)
Levine, L.A., Casella, G.: Implementations of the monte-carlo EM algorithm. Journal of Computational Graphic Statistics 10(1), 422–439 (2000)
Londhe, S.: Soft computing approach for real-time estimation of missing wave heights. Ocean Engineering 35(11), 1080–1089 (2008)
Mood, A.M., Graybill, F.A., Boes, D.C.: Introduction to the Theory of Statistics. Mc Graw Hill Book Company (1974)
Nelwamondo, F.V., Golding, D., Marwala, T.: A dynamic programming approach to missing data estimation using neural networks. Information Sciences (in press 2012)
Nielsen, S.F.: The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli 6(1), 457–489 (2000)
Parveen, S., Green, P.: Speech enhancement with missing data techniques using recurrent neural networks. In: IEEE (ed.) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 1, pp. 733–738. IEEE (2004)
Pendharkar, P.C., Koehler, G.J.: A general steady state distribution based stopping criteria for finite length genetic algorithms. European Journal of Operational Research 176(3), 1436–1451 (2007)
Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Applied Intelligence 27(1), 79–88 (2007)
Ross, S.M.: Stochastic Processes. John Wiley and Sons (1996)
Safe, M., Carballido, J., Ponzoni, I., Brignole, N.: On Stopping Criteria for Genetic Algorithms. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 405–413. Springer, Heidelberg (2004)
Siripitayananon, P., Hui-Chuan, C., Kang-Ren, J.: Estimating missing data of wind speeds using neural network. In: IEEE (ed.) Proceedings of the 2002 IEEE Southeast Conference, vol. 1, pp. 343–348. IEEE (2002)
Ssali, G., Marwala, T.: Computational intelligence and decision trees for missing data estimation. In: IEEE (ed.) IJCNN 2008 (IEEE World Congress on Computational Intelligence), pp. 201–207. IEEE (2008)
Tsiatis, A.A.: Semiparametric Theory and Missing Data. Springer Series in Statistics (2006)
Wilks, A.: Mathematical Statistics. John Wiley and Sons, New York (1962)
Zhong, M., Lingras, P., Sharma, S.: Estimation of missing traffic counts using factor, genetic, neural, and regression techniques. Transportation Research Part C: Emerging Technologies 12(2), 139–166 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Figueroa-García, J.C., Kalenatic, D., López, C.A. (2013). Incomplete Time Series: Imputation through Genetic Algorithms. In: Pedrycz, W., Chen, SM. (eds) Time Series Analysis, Modeling and Applications. Intelligent Systems Reference Library, vol 47. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33439-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-33439-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33438-2
Online ISBN: 978-3-642-33439-9
eBook Packages: EngineeringEngineering (R0)