Abstract
This paper presents the first generalization bounds for time series prediction with a non-stationary mixing stochastic process. We prove Rademacher complexity learning bounds for both average-path generalization with non-stationary β-mixing processes and path-dependent generalization with non-stationary ϕ-mixing processes. Our guarantees are expressed in terms of β- or ϕ-mixing coefficients and a natural measure of discrepancy between training and target distributions. They admit as special cases previous Rademacher complexity bounds for non-i.i.d. stationary distributions, for independent but not identically distributed random variables, or for the i.i.d. case. We show that, using a new sub-sample selection technique we introduce, our bounds can be tightened under the natural assumption of convergent stochastic processes. We also prove that fast learning rates can be achieved by extending existing local Rademacher complexity analysis to non-i.i.d. setting.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Agarwal, A., Duchi, J.C.: The Generalization Ability of Online Algorithms for Dependent Data. IEEE Transactions on Information Theory 59(1), 573–587 (2013)
Bartlett, P.L., Bousquet, O., Mendelson, S.: Local Rademacher complexities. The Annals of Statistics 33, 1497–1537 (2005)
Berti, P., Rigo, P.: A Glivenko-Cantelli theorem for exchangeable random variables. Statistics and Probability Letters 32, 385–391 (1997)
Doukhan, P.: Mixing: Properties and Examples. Lecture Notes in Statistics, vol. 85. Springer, New York (1989)
Eberlein, E.: Weak convergence of partial sums of absolutely regular sequences. Statistics & Probability Letters 2, 291–293 (1994)
Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases (2004)
Koltchinskii, V., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: High Dimensional Probability II, pp. 443–459. Birkhauser (1999)
Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: learning bounds and algorithms. In: Proceedings of the Annual Conference on Learning Theory (COLT 2009). Omnipress (2009)
McDiarmid, C.: On the method of bounded differences. In: Surveys in Combinatorics, pp. 148–188. Cambridge University Press (1989)
Meir, R.: Nonparametric time series prediction through adaptive model selection. Machine Learning 39(1), 5–34 (2000)
Mohri, M., Rostamizadeh, A.: Rademacher complexity bounds for non-i.i.d. processes. In: Advances in Neural Information Processing Systems (NIPS 2008), pp. 1097–1104. MIT Press (2009)
Mohri, M., Rostamizadeh, A.: Stability bounds for stationary ϕ-mixing and β-mixing processes. Journal of Machine Learning 11 (2010)
Mohri, M., Muñoz Medina, A.: New analysis and algorithm for learning with drifting distributions. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds.) ALT 2012. LNCS, vol. 7568, pp. 124–138. Springer, Heidelberg (2012)
Pestov, V.: Predictive PAC learnability: A paradigm for learning from exchangeable input data. In: 2010 IEEE International Conference on Granular Computing (GrC 2010), Los Alamitos, California, pp. 387–391 (2010)
Rakhlin, A., Sridharan, K., Tewari, A.: Online learning: random averages, combinatorial parameters, and learnability. In: Advances in Neural Information Processing Systems (NIPS 2010), pp. 1984–1992 (2010)
Rakhlin, A., Sridharan, K., Tewari, A.: Sequential complexities and uniform martingale laws of large numbers. Probability Theory and Related Fields, 1–43 (2014)
Shalizi, C.R., Kontorovich, A.: Predictive PAC Learning and Process Decompositions. In: Advances in Neural Information Processing Systems (NIPS 2013), pp. 1619–1627 (2013)
Steinwart, I., Christmann, A.: Fast learning from non-i.i.d. observations. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems (NIPS 2009), pp. 1768–1776. MIT Press (2009)
Volkonskii, V.A., Rozanov, Y.A.: Some limit theorems for random functions I. Theory of Probability and Its Applications 4, 178–197 (1959)
Yu, B.: Rates of convergence for empirical processes of stationary mixing sequences. Annals Probability 22(1), 94–116 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kuznetsov, V., Mohri, M. (2014). Generalization Bounds for Time Series Prediction with Non-stationary Processes. In: Auer, P., Clark, A., Zeugmann, T., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2014. Lecture Notes in Computer Science(), vol 8776. Springer, Cham. https://doi.org/10.1007/978-3-319-11662-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-11662-4_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11661-7
Online ISBN: 978-3-319-11662-4
eBook Packages: Computer ScienceComputer Science (R0)