Abstract
During the last fifteen years, Akaike's entropy-based Information Criterion (AIC) has had a fundamental impact in statistical model evaluation problems. This paper studies the general theory of the AIC procedure and provides its analytical extensions in two ways without violating Akaike's main principles. These extensions make AIC asymptotically consistent and penalize overparameterization more stringently to pick only the simplest of the “true” models. These selection criteria are called CAIC and CAICF. Asymptotic properties of AIC and its extensions are investigated, and empirical performances of these criteria are studied in choosing the correct degree of a polynomial model in two different Monte Carlo experiments under different conditions.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & B. F. Csaki (Eds.),Second International Symposium on Information Theory, (pp. 267–281). Academiai Kiado: Budapest.
Akaike, H. (1974). A new look at the statistical model identification.IEEE Transactions on Automatic Control, AC-19, 716–723.
Akaike, H. (1976). Canonical correlation analysis of time series and the use of an information criterion. In R. K. Mehra & D. G. Lainiotis (Eds.),System identification (pp. 27–96). New York: Academic Press.
Akaike, H. (1977). On entropy maximization principle. In P. R. Krishnaiah (Ed.),Proceedings of the Symposium on Applications of Statistics (pp. 27–47). Amsterdam: North-Holland.
Akaike, H. (1978). On newer statistical approaches to parameter estimation and structure determination.International Federation of Automatic Control, 3, 1877–1884.
Akaike, H. (1979). A Bayesian extension of the minimum AIC procedure of autogressive model fitting.Biometrika, 66, 237–242.
Akaike, H. (1981a). Likelihood of a model and information criteria.Journal of Econometrics, 16, 3–14.
Akaike, H. (1981b). Modern development of statistical methods. In P. Eykhoff (Ed.),Trends and progress in system identification (pp. 169–184). New York: Pergamon Press.
Akaike, H. (1987). Factor Analysis and AIC.Psychometrika, 52.
Anderson, T. W. (1962). The choice of the degree of a polynomial regression as a multiple decision problem.Annals of Mathematical Statistics, 33, 255–265.
Atilgan, T. (1983).Parameter parsimony, model selection, and smooth density estimation. Unpublished doctoral dissertation, Madison: University of Wisconsin, Department of Statistics.
Atilgan, T., & Bozdogan, H. (1987, June). Information-theoretic univariate density estimation under different basis functions. A paper presented at the First Conference of the International Federation of Classification Societies, Aachen, West Germany.
Atkinson, A. C. (1980). A note on the generalized information criterion for choice of a model.Biometrika, 67, 413–418.
Bhansali, R. J., & Downham, D. Y. (1977). Some properties of the order of an autoregressive model selected by a generalization of Akaike's FPE criterion.Biometrika, 64, 547–551.
Boltzmann, L. (1877). Über die Beziehung zwischen dem zweitin Hauptsatze der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung respective den Sätzen über das Wärmegleichgewicht.Wiener Berichte, 76, 373–435.
Čencov, N. N. (1982).Statistical decision rules and optimal inference. Providence, RI: American Mathematical Society.
Clergeot, H. (1984). Filter-order selection in adaptive maximum likelihood estimation.IEEE Transactions on Information Theory, IT-30 (2), 199–210.
Cox, D. R., & Hinkley, D. V. (1974).Theoretical statistics. London: Chapman and Hall.
Davis, M. H. A., & Vinter, R. B. (1985).Stochastic modelling and control. New York: Chapman and Hall.
Efron, B. (1967). The power of the likelihood ratio test.Annals of Mathematical Statistics, 38, 802–806.
Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics.Royal Society of London. Philosophical Transactions (Series A),222, 309–368.
Graybill, F. A. (1976),Theory and application of the linear model. Boston: Duxbury Press.
Hannan, E. J. (1986). Remembrance of things past. In J. Gani (Ed.),The craft of probabilistic modelling. New York: Springer-Verlag.
Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression.Journal of the Royal Statistical Society, (Series B),41, 190–195.
Haughton, D. (1983). On the choice of a model to fit data from an exponential family. Unpublished doctoral dissertaion, Massachusetts Institute of Technology, Department of Mathematics, Cambridge, MA.
Jaynes, E. T. (1957). Information theory and statistical mechanics.Physical Review, 106, 620–630.
Kashyap, R. L. (1982). Optimal choice of AR and MA parts in autoregressive moving average models.IEEE Transactions on Pattern Analysis and Machine Intelligence, 4, 99–104.
Kendall, M. G., & Stuart, M. A. (1967).The Advanced Theory of Statistics, Vol. 2, Second Edition. New York: Hafner Publishing.
Kitagawa, G. (1979). On the use of AIC for the detection of outliers.Technometrics, 21, 193–199.
Kullback, S. (1959).Information theory and statistics. New York: John Wiley & Sons.
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency.Annals of Mathematical Statistics, 22, 79–86.
Larimore, W. E., & Mehra, R. K. (1985, October). The problems of overfitting data.Byte, pp. 167–180.
Lindley, D. V. (1968). The choice of variables in multiple regression (with discussion).Journal of the Royal Statistical Scociety (Series B),30, 31–36.
Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference.Biometrika, 20A, 175–240 (Part I), 263–294 (Part II).
Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses.Royal Society of London. Philosophical Transactions. (Series A),231, 289–337.
Parzen, E. (1982). Data modeling using quantile and density-quantile functions. In J. T. de Oliveira & B. Epstein (Eds.),Some recent advances in statistics (pp. 23–52). London: Academic Press.
Quinn, B. G. (1980). Order determination for a multivariate autoregression.Journal of the Royal Statistical Society (Series B),42, 182–185.
Rissanen, J. (1978). Modeling by shortest data description.Automatica, 14, 465–471.
Schwarz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6, 461–464.
Sclove, S. L. (1987). Application of model-selection criteria to some problems in multivariate analysis.Psychometrika, 52.
Shibata, R. (1983). A theoretical view of the use of AIC. In O. D. Anderson (Ed.),Time series analysis: Theory and practice, Vol. 4 (pp. 237–244). Amsterdam: North-Holland.
Silvey, S. D. (1975).Statistical inference. London: Chapman and Hall.
Stone, C. J. (1981). Admissible selection of an accurate and parsimonious normal linear regression model.Annals of Statistics, 9, 475–485.
Teräsvirta, T., & Mellin, I. (1986). Model selection criteria and model selection tests in regression models.Scandinavian Journal of Statistics, 13, 159–171.
Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large.Transactions of the American Mathematical Society, 54, 426–482.
White, H. (1982). Maximum likelihood estimation of misspecified models.Econometrica, 50, 1–26.
Wilks, S. S. (1962).Mathematical Statistics. New York: John Wiley & Sons.
Woodroofe, M. (1982). On model selection and the arc sine laws.Annals of Statistics, 10, 1182–1194.
Author information
Authors and Affiliations
Additional information
The author extends his deep appreciation to many people. These include Hirotugu Akaike, Donald E. Ramirez, Marvin Rosenblum, and S. James Taylor for reading and commenting on some parts of this manuscript through various stages of its development. I especially wish to thank Yoshio Takane, Jim Ramsay, and Stanley L. Sclove for critically reading the paper and making many helpful suggestions. I also wish to thank Julie Riddleberger for her excellent typing of this manuscript.
This research was partially supported by NIH Biomedical Research Support Grant (BRSG) No. 5-24867 at the University of Virginia.
Rights and permissions
About this article
Cite this article
Bozdogan, H. Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika 52, 345–370 (1987). https://doi.org/10.1007/BF02294361
Issue Date:
DOI: https://doi.org/10.1007/BF02294361