Abstract
The data analyst confronted with a large number of variables from which he must select a parsimonious subset, as independent variables in a multiple regression, is faced with a number of technical issues. What criterion should he use to judge the adequacy of his selection? What procedure should he use to select the subset of independent variables? How should he check for/guard against/correct for possible multicollinearity in his chosen set of independent variables?
“Stepwise regression can lead to confusing results...”
Daniel and Wood ([1980], p. 85)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abt, K. 1967. On the identification of the significant independent variables in linear models. Metrika 12: 2–15.
Akaike, H. 1972. Information theory and an extension of the maximum likelihood principle. Proceedings, Second International Symposium on Information. Theory, 267–81.
Allen, D. M. 1971. The prediction sum of squares as a criterion for selecting prediction variables. Technical Report 23. Department of Statistics. University of Kentucky.
Allen, D. M. 1974. The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16 (February): 125–27.
Amemiya, T. 1976. Selection of regressors. Technical Report 225. Institute for Mathematical Studies in the Social Sciences. Stanford University.
Beaton, A. E. 1964. The use of special matrix operators in statistical calculus. Research Bulletin RB 64 51 (October). Princeton: Educational Testing Service.
Bendel, R. B. and Afifi, A. A. 1977. Comparison of stopping rules in forward “Stepwise” regression. Journal of the American Statistical Association 72 (March): 46–53.
Daniel, C. and Wood, F. S. 1980. Fitting Equations to Data. New York: Wiley.
Draper, N. R. and Smith, H. 1966. Applied Regression Analysis. New York: Wiley.
Dunnett, C. W. and Sobel, M. 1954. A bivariate generalization of Student’s distribution, with tables for certain special cases. Biometrika 41 (April): 153–69.
Efroymson, M. A. 1962. Multiple regression analysis. In Mathematical Methods for Digital Computers, ed. A. Ralston, and H. S. Wilf. New York: Wiley.
Fisher, R. A. 1934. Statistical Methods for Research Workers. New York: Hafner.
Goldstein, M. and Smith, A. F. M. 1974. Ridge type estimators for regression analysis. Journal of the Royal Statistical Society, Series B 36 (December): 284–91.
Goodnight, J. H. 1979. A tutorial on the SWEEP operator. American Statistician 33 (August): 149–58.
Gorman, J. W. and Toman, R. J. 1966. Selection of variables for fitting equations to data. Technometrics 8 (February): 27–51.
Hocking, R. R. 1972. Criteria for selection of a subset regression: Which one should be used? Technometrics 14 (November): 967–70.
Hocking, R. R. 1976. The analysis and selection of variables in linear regression. Biometrics 32 (March): 1–49.
Hoerl, A. E. and Kennard, R. W. 1970. Ridge regression: Biased estimation of non-orthogonal problems. Technometrics 12 (February): 55–67.
Krishnaiah, P. R. and Armitage, J. V. 1. 1965. Probability Integrals of the Multivariate F Distribution, with Tables and Applications, ARL 65-236. Ohio: Wright-Patterson AFB.
Kullback, S. and Leibler, R. A. 1951. On information and sufficiency. Annals of Mathematical Statistics 22 (March): 79–66.
Madansky, A. 1976. Foundations of Econometrics. Amsterdam: North-Holland.
Malinvaud, E. 1966. Statistical Methods of Econometrics. Chicago: Rand-McNally.
Mallows, C. L. 1967. Choosing a subset regression. Bell Telephone Laboratories, unpublished report.
Pope, P. T. 1969. On the stepwise construction of a prediction equation. Tech. Report 37. THEMIS, Statistics Department, Southern Methodist University, Dallas, Texas.
Pope, P. T. and Webster, J. T. 1972. The use of an F-statistic n stepwise regression procedures. Technometrics 14 (May): 327–40.
Roberts, H. V. and Ling, R. F. 1982. Conversational Statistics with IDA. New York: Scientific Press and McGraw-Hill.
Theil, H. 1961. Economic Forecasts and Policy. Amsterdam: North-Holland.
Thompson, M. L. 1978a. Selection of variables in multiple regression: Part I. A review and evaluation. International Statistical Review 46 (April): 1–19.
Thompson, M. L. 1978b. Selection of variables in multiple regression: Part II. Chosen procedures, computations, and examples. International Statistical Review 46 (April): 129–46.
Vinod, H. D. 1976. Application of new ridge regression methods to a study of Bell system scale economies. Journal of the American Statistical Association 71 (December): 929–33.
Vinod, H. D. and Ullah, A. 1981. Recent Advances in Regression Analysis. New York: Marcel Dekker.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1988 Springer-Verlag New York Inc.
About this chapter
Cite this chapter
Madansky, A. (1988). Independent Variable Selection in Multiple Regression. In: Prescriptions for Working Statisticians. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-3794-5_7
Download citation
DOI: https://doi.org/10.1007/978-1-4612-3794-5_7
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-8354-6
Online ISBN: 978-1-4612-3794-5
eBook Packages: Springer Book Archive