Abstract
This paper proposes a method based on the Minimum Message Length (MML) Principle for the task of discovering polynomial models up to the second order. The method is compared with a number of other selection criteria in the ability to, in an automated manner, discover a model given the generated data. Of particular interest is the ability of the methods to discover (1) second-order independent variables, (2) independent variables with weak causal relationships with the target variable given a small sample size, and (3) independent variables with weak links to the target variable but strong links from other variables which are not directly linked with the target variable. A common non-backtracking search strategy has been developed and is used with all of the model selection criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. Akaike. Information theory and an extension of the Maximum Likelihood principle. In B.N. Petrov and F. Csaki, editors, Proc. 2nd Int. Symp. Information Thy., pages 267–281, 1973.
R. Baxter and D. Dowe. Model selection in linear regression using the MML criterion. Technical Report 276, School of Computer Science and Software Engineering, Monash University, 1996.
H. Bozdogan. Model selection and akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika, 52(3):345–370, 1987.
J.H. Conway and N.J.A. Sloane. Sphere Packings, Lattices and Groups. Springer-Verlag, New York, 1988.
M. Ezekiel. Methods of Correlation Analysis. Wiley, New York, 1930.
C. Mallows. Some comments on Cp. Technometrics, 15:661–675, 1973.
A.J. Miller. Subset Selection in Regression. Chapman and Hall, London, 1990.
J. Rissanen. Modeling by shortest data description. Automatica, 14:465–471, 1978.
J. Rissanen. Stochastic complexity. Journal of the Royal Statistical Society B, 49(1):223–239, 1987.
T. Ryan. Modern Regression Methods. John Wiley & Sons, New York, 1997.
G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461–464, 1978.
V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
C.S. Wallace. On the selection of the order of a polynomial model, unpublished technical report, Royal Holloway College, 1997.
C.S. Wallace and P.R. Freeman. Estimation and inference by compact coding. Journal of the Royal Statistical Society B, 49(1):240–252, 1987.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rumantir, G.W. (2000). Minimum Message Length Criterion for Second-Order Polynomial Model Discovery. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_7
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive