Abstract
The concept and the mathematical properties of entropy play an important role in statistics, cybernetics, and information sciences. Indeed, many algorithms and statistical data processing tools, with a wide range of targets and scopes, have been designed based on entropy. The paper describes two estimators inspired by the concept of entropy that allow to robustly cope with multicollinearity, in one case, and outliers, in the other. The Generalized Maximum Entropy (GME) estimator optimizes the Shannon’s entropy function subject to consistency and normality constraints. In regression applications GME allows, for example, to estimate model coefficients in the presence of multicollinearity. The Least Entropy-Like (LEL) estimator is a novel prediction error model coefficient identification algorithm that minimizes a nonlinear cost function of the fitting residuals. As the cost function that is minimized shares the same mathematical properties of entropy, it allows to compute an estimate of the model coefficients corresponding to a positively skewed distribution of the residuals. The resulting estimator exhibits higher robustness to outliers with respect to standard, as ordinary least squares (OLS) model coefficient approaches. Both the GME and LEL estimation methods are applied to a common case study to illustrate their respective properties.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Ordinary Little Square
- Maximum Entropy Principle
- Regression Matrix
- Absolute Residual
- Generalize Maximum Entropy
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Anscombe, F.: Graphs in statistical analysis. The American Statistician 27(1), 17–21 (1973)
Bai, E.W.: An optimization based robust identication algorithm in the presence of outliers. Journal of Global Optimization 23(3), 195–211 (2002)
Bishop, C.: Pattern Recognition and Machine Learning. Springer (2006)
Ciavolino, E., Al-Nasser, A.: Comparing generalised maximum entropy and partial least squares methods for structural equation models. Journal of Nonparametric Statistics 21(8), 1017–1036 (2009)
Ciavolino, E., Dahlgaard, J.: Simultaneous equation model based on the generalized maximum entropy for studying the effect of management factors on enterprise performance. Journal of Applied Statistics 36(7), 801–815 (2009)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, Inc. (2001), http://dx.doi.org/10.1002/0471200611
Distante, C., Indiveri, G.: RANSAC-LEL: An optimized version with least entropy like estimators. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 1425–1428. IEEE (2011), http://dx.doi.org/10.1109/ICIP.2011.6115709
Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6), 381–395 (1981), http://dx.doi.org/10.1145/358669.358692
Golan, A., Judge, G.G., Miller, D.: Maximum Entropy Econometrics: Robust Estimation with Limited Data. John Wiley & Sons Inc. (1996)
Huber, P., Ronchetti, E.: Robust Statistics. Wiley, Hoboken (2009)
Indiveri, G.: An entropy-like estimator for robust parameter identification. Entropy 11(4), 560–585 (2009)
Jaynes, E.: Information theory and statistical mechanics. Physical Review 106(4), 620–630 (1957)
Jaynes, E.: Prior probabilities. IEEE Transactions on Systems Science and Cybernetics 4(3), 227–241 (1968)
Longley, J.: An appraisal of least squares programs for the electronic computer from the point of view of the user. Journal of the American Statistical Association 62(319), 819–841 (1967)
Paris, Q.: Multicollinearity and maximum entropy estimators. Economics Bulletin 3(11), 1–9 (2001)
Poljak, B., Tsypkin, J.: Robust identification. Automatica 16(1), 53–63 (1980)
Rousseeuw, P., Leroy, A.: Robust regression and outlier detection. Wiley, New York (2003)
Shannon, C.E.: A mathematical theory of communications. Bell System Technical Journal 27(7), 379–423 (1948)
Steele, J., Steiger, W.: Algorithms and complexity for least median of squares regression. Discrete Applied Mathematics 14(1), 93–100 (1986)
Stromberg, A.: Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression. SIAM Journal on Scientific Computing 14(6), 1289–1299 (1993)
Ta, M., DeBrunner, V.: Minimum entropy estimation as a near maximum-likelihood method and its application in system identification with non-gaussian noise. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 2, pp. ii–545. IEEE (2004)
Tribus, M., McIrvine, E.C.: Energy and information. Scientic American 225(3), 179–188 (1971)
Wolsztynski, E., Thierry, E., Pronzato, L.: Minimum-entropy estimation in semi-parametric models. Signal Processing 85(5), 937–949 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ciavolino, E., Indiveri, G. (2013). Entropy-Based Estimators in the Presence of Multicollinearity and Outliers. In: Ventre, A., Maturo, A., Hošková-Mayerová, Š., Kacprzyk, J. (eds) Multicriteria and Multiagent Decision Making with Applications to Economics and Social Sciences. Studies in Fuzziness and Soft Computing, vol 305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35635-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-35635-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35634-6
Online ISBN: 978-3-642-35635-3
eBook Packages: EngineeringEngineering (R0)