Abstract
We propose the notion of multivariate predictability as a measure of goodness-of-fit in data reduction techniques which are useful for visualizing and screening data. For quantitative variables this leads to the usual sums-of-squares and variance accounted for criteria. For categorical variables we show how to predict the category-levels of all variables associated with every point (case). The proportion of predictions which agree with the true categories gives the measure of fit. The ideas are very general; as an illustration we use nonlinear principal components analysis (NLPCA) in association with ordered categorical variables. A detailed example using data from the International Social Survey Program (ISSP) will be given in Blasius and Gower (quality and quantity, 39, to appear). It will be shown that the predictability criterion suggests that the fits are rather better than is indicated by “percentage of variance accounted for”.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Blasius, J. & Gower, J. C. (to appear). Multivariate prediction with nonlinear principal components analysis: Application. Quality and Quantity 39
I. Borg P. Groenen (1997) Modern Multidimensional Scaling Springer New York
I. Borg S. Shye (1995) Facet Theory. Form and Content Sage Newbury Park, CA
J.M. Chambers (1998) Programming with Data: A Guide to the S Language Springer New York
C. Eckart G. Young (1936) ArticleTitleThe approximation of one matrix by another of lower rank Psychometrika 1 211–218
K.R. Gabriel (1971) ArticleTitleThe biplot-graphic display of matrices with applications to principal components analysis Biometrika 58 453–467
K.R. Gabriel (1981) Biplot display of multivariate matrices for inspecting of data and diagnosis V. Barnett (Eds) Interpreting Multivariate Data Wiley Chichester 147–174
InstitutionalAuthorNameGenstat 5 Committee (1993) Genstat 5 Release 3 Reference Manual Numerical Algorithms Group Oxford
A. Gifi (1990) Nonlinear Multivariate Analysis Wiley Chichester
J.C. Gower (1966) ArticleTitleSome distance properties of latent-root and vector methods used in multivariate analysis Biometrika 53 325–338
Gower, J. C. (1993). The construction of neighbour-regions in two dimensions for prediction with multi-level categorical variables. In: O. Opitz, B. Lausen & R. Klar (eds.), Information and Classification: Concepts–Methods–Applications Proceedings 16th Annual Conference of the Gesellschaft für Klassifikation, Dortmund, April 1992, Berlin: Springer, pp. 174–189
J.C. Gower (2002) Categories and quantities S. Nishisato Y. Baba H. Bozdogan K. Kamefuji (Eds) Measurement and Multivariate Analysis Springer Tokyo 1–12
J.C. Gower D.J. Hand (1996) Biplots Chapman & Hall London
J.C. Gower S. Harding (1998) Prediction regions for categorical variables J. Blasius M. Greenacre (Eds) Visualization of Categorical Data. Academic Press San Diego 405–423
M.J. Greenacre (1993) ArticleTitleBiplots in correspondence analysis Journal of Applied Statistics 20 251–269
L. Guttman (1965) ArticleTitleA faceted definition of intelligence Scripta Hierosolymitana 14 166–181
W.J. Heiser J.J. Meulman (1994) Homogeneity analysis: exploring the distribution of variables and their nonlinear relationships M.J. Greenacre J. Blasius (Eds) Correspondence Analysis in the Social Sciences. Recent Developments and Applications Academic Press London 179–209
J.J. Meulman W.J. Heiser (1999) SPSS Categories 100 SPSS Inc. Chicago
Payne, R. W., Lane, P. W., Baird, D. B., Gilmour, A. R., Harding, S. A., Morgan, G. W. Murray, D. A., Thompson, R., Todd, A. D., Tunicliffe-Wilson, G., Webster, R. & Welham, S. J. (1998). Genstat 5 Release 4.1 Reference Manual Supplement. Oxford: Numerical Algorithms Group
SPSS. (1999). See Meulman and Heiser (1999)
Author information
Authors and Affiliations
Corresponding author
Additional information
This article was written while John Gower was a visiting professor at the ZA-Eurolab, at the Zentralarchiv für Empirische Sozialforschung, University of Cologne, Germany. The ZA is a Large Scale Facility funded by the Training and Mobility of Researchers program of the European Union.
Rights and permissions
About this article
Cite this article
GOWER, J.C., BLASIUS, J. Multivariate Prediction with Nonlinear Principal Components Analysis: Theory. Qual Quant 39, 359–372 (2005). https://doi.org/10.1007/s11135-005-3005-1
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11135-005-3005-1