In environmental sciences, one often encounters large datasets with many variables. For instance, one may have a dataset of the monthly sea surface temperature (SST) anomalies (“anomalies” are the departures from the mean) collected at l = 1,000 grid locations over several decades, i.e. the data are of the form x = [x 1 , …, xl ], where each variable xi (i = 1, …, l) has n samples. The samples may be collected at times tk (k = 1, …, n), so each xi is a time series containing n observations. Since the SST of neighboring grids are correlated, and a dataset with 1,000 variables is quite unwieldy, one looks for ways to condense the large dataset to only a few principal variables. The most common approach is via principal component analysis (PCA), also known as empirical orthogonal function (EOF) analysis (Jolliffe 2002).
In this chapter, we examine the use of MLP NN models for nonlinear PCA (NLPCA) in Section 8.2, the overfitting problem associated with NLPCA in Section 8.3, and the extension of NLPCA to closed curve solutions in Section 8.4. MATLAB codes for NLPCA are downloadable from http://www.ocgy. ubc.ca/projects/clim.pred/download.html.The discrete approach by self-organizing maps is presented in Sections 8.5, and the generalization of NLPCA to complex variables in Section 8.6.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Nonlinear Principal Component Analysis
- Linear Principal Component Analysis
- Mean Absolute Error Norm
- Principal Component Analysis Mode
- Information Criterion
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baldwin, M., Gray, L., Dunkerton, T., Hamilton, K., Haynes, P., Randel, W., Holton, J., Alexander, M., Hirota, I., Horinouchi, T., Jones, D., Kinnersley, J., Marquardt, C., Sato, K., & Takahashi, M. (2001). The quasi-biennial oscillation. Reviews of Geophysics, 39, 179–229
Bishop, C. M. (1995). Neural networks for pattern recognition. (482 pp.) Oxford: Oxford University Press
Cavazos, T. (1999). Large-scale circulation anomalies conducive to extreme precipitation events and derivation of daily rainfall in northeastern Mexico and southeastern Texas. Journal of Climate, 12, 1506–1523
Cavazos, T. (2000). Using self-organizing maps to investigate extreme climate events: An application to wintertime precipitation in the Balkans. Journal of Climate, 13, 1718–1732
Cavazos, T., Comrie, A. C., & Liverman, D. M. (2002). Intrasea-sonal variability associated with wet monsoons in southeast Arizona. Journal of Climate, 15, 2477–2490
Cherkassky, V., & Mulier, F. (1998). Learning from data (441 pp.). New York: Wiley
Christiansen, B. (2005). The shortcomings of nonlinear principal component analysis in identifying circulation regimes. Journal of Climate, 18, 4814–4823
Christiansen, B. (2007). Reply to Monahan and Fyfe's comment on “The shortcomings of nonlinear principal component analysis in identifying circulation regimes”. Journal of Climate, 20, 378–379. DOI: 10.1175/JCLI4006.1
Clarke, T. (1990). Generalization of neural network to the complex plane. Proceedings of International Joint Conference on Neural Networks, 2, 435–440
Del Frate, F., & Schiavon, G. (1999). Nonlinear principal component analysis for the radiometric inversion of atmospheric profiles by using neural networks. IEEE Transactions on Geoscience and Remote Sensing, 37, 2335–2342
Diaz, H. F., & Markgraf, V. (Eds.) (2000) El Nino and the southern oscillation: Multiscale variability and global and regional impacts (496 pp.). Cambridge: Cambridge University Press
Georgiou, G., & Koutsougeras, C. (1992). Complex domain backpropagation. IEEE Transactions on Circults and Systems II, 39, 330–334
Hamilton, K. (1998). Dynamics of the tropical middle atmosphere: A tutorial review. Atmosphere-Ocean, 36, 319– 354
Hamilton, K., & Hsieh, W. W. (2002). Representation of the QBO in the tropical stratospheric wind by nonlinear principal component analysis. Journal of Geophysical Research, 107. DOI: 10.1029/2001JD001250
Hardman-Mountford, N. J., Richardson, A. J., Boyer, D. C., Kreiner, A., & Boyer, H. J. (2003). Relating sardine recruitment in the Northern Benguela to satellite-derived sea surface height using a neural network pattern recognition approach. Progress in Oceanograply, 59, 241–255
Hastie, T., & Stuetzle, W. (1989). Principal curves. Journal of the American Statistical Association, 84, 502–516
Hastie, T., Tibshirani, R., & Friedman, J. (2001). Elements of statistical learning: Data mining, inference and prediction (552 pp.). New York: Springer
Hirose, A. (1992). Continuous complex-valued backpropagation learning. Electronic Letters, 28, 1854–1855
Hoerling, M. P., Kumar, A., & Zhong, M. (1997). El Nino, La Nina and the nonlinearity of their teleconnections. Journal of Climate, 10, 1769–1786
Holton, J. R., & Tan, H.-C. (1980). The influence of the equatorial quasi-biennial oscillation on the global circulation at 50 mb. Journal of the Atmospheric Sciences, 37, 2200– 2208
Hsieh, W. W. (2001). Nonlinear principal component analysis by neural networks. Tellus, 53A, 599–615
Hsieh, W. W. (2004). Nonlinear multivariate and time series analysis by neural network methods. Reviews of Geophysics, 42, RG1003. DOI: 10.1029/2002RG000112
Hsieh, W. W. (2007). Nonlinear principal component analysis of noisy data. Neural Networks, 20, 434–443. DOI 10.1016/j.neunet.2007.04.018
Hsieh, W. W., & Wu, A. (2002). Nonlinear multichannel singular spectrum analysis of the tropical Pacific climate variability using a neural network approach. Journal of Geophysical Research, 107. DOI: 10.1029/2001JC000957
Jolliffe, I. T. (2002). Principal component analysis (502 pp.) Berlin: Springer
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200
Kim, T., & Adali, T. (2002). Fully complex multi-layer per-ceptron network for nonlinear signal processing. Journal of VLSI Signal Processing, 32, 29–43
Kirby, M. J., & Miranda, R. (1996). Circular nodes in neural networks. Neural Computation, 8, 390–402
Kohonen, T. (1982). Self-organzing formation of topologi-cally correct feature maps. Biological Cybernetics, 43, 59–69
Kohonen, T. (2001). Self-Organizing maps (3rd ed., 501 pp.) Berlin: Springer
Kramer, M. A. (1991). Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal 37233–243
Liu, Y., Wieisberg, R. H., & Mooers, C. N. K. (2006). Performance evaluation of the self-organizing map for feature extraction. Journal of Geophysical Research 111. DOI: 10.1029/2005JC003117
Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the Atmospheric Sciences 20130–141
Monahan, A. H. (2000). Nonlinear principal component analysis by neural networks: Theory and application to the Lorenz system. Journal of Climate 13821–835
Monahan, A. H. (2001). Nonlinear principal component analysis: Tropical Indo-Pacific sea surface temperature and sea level pressure. Journal of Climate 14219–233
Monahan, A. H., & Fyfe, J. C. (2007). Comment on “The shortcomings of nonlinear principal component analysis in identifying circulation regimes”. Journal of Climate 20375–377. DOI: 10.1175/JCLI4002.1
Monahan, A. H., Fyfe, J. C., & Flato, G. M. (2000). A regime view of northern hemisphere atmospheric variability and change under global warming. Geophysics Research Letters 271139–1142
Monahan, A. H., Pandolfo, L., & Fyfe, J. C. (2001). The preferred structure of variability of the northern hemisphere atmospheric circulation. Geophysical Research Letters28, 1019–1022
Newbigging, S. C., Mysak, L. A., & Hsieh, W. W. (2003). Improvements to the non-linear principal component analysis method, with applications to ENSO and QBO. Atmosphere-Ocean 41290–298
Nitta, T. (1997). An extension of the back-propagation algo-rtihm to complex numbers. Neural Networks 101391– 1415
Oja, E. (1982). A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology 15267– 273
Philander, S. G. (1990). El Niño, La Niña, and the southern oscillation (293 pp.). San Diego, CA: Academic
Preisendorfer, R. W. (1988). Principal component analysis in meteorology and oceanography (425 pp.). Amsterdam: Elsevier
Rattan, S. S. P., & Hsieh, W. W. (2004), Nonlinear complex principal component analyis of the tropical Pacific interan-nual wind variability. Geophysical Research Letters 31 (21), L21201. DOI: 10.1029/2004GL020446
Rattan, S. S. P., & Hsieh, W. W. (2005). Complex-valued neural networks for nonlinear complex principal component analysis. Neural Networks 1861–69. DOI: 10.1016/j.neunet.2004.08.002
Rattan, S. S. P., Ruessink, B. G., & Hsieh, W. W. (2005). Nonlinear complex principal component analysis of nearshore bathymetry. Nonlinear Processes in Geophysics12, 661– 670
Richardson, A. J., Risien, C., & Shillington, F. A. (2003). Using self-organizing maps to identify patterns in satellite imagery. Progress in Oceanography 59223–239
Richman, M. B. (1986). Rotation of principal components. Journal of Climatology 6293–335
Rojas, R. (1996). Neural networks – A systematic introduction (502 pp.). Berlin: Springer
Ruessink, B. G., van Enckevort, I. M. J., & Kuriyama, Y. (2004). Non-linear principal component analysis of nearshore bathymetry. Marine Geology 203185– 197
Saff, E. B., & Snider, A. D. (2003). Fundamentals of complex analysis with applications to engineering and science (528 pp.). Englewood Cliffs, NJ: Prentice-Hall
Sanger, T. D. (1989). Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks 2459–473
Schölkopf, B., Smola, A., & Muller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 101299–1319
Tang, Y., & Hsieh, W. W. (2003). Nonlinear modes of decadal and interannual variability of the subsurface thermal structure in the Pacific Ocean. Journal of the Geophysical Research 108. DOI: 10.1029/2001JC001236
Villmann, T., Merenyi, E., & Hammer, B. (2003). Neural maps in remote sensing image analysis. Neural Networks 16389– 403
von Storch, H., & Zwiers, F. W. (1999). Statistical analysis in climate research (484 pp.). Cambridge: Cambridge University Press
Webb, A. R. (1999). A loss function approach to model selection in nonlinear principal components. Neural Networks 12339–345
Yacoub, M., Badran, F., & Thiria, S. (2001). A topological hierarchical clustering: Application to ocean color classification. Artificial Neural Networks-ICANN 2001, Proceedings. Lecture Notes in Computer Science492–499
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media B.V
About this chapter
Cite this chapter
Hsieh, W.W. (2009). Nonlinear Principal Component Analysis. In: Haupt, S.E., Pasini, A., Marzban, C. (eds) Artificial Intelligence Methods in the Environmental Sciences. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-9119-3_8
Download citation
DOI: https://doi.org/10.1007/978-1-4020-9119-3_8
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-9117-9
Online ISBN: 978-1-4020-9119-3
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)