Abstract
Sensors and actuators embedded in physical objects being linked through wired/wireless networks known as “internet of things” are churning out huge volumes of data (McKinsey Quarterly report, 2010). This phenomenon has led to the archiving of mammoth amounts of data from scientific simulations in the physical sciences and bioinformatics, to social media and a plethora of other areas. It is predicted that over 30 billion devices with 200 billion intermittent connections will be connected by 2020. The creation and archival of the massive amounts of data spawned a multitude of industries. Data management and up-stream analytics is aided by data compression and dimensionality reduction. This review paper will focus on some foundational methods of dimensionality reduction by examining in extensive detail some of the main algorithms, and points the reader to emerging next generation methods that seek to identify structure within high dimensional data not captured by 2nd order statistics.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Committee on the Analysis of Massive Data, Frontiers in Massive Data Analysis. National Academies Press (2013)
Shalizi, C.R.: Advanced Data Analysis from an Elementary Point of View (2013), http://www.stat.cmu.edu/~cshalizi
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 3rd edn. Prentice Hall, Englewood Cliffs (1992)
Hotelling, H.: Relations Between Two Sets of Variables. Biometrika 28, 321–377 (1936)
Mood, A.M., Graybill, F.A., Boes, D.C.: Introduction to the Theory of Statistics, 3rd edn. McGraw-Hill (1974)
Hardle, W.K., Simar, L.: Applied Multivariate Statistical Analysis, 3rd edn. Springer (2011)
Friedman, J.H.: Exploratory Projection Pursuit. Journal of the American Statistical Association 82(397), 249–266 (1987)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Springer (2001)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, Inter-Science (2001)
Lakshminarayan, C.K., Baron, M.I.: Pattern Recognition in Large-Scale Data Sets: Application in Integrated Circuit Manufacturing. In: Bhatnagar, V. (ed.) BDA 2013. Springer, Heidelberg (2013)
Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C, The Art of Scientific Computing. Cambridge University Press (1990)
Strang, G.: Linear Algebra and its Applications, 4th edn. Brooks/Cole Publishing Company (2005)
Burgess, C.J.C.: Dimension Reduction: A guided Tour. Foundation and Trends in Machine Learning 2(4), 275–365 (2010)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis; An overview with application to learning methods, technical report, CSD-TR-03-02, Dept. of Computer Science, Royal Holloway, University of London (2003)
Timm, N.H.: Applied Multivariate Analysis. Springer (2002)
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer (2007)
Strang, G.: Introduction to Applied Mathematics. Wellesley-Cambridge Press (1986)
Ng, A.: Independent Component Analysis, CS229, Lecture Notes. Stanford University
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Lakshminarayan, C.K. (2013). High Dimensional Big Data and Pattern Analysis: A Tutorial. In: Bhatnagar, V., Srinivasa, S. (eds) Big Data Analytics. BDA 2013. Lecture Notes in Computer Science, vol 8302. Springer, Cham. https://doi.org/10.1007/978-3-319-03689-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-03689-2_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03688-5
Online ISBN: 978-3-319-03689-2
eBook Packages: Computer ScienceComputer Science (R0)