Abstract
Cluster analysis is an exploratory tool for compressing data into a smaller number of groups or representing points. The latter aim at sufficiently summarizing the underlying data structure and as such can serve the analyst for further consideration instead of dealing with the complete data set. Because of this data compression property, cluster analysis remains to be an essential part of the marketing analyst’s toolbox in today’s data rich business environment. This chapter gives an overview of the various approaches and methods for cluster analysis and links them with the most relevant marketing research contexts. We also provide pointers to the specific packages and functions for performing cluster analysis using the R ecosystem for statistical computing. A substantial part of this chapter is devoted to the illustration of applying different clustering procedures to a reference data set of shopping basket data. We briefly outline the general approach of the considered techniques, provide a walk-through for the corresponding R code required to perform the analyses, and offer some interpretation of the results.
Similar content being viewed by others
References
Adams, R. A., & Fournier, J. J. (2003). Sobolev spaces (Pure and applied mathematics) (Vol. 140). Amsterdam: Elsevier.
Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Beverly Hills: Sage.
Anderberg, M. R. (1973). Cluster analysis for applications. New York: Academic.
Arabie, P., & Lawrence, J. H. (1994). Cluster analysis in marketing research. In R. P. Bagozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Cambridge, MA: Blackwell.
Arabie, P., & Lawrence, J. H. (1996). An overview of combinatorial data analysis. Clustering and classification (pp. 5–63). Singapore: World Scientific.
Arabie, P., Carroll, J. D., DeSarbo, W., & Wind, J. (1981). Overlapping clustering: A new method for product positioning. Journal of Marketing Research, 28(3), 310–317.
Bock, H. H. (1974). Automatische Klassifikation. Göttingen: Vandenhoeck & Ruprecht.
Boztuğ, Y., & Reutterer, T. (2008). A combined approach for segment-specific market basket analysis. European Journal of Operational Research, 187(1), 294–312.
Breugelmans, E., Boztuğ, Y., & Reutterer, T. (2010). A multistep approach to derive targeted category promotions. Working paper series of the Marketing Science Institute, MSI report no. 10-118, Cambridge, MA.
Büschken, J., & Allenby, G. M. (2016). Sentence-based text analysis for customer reviews. Marketing Science, 35(6), 953–975.
Cattell, R. B. (1943). The description of personality: Basic traits resolved into clusters. Journal of Abnormal and Social Psychology, 38(4), 476–506.
Chapman, C., & McDommell Feit, E. (2019). Segmentation: Clustering and classification. R for marketing research and analytics (pp. 299–338). New York: Springer.
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1(2), 224–227.
Decker, R. (2005). Market basket analysis by means of a growing neural network. The International Review of Retail, Distribution and Consumer Research, 15(2), 151–169.
DeSarbo, W. S., Ajay, K. M., & Lalita, A. M. (1993). Non-spatial tree models for the assessment of competitive market structure: An integrated review of the marketing and psychometric literature. In J. Eliashberg & G. L. Lilien (Eds.), Handbooks in operations research and management science (Vol. 5, pp. 193–257). Amsterdam: Elsevier.
Dimitriadou, E., Dolničar, S., & Weingessel, A. (2002). An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 67(1), 137–159.
Dolnicar, S., & Leisch, F. (2003). Winter tourist segments in Austria: Identifying stable vacation styles using bagged clustering techniques. Journal of Travel Research, 41(3), 281–292.
Dolnicar, S., Grün, B., Leisch, F., & Schmidt, K. (2014). Required sample sizes for data-driven market segmentation analyses in tourism. Journal of Travel Research, 53(3), 296–306.
Dolnicar, S., Grün, B., & Leisch, F. (2018). Market segmentation analysis. Understanding it, doing it, and making it useful. Singapore: Springer.
Dréze, X., & Hoch, S. J. (1998). Exploiting the installed base using cross-merchandising and category destination programs. International Journal of Research in Marketing, 15(5), 459–471.
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis: Wiley series in probability and statistics. New York: Wiley.
Farris, J. S. (1969). On the cophenetic correlation coefficient. Systematic Zoology, 18(3), 279–285.
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97(458), 611–631.
Fraley, C., & Raftery, A. E. (2003). Enhanced model-based clustering, density estimation, and discriminant analysis software: MCLUST. Journal of Classification, 20(2), 263–286.
Frühwirth-Schnatter, S. (2006). Finite mixture and Markov switching models. New York: Springer Science & Business Media.
Ghesmoune, M., Lebbah, M., & Azzag, H. (2016). State-of-the-art on clustering data streams. Big Data Analytics, 1(13), 1–27.
Grover, R., & Srinivasan, V. (1987). A simultaneous approach to market segmentation and market structuring. Journal of Marketing Research, 24, 139–153.
Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.
Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society: Series C: Applied Statistics, 28(1), 100–108.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). Unsupervised learning. In The elements of statistical learning (pp. 485–585). New York: Springer.
Hennig, C., Meila, M., Murtagh, F., & Rocci, R. (2015). Handbook of cluster analysis. Boca Raton/London/New York: CRC Press.
Hornik, K. (2004). Cluster ensembles. In C. Weihs, W. Gaul (Eds.), Classification – The ubiquitous challenge. Proceedings of the 28th annual conference of the Gesellschaft für Klassifikation E.V (pp. 65–72). Heidelberg: University of Dortmund/Springer.
Hornik, K. (2005). A clue for cluster ensembles. Journal of Statistical Software, 14(12), 1–25.
Hruschka, H. (1986). Market definition and segmentation using fuzzy clustering methods. International Journal of Research in Marketing, 3(2), 117–134.
Hruschka, H., & Natter, M. (1986). Comparing performance of feedforward neural nets and K-means for cluster-based market segmentation. European Journal of Operational Research, 114(2), 346–353.
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Upper Saddle River: Prentice-Hall.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. Hoboken: Wiley.
Leisch, F. (2006). A toolbox for k-centroids cluster analysis. Computational Statistics & Data Analysis, 51(2), 526–544.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1(14), 281–297.
Manchanda, P., Ansari, A., & Gupta, S. (1999). The “shopping basket”: A model for multicategory purchase incidence decisions. Marketing Science, 18(2), 95–114.
Mazanec, J. A. (1999). Simultaneous positioning and segmentation analysis with topologically ordered feature maps: A tour operator example. Journal of Retailing and Customer Services, 6(4), 219–235.
Mazanec, J. A., & Strasser, H. (2000). A nonparametric approach to perceptions-based market segmentation: Foundations (Vol. 1). Wien: Springer.
McLachlan, G. J., & Basford, K. E. (1988). Mixture models: Inference and applications to clustering. New York: Marcel Dekker.
Mild, A., & Reutterer, T. (2003). An improved collaborative filtering approach for predicting cross-category purchases based on binary market basket data. Journal of Retailing and Consumer Services, 10(3), 123–133.
Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(2), 159–179.
Mooi, E., Sarstedt, M., & Mooi-Reci, I. (2018). Data. In Market research (pp. 27–50). Singapore: Springer.
Netzer, O., Feldman, R., Goldenberg, J., & Fresko, M. (2012). Mine your own business: Market-structure surveillance through text mining. Marketing Science, 31(3), 521–543.
Ng, R. T., & Han, J. (2002). CLARANS: A method for clustering objects for spatial data mining. IEEE Transactions on Knowledge and Data Engineering, 14(5), 1003–1016.
Punj, G., & Stewart, D. W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20(2), 134–148.
R Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna: R Development Core Team.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846850.
Rao, V. R., & Sabavala, D. J. (1981). Inference of hierarchical choice processes from panel data. Journal of Consumer Research, 8(1), 85–96.
Reutterer, T. (1998). Competitive market structure and segmentation analysis with self-organizing feature maps. In P. Anderson (Ed.), Proceedings of the 27th EMAC conference. Track 5: Marketing research (pp. 85–105). Stockholm: EMAC.
Reutterer, T. (2003). Bestandsaufnahme und aktuelle Entwicklungen bei der Segmentierungsanalyse von Produktmarkten. Journal für Betriebswirtschaft, 53(2), 52–74.
Reutterer, T., & Natter, M. (2000). Segmentation-based competitive analysis with MULTICLUS and topology representing networks. Computers & Operations Research, 27(11–12), 1227–1247.
Reutterer, T., Mild, A., Natter, M., & Taudes, A. (2006). A dynamic segmentation approach for targeting and customizing direct marketing campaigns. Journal of Interactive Marketing, 20(3–4), 43–57.
Reutterer, T., Hahsler, M., & Hornik, K. (2007). Data mining und marketing am beispiel der explorativen warenkorbanalyse. Marketing ZFP, 29(3), 163–180.
Reutterer, T., Hornik, K., March, N., & Gruber, K. (2017). A data mining framework for targeted category promotions. Journal of Business Economics, 87(3), 337–358.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Russell, G. J., & Petersen, A. (2000). Analysis of cross category dependence in market basket selection. Journal of Retailing, 76(3), 367–392.
Russell, G. J., Ratneshwar, S., Schocker, A. D., Bell, D., Bodapat, A., Degeratu, A., Hildebrandt, L., Kim, N., Ramaswami, S., & Shankar, V. H. (1999). Multiple-category decision-making: Review and synthesis. Marketing Letters, 10(3), 319–332.
Saraçli, S., Doğan, N., & Doğan, I. (2013). Comparison of hierarchical cluster analysis methods by cophenetic correlation. Journal of Inequalities and Applications, 2013(1), 203.
Sneath, P. H. (1957). Some thoughts on bacterial classification. Journal of General Microbiology, 17, 184–200.
Sokal, R. R., & Sneath, P. H. A. (1963). Principles of numerical taxonomy (A series of books in biology). San Francisco: W.H. Freeman.
Späth, H. (1977). Cluster-analyse – Algorithmen zur Objektklassifizierung und Datenreduktion (2nd ed.). München/Wien: Oldenbourg Wissenschaftsverlag.
Srivastava, R. K., Leone, R. P., & Shocker, A. D. (1981). Market structure analysis: Hierarchical clustering of products based on substitution-in-use. Journal of Marketing, 45(3), 38–48.
Srivastava, R. K., Alpert, M. I., & Shocker, A. D. (1984). A customer-oriented approach for determining market structures. Journal of Marketing, 48(2), 32–45.
Strasser, H. (2000). Reduction of complexity. In J. Mazanec & H. Strasser (Eds.), A nonparametric approach to perceptions-based market segmentation: Foundations (pp. 99–140). Wien/New York: Springer.
Strehl, A., & Ghosh, J. (2003). Relationship-based clustering and visualization for high-dimensional data mining. INFORMS Journal on Computing, 15(2), 208–230.
Struyf, A., Hubert, M., & Rousseeuw, P. (1996). Clustering in an object-oriented environment. Journal of Statistical Software, 1(4), 1.
Tirunillai, S., & Tellis, G. J. (2014). Mining marketing meaning from online chatter: Strategic brand analysis of big data using Latent Dirichlet allocation. Journal of Marketing Research, 51(4), 463–479.
Titterington, D. M., Smith, A. F. M., & Makov, U. E. (1985). Statistical analysis of finite mixture distributions. Chichester: Wiley.
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.
Wedel, M., & Kamakura, W. A. (2000). Market segmentation – Conceptual and methodological foundations. New York: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this entry
Cite this entry
Reutterer, T., Dan, D. (2019). Cluster Analysis in Marketing Research. In: Homburg, C., Klarmann, M., Vomberg, A. (eds) Handbook of Market Research. Springer, Cham. https://doi.org/10.1007/978-3-319-05542-8_11-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-05542-8_11-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05542-8
Online ISBN: 978-3-319-05542-8
eBook Packages: Springer Reference Business and ManagementReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences
Publish with us
Chapter history
-
Latest
Cluster Analysis in Marketing Research- Published:
- 26 March 2020
DOI: https://doi.org/10.1007/978-3-319-05542-8_11-2
-
Original
Cluster Analysis in Marketing Research- Published:
- 29 March 2019
DOI: https://doi.org/10.1007/978-3-319-05542-8_11-1