
1 Introduction

Market segmentation is one of the important concepts both in marketing theory and in practice. In the last few decades, market segmentation has become a central issue in marketing theory. In marketing theory literature, many researchers have proposed different techniques but still proper market segmentation is a burning problem among market researchers. The concept of market segmentation was first proposed by American marketing researcher Smith [45] and then was further developed by many researchers. Market segmentation is the process of differentiating a large market into some groups or the clusters of customers from available market information or similar behavior customer [18, 23, 31, 45, 46, 51]. The groups or subset may be demographic, economic, and choice based.

This research paper surveys the development of data mining techniques in market segmentation. Data mining is the process of discovering pattern in data, and market researchers are frequently used for understanding the behavior pattern in marketing data. Nowadays, data mining is becoming one of the emerging fields of research due to explosion of data.

The aim of this paper is to present a comprehensive review of literature which published in academic journals related to the application of data mining techniques in market segmentation. The classification of framework is adopted here as per previous literature [42, 43]. The paper is organized as follows: First, the research methodology of the study is presented; second, introduced available list of data mining techniques which used for market segmentation and are also presented into several groups; third, articles about data mining in market segmentation are analyzed, and the results of the classification are showed through table; and finally, the conclusions, limitations of the study, and suggestion are discussed.

2 Research Methodology

Nowadays, the nature of research in market segmentation becomes multidisciplinary and many researchers across the different field worked on and have published relevant materials in various journals. The following online journal databases were used for searching comprehensive bibliography of academic literature on market segmentation.

  1. 1.

    Science Direct

  2. 2.

    ABI/INFORM Database

  3. 3.

    Emerald Full Text

  4. 4.

    IEEE Transaction

  5. 5.


  6. 6.


  7. 7.

    Google Scholar

  8. 8.

    Wiley Online Library.

The above academic literature was searched with key word “market segmentation” or “target marketing” or “data mining and market segmentation,” and originally around 750 articles were produced. Each full-text article was reviewed carefully and eliminated those articles which were not related to data mining techniques in market segmentation or those were not having main focus for improvement of market segmentation. Below criteria followed for selection and the number reduced to 103 articles which are related to data mining techniques and development of methodologies in market segmentation. Those articles have been included which were published in above-mentioned academic journal. Doctoral thesis, master thesis, conferences, and unpublished work are avoided as per previous literature [43].

Each article has been carefully reviewed and separately classified into several categories of data mining techniques. Although this search tried to cover all the available used data mining techniques so far, this paper will serve as a comprehensive base of data mining research in market segmentation area and will give a broad view of available techniques in marker segmentation.

3 Classification Method

The reviewed research papers were classified into thirteen categories, and each category also consists of several single or hybrid data mining techniques. Each categorywise several data mining techniques presented into table with reference papers.

3.1 Classification Framework for Data Mining Techniques

This paper surveys and classifies different market segmentation techniques into thirteen broad categories as follows:

  1. 1.

    Neural network

  2. 2.

    Evolutionary algorithm

  3. 3.

    Fuzzy theory

  4. 4.

    RFM analysis

  5. 5.

    Hierarchical clustering

  6. 6.


  7. 7.

    Bagged clustering

  8. 8.

    Kernel methods

  9. 9.

    Multidimensional scaling

  10. 10.

    Taguchi method

  11. 11.

    Model-based clustering

  12. 12.

    Rough set

  13. 13.


Early 1960s and 1970s researchers had started K-means and hierarchical cluster analysis for market segmentation [44]. Even today also, many researchers successfully used K-means clustering for market segmentation [8, 33]. But above traditional clustering techniques have some drawbacks. For example, K-means algorithm cannot handle noise and outliers data [5, 6, 47]. K-means algorithm also failed to give any exact or initial number of cluster and the statistical validity of the cluster formed [37], and hence, clustering falls into local minima [15, 35]. To address the above issue, the researcher combined K-means with genetic algorithm to reach the global minima [3, 36, 40].

To classify the complex consumer pattern in market, researchers also introduced other different approaches such as evolutionary algorithm, kernel methods, rough set, Taguchi method, etc. Nowadays, the above approaches are quite popular among the market researchers, and also these approaches are able to perform better market segmentation than traditional one [24]. Other important market segmentation techniques such as multidimensional scaling, random forest, RFM analysis, bagged clustering, etc., are also found in academic literature [10, 48].

3.2 Classification Process

Each of the selected research papers were reviewed carefully and classified into one of the thirteen categories according to the proposed classification framework. The research papers were analyzed by year of publication and distribution of journal. The classification process is adopted here as per previous literature [42, 43]:

  • Online academic literature search

  • First-phase classification by the researcher

  • Second-phase classification of initial classification result

  • Final verification and classification of result.

4 Classification of the Articles

The distributions of 103 articles have been done according to the proposed classification model which is shown in Table 1. Articles are categorized into thirteen broad methodologies, and then, each methodology is further divided into major data mining techniques. Most of the data mining techniques used in market segmentation have been included here.

Table 1 Distribution of articles according to the proposed classification model

4.1 Distribution by Journals

The distribution of articles by journal has been shown in Table 2. Articles related to application of data mining techniques in market segmentation are distributed across 45 journals. The top three journals are as follows: Expert Systems with Applications, Journal of Marketing Research, and Management Science.

Table 2 Distribution of articles by journal

The above journals covered more than 45 % of the total number of articles published. However, other important journals are as follows: Decision Support Systems, European Journal of Operational Research, and Tourism Management.

4.2 Distribution of Articles by Year of Publication

The distribution of articles by year of publication is shown in Fig. 1. Last 12 years has taken for consideration. The amount of publication increased significantly from 2001 to 2012. More research work would be expected in the future.

Fig. 1
figure 1

Distribution of articles by year of publication

4.3 Distribution of Articles According to the Proposed Classification Model

The distribution of articles according to the major data mining techniques is shown in Table 3. Twenty-two major data mining techniques found in literature. The top six data mining techniques are as follows: neural network, evolutionary algorithm, fuzzy theory, RFM analysis, hierarchical clustering, and K-means

Table 3 Distribution of articles by data mining techniques


More than 75 % of the total articles have published using above techniques. Researcher also used hybrid techniques for the better result and another reason was to overcome shortfall of another techniques. In last few years, many advance techniques are proposed for market segmentation.

5 Discussions, Limitations, and Suggestions

5.1 Discussions

This is the first identified research paper on data mining techniques in market segmentation. This paper covers available data mining techniques used in market segmentation so far. Eight online journal databases were used for searching. After careful analysis of several research papers on market segmentation, we found that various researchers adopted different data mining techniques in the last few decades. The development of methodologies on marketing research can be divided into two phases. In the first phase, early researchers who worked on market segmentation generally used K-means and hierarchical clustering. But after 2000, market researchers are motivated to use advanced data mining techniques that can handle complex consumer behavior pattern. Hybrid algorithms are found in literature to increase the performance of exiting one. Some examples of hybrid algorithm include multilayer perceptron and K-means, genetic algorithm and K-means, particle swarm optimization and K-means, and self-organizing map and K-means. Neural network-based market segmentation is quite popular among the researcher.

5.2 Limitations

This paper might have some limitations. A widespread comprehensive literature review of data mining techniques in market segmentation has been presented. This research paper tries to incorporate all the available data mining techniques without time bound. However, this work surveyed those articles which were extracted based on the keyword “market segmentation” or “target marketing” or “data mining and market segmentation.” In order to find out how data mining techniques for market segmentation developed, the categorization is done based on keyword index, abstract, and article methodology part. Articles without keyword index could not be extracted. This research work limited search for articles to 8 online databases. There might be the presence of other useful academic journals which can provide other important data mining techniques for market segmentation. We believe that many companies practice advanced techniques for market segmentation but failed to include here due to limited resource. In this study, we do not consider other non-English publications. However, other languages may have important techniques for market segmentation. Another major limitation is as the quality of market segmentation not only depends on good data mining techniques but also on the selection of proper segmentation variables [51]. Normally, researchers used general variables because general variables are easy to use [25]. The review of segmentation variable in market segmentation is out of scope of this paper.

5.3 Suggestions

  1. 1.

    Research on the application of data mining techniques in market segmentation is becoming an emerging field as shown in the Fig. 1 and Table 1, and the number of publication in term of research on the application of data mining in the market segmentation will gradually increase in the future.

  2. 2.

    Neural network-based market segmentation is the most used data mining technique. Neural network is used for classification, clustering, and prediction purpose. Many researchers preferred self-organizing map neural network model for visualization and determination number of cluster.

  3. 3.

    The result of market segmentation can easily improve with combination of good data mining techniques and proper selection of segmentation variable.

  4. 4.

    The K-means and hierarchical clustering are still preferred data mining techniques for market segmentation.

  5. 5.

    Kernel-based market segmentation is one of the promising techniques for robust market segmentation and also performed better than tradition techniques.

  6. 6.

    Many hybrid data mining techniques are used to increase the performance. For example, researchers used K-means with combination genetic algorithm for better performance.

  7. 7.

    Market segmentation is becoming more complex gradually and researchers are also working on development of more advanced data mining techniques that can handle outlier, noise, and big data-related problem.

  8. 8.

    Many data mining techniques are used from last few decades, but still there is always scope of improvement. Future researchers can improve the available algorithm for better performance in market segmentation. Some examples of such data mining techniques include kernel-based method, probabilistic fuzzy c-means, random forest, evolutionary algorithm.

6 Conclusion

Market segmentation is one of the primary and most critical parts of market research. This research work identified 103 articles which were published in 8 online databases. Our aim is to provide brief summary on available all data mining techniques which are most popular and successfully used so far. However, the research work cannot claim to be exhaustive. We believe that this work does provide reasonable insight of market segmentation and would help to give a clear picture for those who interested to work and research on this area.