1 Introduction

Most companies use marketing campaigns to recruit new customers or retain old customers. Customer segmentation is an important technique for marketing campaigns to target the right customers. In the field of marketing research, customer segmentation is a significant and useful task. Since most conventional clustering methods, such as K-mean clustering and self-organizing map (SOM) clustering, still have some drawbacks in the data clustering, this study explores a new method in order to reach a better result of data clustering (Zhang and Fang 2013).

In 1995, researchers developed a population-based algorithm called Particle Swarm Optimization (PSO) (Kennedy and Eberhart 1995). This algorithm simulates birds flocking or fish schooling to achieve a self-evolving system (Chen and Ye 2004). Researchers used the vector space concept to search for an optimum solution to a fitness function. This study develops the RFM (Recency, Frequency, and Monetary) model to calculate customer lifetime value and then the PSO algorithm clusters vales of automobile industry customers into four homogenous groups. To improve the effectiveness of PSO, this study explores all multi-stage hybrid K-means, SOM, and PSO Algorithms. We compare the experimental results against those of hybrid and other algorithms as well. Finally, the campaign strategies are proposed to the most profitable and valuable customers.

2 Method review of customer segmentation

Targeting the right customers is a crucial task in implementing a successful marketing promotion. Marketing personnel segment customers using a mathematical model such as neural network and genetic algorithm (Vellido et al. 1999; Chan 2008). They design an efficient campaign plan for assigning values to customers to find the most profitable customers for a given product (Chan 2008). In this study, we classify existing customer segmentation methods as single or hybrid methodology. Most single methodology studies use artificial intelligent methodologies, e.g. neural net (Vellido et al. 1999; Chan 2005), GA (Tsai and Chiu 2004; Jonker et al. 2004; Chan 2008) and Fuzzy sets (Hu and Sheu 2003; Jiao and Zhang 2005) to segment customers.

Besides the single methodology, other researches chose hybrid methodologies to search for an optimum solution to segmentation problems. In such applications, studies sometimes combine more than one method to solve the customer segmentation problem for individual applications (Kim and Street 2004; Kim et al. 2005; Kuo et al. 2006; Shin and Sohn 2004).

Since customer segmentation is a current issue for marketing researchers, each year numerous papers appear in the literature on this topic. Table 1 summarizes customer segmentation researches. Before 2000, researches used simple artificial intelligence methods to segment customers. After year 2004, many researches used hybrid methodologies by taking advantage of multiple methods. For example, Kuo et al. (2006) proposed a novel two-stage method, by using Self-Organizing Feature Maps (SOM) neural network to determine the number of clusters and the starting point, and then by genetic K-means algorithm to find the final solution. Each paper developed a unique methodology and exploited that method’s advantage to provide solutions to segmentation problems (Zhang et al. 2015).

Table 1 Artificial customer segmentation research review

Conventionally, K-mean and SOM are two of the most popular algorithms used to cluster customer data. K-mean has a problem in stopping at local optimal solutions, and not producing a result that represents a global solution (Zhang and Fang 2013).

The major disadvantage of K-mean is that the objective function is non-convex and it may contain local minima (Chen and Ye 2004). The result could be at local minima instead of the global minima. The major problem using SOM is difficulty in clustering ambiguous data. In some situations, SOM could cluster a small number of data points inside a particular group. The clustered data loses its context within the larger dataset, representing a customer segment. From the marketing viewpoint, finding a better segmentation algorithm is a very important issue. In this study, we proposes a hybrid evolutionary computational algorithm for clustering analysis instead of the conventional Particle Swarm Optimization (PSO) and other artificial intelligent algorithms.

3 Framework for marketing campaign

The relationship between marketing campaigns, customer segmentation and cluster algorithm are described in Fig. 1. Marketing campaigns use a specific, defined series of activities to deliver a new or changed service or new marketing channels and methods. The study proposes a three- layer framework for marketing campaign. The marketing campaigns should target right customers by using right strategies. To find target customers, corporates have to segment customers into homogenous groups, so layer 1 have to use the method of layer 2. Since the technology of customer segment is to use an effective cluster algorithm, layer 2 have to apply the technology of layer 3. The relationship between marketing campaign, customer segmentation and cluster algorithm are shown in Fig. 1.

Fig. 1
figure 1

Three-Layer framework for marketing campaign

4 Customer data clustering for marketing campaign

To segment the customer’s behavior for a marketing campaign we follow multiple steps to segment the data (see Fig. 2). The customer service department initially collects data. The data set includes two major columns: demographic data such as name, gender, address and transaction data such as the purchased product, transaction amount and date. The information manager then builds a database such as My-SOL to organize the collected data. This study models three customer transactional data dimensions, namely recency, frequency and monetary (RFM) to classify customer behavior (Yao et al. 2000). A RFM Model is then computed for each customer. A developed segmentation algorithm segments the RFM model into several homogenous groups. The data is then analyzed to find hidden knowledge such as customer preferences. Based on the marketing concept (Berger 2006), this study used profitability and customer loyalty as the major dimensions to divide customers into four categories; butterflies, true friends, strangers and barnacles in Table 2. Marketing personnel then identify a customer group and maximize customer lifetime values by executing a promotion plan to increase customer value. If the segmentation were successful, marketing activities, such as a promotion, would take place to target potential profitable and valuable customers.

Fig. 2
figure 2

The framework of marketing campaign

Table 2 Customer categories

5 Customer segmentation

The fourth marketing campaign step as shown in Fig. 2 is called customer segmentation step. Customer segmentation divides the customer database into groups that are similar in specific behaviors relevant to marketing (Rouse 2015). The customer segmentation process includes selecting a target data set, data encoding, segmentation, behavior modeling and marketing knowledge development (see Fig. 3). Because the data set contains more than 100,000 customers from an automobile retailer, it is huge it is difficult to process. The initial significant task is to extract a smaller target data set from the raw data in the third step. The criteria of selection into a smaller target data set is to choose the more popular car models. We compute the RFM data model from the target data into a standard format that is recognizable by PSO. PSO is used in this study to segment customers into homogenous groups. Each group has individual characteristics. A customer behavioral model is then developed related to the RFM model. Finally, customer behavior knowledge such as preferences is obtained.

Fig. 3
figure 3

The processes of customer segmentation

6 RFM model for customer lifetime value

The third marketing campaign step is to generate the RFM Model. The RFM (Recency, Frequency and Monetary) is the best-known method to discover customer behavior patterns from a massive database. Many academic researchers have used this method to analyze and present customer purchasing behavior characteristics (Chan 2005, 2008; Hsieh 2004). The RFM model uses three customer transactional data dimensions; Recency, Frequency and Monetary, to segment customer behavior (Chan 2005, 2008; Yao et al. 2000). The first dimension, “Recency” indicates the length of time since the start of a transaction. The second dimension, “Frequency” indicates how frequently a customer purchases products during a particular period. The third dimension, “Monetary” measures the amount of money that customers spend during a period (Chan 2008; Jonker et al. 2004). The RFM model uses the assumption that future patterns of consumer trading will be similar to past and current patterns.

The RFM model can represent customer behavior, but the value of each variable in the RFM model is located within a different range. Since we have to normalize each value into a uniform form to fit the PSO input format, this study encode the Recency, Frequency and Monetary values into a four-digit string as Eq. (1).

$$X' = \frac{{X - \mathop x\nolimits_{\hbox{min} } }}{{\mathop x\nolimits_{mzx} - \mathop x\nolimits_{\hbox{min} } }} \times \,100\,\,\%$$
(1)

\(\mathop x\nolimits_{\hbox{min} }\): the minimum value of the data set \(\mathop x\nolimits_{mzx}\): the maximum value of the data set.

If a value is between 80 and 100 %, the binary code is set equal to 5. Following this notion, if the value is between 80 and 60 %, the binary code is set equal to 4 (see Table 3). This proposed encoding scheme transforms the parameter space points into a binary string representation (Jang et al. 1997). Each coordinate value is encoded as a gene composed of four binary bits using binary code. Thus, we represent customer behavior using a RFM model, encoded as a three-dimensional binary code, or gene composed of four binary bits (Chan 2008). To assess each customer segmentation generation proposed by the PSO algorithm, this investigation develops a lifetime total value (LTV) model as the PSO adaption value.

Table 3 Transformation between input data and binary code

After segmentation, the value of each group should be identified for value analysis. This study uses lifetime values to analyze the value of segmented customers. Normally, customer lifetime values must represent past profit contributions and future expected profitability (Hwang et al. 2004). We use the sum of the present and future values as an index to determine the segmented customer LTV. In order to analyze the value of customers, this study uses a lifetime value to judge the profitability and customer loyalty to identify the short term or long term values.

This study adopts the LTV model from Chan (2008).

$${\text{LTV}} = {\text{Present}}\,{\text{value }} + {\text{Future}}\,{\text{value}}$$
(2)
$${\text{Present value}} = \sum\limits_{t_{i} = 0}^{N_{i}} {\pi {\text{p}}\left( {t_{i}} \right)} \left( {1 + d} \right)^{N_{i} - t_{i}}$$
(3)

where t i , service period index of customer i; N i , total service period of customer i; d, interest rate; πp(t i ), past profit contribution of customer i at period t i

Previously LTV models could not link campaign activities. The main problem of most existing researches that the correlation between customer values and campaign activities is not considered in the LTV model. To cope with this difficulty, this study develops a model for calculating customer value based on a serial number of campaigned projects, as follows (see Fig. 4):

$${\text{Potential value }} = \sum\limits_{{{\text{j}} = 1}}^{\text{N}} {{\text{Prob}}_{\text{ij}} \times {\text{Profit}}_{\text{ij}} \times {\text{M}}_{\text{ij}} }$$
(4)

where Probij, the probability of using marketing strategy j for segmented customer i; Mij, the contribution factor of increasing monetary value by using strategy i for segmented customer i; N, the expected year of customer loyalty

$${\text{Customer loyalty}}_{\text{i}} = \sum\limits_{{{\text{j}} = 1}}^{\text{M}} {{\text{L}}_{\text{ij}} \times \left( { 1- {\text{ Churn rate}}} \right)}$$
(5)

where Lij, the contribution factor of increasing loyalty by adapting strategy j for customer i; M, the total number of marketing strategy

Fig. 4
figure 4

Customer life time value computation

The new model adds two contribution factor variables. The first variable is the probability of customer i adapting strategy j. Meanwhile, the other factor is the contribution value when customer i uses strategy j. These variables link campaign programs and customer values.

7 Particle swarm optimization

This study proposes an evolutionary optimization algorithm instead of the traditional artificial intelligent algorithm to segment the customer data set. Since Kennedy and Eberhart recognized the bird flocking and fish schooling phenomenon within social interactions and developed PSO (particle swarm optimization) in 1995, the PSO algorithm has become a popular evolutionary optimization algorithm (Kao and Zahara 2008). Particle Swarm Optimization (PSO) uses a swarm intelligence based algorithm. It looks for a better solution to an optimization problem in a search space with regard to a set of objectives (Kennedy and Eberhart 1995; Shi and Eberhart 1998). The major PSO concept is that the members of a group follow the lead set by the best of the group (Kao and Zahara 2008). The potential solutions, (called particles) search through the problem space by following the current optimum particles (swarmintelligence.org 2015). The particle swarm optimization concept consists of, at each time step, changing the velocity of (accelerating) each particle toward its best locations (see Eq. 7). Acceleration is generated using a random number between (0, 1) toward to its Pbest and Gbest value. Pbest is the current adaptive value for each particle. The best value for each particle and velocity are updated until the best particle (Gbest) is found (see Eq. 6).

The velocity and position of each particle are changed according to Eqs. 6 and 7.

$$V^{new} = V_{{}}^{old} + c_{1} * rand() * (P_{best} - X^{old} ) + c_{2} * rand() * (G_{best} - X^{old} )$$
(6)
$$X_{{}}^{new} = X_{{}}^{old} + V_{{}}^{new}$$
(7)

V: particle velocity

X: particle position

Rand(): random number between (0, 1)

c 1 and c 2 are leaning factors usually c 1  = c 2  = 2 (Urade and Patel 2011).

Pbest i: Each particle keeps tracks of its coordinates in the problem space associated with the best solution (fitness) achieved so far. This value is called pbest (Urade and Patel 2011).

Gbest: As a particle takes the entire population as its topological neighbors, the best value is a global best and is called gbest (Urade and Patel 2011).

Since PSO has been applied successfully in research area (Zhang et al. 2015), this study chooses PSO as the customer segmentation algorithm. From Fig. 5 the starting point is Coding and Transformation, which transforms the RFM values into four binary values. PSO can analyze the data to generate a sample space and set parameters. Because the customer segmentation goal is to search for the maximum customer lifetime total values (LTV), we chose the customer total lifetime value (LTV) as the index of the adapt value for the PSO algorithm.

Fig. 5
figure 5

Using particle swarm optimization for customer segmentation

8 Evaluation

The segmentation criteria must be determined to compare the PSO customer segmentation performance with current exiting algorithms. We used DB (Davies and Bouldin 1979) index value as the data clustering performance criteria. DB uses the ratio of the difference inside a group and the ratio of differences between different groups as the segmentation criteria. First, we used the standard deviations between members as well as the distance between the center of group i and j group to compute the similarities between group members (see Eq. 8). Second, the similarity function average is chosen as the effectiveness index (see Eq. 9). Finally, the DB index value is calculated. A small value for DB indicates that there are smaller differences between group members, so the group is more homogeneous.

$$\mathop R\nolimits_{J} = \hbox{max} \left\{ {\frac{{\mathop S\nolimits_{i} + \mathop S\nolimits_{j} }}{{\mathop M\nolimits_{ij} }}} \right\}{\text{ i}} = 1 , 2 ,\ldots , {\text{K;}}\,\,{\text{ j}} = 1 , 2 ,\ldots , {\text{K;}}\,\,{\text{i}} \ne {\text{j}}$$
(8)
$$DB = \frac{1}{K}\sum\limits_{i = 1}^{K} {\mathop R\nolimits_{i} }$$
(9)
  • DB: the segmentation effect index

  • K: group number

  • R: similarity function

  • S: Standard deviation

  • \({\text{M}}_{\text{ij}}\): the distance between the center of group i and group j

9 Case study

A good segmentation technology is significant in assisting an automobile retailer in conducting a good campaign to recruit and retain customers. This section will follow the steps in Fig. 2 to illustrate the case study. In the first step of marketing campaign, this study used 1000 pieces of historical target customer data provided by an automobile retailer. The data set includes two major columns: demographic data such as name, gender, address and transaction data such as the purchased product, transaction amount and date. The data set is stored in the My-SQL database. Based on the RFM concept described in Sect. 6 this study developed a program to transform the data set into the RFM model.

This proposed encoding scheme transforms the points of the parameter space into a binary string representation. For instance, a point (5, 2, 3) in a three-dimensional parameter space can be represented as a concatenated binary string.

The fourth step of marketing campaign is customer segmentation. The first step in customer segmentation is to find the best segment size. Based on the marketing concept (Berger 2006) this study used profitability and customer loyalty as the major dimensions to divide customers into four categories; butterflies, true friends, strangers and barnacles in Table 2. If the customer’s loyalty period is longer than the average value, that customer belongs to the long-term customer group. If the profit from a customer is larger than the average value, that customer belongs to the high profitability customer group. Customers belong to short-term customers with high profitability belong to butterflies. In the other side, the short-term customers with low profitability belong to strangers. The long-term customers with high profitability belong to true friends. In the other side, the long-term customers with low profitability belong to barnacles. This study summarizes four customer categories in Table 2.

The second step is to find the best fitting number of particles. Because the customer segmentation goal is to search for the maximum customer lifetime total values instead of profitability and customer loyalty, we chose the customer lifetime total value as the index of the adapt value for the PSO algorithm. From the results shown in Table 4, we see that when the number of particles is close to 20, the adaptive value, or fitting function, converges to a stable value 1.543. When the number of particles is more than 20, the execution time increased rapidly and the adaptive value increased slowly. It means that the computation of PSO algorithm reaches the maximum customer total values. Therefore, we chose 20 as the number of particles to execute value segmentation.

Table 4 Execution time and the adaptive value

To compare the PSO algorithm performance with that of other algorithms we used the segmentation similarity function (Ri) and effect index (DB). In Table 5 the PSO DB value is 0.8845. It is smaller than that of K-mean or SOM values. Applying the DB value as the criteria for evaluating algorithm performance, as described in Sect. 8 above, PSO, with the lowest DB index value, should be the better algorithm to cluster customers.

Table 5 DB value for K-means, SOM, and PSO algorithms

To improve the effectiveness of PSO, this study explores multi-stage hybrid of K-means, SOM, and PSO Algorithms. The hybrid algorithms are named by the order of processing these algorithms and include S-KMeans, KMeans-PSO, S-PSO and S-KMeans-PSO. For example, in SK-Means hybrid algorithm, SOM is executed first and then K-Means, The result (Table 6) shows that the hybrid of SOM, K-Means and PSO Algorithms (SK-PSO) gets the best performance with the lowest DB vale (0.8413), but takes longer to execute with 74.422 s.

Table 6 DB value and execution time for hybrid K-means, SOM, and PSO algorithms

Figure 6 shows the customer segmentation distribution. The units of the three axes (prices, services and insurance) are originally different. To normalize the units this study converted the scale into a number between 0 and 1. Each axe represents the value by calculating lifetime total customer (LTV) values by Eq. 2 for three key customer purchase items (prices, services and insurance) which come from the data set.

Fig. 6
figure 6

Customer segmentation result

In the value analysis step, this study segmented the customers into four groups with PSO algorithm. The customer characteristics are summarized as follows:

  • Cluster 1

    This cluster needs little offering and lower value. As shown in Fig. 6, nearly 40 % of customers are scattered in the low value area called “Strangers” with low profitability and low loyalty. Many new customers who purchased small cars such as the March and TIIDA are identified as strangers. Many of the customers in this group are young or new drivers.

  • Cluster 2

    This cluster represents a good fit between the seller’s offerings and the customer’s needs with high profit potential. They love to try brand new cars and care about prices and services. Sometimes they try to find a cheaper dealer to fix their cars to save money. Nearly 20 % of these customers are classified as the “Butterfly” category with high profitability and low loyalty.

  • Cluster 3

    Customers in this cluster are the most valuable segment. They offer a good offerings/needs fit and high profit potential. Normally, the customers in this category have high loyalty and value. They love the Nissan brand. Nearly 27 % of these customers belong to the “True Friends” category with high profitability and high loyalty.

  • Cluster 4

    This cluster represents a low poor offerings/needs fit and offer a low profit potential. They have high loyalty. Customers who drive low mileage and need only few services are identified in this cluster. Nearly 13 % of these customers are scattered in the “Barnacles” area with low profitability and high loyalty.

The final step is to provide campaign activity for each group. Based on the above analysis, this study proposes a campaign activity as follows. Based on the 80/20 rule, 80 % of your profits come from 20 % of your clients. Clusters 2 and 3 are the high profit, valuable customers. This study proposes campaign activities for those two clusters.

  • Cluster 2

    This cluster likes to buy brand new cars. The dealer could extend the warranty for new cars. Now the dealer has extended the warranty from 3 to 5 years to retain the high profit customer longer.

  • Cluster 3

    This cluster gets used to buying higher price cars, This study proposes providing more discounts in car sales and maintenance to attract these customers and retain this valuable customer longer.

10 Conclusion

Many companies have difficulties targeting the right customers with the right campaign activities. Matching the right customers with the right campaign activities is crucial. Poor customer segmentation could cause targeting the wrong customers. Most conventional clustering methods such as K-mean clustering and self-organizing map (SOM) clustering still have some drawbacks in data clustering (Chen and Ye 2004). Since the intelligent swarm algorithm has the advantage that it uses fewer parameters to reach a global optimal solution (Bai 2010), this study proposes the PSO algorithm to cluster customers. To find profitable and valuable customers we used a RFM model to compute their lifetime values. Based on these lifetime values we applied the PSO algorithm to cluster customers. We compared the experimental results between PSO, K-mean, and SOM. Clustering with the PSO algorithm produced the smallest segmentation effect index of the three algorithms tested. We compare the performance of the PSO customer segmentation algorithm against that of other segmentation algorithms [K-mean and self-organizing map (SOM)] and hybrid algorithms. The study finds the hybrid S-KMeans-PSO algorithms can reach the best performance. This experiment demonstrates that S-KMeans-PSO is effective as a customer clustering application. Finally, two campaign activities are proposed and applied to high profit and valuable customers to increase the customer value.

This study still has a few limitations. First, using PSO requires many data to identify the customer values. Second, we could use the PSO algorithm in combination with other algorithms to find a global solution (Pang et al., 2004; Shi and Eberhart 2001). In the near future we will develop a PSO-based hybrid algorithm to find the optimal global solutions. Third, this study used DB (Davies and Bouldin) index to evaluate the performance of the algorithms. There are still other indexes we could apply in future research.