Predicting customer churn from valuable B2B customers in the logistics industry: a case study

Chen, Kuanchin; Hu, Ya-Han; Hsieh, Yi-Cheng

doi:10.1007/s10257-014-0264-1

Predicting customer churn from valuable B2B customers in the logistics industry: a case study

Original Article
Published: 02 October 2014

Volume 13, pages 475–494, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Information Systems and e-Business Management Aims and scope Submit manuscript

Predicting customer churn from valuable B2B customers in the logistics industry: a case study

Download PDF

Kuanchin Chen¹,
Ya-Han Hu² &
Yi-Cheng Hsieh²

3847 Accesses
53 Citations
Explore all metrics

Abstract

This study uncovers the effect of the length, recency, frequency, monetary, and profit (LRFMP) customer value model in a logistics company to predict customer churn. This unique context has useful business implications compared to the main stream customer churn studies where individual customers (rather than business customers) are the main focus. Our results show the five LRFMP variables had a varying effect on customer churn. Specifically length, recency and monetary variables had a significant effect on churn, while the frequency variable only became a top predictor when the variability of the first three variables was limited. The profit variable had never become a significant predictor. Certain other behavioral variables (such as time between transactions) also had an effect on churn. The resulting set of predictors of churn expands the original LRFMP and RFM models with additional insights. Managerial implications were provided.

Intelligent data analysis approaches to churn as a business problem: a survey

Article 23 September 2016

Customer Churn Models: A Comparison of Probability and Data Mining Approaches

Modeling Customer Lifetime Value, Retention, and Churn

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Logistics is the flow of raw materials or other goods to end customers (Waters 2003; Yildiz et al. 2010). Delivering items to the correct place at an appropriate time at a reasonable cost is an essential success factor in the logistics industry (Liu and Lyons 2011). Concern arises when any stage of this process causes customer dissatisfaction, which potentially leads to loss of business. Logistics companies currently engage in activities that transcend the traditional goods delivery model, such as service delivery, rendering, and integration (Esper et al. 2003; Renko and Ficko 2010). For example, UPS has expanded its business, serving as a solution provider that offers service integration in e-commerce, accounts receivable, insurance, and financing. These activities relate to UPS’s core business, and such expansion into other areas develops customer dependency and loyalty, which eventually translate into sustainable business.

A long-term relationship with customers is crucial in the logistics industry because numerous aspects of service encounters (such as cost, speed of delivery, and courtesy) can easily be imitated by competitors. Customer churn (or customer attrition) is key for gauging success in the logistics industry. Van den Poel and Lariviere (2004) showed that the cost of a high churn rate (at 25 % compared with 7 %) for a UK retail bank was nearly 220,000 euros after 25 years. Although this figure varies among industries, a highly competitive market, such as the logistics industry, is unlikely to have a low customer churn rate. Therefore, researchers recommend a defensive marketing strategy (Fornell and Wernerfelt 1987; Ahn et al. 2006) that prevents customers from switching service providers. Customer churn prediction is a popular element for retention and loyalty reasons (De Bock and Van den Poel 2012; Glady et al. 2009; Huang et al. 2012; Hung et al. 2006; Kisioglu and Topcu 2011; Li et al. 2011b; Neslin et al. 2006).

The purpose of customer churn management is to minimize losses caused by customer churn and to retain high-value customers, thereby maximizing profit. The 80/20 rule suggests that 80 % of revenue is provided by 20 % of customers (Xu and Walton 2005). Focusing on retaining high-value customers is a reasonable strategy. Consequently, studying the profile of lost customers and predicting customer churn is crucial for survival in highly competitive industries. Customer churn prediction has been applied in numerous fields, particularly in the telecommunication and financial industries, in which target audiences are primarily individuals (Huang et al. 2012; Verbeke et al. 2012; Kisioglu and Topcu 2011; Nie et al. 2011; Tsai and Lu 2009). Most high-value customers in the logistics industry are businesses that are bound by regulations, corporate policy, and accumulated business practice. Because of this difference, switching cost is considerably lower for individual customers than for business customers.

The contribution of this study is twofold. First, we expanded the body of knowledge in customer value analysis, specifically, the length, recency, frequency, monetary, and profit (LRFMP) model, to study the effect of such a model on business customers in the logistics industry (Li et al. 2011a; Verbeke et al. 2012). The results indicated that this model is generalizable in the logistics industry business context. Second, we investigated treating the effect of all five LRFMP variables concurrently. If some LRFMP variables overshadowed other variables, then further insight may be attained regarding how variables become predictors when variability of primary variables is limited or controlled.

2 Related work

2.1 Customer value analysis

Customer value analysis involves identifying patterns and group associations by using a high number of customer data. By conducting customer value analysis, businesses can identify valued customers who have contributed to business revenue the most for a considerable period of time (Cheng and Chen 2009). Among techniques for assessing customer value, the recency, frequency, and monetary (RFM) model has received considerable attention in recent literature (Chang et al. 2010; Chen et al. 2005; Cheng and Chen 2009; Hosseini et al. 2010; Liang 2010; Liu and Shih 2005; Yeh et al. 2009). The RFM technique is used widely in customer behavior analysis to study customer value and market segmentation.

Hughes (1994, 2005) studied numerous historical transaction data and determined that the following three key behavioral indicators are relevant to customer value analysis.

1.
Recency refers to the period of time between the previous purchase date and a set date (determined before analysis begins). The likelihood of repurchase is high when the duration between these two dates is short.
2.
Frequency refers to the total number of purchases within a particular period of time. It is used to measure the interaction frequency between a customer and the business. A high level of interaction indicates customer loyalty.
3.
Monetary refers to the total dollar amount of a customer’s purchases within a particular period of time. It can be used to measure the contribution of a customer to revenue. The greater the amount spent on purchases is, the more the customer contributes to revenue.

The importance of these three indicators varies among industries. Identifying the relative weight of these indicators for the specific domain of business is useful. Studies have recently begun expanding the RFM model by including additional variables. For example, Wei et al. (2012) suggested that the longevity of a relationship with a customer affects customer loyalty. They suggested an LRFM model including the longevity of a relationship, a CRFM model in which the RFM model is applied to various product categories, and a CLVRFM model based on traditional RFM analysis.

Chang and Tsai (2011) added product category groups to develop the GRFM model. Yeh et al. (2009) developed a RFMTC model by including the first purchase time (T) and customer churn probability (C). These extensions to the traditional RFM model have generated mixed results.

2.2 Customer churn

To retain customers, companies engage in activities to satisfy customers and reduce customer defection. Customer retention is the ultimate goal of customer relationship management (Payne and Frow 2005; Reinartz et al. 2004). In increasingly competitive business environments, acquiring new customers is increasingly expensive and difficult. Retaining existing customers is a popular strategy that is comparatively less costly than attracting new customers (Reinartz and Kumar 2003). Several studies have shown that acquiring a new customer is usually five to six times more expensive than retaining a customer (Athanassopoulos 2000; Slater and Narver 2000).

The scenario in which customers cease transacting with a company is called customer churn or customer attrition (Neslin et al. 2006; Yu et al. 2011). Customers who end a relationship with a company and develop a new relationship with a different company are called “churners” (Kisioglu and Topcu 2011).

Customer churn causes revenue loss and other negative effects on corporate operations (Saradhi and Palshikar 2011). Therefore, establishing an accurate customer churn prediction model for identifying key factors that cause churn is crucial. Several key recent studies on customer churn are summarized in Table 1.

Table 1 Recent studies on customer churn prediction

Full size table

According to Table 1, most previous studies on customer churn prediction have focused on the banking, retail, and telecommunication industries. According to a review of relevant literature, no study has investigated customer churn prediction in the logistics industry. In addition, variables that affect churn vary substantially among industries. For example, average minutes of usage, age, and place of residence, which are used in the telecommunication industry, are not relevant to logistics (Kisioglu and Topcu 2011). Similarly, debt ratio, home ownership, and credit risk, which are applied in the banking industry, are not relevant to logistics (Burez and Van den Poel 2008). One method for gaining insight into customer churn in an industry on which little empirical guidance has been provided in the literature is to apply a common set of classification techniques derived from previous churn studies that used a set of variables relevant to the industry. The immediate benefit of using common techniques is that it enables researchers to provide new evidence regarding the generalizability of existing methods.

Table 1 shows a common set of techniques used in recent churn studies. This set of techniques includes logistic regression (LGR), decision tree analysis (C4.5), artificial neural network (multilayer perceptron, MLP), and support vector machine (SVM). These four techniques were applied in this study.

3 Methods

3.1 Data collection and preprocessing

The company investigated in this study was founded in 1938 and is one of the largest logistics companies in Taiwan. It has approximately 2,500 transportation vehicles and over 100,000 business customers. In 2010, its total revenue was over TWD 8 billion (roughly USD 271 million). The original dataset comprised data on 106,747 business customers who engaged in over 210 million transactions between March, 2010 and August, 2012.

Before customer value analysis and customer churn prediction were conducted, a series of data preprocessing tasks, including merging customer, shipping, and delivery tables; removing records with missing values; deleting duplicate records; and aggregating records for each business customer, was performed. Moreover, recently acquired customers were excluded from the analysis because they had not been with the company long enough to be considered retained customers. These recent customers were defined as those for whom the transaction length (i.e., the number of days between the first and the final transaction) was shorter than 30 days. After recently acquired customers had been excluded, a total of 69,170 business customers remained in the final data set.

Before a customer churn prediction model was developed, the groups of active and lost customers were defined. In the case company, lost business customers were defined as those who engaged in no transactions in the past month. A customer service representative usually contacts lost customers to determine why they have not engaged in transactions. The case company classifies lost customers into four categories: those who changed location, those who went bankrupt, those who switched to a competitor, and those in debt. After consulting with experienced account managers, we considered only customers who switched to a competitor. The case company cannot address the attrition of customers in the other three categories. Among the 69,170 business customers, the numbers of active and lost customers were 67,849 and 1,321, respectively.

After consulting with the case company, we determined that a total of 18 variables (see Table 2) in three general categories were relevant to customer churn (customer profile, customer transaction behavior, and quality of delivery service). The descriptive statistics for both active and lost customers are shown in Table 3.

Table 2 Definition of variables

Full size table

Table 3 Descriptive statistics of the complete customer data set

Full size table

3.2 Experimental design

3.2.1 Customer value analysis

Figure 1 shows the research process, which can be divided into two main steps: customer value analysis and customer churn prediction. First, the differences in the scales of the LRFMP variables (see Table 2 for the scales) were standardized before further processing. The standardization procedure follows Hughes’ (1994) equal depth approach (or called customer quintile method by Miglautsch 2000). Specifically, the customers were sorted in ascending order according to the variable CsnRcn (recency) and in descending order according to the other four variables. For each of the LRFMP variables, the customers were partitioned into equal quintiles. These quintiles were assigned numbers from 5 (highest customer value) to 1 (lowest customer value). This approach to standardizing scales has been the primary approach applied in other LRFMP studies (Cheng and Chen 2009; McCarty and Hastak 2007; Coussement et al. 2014).

To determine the weighting of each LRFMP variable, this study employed the analytic hierarchy process (AHP). Five senior sales managers at the case company were invited to evaluate the relative importance of the LRFMP variables. An example of the AHP pair-comparison matrix for the LRFMP model is shown in Table 4. According to the AHP assessment, the relative LRFMP weights were 0.077, 0.225, 0.43, 0.241, and 0.026, indicating that frequency carries the most weight, followed by recency, monetary, length, and profit.

Table 4 An example of AHP pair-comparison matrix for the LRFMP model

Full size table

After the standardized LRFMP score for each customer was collected and the weight was determined for each of the LRFMP variables by using the AHP, a composite score for each customer was calculated as follows:

Assume that the standardized LRFMP scores of customer u _i (u _i ∊ U) are NL _i, NR _i, NF _i, NM _i, and NP _i, respectively. The LRFMP composite score of customer u _i, denoted as Score _ui, is defined in the following equation:

$$LRFMP(u_{i} ) = 0.077 \times NL_{i} + 0.225 \times NR_{i} + 0.43 \times NF_{i} + 0.241 \times NM_{i} + 0.026 \times NP_{i} .$$

(1)

The coefficients of the variables in this equation were determined using the aforementioned AHP study. The composite scores were subjected to a median split to study patterns among the customers. The top 50 % of customers were labeled “valuable customers,” whereas the remaining customers were labeled “less valuable customers.” Median split is a common approach for studying patterns, behavioral difference, intention, and satisfaction in churn-related studies. Examples include word of mouth in customer lifetime value (Lee et al. 2006), and customer loyalty (Zhang et al. 2010).

3.2.2 Customer churn prediction

Because valuable customers contribute more to the profitability of the firm than do less valuable customers, we concentrated on these customers and their churn rate. We used Weka 3.7.7, a widely used open-source data mining software (www.cs.waikato.ac.nz/ml/weka), to study the performance of classification techniques, namely J48 (C4.5 in Weka), MultilayerPerceptron (MLP in Weka), SMO (SVM in Weka), and SimpleLogistic (LGR in Weka) (Linoff and Berry 2011; Tan et al. 2006).

The churn rate of valuable customers was approximately 1 %. Such a low customer churn rate can cause a class imbalance problem, wherein the majority class or group influences the prediction more than the minor class or group because of unequal representation. Previous studies have suggested that the sample size can be adjusted to improve the prediction performance of supervised learning (Tan et al. 2006). A resampling procedure was conducted to ensure that the lost/active ratio remained balanced. Specifically, we generated the dataset by undersampling the majority instances and retaining the complete set of minority instances so that the sample sizes of the two groups were approximately equal. In addition, useful instances in the majority class can be lost if the resampling procedure is applied only once. Therefore, a random resampling technique was applied 30 times to generate multiple datasets; for each generated dataset, tenfold cross-validation was applied to evaluate sample quality.

3.3 Parameter settings

According to the industry-wide data mining process model, the Cross Industry Standard Process for Data Mining, parameter calibration for the modeling phase of data mining involves testing the model by using a range of parameter values to optimize the model. Details on the parameter settings are shown in Table 5. Regarding C4.5, a decision tree stops growing if the number of instances in a node does not satisfy a user-specified threshold (i.e., 15, 20, or 25). Regarding MLP, we chose a single-hidden-layer network topology with a sigmoid function. Four other crucial parameters were adjusted in the experiments, including the number of nodes in the hidden layer, learning rate, momentum factor, and maximal number of epochs. In SVM, both the PolyKernel and RBFKernel were selected as the kernel function in all tests.

Table 5 Parameter tuning in different classification techniques

Full size table

3.4 Performance measure

The performance of the classification models can be evaluated using several widely accepted indicators (accuracy, precision, recall, and F1) based on the confusion matrix. The confusion matrix in Table 6 can be used to calculate these four metrics as follows:

$$Accuracy = \frac{TP + TN}{TP + FP + FN + TN}$$

(2)

$$Precision = \frac{TP}{TP + FP}$$

(3)

$$Recall = \frac{TP}{TP + FN}.$$

(4)

$$F1 = \frac{2 \times Precision \cdot Recall}{Precision + Recall}.$$

(5)

Table 6 Confusion matrix

Full size table

4 Results and discussion

4.1 Results of customer value analysis

The purpose of customer value analysis is to identify valuable customers that potentially contribute to the profitability of the company. As previously discussed, valued customers are the top 50 % of customers (a total of 34,675) based on the composite scores (Score _ui) calculated using the LRFMP variables. The churn rate for this group was approximately 1.1 %. The customers in this 1.1 % were lost customers, whereas the customers who remained with the company were active customers. Although a low churn rate in this group is a positive sign for a healthy future, it caused the problem of unequal representation of the two groups in data analysis. A resampling procedure was conducted to ensure that the sizes of the two groups were approximately equal. Table 7 shows descriptive statistics on LRFMP variables for the two customer groups. Table 8 shows how the two groups differed in other related variables.

Table 7 Descriptive statistics of the complete customer data set in customer value analysis

Full size table

Table 8 Descriptive statistics of the valuable customer data set

Full size table

4.2 Results of customer churn prediction

The data sets produced in the previous section were used to measure the performance of four classifiers, namely C4.5, MLP, SVM, and LGR. The results and accuracy measures are presented in Table 9. Accuracy, precision, recall, and the F1 measure were used to guide the direction of model tuning. To facilitate explanation, we report only the average values based on 30 generated datasets.

Table 9 Experimental results for all classifiers

Full size table

For each classification algorithm, we report only the optimal result (bold-faced numbers in Table 9) among various parameter settings. The highest accuracy rates for C4.5, MLP, SVM, and LGR were 93.1, 90.9, 88.4, and 87.6 %, respectively. The highest F1 measures of C4.5, MLP, SVM, and LGR were 93.3, 91.1, 87.2, and 86.8 %, respectively. A one-tailed paired t test was conducted to compare the prediction accuracy of the classifiers. The results indicated that C4.5 significantly outperformed the other three algorithms (C4.5 vs. MLP: t = 22.4057, p < 0.001; C4.5 vs. SVM: t = 12.4619, p < 0.001; C4.5 vs. LGR: t = 34.2875, p < 0.001). In addition, SVM and MLP significantly outperformed LGR (MLP vs. LGR: t = 18.7484, p < 0.001; SVM vs. LGR: t = 2.144, p < 0.05). Similar results were obtained regarding the F1 measure. C4.5 significantly outperformed the other three algorithms (C4.5 vs. MLP: t = 18.7992, p < 0.001; C4.5 vs. SVM: t = 62.5325, p < 0.001; C4.5 vs. LGR: t = 36.2557, p < 0.001). Furthermore, SVM and MLP significantly outperformed LGR (MLP vs. LGR: t = 18.6723, p < 0.001; SVM vs. LGR: t = 3.063, p < 0.01). These results indicated that C4.5 was the optimal algorithm according to all of the reported performance measures. LGR performed the least favorably. In addition, we observed that the sensitivity for parameter variation was low, indicating that all techniques used in the analysis are reliable classification models.

A customer churn prediction model can be used as an early warning tool for businesses, and extracting critical factors related to customer churn can provide additional useful knowledge that supports decision making. We conducted an additional experiment to identify the top three variables that exerted the most substantial effect on whether a customer is lost or active. The information gain ratio (or gain ratio) was used to determine the variables most relevant to the model. The results indicated that CsnRcn (days since the most recent transaction) was the most influential variable, followed by CsnLng (longevity of the relationship) and CsnMnt (average revenue gained from the customer). In other words, customers for whom the number of days since the previous transaction (CsnRcn) is higher, those who have maintained a shorter relationship with the company (CsnLng), and those who have provided lower average revenue to the case company (CsnMnt) have a higher probability of being a churner. This result was expected because most business customers have a constant need for the delivery of products or services. Unless a replacement delivery method is implemented (e.g., mail replaced by e-mail or electronic communications), the need for delivery does not fluctuate drastically in a short period of time. Therefore, a high value in either recency (CsnRcn) or longevity (CsnLng) is a possible indication of customer churn. This finding is consistent with those of previous studies, such as Chen et al. (2008) and Li et al. (2011a). The association of the longevity of the relationship (CsnLng) and average revenue provided to the case company (CsnMnt) with the target variables in the aforementioned experiments is consistent with the results reported by Buckinx and Van den Poel (2005). The consistency of these customer churn variables with those used in existing studies has numerous implications because the respondents were business users of delivery services. Such a combination of context and user profiles has not received sufficient attention in relevant literature.

4.3 Profile analysis

This section presents the results of profile analysis, of which the goal is to determine patterns between lost and active customers based on the similar magnitudes of the top three variables (CsnRcn, CsnLng, and CsnMnt) reported in the previous section. The data were first sorted according to CsnRcn (ascending order), CsnLng (descending order), and then CsnMnt (descending order). Because of this sorting procedure, the variability of records among these three variables in each quartile was limited. The top 50 % of this sorted data consisted of only 5 churners and 17,332 active customers, representing a churn rate of 0.03 % [5/(5 + 17,332)]; however, the bottom 50 % consisted of 404 churners and 16,934 active customers, representing a churn rate of 2.33 %. As these three variables were the top three predictors, churners were located mainly in the bottom portion of the sorted data set. Quartile 3 contained only 6 churners and 8,663 active customers. Therefore, most churners were located in Quartile 4, which contained 398 churners and 8,271 active customers, representing a churn rate of 4.59 %. This high churn rate is 153 times higher than that of the top 50 % of the sorted data set.

The next step involved explaining the top predictors of churn in the fourth quartile, wherein the variability of CsnRcn, CsnLng, and CsnMnt was limited because of the aforementioned sorting procedure. According to the gain ratio and Chi square statistic, the top predictors of churn in Quartile 4 were CsnRcn (recency), CsnLng (length), CsnStpItvMax, CsnFrq (frequency), and CsnStpItvAvg (as shown in Table 10). CsnMnt (monetary) did not qualify as a top predictor, whereas CsnFrq (frequency) did, indicating that the effect of frequency may have been shadowed until the variability of other variables was limited or controlled in the fourth quartile. In all analyses, profit was never influential. This insight improved our understanding of the relationship between the LRFMP variables and churn.

Table 10 Variable contribution to churn when variability of recency, length and CsnStpItvMax are limited

Full size table

Regarding other profile variables, both customer region (CsrRgn) and distance to the nearest branch of the case company (CsrNrRng) exerted little effect on whether the customer was lost to competition. One possible reason is that the rural–urban disparity in Taiwan is low and the case company provides a free package pickup service for business customers. The statistics in Table 10 indicate that none of the quality variables were among the top predictors. Further analysis revealed that the case company provides high-quality delivery service according to several metrics. For example, the delivery failure rate was 0.026 %, the redelivery rate was 0.014 %, the rate of deliveries completed in 1 day was 94.1 %, and the average number of days for delivery was 1.06. These numbers indicate that the company provided excellent delivery service. In other words, delivery quality was not likely a key cause of customer churn.

Table 11 lists several crucial decision rules generated using C4.5. We report only some of the crucial rules for customer churn prediction because of space limitations. The first number at the end of each rule represents the number of customers satisfying the antecedent part of the rule, whereas the second number represents the number of customers incorrectly classified according to the rule. Decision tree induction can be used to characterize a group of churn customers, and the results can be applied in customer relationship management.

Table 11 The decision rules extracted by the C4.5

Full size table

5 Conclusion

This study examined the variables that contribute to the customer churn of valuable customers at a logistics company. Valuable customers were identified based on their composite scores, which were calculated using the LRFMP variables. Because the overall ratio of lost customers to active customers was low, a resampling procedure was adopted to balance the representation of the two customer groups. Various prediction models were then used to compare the prediction performance. The effects of the LRFMP variables, as well as other profile variables, on customer churn were examined.

The decision tree model outperformed the other models (MLP, SVM, and LGR) in predicting customer churn. The top three most influential predictors for customer churn were recency (CsnRcn), length (CsnLng), and monetary (CsnMnt), whereas the other two LRFMP variables (i.e., frequency and profit) were not significant predictors; these results are not consistent with those of several other customer churn studies. This points to an issue that relates to whether the five LRFMP variables equally influence churn.

This study addressed how and when these other LRFMP variables become useful or influential. We limited the variability of the top three churn variables (i.e., recency, length, and monetary) by sorting the records based on the three variables and focusing on the quartile that contained the highest number of lost customers. The results indicated that recency (CsnRcn), length (CsnLng), the maximal time interval between two adjacent transactions (CsnStpItvMax), frequency (CsnFrq), and the average time interval between two adjacent transactions (CsnStpItvAvg) became predictors after the variability of the top three variables was limited. This result indicates that the effects of the five variables in the LRFMP model do not equally affect churn. Even in the quartile containing the customers that are most likely to churn (high recency, short length, and short duration between transactions), many customers still decided to remain with the company. Our identification of these “secondary” predictors, frequency, length, and time between adjacent transitions, provides insight into the intention to churn.

The primary contributions of this study are described as follows. First, our sample of business customers in the logistics industry represents a population that has not been previously explored. Many previous studies on churn have focused on individual users who usually have a lower switching cost than do business customers (Miguéis et al. 2012; Huang et al. 2012). Although insight into the logistics industry and churn of business customers is limited, studying the effect of LRFMP variables in this context provides evidence regarding the generalizability of the LRFMP and traditional RFM models. Second, variables related to churn vary greatly among industries. As indicated in Sect. 2, numerous variables used in other industries are either not relevant or not applicable (e.g., average minutes of usage, place of residence, debt ratio, home ownership, and credit risk). This study clarifies how churn can be predicted more accurately in the logistics industry.

Third, not all of the five LRFMP variables exerted an equal effect on churn. Previous studies (Wei et al. 2012) have laid the foundation for these five variables, whereas this study expanded on this foundation by showing that only recency, longevity, and monetary are the most influential predictors of the five variables. Frequency was not considered a crucial predictor until the variability of the aforementioned three variables was limited. These results revealed that frequency, monetary, and length are the variables that are the most related to churn for customers who are the most likely to leave the company. The profit variable of the LRFMP model never became a significant predictor in our analyses.

Several directions can be taken in the future. First, the data were provided by a single case company. Data from numerous companies can be collected and compared to enhance the generalizability of the LRFMP model further. Second, the LRFMP model is a popular customer value analysis model, but it is not the only model. As shown in the current study, not all five of the LRFMP variables exerted a notable influence on churn. Future studies can consider multiple customer value analysis models. Third, as mentioned in the Introduction, logistics companies have begun including other services, such as banking and accounts receivable, in their core business. Such an expansion increases customer “lock-in,” a common business practice used to retain customers. Future studies can report the effect of customer lock-in on customer churn.

References

Ahn JH, Han SP, Lee YS (2006) Customer churn analysis: churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry. Telecommun Policy 30:552–568
Article Google Scholar
Athanassopoulos AD (2000) Customer satisfaction cues to support market segmentation and explain switching behavior. J Bus Res 47(3):191–207
Article Google Scholar
Buckinx W, Van den Poel D (2005) Customer base analysis: partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting. Eur J Oper Res 164(1):252–268
Article Google Scholar
Burez J, Van den Poel D (2008) Separating financial from commercial customer churn: a modeling step towards resolving the conflict between the sales and credit department. Expert Syst Appl 35:497–514
Article Google Scholar
Chang HC, Tsai HP (2011) Group RFM analysis as a novel framework to discover better customer consumption behavior. Expert Syst Appl 38(12):14499–14513
Article Google Scholar
Chang EC, Huang SC, Wu HH (2010) Using K-means method and spectral clustering technique in an outfitter’s value analysis. Qual Quant 44(4):807–815
Article Google Scholar
Chen MC, Chiu AL, Chang HH (2005) Mining changes in customer behavior in retail marketing. Expert Syst Appl 28(4):773–781
Article Google Scholar
Chen Y, Fu C, Zhu H (2008) A data mining approach to customer segment based on customer value. In: International conference on fuzzy systems and knowledge discovery, pp 513–517
Cheng CH, Chen YS (2009) Classifying the segmentation of customer value via RFM model and RS theory. Expert Syst Appl 36(3, Part 1):4176–4184
Article Google Scholar
Coussement K, Van den Bossche FAM, De Bock KW (2014) Data accuracy’s impact on segmentation performance: benchmarking RFM analysis, logistic regression, and decision trees. J Bus Res 67(1):2751–2758
Article Google Scholar
De Bock KW, Van den Poel D (2012) Reconciling performance and interpretability in customer churn prediction using ensemble learning based on generalized additive models. Expert Syst Appl 39(8):6816–6826
Article Google Scholar
Esper TL, Jensen TD, Turnipseed FL, Burton S (2003) The last mile: an examination of effects of online retail delivery strategies on consumers. J Bus Logist 24(2):177–203
Article Google Scholar
Fornell C, Wernerfelt B (1987) Defensive marketing strategy by customer complaint management: a theoretical anlaysis. J Mark Res 24(4):337–346
Article Google Scholar
Glady N, Baesens B, Croux C (2009) Modeling churn using customer lifetime value. Eur J Oper Res 197(1):402–411
Article Google Scholar
Hosseini SMS, Maleki A, Gholamian MR (2010) Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty. Expert Syst Appl 37(7):5259–5264
Article Google Scholar
Huang BQ, Kechadi TM, Buckley B, Kiernan G, Keogh E, Rashid T (2010) A new feature set with new window techniques for customer churn prediction in land-line telecommunications. Expert Syst Appl 37(5):3657–3665
Article Google Scholar
Huang B, Kechadi MT, Buckley B (2012) Customer churn prediction in telecommunications. Expert Syst Appl 39(1):1414–1425
Article Google Scholar
Hughes AM (1994) Strategic database marketing. Probus Publishing Company, Chicago
Google Scholar
Hughes AM (2005) Strategic database marketing. McGraw-Hill, New York City
Google Scholar
Hung SY, Yen DC, Wang HY (2006) Applying data mining to telecom churn management. Expert Syst Appl 31(3):515–524
Article Google Scholar
Hwang H, Jung T, Suh E (2004) An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry. Expert Syst Appl 26(2):181–188
Article Google Scholar
Kisioglu P, Topcu YI (2011) Applying Bayesian belief network approach to customer churn analysis: a case study on the telecom industry of Turkey. Expert Syst Appl 38(6):7151–7157
Article Google Scholar
Kumar D, Ravi V (2008) Predicting credit card customer churn in banks using data mining. Int J Data Anal Tech Strateg 1(1):4–28
Article Google Scholar
Lee J, Lee J, Feick L (2006) Incorporating word-of-mouth effects in estimating customer lifetime value. Database Mark Cust Strateg Manag 14(1):29–39
Article Google Scholar
Li DC, Dai WL, Tseng WT (2011a) A two-stage clustering method to analyze customer characteristics to build discriminative customer management: a case of textile manufacturing business. Expert Syst Appl 38(6):7186–7191
Article Google Scholar
Li Y, Deng Z, Qian Q, Xu R (2011b) Churn forecast based on two-step classification in security industry. Intell Inf Manag 3(4):160–165
Google Scholar
Liang YH (2010) Integration of data mining technologies to analyze customer value for the automotive maintenance industry. Expert Syst Appl 37(12):7489–7496
Article Google Scholar
Linoff GS, Berry MJ (2011) Data mining techniques: for marketing, sales, and customer relationship management. Wiley, Hoboken
Google Scholar
Liu CL, Lyons AC (2011) An analysis of third-party logistics performance and service provision. Transp Res Part E Logist Transp Rev 47(4):547–570
Article Google Scholar
Liu DR, Shih YY (2005) Hybrid approaches to product recommendation based on customer lifetime value and purchase preferences. J Syst Softw 77(2):181–191
Article Google Scholar
McCarty JA, Hastak M (2007) Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression. J Bus Res 60(6):656–662
Article Google Scholar
Miglautsch JR (2000) Thoughts on RFM scoring. J Database Mark 8(1):67–72
Article Google Scholar
Miguéis VL, Van den Poel D, Camanho AS, e Cunha JF (2012) Modeling partial customer churn: on the value of first product-category purchase sequences. Expert Syst Appl 39(12):11250–11256
Article Google Scholar
Neslin S, Gupta S, Kamakura W, Lu J, Mason C (2006) Defection detection: improving predictive accuracy of customer churn models. J Mark Res 43(2):204–211
Article Google Scholar
Nie G, Rowe W, Zhang L, Tian Y, Shi Y (2011) Credit card churn forecasting by logistic regression and decision tree. Expert Syst Appl 38(12):15273–15285
Article Google Scholar
Payne A, Frow P (2005) A strategic framework for customer relationship management. J Mark 69(4):167–176
Article Google Scholar
Reinartz WJ, Kumar V (2003) The impact of customer relationship characteristics on profitable lifetime duration. J Mark 67(1):77–99
Article Google Scholar
Reinartz W, Krafft M, Hoyer WD (2004) The customer relationship management process: its measurement and impact on performance. J Mark Res 41(3):293–305
Article Google Scholar
Renko S, Ficko D (2010) New logistics technologies in improving customer value in retailing service. J Retail Consum Serv 17(3):216–223
Article Google Scholar
Saradhi VV, Palshikar GK (2011) Employee churn prediction. Expert Syst Appl 38(3):1999–2006
Article Google Scholar
Slater SF, Narver JC (2000) Intelligence generation and superior customer value. J Acad Mark Sci 28(1):120–127
Article Google Scholar
Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Addison Wesley, Boston
Google Scholar
Tsai CF, Chen MY (2010) Variable selection by association rules for customer churn prediction of multimedia on demand. Expert Syst Appl 37(3):2006–2015
Article Google Scholar
Tsai CF, Lu YH (2009) Customer churn prediction by hybrid neural networks. Expert Syst Appl 36(10):12547–12553
Article Google Scholar
Van den Poel D, Larivière B (2004) Customer attrition analysis for financial services using proportional hazard models. Eur J Oper Res 157(1):196–217
Article Google Scholar
Verbeke W, Dejaeger K, Martens D, Hur J, Baesens B (2012) New insights into churn prediction in the telecommunication sector: a profit driven data mining approach. Eur J Oper Res 218(1):211–229
Article Google Scholar
Waters D (2003) Logistics: an introduction to supply chain management. Palgrave Macmillan, New York City
Google Scholar
Wei CP, Chiu IT (2002) Turning telecommunications call details to churn prediction: a data mining approach. Expert Syst Appl 23(2):103–112
Article Google Scholar
Wei JT, Lin SY, Weng CC, Wu HH (2012) A case study of applying LRFM model in market segmentation of a children’s dental clinic. Expert Syst Appl 39(5):5529–5533
Article Google Scholar
Xu M, Walton J (2005) Gaining customer knowledge through analytical CRM. Ind Manag Data Syst 105(7):955–971
Article Google Scholar
Yeh IC, Yang KJ, Ting TM (2009) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3, Part 1):5866–5871
Article Google Scholar
Yildiz H, Ravi R, Fairey W (2010) Integrated optimization of customer and supplier logistics at Robert Bosch LLC. Eur J Oper Res 207(1):456–464
Article Google Scholar
Yu X, Guo S, Guo J, Huang X (2011) An extended support vector machine forecasting framework for customer churn in e-commerce. Expert Syst Appl 38(3):1425–1430
Article Google Scholar
Zhang JQ, Dixit AA, Friedmann RR (2010) Customer loyalty and lifetime value: an empirical investigation of consumer packaged goods. J Mark Theory Pract 18(2):127–140
Article Google Scholar

Download references

Acknowledgments

This research was supported by the National Science Council of the Republic of China under the Grant NSC 102-2410-H-194-104-MY2.

Author information

Authors and Affiliations

Department of Business Information Systems, Western Michigan University, 3344 Schneider Hall, Kalamazoo, MI, 49008-5412, USA
Kuanchin Chen
Department of Information Management, National Chung Cheng University, Chiayi, 62102, Taiwan, ROC
Ya-Han Hu & Yi-Cheng Hsieh

Authors

Kuanchin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ya-Han Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Cheng Hsieh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ya-Han Hu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, K., Hu, YH. & Hsieh, YC. Predicting customer churn from valuable B2B customers in the logistics industry: a case study. Inf Syst E-Bus Manage 13, 475–494 (2015). https://doi.org/10.1007/s10257-014-0264-1

Download citation

Received: 15 October 2013
Revised: 28 June 2014
Accepted: 22 September 2014
Published: 02 October 2014
Issue Date: August 2015
DOI: https://doi.org/10.1007/s10257-014-0264-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Predicting customer churn from valuable B2B customers in the logistics industry: a case study

Abstract

Similar content being viewed by others

Intelligent data analysis approaches to churn as a business problem: a survey

Customer Churn Models: A Comparison of Probability and Data Mining Approaches

Modeling Customer Lifetime Value, Retention, and Churn

1 Introduction

2 Related work

2.1 Customer value analysis

2.2 Customer churn

3 Methods

3.1 Data collection and preprocessing

3.2 Experimental design

3.2.1 Customer value analysis

3.2.2 Customer churn prediction

3.3 Parameter settings

3.4 Performance measure

4 Results and discussion

4.1 Results of customer value analysis

4.2 Results of customer churn prediction

4.3 Profile analysis

5 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Predicting customer churn from valuable B2B customers in the logistics industry: a case study

Abstract

Similar content being viewed by others

Intelligent data analysis approaches to churn as a business problem: a survey

Customer Churn Models: A Comparison of Probability and Data Mining Approaches

Modeling Customer Lifetime Value, Retention, and Churn

Explore related subjects

1 Introduction

2 Related work

2.1 Customer value analysis

2.2 Customer churn

3 Methods

3.1 Data collection and preprocessing

3.2 Experimental design

3.2.1 Customer value analysis

3.2.2 Customer churn prediction

3.3 Parameter settings

3.4 Performance measure

4 Results and discussion

4.1 Results of customer value analysis

4.2 Results of customer churn prediction

4.3 Profile analysis

5 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation