Keywords

1 Introduction

Daily online activities generate plenty of opportunities for businesses to understand their consumer behavior in E-commerce platforms [1]. Indeed, consumers around the globe purchased $2.86 trillion on the web in 2018, which represented an 18% growthFootnote 1 in online sales compared to the $2.43 trillion sold in 2017. According to predictions of the purchasing behavior of customers, companies aim to anticipate their needs and provide personalized services [2, 3].

However, consumer behavior itself is well known as a complex pattern among the data mining community [4]. Aiming to predict the likelihood of such patterns, researchers were applying multiple probabilistic and machine learning (ML) statistical models to historical online customer’s data, resulting in somewhat reliable probabilities to predict the next customer’s steps [5, 6]. That has also increased the complexity of analyzing this literature, given the multiple approaches and datasets available. Previous reviews and surveys related to this topic have usually focused on the specific literature of recommendation systems [7,8,9,10]. On the other hand, our focus is on reducing complexity for understanding the step before recommendations, which is the prediction of customer’s next purchases, and in visualizing research opportunities in the field.

Therefore, these paper contributions are a novel conceptual framework for analysis and a research agenda. The framework systematically maps this literature regarding datasets adopted, predictive methods, and tasks with their applications. Specifically, the framework reveals three main tasks, namely, prediction of buying sessions, purchase decisions, and customer intents. Next, it provides eight applications enabled by each task. Finally, it illustrates three perspectives on predictive methodologies, and a research agenda with future work opportunities in the field.

The rest of this paper is organized as follows: Sect. 2 describes the research methodology of the literature review; Sect. 3 presents results and the main contributions, followed by final remarks in Sect. 4.

2 Research Methodology

To provide the framework and research agenda proposed, we performed a literature review following systematic guidelines from Watson (2002) [11] and Kitchenham et al. (2009) [12]. Inspired by [13], two research questions and a search query were developed to collect comprehensive literature within the research scope of purchase prediction in E-commerce. Then, the search query was applied in the following scientific databases, well known for containing literature in the field of behavior analytics: Scopus, Web of Science, Science Direct, EBSCO Host (Business Source Complete and Academic Search Complete), Emerald, IEEE Xplore, Association of Information Systems (AIS) library and ACM Digital Library.

  • Search Query: “(consumer or customer) AND (purchas* OR buy* OR sale* OR shop* OR behavi*) AND (predict* OR forecast*)”

The searches were performed in the abstract field, except for the Web of Knowledge (abstract title and keywords were used) and AIS libraries (full text was used), due to the characteristics of their search engines. The search period has covered papers from 2014 to 2019, only in the English language, which has provided a total of 9824 exported proposals. The next step removed duplicates and had an inclusion filter only to retrieve papers focused on the problem of consumer purchase behavior prediction. That has provided a total of 429 papers.

Next, the exclusion criteria were applied to remove papers not focused on the E-commerce context. At this stage, the total of papers kept was 35. Based on those proposals, backward and forward searches were conducted via Google Scholar, adding 18 and 10 studies, respectively. The final number of papers for extraction and mapping steps was 63. All those results are available at a Github repository (https://github.com/dougcirqueira/literature-review-purchase-prediction).

3 Results

Tables 1 and 2 provide non-exhaustive lists of the proposals selected for this literature review. Table 1 brings single task proposals (prediction of one outcome), while Table 2 provides multi-task proposals (prediction of multiple outcomes).

Table 1. Selected proposals in single task settings (A: Aggregation; R: Rule; P: Personalized Function; L: Learning; CDM: Classical Data Mining; PC: Probabilistic Classifier; DLC: Deep Learning Classifier; CF: Collaborative Filtering)
Table 2. Selected proposals in multi-task settings (A: Aggregation; R: Rule; P: Personalized Function; L: Learning; CDM: Classical Data Mining; PC: Probabilistic Classifier; DLC: Deep Learning Classifier; CF: Collaborative Filtering)

A Conceptual Framework of Analysis for Customer Purchase Prediction in E-commerce

A conceptual framework of analysis aims to optimize the understanding of a complex topic by breaking it down into smaller and comprehensive components [48]. We adopted a systematic literature review approach to developing the conceptual framework of analysis proposed and illustrated in Fig. 1.

Fig. 1.
figure 1

A conceptual framework of analysis for the literature in behavior and predictive analytics for customer purchase prediction online. (Legends for applications enabled by tasks: A = Product Recommendations; B = Targeted Marketing; C = Layout Personalization; D = Server Load Balance Optimization; E = Stock Management; F = Real-time Customer Service; G = Purchase Trends Discovery; H = Offers Awareness)

The framework has six components. Component 1 defines the dataset types adopted in this literature. Component 2 classifies in dimensions the input data present in those datasets. Component 3 shows the methodologies adopted for constructing features out of the input data, illustrating how consumer behavior is modeled to predictive analytics. Component 4 introduces the predictive methods summarized into four categories. Component 5 shows which tasks enable what applications from component 6, as identified in Subsect. 3.1. Details on each component will be given under the research questions developed in the literature review.

The two research questions developed to conduct the systematic literature review were the guidance for scoping our findings. The results will be presented, reflecting those questions in Subsects. 3.1 and 3.2.

3.1 RQ 1. What Tasks and Applications Have Been Addressed in the Problem of Consumer Purchase Behavior Prediction in E-Commerce?

This research question addresses components 5 and 6 of the proposed framework. It reveals the literature targeting three main tasks within the online purchase prediction problem. Every task has a different prediction outcome, described as follows:

  • Predict Customer Intent (PCI): Predict the intention of customer visits online. Examples of intention types reported in the literature are purchase oriented or general [35], browsing, searching, purchasing, and bouncing [37]. This task is essential for identifying similar groups of customers, and for applications in which customer segmentation is needed.

  • Predict Buying Session (PBS): Predict if a current user online session will end up with a purchase or not. This task is interesting for applications that need to capture the general likelihood of the user conversion during his visit online, without details regarding preferences for specific products.

  • Predict Purchase Decisions (PPD): Predict customers purchase behavior concerning their buying decisions. For instance, to foresee what product or category a customer will buy; to predict the time or period likely to witness a purchase; to predict the next amount customers are likely to spend in their purchases. PPD is the most complex task, as the aim is to predict fine-grained decisions. That is the ideal task for recommending specific products or services to customers.

Those three identified tasks enable a variety of business intelligence applications for online retailers, such as: A) Product Recommendations [29]; B) Targeted Marketing [16, 42]; C) Layout Personalization of E-commerce Landing Pages [17]; D) Load balance Optimization to Prioritize Quality of Service for Likely Buyers [14]; E) Stock Management Optimization of Products [28, 32]; F) Real-time Customer Service [49]; G) Purchase Trends Discovery [15]; H) Offers Awareness Based on the Detected Intention of Consumers [35].

3.2 RQ 2. What Methodologies Have Been Adopted to Predict Consumer Purchase Behavior Online?

This research question addresses the components from 1 to 4 of the framework proposed. It provides three perspectives in the predictive methodologies adopted in this literature.

Online Customer Behavior Datasets and their Features.

Customer behavior in E-commerce is captured through datasets of past online sessions and shopping logs, which are described in Table 3:

Table 3. Dataset types identified in the literature

The input data is further classified in the data layer, inspired by [2], in dimensions, which have specific input data features. Every dimension and its features support in explaining and predicting customer behavior from different perspectives, which bring some benefits for predictive tasks on that data, as illustrated in Table 4.

Table 4. Classification of E-commerce data in dimensions and its benefits

Feature Construction for Purchase Prediction.

In this Subsection, we use a formal notation to explain the feature construction process. The input data features \( feat_{in} \) described previously serve as the basis for feature construction, from which is derived new descriptive features \( feat_{eng\_out} \) to capture historical patterns, which can indicate the probability of purchase. Two methodologies are adopted to create descriptive features. The first is Feature Engineering, where domain expertise is used to think of a function or rule \( f_{eng} \) to apply on input data features \( feat_{in} \) present in a dataset, which are related to a current customer transaction \( Ti \). This process can be shaped by conditions \( cond_{n} \) to capture relationships between multiple input data features. The Feature Engineering process can be described in Eq. 1.

$$ feat_{eng\_out} \, = \,f_{eng} \left( {D, \,Ti, feat_{in} ,\,cond_{1} , \,cond_{n} } \right) $$
(1)

The second methodology for feature construction is Feature Learning, in which a function \( f_{learn} \) to create new features is an unsupervised ML model, which automatically derives new explanatory features. For instance, researchers extract Latent Representations, or hidden layer weights \( feat_{learn\_out} \) learned during training time of a Recurrent Neural Network or Autoencoder model, carrying hidden correlations and relationships between variables. This learning process is conditioned by the target outcome \( targ_{out} \) and a cost function \( cost_{f} \), which represent the desired outcome of the learned representation, and how the weights of the hidden layer will be learned. The desired outcome is, for instance, a binary label for predicting buying sessions, or a multi-category label for predicting purchase decisions regarding products. The Feature Learning process is described in Eq. 2.

$$ feat_{learn\_out} \, = \,f_{learn} \left( {D, \,Ti,\, feat_{in} ,\,targ_{out} ,\, cost_{f} } \right) $$
(2)

Table 5 illustrates examples of those methodologies in action.

Table 5. Methodologies for Customer behavior Feature Construction

Predictive Methods.

Researchers have been working with ML and probabilistic methods to predict the complex customer purchase behavior online [5]. Based on the conceptual framework, we summarize the predictive models adopted into four categories, with their advantages and disadvantages. It is provided examples of particular methods within each category, specifically for purchase prediction in E-commerce. We illustrate in Table 6 how those models compare concerning their characteristics and suitability for tasks identified in Subsect. 3.1.

The characteristics analyzed are a) Suitability for Real-Time: concerning usual time required for training, if any, and for providing predictions in production settings; b) Interpretability: concerning the capacity of providing explanations for why a predicted outcome is given by the model; c) Sequential Modeling: it illustrates if a predictive method is able to model the customer activities sequentially. That is important when researchers want to explicitly analyze the influence of past purchases in current customer actions; d) Feature Construction Function: reveals what methodology and function are usually adopted for feature construction when applying the predictive method analyzed.

Table 6. Predictive methodologies

Details regarding each predictive methodology are provided as follows.

  • Probabilistic Classifier: A model that uses probability theory to model the uncertainty in the data. Advantage: Usually, it requires a few numbers of engineered features, which makes them a feasible choice for real-time settings, as well as the natural capacity of sequentially modeling short-term patterns in events. Disadvantage: It is difficult to capture the effects of long-term patterns in customer behavior. However, this capacity can be achieved in the cost of increasing model complexity and processing time.

  • Bayesian Classifier: Estimates conditional probability distributions based on the influence of given features to output a specific prediction. In [42], authors predict purchase decisions by analyzing the influence of sequential purchases, number, and duration of visits to compute probabilities for the customer choice of a specific product or time of purchase.

  • Hidden Markov Model: A generalization of a probabilistic mixture model, where the probability of an event, such as a purchase, depends on the occurrence of hidden variables through a sequential Markov process modeling a previous customer action [24].

  • Classical Data Mining Classifiers: Those models work by learning similarities between feature vectors of buying sessions, intents, and purchase decisions. Advantage: Most of the approaches in this category perform well even with small or medium dataset sizes, which makes some of them suitable for real-time settings. Disadvantage: Authors adopting this methodology usually need to perform extensive Feature Engineering to achieve good prediction results, also for detecting sequential patterns.

  • Unsupervised Clustering: Unlabeled sessions and purchase transactions are input to a model which will discover patterns in similar instances and group them for providing predictions. For example, [37, 38] adopt the K-means algorithm to segment customers based on variables regarding their clickstream behavior.

  • Association Rules: Enables the discovery of associations between features, which can reveal rules with high confidence to indicate probabilities of sessions ending up with a purchase [16].

  • Instance-Based: Model which classify new data instances based on similar cases and their features. In [22], authors employ K-Nearest Neighbor to predict buying sessions according to previous examples of sessions, with similar features, which ended up with a purchase.

  • Linear ML: Machine learning models which assume a linear decision boundary between buying and non-buying sessions, or feature vectors representing purchase decisions of customers. However, the kernel trick can be adopted to detect non-linear relationships between features [50], or Feature Engineering to create combinations between multiple features [27].

  • Ensemble Learning: stacking of various weak predictive models together to build up a robust model for providing predictions [20].

  • Deep Learning Classifiers: ML models which can naturally learn complex and non-linear decision boundaries and relationships in the dataset. Advantage: These models can be powerful in modeling long-term influences of past customer events on current decisions [25], and do not require extensive Feature Engineering, as they have Feature Learning built-in. Disadvantage: This method usually requires massive amounts of data, which makes it hard for usage with new customers and a few purchases [40, 41]. The interpretability of predictions is also an issue.

  • Collaborative Filtering: Classical model applied in recommendation systems. This approach models customers and products in a utility matrix based on their clicks, views, reviews, or purchases, which is then factorized to provide latent factors representing the likelihood of customers choosing similar products [29, 30, 44]. That is the favorite model adopted by researchers focusing on predicting purchase decisions, but it is also utilized in predicting buying sessions [14]. Advantage: One of the most flexible approaches for multiple types of features in different E-commerce platforms. It also scales well with more customers and products being added in a dataset. Disadvantage: The utility matrix is usually sparse, as most of the customers have not viewed many of the products available in an E-commerce platform. Therefore, it is challenging to predict purchases for new customers, and it is important to think of Feature Engineering for creating features that can overcome such issues.

3.3 State-of-the-Art Performance

To have a fair comparison between the identified predictive methodologies, for every specific task, we grouped the existing proposals by the predictive methodology adopted. We evaluated only the F1 score and Area Under Receiver Operating Characteristic Curve (AUC) reported by those. Our choice for those metrics considers the fact that datasets in this literature are usually unbalanced, with few occurrences of purchases, and it is well known that F1 and AUC scores are the ideal metrics in unbalanced scenarios [51]. Table 7 illustrates the average results obtained from predictive methodologies for suitable tasks where they can be applied. It is not reported performance for predicting customer intent as the authors did not adopt the mentioned metrics.

Table 7. State of the Art Results for Predicting Buying Sessions and Purchase Decisions

Classical Data Mining Classifiers are the current state-of-the-art for Predicting Buying Sessions, specifically Ensemble learners [20] and Support Vector Machines [19]. Those are followed by Deep Learning classifiers. It is interesting to observe the drop in performance when going to the task of Predicting Purchase Decision, which proves it is the most complex task due to the fine-grained predictions aimed at it. Concerning performance, the classical Collaborative Filtering approach is the most robust, comprised of a Latent Factor Model [30] and Matrix Factorization [31]. Those are followed by Classical Data Mining and Deep Learning classifiers.

3.4 Research Agenda

We derive a research agenda based on the targeted research gaps and findings of this review, containing the following directions:

  • Sequential Learning: Few proposals have explored sequential ML models in this literature. Examples are recurrent neural networks, which are only adopted in three studies [25, 33, 40]. Such models are indicated to learn the evolving consumer behavior over time, and sequential patterns such as “She is buying a phone case after purchasing a smartphone”.

  • Interpretability: It is noticed the majority of authors reporting higher performance as those applying Classical Data Mining and Deep Learning classifiers, which also have a black-box nature. Indeed, interpretability seems not to be the focus of this recent literature.

  • Customer Data and General Data Protection Regulation (GDPR): Given the rise of privacy policies with GDPR in Europe, it is needed more research on the trade-off between the amount of data required and protection of customers’ privacy, regarding the performance of purchase prediction tasks.

  • Dataset for benchmarking: There is no clear consensus on datasets for state-of-the-art comparison in this literature, as many studies have used private data. However, we observed a significant adoption of the Recsys 2015 challenge data [17, 25, 31, 39, 40, 42], which suggests this dataset as a candidate in this regard.

  • Evaluation in Multiple E-commerce Platforms: Most researchers evaluate their proposed predictive methods in a single dataset, or focus on specific E-commerce settings. Therefore it is hard to argue their methodologies are general for multiple E-commerce platforms, such as general-purpose and specialized marketplaces.

  • Feature Engineering and Feature Learning: It was noticed that the well-performing proposals adopting Classical ML models had been heavily investing in Feature Engineering. However, more investigation in the field of Feature Learning is recommended in this area, or the combination of those two methodologies in purchase prediction online.

  • Creation Process of Personalized Feature Engineering Functions: Some researchers explore the creation of personalized functions in Feature Engineering, such as the popularity of a product [17], the diversity of customer behavior [18, 35] and graph metrics [21]. It could be relevant to map this creation process, and help other researchers in establishing such novel features for customer behavior online.

  • A Framework for Purchase Prediction Tasks in E-commerce: Existing proposals focus on one of the three tasks identified, but there is a lack of a view into how those tasks can work together. Therefore, further research could be taken to provide a framework which aligns the identified tasks in this review.

4 Final Remarks

This study presents a systematic literature review of recent proposals in consumer purchase prediction in E-commerce. A novel conceptual framework provides lenses in the state-of-the-art of this field. It is noticed that, despite the broad literature, there is still a need for an in-depth investigation of specific directions. Therefore, a research agenda is provided, illustrating potential future work demands.

A next step would be to adopt a benchmark dataset, and evaluate predictive methodologies in multi-task settings, such as to forecast the next product, purchase time, or amount a customer will likely buy. Therefore, it is relevant to investigate the construction of a framework for purchase prediction, which considers the combination of three tasks identified in this review.