Purchase Prediction via Machine Learning in Mobile Commerce

Lv, Chao; Feng, Yansong; Zhao, Dongyan

doi:10.1007/978-3-319-50496-4_43

Chao Lv¹⁸,
Yansong Feng¹⁸ &
Dongyan Zhao¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10102))

Included in the following conference series:

4751 Accesses
1 Citations

Abstract

In this paper, we propose a machine learning approach to solve the purchase prediction task launched by the Alibaba Group. In detail, we treat this task as a binary classification problem and explore five kinds of features to learn potential model of the influence of historical behaviors. These features include user quality, item quality, category quality, user-item interaction and user-category interaction. Due to the nature of mobile platform, time factor and spacial factor are considered specially. Our approach ranks the 26th place among 7186 teams in this task.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Towards early purchase intention prediction in online session based retailing systems

Article Open access 19 December 2020

What prompts consumers to purchase online? A machine learning approach

Article 09 November 2022

Discovering Consumers’ Purchase Intentions Based on Mobile Search Behaviors

Keywords

1 Introduction

Recently, the Alibaba Group launched a purchase prediction task known as Ali Mobile Recommendation Algorithm^{Footnote 1}. This purchase prediction task provides the historical behaviors data of users in the mobile platform during a period of one month to help predict purchase behaviors will happen in the following one day. The historical behaviors include click, collect, add-to-cart and payment. Conventional methods in recommender system [6], such as collaborative filtering and matrix factorization, don’t obtain a good performance in this task.

In this paper, we propose a machine learning approach to solve this purchase prediction task, instead of CF-based methods. This task is treated as a binary classification problem, and five kinds of features are explored from different aspects to learn potential model of the historical browsing behaviors, including user quality, item quality, category quality, user-item interaction and user-category interaction. Those features could reflect the willingness of users to buy items. In particular, we concentrate on the time and spacial factor. The time factor is incorporated into the feature families and features are extracted in different time dimension. The spacial factor is employed in the filtering module. For any purchase behaviors we predict, if the location of item is far away from the user, we will remove it from our prediction results.

2 Related Work

The most prominent technique in recommender system is Collaborative Filtering (CF) [8]. The basic insight for this technique is a sort of continuity in the realm of taste. If users Alice and Bob have the same utility for items 1 through k, then the chances are good that they will have the same utility for item \(k + 1\). Usually, these utilities are based on ratings that users have applied for items with which they are already familiar. CF is roughly classified into two categories, i.e. memory-based approachs [5, 9] and model-based approachs [1, 3].

The Netflix million-dollar challenge boosted interest in CF and yielded the publication of a number of new methods. Several matrix factorization techniques have been successfully applied to CF, including Singular Value Decomposition (SVD) [7] and Non-negative Matrix Factorization (NMF) [4]. A joint non-nagative matrix factorization method proposed in [2] trys to solve the purchase prediction task launched by the Alibaba Group in 2014. The goal of that task in 2014 is to predict purchase behaviors in the following one month based on historical behaviors data in a period of four months.

3 Problem Definition

Notations: U stands for the set of users, I stands for the whole set of items, P stands for the subset of items, \(P \subseteq I\), D stands for the user behaviors data set in all the set of all items. Our objective is to develop a recommendation model for users in U on the business domain P using the data D. In detail, our goal is to predict purchase behaviors over P in the following one day based on the behaviors data during one month in D.

4 Method

We treat the target problem as a binary classification problem, i.e. any (user, item) pairs will be divided into two classes: “buy” and “not buy”. The framework of which is showed in Fig. 1.

First, we would like to learn a model from the behaviors data over the whole set of items in the training module, which can reflect why users will buy items in the following one day and how their historical behaviors influence their future purchase behaviors. In detail, if a user is going to buy an item in the following one day, this (user, item) pair will be labeled as a positive instance while other pairs that doesn’t be bought are going to be labeled as a negative instance. In addition, this trained model is applied to the behaviors data over the subset of items in the prediction module and positive instances in the prediction results will be seen as purchase behaviors will happen in the following one day. Then, we take spacial factor into consideration and remove pairs with too long distance in those positive instances via the filtering module. In the last, the filtered predicted purchase behaviors are compared with the real purchase behaviors to evaluate the performance of our approach in the evaluation module.

4.1 Training Module and Prediction Module

Training set is a basic component in the training module just like test set in the prediction module, but there is little difference between the generation of them. Because we can’t use the future infomation, i.e. we don’t know purchase behaviors on the whole set of items in the following one day, we split bahaviors data in the last day of the month and use them to label (user, item) pairs that appear in the remainder of the month. This process is illustrated in Fig. 2.

4.2 Feature Project

Feature project is an important component in our machine learning approach and we will discuss feature families detailly in this section.

For a certain (user, item) pair, the item belongs to a category, we consider the following five feature families, i.e. user quality, item quality, category quality, user-item interaction and user-category interaction.

User Quality estimates the purchasing power and vitality of users. In the mobile commerce, some users are active and have strong purchase desire while others are inactive and not willing to buy items frequently.

Last Login Day represents the last login day of a user.
Conversion Ratio represents the ratio of purchase behaviors of a user in his total behaviors.
Behaviors Statistics stands for the count of a user’s behaviors. The more this user browses, the higher possibility he will buy. There is an example in the left of Fig. 3 to explain the definition. In this example, a user click for 3 times in the first day, 1 time in the second day and 2 times in the fourth day, so the count of his total behaviors in the last four days equals \(3 + 1 + 0 + 2 = 6\).
Active Days means the count of active days of a user. This feature could represent the positivity of a user directly. There is an example in the right of Fig. 3 to explain the definition. In this example, a user login in the first day, the second day and the fourth day, so the count of his active days in the last four days equals \(1 + 1 + 0 + 1 = 3\).

Item Quality reflects the popularity of an item. Obviously, more popular items have bigger tendency to be sold.

Last Browsed Day represents the last day an item is browsed.
Conversion Ratio represents the ratio of purchase behaviors of an item in its total browsed behaviors.
Behaviors Statistics stands for the count of an item’s browsed behaviors. The more this item is browsed, the higher possibility it will be sold.
Active Days means the count of days an item is browsed. This feature could represent the popularity of an item.

Category Quality describes the popularity of a category. The definition of Last Browsed Day, Conversion Ratio, Behaviors Statistics and Active Days in it is similar with those in Item Quality.

User-Item Interaction describes the interaction between the user and item. it is a direct aspect to reflect the willing that the user want to buy the item.

Behaviors Statistics represents the count of a user’s browsing behaviors on one item.
Active Days means the count of days in which the user browses the item.

User-Category Interaction represents the interaction between the user and the category. It is similar with User-Item Interaction, and behaviors Statistics and Active Days will be generated in the same way.

4.3 Filtering Module

This purchase prediction task is based on a typical O2O business model, in which users pay online and consume offline. This means that users are not willing to buy items which are far away from them because they have to go there to consume. Based on the fact, we propose Filter Module to remove those pairs with too long distance. In detail, in a (user, item) pair, if the distance between the location of the item and the user is bigger than L, any purchase behaviors will happen on this pair. We set \(L = 100\) km from experience in this paper.

4.4 Reduced Data

Because the volume of our data set is too large, it will spend unacceptable time for training process in the machine learning approach. Hence, we use a reduced data set to solve this problem and keep prediction performance at the same time, which is showed in Fig. 4.

Instead of using all the (user, item) pairs happened in the one month, we use pairs show up in the last N days to train and predict. \(N = 1\) means that we use data in the last day while \(N = 30\) means that we use all data over the whole one month.

5 Experiments

5.1 Data Description

The data contains two parts. The first part is the dataset D, the mobile behaviors data of users in the set of all items, with the following columns: \(user\_id\), \(item\_id\), \(behaviors\_type\), \(user\_geohash\), \(item\_category\) and time. The second part is the dataset P, the subset of items data, with the following columns: \(item\_id\), \(item\_geohash\) and \(item\_category\). The training data contains the mobile behaviors data of certain quantity of sampled users (D) from November 18, 2014 to December 18, 2014. The evaluation data is the purchase data of these same users of the items in P in December 19, 2014. Summary statistics of the data are listed in Table 1.

Table 1. The statistics of the data set

Full size table

5.2 Two Rule-Based Baselines

CartRule is the first strategy of most participants, and we select it as our first baseline. In detail, CartRule thinks that if a user adds an item into his cart and doesn’t buy it in that day, it’s likely that he will buy it in the next day. In addition, we propose CartRuleTime which adds time factor into consideration based on CartRule. CartRuleTime thinks that if a user adds an item into his cart and doesn’t buy it after m o’clock (\( m \in \{ 0, 1, ..., 23 \} \)) in that day, it’s likely that he will buy it in the next day. When m is set to 15, the performance is the best according to our experiments.

Table 2. Performance of different approachs

Full size table

5.3 Result

We set \(N \in \{ 1, 2, 3, 4\}\) in the reduced data in this paper and apply three classifiers: LR (Linear Regression), RF (Random Forest) and GBDT (Gradient Boosting Decision Tree). Table 2 shows the prediction performance of different approachs in this purchase prediction task. N1_LR means that \(N = 1\) and \(classifier = LR\), N4_GBDT means that \(N = 4\) and \(classifier = GBDT\), others could be explained in the same way. CartRuleTime has little improvement compared to CartRule because CartRuleTime takes the time factor into consideration. Those machine learning approachs we proposed have a much better performance than two rule-based methods, which could proves the effectiveness of our approachs to some extent. Compare the performance of different classifiers, we could see easily that GBDT is the best choice. With the increase of N, the F1 score changes littlely. This phenomenon proves that reduced data could accelerate the process of machine learning and keeps the performance at the same time (Table 3).

Table 3. Performance of different features

Full size table

To prove the effectiveness and robustness of feature families explored, we test a series of combination of feature families on N3_GBDT, which is the best result mentioned above. U+I+C means we use quality features only and UI+UC means we use interaction features only. The performance of U+I+C is poorer than UI+UC, which explains the importance of interaction features. The performance of All is better than UI+UC, which reflects the supporting role of the quality features.

6 Conclusion

We present a machine learning approach to solve the purchase prediction task launched by the Alibaba Group. Five kinds of features are explored to describe the willingness of users’ purchase desires on items. In particular, we take the time and spacial factor into consideration. Experimental results prove the effectiveness of our proposed approach.

Notes

1.
http://tianchi.aliyun.com.

References

Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)
Article Google Scholar
Ju, B., Ye, M., Qian, Y., Ni, R., Zhu, C.: Modeling behaviors of browsing and buying for alidata discovery using joint non-negative matrix factorization. In: 2014 Tenth International Conference on Computational Intelligence and Security (CIS), pp. 114–118. IEEE (2014)
Google Scholar
Koren, Y., Bell, R., Volinsky, C., et al.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Article Google Scholar
Linden, G., Smith, B., York, J.: Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 7(1), 76–80 (2003)
Article Google Scholar
Lü, L., Medo, M., Yeung, C.H., Zhang, Y.C., Zhang, Z.K., Zhou, T.: Recommender systems. Phys. Rep. 519(1), 1–49 (2012)
Article Google Scholar
Paterek, A.: Improving regularized singular value decomposition for collaborative filtering. In: Proceedings of KDD Cup and Workshop, vol. 2007, pp. 5–8 (2007)
Google Scholar
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 4 (2009)
Article Google Scholar
Wang, J., De Vries, A.P., Reinders, M.J.: Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 501–508. ACM (2006)
Google Scholar

Download references

Acknowledgement

The work reported in this paper was supported by the National Natural Science Foundation of China Grant 61272344 and 61370116.

Author information

Authors and Affiliations

Institute of Computer Science and Technology, Peking University, Beijing, 100871, China
Chao Lv, Yansong Feng & Dongyan Zhao

Authors

Chao Lv
View author publications
You can also search for this author in PubMed Google Scholar
Yansong Feng
View author publications
You can also search for this author in PubMed Google Scholar
Dongyan Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yansong Feng .

Editor information

Editors and Affiliations

Microsoft Research Asia, Beijing, China
Chin-Yew Lin
Brandeis University, Waltham, Massachusetts, USA
Nianwen Xue
Peking University, Beijing, China
Dongyan Zhao
Fudan University, Shanghai, China
Xuanjing Huang
Peking University, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lv, C., Feng, Y., Zhao, D. (2016). Purchase Prediction via Machine Learning in Mobile Commerce. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-50496-4_43
Published: 02 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Purchase Prediction via Machine Learning in Mobile Commerce

Abstract

Similar content being viewed by others

Towards early purchase intention prediction in online session based retailing systems

What prompts consumers to purchase online? A machine learning approach

Discovering Consumers’ Purchase Intentions Based on Mobile Search Behaviors

Keywords

1 Introduction

2 Related Work

3 Problem Definition

4 Method