Skip to main content

Deep Learning-Based Efficient Customer Segmentation for Online Retail Businesses

  • Chapter
  • First Online:
Benchmarks and Hybrid Algorithms in Optimization and Applications

Abstract

The advent of competitors in the industry has formed gravity among competing corporations regarding the search for new consumers and retaining existing ones. As a result, the need for innovative consumer retention and attraction tactics is significant regardless of the business size. The efficient way is to segment consumers based on some particular behavior. Market segmentation means clustering consumers into a few segments where consumers of each segment share common behavior or characteristics. Segmentation permits diverse business or commercial firms to direct and choose various groups of customers according to their characteristics and behavior in purchasing goods, items, or interests in a specific merchandise. In addition, with the segmentation approach, the capability of any corporate firm to understand the needs of each of its customers will be significantly enhanced in the provision of targeted customer services and the development of customized customer marketing plans. Big data ecosystem and machine learning/deep learning create appropriate ways to improve better recognition and segmentation of consumers. The statistical-based approach needs to be updated and often leads to incorrect results. In this chapter, we address the problem of segmenting the customers. We provide an in-depth understanding of various unsupervised machine learning algorithms such as K-Means, K-Means++, Principal Component Analysis, and unsupervised deep learning algorithms such as AutoEncoders. Next, we implement customer segmentation by proposing a learning-based framework on the open-source dataset. This dataset is available on Kaggle. The proposed framework uses K-Means clustering and AutoEncoder to segment the customers. We optimize hyperparameters for these algorithms using advanced deep learning frameworks like TensorFlow and popular machine learning frameworks like sci-kit-learn. The elbow method is used to find the optimal number of segments. It can also be extended to cluster the groups in diverse businesses using semantic features, temporal affinity, and sentiment analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen D, Sain SL, Guo K (2012) Data mining for the online retail industry: a case study of RFM model-based customer segmentation using data mining. J Database Mark Cust Strategy Manag 19(3):197–208

    Article  Google Scholar 

  2. Dogan O, Ayçin E, Bulut Z (2018) Customer segmentation using RFM model and clustering methods: a case study in the retail industry. Int J Contemp Econ Adm Sci 8

    Google Scholar 

  3. Hu YH, Yeh TW (2014) Discovering valuable frequent patterns based on RFM analysis without customer identification information. Knowl-Based Syst 61:76–88

    Google Scholar 

  4. Sarvari PA, Ustundag A, Takci H (2016) Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes

    Google Scholar 

  5. Yeh IC, Yang KJ, Ting TM (2009) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3):5866–5871

    Article  Google Scholar 

  6. Bloom JZ (2005) Market segmentation: a neural network application. Ann Tour Res 32(1):93–111

    Article  Google Scholar 

  7. Holmbom AH, Eklund T, Back B (2011) Customer portfolio analysis using the SOM. Int J Bus Inf Syst 8(4):396–412

    Google Scholar 

  8. Sirigineedi SS, Soni J, Upadhyay H (2020) Learning-based models to detect runtime phishing activities using urls. In: Proceedings of the 2020 4th international conference on compute and data analysis, pp 102–106

    Google Scholar 

  9. Gangwani P, Soni J, Upadhyay H, Joshi S (2020) A deep learning approach for modeling geothermal energy prediction. Int J Comput Sci Inf Secur (IJCSIS) 18(1)

    Google Scholar 

  10. Kiang MY, Hu MY, Fisher DM (2006) An extended self-organizing map network for market segmentation—a telecommunication example. Decis Support Syst 42(1):36–47

    Article  Google Scholar 

  11. Al-Zuabi IM, Jafar A, Aljoumaa K (2019) Predicting customer’s gender and age depending on mobile phone data. J Big Data 6(1):18

    Article  Google Scholar 

  12. Joulin A, Bach F, Ponce J (2010) Discriminative clustering for image co-segmentation. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, New York, pp 1943–1950

    Google Scholar 

  13. Aggarwal CC, Zhai C (2012) A survey of text clustering algorithms. In: Mining text data. Springer, Berlin, pp 77–128

    Google Scholar 

  14. Tian K, Shao M, Wang Y, Guan J, Zhou S (2016) Boosting compound–protein interaction prediction by deep learning. Methods 110:64–72

    Article  Google Scholar 

  15. Yamamoto M, Hwang H (2014) A general formulation of cluster analysis with dimension reduction and subspace separation. Behaviormetrika 41(1):115–129

    Article  Google Scholar 

  16. Hofmann T, Schölkopf B, Smola AJ (2008) Kernel methods in machine learning. Ann Stat 36:1171–1220

    Article  MathSciNet  MATH  Google Scholar 

  17. Soni J, Prabakar N (2018) Effective machine learning approach to detect groups of fake reviewers. In: Proceedings of the 14th international conference on data science (ICDATA’18), Las Vegas, NV, pp 3–9

    Google Scholar 

  18. Xu J, Liu H (2010) Web user clustering analysis based on KMeans algorithm. In: 2010 international conference on information, networking and automation (ICINA)

    Google Scholar 

  19. Soni J, Prabakar N, Upadhyay H (2019) Behavioral analysis of system call sequences using LSTM Seq-Seq, cosine similarity and jaccard similarity for real-time anomaly detection. In: 2019 international conference on computational science and computational intelligence (CSCI). IEEE, pp 214–219

    Google Scholar 

  20. Arthur D, Vassilvitskii S (2006) k-means++: the advantages of careful seeding. Stanford

    Google Scholar 

  21. Liu F, Deng Y (2020) Determine the number of unknown targets in open world based on Elbow method. IEEE Trans Fuzzy Syst 29(5):986–995

    Article  Google Scholar 

  22. Soni J, Prabakar N, Upadhyay H (2019) Feature extraction through deepwalk on weighted graph. In: Proceedings of the 15th international conference on data science (ICDATA’19), Las Vegas, NV

    Google Scholar 

  23. Granato D, Santos JS, Escher GB, Ferreira BL, Maggio RM (2018) Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: a critical perspective. Trends Food Sci Technol 72:83–90

    Article  Google Scholar 

  24. Soni J, Prabakar N, Upadhyay H (2020) Visualizing high-dimensional data using t-distributed stochastic neighbor embedding algorithm. In: Principles of data science. Springer, Cham, pp 189–206

    Google Scholar 

  25. Tschannen M, Bachem O, Lucic M (2018) Recent advances in autoencoder-based representation learning. arXiv:1812.05069

  26. Soni J, Prabakar N, Upadhyay H (2019) Comparative analysis of LSTM sequence-sequence and auto encoder for real-time anomaly detection using system call sequences

    Google Scholar 

  27. Pang B, Nijkamp E, Wu YN (2020) Deep learning with tensorflow: a review. J Educ Behav Stat 45(2):227–248

    Article  Google Scholar 

  28. Jin H, Song Q, Hu X (2019) Auto-keras: an efficient neural architecture search system. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1946–1956

    Google Scholar 

  29. Hao J, Ho TK (2019) Machine learning made easy: a review of scikit-learn package in python programming language. J Educ Behav Stat 44(3):348–361

    Article  Google Scholar 

  30. https://www.kaggle.com/arjunbhasin2013/ccdata

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayesh Soni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Soni, J., Prabakar, N., Upadhyay, H. (2023). Deep Learning-Based Efficient Customer Segmentation for Online Retail Businesses. In: Yang, XS. (eds) Benchmarks and Hybrid Algorithms in Optimization and Applications. Springer Tracts in Nature-Inspired Computing. Springer, Singapore. https://doi.org/10.1007/978-981-99-3970-1_9

Download citation

Publish with us

Policies and ethics