Abstract
Mobile apps (applications) have become a popular form of software, and the app reviews by users have become an important feedback resource. Users may raise some issues in their reviews when they use apps, such as a functional bug, a network lag, or a request for a feature. Understanding these issues can help developers to focus on users’ concerns, and help users to evaluate similar apps for download or purchase. However, we do not know which types of issues are raised in a review. Moreover, the amount of user reviews is huge and the nature of the reviews’ text is unstructured and informal. In this paper, we analyze 3 902 user reviews from 11 mobile apps in a Chinese app store — 360 Mobile Assistant, and uncover 17 issue types. Then, we propose an approach CSLabel that can label user reviews based on the raised issue types. CSLabel uses a cost-sensitive learning method to mitigate the effects of the imbalanced data, and optimizes the setting of the support vector machine (SVM) classifier’s kernel function. Results show that CSLabel can correctly label reviews with the precision of 66.5%, the recall of 69.8%, and the F1 measure of 69.8%. In comparison with the state-of-the-art approach, CSLabel improves the precision by 14%, the recall by 30%, the F1 measure by 22%. Finally, we apply our approach to two real scenarios: 1) we provide an overview of 1 076 786 user reviews from 1 100 apps in the 360 Mobile Assistant and 2) we find that some issue types have a negative correlation with users’ evaluation of apps.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Pagano D, Maalej W. User feedback in the Appstore: An empirical study. In Proc. the 21st IEEE International Requirements Engineering Conference (RE), July 2013, pp.125-134.
Pagano D, Brügge B. User involvement in software evolution practice: A case study. In Proc. the 35th International Conference on Software Engineering (ICSE), May 2013, pp.953-962.
Maalej W, Nabil H. Bug report, feature request, or simply praise? On automatically classifying app reviews. In Proc. the 23rd IEEE International Requirements Engineering Conference (RE), Aug. 2015, pp.116-125.
Panichella S, Di Sorbo A, Guzman E, Visaggio C A, Canfora G, Gall H C. How can I improve my app? Classifying user reviews for software maintenance and evolution. In Proc. IEEE International Conference on Software Maintenance and Evolution (ICSME), Sept. 29-Oct. 1, 2015, pp.281-290.
McIlroy S, Ali N, Khalid H, Hassan A E. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering, 2016, 21(3): 1067-1106.
Maas A L, Daly R E, Pham P T, Huang D, Ng A Y, Potts C. Learning word vectors for sentiment analysis. In Proc. the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1, June 2011, pp.142-150.
Pang B, Lee L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proc. the 42nd Annual Meeting on Association for Computational Linguistics, July 2004, pp.271-278.
Seaman C B, Shull F, Regardie M, Elbert D, Feldmann R L, Guo Y, Godfrey S. Defect categorization: Making use of a decade of widely varying historical data. In Proc. the 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, Oct. 2008, pp.149-157.
Seaman C B. Qualitative methods in empirical studies of software engineering. IEEE Transactions on Software Engineering, 1999, 25(4): 557-572.
Shrout P E, Fleiss J L. Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 1979, 86(2): 420-428.
Witten I H, Frank E, Hall M A, Pal C J. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2016.
Salton G, Yang C S. On the specification of term values in automatic indexing. Journal of Documentation, 1973, 29(4): 351-372.
Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 1988, 24(5): 513-523.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 2009, 11(1): 10-18.
Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook, Maimon R L (ed.), Springer, 2009, pp.667-685.
Elkan C. The foundations of cost-sensitive learning. In Proc. the 17th International Joint Conference on Artificial Intelligence, Volume 17, Aug. 2001, pp.973-978.
Dumais S, Platt J, Heckerman D, Sahami M. Inductive learning algorithms and representations for text categorization. In Proc. the 7th International Conference on Information and Knowledge Management, Nov. 1998, pp.148-155.
Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys (CSUR), 2002, 34(1): 1-47.
Platt J. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods — Support Vector Learning, Schoelkopf B, Burges C, Smola A (eds.), MIT Press, 1998.
Seni G, Elder J F. Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions. Morgan & Claypool, 2010.
Harman M, Jia Y, Zhang Y. App store mining and analysis: MSR for app stores. In Proc. the 9th IEEE Working Conference on Mining Software Repositories (MSR), June 2012, pp.108-111.
Di Sorbo A, Panichella S, Alexandru C V, Shimagaki J, Visaggio C A, Canfora G, Gall H C. What would users change in my app? Summarizing app reviews for recommending software changes. In Proc. the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Nov. 2016, pp.499-510.
Iacob C, Harrison R. Retrieving and analyzing mobile apps feature requests from online reviews. In Proc. the 10th IEEE Working Conference on Mining Software Repositories (MSR), May 2013, pp.41-44.
Fu B, Lin J, Li L, Faloutsos C, Hong J, Sadeh N. Why people hate your app: Making sense of user feedback in a mobile app store. In Proc. the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2013, pp.1276-1284.
Carreño L V G, Winbladh K. Analysis of user comments: An approach for software requirements evolution. In Proc. the 35th International Conference on Software Engineering (ICSE), May 2013, pp.582-591.
Jo Y, Oh A H. Aspect and sentiment unification model for online review analysis. In Proc. the 4th ACM International Conference on Web Search and Data Mining, Feb. 2011, pp.815-824.
Guzman E, MaalejW. How do users like this feature? A fine grained sentiment analysis of app reviews. In Proc. the 22nd IEEE International Requirements Engineering Conference (RE), Aug. 2014, pp.153-162.
Manning C D, Schütze H. Foundations of Statistical Natural Language Processing (1st edition). MIT Press, 1999.
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 2010, 61(12): 2544-2558.
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
Chen N, Lin J, Hoi S C, Xiao X, Zhang B. AR-Miner: Mining informative reviews for developers from mobile app marketplace. In Proc. the 36th International Conference on Software Engineering, May 31-June 7, 2014, pp.767-778.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(PDF 211 kb)
Rights and permissions
About this article
Cite this article
Zhang, L., Huang, XY., Jiang, J. et al. CSLabel: An Approach for Labelling Mobile App Reviews. J. Comput. Sci. Technol. 32, 1076–1089 (2017). https://doi.org/10.1007/s11390-017-1784-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-017-1784-1