CSLabel: An Approach for Labelling Mobile App Reviews

Zhang, Li; Huang, Xin-Yue; Jiang, Jing; Hu, Ya-Kun

doi:10.1007/s11390-017-1784-1

CSLabel: An Approach for Labelling Mobile App Reviews

Regular Paper
Published: 08 December 2017

Volume 32, pages 1076–1089, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Computer Science and Technology Aims and scope Submit manuscript

CSLabel: An Approach for Labelling Mobile App Reviews

Download PDF

Li Zhang¹,
Xin-Yue Huang¹,
Jing Jiang¹ &
…
Ya-Kun Hu¹

387 Accesses
10 Citations
Explore all metrics

Abstract

Mobile apps (applications) have become a popular form of software, and the app reviews by users have become an important feedback resource. Users may raise some issues in their reviews when they use apps, such as a functional bug, a network lag, or a request for a feature. Understanding these issues can help developers to focus on users’ concerns, and help users to evaluate similar apps for download or purchase. However, we do not know which types of issues are raised in a review. Moreover, the amount of user reviews is huge and the nature of the reviews’ text is unstructured and informal. In this paper, we analyze 3 902 user reviews from 11 mobile apps in a Chinese app store — 360 Mobile Assistant, and uncover 17 issue types. Then, we propose an approach CSLabel that can label user reviews based on the raised issue types. CSLabel uses a cost-sensitive learning method to mitigate the effects of the imbalanced data, and optimizes the setting of the support vector machine (SVM) classifier’s kernel function. Results show that CSLabel can correctly label reviews with the precision of 66.5%, the recall of 69.8%, and the F₁ measure of 69.8%. In comparison with the state-of-the-art approach, CSLabel improves the precision by 14%, the recall by 30%, the F₁ measure by 22%. Finally, we apply our approach to two real scenarios: 1) we provide an overview of 1 076 786 user reviews from 1 100 apps in the 360 Mobile Assistant and 2) we find that some issue types have a negative correlation with users’ evaluation of apps.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Pagano D, Maalej W. User feedback in the Appstore: An empirical study. In Proc. the 21st IEEE International Requirements Engineering Conference (RE), July 2013, pp.125-134.
Pagano D, Brügge B. User involvement in software evolution practice: A case study. In Proc. the 35th International Conference on Software Engineering (ICSE), May 2013, pp.953-962.
Maalej W, Nabil H. Bug report, feature request, or simply praise? On automatically classifying app reviews. In Proc. the 23rd IEEE International Requirements Engineering Conference (RE), Aug. 2015, pp.116-125.
Panichella S, Di Sorbo A, Guzman E, Visaggio C A, Canfora G, Gall H C. How can I improve my app? Classifying user reviews for software maintenance and evolution. In Proc. IEEE International Conference on Software Maintenance and Evolution (ICSME), Sept. 29-Oct. 1, 2015, pp.281-290.
McIlroy S, Ali N, Khalid H, Hassan A E. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering, 2016, 21(3): 1067-1106.
Article Google Scholar
Maas A L, Daly R E, Pham P T, Huang D, Ng A Y, Potts C. Learning word vectors for sentiment analysis. In Proc. the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1, June 2011, pp.142-150.
Pang B, Lee L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proc. the 42nd Annual Meeting on Association for Computational Linguistics, July 2004, pp.271-278.
Seaman C B, Shull F, Regardie M, Elbert D, Feldmann R L, Guo Y, Godfrey S. Defect categorization: Making use of a decade of widely varying historical data. In Proc. the 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, Oct. 2008, pp.149-157.
Seaman C B. Qualitative methods in empirical studies of software engineering. IEEE Transactions on Software Engineering, 1999, 25(4): 557-572.
Article Google Scholar
Shrout P E, Fleiss J L. Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 1979, 86(2): 420-428.
Article Google Scholar
Witten I H, Frank E, Hall M A, Pal C J. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2016.
Salton G, Yang C S. On the specification of term values in automatic indexing. Journal of Documentation, 1973, 29(4): 351-372.
Article Google Scholar
Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 1988, 24(5): 513-523.
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 2009, 11(1): 10-18.
Article Google Scholar
Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook, Maimon R L (ed.), Springer, 2009, pp.667-685.
Elkan C. The foundations of cost-sensitive learning. In Proc. the 17th International Joint Conference on Artificial Intelligence, Volume 17, Aug. 2001, pp.973-978.
Dumais S, Platt J, Heckerman D, Sahami M. Inductive learning algorithms and representations for text categorization. In Proc. the 7th International Conference on Information and Knowledge Management, Nov. 1998, pp.148-155.
Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys (CSUR), 2002, 34(1): 1-47.
Article Google Scholar
Platt J. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods — Support Vector Learning, Schoelkopf B, Burges C, Smola A (eds.), MIT Press, 1998.
Seni G, Elder J F. Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions. Morgan & Claypool, 2010.
Harman M, Jia Y, Zhang Y. App store mining and analysis: MSR for app stores. In Proc. the 9th IEEE Working Conference on Mining Software Repositories (MSR), June 2012, pp.108-111.
Di Sorbo A, Panichella S, Alexandru C V, Shimagaki J, Visaggio C A, Canfora G, Gall H C. What would users change in my app? Summarizing app reviews for recommending software changes. In Proc. the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Nov. 2016, pp.499-510.
Iacob C, Harrison R. Retrieving and analyzing mobile apps feature requests from online reviews. In Proc. the 10th IEEE Working Conference on Mining Software Repositories (MSR), May 2013, pp.41-44.
Fu B, Lin J, Li L, Faloutsos C, Hong J, Sadeh N. Why people hate your app: Making sense of user feedback in a mobile app store. In Proc. the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2013, pp.1276-1284.
Carreño L V G, Winbladh K. Analysis of user comments: An approach for software requirements evolution. In Proc. the 35th International Conference on Software Engineering (ICSE), May 2013, pp.582-591.
Jo Y, Oh A H. Aspect and sentiment unification model for online review analysis. In Proc. the 4th ACM International Conference on Web Search and Data Mining, Feb. 2011, pp.815-824.
Guzman E, MaalejW. How do users like this feature? A fine grained sentiment analysis of app reviews. In Proc. the 22nd IEEE International Requirements Engineering Conference (RE), Aug. 2014, pp.153-162.
Manning C D, Schütze H. Foundations of Statistical Natural Language Processing (1st edition). MIT Press, 1999.
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 2010, 61(12): 2544-2558.
Article Google Scholar
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
MATH Google Scholar
Chen N, Lin J, Hoi S C, Xiao X, Zhang B. AR-Miner: Mining informative reviews for developers from mobile app marketplace. In Proc. the 36th International Conference on Software Engineering, May 31-June 7, 2014, pp.767-778.

Download references

Author information

Authors and Affiliations

State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
Li Zhang, Xin-Yue Huang, Jing Jiang & Ya-Kun Hu

Authors

Li Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Yue Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Ya-Kun Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Jiang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 211 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Huang, XY., Jiang, J. et al. CSLabel: An Approach for Labelling Mobile App Reviews. J. Comput. Sci. Technol. 32, 1076–1089 (2017). https://doi.org/10.1007/s11390-017-1784-1

Download citation

Received: 20 April 2017
Revised: 15 September 2017
Published: 08 December 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s11390-017-1784-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

CSLabel: An Approach for Labelling Mobile App Reviews

Abstract

Article PDF

Similar content being viewed by others

Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews

A Multi-label Active Learning Approach for Mobile App User Review Classification

On the automatic classification of app reviews

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CSLabel: An Approach for Labelling Mobile App Reviews

Abstract

Article PDF

Similar content being viewed by others

Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews

A Multi-label Active Learning Approach for Mobile App User Review Classification

On the automatic classification of app reviews

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation