Sentiment Analysis on Different Domains Using Machine Learning Algorithms

Ahuja, Ravinder; Sharma, S. C.

doi:10.1007/978-981-16-5689-7_13

Ravinder Ahuja¹⁴ &
S. C. Sharma¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 318))

850 Accesses
5 Citations

Abstract

In sentiment analysis, we try to find out the writer's view about any product, events, government policy, services, topics, individual, etc., through the text written by them on social media platforms like Twitter, Facebook, etc. This study has considered two datasets (STS-Gold and IMDb) on a different domain and with varying lengths of text. The objective of this study is to know which classification algorithm performs better on two domains of text with different length. We have applied six machine learning algorithms (support vector machine, logistic regression, K-Nearest Neighbors, random forest, Naïve Bayes, and decision tree) and compared them on the basis f-score, precision, recall, and accuracy. In the IMDb dataset, logistic regression performs better among all and gives the highest accuracy of 96.3% and f-score of 80.6%. The second highest is achieved with Naïve Bayes with 95.89 and 80.05% f-score. Naïve Bayes gives the highest accuracy of 81.08% and an f-score of 42.45% in the STS-Gold dataset. The second highest is achieved with logistic regression giving an accuracy of 80.09 and 41.52% f-score. We found that logistic regression and Naïve Bayes are performing better among all the algorithms on both datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluating the Performance of Machine Learning Algorithms for Sentiment Prediction on Social Media Natural Language Text Data

Performance Analysis of Machine Learning Techniques for Sentiment Analysis

Sentiment analysis: a review and comparative analysis over social media

Article 23 May 2018

References

“The pen is mightier than the sword,” Wikipedia, 22-Nov-2016. [Online]. Available: https://en.wikipedia.org/w/index.php?title=The_pen_is_mightier_than_the_sword&oldid=750939396 [Accessed: 02-Dec-2016]
Pawar AB, Jawale MA, Kyatanavar DN (2016) Fundamentals of sentiment analysis: concepts and methodology. In: Sentiment analysis and ontology engineering. Springer, Cham, pp 25–48
Google Scholar
https://en.wikipedia.org/wiki/Twitter
Contiki M et al (2016) SemEval-2016 task 5: Aspect-based sentiment analysis. In: Workshop on semantic evaluation (SemEval-2016). Association for Computational Linguistics
Google Scholar
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Article Google Scholar
Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inform Sci Technol 60(11):2169–2188
Article Google Scholar
Sang ETK, Bos J (2012) Predicting the 2011 Dutch Senate election results with twitter. In: Proceedings of the 13th conference of the European Chapter of the Association for Computational Linguistics, pp 53–60
Google Scholar
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82. https://doi.org/10.1145/2436256.2436274
Article Google Scholar
Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches, and applications. Knowl-Based Syst 89:14–46. https://doi.org/10.1016/j.knosys.2015.06.015
Article Google Scholar
Hatzivassiloglou. V, Wiebe J (2000) Effects of adjective orientation and gradability on sentence subjectivity. In: Proceedings of the international conference on computational linguistics (COLING), pp 299–305
Google Scholar
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing (HLT/EMNLP), Vancouver, pp 347–354
Google Scholar
Yi J, Nasukawa T, NiblackW, Bunescu R (2003) Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM 2003), USA, pp 427–434
Google Scholar
Turney (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), Philadelphia, pp 417–424
Google Scholar
Li S, Wang Z, Zhou G, Lee SYM (2011) Semi-supervised learning for imbalanced sentiment classification. In: Proceedings of the international joint conference on artificial intelligence, pp 1826–1831
Google Scholar
Xia R, Zong C, Li S (2011) The ensemble of feature sets and classification algorithms for sentiment classification. Inf Sci 181:1138–1152. https://doi.org/10.1016/j.ins.2010.11.023. http://www.sciencedirect.com/science/article/pii/S0020025510005682
Fersini E, Messina E, Pozzi F (2014) Sentiment analysis: Bayesian ensemble learning. Decis Support Syst 68:26–38. https://doi.org/10.1016/j.dss.2014.10.004. http://www.sciencedirect.com/science/article/pii/S0167923614002498
Rokach L (2005) Ensemble methods for classifiers. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, US pp 957–980. https://doi.org/10.1007/0-387-25465-X_45
Sehgal V, Song C (2007) Sops: stock prediction using web sentiment. In: Seventh IEEE international conference on data mining workshops, 2007. ICDM workshops in 2007, pp 21–26. https://doi.org/10.1109/ICDMW.2007.100
Whitehead M, Yaeger L (2010) Sentiment mining using ensemble classification models. In: Innovations and advances in computer sciences and engineering. Springer pp 509–514
Google Scholar
Hangya V, Berend G, Varga I, Farkas R (2014) SZTE-NLP: aspect level opinion mining exploiting syntactic cues. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). Dublin, Ireland, pp 610–614
Google Scholar
Dang NC, Moreno-García MN, De la Prieta F (2020) Sentiment analysis based on deep learning: a comparative study. Electronics 9(3):483
Google Scholar
Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold
Google Scholar
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies, vol 1. Association for Computational Linguistics pp 142–150
Google Scholar
(2005). Internet &Text Slang Dictionary. Accessed: Feb. 2, 2017. [Online] Available: https://www.noslang.com/dictionary
Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Documentation 28(1):11–21
Article Google Scholar
McCallum A, Nigam K (1998) A comparison of event models for naive Bayes text classification. In: AAAI-98 Workshop on learning for text categorization, vol 752(1), pp 41–48
Google Scholar
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Their Appl 13(4):18–28
Article Google Scholar
Umanol M, Okamoto H, Hatono I, Tamura HIROYUKI, Kawachi F, Umedzu S, Kinoshita J (1994) Fuzzy decision trees by fuzzy ID3 algorithm and its application to diagnosis systems. In: Fuzzy systems, 1994. Proceedings of the third IEEE conference on IEEE world congress on computational intelligence, pp 2113–2118. IEEE
Google Scholar
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Google Scholar
Tabari B, Herman W (2002) A Multivariate logistic regression equation to screen for diabetes. Diabetes Care 25:1999–2003
Article Google Scholar
Soucy P, Mineau GW (2001) A simple KNN algorithm for text categorization. In Proceedings IEEE international conference on data mining, 2001, ICDM 2001. IEEE pp 647–648
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Google Scholar

Download references

Author information

Authors and Affiliations

Electronics and Computer Discipline, IIT Roorkee, Saharanpur Campus, Saharanpur, Uttar Pradesh, India
Ravinder Ahuja & S. C. Sharma

Authors

Ravinder Ahuja
View author publications
You can also search for this author in PubMed Google Scholar
S. C. Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Krishna Engineering College, Ghaziabad, India
Shailesh Tiwari
National Institute of Technology Agartala, Agartala, Tripura, India
Munesh C. Trivedi
Department of Engineering and Science, University of Agder, Kristiansand, Norway
Mohan Lal Kolhe
Motilal Nehru National Institute of Technology, Allahabad, Uttar Pradesh, India
K.K. Mishra
Department of Computer Science and Engineering, RBS Engineering Technical Campus, Agra, Uttar Pradesh, India
Brajesh Kumar Singh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahuja, R., Sharma, S.C. (2022). Sentiment Analysis on Different Domains Using Machine Learning Algorithms. In: Tiwari, S., Trivedi, M.C., Kolhe, M.L., Mishra, K., Singh, B.K. (eds) Advances in Data and Information Sciences. Lecture Notes in Networks and Systems, vol 318. Springer, Singapore. https://doi.org/10.1007/978-981-16-5689-7_13

Download citation

DOI: https://doi.org/10.1007/978-981-16-5689-7_13
Published: 08 February 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5688-0
Online ISBN: 978-981-16-5689-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Sentiment Analysis on Different Domains Using Machine Learning Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluating the Performance of Machine Learning Algorithms for Sentiment Prediction on Social Media Natural Language Text Data

Performance Analysis of Machine Learning Techniques for Sentiment Analysis

Sentiment analysis: a review and comparative analysis over social media

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Sentiment Analysis on Different Domains Using Machine Learning Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluating the Performance of Machine Learning Algorithms for Sentiment Prediction on Social Media Natural Language Text Data

Performance Analysis of Machine Learning Techniques for Sentiment Analysis

Sentiment analysis: a review and comparative analysis over social media

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation