Skip to main content

Sentiment Analysis on Different Domains Using Machine Learning Algorithms

  • Conference paper
  • First Online:
Advances in Data and Information Sciences

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 318))

Abstract

In sentiment analysis, we try to find out the writer's view about any product, events, government policy, services, topics, individual, etc., through the text written by them on social media platforms like Twitter, Facebook, etc. This study has considered two datasets (STS-Gold and IMDb) on a different domain and with varying lengths of text. The objective of this study is to know which classification algorithm performs better on two domains of text with different length. We have applied six machine learning algorithms (support vector machine, logistic regression, K-Nearest Neighbors, random forest, Naïve Bayes, and decision tree) and compared them on the basis f-score, precision, recall, and accuracy. In the IMDb dataset, logistic regression performs better among all and gives the highest accuracy of 96.3% and f-score of 80.6%. The second highest is achieved with Naïve Bayes with 95.89 and 80.05% f-score. Naïve Bayes gives the highest accuracy of 81.08% and an f-score of 42.45% in the STS-Gold dataset. The second highest is achieved with logistic regression giving an accuracy of 80.09 and 41.52% f-score. We found that logistic regression and Naïve Bayes are performing better among all the algorithms on both datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. “The pen is mightier than the sword,” Wikipedia, 22-Nov-2016. [Online]. Available: https://en.wikipedia.org/w/index.php?title=The_pen_is_mightier_than_the_sword&oldid=750939396 [Accessed: 02-Dec-2016]

  2. Pawar AB, Jawale MA, Kyatanavar DN (2016) Fundamentals of sentiment analysis: concepts and methodology. In: Sentiment analysis and ontology engineering. Springer, Cham, pp 25–48

    Google Scholar 

  3. https://en.wikipedia.org/wiki/Twitter

  4. Contiki M et al (2016) SemEval-2016 task 5: Aspect-based sentiment analysis. In: Workshop on semantic evaluation (SemEval-2016). Association for Computational Linguistics

    Google Scholar 

  5. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  Google Scholar 

  6. Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inform Sci Technol 60(11):2169–2188

    Article  Google Scholar 

  7. Sang ETK, Bos J (2012) Predicting the 2011 Dutch Senate election results with twitter. In: Proceedings of the 13th conference of the European Chapter of the Association for Computational Linguistics, pp 53–60

    Google Scholar 

  8. Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82. https://doi.org/10.1145/2436256.2436274

    Article  Google Scholar 

  9. Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches, and applications. Knowl-Based Syst 89:14–46. https://doi.org/10.1016/j.knosys.2015.06.015

    Article  Google Scholar 

  10. Hatzivassiloglou. V, Wiebe J (2000) Effects of adjective orientation and gradability on sentence subjectivity. In: Proceedings of the international conference on computational linguistics (COLING), pp 299–305

    Google Scholar 

  11. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing (HLT/EMNLP), Vancouver, pp 347–354

    Google Scholar 

  12. Yi J, Nasukawa T, NiblackW, Bunescu R (2003) Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM 2003), USA, pp 427–434

    Google Scholar 

  13. Turney (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), Philadelphia, pp 417–424

    Google Scholar 

  14. Li S, Wang Z, Zhou G, Lee SYM (2011) Semi-supervised learning for imbalanced sentiment classification. In: Proceedings of the international joint conference on artificial intelligence, pp 1826–1831

    Google Scholar 

  15. Xia R, Zong C, Li S (2011) The ensemble of feature sets and classification algorithms for sentiment classification. Inf Sci 181:1138–1152. https://doi.org/10.1016/j.ins.2010.11.023. http://www.sciencedirect.com/science/article/pii/S0020025510005682

  16. Fersini E, Messina E, Pozzi F (2014) Sentiment analysis: Bayesian ensemble learning. Decis Support Syst 68:26–38. https://doi.org/10.1016/j.dss.2014.10.004. http://www.sciencedirect.com/science/article/pii/S0167923614002498

  17. Rokach L (2005) Ensemble methods for classifiers. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, US pp 957–980. https://doi.org/10.1007/0-387-25465-X_45

  18. Sehgal V, Song C (2007) Sops: stock prediction using web sentiment. In: Seventh IEEE international conference on data mining workshops, 2007. ICDM workshops in 2007, pp 21–26. https://doi.org/10.1109/ICDMW.2007.100

  19. Whitehead M, Yaeger L (2010) Sentiment mining using ensemble classification models. In: Innovations and advances in computer sciences and engineering. Springer pp 509–514

    Google Scholar 

  20. Hangya V, Berend G, Varga I, Farkas R (2014) SZTE-NLP: aspect level opinion mining exploiting syntactic cues. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). Dublin, Ireland, pp 610–614

    Google Scholar 

  21. Dang NC, Moreno-García MN, De la Prieta F (2020) Sentiment analysis based on deep learning: a comparative study. Electronics 9(3):483

    Google Scholar 

  22. Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold

    Google Scholar 

  23. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies, vol 1. Association for Computational Linguistics pp 142–150

    Google Scholar 

  24. (2005). Internet &Text Slang Dictionary. Accessed: Feb. 2, 2017. [Online] Available: https://www.noslang.com/dictionary

  25. Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Documentation 28(1):11–21

    Article  Google Scholar 

  26. McCallum A, Nigam K (1998) A comparison of event models for naive Bayes text classification. In: AAAI-98 Workshop on learning for text categorization, vol 752(1), pp 41–48

    Google Scholar 

  27. Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Their Appl 13(4):18–28

    Article  Google Scholar 

  28. Umanol M, Okamoto H, Hatono I, Tamura HIROYUKI, Kawachi F, Umedzu S, Kinoshita J (1994) Fuzzy decision trees by fuzzy ID3 algorithm and its application to diagnosis systems. In: Fuzzy systems, 1994. Proceedings of the third IEEE conference on IEEE world congress on computational intelligence, pp 2113–2118. IEEE

    Google Scholar 

  29. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  30. Tabari B, Herman W (2002) A Multivariate logistic regression equation to screen for diabetes. Diabetes Care 25:1999–2003

    Article  Google Scholar 

  31. Soucy P, Mineau GW (2001) A simple KNN algorithm for text categorization. In Proceedings IEEE international conference on data mining, 2001, ICDM 2001. IEEE pp 647–648

    Google Scholar 

  32. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ahuja, R., Sharma, S.C. (2022). Sentiment Analysis on Different Domains Using Machine Learning Algorithms. In: Tiwari, S., Trivedi, M.C., Kolhe, M.L., Mishra, K., Singh, B.K. (eds) Advances in Data and Information Sciences. Lecture Notes in Networks and Systems, vol 318. Springer, Singapore. https://doi.org/10.1007/978-981-16-5689-7_13

Download citation

Publish with us

Policies and ethics