Skip to main content

Sentiment Analysis: Choosing the Right Word Embedding for Deep Learning Model

  • Conference paper
  • First Online:
Advanced Computing and Intelligent Technologies

Abstract

Sentiment analysis is a field that helps data analysts to gauge public opinion, observe brand and product reputation, perform nuanced market research, and understand customer experiences. It is the process by which whether a particular text is positive, negative or neutral can be determined. This process is a combined application of techniques of natural language processing and artificial intelligence to assign weighted sentiment scores to the entities within a sentence or piece of text. Word embedding or the feature vectors are the vectors that can represent the text into the vector space. Embeddings solve the problem of representing very large non-sparse vectors into a lower-dimensional space. For deep learning models, it is very necessary to input the text data as feature vectors. Various methods of generating feature vectors are explored. The paper mainly focuses on the application of various pre-trained word embedding on deep learning model and compares them through various metrics calculated on them. The results obtained showed that Word2Vec outperformed other word embeddings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Goldberg, Y.: Neural network methods for natural language processing. Synth. Lect. Hum. Lang. Technol. 10(1), 1–309 (2017)

    Article  Google Scholar 

  2. Almeida, F., Xexéo, G.: Word Embeddings: A Survey (2019). arXiv:1901.09069

  3. Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)

    Google Scholar 

  4. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Rese. 3, 1137–1155 (2003)

    Google Scholar 

  5. Bengio, Y., Senécal, J.S.: Quick training of probabilistic neural nets by importance sampling. In: AISTATS, pp. 1–9, Jan 2003

    Google Scholar 

  6. Doucet, A., De Freitas, N., Gordon, N.: An introduction to sequential Monte Carlo methods. In: Sequential Monte Carlo Methods in Practice, pp. 3–14. Springer, New York, NY (2001)

    Google Scholar 

  7. Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Aistats, vol. 5, pp. 246–252, Jan 2005

    Google Scholar 

  8. Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine learning, pp. 641–648, June 2007

    Google Scholar 

  9. Mnih, A., & Hinton, G. E.: A scalable hierarchical distributed language model. In: Advances in Neural Information Processing Systems, pp. 1081–1088 (2009)

    Google Scholar 

  10. Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167, July 2008

    Google Scholar 

  11. Mikolov, T., Kopecky, J., Burget, L., Glembek, O.: Neural network based language models for highly inflective languages. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4725–4728. IEEE (2009)

    Google Scholar 

  12. Mikolov, T., Karafiát, M., Burget, L.: Jan ˇCernocky, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association, pp. 1045–1048 (2010)

    Google Scholar 

  13. Mikolov, T., Yih, W. T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 746–751), June 2013

    Google Scholar 

  14. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space (2013). arXiv:1301.3781

  15. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).

    Google Scholar 

  16. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of Tricks for Efficient Text Classification (2016). arXiv:1607.01759.

  17. Turney, P.D., Pantel, P.: From frequency to meaning: Vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)

    Article  MathSciNet  Google Scholar 

  18. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  19. Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28(2), 203–208 (1996)

    Article  Google Scholar 

  20. Rohde, D.L., Gonnerman, L.M., Plaut, D.C.: An improved model of semantic similarity based on lexical co-occurrence. Commun. ACM 8(627–633), 116 (2006)

    Google Scholar 

  21. Dhillon, P., Foster, D.P., Ungar, L.H.: Multi-view learning of word embeddings via cca. In: Advances in Neural Information Processing Systems, pp. 199–207

    Google Scholar 

  22. Lebret, R., Collobert, R.: Word Emdeddings Through Hellinger PCA. arXiv:1312.5542

  23. Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543, Oct 2014

    Google Scholar 

  24. Petrolito, R., Dell'Orletta, F.: Word embeddings in sentiment analysis. In: CLiC-it, Dec 2018

    Google Scholar 

  25. Hong, J., Fang, M.: Sentiment analysis with deeply learned distributed representations of variable length texts. Stanford University Report, pp. 1–9 (2015)

    Google Scholar 

  26. Mandelbaum, A., Shalev, A.: Word Embeddings and Their use in Sentence Classification Tasks (2016). arXiv:1610.08229

  27. Wang, J.H., Liu, T.W., Luo, X., Wang, L.: An LSTM approach to short text sentiment classification with word embeddings. In: Proceedings of the 30th Conference on Computational Linguistics and Speech Processing (ROCLING 2018), pp. 214–223, Oct 2018

    Google Scholar 

  28. Wang, B., Wang, A., Chen, F., Wang, Y., Kuo, C.C.J.: Evaluating word embedding models: methods and experimental results. APSIPA Transactions on Signal and Information Processing, vol. 8 (2019)

    Google Scholar 

  29. Hellrich, J., Hahn, U.: Don't get fooled by word embeddings-better watch their neighborhood. In: DH, Aug 2017

    Google Scholar 

  30. Gladkova, A., Drozd, A.: Intrinsic evaluations of word embeddings: What can we do better?. In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pp. 36–42, Aug 2016

    Google Scholar 

  31. Schnabel, T., Labutov, I., Mimno, D., Joachims, T.: Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 298–307, Sept 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarita Bansal Garg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Garg, S.B., Subrahmanyam, V.V. (2022). Sentiment Analysis: Choosing the Right Word Embedding for Deep Learning Model. In: Bianchini, M., Piuri, V., Das, S., Shaw, R.N. (eds) Advanced Computing and Intelligent Technologies. Lecture Notes in Networks and Systems, vol 218. Springer, Singapore. https://doi.org/10.1007/978-981-16-2164-2_33

Download citation

Publish with us

Policies and ethics