Abstract
Sentiment analysis is a field that helps data analysts to gauge public opinion, observe brand and product reputation, perform nuanced market research, and understand customer experiences. It is the process by which whether a particular text is positive, negative or neutral can be determined. This process is a combined application of techniques of natural language processing and artificial intelligence to assign weighted sentiment scores to the entities within a sentence or piece of text. Word embedding or the feature vectors are the vectors that can represent the text into the vector space. Embeddings solve the problem of representing very large non-sparse vectors into a lower-dimensional space. For deep learning models, it is very necessary to input the text data as feature vectors. Various methods of generating feature vectors are explored. The paper mainly focuses on the application of various pre-trained word embedding on deep learning model and compares them through various metrics calculated on them. The results obtained showed that Word2Vec outperformed other word embeddings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Goldberg, Y.: Neural network methods for natural language processing. Synth. Lect. Hum. Lang. Technol. 10(1), 1–309 (2017)
Almeida, F., Xexéo, G.: Word Embeddings: A Survey (2019). arXiv:1901.09069
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Rese. 3, 1137–1155 (2003)
Bengio, Y., Senécal, J.S.: Quick training of probabilistic neural nets by importance sampling. In: AISTATS, pp. 1–9, Jan 2003
Doucet, A., De Freitas, N., Gordon, N.: An introduction to sequential Monte Carlo methods. In: Sequential Monte Carlo Methods in Practice, pp. 3–14. Springer, New York, NY (2001)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Aistats, vol. 5, pp. 246–252, Jan 2005
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine learning, pp. 641–648, June 2007
Mnih, A., & Hinton, G. E.: A scalable hierarchical distributed language model. In: Advances in Neural Information Processing Systems, pp. 1081–1088 (2009)
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167, July 2008
Mikolov, T., Kopecky, J., Burget, L., Glembek, O.: Neural network based language models for highly inflective languages. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4725–4728. IEEE (2009)
Mikolov, T., Karafiát, M., Burget, L.: Jan ˇCernocky, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association, pp. 1045–1048 (2010)
Mikolov, T., Yih, W. T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 746–751), June 2013
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space (2013). arXiv:1301.3781
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of Tricks for Efficient Text Classification (2016). arXiv:1607.01759.
Turney, P.D., Pantel, P.: From frequency to meaning: Vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28(2), 203–208 (1996)
Rohde, D.L., Gonnerman, L.M., Plaut, D.C.: An improved model of semantic similarity based on lexical co-occurrence. Commun. ACM 8(627–633), 116 (2006)
Dhillon, P., Foster, D.P., Ungar, L.H.: Multi-view learning of word embeddings via cca. In: Advances in Neural Information Processing Systems, pp. 199–207
Lebret, R., Collobert, R.: Word Emdeddings Through Hellinger PCA. arXiv:1312.5542
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543, Oct 2014
Petrolito, R., Dell'Orletta, F.: Word embeddings in sentiment analysis. In: CLiC-it, Dec 2018
Hong, J., Fang, M.: Sentiment analysis with deeply learned distributed representations of variable length texts. Stanford University Report, pp. 1–9 (2015)
Mandelbaum, A., Shalev, A.: Word Embeddings and Their use in Sentence Classification Tasks (2016). arXiv:1610.08229
Wang, J.H., Liu, T.W., Luo, X., Wang, L.: An LSTM approach to short text sentiment classification with word embeddings. In: Proceedings of the 30th Conference on Computational Linguistics and Speech Processing (ROCLING 2018), pp. 214–223, Oct 2018
Wang, B., Wang, A., Chen, F., Wang, Y., Kuo, C.C.J.: Evaluating word embedding models: methods and experimental results. APSIPA Transactions on Signal and Information Processing, vol. 8 (2019)
Hellrich, J., Hahn, U.: Don't get fooled by word embeddings-better watch their neighborhood. In: DH, Aug 2017
Gladkova, A., Drozd, A.: Intrinsic evaluations of word embeddings: What can we do better?. In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pp. 36–42, Aug 2016
Schnabel, T., Labutov, I., Mimno, D., Joachims, T.: Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 298–307, Sept 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Garg, S.B., Subrahmanyam, V.V. (2022). Sentiment Analysis: Choosing the Right Word Embedding for Deep Learning Model. In: Bianchini, M., Piuri, V., Das, S., Shaw, R.N. (eds) Advanced Computing and Intelligent Technologies. Lecture Notes in Networks and Systems, vol 218. Springer, Singapore. https://doi.org/10.1007/978-981-16-2164-2_33
Download citation
DOI: https://doi.org/10.1007/978-981-16-2164-2_33
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2163-5
Online ISBN: 978-981-16-2164-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)