Abstract
Recurrent neural networks are still providing excellent results in sentiment analysis tasks, variants such as LSTM and Bidirectional LSTM have become a reference for building fast and accurate predictive models. However, such performance is difficult to obtain due to the complexity of the models and the hyperparameters choice. LSTM based models can easily overfit to the studied domain, and tuning the hyperparameters to get the desired model is the keystone of the training process. In this work, we provide a study on the sensitivity of a selection of LSTM based models to various hyperparameters and we highlight important aspects to consider while using similar models in the context of sentiment analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
References
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012). https://doi.org/10.2200/S00416ED1V01Y201204HLT016
El Haddaoui, B., Chiheb, R., Faizi, R., El Afia, A.: Toward a sentiment analysis framework for social media, 1–6 (2018). https://doi.org/10.1145/3230905.3230919
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Purnamasari, P.D., Taqiyuddin, M., Ratna, A.A.P.: Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network. In: Proceedings of the 3rd International Conference on Communication and Information Processing - ICCIP 2017, Tokyo, Japan, pp. 19–23 (2017). https://doi.org/10.1145/3162957.3163012
Yin, W., Kann, K., Yu, M., Schütze, H.: Comparative Study of CNN and RNN for Natural Language Processing. ArXiv170201923 Cs, Feb 2017. http://arxiv.org/abs/1702.01923. Accessed: 14 Nov 2020
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 06(02), 107–116 (1998). https://doi.org/10.1142/S0218488598000094
Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM, 6 (1999)
Gers, F.A., Schmidhuber, J.: Recurrent nets that time and count. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, Como, Italy, pp. 189–194. vol. 3 (2000). https://doi.org/10.1109/IJCNN.2000.861302
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Gated feedback recurrent neural networks, 9 (2015).
Yu, Y., Si, X., Hu, C., Zhang, J.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019). https://doi.org/10.1162/neco_a_01199
Wang, J., Peng, B., Zhang, X.: Using a stacked residual LSTM model for sentiment intensity prediction. Neurocomputing 322, 93–101 (2018). https://doi.org/10.1016/j.neucom.2018.09.049
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997). https://doi.org/10.1109/78.650093
Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. ArXiv161106639 Cs, Nov 2016. http://arxiv.org/abs/1611.06639 (2020). Accessed 18 Nov 2020
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951). https://doi.org/10.1214/aoms/1177729586
Dozat, T.: Incorporating Nesterov Momentum into Adam, p. 4 (2016)
Allen-Zhu, Z., Li, Y.: Can SGD learn recurrent neural networks with provable generalization? ArXiv190201028 Cs Math Stat, May 2019. http://arxiv.org/abs/1902.01028 (2020). Accessed 21 Nov 2020
Keskar, N.S., Socher, R.: Improving generalization performance by switching from adam to SGD. ArXiv171207628 Cs Math, Dec 2017. http://arxiv.org/abs/1712.07628 (2020). Accessed 21 Nov 2020
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964). https://doi.org/10.1016/0041-5553(64)90137-5
Nesterov, Y.E.: A method for solving the convex programming problem with convergence rate O (1/k^ 2). Dokl. Akad. Nauk SSSR 269, 543–547 (1983)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. ArXiv14126980 Cs, Jan 2017. http://arxiv.org/abs/1412.6980 (2020). Accessed 21 Nov 2020
Choi, D., Shallue, C.J., Nado, Z., Lee, J., Maddison, C.J., Dahl, G.E.: On empirical comparisons of optimizers for deep learning. ArXiv191005446 Cs Stat, Jun 2020. http://arxiv.org/abs/1910.05446 (2020). Accessed 18 Nov 2020
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting, p. 30 (2014)
Turian, J., Ratinov, L.-A., Bengio, Y.: Word representations: A simple and general method for semi-supervised learning, p. 11 (2010)
Almeida, F., Xexéo, G.: Word embeddings: A survey. ArXiv190109069 Cs Stat, Jan 2019. http://arxiv.org/abs/1901.09069 (2020). Accessed 21 Nov 2020
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. ArXiv13013781 Cs, Sep 2013. http://arxiv.org/abs/1301.3781 (2020). Accessed 14 Nov 2020
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1532–1543 (2014). https://doi.org/10.3115/v1/D14-1162
Yuan, H., Wang, Y., Feng, X., Sun, S.: Sentiment analysis based on weighted word2vec and att-LSTM. In: Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence - CSAI 2018, Shenzhen, China, pp. 420–424. (2018). https://doi.org/10.1145/3297156.3297228
Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 515–520 (2016). https://doi.org/10.18653/v1/N16-1062
Nakov, P., Rosenthal, S., Kozareva, Z., Stoyanov, V., Ritter, A., Wilson, T.: SemEval-2013 task 2: Sentiment analysis in twitter. In: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, Georgia, USA, pp. 312–320 (June 2013). https://www.aclweb.org/anthology/S13-2052. Accessed 23 Nov 2020
Rosenthal, S., Ritter, A., Nakov, P., Stoyanov, V.: SemEval-2014 task 9: Sentiment analysis in twitter. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, pp. 73–80 (August 2014). https://doi.org/10.3115/v1/S14-2009
Rosenthal, S., Mohammad, S.M., Nakov, P., Ritter, A., Kiritchenko, S., Stoyanov, V.: SemEval-2015 task 10: Sentiment analysis in twitter. ArXiv191202387 Cs, Dec 2019. http://arxiv.org/abs/1912.02387 (2020). Accessed 23 Nov 2020
Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: SemEval-2016 task 4: Sentiment analysis in twitter. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), San Diego, California, pp. 1–18 (June 2016). https://doi.org/10.18653/v1/S16-1001
Angiani, G., et al.: A Comparison between Preprocessing Techniques for Sentiment Analysis in Twitter (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
El Haddaoui, B., Chiheb, R., Faizi, R., El Afia, A. (2022). On the Sensitivity of LSTMs to Hyperparameters and Word Embeddings in the Context of Sentiment Analysis. In: Lazaar, M., Duvallet, C., Touhafi, A., Al Achhab, M. (eds) Proceedings of the 5th International Conference on Big Data and Internet of Things. BDIoT 2021. Lecture Notes in Networks and Systems, vol 489. Springer, Cham. https://doi.org/10.1007/978-3-031-07969-6_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-07969-6_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07968-9
Online ISBN: 978-3-031-07969-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)