Abstract
Social media, e-commerce, review, and blogging websites have become important sources of knowledge as information and communication technology has advanced. Individuals can share their thoughts, complaints, feelings, and views on a wide range of topics. Because it seeks to identify the orientation of the sentiment present in source materials, sentiment analysis is a key field of research in natural language processing. Sentiment analysis is a natural language processing (NLP) task that received the attention of many researchers and practitioners. The majority of earlier studies in sentiment analysis mainly focused on traditional machine learning (i.e., shallow learning) and, to some extent, deep learning algorithms. Recently, transformer-based models have been developed and applied in different application domains. These models have been shown to have a huge potential to advance text classification and, particularly, sentiment analysis research fields. In this paper, we investigate the performance of transformer-based sentiment analysis models. The case study has been performed on four datasets that are in Turkish. First, preprocessing methods were used to remove links, numerals, unmeaningful, and punctuation characters from the data. Unsuitable data was eliminated after the preprocessing phase. Second, each data set splitted into two parts; 80% for training, 20% for testing. Finally, transformer-based BERT, ConvBERT, ELECTRA, traditional deep learning, and machine learning algorithms have been applied to classify sentences into two or three classes, which are either positive, neutral, or negative. Experimental results demonstrated that transformer-based models could provide superior performance in terms of F-score compared to the traditional machine learning-based and deep learning models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Internet live stats - internet usage; social media statistics. https://www.internetlivestats.com/. Accessed 6 Oct 2022
Wikistats - statistics for wikimedia projects. https://stats.wikimedia.org/. Accessed 6 Oct 2022
Abercrombie, G., Batista-Navarro, R.: Parlvote: a corpus for sentiment analysis of political debates. In: LREC (2020)
Al-Smadi, M., Talafha, B., Al-Ayyoub, M., Jararweh, Y.: Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int. J. Mach. Learn. Cybern. 10(8), 2163–2175 (2018). https://doi.org/10.1007/s13042-018-0799-4
Alamoodi, A., et al.: Sentiment analysis and its applications in fighting covid-19 and infectious diseases: a systematic review. Expert Syst. Appl. 114155 (2020). https://doi.org/10.1016/j.eswa.2020.114155, http://www.sciencedirect.com/science/article/pii/S0957417420308988
Antoun, W., Baly, F., Hajj, H.: Arabert: transformer-based model for Arabic language understanding. ArXiv abs/2003.00104 (2020)
Appel, O., Chiclana, F., Carter, J., Fujita, H.: A hybrid approach to the sentiment analysis problem at the sentence level. Knowl.-Based Syst. 108, 110–124 (2016). https://doi.org/10.1016/j.knosys.2016.05.040, http://www.sciencedirect.com/science/article/pii/S095070511630137X. New Avenues in Knowledge Bases for Natural Language Processing
Arisoy, E., Sethy, A., Ramabhadran, B., Chen, S.: Bidirectional recurrent neural network language models for automatic speech recognition. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5421–5425 (2015). https://doi.org/10.1109/ICASSP.2015.7179007
Ayata, D., Saraçlar, M., Özgür, A.: Turkish tweet sentiment analysis with word embedding and machine learning. In: 2017 25th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2017)
Bilen, B., Horasan, F.: Lstm network based sentiment analysis for customer reviews. Politeknik Dergisi, 1–1 (2021). https://doi.org/10.2339/politeknik.844019
Bisong, E.: Google Colaboratory, pp. 59–64. Apress, Berkeley (2019). https://doi.org/10.1007/978-1-4842-4470-8_7
Brown, T., et al.: Language models are few-shot learners. ArXiv abs/2005.14165 (2020)
Cetin, M., Amasyali, M.F.: Active learning for Turkish sentiment analysis. In: 2013 IEEE INISTA, pp. 1–4 (2013)
Chan, B., Schweter, S., Möller, T.: German’s next language model. ArXiv abs/2010.10906 (2020)
Ciftci, B., Apaydin, M.: A deep learning approach to sentiment analysis in Turkish. In: 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), pp. 1–5 (2018)
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. ArXiv abs/2003.10555 (2020)
Dehkharghani, R., Saygin, Y., Yanikoglu, B.A., Oflazer, K.: Sentiturknet: a Turkish polarity lexicon for sentiment analysis. Lang. Resour. Eval. 50, 667–685 (2016)
Demirci, G.M., Keskin, S., Doğan, G.: Sentiment analysis in Turkish with deep learning. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2215–2221 (2019)
Demirtas, E., Pechenizkiy, M.: Cross-lingual polarity detection with machine translation. In: Proceedings of the 2nd International Workshop on Issues of Sentiment Discovery and Opinion Mining, WISDOM’13. Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2502069.2502078
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2019)
Dreisbach, C., Koleck, T.A., Bourne, P.E., Bakken, S.: A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. Int. J. Med. Inform. 125, 37–46 (2019). https://doi.org/10.1016/j.ijmedinf.2019.02.008, http://www.sciencedirect.com/science/article/pii/S1386505618313789
Erşahin, B., Aktas, Ö., Kilinç, D., Ersahin, M.: A hybrid sentiment analysis method for Turkish. Turk. J. Electr. Eng. Comput. Sci. 27, 1780–1793 (2019)
Çetin, F.S., Eryigit, G.: Türkçe hedef tabanlı duygu analizi İçin alt görevlerin İncelenmesi - hedef terim, hedef kategori ve duygu sınıfı belirleme (2018)
Çetin, F.S., Yildirim, E., Özbey, C., Eryigit, G.: Tgb at semeval-2016 task 5: Multi-lingual constraint system for aspect based sentiment analysis. In: SemEval@NAACL-HLT (2016)
Farahani, M., Gharachorloo, M., Farahani, M., Manthouri, M.: Parsbert: transformer-based model for Persian language understanding. ArXiv abs/2005.12515 (2020)
Greco, F., Polli, A.: Emotional text mining: customer profiling in brand management. Int. J. Inform. Manag. 51, 101934 (2020). https://doi.org/10.1016/j.ijinfomgt.2019.04.007, http://www.sciencedirect.com/science/article/pii/S0268401218313598
Karamollaoğlu, H., Dogru, I., Dörterler, M., Utku, A., Yildiz, O.: Sentiment analysis on Turkish social media shares through lexicon based approach. In: 2018 3rd International Conference on Computer Science and Engineering (UBMK), pp. 45–49 (2018)
Kaya, M., Fidan, G., Toroslu, I.H.: Sentiment analysis of Turkish political news. In: 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 1, pp. 174–180 (2012)
Kırelli, Y., Arslankaya, S.: Sentiment analysis of shared tweets on global warming on twitter with data mining methods: a case study on Turkish language. Comput. Intell. Neurosci. 2020 (2020)
Liu, Y., et al.: Roberta: a robustly optimized Bert pretraining approach. ArXiv abs/1907.11692 (2019)
Maryame, N., Najima, D., Hasnae, R., Rachida, A.: State of the art of deep learning applications in sentiment analysis: psychological behavior prediction. In: Bhateja, V., Satapathy, S.C., Satori, H. (eds.) Embedded Systems and Artificial Intelligence, pp. 441–451. Springer Singapore, Singapore (2020)
Mulki, H., Haddad, H., Ali, C.B., Babaoglu, I.: Preprocessing impact on turkish sentiment analysis. In: 2018 26th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2018)
Onan, A.: Sentiment analysis in Turkish based on weighted word embeddings. In: 2020 28th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2020)
Pejic-Bach, M., Bertoncel, T., Meško, M., Živko Krstić: Text mining of industry 4.0 job advertisements. Int. J. Inform. Manag. 50, 416–431 (2020). https://doi.org/10.1016/j.ijinfomgt.2019.07.014, http://www.sciencedirect.com/science/article/pii/S0268401218313677
Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., Manandhar, S.: Semeval-2014 task 4: aspect based sentiment analysis. In: COLING 2014 (2014)
Rumelli, M., Akkuş, D., Kart, Ö., Isik, Z.: Sentiment analysis in turkish text with machine learning algorithms. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019)
Schweter, S.: Berturk - bert models for turkish (2020). https://doi.org/10.5281/zenodo.3770924
Seyfioglu, M.S., Demirezen, M.: A hierarchical approach for sentiment analysis and categorization of turkish written customer relationship management data. In: 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 361–365 (2017)
Sezgen, E., Mason, K.J., Mayer, R.: Voice of airline passenger: a text mining approach to understand customer satisfaction. J. Air Transp. Manag. 77, 65–74 (2019). https://doi.org/10.1016/j.jairtraman.2019.04.001, http://www.sciencedirect.com/science/article/pii/S0969699718304873
Shehu, H., Sharif, M.H., Uyaver, S., Tokat, S., Ramadan, R.: Sentiment analysis of Turkish twitter data using polarity lexicon and artificial intelligence (2020)
Shehu, H.A., Tokat, S.: A hybrid approach for the sentiment analysis of Turkish twitter data. In: Hemanth, D.J., Kose, U. (eds.) Artificial Intelligence and Applied Mathematics in Engineering Problems, pp. 182–190. Springer International Publishing, Cham (2020)
Sigirci, I.O., et al.: Sentiment analysis of Turkish reviews on google play store. In: 2020 5th International Conference on Computer Science and Engineering (UBMK), pp. 314–315 (2020)
Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: L. Getoor, T. Scheffer (eds.) Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, 28 June - 2 July (2011), pp. 1017–1024 (2011). https://icml.cc/2011/papers/524_icmlpaper.pdf. Accessed 6 Oct 2022
Taboada, M., Brooke, J., Tofiloski, M., Voll, K.D., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)
Toh, Z., Su, J.: Nlangp at semeval-2016 task 5: Improving aspect based sentiment analysis using neural network features. In: SemEval@NAACL-HLT (2016)
Uslu, A., Tekin, S., Aytekin, T.: Sentiment analysis in Turkish film comments. In: 2019 27th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2019)
Vaswani, A., et al.: Attention is all you need (2017)
de Vries, W., van Cranenburgh, A., Bisazza, A., Caselli, T., van Noord, G., Nissim, M.: Bertje: a dutch bert model. ArXiv abs/1912.09582 (2019)
Vural, A., Cambazoglu, B.B., Senkul, P., Tokgoz, Z.O.: A framework for sentiment analysis in Turkish: application to polarity detection of movie reviews in Turkish. In: ISCIS (2012)
Wolf, T., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.6
Xing, F.Z., Malandri, L., Zhang, Y., Cambria, E.: Financial sentiment analysis: an investigation into common mistakes and silver bullets. In: COLING (2020)
Yan, S.: Understanding LSTM and its diagrams (2017). https://blog.mlreview.com/understanding-lstm-and-its-diagrams-37e2f46f1714. Accessed 6 Oct 2022
Yang, P., Chen, Y.: A survey on sentiment analysis by using machine learning methods. In: 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), pp. 117–121 (2017)
Yildirim, S.: Comparing deep neural networks to traditional models for sentiment analysis in Turkish language (2020)
Yılmaz, B.: Product comments dataset (2020). https://www.kaggle.com/baharyilmaz/product-comments-dataset Accessed 6 Oct 2022
Yurtalan, G., Koyuncu, M.: Çigdem Turhan: a polarity calculation approach for lexicon-based Turkish sentiment analysis. Turk. J. Electr. Eng. Comput. Sci. 27, 1325–1339 (2019)
Zhu, Y., et al.: Aligning books and movies: towards story-like visual explanations by watching movies and reading books (2015)
Acknowledgements
We thank Stefan Schweter for providing fine-tuned Turkish BERT model for the community.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ozturk, O., Ozcan, A. (2023). Sentiment Analysis in Turkish Using Transformer-Based Deep Learning Models. In: Hemanth, D.J., Yigit, T., Kose, U., Guvenc, U. (eds) 4th International Conference on Artificial Intelligence and Applied Mathematics in Engineering. ICAIAME 2022. Engineering Cyber-Physical Systems and Critical Infrastructures, vol 7. Springer, Cham. https://doi.org/10.1007/978-3-031-31956-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-31956-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31955-6
Online ISBN: 978-3-031-31956-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)