Avoid common mistakes on your manuscript.
Erratum to: Cogn Comput DOI 10.1007/s12559-016-9415-7
Unfortunately, the original version of the article has been published with few errors in Abstract, Conclusion, Acknowledgment, and References.
Also, Dr. Erik Cambria is the co-corresponding author of the article.
The corrected versions of the sections are given below.
Abstract
With the advent of the internet, people actively express their opinions about products, services, events, political parties, etc., in social media, blogs, and website comments. The amount of research work on sentiment analysis is growing explosively. However, the majority of research efforts are devoted to English language data, while a great share of information is available in other languages. We present a state-of-the-art review on multilingual sentiment analysis. More importantly, we compare our own implementation of existing state-of-the-art approaches on common data. Precision observed in our experiments is typically lower than that reported by the original authors, which we attribute to lack of detail in the original presentation of those approaches. Thus, we compare the existing works by what they really offer to the reader, including whether they allow for accurate implementation and for reliable reproduction of the reported results.
Conclusion
We gave an overview of state-of-the-art multilingual sentiment analysis methods. We described data pre-processing, typical features, and the main resources used for multilingual sentiment analysis. Then, we discussed different approaches applied by their authors to English and other languages. We have classified these approaches into corpus-based, lexicon-based, and hybrid ones.
The real value of any sentiment analysis technique for the research community corresponds to the results that can be reproduced with it, not in the results its original authors reportedly obtained with it. To evaluate this real value, we have implemented eleven selected approaches as closely as we could, based on their descriptions in the original papers, and tested them on the same two corpora. In the majority of the cases, we obtained lower results than those reported by their corresponding authors. We attribute this mainly to the incompleteness of their descriptions in the original papers. In some cases, though, the methods were developed for a specific domain, so in such cases, comparison on our test corpora may not be fair. A lesson learnt was that for a method to be useful for the research community, authors should provide sufficient detail to allow its correct implementation by the reader.
According to our results, the approach proposed by Singh et al. [52] outperforms other approaches. However, this approach is computationally expensive and has been tested only on English language data. The least accurate approaches of those that we considered were the ones proposed by Zhu et al. [73], Habernal et al. [23], and Mizumoto et al. [34].
The main problem of multilingual sentiment analysis is the lack of lexical resources [18]. In our future work, we are planning to develop a multilingual corpus, which will include Persian, Arabic, Turkish, and English data, and compare a range of state-of-the-art methods.
References
Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn Comput. 2015;7(4):487–99.
Ahmad K, Cheng D, Almas Y. Multi-lingual sentiment analysis of financial news streams. In: Proceedings of the 1st international conference on grid in finance; 2006.
Al-Ayyoub M, Essa SB, Alsmadi I. Lexicon-based sentiment analysis of arabic tweets. Int J Soc Netw Min. 2015;2:101–14.
Balahur A, Turchi M. Multilingual sentiment analysis using machine translation? In: Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis. Association for Computational Linguistics; 2012, p. 52–60.
Balahur A, Turchi M. Improving sentiment analysis in twitter using multilingual machine translated data. In: RANLP; 2013, p. 49–55.
Bautin M, Vijayarenu L, Skiena S. International sentiment analysis for news and blogs. In: ICWSM; 2008.
Berger AL, Pietra VJD, Pietra SAD. A maximum entropy approach to natural language processing. Comput Linguist. 1996;22:39–71.
Bhaskar J, Sruthi K, Nedungadi P. Enhanced sentiment analysis of informal textual communication in social media by considering objective words and intensifiers. In: Recent advances and innovations in engineering (ICRAIE), 2014. IEEE; 2014, p. 1–6.
Blitzer J, Dredze M, Pereira F, et al. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL; 2007, p. 440–47.
Boiy E, Moens M-F. A machine learning approach to sentiment analysis in multilingual Web texts. Inf Retr. 2009;12:526–58.
Cambria E, Olsher D, Rajagopal D. SenticNet 3: A common and common-sense knowledge base for cognition-driven sentiment analysis. In: AAAI, 2014, p. 1515–1521, Quebec City.
Carroll TZJ. Unsupervised classification of sentiment and objectivity in Chinese text. In: Third international joint conference on natural language processing. 2008, p. 304.
Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol TIST. 2011;2:27.
Chikersal P, Poria S, Cambria E. SeNTU: sentiment analysis of tweets by combining a rule-based classifier with supervised learning. In: Proceedings of the international workshop on semantic evaluation (SemEval 2015). 2015.
Croft WB, Lafferty J. Language modeling for information retrieval. Berlin: Springer; 2003.
Cruz-Garcia IO, Gelbukh A, Sidorov G. Implicit aspect indicator extraction for aspect based opinion mining. Int J Comput Linguist Appl. 2014;5(2):135–52.
Das N, Ghosh S, Gonçalves T, Quaresma P. Comparison of different graph distance metrics for semantic text based classification. Polibits. 2014;49:51–7.
Denecke K. Using SentiWordNet for multilingual sentiment analysis. In: IEEE 24th international data engineering workshop, 2008. ICDEW 2008. IEEE; 2008, p. 507–12.
Duwairi RM, Qarqaz I (2014) Arabic sentiment analysis using supervised classification. In: 2014 international conference on future internet of things and cloud (FiCloud). IEEE; 2014.
Evans DK, Ku L-W, Seki Y, Chen H–H, Kando N. Opinion analysis across languages: an overview of and observations from the NTCIR6 opinion analysis pilot task. In: Applications of fuzzy sets theory. Berlin: Springer; 2007, p. 456–63.
Ghorbel H, Jacot D. Further experiments in sentiment analysis of French movie reviews. In: Advances in intelligent web mastering–3. Berlin, Heidelberg: Springer; 2011, p. 19–28.
Ghosh M, Kar A. Unsupervised linguistic approach for sentiment classification from online reviews using SentiWordNet 3.0. Int J Eng Res Technol. 2013.
Habernal I, Ptácek T, Steinberger J. Sentiment analysis in Czech social media using supervised machine learning. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis. 2013, p. 65–74.
He Y, Zhou D. Self-training from labeled features for sentiment analysis. Inf Process Manag. 2011;47:606–16.
Holmes G, Donkin A, Witten IH. Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE; 1994, p. 357–61.
Hu M, Liu B. Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM; 2004, p. 168–77.
Jimenez S, Gonzalez FA, Gelbukh A. Soft cardinality in semantic text processing: experience of the SemEval international competitions. Polibits. 2015;51:63–72.
Liu B. Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge: Cambridge University Press; 2015.
Liu Z, Dong X, Guan Y, Yang J. Reserved self-training: a semisupervised sentiment classification method for Chinese microblogs. In: Proceedings of IJCNLP; 2013.
Mahyoub FHH, Siddiqui MA, Dahab MY. Building an Arabic sentiment lexicon using semi-supervised learning. J King Saud Univ Comput Inf Sci. 2014;26(4):417–24.
Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. Cambridge: Cambridge University Press; 2008.
Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 2014;5:1093–113.
Mirchev U, Last M. Multi-document summarization by extended graph text representation and importance refinement. Innov Doc Summ Tech Revolut Knowl Underst Revolut Knowl Underst. 2014; 28.
Mizumoto K, Yanagimoto H, Yoshioka M. Sentiment analysis of stock market news with semi-supervised learning. In: 2012 IEEE/ACIS 11th international conference on computer and information science (ICIS). IEEE, 2012; p. 325–28.
Morency L-P, Mihalcea R, Doshi P. Towards multimodal sentiment analysis: harvesting opinions from the web. In: Proceedings of the 13th international conference on multimodal interfaces. ACM; 2011, p. 169–76.
Narayanan V, Arora I, Bhatia A. Fast and accurate sentiment classification using an enhanced Naive Bayes model. In: Intelligent data engineering and automated learning–IDEAL 2013. Berlin: Springer; 2013, p. 194–201.
Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics; 2004, p. 271.
Pang B, Lee L, Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10. Association for Computational Linguistics, 2002; p. 79–86.
Posadas-Durán J-P, Markov I, Gómez-Adorno H, Sidorov G, Batyrshin I, Gelbukh A, Pichardo-Lagunas O. Syntactic N-grams as features for the author profiling task. Notebook for PAN at CLEF 2015. CEUR Workshop Proceedings 1391; 2015.
Raina P. Sentiment analysis in news articles using sentic computing. In: 2013 IEEE 13th International conference on data mining workshops (ICDMW). IEEE; 2013, p. 959–62.
Rajagopal D, Cambria E, Olsher D, Kwok K. A graph-based approach to commonsense concept extraction and semantic similarity detection. In: Proceedings of the 22nd international conference on world wide web companion. International World Wide Web Conferences Steering Committee; 2013, p. 565–70.
Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst. 2015.
Read J. Recognising affect in text using pointwise-mutual information. Unpubl. M Sc Diss. Univ. Sussex UK; 2004.
Remus R, Quasthoff U, Heyer G. SentiWS-a publicly available German-language resource for sentiment analysis. In: LREC. 2010.
Saraee M, Bagheri A. Feature selection methods in Persian sentiment analysis. In: Natural language processing and information systems. Springer; 2013, p. 303–308.
Seki Y, Evans DK, Ku L-W, Sun L, Chen H–H, Kando N, Lin C-Y. Overview of multilingual opinion analysis task at NTCIR-7. In: Proceedings of the 7th NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering, and cross-lingual information access. 2008, p. 185–203.
Shi H–X, Li X-J. A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 international conference on machine learning and cybernetics (ICMLC). IEEE; 2011, p. 950–54.
Sidorov G. Should syntactic n-grams contain names of syntactic relations? Int J Comput Linguist Appl. 2014;5(2):25–47.
Sidorov G, Miranda-Jiménez S, Viveros-Jiménez F, Gelbukh A, Castro-Sánchez N, Velásquez F, Díaz-Rangel I, Suárez-Guerra S, Treviño A, Gordon J. Empirical study of opinion mining in Spanish tweets. MICAI 2012. Lect Notes Comput Sci. 2012;7629:1–14.
Sidorov G, Velasquez F, Stamatatos E, Gelbukh A, Chanona-Hernández L. Syntactic n-grams as machine learning features for natural language processing. Expert Syst Appl. 2014;41(3):853–60.
Sindhwani V, Melville P. Document-word co-regularization for semi-supervised sentiment analysis. In: Eighth IEEE international conference on data mining, 2008. ICDM’08. IEEE; 2008, p. 1025–30.
Singh VK, Piryani R, Uddin A, Waila P, et al. Sentiment analysis of textual reviews; Evaluating machine learning, unsupervised and SentiWordNet approaches. In: 2013 5th international conference on knowledge and smart technology (KST). IEEE; 2013, p. 122–27.
Stone PJ, Dunphy DC, Smith MS. The general inquirer: a computer approach to content analysis; 1966.
Tan S, Zhang J. An empirical study of sentiment analysis for Chinese documents. Expert Syst Appl. 2008;34:2622–9.
Tong S, Koller D. Support vector machine active learning with applications to text classification. J Mach Learn Res. 2002;2:45–66.
Tromp E. Multilingual sentiment analysis on social media. Master’s Thesis, Dep. Math. Comput. Sci. Eindh. Univ. Technol.; 2011.
Wan X. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics; 2008, p. 553–61.
Wang S, Li D, Song X, Wei Y, Li H. A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Syst Appl. 2011;38:8696–702.
Wiebe J, Mihalcea R. Word sense and subjectivity. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics; 2006, p. 1065–72.
Wong K-F, Xia Y, Xu R, Wu M, Li W. Pattern-based opinion mining for stock market trend prediction. Int J Comput Process Orient Lang. 2008;21(4):347–61.
Xia Y, Cambria E, Hussain A, Zhao H. Word polarity disambiguation using Bayesian model and opinion-level features. Cogn Comput. 2015;7(3):369–80.
Xia Y, Wang L, Wong K-F. Sentiment vector space model for lyric-based song sentiment classification. Int J Comput Process Orient Lang. 2008;21(4):331–45.
Xia Y, Zhao T, Yao J, Jin P. Measuring Chinese-English crosslingual word similarity with HowNet and parallel corpus. In: Computational linguistics and intelligent text processing, 12th international conference, CICLing 2011, vol. 2. 2011, p. 221–33.
Xia Y, Li X, Cambria E, Hussain A. A localization toolkit for SenticNet. In: 2014 IEEE international conference on data mining workshop (ICDMW). 2014, p. 403–8.
Xia R, Zong C. Exploring the use of word relation features for sentiment classification. In: Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics; 2010, p. 1336–44.
Xu Y, Jones GJ, Li J, Wang B, Sun C. A study on mutual information-based feature selection for text categorization. J Comput Inf Syst. 2007;3:1007–12.
Xu R, Wong K-F, Lu Q, Xia Y, Li W. Learning knowledge from relevant webpage for opinion analysis. In: IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, WI-IAT ‘08. 2008, p. 307–13.
Yang Y, Pedersen JO. A comparative study on feature selection in text categorization. In: ICML; 1997, p. 412–20.
Ye Q, Shi W, Li Y. Sentiment classification for movie reviews in Chinese by improved semantic oriented approach. In: Proceedings of the 39th annual Hawaii international conference on system sciences, HICSS’06. IEEE; 2006, p. 53b–53b.
Ye Q, Zhang Z, Law R. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst Appl. 2009;36:6527–35.
Zagibalov T, Carroll J. Automatic seed word selection for unsupervised sentiment classification of Chinese text. In: Proceedings of the 22nd international conference on computational linguistics, vol. 1. Association for Computational Linguistics; 2008, p. 1073–80.
Zhang Z-Q, Li Y-J, Ye Q, Law R. Sentiment classification for Chinese product reviews using an unsupervised Internet-based method. In: International conference on management science and engineering, 2008. ICMSE 2008. 15th Annual conference proceedings. IEEE; 2008, p. 3–9.
Zhu S, Xu B, Zheng D, Zhao T. Chinese microblog sentiment analysis based on semi-supervised learning. In: Semantic web and web science. New York: Springer; 2013, p. 325–31.
Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016;31(2):102–7.
Cambria E, Hussain A. Sentic computing: a common-sense-based framework for concept-level sentiment analysis. Cham: Springer; 2015. ISBN 978-3-319-23654-4.
Poria S, Gelbukh A, Cambria E, Das D, Bandyopadhyay S. Enriching SenticNet polarity scores through semi-supervised fuzzy clustering. In: Proceedings of ICDM, 2012, p. 709–716.
Poria S, Gelbukh A, Cambria E, Yang P, Hussain A, Durrani T. Merging SenticNet and WordNet-affect emotion lists for sentiment analysis. In: Proceedings of ICSP, 2012, p. 1251–1255.
Cambria E, Schuller B, Liu B, Wang H, Havasi C. Statistical approaches to concept-level sentiment analysis. IEEE Intell Syst. 2013;28(3):6–9.
Poria S, Cambria E, Howard N, Huang G-B, Hussain A. Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing. 2016;174:50–9.
Poria S, Cambria E, Hussain A, Huang G-B. Towards an intelligent framework for multimodal affective data analysis. Neural Netw. 2015;63:104–16.
Cambria E, Wang H, White B. Guest editorial: big social data analysis. Knowl Based Syst. 2014;69:1–2.
Acknowledgments
This work was partly supported by the Royal Society of Edinburgh (RSE) and Natural Science Foundation of China (NNSFC) joint project Grant No. 61411130162, and the UK Engineering and Physical Science Research Council (EPSRC) Grant No. EP/M026981/1. We also wish to thank the anonymous reviewers who helped improve the quality of the paper.
Author information
Authors and Affiliations
Corresponding authors
Additional information
The online version of the original article can be found under doi:10.1007/s12559-016-9415-7.
Rights and permissions
About this article
Cite this article
Dashtipour, K., Poria, S., Hussain, A. et al. Erratum to: Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques. Cogn Comput 8, 772–775 (2016). https://doi.org/10.1007/s12559-016-9421-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-016-9421-9