Abstract
Twitter analytics is a classic research area especially with the widespread presence of Big Data in various online media such as—social network sites, online portals for shopping, e-commerce, forums, chats, recommendation systems, and online services. Ascertaining the sentiment behind, the various types of tweets by different persons can provide great insights on various aspects including behavioral patterns. Besides highlighting the newest trends in the field, we retrieved real-time twitter data pertaining to three currently popular hashtags in the Indian context and carried out extensive experimentation analysis about the prevailing sentiment of a strata of population. Inclusion of current challenges, future trends and applications of sentiment analysis from Twitter data makes this novel work very useful for fellow researchers.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Machine learning
- Natural language processing
- Sentiment analysis
- Twitter data analytics
- Opinion mining
- Data visualization
- Recommendation systems
- Attitude analysis
- Polarity determination
- Sentiment classification
1 Introduction
Analytics of twitter data is predominantly conducted to ascertain the underlying sentiment behind the tweets/images. Hence, this effectively narrows to sentiment analysis. Sentiment analysis is computational study of the various opinions, emotions, sentiments, and attitude which is expressed by different users in the form of texts pertaining to an entity of interest. Sentiment analysis is also called as review mining, opinion mining, or attitude analysis [1,2,3,4,5].
Motivation for the surge in voluminous user content globally is attributed to technological advancements as also increased Internet activities like—discussion forums, conferencing, online transactions, e-commerce, chatting, surveillances, ticket booking, websites of merchants, widespread and continual communications on various social media, and the variety of other online activities [1, 3, 6, 7].
Current work is organized as follows: Sect. 2 covers motivation, Sect. 3 covers literature survey, Sect. 4 covers experimentation and results, Sect. 5 presents observations, Sect. 6 highlights the novelty, Sect. 7 presents the various applications, Sect. 8 presents challenges, Sect. 9 presents research contribution, and Sect. 10 covers conclusion.
2 Motivation
This novel technique will help people to analyze various data from Twitter and help understand the public opinion or sentiment of people behind the specific keywords, and this will be useful in various sectors like business, marketing, forecasting, politics, and tourism.
3 Literature Survey
There are mainly two approaches found in existing literature [1, 7,8,9,10,11,12,13,14,15,16,17,18] for performing sentiment analysis–lexicon based and machine learning based. Concept of polarity is used in the former while suitable classification models are developed in the latter.
Detailed survey of recent work is presented in Table 1, and research gaps are highlighted.
4 Experimentation and Results
We used Tweepy to fetch the tweets in real time for three currently popular hashtags in India: #MakeInIndia, #AtmNirbharIndia, #VocalforLocal. The tweepy.Cursor() function was used to fetch all latest tweets. Preprocessing was performed using the ‘re’ library of python. TextBlob was used for polarity determination. We wrote a python program to encode the seven class labels as follows: -1 negative, -0.6 to -1 strongly negative, 0 to -0.3 weakly negative, 0 neutral, 0 to 0.3 weakly positive, 0.6 to 1 strongly positive, and 1 for positive and performed three experiments as under.
4.1 Experiment 1: #MakeInIndia
We fetched 1000 tweets in real time and have analyzed the same for ascertaining the sentiment. Visualization results for seven sentiment classes are as illustrated in Fig. 1
4.2 Experiment 2: #AtmNirbhar
We fetched 1000 tweets in real time and have analyzed the same for ascertaining the sentiment. Visualization results are as illustrated in Fig. 2
4.3 Experiment 3: #VocalforLocal
Figure 3 illustrates the outcome of analyzing 200 tweets.
We performed comparative analysis of the two hashtags with respect to seven sentiment classes as illustrated in the stacked bar chart in Fig. 4
To validate the obtained results, we assigned the task of annotation to two human experts and noted the findings. Figures 5 and 6 illustrate the differences in annotation between the two experts using RMSE and standard deviation, respectively.
5 Observations
-
From Figs. 1, 2, 3 and 4, we infer that the highest positive percentage of tweets was for #MakeInIndia while the highest negative tweets were for #Atmnirbhar
-
From Table 1, it is observed that although some standard datasets do exist, most researchers prefer to gather tweets in real time. Tweepy was observed to be the predominant choice. Also, SVM and Random Forests have frequently yielded high accuracy of over 95%
6 Novelty
This technique gives the result visualization in the form of pie-chart along with seven classes which gives the clearer idea about the sentiment behind keyword, and this novel approach of result visualization helps people to understand result in detail.
7 Applications
Twitter data analytics has variety of applications such as
-
For generating reputation for brands or products [26,27,28],
-
For increasing the customer engagement, having better informed decisions toward risk analysis, efficient credit ratings for various customers, and performing competitive analysis [29],
-
Increasing productivity and efficiency of restaurants [30],
-
For better market intelligence and improve customer satisfaction [3, 36],
-
Increased tourism [37],
-
Monitoring and analyzing public opinions concerning political issues [3],
-
To forecast the price changes as per news sentiments [1],
-
To develop new products, services and promote products as per the customers reviews [1] and social advertising [38, 39].
8 Challenges in Twitter Data Analytics
-
i.
Determining the contextual information for sentiments and forming a generalized foundation globally is difficult [30].
-
ii.
There is increased difficulty due to the widespread use of onomatopoeias, idioms, homophones, alliterations, and acronyms [30]. Hence, complex NLP techniques are required to decipher the correct context and meaning of various words.
-
iii.
Aspect-based sentiment analysis is an important challenge [36].
-
iv.
Opinion summarization, subjectivity classification, and opinion retrieval [36]
-
v.
Lack of large annotated data to train models across various domains [40]
9 Research Contribution
-
Current work is a novel approach of visualizing and analyzing the three currently popular hashtags in India. Our extensive experimentation and analysis about the prevailing sentiment shall be greatly beneficial for fellow researchers.
-
We have also covered important aspects such as—current challenges, future trends, and applications of sentiment analysis.
10 Conclusion
We have successfully implemented the proof of concept toward gathering tweets in real time and attempting to analyze the sentiment of a part of population using the lexicon-based technique. We have performed extensive experimentation and analyzed the sentiments for 100, 200, 500, and 1000 different sets of tweets for three most currently most popular hashtags. Ample data visualization performed in this work would be great asset to fellow researchers thereby carving the path for future research.
References
Kumar R, Vadlamani R (2015) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowledge Based Systems
Liu B (2010) Sentiment analysis: a multi-faceted problem. IEEE Intell Syst 25(3):76–80
Tubishat M, Idris N, Abushariah MAM (2018) Implicit aspect extraction in sentiment analysis: review, taxonomy, opportunities, and open challenges. Inf Process Manage 54:545–563
Montoyo A, Martínez-Barco P, Balahur A (2012) Subjectivity and sentiment analysis: an overview of the current state of the area and envisaged developments. Decis Support Syst 53:675–679
Kumar S, Morstatter F, Liu H (2013) Twitter data analytics. Springer
Liu J (2008) Opinion spam and analysis. In: Proceedings of the international conference on web search and web data mining, ACM
Tan LKW, Na JC, Theng YL et al (2012) Phrase-level sentiment polarity classification using rule-based typed dependencies and additional complex phrases consideration. J Comput Sci Technol 27(3):650–666
Wang T et al (2014) Product aspect extraction supervised with online domain knowledge. Knowledge-Based Syst 71:86–100
Kunte AV, Panicker S (2020) Analysis of machine learning algorithms for predicting personality: brief survey and experimentation. In: 2019 global conference for advancement in technology (GCAT)
Kunte A, Panicker S (2020) Personality prediction of social network users using ensemble and XGBoost. In: Das H, Pattnaik P, Rautaray S, Li KC (eds) Progress in computing, analytics and networking. Advances in intelligent systems and computing, vol 1119. Springer, Singapore
Kunte AV, Panicker SS (2019) Using textual data for personality prediction: a machine learning approach. In: 2019 4th international conference on information systems and computer networks (ISCON)
Panicker S, Kunte A (2019) Personality prediction using social media. In: 2019 5th international conference for convergence in technology (I2CT), Pune (in Press)
Mane VL, Panicker SS (2015) Knowledge discovery from user health posts. In: IEEE 9th international conference on intelligent systems and control (ISCO)
Mane VL, Panicker SS (2015) Summarization and sentiment analysis from user health posts. In: 2015 international conference on pervasive computing (ICPC). IEEE
Salunke V, Panicker SS (2021) Image sentiment analysis using deep learning. In: Ranganathan G, Chen J, Rocha A (eds) Inventive communication and computational technologies. Lecture notes in networks and systems, vol 145. Springer, Singapore
Dangra BS, Rajput D, Bedekar MV, Panicker SS (2015) Profiling of automobile drivers using car games. In: International conference on pervasive computing (ICPC). IEEE
Bedekar MV, Atote B, Zahoor S, Panicker S (2016) Proposed used of information DisPersal Algorithm in user profiling. In: International conference on ICT for sustainable development, Goa, India
Khan M, Malviya A (2020) Big data approach for sentiment analysis of twitter data using Hadoop framework and deep learning. In: 2020 international conference on emerging trends in information technology and engineering (ic-ETITE), Vellore, India
Hu T, She B, Duan L, Yue H, Clunis J (2020) A systematic spatial and temporal sentiment analysis on geo-tweets. IEEE Access 8, 8658–8667
Murakami A, Nasukawa T, Watanabe K, Hatayama M (2020) Understanding requirements and issues in disaster area using geotemporal visualization of Twitter analysis. IBM J Res Develop
Kumar TS, Nabeem PM, Manoj CK, Jeyachandran K (2020) Sentimental analysis (opinion mining) in social network by using SVM algorithm. In: 2020 fourth international conference on computing methodologies and communication (ICCMC), Erode, India
Phan HT, Tran VC, Nguyen NT, Hwang D (2020) Improving the performance of sentiment analysis of Tweets containing fuzzy sentiment using the feature ensemble model. In: IEEE Access, vol 8, pp 14630–14641
Bhatnagar D, SubaLakshmi RJ, Vanmathi C (2020) Twitter Sentiment Analysis Using Elasticsearch, LOGSTASH And KIBANA. In: 2020 international conference on emerging trends in information technology and engineering (ic-ETITE)
Oyasor J, Raborife M, Ranchod P (2020) Sentiment analysis as an indicator to evaluate gender disparity on sexual violence tweets in South Africa. In: 2020 international SAUPEC/RobMech/PRASA conference, Cape Town, South Africa
Joshi PA, Simon G, Murumkar YP (2018) Generation of brand/product reputation using Twitter data. In: 2018 international conference on information, communication, engineering and technology (ICICET), Pune
Wang W, Li B, Feng D, Zhang A, Wan S (2020) The OL-DAWE model: tweet polarity sentiment analysis with data augmentation. In: IEEE Access, vol 8, pp 40118–40128
Li YM, Shiu YL (2012) A diffusion mechanism for social advertising over microblogs. Decis Support Syst 54:9–22
Du J et al (2013) Box office prediction based on microblog. In: Expert systems with applications
Al-Moslmi T, Omar N, Abdullah S, Albared M (2017) Approaches to cross-domain sentiment analysis: a systematic literature review. IEEE Access 5:16173–16192
Li SK, Guan Z, Tang LY et al (2012) Exploiting consumer reviews for product feature ranking. J Comput Sci Technol 27(3):635–649
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ahire, K., Bagul, M., Dhanawate, S., Panicker, S.S. (2021). A Novel Proof of Concept for Twitter Analytics Using Popular Hashtags: Experimentation and Evaluation. In: Goyal, V., Gupta, M., Trivedi, A., Kolhe, M.L. (eds) Proceedings of International Conference on Communication and Artificial Intelligence. Lecture Notes in Networks and Systems, vol 192. Springer, Singapore. https://doi.org/10.1007/978-981-33-6546-9_31
Download citation
DOI: https://doi.org/10.1007/978-981-33-6546-9_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-6545-2
Online ISBN: 978-981-33-6546-9
eBook Packages: EngineeringEngineering (R0)