Abstract
The conversation over social media is becoming very common in the present world because of diverse use of mobile devices and availability of the Internet. This conversation on social media takes in the form of short text with some contextual words, and so, analyzing such short text is difficult. A Most common topic among social media user is health-related, which is difficult to identify and analyze. This study explores model for discovering health-related topics and issues from tweets. To demonstrate the topic modeling and its effectiveness, LDA along with the Topic Aspect Model method has been used and applied on tweets related to tobacco and alcohol use. The public health researcher can understand health-related issues through large conversational Twitter data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jordan, S., Hovet, S., Fung, I., Liang, H., Fu, K. W., & Tse, Z. (2019). Using Twitter for public health surveillance from monitoring and prediction to public response. Data, 4(1), 6.
Stieglitz, S., Mirbabaie, M., Ross, B., & Neuberger, C. (2018). Social media analytics–challenges in topic discovery, data collection, and data preparation. International Journal of Information Management, 39, 156–168.
Paul, M. J., & Dredze, M. (2014). Discovering health topics in social media using topic models. PLoS ONE, 9(8), e103408.
Prier, K. W., Smith, M. S., Giraud-Carrier, C., & Hanson, C. L. (2011). Identifying health-related topics on twitter. In International conference on social computing, behavioral-cultural modeling, and prediction (pp. 18–25). Berlin, Heidelberg: Springer.
Beykikhoshk, A., Arandjelović, O., Phung, D., Venkatesh, S., & Caelli, T. (2015). Using Twitter to learn about the autism community. Social Network Analysis and Mining, 5(1), 22.
Culotta, A. (2010). Towards detecting influenza epidemics by analyzing Twitter messages. In Proceedings of the first workshop on social media analytics (pp. 115–122). acm.
Culotta, A. (2013). Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages. Language resources and evaluation, 47(1), 217–238.
Kalyanam, J., Katsuki, T., Lanckriet, G. R., & Mackey, T. K. (2017). Exploring trends of nonmedical use of prescription drugs and polydrug abuse in the Twitter sphere using unsupervised machine learning. Addictive Behaviors, 65, 289–295.
Bosley, J. C., Zhao, N. W., Hill, S., Shofer, F. S., Asch, D. A., Becker, L. B., et al. (2013). Decoding twitter: Surveillance and trends for cardiac arrest and resuscitation communication. Resuscitation, 84(2), 206–212.
Mohan, P., Lando, H. A., & Panneer, S. (2018). Assessment of tobacco consumption and control in India. Indian Journal of Clinical Medicine, 9, 1179916118759289.
Nazar, G. P., Chang, K. C., Srivastava, S., Pearce, N., Karan, A., & Millett, C. (2019). Impact of India’s National Tobacco Control Programme on bidi and cigarette consumption: A difference-in-differences analysis. Tobacco control.
Paul, M. J., Sarker, A., Brownstein, J. S., Nikfarjam, A., Scotch, M., Smith, K. L., & Gonzalez, G. (2016). Social media mining for public health monitoring and surveillance. In Biocomputing 2016: Proceedings of the pacific symposium (pp. 468–479).
Gentry, J. (2015). twitteR: R Based Twitter Client. R package version 1.1.9. https://CRAN.R-project.org/package=twitteR.
Kearney, M. W. (2019). rtweet: collecting twitter data. R package version 0.6.9 Retrieved from https://cran.r-project.org/package=rtweet.
Blei, D. M., Ng, A. Y., Jordan, & M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022.
Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019). Latent Dirichlet Allocation (LDA) and Topic modeling: Models, applications, a survey. Multimedia Tools and Applications, 78(11), 15169–15211.
Chemudugunta, C., Smyth, P., & Steyvers, M. (2007). Modeling general and specific aspects of documents with a probabilistic topic model. In Advances in Neural Information Processing Systems (pp. 241–248).
Paul, M. J. (2012). Mixed membership Markov models for unsupervised conversation modeling. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 94–104). Association for Computational Linguistics.
Paul, M., & Girju, R. (2010). A two-dimensional topic-aspect model for discovering multi-faceted topics. In Twenty-fourth AAAI conference on artificial intelligence.
Parker, J., Wei, Y., Yates, A., Frieder, O., & Goharian, N. (2013). A framework for detecting public health trends with twitter. In Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 556–563). ACM.
Twitter API documentation. http://dev.twitter.com/doc.
Boyd, D. M., & Ellison, N. B. (2007). Social network sites: Definition, history, and scholarship. Journal of computer-mediated Communication, 13(1), 210–230.
Chew, C. (2010). Pandemics in the age of twitter: A content analysis of the 2009 h1n1 outbreak (Doctoral dissertation).
Hoang, T. A., & Lim, E. P. (2017). Modeling topics and behavior of microbloggers: An integrated approach. ACM Transactions on Intelligent Systems and Technology (TIST), 8(3), 44.
Yang, S. H., Kolcz, A., Schlaikjer, A., & Gupta, P. (2014). Large-scale high-precision topic modeling on twitter. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1907–1916). ACM.
Wagner, C., Singer, P., Posch, L., & Strohmaier, M. (2013). The wisdom of the audience: An empirical study of social semantics in twitter streams. In Extended semantic web conference (pp. 502–516). Berlin, Heidelberg: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Avasthi, S. (2021). Topic Modeling on Twitter Data and Identifying Health-Related Issues. In: Goyal, D., Bălaş, V.E., Mukherjee, A., Hugo C. de Albuquerque, V., Gupta, A.K. (eds) Information Management and Machine Intelligence. ICIMMI 2019. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-4936-6_6
Download citation
DOI: https://doi.org/10.1007/978-981-15-4936-6_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4935-9
Online ISBN: 978-981-15-4936-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)