Skip to main content

Topic Modeling on Twitter Data and Identifying Health-Related Issues

  • Conference paper
  • First Online:
Information Management and Machine Intelligence (ICIMMI 2019)

Part of the book series: Algorithms for Intelligent Systems ((AIS))

  • 515 Accesses

Abstract

The conversation over social media is becoming very common in the present world because of diverse use of mobile devices and availability of the Internet. This conversation on social media takes in the form of short text with some contextual words, and so, analyzing such short text is difficult. A Most common topic among social media user is health-related, which is difficult to identify and analyze. This study explores model for discovering health-related topics and issues from tweets. To demonstrate the topic modeling and its effectiveness, LDA along with the Topic Aspect Model method has been used and applied on tweets related to tobacco and alcohol use. The public health researcher can understand health-related issues through large conversational Twitter data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jordan, S., Hovet, S., Fung, I., Liang, H., Fu, K. W., & Tse, Z. (2019). Using Twitter for public health surveillance from monitoring and prediction to public response. Data, 4(1), 6.

    Google Scholar 

  2. Stieglitz, S., Mirbabaie, M., Ross, B., & Neuberger, C. (2018). Social media analytics–challenges in topic discovery, data collection, and data preparation. International Journal of Information Management, 39, 156–168.

    Google Scholar 

  3. Paul, M. J., & Dredze, M. (2014). Discovering health topics in social media using topic models. PLoS ONE, 9(8), e103408.

    Google Scholar 

  4. Prier, K. W., Smith, M. S., Giraud-Carrier, C., & Hanson, C. L. (2011). Identifying health-related topics on twitter. In International conference on social computing, behavioral-cultural modeling, and prediction (pp. 18–25). Berlin, Heidelberg: Springer.

    Google Scholar 

  5. Beykikhoshk, A., Arandjelović, O., Phung, D., Venkatesh, S., & Caelli, T. (2015). Using Twitter to learn about the autism community. Social Network Analysis and Mining, 5(1), 22.

    Google Scholar 

  6. Culotta, A. (2010). Towards detecting influenza epidemics by analyzing Twitter messages. In Proceedings of the first workshop on social media analytics (pp. 115–122). acm.

    Google Scholar 

  7. Culotta, A. (2013). Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages. Language resources and evaluation, 47(1), 217–238.

    Google Scholar 

  8. Kalyanam, J., Katsuki, T., Lanckriet, G. R., & Mackey, T. K. (2017). Exploring trends of nonmedical use of prescription drugs and polydrug abuse in the Twitter sphere using unsupervised machine learning. Addictive Behaviors, 65, 289–295.

    Google Scholar 

  9. Bosley, J. C., Zhao, N. W., Hill, S., Shofer, F. S., Asch, D. A., Becker, L. B., et al. (2013). Decoding twitter: Surveillance and trends for cardiac arrest and resuscitation communication. Resuscitation, 84(2), 206–212.

    Google Scholar 

  10. Mohan, P., Lando, H. A., & Panneer, S. (2018). Assessment of tobacco consumption and control in India. Indian Journal of Clinical Medicine, 9, 1179916118759289.

    Google Scholar 

  11. Nazar, G. P., Chang, K. C., Srivastava, S., Pearce, N., Karan, A., & Millett, C. (2019). Impact of India’s National Tobacco Control Programme on bidi and cigarette consumption: A difference-in-differences analysis. Tobacco control.

    Google Scholar 

  12. Paul, M. J., Sarker, A., Brownstein, J. S., Nikfarjam, A., Scotch, M., Smith, K. L., & Gonzalez, G. (2016). Social media mining for public health monitoring and surveillance. In Biocomputing 2016: Proceedings of the pacific symposium (pp. 468–479).

    Google Scholar 

  13. Gentry, J. (2015). twitteR: R Based Twitter Client. R package version 1.1.9. https://CRAN.R-project.org/package=twitteR.

  14. Kearney, M. W. (2019). rtweet: collecting twitter data. R package version 0.6.9 Retrieved from https://cran.r-project.org/package=rtweet.

  15. Blei, D. M., Ng, A. Y., Jordan, & M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022.

    Google Scholar 

  16. Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019). Latent Dirichlet Allocation (LDA) and Topic modeling: Models, applications, a survey. Multimedia Tools and Applications, 78(11), 15169–15211.

    Google Scholar 

  17. Chemudugunta, C., Smyth, P., & Steyvers, M. (2007). Modeling general and specific aspects of documents with a probabilistic topic model. In Advances in Neural Information Processing Systems (pp. 241–248).

    Google Scholar 

  18. Paul, M. J. (2012). Mixed membership Markov models for unsupervised conversation modeling. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 94–104). Association for Computational Linguistics.

    Google Scholar 

  19. Paul, M., & Girju, R. (2010). A two-dimensional topic-aspect model for discovering multi-faceted topics. In Twenty-fourth AAAI conference on artificial intelligence.

    Google Scholar 

  20. Parker, J., Wei, Y., Yates, A., Frieder, O., & Goharian, N. (2013). A framework for detecting public health trends with twitter. In Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 556–563). ACM.

    Google Scholar 

  21. Twitter API documentation. http://dev.twitter.com/doc.

  22. Boyd, D. M., & Ellison, N. B. (2007). Social network sites: Definition, history, and scholarship. Journal of computer-mediated Communication, 13(1), 210–230.

    Google Scholar 

  23. Chew, C. (2010). Pandemics in the age of twitter: A content analysis of the 2009 h1n1 outbreak (Doctoral dissertation).

    Google Scholar 

  24. Hoang, T. A., & Lim, E. P. (2017). Modeling topics and behavior of microbloggers: An integrated approach. ACM Transactions on Intelligent Systems and Technology (TIST), 8(3), 44.

    Google Scholar 

  25. Yang, S. H., Kolcz, A., Schlaikjer, A., & Gupta, P. (2014). Large-scale high-precision topic modeling on twitter. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1907–1916). ACM.

    Google Scholar 

  26. Wagner, C., Singer, P., Posch, L., & Strohmaier, M. (2013). The wisdom of the audience: An empirical study of social semantics in twitter streams. In Extended semantic web conference (pp. 502–516). Berlin, Heidelberg: Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sandhya Avasthi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Avasthi, S. (2021). Topic Modeling on Twitter Data and Identifying Health-Related Issues. In: Goyal, D., Bălaş, V.E., Mukherjee, A., Hugo C. de Albuquerque, V., Gupta, A.K. (eds) Information Management and Machine Intelligence. ICIMMI 2019. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-4936-6_6

Download citation

Publish with us

Policies and ethics