Comparison of Machine Learning Models for Early Depression Detection from Users’ Posts

Mothe, Josiane; Ramiandrisoa, Faneva; Ullah, Md Zia

doi:10.1007/978-3-031-04431-1_5

Josiane Mothe⁵,
Faneva Ramiandrisoa⁶ &
Md Zia Ullah⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1018))

742 Accesses

Abstract

With around 300 millions people worldwide suffering from depression, the detection of this disorder is crucial and a challenge for individual and public health. As with many diseases, early detection means better medical management; the use of social media messages as potential clues to depression is an opportunity to assist in this early detection by automatic means. This chapter is based on the participation of the CNRS IRIT laboratory in the early detection of depressive people (eRisk) task at the CLEF evaluation forum. Early depression detection differs from depression detection in that it considers temporality; the system must make its decision about a user’s possible depression with as little data as possible. In this chapter we re-evaluate the models we have developed for our participation at eRisk over the years on the different collections, to obtain a more robust comparison. We also add new models. We use well-established classification methods, such as Logistic regression, Random forest, and Support Vector Machine (SVM). The users’ data from which the system should detect if they are depressed, are represented as vectors composed of (a) various task-oriented features including depression related lexicons and (b) word and document embeddings, extracted from the users’ posts. We perform an ablation study to analyze the most important features for our models. We also use BERT deep learning architecture for comparison purposes, both for depression detection and early depression detection. According to our results, well-established machine learning models are still better than more modern models for -early- detection of depression.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An ensemble approach to detect depression from social media platform: E-CLS

Article 06 February 2024

Detecting Depression on Social Platforms Using Machine Learning

Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings

Notes

1.
https://www.la-depression.org/, accessed January 28, 2021.
2.
http://www.doctissimo.fr/psychologie/news/la-france-pays-le-plus-touche-par-la-depression, accessed January 28, 2021.
3.
CES-D stands for Center for Epidemiologic Studies Depression who provides a questionnaire that can be used to detect depression [35].
4.
Reddit is a social news aggregation, web content rating, and discussion website (https://www.reddit.com).
5.
https://www.who.int/news-room/fact-sheets/detail/depression.
6.
http://en.wikipedia.org/wiki/List_of_antidepressants accessed on 23/02/2017.
7.
http://www.webmd.com/depression/guide/depression-medications-antidepressants accessed on 10/01/2018.
8.
http://empath.stanford.edu.
9.
https://github.com/google-research/bert, accessed on 02/02/2021.

References

Burdisso, S. G., Errecalde, M., & Montes-y-Gómez, M. (2019). t-ss3: a text classifier with dynamic n-grams for early risk detection over text streams. arxiv:1911.06147.
Burdisso, S. G., Errecalde, M., & Montes-y-Gómez, M. (2019). A text classification framework for simple and effective early depression detection over social media streams. Expert System Application, 133, 182–197.
Article Google Scholar
Cacheda, F., Iglesias, D. F., Nóvoa, F. J., & Carneiro, V. (2018). Analysis and experiments on early detection of depression. In Working Notes of CLEF 2018—Conference and Labs of the Evaluation Forum, Avignon, France, September 10–14, 2018.
Google Scholar
Choudhury, M. D., Counts, S., Horvitz, E., & Hoff, A. (2014). Characterizing and predicting postpartum depression from shared facebook data. In Computer Supported Cooperative Work, CSCW ’14, Baltimore, MD, USA, February 15–19, 2014 (pp. 626–638).
Google Scholar
Choudhury, M. D., Gamon, M., Counts, S., & Horvitz, E. (2013). Predicting depression via social media. In Proceedings of the Seventh International Conference on Weblogs and Social Media.
Google Scholar
Colombo, G. B., Burnap, P., Hodorog, A., & Scourfield, J. (2016). Analysing the connectivity and communication of suicidal users on twitter. Computer Communications, 73, 291–300.
Article Google Scholar
Dai, A. M., Olah, C., & Le, Q. V. (2015). Document embedding with paragraph vectors. arXiv:1507.07998.
Dalloux, C., Claveau, V., Cuggia, M., Bouzillé, G., & Grabar, N. (2020). Supervised learning for the ICD-10 coding of french clinical narratives. In Digital Personalized Health and Medicine—Proceedings of MIE 2020, Medical Informatics Europe, Geneva, Switzerland, April 28–May 1, 2020 (2020) (pp. 427–431).
Google Scholar
Deveaud, R., Mothe, J., Ullah, M. Z., & Nie, J.-Y. (2018). Learning to adaptively rank document retrieval system configurations. ACM Transactions on Information Systems (TOIS), 37(1), 1–41.
Article Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.
Fast, E., Chen, B., & Bernstein, M. S. (2016). Empath: Understanding topic signals in large-scale text. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, San Jose, CA, USA, May 7–12, 2016 (2016) (pp. 4647–4657).
Google Scholar
France, D. J., Shiavi, R. G., Silverman, S. E., Silverman, M. K., & Wilkes, D. M. (2000). Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Transactions on Biomedical Engineering, 47(7), 829–837.
Article Google Scholar
Funez, D. G., Errecalde, M. L., Villegas, M. P., Ucelay, M. J. G., & Cagnina, L. C. (2017). Temporal variation of terms as concept space for early risk prediction. In Working Notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11–14, 2017.
Google Scholar
Funez, D. G., Ucelay, M. J. G., Villegas, M. P., Burdisso, S., Cagnina, L. C., Montes-y-Gómez, M., & Errecalde, M. (2018). Unsl’s participation at erisk 2018 lab. In Working Notes of CLEF 2018—Conference and Labs of the Evaluation Forum, Avignon, France, September 10–14, 2018.
Google Scholar
Hoang, T. B. N., & Mothe, J. (2018). Predicting information diffusion on twitter-analysis of predictive features. Journal of Computational Science, 28, 257–264.
Article Google Scholar
Iarivony Faneva, R. (2020). Extraction et fouille de données textuelles: application à la détection de la dépression, de l’anorexie et de l’agressivité dans les réseaux sociaux. Ph.D. thesis, Université de Toulouse.
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2017). Bag of tricks for efficient text classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3–7, 2017, Volume 2: Short Papers (pp. 427–431).
Google Scholar
King, G., & Zeng, L. (2001). Logistic regression in rare events data. Political Analysis, 9(2), 137–163.
Article Google Scholar
Kulkarni, A. B. K. (2018). Early detection of depression. Master’s thesis, University of Houston.
Google Scholar
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In International Conference on Machine Learning, PMLR (pp. 1188–1196).
Google Scholar
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM Computing Surveys, 50, 6.
Google Scholar
Li, Z., Xiong, Z., Zhang, Y., Liu, C., & Li, K. (2011). Fast text categorization using concise semantic analysis. Pattern Recognition Letters, 32(3), 441–448.
Article Google Scholar
Low, L. A., Maddage, N. C., Lech, M., Sheeber, L., & Allen, N. B. (2011). Detection of clinical depression in adolescents’ speech during family interactions. IEEE Transactions on Biomedical Engineering, 58(3), 574–586.
Article Google Scholar
Malam, I. A., Arziki, M., Bellazrak, M. N., Benamara, F., Kaidi, A. E., Es-Saghir, B., He, Z., Housni, M., Moriceau, V., Mothe, J., & Ramiandrisoa, F. (2017). IRIT at e-risk. In Working Notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11–14, 2017.
Google Scholar
Marriott, T. C., & Buchanan, T. (2014). The true self online: Personality correlates of preference for self-expression online, and observer ratings of personality online and offline. Computers in Human Behavior, 32, 171–177.
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger, Eds. Advances in Neural Information Processing Systems (vol. 26). Curran Associates, Inc.
Google Scholar
Mohammad, S., & Turney, P. D. (2013). Crowdsourcing a word-emotion association lexicon. Computational Intelligence, 29, 3.
Article MathSciNet Google Scholar
Mowery, D., Park, A., Conway, M., & Bryan, C. (2016). Towards automatically classifying depressive symptoms from twitter data for population health. In Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (pp. 182–191).
Google Scholar
Organization, W. H., et al. (2017). Depression and other common mental disorders: Global health estimates. 2017. Geneva: WHO.
Google Scholar
Øverland, S., Woicik, W., Sikora, L., Whittaker, K., Heli, H., Skjelkvåle, F. S., Sivertsen, B., & Colman, I. (2020). Seasonality and symptoms of depression: A systematic review of the literature. Epidemiology and Psychiatric Sciences, 29.
Google Scholar
Ozdas, A., Shiavi, R. G., Silverman, S. E., Silverman, M. K., & Wilkes, D. M. (2004). Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk. IEEE Transactions on Biomedical Engineering, 51(9), 1530–1540.
Article Google Scholar
Paul, S., Jandhyala, S. K., & Basu, T. (2018). Early detection of signs of anorexia and depression over social media using effective machine learning frameworks. In Working Notes of CLEF 2018—Conference and Labs of the Evaluation Forum, Avignon, France, September 10–14, 2018.
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12, 2825–2830.
Google Scholar
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 2227–2237).
Google Scholar
Radloff, L. (2015). A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 413–449.
Google Scholar
Ramiandrisoa, F., & Mothe, J. (2020). Early detection of depression and anorexia from social media: A machine learning approach. In Proceedings of the Joint Conference of the Information Retrieval Communities in Europe (CIRCLE 2020), Samatan, Gers, France, July 6–9, 2020.
Google Scholar
Ramiandrisoa, F., Mothe, J., Benamara, F., & Moriceau, V. (2018). IRIT at e-risk 2018. In Working Notes of CLEF 2018—Conference and Labs of the Evaluation Forum, Avignon, France, September 10–14, 2018.
Google Scholar
Resnik, P., Armstrong, W., Claudino, L. M. B., Nguyen, T., Nguyen, V., & Boyd-Graber, J. L. (2015). Beyond LDA: exploring supervised topic modeling for depression-related language in Twitter. In Proceedings of CLPsych@NAACL-HLT.
Google Scholar
Rude, S., Gortner, E., & Pennebaker, J. (2004). Language use of depressed and depression-vulnerable college students. Cognition & Emotion, 18(8), 1121–1133.
Article Google Scholar
Sadeque, F., Xu, D., & Bethard, S. (2017). Uarizona at the CLEF erisk 2017 pilot task: Linear and recurrent models for early depression detection. In Working Notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11–14, 2017.
Google Scholar
Trotzek, M., Koitka, S., & Friedrich, C. M. (2017). Linguistic metadata augmented classifiers at the CLEF 2017 task for early detection of depression. In Working Notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11–14, 2017.
Google Scholar
Trotzek, M., Koitka, S., & Friedrich, C. M. (2018). Utilizing neural networks and linguistic metadata for early detection of depression indications in text sequences. IEEE Transactions on Knowledge and Data Engineering, 32(3), 588–601.
Article Google Scholar
Trotzek, M., Koitka, S., and Friedrich, C. M. (2018). Word embeddings and linguistic metadata at the CLEF 2018 tasks for early detection of depression and anorexia. In Working Notes of CLEF 2018—Conference and Labs of the Evaluation Forum, Avignon, France, September 10–14, 2018.
Google Scholar
Villegas, M. P., Funez, D. G., Ucelay, M. J. G., Cagnina, L. C., & Errecalde, M. L. (2017). LIDIC—unsl’s participation at erisk 2017: Pilot task on early detection of depression. In Working Notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11–14, 2017.
Google Scholar
Wang, X., Zhang, C., Ji, Y., Sun, L., Wu, L., Bao, Z. A., & depression detection model based on sentiment analysis in micro-blog social network. In Trends and Applications in Knowledge Discovery and Data Mining - PAKDD,. (2013). International Workshops: DMApps, DANTH, QIMIE, BDM, CDA, CloudSD, Gold Coast, QLD, Australia, April 14–17, 2013. Revised Selected Papers, 2013, 201–213.
Google Scholar
Xue, Y., Li, Q., Jin, L., Feng, L., Clifton, D. A., & Clifford, G. D. (2014). Detecting adolescent psychological pressures from micro-blog. In Proceedings of the Health Information Science—Third International Conference, HIS 2014, Shenzhen, China, April 22–23, 2014 (pp. 83–94).
Google Scholar

Download references

Acknowledgements

This work is partially supported by the PREVISION project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under GA No 833115 (https://cordis.europa.eu/project/id/833115). The paper reflects the authors’ view and the Commission is not responsible for any use that may be made of the information it contains.

Author information

Authors and Affiliations

ESPE, Univ. Toulouse Jean-Jaurès, Univ. de Toulouse, IRIT, UMR5505 CNRS, 118 Rte de Narbonne, Toulouse, France
Josiane Mothe
Univ. de Toulouse, IRIT, UMR5505 CNRS, 118 Rte de Narbonne, Toulouse, France
Faneva Ramiandrisoa & Md Zia Ullah

Authors

Josiane Mothe
View author publications
You can also search for this author in PubMed Google Scholar
Faneva Ramiandrisoa
View author publications
You can also search for this author in PubMed Google Scholar
Md Zia Ullah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josiane Mothe .

Editor information

Editors and Affiliations

Faculty of Informatics, University of Lugano, Lugano, Switzerland
Fabio Crestani
Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
David E. Losada
Centro de Investigación en Tecnoloxías da Información e as Comunicacións (CITIC), Universidade da Coruña, A Coruña, Spain
Javier Parapar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mothe, J., Ramiandrisoa, F., Ullah, M.Z. (2022). Comparison of Machine Learning Models for Early Depression Detection from Users’ Posts. In: Crestani, F., Losada, D.E., Parapar, J. (eds) Early Detection of Mental Health Disorders by Social Media Monitoring. Studies in Computational Intelligence, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-04431-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-04431-1_5
Published: 15 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04430-4
Online ISBN: 978-3-031-04431-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Comparison of Machine Learning Models for Early Depression Detection from Users’ Posts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An ensemble approach to detect depression from social media platform: E-CLS

Detecting Depression on Social Platforms Using Machine Learning

Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Comparison of Machine Learning Models for Early Depression Detection from Users’ Posts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An ensemble approach to detect depression from social media platform: E-CLS

Detecting Depression on Social Platforms Using Machine Learning

Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation