Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory

Bhagat, Dhritesh; Ray, Aritra; Sarda, Adarsh; Dutta Roy, Nilanjana; Mahmud, Mufti; De, Debashis

doi:10.1007/978-981-19-5191-6_2

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 519))

297 Accesses
8 Citations

Abstract

In today’s world of cut-throat competition, where everyone is running an invisible race, we often find ourselves alone amongst the crowd. The advancements in technology are making our lives easier, yet man being a social animal is losing touch with society. As a result, today a huge part of the population is suffering from psychological disorders. Inferiority complex, inability to fulfil dreams, loneliness, etc., are considered to be the common reasons to disturb mental stability, which may further lead to disorders like depression. In extreme cases, depression causes loss of precious lives when an individual decides to commit suicide. Assessing an individual’s mental health in an interactive way with the core help of machine learning is the primary focus of this work. To realize this objective, we have used the most suitable long-short term memory (LSTM) architecture. It is an artificial recurrent neural network (RNN) in the field of deep learning on Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and FastText datasets to get 86% accuracy when fed with model-patient conversational data. Further, we discussed the scope of enhancing cognitive control capabilities over the psychiatric disorders, which may even lead to severe level of depression and suicidal attacks. Here, the proposed system will help to determine the severity level of depression in a person and will help with the recovery process. The system comprises of a wrist-band to measure some biological parameters, a headband to analyse the mental health and a user-friendly website and mobile application which has an in-built chatbot. AI-based chatbot will talk to the patients and help them reveal their thoughts, which they are otherwise not able to communicate to their peers. A person can chat via text message, which is to be stored in the database for further analysis. The novelty of this work is in the sentiment analysis of voice chat, which therefore creates a comfortable environment for the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Speech-Based Emotion Classification for Human by Introducing Upgraded Long Short-Term Memory (ULSTM)

Towards Robust Deep Neural Networks for Affect and Depression Recognition from Speech

Semantic speech analysis using machine learning and deep learning techniques: a comprehensive review

Article Open access 19 December 2023

References

Al Banna MH et al (2021) Attention-based bi-directional long-short term memory network for earthquake prediction. IEEE Access 9:56589–56603
Article Google Scholar
Anagnostopoulos CN, Iliou T, Giannoukos I (2015) Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artif Intell Rev 43(2):155–177
Article Google Scholar
Dini L, Bittar A (2016) Emotion analysis on Twitter: the hidden challenge. In: Proceedings of LREC’16, pp 3953–3958 (2016)
Google Scholar
Fabietti M et al (2020) Artifact detection in chronically recorded local field potentials using long-short term memory neural network. In: Proceedings of AICT 2020, pp 1–6 (2020)
Google Scholar
Ghosh T et al (2021) An attention-based mood controlling framework for social media users. In: Proceedings of brain informatics, pp 245–256 (2021)
Google Scholar
Ghosh T et al (2021) A hybrid deep learning model to predict the impact of covid-19 on mental health form social media big data. Preprints (2021060654)
Google Scholar
Humphrey EJ, Bello JP, LeCun Y (2012) Moving beyond feature design: deep architectures and automatic feature learning in music informatics. In: ISMIR, pp 403–408
Google Scholar
Kahou SE et al (2016) Emonets: multimodal deep learning approaches for emotion recognition in video. J Multimodal User Interfaces 10(2):99–111
Article Google Scholar
Livingstone SR, Russo FA (2018) The ryerson audio-visual database of emotional speech and song (ravdess): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 13(5):e0196391
Article Google Scholar
Mikolov T, Grave E, Puhrsch C, Joulin A (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405, pp 1–4
Mohammad SM, Bravo-Marquez F (2017) Emotion intensities in tweets. arXiv preprint arXiv:1708.03696, pp 1–13
Poria S, Cambria E, Howard N, Huang GB, Hussain A (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59
Article Google Scholar
Sreeja PS, Mahalakshmi G (2017) Emotion models: a review. Int J Control Theor Appl 10:651–657
Google Scholar
Sailunaz K, Dhaliwal M, Rokne J, Alhajj R (2018) Emotion detection from text and speech: a survey. Soc Netw Anal Mining 8(1):1–26
Google Scholar
Satu M et al (2020) Towards improved detection of cognitive performance using bidirectional multilayer long-short term memory neural network. In: Proceedings of brain informatics, pp 297–306
Google Scholar
Satu MS et al (2021) Tclustvid: a novel machine learning classification model to investigate topics and sentiment in covid-19 tweets. Knowl-Based Syst 226:107126
Article Google Scholar
Semwal N, Kumar A, Narayanan S (2017) Automatic speech emotion detection system using multi-domain acoustic feature selection and classification models. In: Proceedings of ISBA, pp 1–6
Google Scholar

Download references

Acknowledgements

MM is supported by the AI-TOP (2020-1-UK01-KA201-079167) and DIVERSASIA (618615-EPP-1-2020-1-UKEPPKA2-CBHEJP) projects funded by the European Commission under the Erasmus+ programme.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
Dhritesh Bhagat, Aritra Ray, Adarsh Sarda & Nilanjana Dutta Roy
Department of Computer Science, Nottingham Trent University, Nottingham, UK
Mufti Mahmud
Department of Computer Science and Engineering, Maulana Abul Kalam Azad University, West Bengal, India
Debashis De

Authors

Dhritesh Bhagat
View author publications
You can also search for this author in PubMed Google Scholar
Aritra Ray
View author publications
You can also search for this author in PubMed Google Scholar
Adarsh Sarda
View author publications
You can also search for this author in PubMed Google Scholar
Nilanjana Dutta Roy
View author publications
You can also search for this author in PubMed Google Scholar
Mufti Mahmud
View author publications
You can also search for this author in PubMed Google Scholar
Debashis De
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nilanjana Dutta Roy .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
Jyotsna Kumar Mandal
Department of Computer Science and Engineering, School of Computational Science, Maulana Abul Kalam Azad University of Technology, Kolkata, West Bengal, India
Debashis De

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhagat, D., Ray, A., Sarda, A., Dutta Roy, N., Mahmud, M., De, D. (2023). Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory. In: Mandal, J.K., De, D. (eds) Frontiers of ICT in Healthcare . Lecture Notes in Networks and Systems, vol 519. Springer, Singapore. https://doi.org/10.1007/978-981-19-5191-6_2

Download citation

DOI: https://doi.org/10.1007/978-981-19-5191-6_2
Published: 25 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5190-9
Online ISBN: 978-981-19-5191-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Speech-Based Emotion Classification for Human by Introducing Upgraded Long Short-Term Memory (ULSTM)

Towards Robust Deep Neural Networks for Affect and Depression Recognition from Speech

Semantic speech analysis using machine learning and deep learning techniques: a comprehensive review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Speech-Based Emotion Classification for Human by Introducing Upgraded Long Short-Term Memory (ULSTM)

Towards Robust Deep Neural Networks for Affect and Depression Recognition from Speech

Semantic speech analysis using machine learning and deep learning techniques: a comprehensive review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation