Abstract
In today’s world of cut-throat competition, where everyone is running an invisible race, we often find ourselves alone amongst the crowd. The advancements in technology are making our lives easier, yet man being a social animal is losing touch with society. As a result, today a huge part of the population is suffering from psychological disorders. Inferiority complex, inability to fulfil dreams, loneliness, etc., are considered to be the common reasons to disturb mental stability, which may further lead to disorders like depression. In extreme cases, depression causes loss of precious lives when an individual decides to commit suicide. Assessing an individual’s mental health in an interactive way with the core help of machine learning is the primary focus of this work. To realize this objective, we have used the most suitable long-short term memory (LSTM) architecture. It is an artificial recurrent neural network (RNN) in the field of deep learning on Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and FastText datasets to get 86% accuracy when fed with model-patient conversational data. Further, we discussed the scope of enhancing cognitive control capabilities over the psychiatric disorders, which may even lead to severe level of depression and suicidal attacks. Here, the proposed system will help to determine the severity level of depression in a person and will help with the recovery process. The system comprises of a wrist-band to measure some biological parameters, a headband to analyse the mental health and a user-friendly website and mobile application which has an in-built chatbot. AI-based chatbot will talk to the patients and help them reveal their thoughts, which they are otherwise not able to communicate to their peers. A person can chat via text message, which is to be stored in the database for further analysis. The novelty of this work is in the sentiment analysis of voice chat, which therefore creates a comfortable environment for the user.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Al Banna MH et al (2021) Attention-based bi-directional long-short term memory network for earthquake prediction. IEEE Access 9:56589–56603
Anagnostopoulos CN, Iliou T, Giannoukos I (2015) Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artif Intell Rev 43(2):155–177
Dini L, Bittar A (2016) Emotion analysis on Twitter: the hidden challenge. In: Proceedings of LREC’16, pp 3953–3958 (2016)
Fabietti M et al (2020) Artifact detection in chronically recorded local field potentials using long-short term memory neural network. In: Proceedings of AICT 2020, pp 1–6 (2020)
Ghosh T et al (2021) An attention-based mood controlling framework for social media users. In: Proceedings of brain informatics, pp 245–256 (2021)
Ghosh T et al (2021) A hybrid deep learning model to predict the impact of covid-19 on mental health form social media big data. Preprints (2021060654)
Humphrey EJ, Bello JP, LeCun Y (2012) Moving beyond feature design: deep architectures and automatic feature learning in music informatics. In: ISMIR, pp 403–408
Kahou SE et al (2016) Emonets: multimodal deep learning approaches for emotion recognition in video. J Multimodal User Interfaces 10(2):99–111
Livingstone SR, Russo FA (2018) The ryerson audio-visual database of emotional speech and song (ravdess): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 13(5):e0196391
Mikolov T, Grave E, Puhrsch C, Joulin A (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405, pp 1–4
Mohammad SM, Bravo-Marquez F (2017) Emotion intensities in tweets. arXiv preprint arXiv:1708.03696, pp 1–13
Poria S, Cambria E, Howard N, Huang GB, Hussain A (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59
Sreeja PS, Mahalakshmi G (2017) Emotion models: a review. Int J Control Theor Appl 10:651–657
Sailunaz K, Dhaliwal M, Rokne J, Alhajj R (2018) Emotion detection from text and speech: a survey. Soc Netw Anal Mining 8(1):1–26
Satu M et al (2020) Towards improved detection of cognitive performance using bidirectional multilayer long-short term memory neural network. In: Proceedings of brain informatics, pp 297–306
Satu MS et al (2021) Tclustvid: a novel machine learning classification model to investigate topics and sentiment in covid-19 tweets. Knowl-Based Syst 226:107126
Semwal N, Kumar A, Narayanan S (2017) Automatic speech emotion detection system using multi-domain acoustic feature selection and classification models. In: Proceedings of ISBA, pp 1–6
Acknowledgements
MM is supported by the AI-TOP (2020-1-UK01-KA201-079167) and DIVERSASIA (618615-EPP-1-2020-1-UKEPPKA2-CBHEJP) projects funded by the European Commission under the Erasmus+ programme.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bhagat, D., Ray, A., Sarda, A., Dutta Roy, N., Mahmud, M., De, D. (2023). Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory. In: Mandal, J.K., De, D. (eds) Frontiers of ICT in Healthcare . Lecture Notes in Networks and Systems, vol 519. Springer, Singapore. https://doi.org/10.1007/978-981-19-5191-6_2
Download citation
DOI: https://doi.org/10.1007/978-981-19-5191-6_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5190-9
Online ISBN: 978-981-19-5191-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)