An ensemble approach to detect depression from social media platform: E-CLS

Tiwari, Shashank Shekher; Pandey, Rajnish; Deepak, Akshay; Singh, Jyoti Prakash; Tripathi, Sudhakar

doi:10.1007/s11042-023-17971-6

An ensemble approach to detect depression from social media platform: E-CLS

Published: 06 February 2024

Volume 83, pages 71001–71033, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

An ensemble approach to detect depression from social media platform: E-CLS

Download PDF

Shashank Shekher Tiwari¹,
Rajnish Pandey¹,
Akshay Deepak¹,
Jyoti Prakash Singh ORCID: orcid.org/0000-0002-3742-7484¹ &
…
Sudhakar Tripathi²

276 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Depression, a prevalent adult symptom, can arise from various sources, including mental health conditions and social interactions. With the rise of social media, adults often share their daily experiences, potentially revealing their emotional state on social platforms, like X (formerly Twitter) and Facebook. In this study, we present Ensemble (E) of Convolutional Neural Network (C), Attention-based Long Short-Term Memory (L) Network, and Support Vector Machine (S) (E-CLS), utilizing Term Frequency-Inverse Document Frequency (TF-IDF) vectors, Global Vectors for Word Representation (GloVe) and Bidirectional Encoder Representations from Transformers (BERT) word embeddings. This model effectively identifies depressive posts. Validated with a Twitter-derived depressive dataset, E-CLS achieves an impressive $F_{1}$-score of 0.91, surpassing existing machine-learning and deep-learning models by 2%. This research advances the detection of depression in social media posts, holding promise for enhanced mental health monitoring. Furthermore, our work contributes to the burgeoning field of mental health informatics by leveraging state-of-the-art techniques in natural language processing. The ensemble approach synergizes the strengths of Convolutional Neural Network (CNN) for local pattern recognition, Long Short-Term Memory (LSTM) Network for sequential context understanding, and Support Vector Machine (SVM) for robust classification. The incorporation of TF-IDF vectors and GloVe embeddings enriches feature representation, enhancing the model’s ability to discern nuanced linguistic cues associated with depression. By demonstrating superior performance over established models, E-CLS showcases its potential as a valuable tool in digital mental health interventions.

Comparison of Machine Learning Models for Early Depression Detection from Users’ Posts

A Hybrid Deep Learning Model Using Grid Search and Cross-Validation for Effective Classification and Prediction of Suicidal Ideation from Social Network Data

Article 16 October 2022

Depression Detection Using Deep Learning and Natural Language Processing Techniques: A Comparative Study

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The advancement of internet and communication technology [19, 21], particularly online social media, has revitalised how people engage with one another and express themselves in public. Users of online social media platforms such as Facebook, X (formerly Twitter), and Instagram can express their emotions and sentiments about a topic, subject, or any social or general issues in writing or through multimedia content. These social media platforms have given people infinite power to communicate their feelings, emotions, and thoughts [15, 42], which was previously confined to only physical contacts.

Depression is a common illness that affects an estimated 3.8 percent of the global population, including 5.0 percent of adults and 5.7 percent of people over the age of 60^{Footnote 1}. Depression affects around 280 million individuals globally and can be harmful to one’s health, particularly if it occurs regularly and with moderate or severe intensity. It can cause a person to suffer greatly and perform poorly at work, school, and at home. At its worst, depression can lead to suicide, which is the fourth leading cause of mortality among people aged 15 to 29. Despite the fact that there are recognised, effective therapies for mental disorders, more than 75% of people in low- and middle-income countries do not receive therapy and do not discuss their situation with others. A lack of funds, a shortage of competent health-care staff, and the societal stigma associated with mental illnesses, all serve as impediments to proper diagnosis and treatment. During a depressive episode, the person feels dissatisfied, irritable, or empty for the most of the day, almost every day for at least two weeks. Impaired focus, feelings of excessive guilt or poor self-worth, hopelessness about the future, thoughts of dying or suicide, disrupted sleep, changes in diet or weight, and feeling particularly tired or low in energy are all possible symptoms. Depression can be caused by job discontent, everyday life troubles, sickness issues, a terrible relationship, exam failure, comparing oneself to others, financial situations, or an overdose of medications [28]. Social withdrawal and isolation are primary consequences of depression [27, 38] and other mental diseases [4, 52], which may lead to suicide.

Online social sites provide a simple platform for people of all ages, even those who are depressed. Examples of depressive posts are shown in Figs. 1 and 2.

The detrimental effects of depression, as well as the lack of resources in many countries throughout the world, inspired us to create a system for identifying a person suffering from depression by analysing his social media activities, particularly tweets. One of the most significant challenges to developing a machine or deep learning based system for depression detection is an adequate dataset. We began by gathering depression-related datasets from Twitter as people commonly use these social media platforms to express their feelings and emotions [50]. So identifying different people’s mental levels by looking at their social media posts can be a viable alternative. Most people utilise unpleasant or negative posts to describe their level of depression. According to these findings, peer-to-peer communities on social media can challenge stigma, raise the possibility of professional aid, and directly offer online assistance to persons suffering from mental illness. In the United States, a similar study [5] found that internet users with stigmatisation such as depression or urinary incontinence are more likely to use online resources for health-related information and communication than people with other chronic diseases, emphasising the importance of research to assist depressed individuals on social media platforms and the Internet.

The openness of social media allows people to openly and freely engage and comment on any issue on these media; on the other side, it allows specialists working in the health area to obtain insight into what is going on in someone’s mental state. Machine-learning algorithms [14] may be able to provide such insight by evaluating the unique patterns buried in online communication and processing them to reveal the mental state (such as despair, sadness, happiness, rage, anxiety, etc.) among social media users.

Despandey et al. [12], Lin et al. [32], and Orabi et al. [41] were among the researchers who attempted to diagnose depressive persons based on their tweets. On their proprietary dataset, they used both traditional machine-learning and deep-learning models.

In light of the above discussion, this paper focuses on the following Research Questions (RQs):

1.
RQ1: How does E-CLS improve depressive and non-depressive tweet classification compared to the individual models?
2.
RQ2: What is the impact of word embedding and TF-IDF in E-CLS for sentence categorization?

The notable contributions of this research, including answers to the above raised research questions, are summarized as follows:

1.
To mitigate the scarcity of publicly available depressive datasets, this paper proposes a new Depressive Dataset (DepDs), which has been labeled manually by a psychiatric expert, a professor, and a post graduate student (M. Tech.) DepDs consist of 5,000 depressive sentences and 7029 non-depressive sentences.
2.
ARQ1: We propose an ensemble model, named E-CLS, that effectively utilities the GloVe and BERT embeddings and TF-IDF features of the DepDs dataset for depression detection.
3.
ARQ2: The integration of word embedding and TF-IDF characteristics within the E-CLS network significantly amplifies its effectiveness in categorizing depressive and non-depressive sentences. Word embedding captures semantic relationships, while TF-IDF quantifies term importance. Their combined utilization enriches the model’s understanding of text, leading to more accurate categorization outcomes.
4.
We investigated the performance of seven conventional machine-learning classifiers (Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN), Naive Bayes (NB), Gradient Boosting (GB), Decision Tree (DT), and Random Forest (RF)) and five deep-learning classifiers (Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) Network, Bi-LSTM (Bi-Directional LSTM), Attention-based (LSTM-A), and Bidirectional Encoder Representations from Transformers (BERT)). All the classification models are evaluated on DepDs dataset. Among these, E-CLS model achieves an impressive $F_{1}$-score of 0.91, surpassing existing machine-learning and deep-learning models by 2%.

To the best of our knowledge, this is the first study on the detection of depressive utterances using an ensemble of a traditional machine-learning model SVM and two deep-learning models, namely, CNN and LSTM-A, leading to an ensemble model E-CLS .

The remainder of the paper is organized as follows: Section 2 addresses the related literature of depression detection briefly. The dataset section is described in sction 3. The suggested model and its operations are detailed in Section 4. The findings of the proposed methodology are listed in Section 5. The detailed discussion is dictated in Section 6. Finally the study is concluded in Section 7.

2 Related works

Several studies on detecting depressive posts on social media have lately been published. One group of researchers [3, 6, 9, 11, 22, 25, 30, 43] used traditional machine-learning strategies to detect depressive sentences. They extracted text features such as n-grams, word tokens, TF-IDF vectorizer, sentiment polarisation, and so on. Another group of researchers, [10, 29, 41, 44, 47, 53,54,55,56], used deep-learning strategies to detect depressive statements in text. These researchers used several embedding approaches such as GloVe, word to vector (Word2Vec), and fastText. Tables 1 and 2 contain a summary of some of the prospective efforts for depression identification from social media posts.

Table 1 Prominent works in literature based on conventional machine-learning approaches

Full size table

Table 2 Prominent works in literature based on deep-learning approaches

Full size table

This section is divided into two subsections for better organisation of the associated literature: (i) depression detection using machine-learning approaches and (ii) depression detection using deep-learning approaches.

2.1 Depression detection using conventional machine-learning approaches

This section highlights research in which researchers used machine-learning techniques to identify depressed postings. Table 1 summarizes prominent works in existing literature based on conventional machine learning approaches. Khan et al. [25] used traditional machine-learning classifiers such as Multinomial NB, SVM, Random Forest RF, Decision Tree DT, KNN, and Extreme GB with countvector of the text data as a feature to achieve an accuracy of 0.87. Arun et al. [3] used an XGB classifier to reach an accuracy of 0.98 on clinical data comprising cognitive performance, mental health, and cardiometabolic illness. Hosseinifard et al. [22] employed KNN, Linear Discriminant Analysis (LDA), and LR on non-linear aspects of an electroencephalogram (EEG) signal to reach an accuracy of 0.90. Cohn et al. [9] contrasted clinical diagnosis of major depression with automatically measured facial motions and vocal prosody in patients undergoing depression treatment. To evaluate facial and vocal expression, they used manual Facial Action Coding System (FACS) coding, active appearance modelling (AAM), and pitch extraction. They had an accuracy of 0.88, 0.79, and 0.79 for diagnosing depression in a patient using FACS, AAM, and vocal prosody-based characteristics, respectively. Choudhury et al. [11] used crowdsourcing methods to produce a large corpus of depressive vs non-depressive datasets. They performed Principle Component Analysis (PCA) on the text characteristics to minimise their dimension before classifying them with SVM with an accuracy of 0.75. Jain et al. [24] suggested a depression analysis and suicidal ideation detection method based on the Parent Health Questionnaire (PHQ), which includes questions like “What’s your age?” or “Are you a regular in school/college?”. They tested multiple machine-learning models on the available answer data and reached a maximum accuracy of 0.86 with Logistic Regression. Asad et al. [1] gathered data on depression from Facebook and X (formerly Twitter). They achieved an accuracy of 0.74 by combining a text feature based on TF-IDF with SVM and NB classifiers.

Sadeque et al. [45] provided latency-based metrics for early identification of depressive posts and demonstrated that the new metrics outperformed the current statistic Early Risk Detection Error (ERDE). Peng et al. [43] presented a multi-kernel SVM-based model with user microblog content, user profile, and user behaviours as features to recognise depressed persons and achieved an F1-score of 0.76. Leiva et al. [30] employed numerous cutting-edge machine-learning models, as well as TF-IDF vectorizer and sentiment polarization, to reach an F1-score of 0.59. Islam et al. [23] worked on Facebook depressed data posted by diverse individuals. Various aspects such as emotional variables, temporal categories, and language dimension were extracted from these posts and employed with a KNN to distinguish depressive and non-depressive people with an F1-score of 0.61. Nadeem et al. [37] collected depressive and non-depressive data using a crowdsourced method. On this data, many typical machine-learning classifiers, such as DT, SVM, LR, NB, and Ridge regression were used, yielding an F1-score of 0.84. Hassan et al. [17] presented a model to detect a person’s depression state by monitoring and extracting emotions from text. The author achieved an accuracy of 0.91 by combining SVM, NB, and maximum entropy classifiers with N-gram, Part of Speech (POS), Negation, and Sentiment Analyzer data. Burdisso et al. [6] presented a new framework called Sequential S3 (Smoothness, Significance, and Sanction). When evaluated on the CLEF’s eRisk2017 dataset^{Footnote 2}, this SS3 framework used word token as a feature and produced an F1-score of 0.61, which was much higher than other state-of-the-art machine-learning and deep-learning models. Almeida et al. [2] introduced an ensemble classification strategy that includes supervised learning, information retrieval, and feature selection methods and achieved an F1-score of 0.53.

2.2 Depression detection using deep-learning approaches

Table 2 summarizes prominent works in existing literature based on deep learning learning approaches. Zogan et al. [56] suggested a Multi-Modal Depression Detection Hierarchical Attention Network (MDHAN) with social information, interaction, emoji sentiment, topic distribution, and symptoms as features and attained an $F_1$ score of 0.89. Oh et al. [40] used the depression score among panic disorder, generalised anxiety disorder, and depressive disorder diagnostic modules with dense-layer, feedforward, and neural network to achieve an Area Under Curve (AUC) of 0.91 and AUC2 of 0.89 respectively. Mumtaj et al. [36] used EEG signals to extract relevant features and used the one-layered Deep CNN and LSTM models to reach an accuracy of 0.98. Lam et al. [29] achieved a $F_1$ score of 0.87 by using 1DCNN and transformer deep-learning models using transcript text and Mel-frequency spectrogram as features. Yang et al. [55] employed deep-learning models Deep neural network (DNN) and Deep CNN (DCNN), as well as paragraph vector, Histogram of Displacement Range (HDR), Geneva Minimalistic Acoustic Parameter Set (GeMAPS), and Low Level Descriptors (LLD), to achieve Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) of 5.974 and 5.163, respectively. Orabi et al. [41] achieved an F1 score of 0.87 by combining Bi-LSTM and CNN deep-learning models with Word2Vec embedding techniques. Wu et al. [54] proposed a model called Deep Learning-based Depression Detection with Heterogeneous Data Sources (D3-HDS) to predict an individual’s depression label by assessing his/her living environment, behaviour, and social media posting contents. They achieved an accuracy of 0.74 using the Word2Vec embedding technique.

Shetty et al. [49] presented a model to predict depression in people based on their online behaviour. To reach an accuracy of 0.95, an ensemble of LTSM, CNN, and SVM, as well as TF-IDF, count vectorizer, and n-gram, were used. Wang et al. [51] investigated deep-learning methods with pre-trained language representation models for depression risk prediction, including BERT, a robustly optimised BERT pretraining approach (RoBERTa), and generalised autoregressive pretraining for language understanding (XLNET), and compared them to previous methods on a manually annotated benchmark dataset. Their results produced an $F_1$ score of 0.86. The depression risk was rated at four levels ranging from 0 to 3, with 0 denoting no inclination and 1, 2, and 3 denoting mild, moderate, and severe depression risk, respectively. Wongkoblap et al. [53] built a predictive MIL model with and without anaphora resolution to recognise users with depression from their generated textual material on Twitter, as well as predictive models that may highlight posts pertinent to mental health. The authors utilised the GloVe embedding approach and received an $F_1$ score of 0.92. Cong et al. [10] utilised a hybrid deep-learning model that combined XGBoost with attention-based Bi-LSTM. The XGBoost component was utilised to deal with the imbalance dataset, while attention-based Bi-LSTM was employed to improve the suggested model’s performance. The authors claimed to have an $F_1$ score of 0.61. Rosa et al. [44] proposed a Knowledge-Based Recommendation System (KBRS), a new system for monitoring and detecting potential users with emotional problems based on language classification with depressive or stressed content. For character-level representation a CNN is used and Bi-LSTM-RNN is used for disorder entity recognition. The researcher reported an accuracy of 0.89. Shah et al. [47] proposed a hybrid deep-learning model comprised of Bi-LSTM with various word embedding approaches and metadata variables to diagnose sad people using social media posts. The authors claimed to have an F1 score of 0.71.

To the best of our knowledge, this work is the first study on ensemble of one conventional machine-learning (SVM) model and two deep-learning models (CNN, LSTM) to detect depressive posts.

3 Dataset

3.1 Depressive dataset creation

To assemble our Depressive Dataset (DepDs), we meticulously compiled a comprehensive collection of English tweets. The selection of specific keywords and hashtags, such as “depressed", “sad", “anxiety", and “lonely", aimed to capture a diverse spectrum of tweets reflecting various aspects of mental health expression on social media. This careful curation ensured the richness and representativeness of the dataset, aligning with our primary goal of developing a robust model for the classification of depressive and non-depressive tweets.

3.1.1 Collection procedure

To address the lack of publicly available X (formerly Twitter) datasets focused on identifying depressive posts, we initiated the creation of a new dataset leveraging the X (formerly Twitter) API. The collected dataset comprises 12,029 English tweets. For detailed insights of the tweet collection process is presented in Algorithm 1.

3.1.2 Annotation

The collected tweets exhibited characteristics such as hashtags and Uniform Resource Locators (URLs) pointing to various websites and multimedia content. These features served as valuable cues for the labelling process, aiding in the classification of tweets as either depressive or non-depressive presented in Algorithm 2.

3.1.3 Label assignment

The labeling process was carried out by a psychiatric doctor, a professor and one Post Graduate (M. Tech.) student. Each person independently assessed and labeled the tweets based on their content. Subsequently, a voting-based methodology was employed to determine the final label. In cases where two or more students reached a consensus on a specific label, that label was assigned to the respective tweet. The details are shown in Algorithm 3.

3.1.4 Inter-rater agreement

To assess the consistency of labels assigned by different annotators, we utilized the Kohen Kappa (K) measure. The resulting k-value for inter-rater agreement was calculated to be 0.85, indicating a substantial level of agreement among the persons involved in the labeling process as shown in Algorithm 4.

3.1.5 Final dataset composition

Out of the 12,029 collected tweets, 7,029 were labelled as depressive, while the remaining 5,000 were designated as non-depressive. This balanced representation of both classes within the dataset contributes to its overall robustness and effectiveness for subsequent analyses.

The detailed of dataset creation were given in Algorithms (1-4)

3.2 Analysis of DepDs using BERT

Given that the dataset pertains to depressive content, where a level of 1 signifies a depressive sentence and 0 denotes a non-depressive one, the analysis was performed utilizing a pre-trained ‘bert-base-uncased’ model. The confidence distribution plot (see Figs. 3 and 4) offers valuable insights into the model’s certainty in distinguishing between depressive and non-depressive sentences. This visualization provides a clear overview of the model’s confidence scores across the dataset. For instance, when the ‘bert-base-uncased’ model and ‘sentiment-analysis’ analyzes a sentence like “I feel overwhelmed and sad", it assigns a confidence score of 0.85, indicating a high level of certainty that this sentence is depressive. Conversely, for a sentence like “The weather is lovely today", the model assigns a confidence score of 0.12, suggesting a low likelihood of depressive content.

The 2-dimensional distribution plot (see Fig. 5) provides an illuminating perspective on the relationship between confidence scores and assigned depressive levels. Imagine two axes where the x-axis represents confidence scores ranging from 0 to 1, and the y-axis represents the assigned levels (0 for non-depressive, 1 for depressive). In this plot, each data point corresponds to a sentence, with its position indicating both its confidence score and assigned level. For instance, a data point at coordinates (0.85, 1) signifies a highly confident prediction of a depressive sentence, while a point at (0.12, 0) indicates a low-confidence assignment of a non-depressive sentence. This visual representation encapsulates how the model associates confidence with actual assigned levels. The 2-dimensional distribution plot thereby offers a powerful visualization of the model’s proficiency in distinguishing between depressive and non-depressive content based on confidence scores.

Suppose we have an input text: “I feel overwhelmed and sad, and it’s hard to find joy in anything lately". This text expresses emotions commonly associated with depression. The BERT classifier analyzes this text and assigns a sentiment label and a corresponding confidence score. In this case, the classifier predict a label of ‘Depressive’ and assign a confidence score of 0.87. Here, the score of 0.87 indicates a high level of confidence in the classifier’s prediction that the input text is depressive. This means that the model is quite certain in its assessment that the text exhibits characteristics associated with depression. On the other hand, if we had a different input text like “The weather is lovely today, and I’m feeling great!", the BERT classifier predicts a label of ‘Non-Depressive’ with a lower confidence score of 0.12 for depressive and 0.88 for non-depressive. It indicates that the input text is less likely to contain depressive content, and the model’s confidence in this prediction is relatively high for non-depressive label. So, the score (see Figs. 6 and 7) provides a numerical representation of the model’s confidence in its depression detection , with higher scores indicating greater confidence for the depressive posts.

The ‘Faceted distribution’ plot (see Figs. 8 and 9) presents a detailed view of confidence score distributions for both depressive and non-depressive sentences. It partitions the dataset into separate segments based on assigned levels. One facet represents sentences labeled as depressive (assigned level 1), while the other represents non-depressive sentences (assigned level 0). Each facet displays a histogram of confidence scores, showing the frequency of sentences at different confidence levels. This visual breakdown allows for a targeted examination of the model’s performance within specific level categories. For example, in the facet for depressive sentences, a peak at high confidence scores indicates accurate classifications, while a spread across various confidence levels may signal areas for improvement. This facet-based analysis refines our understanding of how the model performs within distinct depressive and non-depressive contexts.

To the best of our knowledge, none of the X (formerly Twitter) datasets used in existing works on identifying depressive posts are publicly available. The detailed datasets used in this work can be seen from Table 3. Figure 10 shows the word cloud distribution of the depressive tweets in the dataset, while Fig. 11 shows the same for nondepressive tweets.

4 Methodology

In this section, the detailed working of the proposed ensemble model (E-CLS) is presented. We consider the prediction of all the three models (LSTM-A, CNN, SVM) and label information associated with them based on the majority voting of the three models. Along with the proposed model, several machine-learning and deep-learning models were also implemented to compare the performance of the proposed hybrid network (E-CLS). The detailed flow diagram for the proposed model can be seen in Fig. 12.

Table 3 Statistics of Datasets Used in this Work

Full size table

4.1 Problem statement

The primary objective of this research is to develop a robust and accurate model for the classification of depressive and non-depressive tweets. In pursuit of this goal, we introduce an ensemble model E-CLS, amalgamating three distinct models: LSTM-A, CNN, and SVM. Each model contributes to the prediction process based on features extracted from the input text data. To facilitate the training and evaluation of our model, we meticulously curated a novel dataset, DepDs. This dataset consists of 12,029 English tweets acquired from the X (formerly Twitter) platform through the X (formerly Twitter) API. Out of these, 70% (8,420 tweets) were used as our training set, 20% (2,406 tweets) constituted our testing set, and remaining 10% (1,203 tweets) was used as the validation set.

The classification process within the E-CLS model involves the prediction of all three models, resulting in label information associated with each prediction. These labels are further consolidated using a majority voting approach, enhancing the overall prediction accuracy. Additionally, several established machine-learning and deep-learning models are implemented for comparative evaluation.

4.2 Proposed approach

Let $M$ be the set of models, including LSTM-A, CNN, and SVM, denoted as $m_1, m_2, m_3$ respectively. For each model $m_i$, $i = 1, 2, 3$, a prediction $P_i$ is generated based on the input features. Let $L_i$ represent the label associated with the prediction of model $m_i$. The majority voting strategy is defined as:

$$ L_{\text {majority}} = {\text {*}}{argmax}_{l \in \{0, 1\}} \sum _{i=1}^{3} \mathbb {I}(L_i = l) $$

where $\mathbb {I}(x)$ is the indicator function, yielding 1 if $x$ is true and 0 otherwise. Consequently, the final prediction $L_{\text {E-CLS}}$ of the E-CLS model is determined as:

$$ L_{\text {E-CLS}} = L_{\text {majority}}(L_1, L_2, L_3) $$

where $L_{\text {E-CLS}} \in \{0, 1\}$ denotes the predicted label by the E-CLS model.

The (E-CLS) model combines the strengths of LSTM-A, CNN, and SVM models through majority voting, resulting in a powerful hybrid network capable of accurate tweet classification.

The abstract diagram (see Fig. 13) of the proposed E-CLS framework for classifying depressive tweets focusing on the classification of tweets with depressive content. Data collection involves using the Twitter API to gather a balanced dataset of both depressive and non-depressive tweets. Following this, data preprocessing and text cleaning steps ensure that the collected information is free from noise and irrelevant details. Feature extraction techniques like TF-IDF and GloVe word embedding are then applied to represent the textual data in a numerical format. The ensemble model, composed of CNN (see Fig. 14), LSTM-A (see Fig. 15), and SVM classifiers, is trained on the preprocessed data. This diagram encapsulates the entire process, emphasizing the framework’s robustness in addressing mental health-related content on social media.

4.3 Machine learning models

We used the following classifiers: (i) SVM, (ii) LR, (iii) NB, (iv) KNN, (v) DT, (vi) RF, and (vii) GB. To train these machine-learning classifiers, we extracted features from sentences using 1-gram, 2-gram, and 3-gram TF-IDF. Section 5 presents the experimental findings of these classifiers.

4.4 Deep learning models

We implemented six deep learning models: (i) Long Short-Term Memory (LSTM), (ii) LSTM-Attention (LSTM-A), (iii) Bidirectional-LSTM (Bi-LSTM), (iv) Convolutional Neural Network (CNN), (v) Ensemble of CNN, LSTM, and SVM (E-CLS) network (with BERT-embedding) and (vi) Ensemble of CNN, LSTM, and SVM (E-CLS) network (proposed model). In the case of the proposed E-CLS network and other deep-learning models, pre-trained GloVe word embedding was used to give input to the system. Word-embedding techniques generate word vectors for words that have similar contextual meanings. The reason for using GloVe is that it generally performs well as it is pre-trained over a large corpus and reduces the computation overheads. The complete architecture of the proposed E-CLS network can be seen in Fig. 12. The embedded word vectors are passed through CNN and LSTM block, whereas TF-IDF vectors were passed to the SVM block. All these three blocks are placed in parallel, as shown in Fig. 12. The output feature vector coming from the above three blocks were passed through majority voting block. This block will mark any sentence as depressive if at least any two block will detect this sentence as depressive otherwise the majority voting block will mark it as non-depressive. The hyperparameter details of the deep learning-based approaches are given in Table 4, while the same for the SVM classifier is given in Table 5.

Motivation for choosing CNN, LSTM, and SVM as baselines By incorporating these diverse baselines into our ensemble model, we aim to leverage the complementary strengths of each individual model to improve overall performance. This approach allows us to benefit from the local feature extraction capabilities of CNN, the sequential modeling strengths of LSTM, and the robustness of SVM in separating classes in a high-dimensional feature space.

Table 4 Hyper-parameter settings for the deep neural network-based approaches

Full size table

Table 5 Basic Hyperparameters for SVM Classifier

Full size table

4.5 Word embedding vectors

Representing each text into its corresponding vectors [20] is a necessary step to provide input to the deep-learning models. In this work, we have used word embedding to extract the robust features from the text. We separated each sentence into words, and then we created a 200-dimensional embedding vector for each word. Word-embedding vectors are initialized randomly from the uniform distribution. These weights are then trained using back-propagation.

4.6 Convolutional neural network

The convolution process of the CNN [26] model is used to extract the semantic features of the sentence by using n-gram information. In the CNN model, the convolution process involves a filter $ F\epsilon R^{h*k} $, with the size of h words with k dimensions (same as the embedding vector). This filter is applied to the text matrix $ T_{i} $ and element-wise multiplication is performed, then the summation of all values are passed through a non-linear function to produce a new feature. Next time, this filter is again applied to the text matrix by moving one column towards the right and convolved with the next h words of the matrix $ T_{i} $. Subsequently, it is passed through the non-linear function to produce a new feature, and so on. A feature $ c_{i} $ for a window of words $ W_{i:i+h-1} $ can be generated as:

$$\begin{aligned} c_{i} = \phi (F. W_{i:i+h-1} + b) \end{aligned}$$

(1)

where, $ b\epsilon R $ is the bias and $ \phi $ is a non-linear function. This filter is carried out to each feasible word window having m words, $ W_{1:h} $, $ W_{2:h+1} $, .......,$ W_{m-h+1:m} $, and each is embedded in n-dimension to produce a feature map.

$$\begin{aligned} c = [c_{1}, c_{2}, \ldots .., c_{m-h+1}] \end{aligned}$$

(2)

where $ c\epsilon R^{m-h+1} $.

A simple convolution operation with a filter having size $h=3$ is represented as:

$$ \begin{bmatrix} &{} W_1 &{} W_2 &{} W_3 &{} \dots &{} W_m \\ &{} \alpha _{11} &{} \alpha _{21} &{} \alpha _{31} &{} \dots &{} \alpha _{m1}\\ &{} \alpha _{12} &{} \alpha _{22} &{} \alpha _{32} &{} \dots &{}\alpha _{m2}\\ &{} \alpha _{13} &{} \alpha _{23} &{} \alpha _{33} &{} \dots &{}\alpha _{m3}\\ &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ &{} \alpha _{1K} &{} \alpha _{2K} &{} \alpha _{3K} &{} \dots &{} \alpha _{mK}\\ \end{bmatrix} \bullet \begin{bmatrix} &{} f_{11} &{} f_{21} &{} f_{31} \\ &{} f_{12} &{} f_{22} &{} f_{32} \\ &{} f_{13} &{} f_{23} &{} f_{33} \\ &{} \vdots &{} \vdots &{} \vdots \\ &{} f_{1K} &{} f_{2K} &{} f_{3K} \\ \end{bmatrix} $$

$$\begin{bmatrix} &{} c_1 \\ &{} c_2 \\ &{} c_3 \\ &{} \vdots \\ &{} c_{m-h+1} \\ \end{bmatrix} = \begin{bmatrix} &{} f\{\alpha _{11}f_{11} +\alpha _{21}f_{21}+\alpha _{31}f_{31}+\alpha _{12}f_{12}+ \alpha _{22}f_{22}+\alpha _{32}f_{32}+ \dots + \\ {} &{} \alpha _{1K}f_{1K} + \alpha _{2K}f_{2K}+\alpha _{3K}f_{3K}\} \\ &{} f\{\alpha _{21}f_{11} +\alpha _{31}f_{21}+\alpha _{41}f_{31}+\alpha _{22}f_{12}+ \alpha _{32}f_{22}+\alpha _{42}f_{32}+ \dots + \\ {} &{} \alpha _{2K}f_{1K} + \alpha _{3K}f_{2K}+\alpha _{4K}f_{3K}\} \\ &{} f\{\alpha _{31}f_{11} +\alpha _{41}f_{21}+\alpha _{51}f_{31}+\alpha _{32}f_{12}+ \alpha _{42}f_{22}+\alpha _{52}f_{32}+ \dots + \\ {} &{} \alpha _{3K}f_{1K} + \alpha _{4K}f_{2K}+\alpha _{5K}f_{3K}\} \\ &{} \vdots \\ &{} f\{\alpha _{(m-2)1}f_{11} +\alpha _{(m-1)1}f_{21}+\alpha _{(m)1}f_{31}+\alpha _{(m-2)2}f_{12}+ \alpha _{(m-1)2}f_{22}+ \\ {} &{} \alpha _{(m)2}f_{32}+ \dots + \alpha _{(m-2)K}f_{1K} + \alpha _{(m-1)K}f_{2K}+\alpha _{(m)K}f_{3K} \} \\ \end{bmatrix} $$

where, $[\alpha _{m1}~ \alpha _{m2}~ \alpha _{m3} \ldots .. \alpha _{mK}]$ represents the embedding of word $W_{m}$.

We used a Rectified Linear Unit (ReLu) as an activation function. The ReLu activation function is defined as:

$$\begin{aligned} \tau (x) = max (0, x) \end{aligned}$$

(3)

This means for $\tau (x) < 0 $ it returns 0 and for $\tau (x) \ge 0 $ it returns x itself.

We used this function because it improves the training of CNN by speeding up the training process, as the computation step in ReLu is easier.

The purpose of applying the pooling operation is to get the most important feature in each of the windows, i.e., the one with the highest value. Similarly, we got a number-of-features matrix, one for each of the filters. After the convolution layer, we concatenated each of the matrices and flattened it to a single feature vector as seen in Fig. 12. Obtained features from the flattened layer are passed through a fully connected dense layer with a softmax activation function, which predicts the output as a probability of occurrence of a depressive and non-depressive sentence. To overcome the situation of over-fitting at the dense layers, we used dropout as the regularization technique. Dropout prevents the interdependency between the hidden neurons by simply dropping it out randomly with the probability of $ D_{p} $. This allows the neural network to learn more robust features and speed up the training.

4.7 Long short-term memory

The long short-term memory (LSTM) [46] is the extended form of recurrent neural network (RNN) consisting of special units in the recurrent hidden layer known as memory blocks. All the memory cells are interconnected to each other. Each memory cell consists of an input gate, output gate, forget gate, and cell state. LSTM has the capability to keep long-term dependencies using these gates and cell state. The information is added or deleted from the cell state based on their importance and is encoded by different gates. The detailed working of an LSTM cell can be seen in [18, 46].

4.8 Support vector machine

In this section we give some theoretical definitions of SVMs. Assume that we are given the training data

$(x_{i}, y_{i}),...,(x_{l}, y_{l}), x_{i}\in R^n, y_{i}\in (+1, -1).$

The decision function $\phi $ in SVM framework is defined as:

$$\begin{aligned} \phi (x) = sgn(\theta (x)) \end{aligned}$$

(4)

$$\begin{aligned} \theta (x) = \sum _{i=1}^{l} y_{i} \alpha _{i}K(x_{i}, x) + b \end{aligned}$$

(5)

where K is a kernel function, $b\in R$ is a threshold, and $\alpha _{i}$ are weights. Besides the weights $\alpha _{i}$ satisfies the following constraints:

$\forall i : 0 \le \alpha _{i} \le C$ and $\sum _{i=1}^{l}\alpha _{i} y_{i} = 0$

where C is a misclassification cost. The vectors $x_{i}$ with non-zero i are called Support Vectors. For linear SVMs, the kernel function K is defined as:

$K(x_{i}, x) - x_{i}.x$

In this case, (5) can be written as:

$$\begin{aligned} \theta (x) = w.x + b \end{aligned}$$

(6)

where $w = \sum _{i=1}^{l} y_{i}\alpha _{i} x_{i}.$ To train an SVM is to find i and b by solving the following optimization problem:

maximize $\sum _{i=1}^{l} \alpha _{i} - \frac{1}{2} \sum _{i=1}^{l} \alpha _{i} \alpha _{j} y_{i} y_{j}K(x_{i}, x_{j})$

subject to $\forall i : 0 \le \alpha _{i} \le C$ and $\sum _{i=1}^{l}\alpha _{i} y_{i} = 0.$

The solution gives an optimal hyperplane, which is a decision boundary between the two classes.

4.9 Proposed E-CLS network

In many cases single machine-learning and/or deep-learning models are not sufficient to correctly predict depressive or non-depressive post. For the proposed dataset of depression, an ensemble of SVM, CNN, and LSTM outperform various machine-learning and deep-learning models.

The E-CLS algorithm is devised for the classification of textual statements into either Depressive or Non-Depressive categories. Initially, each statement undergoes preprocessing to eliminate extraneous information, followed by tokenization into individual words. Subsequently, an embedding matrix is generated to represent the semantic information of each word in the statement. Simultaneously, TF-IDF vectors are computed to create a high-dimensional representation of the statement. These embeddings and TF-IDF vectors serve as inputs to an ensemble of LSTM, CNN, and SVM. These models are adept at capturing contextual, pattern-based, and high-dimensional information, respectively. The predictions from these models are amalgamated, and if the cumulative prediction score surpasses or equals 2, the statement is classified as depressive; otherwise, it is labeled as non-depressive. This amalgamation of LSTM, CNN, and SVM models, coupled with TF-IDF vectors, enables a comprehensive understanding of the sentiment and content of social media posts, enhancing the classification process’s accuracy and robustness.

5 Results

Extensive experiments were performed with conventional machine-learning and deep-learning models. The performance of the system is measured using precision, recall, $F_1$-score, confusion matrix, and AUC-ROC curves. The details of each of the evaluation metrics are listed in Section 5.1.

5.1 Evaluation metrics

To evaluate the performance of the proposed model, we used precision, recall, $F_1$-score, confusion matrix, and AUC-ROC curve. The precision of depressive/non-depressive sentences can be calculated by obtaining the ratio of predicted depressive/non-depressive sentences as depressive/non-depressive sentences to total predicted depressive/non-depressive sentences.

$$\begin{aligned} \text {Precision (depressive)} = \frac{\text {Number of accurately predicted depressive post}}{\text {Total number of predicted depressive post}} \end{aligned}$$

(7)

Recall of depressive/non-depressive sentences can be calculated as the ratio of predicted depressive/non-depressive sentences as depressive/non-depressive sentences to total actual depressive/non-depressive sentences.

$$\begin{aligned} \text {Recall (depressive)} = \frac{\text {Number of accurately predicted depressive sentences}}{\text {Total number of actual depressive statements}} \end{aligned}$$

(8)

$F_1$-score is the harmonic mean of precision and recall. The mathematical equations for the precision, recall, and $F_1$-score can be seen from (7), (8), and (9), respectively.

$$\begin{aligned} F_{1}\text {-score (depressive)} = 2 \times \frac{\text {Precision} \times \text {Recall}}{\text {Precision} + \text {Recall}} \end{aligned}$$

(9)

The confusion matrix is used to show the distribution of the predicted testing samples in both depressive and non-depressive classes. Confusion matrices using SVM, CNN, LSTM, and the proposed E-CLS network can be seen from Figs. 16, 17, 18, and 19, respectively.

The Area Under The Curve-Receiver Operating Characteristics (AUC-ROC) curve plotted with True Positive Rate (TPR) against the False Positive Rate (FPR). The detailed description of the AUC-ROC curve can be seen in (Fig. 20, 21, 22) for different deep-learning models (LSTM, CNN, and E-CLS)

Table 6 Results of conventional machine-learning model

Full size table

Table 7 Results of deep-learning models and the proposed E-CLS model

Full size table

The result of different conventional machine-learning classifiers and deep-learning models are listed in Tables 6 and 7. With a conventional machine-learning classifier, SVM reported the best performance with an F1-score of 0.89 whereas the KNN classifier yielded the worst performance with an F1-score of 0.61. While with deep-learning models, the proposed E-CLS with GloVe-embedding model achieved an F1-score of 0.91, which is greater than the other implemented machine-learning and deep neural network-based models. The confusion matrix for the proposed model can be seen in Fig. 19. The confusion matrix shows that our model detects 89% of depressive statements as depressive and 92% of non-depressive statements as non-depressive.

5.2 E-CLS with BERT-embedding

The E-CLS model, leveraging BERT-embedding, demonstrated promising performance on the Proposed Dataset (DepDs). Specifically, for the depressive class, the model achieved a precision of 0.87, recall of 0.85, and an F1-score of 0.86. In the non-depressive class, the precision, recall, and F1-score were 0.90, 0.92, and 0.91, respectively. The weighted average precision, recall, and F1-score for the E-CLS with BERT-embedding were 0.89, 0.89, and 0.89 (see Table 7).

5.3 E-CLS with GloVe-embedding

The E-CLS model, utilizing GloVe-embedding, exhibited strong performance on the proposed dataset (DepDs). For the depressive class, the precision, recall, and F1-score were 0.91, 0.88, and 0.90, respectively. In the non-depressive class, the model achieved a precision of 0.90, recall of 0.94, and F1-score of 0.92. The weighted average precision, recall, and F1-score for the E-CLS with GloVe-embedding were 0.90, 0.91, and 0.91 as shown bold in Table 7.

The major finding of this research is that the proposed E-CLS model performed best for identifying depression on the collected proposed dataset. The E-CLS achieved an F1-score of 0.91 (see Table 7) on the proposed datasets. From these results, it is evident that the proposed E-CLS model is performing significantly well.

The comparison of our work to others is not direct because some of them used different datasets. However, our model performed better than other similar models. In fact, our proposed E-CLS model outperformed as it achieved an F1-score of 0.91.

5.4 Qualitative analysis

The E-CLS model efficiently processes sentences through a combination of its components: CNN, LSTM, and SVM. For example, consider the sentence ‘I am feeling very low’ which is analyzed by the E-CLS model. The CNN, specialized in detecting local patterns, identifies the word ‘low’ as a potential indicator of low mood or depression. The LSTM, designed for understanding sequential context, recognizes the phrase’s structure and interprets it as an expression of the current emotional state. The SVM, synthesizing the CNN and LSTM outputs, integrates these insights to confidently predict that the sentence conveys feelings of low mood or depression. This analysis showcases how the ensemble model effectively harnesses the strengths of its components, demonstrating its proficiency in detecting signs of depression in social media text.

The E-CLS model exhibits varying performance metrics across demographic subsets, which we define as Group A (users aged 18-30 employing informal language) and Group B (users aged 40-60 with a tendency towards formal linguistic styles). Within Group A, the model demonstrates higher precision, likely attributed to the prevalence of specific keywords or expressions associated with depressive sentiments in this age bracket. Conversely, in Group B, characterized by a more formal language style among older users, the model tends to exhibit higher recall, potentially due to the presence of more overt indicators of depressive sentiments. This subgroup analysis illuminates the influence of demographic characteristics and linguistic styles on the model’s performance. For instance, expressions like ‘feeling down’ may be prevalent in Group A, while phrases indicating diminished pleasure in activities may be more common in Group B. These observations underscore how linguistic nuances and demographic factors play a significant role in influencing the model’s effectiveness.

5.5 Error analysis

In order to check the robustness of the model, we manually checked a few tweets, their associated label and the predicted labels. Some of the manually checked tweets are presented in Table 8. The error analysis of our E-CLS model involved examining individual predictions on a set of real-world sentences. For instance, the sentence ‘I can’t find joy in anything anymore’ was correctly classified as depressive due to the presence of negative language and expressions of hopelessness. Conversely, the sentence ‘Life is beautiful and full of opportunities’ was appropriately classified as non-depressive, given its positive and optimistic tone, which highlights the beauty of life and the potential opportunities it offers.

Table 8 Error Analysis of E-CLS Model

Full size table

6 Discussion

The major findings of this research is that an ensemble model of LSTM, CNN and SVM with tweets can be used to estimate the sign of depression in a person with a very good accuracy. The other major contributions of this research is the curated dataset for the said task. The dataset collected is relatively larger in size than existing datasets and hence a better option to be used for evaluating a deep learning model.

Due to the public unavailability of the datasets used by other prominent works on depression, it was not possible to compare our proposed model with state-of-the-art conventional machine-learning and deep-learning models on the dataset(s) used in the respective literature. Hence, we implemented and tested the prominent state-of-the-art models on our proposed dataest. We have implemented Khan et al. [25] SVM model on our depressive dataset (DepDs) and obtained an F1-score of .89. Similarly, we have also implemented LSTM-A derived from Orabi et al. [41] and CNN derived from Shetty et al. [49] to get an F1-score of .89 and .87, respectively; the results have been shown in Table 9. It can be clearly seen that our proposed E-CLS model achieves an improvement of at least 2% over state-of-the-art techniques.

Table 9 Comparative analysis of proposed work with state-of-the-art models

Full size table

One of the major limitations of the current work is the usage of textual content only. Nowadays emojis and hyperlinks are a major part of the most tweet and they play a significant role in expressing the emotions of a user. A tweet with negative sentiments having a happy emoji may indicate a depressive post. However, in the current work, we have ignored these emojis. The audio and video contents also form a part of some tweets with hyperlinks. We have not explored the associated hyperlinks and the related content.

7 Conclusion

To encapsulate our findings and contributions, we have delved into the intricate task of identifying depressive posts within social media textual content. Through the integration of Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Support Vector Machine (SVM) networks, our ensemble model, E-CLS, showcases remarkable efficacy. Leveraging TF-IDF and GloVe word embeddings, the model excels in discerning between depressive and non-depressive posts, achieving an impressive F1-score of 0.91. This performance outshines several established machine-learning and deep-learning counterparts.

Looking ahead, this study lays a solid foundation for further exploration. We envision extending our approach to accommodate multilingual scenarios, addressing the diverse linguistic landscape found in non-native English-speaking regions. Additionally, incorporating diverse media modalities such as images, videos, and audio clips holds promise for even more comprehensive mental health monitoring on social media platforms. Furthermore, the integration of emojis and hyperlinks, ubiquitous elements in social media communication, presents a compelling area for future investigation.

This study makes a substantial stride in the domain of depression detection from social media posts. The ensemble model showcases promising performance, and its adaptability to different languages and inclusion of diverse media types hold potential for significant impact in mental health monitoring through social media platforms.

Data Availability

The research data will be shared as requested.

Notes

References

Al Asad N, Pranto MAM, Afreen S, Islam MM (2019) Depression detection by analyzing social media posts of user. In: 2019 IEEE international conference on signal processing, information, communication & systems (SPICSCON) (pp. 13–17). IEEE
Almeida H, Briand A, Meurs MJ (2017) Detecting early risk of depression from social media user-generated content. In: CLEF (Working Notes) (pp. 1–12)
Arun V, Prajwal V, Krishna M, Arunkumar B, Padma S, Shyam V (2018) A boosted machine learning approach for detection of depression. In: 2018 IEEE symposium series on computational intelligence (SSCI) (pp. 41–47). IEEE
Bell V (2007) Online information, extreme communities and internet therapy: Is the internet good for our mental health? J Ment Health 16:445–457
Article Google Scholar
Bodenheimer T, Chen E, Bennett HD (2009) Confronting the growing burden of chronic disease: can the us health care workforce do the job? Health Aff 28:64–74
Article Google Scholar
Burdisso SG, Errecalde M, Montes-y Gómez M (2019) A text classification framework for simple and effective early depression detection over social media streams. Expert Syst Appl 133:182–197
Article Google Scholar
Chang B, Choi Y, Jeon M, Lee J, Han K-M, Kim A, Ham B-J, Kang J (2019) Arpnet: Antidepressant response prediction network for major depressive disorder. Genes 10. https://www.mdpi.com/2073-4425/10/11/907
Chen X, Sykora MD, Jackson TW, Elayan S (2018). What about mood swings: Identifying depression on twitter with temporal measures of emotions. In: Companion proceedings of the web conference 2018 (pp. 1653–1660)
Cohn JF, Kruez TS, Matthews I, Yang Y, Nguyen MH, Padilla MT, Zhou F, De la Torre F (2009) Detecting depression from facial actions and vocal prosody. In: 2009 3rd international conference on affective computing and intelligent interaction and workshops (pp. 1–7). IEEE
Cong Q, Feng Z, Li F, Xiang Y, Rao G, Tao C (2018) Xa-bilstm: A deep learning approach for depression detection in imbalanced data. In: 2018 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1624–1627). IEEE
De Choudhury M, Counts S, Horvitz E (2013) Social media as a measurement tool of depression in populations. In: Proceedings of the 5th annual ACM web science conference (pp. 47–56)
Deshpande M, Rao V (2017) Depression detection using emotion artificial intelligence. In: 2017 International conference on intelligent sustainable systems (ICISS) (pp. 858–862). IEEE
Fatima I, Mukhtar H, Ahmad HF, Rajpoot K (2018) Analysis of user-generated content from online social communities to characterise and predict depression degree. J Inf Sci 44:683–695. https://doi.org/10.1177/0165551517740835
Article Google Scholar
Geelan T (2021) Introduction to the special issue-the internet, social media and trade union revitalization: Still behind the digital curve or catching up? New Technol Work Employ 36:123–139
Article Google Scholar
Glinow MAV, Shapiro DL, Brett JM (2004) Can we talk, and should we? managing emotional conflict in multicultural teams. Acad Manage Rev 29:578–592
Article Google Scholar
Hasib KM, Islam MR, Sakib S, Akbar MA, Razzak I, Alam MS (2023) Depression detection from social networks data based on machine learning and deep learning techniques: An interrogative survey. IEEE Trans Comput Soc Syst
Hassan AU, Hussain J, Hussain M, Sadiq M, Lee S (2017) Sentiment analysis of social networking sites (sns) data using machine learning approach for the measurement of depression. In: 2017 International conference on information and communication technology convergence (ICTC) (pp. 138–140). IEEE
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780
Article Google Scholar
Hossain MR, Hoque MM, Siddique N (2023) Leveraging the meta-embedding for text classification in a resource-constrained language. Eng Appl Artif Intell 124:106586
Article Google Scholar
Hossain MR, Hoque MM, Siddique N (2023) Leveraging the meta-embedding for text classification in a resource-constrained language. Eng Appl Artif Intell 124:106586. https://doi.org/10.1016/j.engappai.2023.106586. https://www.sciencedirect.com/science/article/pii/S0952197623007704
Hossain MR, Hoque MM, Siddique N, Sarker IH (2023) Covtinet: Covid text identification network using attention-based positional embedding feature fusion. Neural Comput Appl 35:13503–13527
Article Google Scholar
Hosseinifard B, Moradi MH, Rostami R (2013) Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from eeg signal. Comput Methods Programs Biomed 109:339–345
Article Google Scholar
Islam MR, Kamal ARM, Sultana N, Islam R, Moni MA et al (2018) Detecting depression using k-nearest neighbors (knn) classification technique. In: 2018 International conference on computer, communication, chemical, material and electronic engineering (IC4ME2) (pp. 1–4). IEEE
Jain S, Narayan SP, Dewang RK, Bhartiya U, Meena N, Kumar V (2019) A machine learning based depression analysis and suicidal ideation detection system using questionnaires and twitter. In: 2019 IEEE students conference on engineering and systems (SCES) (pp. 1–6). IEEE
Khan MRH, Afroz US, Masum AKM, Abujar S, Hossain SA (2020) Sentiment analysis from bengali depression dataset using machine learning. In: 2020 11th international conference on computing, communication and networking technologies (ICCCNT) (pp. 1–5). IEEE
Kim J, Lee M (2014). Robust lane detection based on convolutional neural network and random sample consensus. In: International conference on neural information processing (pp. 454–461). Springer
Kour H, Gupta MK (2022) An hybrid deep learning approach for depression prediction from user tweets using feature-rich cnn and bi-directional lstm. Multimed Tools Appl 81:23649–23685
Article Google Scholar
Kumar KS, Srivastava S, Paswan S, Dutta AS et al (2012) Depression-symptoms, causes, medications and therapies. Pharma Innov 1:37
Google Scholar
Lam G, Dongyan H, Lin W (2019) Context-aware deep learning for multi-modal depression detection. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 3946–3950). IEEE
Leiva V, Freire A (2017) Towards suicide prevention: early detection of depression on social media. In: International conference on internet science (pp. 428–436). Springer
Li D, Chaudhary H, Zhang Z (2020) Modeling spatiotemporal pattern of depressive symptoms caused by covid-19 using social media data mining. Int J Environ Res Public Health 17. https://www.mdpi.com/1660-4601/17/14/4988
Lin C, Hu P, Su H, Li S, Mei J, Zhou J, Leung H (2020) Sensemood: Depression detection on social media. In: Proceedings of the 2020 international conference on multimedia retrieval (pp. 407–411)
Lin E, Kuo PH, Liu YL, Yu YWY, Yang AC, Tsai SJ (2018) A deep learning approach for predicting antidepressant response in major depression using clinical and genetic biomarkers. Front Psychiatry 9. https://doi.org/10.3389/fpsyt.2018.00290. https://www.frontiersin.org/articles/10.3389/fpsyt.2018.00290.
Mehltretter J, Fratila R, Benrimoh D, Kapelner A, Perlman K, Snook E, Israel S, Miresco M, Turecki G (2019) Differential treatment benefit prediction for treatment selection in depression: A deep learning analysis of star*d and co-med data. BioRxiv. https://doi.org/10.1101/679779
Mehltretter J, Rollins C, Benrimoh D, Fratila R, Perlman K, Israel S, Miresco M, Wakid M, Turecki G (2020) Analysis of features selected by a deep learning model for differential treatment selection in depression. Front Artif Intell 2. https://doi.org/10.3389/frai.2019.00031. https://www.frontiersin.org/articles/10.3389/frai.2019.00031
Mumtaz W, Qayyum A (2019) A deep learning framework for automatic diagnosis of unipolar depression. Int J Med Inform 132:103983
Article Google Scholar
Nadeem M (2016) Identifying depression on twitter. arXiv:1607.07384
Naslund JA, Aschbrenner KA, Marsch LA, Bartels S (2016) The future of mental health care: peer-to-peer support and social media. Epidemiol Psychiatr Sci 25:113–122
Article Google Scholar
Nguyen KP, Fatt CC, Treacher A, Mellema C, Trivedi MH, Montillo A (2019) Predicting response to the antidepressant bupropion using pretreatment fmri. In: Rekik I, Adeli E, Park SH (eds) Predictive Intelligence in Medicine. Springer International Publishing, Cham, pp 53–62
Chapter Google Scholar
Oh J, Yun K, Maoz U, Kim T-S, Chae J-H (2019) Identifying depression in the national health and nutrition examination survey data using a deep learning algorithm. J Affect Disord 257:623–631
Article Google Scholar
Orabi AH, Buddhitha P, Orabi MH, Inkpen D (2018) Deep learning for depression detection of twitter users. In: Proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic (pp. 88–97)
Park SJ, Lim YS, Sams S, Nam SM, Park HW (2011) Networked politics on cyworld: The text and sentiment of korean political profiles. Soc Sci Comput Rev 29:288–299
Article Google Scholar
Peng Z, Hu Q, Dang J (2019) Multi-kernel svm based depression recognition using social media data. Int J Mach Learn Cybern 10:43–57
Article Google Scholar
Rosa RL, Schwartz GM, Ruggiero WV, Rodríguez DZ (2018) A knowledge-based recommendation system that includes sentiment analysis and deep learning. IEEE Trans Industr Inform 15:2124–2135
Article Google Scholar
Sadeque F, Xu D, Bethard S (2018) Measuring the latency of depression detection in social media. In: Proceedings of the eleventh acm international conference on web search and data mining (pp. 495–503)
Sak H, Senior AW, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: INTERSPEECH (pp. 338–342)
Shah FM, Ahmed F, Joy SKS, Ahmed S, Sadek S, Shil R, Kabir MH (2020) Early depression detection from social network using deep learning techniques. In: 2020 IEEE region 10 symposium (TENSYMP) (pp. 823–826). IEEE
Shatte AB, Hutchinson DM, Fuller-Tyszkiewicz M, Teague SJ (2020) Social media markers to identify fathers at risk of postpartum depression: A machine learning approach. Cyberpsychol Behav Soc Netw 23:611–618
Article Google Scholar
Shetty NP, Muniyal B, Anand A, Kumar S, Prabhu S (2020) Predicting depression using deep learning and ensemble algorithms on raw twitter data. Int J Electr Comput Eng 10:3751
Google Scholar
Stieglitz S, Dang-Xuan L (2013) Emotions and information diffusion in social media-sentiment of microblogs and sharing behavior. J Manag Inf Syst 29:217–248
Article Google Scholar
Wang X, Chen S, Li T, Li W, Zhou Y, Zheng J, Chen Q, Yan J, Tang B (2020) Depression risk prediction for chinese microblogs via deep-learning methods: Content analysis. JMIR Med Inform 8:e17958
Article Google Scholar
Wang Z, Ho S-B, Cambria E (2020) A review of emotion sensing: categorization models and algorithms. Multimed Tools Appl 79:35553–35582
Article Google Scholar
Wongkoblap A, Vadillo MA, Curcin V et al (2021) Deep learning with anaphora resolution for the detection of tweeters with depression: Algorithm development and validation study. JMIR Ment Health 8:e19824
Article Google Scholar
Wu MY, Shen C-Y, Wang ET, Chen AL (2020) A deep architecture for depression detection using posting, behavior, and living environment data. J Intell Inf Syst 54:225–244
Article Google Scholar
Yang L, Jiang D, Xia X, Pei E, Oveneke MC, Sahli H (2017) Multimodal measurement of depression using deep learning models. In: Proceedings of the 7th annual workshop on audio/visual emotion challenge (pp. 53–59)
Zogan H, Wang X, Jameel S, Xu G (2020) Depression detection with multi-modalities using a hybrid deep learning model on social media. arXiv:2007.02847

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Patna, Patna, India
Shashank Shekher Tiwari, Rajnish Pandey, Akshay Deepak & Jyoti Prakash Singh
Department of Information Technology, Rajkiya Engineering College, Ambedkarnagar, India
Sudhakar Tripathi

Authors

Shashank Shekher Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Rajnish Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Akshay Deepak
View author publications
You can also search for this author in PubMed Google Scholar
Jyoti Prakash Singh
View author publications
You can also search for this author in PubMed Google Scholar
Sudhakar Tripathi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jyoti Prakash Singh.

Ethics declarations

Conflict of interest

The authors do not have any conflict of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tiwari, S.S., Pandey, R., Deepak, A. et al. An ensemble approach to detect depression from social media platform: E-CLS. Multimed Tools Appl 83, 71001–71033 (2024). https://doi.org/10.1007/s11042-023-17971-6

Download citation

Received: 10 February 2023
Revised: 27 November 2023
Accepted: 21 December 2023
Published: 06 February 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s11042-023-17971-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An ensemble approach to detect depression from social media platform: E-CLS

Abstract

Similar content being viewed by others

Comparison of Machine Learning Models for Early Depression Detection from Users’ Posts

A Hybrid Deep Learning Model Using Grid Search and Cross-Validation for Effective Classification and Prediction of Suicidal Ideation from Social Network Data

Depression Detection Using Deep Learning and Natural Language Processing Techniques: A Comparative Study

Explore related subjects

1 Introduction

2 Related works

2.1 Depression detection using conventional machine-learning approaches

2.2 Depression detection using deep-learning approaches

3 Dataset

3.1 Depressive dataset creation

3.1.1 Collection procedure

3.1.2 Annotation

3.1.3 Label assignment

3.1.4 Inter-rater agreement

3.1.5 Final dataset composition

3.2 Analysis of DepDs using BERT

4 Methodology

4.1 Problem statement

4.2 Proposed approach

4.3 Machine learning models

4.4 Deep learning models

4.5 Word embedding vectors

4.6 Convolutional neural network

4.7 Long short-term memory

4.8 Support vector machine

4.9 Proposed E-CLS network

5 Results

5.1 Evaluation metrics

5.2 E-CLS with BERT-embedding

5.3 E-CLS with GloVe-embedding

5.4 Qualitative analysis

5.5 Error analysis

6 Discussion

7 Conclusion

Data Availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation