Keywords

1 Introduction

COVID-19 has resulted in quarantining or isolating as a new trend that has a very passive effect on individuals. Individuals may develop thoughts such as depression, anxiety and suicidal ideation as a result of this. Workplaces and educational institutions are doing their best to address the situation, but it is insufficient. The challenge is to develop a model that can accurately predict an individual’s mental health state and assist in monitoring and curing it at an earlier stage.

The primary goal of this paper is to anticipate a person’s mental well-being using social media platforms, specifically Twitter. Based on the individual’s tweets, this paper will predict a mental health score. It will be extremely beneficial to those who use social media platforms on a daily basis and will assist them in monitoring their mental health in order to live a stress-free life.

2 Related Work

This section briefs about the related works carried out for predicting a person’s mental well-being via social platforms.

S. E. Jordan et al. conducted a survey dictating the use of Twitter data for predicting public health. Here, various methods were used for mining the Twitter data for public health monitoring. Research papers where Twitter data is classified as users or tweets are considered for the survey for monitoring the health of persons in a better way. Also, papers published from 2010 to 2017 were taken for conducting survey. The approaches used to categorize the Twitter content in many ways are distinguished. While it is difficult to compare research, since there are so many various ways for using Twitter and interpreting data, this state-of-the-art review highlights the huge potential of using Twitter for public health surveillance.

Heiervang et al. conducted a structured psychiatric interview for the parents for predicting the child mental state. Parents were interviewed face to face in 2003, and they finished the interview online in 2006. Interviews were preceded by printed questionnaires covering child and family variables in both surveys. Web-based surveys can be completed more quickly and at a lesser cost than traditional methods including personal interviews. Point estimates of psychopathology appear to be particularly vulnerable to selective participation although patterns of connections appear to be more durable.

3 Proposed System

Coronavirus has crossed the globe, isolating or disengaging numerous people, bringing about antagonistic psychological well-being impacts for some like uneasiness, melancholy, self-destruction and self-hurt. Working environments/educational establishments that encourage mental prosperity and help individuals with mental incapacities are bound to limit non-attendance, improve profitability and receive the expert and individual rewards that accompany it. The test is to make a model that will foresee the emotional wellness of people and accordingly help psychological well-being suppliers to convey forward with the treatment in this period of scarcity.

These days, most of the mental health problems are identified and treated at later stage. We propose a unique technique to mental health detection using user tweets as an early discovery system to actively identify probable mental health situations. A machine learning framework has been developed for finding a person’s mental well-being. We analyse a person’s tweets, along with a few of their personal details and predict whether or not the person should see a therapist based on a series of quizzes. The proposed approach employs naïve Bayes and linear regression techniques to find a person’s mental health by means of their tweets. To improve efficiency, we perform a quiz analysis with a decision tree algorithm and predict the scores.

4 Algorithm Description

4.1 Naïve Bayes Algorithm

It is a method based on Bayes’ theorem and the assumption that indicators are autonomous. A naïve Bayes classifier, in simple terms, assumes that the presence of one variable in a class has no influence on the presence of another. For example, if a natural product is red, oval and around 3 creeps across, it is considered an apple. Regardless of whether these characteristics rely on one another or on the existence of other characteristics, they all contribute to the probability that this natural commodity is an apple, which is why it is regarded as ‘Credulous’. The Bayes model is simple to construct and is particularly useful for extremely large informative indexes. Along with its simplicity, naïve Bayes is considered to outperform even the most sophisticated order techniques. From P(c), P(x) and P(x|c), the Bayes hypothesis provides a method for determining back probability P(c|x).

4.2 Linear Regression Algorithm

Linear regression is an AI calculation that is based on learned data. It performs a relapse simulation. A regression model is used to model an objective expectation esteem that is contingent on free variables. It is primarily used for evaluating and exploring the relationship between variables. Different relapse models differ in terms of the type of relationship; they consider between reliant and autonomous factors as well as the number of free factors they employ.

Linear regression enacts the task of predicting the value of a dependent variable (y) in light of a given autonomous variable (x). In this way, this relapse protocol discovers a direct link between x (input) and y (output). Linear regression is the name given to it as a result.

4.3 Decision Tree

The decision tree algorithm is part of the supervised learning algorithms family. The decision tree algorithm, unlike other supervised learning algorithms, can also be used to solve regression and classification problems.

The global variables we have determines the different sorts of decision trees we have. There are two forms of it:

Categorical Variable Decision Tree: A categorical variable decision tree is a decision tree with a categorical target variable.

Continuous Variable Decision Tree: A continuous variable decision tree is one that has a continuous focus variable (Fig. 1).

Fig. 1
figure 1

Decision tree

5 Implementation

Today, psychological wellness is anticipated at a later stage. To effectively distinguish expected psychological well-being by mining information logs of online media clients as an early discovery framework, we present a novel way for recognizing emotional well-being. We foster an AI structure to distinguish emotional wellness. The proposed approach can be communicated to provide early notification of anticipated patients. We analyse the client’s tweets and apply naïve Bayes and linear regression calculations to get the clients OCEAN examination, i.e. openness, conscientiousness, extraversion, agreeableness and neuroticism. We utilize a test investigation to become familiar with the client and a choice tree calculation to figure their emotional wellness score. Alongside these two qualities, we get some close to home data, and the emotional wellness score is anticipated, alongside a message showing whether the individual should see a therapist.

5.1 Module Description

  1. 1.

    OCEAN Analysis: In this module, we take the tweets of the users as the datasets and process the datasets to get the OCEAN (openness, conscientiousness, extraversion, agreeableness, neuroticism) analysis of the user. The tweets are hence cleaned and algorithms such as naïve Bayes and linear regression are performed to calculate the emotion of the tweets.

  2. 2.

    Quiz Analysis: The proposed quiz analysis consists of 20 different customized questions that will help us to give a clearer analysis of the mental state. We apply a decision tree algorithm to the answers and predict the score.

  3. 3.

    Prediction of Mental Health: Here, we predict the final score using both the scores obtained from ocean analysis and quiz analysis. We suggest depending on the score whether a person needs to consult the therapist or not.

5.2 System Architecture

See Fig. 2.

Fig. 2
figure 2

System architecture

6 Result and Discussions

In this paper, we used different algorithms to try to determine the mental health of people who use social media sites, mostly Twitter. We successfully predicted the user’s mental health score and recommended whether or not the individual should see a therapist. Using naïve Bayes, linear regression and decision tree on their tweets, we performed OCEAN analysis and got an output with greater accuracy. Another such output was predicted using a survey with several questions based on the candidate’s behaviour. Personal details like work scenarios and family history were taken into account as well. These outputs were then added and taken average of. The following result is more accurate than the previous works based on various parameters (Figs. 3, 4, 5, 6, 7, and 8).

Fig. 3
figure 3

Data pre-processing

Fig. 4
figure 4

Calculating emotions

Fig. 5
figure 5

Quiz analysis

Fig. 6
figure 6

Quiz analysis score

Fig. 7
figure 7

Personal details

Fig. 8
figure 8

Mental health prediction