1 Introduction

Nowadays, most people are affected by mental problems [1] such as continuous change of mood, emotions, behavior, thinking process and combination of these problems. This mental problem creates stress, depression and anxiety which leads to creating difficulties to mingle with family activities, work and other social events. From the survey, around one from five adults is affected (19%) due to the mental illness in the USA [2]. In that analysis, 4.1% of the people affected by serious mental problems and 12% of people are affected by diagnosable mental disorders. Once the people affected by mental problem, they are identified with the help of several symptoms [3] such as feeling sad, overthinking, lack of ability to concentrate, extreme feature, feeling guilt, lack of involvement with friends, families, changes of mood from low to high and high to low, drug addict, stress, feeling tired, sleepless night, low energy, changes in eating habits, thinking about suicide, back pain, headache and stomach pain. Sometimes the genetic factors [4] are the main reason for this mental illness, environmental exposure before birth and brain chemistry. These activities are creating more mental disorders, problem and illnesses to people. Once the people feel mental depression [5], several factors such as blood relative mental illness, family problems, brain damage, chronic changes, diabetic, traumatic experience, child abuse, the breakup of a healthy relationship and last mental illness could lead to increase the mental illness risk. These risks are creating several complications such as mental disabilities [6], physical health problems, behavioral problems, unhappiness in life, conflicts in family, isolation in society, drugs addict, poverty, financial problems, weakness in immunity system and other medical diseases. So, the mental illness must be prevented when the symptoms are in the earlier stage, and medical care [7] should be taken in the routine format and engage the people with regular activities such as physical activity, sleep, eating, watching TV and listening music. But most of the patients are failed to notice their symptoms in an earlier stage; they require continuous medical care to recover from stress and other problems.

In mental care applications, robots are used to assist the mental illness people because they require continuous assessment. The importance of the robots in the mental health field is suggested by psychiatry doctor Joanne Pransky in California [8] because robots are providing valuable services to human beings. Dr. Pransky said in 1986 that the robots are very intelligent like human they perfectly assist the human according to their day-to-day activities. She believes that robots are more helpful to deal with the emotion-related issues by making the perfect interaction with human; also, it redesigns the human behavior and family structure. In addition to this, she feels from 32 years of experience [9] that robotic is a most effective therapeutic tool to recover from people emotion-related issues. The robots are designed according to the people health-related needs because it must help while doing psychiatry therapy. One of the effective robots called Paro which is developed by the national institute of advanced industrial science and technologies in Japan. The Paro robot especially designed to handle the people affected by Alzheimer’s, dementia and other mentally affected people [10]. These mental illness people require more attention. So, Paro robot is designed based on the people emotional response. The developed Paro robot is certified in 2009 by Food and Drug Administration in the USA [11] because it has a sealed sensor. With the help of this sensor, it has been easily held and learns the people needs perfectly. The developed Paro robots help to decrease patient’s caregiver stress, stimulate the interaction between caregivers and patients, motivate the patient, reduce negative thoughts and improve the patient socialization.

Robots [12] in psychiatry treatment have several benefits from young and elder patient because these peoples are mostly available online and there is no chance to access the behavioral health assistances. The minimum interaction between humans needs effective companionship. The robots are mostly accepted by people to reduce their stress and other mental health problems. In addition to this, the robots reduce autism in children and engage the human with their activities effectively. Dr. Pransky stated that robots [13] played a vital role in the psychiatry field to minimize the anxieties problem successfully. However, the robots require basic concepts to understand the patient activities, diseases, treatment procedure and remote treatment process. To ensure the above difficulties, robots [14] are used to support the entire medical assistant system. To overcome the accuracy and precise assist issues in this work, an intelligent learning concept is used to support the robot with patient mental health details. So, the main contributions of the paper are listed as follows.

  • To assess the mental illness patient needs and requirements without missing any details.

  • To improve the assistance accuracy using the robotic learning process.

  • To reduce errors while training robots.

The remaining structure of the manuscript is arranged as follows. The learning concept creation process idea is obtained from various research authors’ opinions and discussed in Sect. 2. Then the detailed working process of the intelligent learning process is discussed in Sect. 3. The efficiency of the intelligent learning concept-based robot assistant is evaluated in Sect. 4, and the paper concludes the work in Sect. 5.

2 Related works

The various robot-based mental health assistances, processes, procedures, functions and learning concepts are analyzed in this section because these help to get the idea of developing the robot-based mental health assistant. Yousif et al. [15] identified the mental health problem and illness by applying a soft computing technique called a neural network. During this process, a multilayer perception network is designed to handle mental health information. The system utilizes a neurosolution for handling the data that should be adapted to the network environment. Then, the deviation of computed and actual output (error) is propagated with the help of an effective learning concept. This successful process predicts the schizophrenia mental illness from the given text or speech input. The efficiency of the system is evaluated using experimental analysis. Rudovic et al. [16] developed robots for providing autism therapy to patients with the help of a machine learning framework. The system uses the contextual information that includes the behavioral score, demographic details, audio, video and physiological data from 35 children. The gathered information is processed between the features that are computed using a machine learning technique. From the correlation value, children are continuously monitored by robots and making the children engaged that reduce the autism from children successfully. The created system excellence is evaluated using experimental analysis, in which 60% of the system uses the robots for providing autism therapy.

Loh et al. [17] analyzed the impacts of robots in the medical field to diagnose diseases and support the doctor for predicting diseases. During this process, artificial intelligence techniques are used to train robots to improve their efficiency in the health system. The intelligent techniques that use the learning concept, bias value and other activation functions are used to train the robots to support in the healthcare field. From the analysis, it clearly states that robots are seriously used in most of the healthcare analysis. Durstewitz et al. [18] applied deep learning techniques to provide psychiatry treatment. This deep learning system collects a large volume of mental health information. The collected details are processed using deep learned recurrent neural network that recognizes the relationship between information during prediction analysis. According to the relationship, psychiatry is provided to the patient to recover from the mental problem effectively. At the time of the process, the semantic model is embedded with collected context details and people behavior is analyzed for providing effective psychiatry treatment. Then the developed system proficiency is evaluated using experimental analysis.

Galiatsatos et al. [19] identified the mental health symptoms by applying the Bayesian network. The system uses 91 patient records processed using fuzzy logic which classifies patients according to the mental depression. From the classified results, patient symptoms such as lack of interest, depression, mood, lack of concentration, negative thoughts and guilt are recognized successfully. Based on the Bayesian process, mental health patients are identified effectively. Depending on the above discussions, the patient mental health has been analyzed using different machine learning techniques which analyze the patient thinking capabilities, thoughts and mind deviations. From the author’s opinions, mental illness people need more attention because they require a caregiver for assisting their needs. But the manual caregiver does not have any patience to assist mental illness people. For achieving this effective assistance, robots are used to analyze the mental illness people, but they need additional learning capabilities to improve robotic assistance. So, this paper uses the deep reinforcement learning (DRL) process to provide the learning concepts to robots by analyzing mental illness people. The overall discussion of the introduced system is presented in Sect. 3.

3 Assisting mental health patient with the help of robots using a deep reinforcement learning approach

This section discusses the robotic mental health people assisting process using a deep reinforcement learning approach. The system uses the collaborative psychiatric epidemiology survey (CPES) [20] dataset for analyzing the mental health illness. The dataset consists of several mental disorder problems, mental illness risk factors, correlations and other special information that are stored as the group. This dataset handles the mental illness, disorders, impairments and respective treatments that are analyzed from the adult population in the USA. Moreover, the dataset details combine the three datasets such as National Comorbidity Survey Replication (NCS-R) [21], National Latino and Asian American Study (NLAAS) [22] and National Survey of American Life (NSAL) [23] dataset. The Blaise computer-assisted interview software used to collect the data from the population that is stored in terms of metadata. The collected data are stored as project ID, case Id, weight (supplemental variable), constructed demographic variable, type of diagnosis, anxiety disorders and other diagnostic variables. According to the discussion, the overall interview and data collection process [20] are shown in Table 1.

Table 1 Representation of dataset interview, component and related response rate

Based on Table 1, the number of interviews, different mental illness information such as fear of home alone, fear of crowd, feature of traveling, feature of alone traveling, feature of car, public transportation, feature auditorium, feature of public place, panic attacks, being alone, sick, stomach, diarrhea, physically ill, embarrassing, impairment score, weight, severe anxiety, smoking habit, alcohol, drug problem, schizophrenia, BMI, suicide attempt, uncomfortable with neighbor, family, friends, health, religion, employment and other mental illness details are collected and stored in the codebook. As discussed earlier, the dataset analyzes the mental illness patient in various aspects, so the dataset has missing value. The range of missing value is also defined in the dataset itself. 0.8% (24 of 3031 variables) are having 0% of missing values, 1.2% (35 of 3031 variable) having 0 to 1% of missing value, 3.1% (94 to 3031 variables) having 1 to 3% of missing value, 6.9% (210 of 3031 variables) having 3 to 5% of missing values and 1.1% (33 of 3031 variables) having 5 to 10% of missing values. So, the collected mental illness must be processed by applying data mining and machine learning techniques to improve the robotic assistance process. According to that, the general working process of the robotic mental assistance structure is shown in Table 1.

Figure 1 depicts the structure of deep reinforcement learning-based robotic assistance system architecture. The system consists of a few steps such as data collection, data preprocessing and generating learning process to teach the behavioral cue to robots to assist the mental illness patients. As discussed earlier, the data have been collected from the collaborative psychiatric epidemiology survey (CPES) dataset which having up to 10% of missing value. So, the data preprocessing method is applied to eliminate or replace the missing value while assisting robots using deep learning approaches.

Fig. 1
figure 1

Deep reinforcement learning-based robotic assistance system architecture

3.1 Mental illness data preprocessing

The first step of the work is data preprocessing because the collected collaborative psychiatric epidemiology survey (CPES) dataset has 10% of missing value. Few mental illness instances have happened in CPES dataset due to the failure to information loads, data corruption and incomplete data extraction. So, the missing data must be removed or replaced from the list because it is one of the huge challenges that data should use to replace this missed value [24]. The effective decision handling process helps to improve the robust mental illness data model. In tradition, once the missing value is presented in the dataset, a row is deleted from the list, but it leads to create a huge volume of data loss and increase the percentage of missing value up to 30%. For overcoming the above issues, the effective missing value replacement process [25] is performed by applying the random forest technique. The random forest approach is one of the nonparametric imputation approaches that effectively work on missing value data in both random missing and not missing at random. The random forest [26] approach takes the decision based on the missing value according to the error imputation estimation process. In addition to this, in this work, a large volume of mental health data is used to create the robotic assistance system. The introduced random forest approach works successfully on the large volume of the dataset by minimizing the data overfitting. The input samples \(X_{1} ,X_{2} , \ldots ,X_{n}\), and dataset values are used to analyze the missing values in correlation with majority voting process. During this process, more decision [27] is required instead of a single decision tree for reducing the noise from the list. If the row contains any missing value, that should be replaced by mean, median, max and standard deviation computation [28] process. Initially, the mean value of row is computed as follows,

$$\bar{x} = \frac{1}{n}\left( {\mathop \sum \limits_{i = 1}^{n} x_{i} } \right)$$
(1)

In Eq. (1), x is denoted as particular data, n is the number of data present in a row and \(\bar{x}\) is mean value.

Then the median value of the row needs to be computed by sorting the values, and the middle value is picked up that is considered as median value. Next, the maximum value of the row is estimated as follows.

$$\bar{\bar{x}} = { \hbox{max} }\left( {x_{i} } \right)$$
(2)

Afterward, the standard deviation of the particular row is estimated as follows,

$${\text{sd}} = \sqrt {\frac{1}{n - 1}\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - \bar{x}} \right)^{2} }$$
(3)

After computing these values of the missing row, the decision has been handled depending on the majority voting. The majority of voting is determined based on the maximum value of this computed value. This process is repeated until the entire missing value present in the dataset is processed. Then the data are normalized for simplifying the data training process. The normalization process changes the representation of value from 0 to 1 range. The normalizations done as follows,

$${\text{normalized}}\,{\text{data}} = \frac{{x - \bar{\bar{x}}}}{\text{sd}}$$
(4)

In Eq. (4), x is defined as particular data, and sd and \(\bar{\bar{x}}\) are computed from Eqs. (1) and (3). Based on this process, the mental health dataset is simplified effectively. After that, the data are trained by applying the intelligent technique to provide the learning concept to robots used to assist the mental illness patients.

3.2 Deep reinforcement learning-based approach data training and learning process

The next step of the work is to create training and learning processes for the robotic system using the deep reinforcement learning process. Reinforcement learning [29] approaches are one of the effective goal-oriented techniques used to learn things from complex data which help to assist the mental illness people. During this process, the reinforcement learning process worked with the deep learning method [30] because it has a huge volume of data used to create an effective training model. The introduced deep reinforcement learning concept has attained the knowledge from the past analysis [31] that is used to get the decision for the immediate problem. The successful decision-making process is used to predict the needs of the mental illness person in emergency situations. So, the deep reinforcement learning process is used in this work to deal with the real-time applications for achieving the goal. The deep reinforcement learning process works according to the state–action relationship [32] because the network defines the action with respective states for attaining a goal. As discussed above, the learning process has the agent, states S and respective action A in which every action \(a \in A\) that used to perform in every state. During the action execution, the state gives the numerical score as a reward that needs to be maximized in the future. Based on the rewards, the robots are trained with respective actions and states. Then the respective actions [33] are represented as follows,

$$Q\left( {S,A} \right) \leftarrow Q\left( {S,A} \right) + \alpha \left( {R + \gamma \max_{{a^{\prime}}} Q\left( {S^{\prime},a^{\prime}} \right) - Q\left( {S,A} \right)} \right)$$
(5)

In Eq. (5), S is denoted as a state, A is denoted as an action and \(\gamma\) is denoted as a discount factor that is computed from the weighted step for getting rewards. R is a reward, \(Q\left( {S,A} \right)\) is an old state–action value, \(\alpha\) is the learning rate,\(\max_{{a^{\prime}}} Q\left( {S^{\prime},a^{\prime}} \right) - Q\left( {S,A} \right)\) is the estimated optimal future value and \(\left( {R + \gamma \max_{{a^{\prime}}} Q\left( {S^{\prime},a^{\prime}} \right) - Q\left( {S,A} \right)} \right)\) is denoted as the learned value. The computed reward action values are stored in the table because they are used to compute the future optimal value. From the mental illness people, state-related actions are found out, and the respective output, the future state-related actions, is learned effectively. This process improves efficiency of the system while training the robots to assist mental illness people. Based on the process, it clearly defines that deep reinforcement learning process consumes state as input; then, the respective network structure is shown in Fig. 1.

As shown in Fig. 2, the network produces Q value for every state–action by using the learning parameter. With the help of the training process, the network predicts the actions in the future direction based on the provided input value. From the analyzed inputs, maximum Q value [34] is considered as the particular assistance process. During this process, the collected, normalized inputs are given to the deep learning neural network which is processed by multiple layers of the network. The process is considered as state and respective action that leads to producing the output for the given mental illness input. Then the input processing of deep learning with the reinforcement process is shown in Fig. 3.

Fig. 2
figure 2

Processing structure of deep reinforcement learning approach

Fig. 3
figure 3

Deep learning with reinforcement process

During this process, the given input is processed by fully connected layers with ReLu (rectified linear unit) activation function for getting the output in each state. The ReLu value is computed as follows,

$$f\left( x \right) = \left\{ {\begin{array}{*{20}c} 0 & {{\text{for}}\,x < 0} \\ x & {{\text{for}}\,x \ge 0} \\ \end{array} } \right.$$
(6)

In Eq. (6), x is denoted as the given input. That the time of output estimation process, the deviation may occur which is computed from actual and predicted value difference. Then the loss value is calculated as follows,

$${\text{Error}} = \left( {{\text{actual}} = {\text{predicted}}} \right)^{2}$$
(7)

If the error occurs during the computation process, the same process is repeated continuously to get the output value. Then the maximum value is chosen as the best output for the given input value. According to the above deep reinforcement learning process, robots are trained continuously. From the output, the patient depression has been resolved by the robots by playing music, providing positive stories and other positive books to get rid of this depression. According to the deep reinforcement learning process, robots are continuously trained and learned to adapt to people mental illnesses. The effective learning and adaption process improves the assistance accuracy when compared to the caregiver assistance. Then the efficiency of the system is evaluated using experimental results and discussion.

4 Results and discussion

The introduced deep reinforcement learning-based robot assistance system is developed according to the discussion in Sect. 3. The created system utilizes the collaborative psychiatric epidemiology survey (CPES) dataset for examining the mental illness of people. The dataset includes several mental disorder details, treatments, needs and other mental health details which are successfully examined by the above processing methodologies. The effective function of state and action leads to create the particular Q value that belongs to specific action [35]. The discussed system is implemented using MATLAB simulation tool using the above dataset. At the time of the learning and training process, the system uses an effective learning concept which reduces the deviation between actual and predicted values. Then the overall deviation of the system value is shown in Table 2.

Table 2 Error rate of deep reinforcement learning-based robot assistance system

Table 2 demonstrates the error rate of the deep reinforcement learning process-based robotic assistance system. The system utilizes the multiple layers with an effective learning concept and the state–action-based process to predict the actions of respective inputs. The prediction of actual and predicted values has minimum deviation compared to several machine learning techniques such as Bayesian network (BN) [19], multilayer perceptron (MLP) [15] and deep learning neural network (DLNN) [18]. The effective utilization of fully connected layer, activation, learning rate and discount factors improves the robot overall training and learning process. In addition to this, a huge volume of data is used for the learning process that reduces the complexity according to the user needs and requests. Then the related graphical appearance of error rate is shown in Fig. 4.

Fig. 4
figure 4

DRL—error rate

The deviation of the actual and predicted values of deep reinforcement learning process-based robotic training is depicted in Fig. 3. The DRL method attains a minimum error rate (0.083) compared to other machine learning techniques such as Bayesian network (BN)(0.288), multilayer perceptron (MLP) (0.204) and deep learning neural network (DLNN)(0.131). The reduced deviation of actual and predicted values improves the overall network training process used to increase the robotic assistance process effectively. The efficiency of the robotic training process is analyzed using F1-score [36] that is computed as,

$$F1\,{\text{Score}} = 2 \cdot \frac{{{\text{precision}} \cdot {\text{recall}}}}{{{\text{precision}} + {\text{recall}}}}$$
(8)
$${\text{Pecision}} = \frac{{{\text{True}}\,{\text{positive}}}}{{{\text{True}}\,{\text{positive}} + {\text{False}}\,{\text{positive}}}}$$
(9)
$${\text{Recall}} = \frac{{{\text{True}}\,{\text{positive}}}}{{{\text{True}}\,{\text{positive}} + {\text{False}}\,{\text{negative}}}}$$
(10)

According to Eqs. (9) and (10) [37], the deep reinforcement learning process efficiency is evaluated. This metrics helps to analyze how effectively the introduced DRL approach selects the data according to the mental illness people need from the collection of the dataset. The successful selection of data is measured using the precision value, and the exact value is chosen from the selected mental data is determined using recall value. Then the obtained precision and recall values are shown in Table 3.

Table 3 Precision and recall values of deep reinforcement learning-based robot assistance system

Table 3 demonstrates the precision and recall values of the deep reinforcement learning process-based robotic assistance system. The system effectively propagates the error value from the previous present layer to the next layer which minimizes the deviation. In addition, the optimized learning and training function improve the overall selection of mental illness people need-related data from the collection of data. This intelligent learning process maximizes the precision and recall values compared to other methods such as Bayesian network (BN), multilayer perceptron (MLP) and deep learning neural network (DLNN). The effective analysis of the learning method improves training efficiency that is shown in Fig. 5.

Fig. 5
figure 5

DRL—precision and recall

Figure 5 depicts the deep reinforcement learning-based approach robotic assistance system precision and recall values. According to Fig. 4, the effective selection of patient need-related data has led to maximizing the precision (99.689%) and recall (99.35%) value. The obtained result is high compared to other machine learning techniques such as Bayesian network (BN) (precision 97.80%, recall 97.37%), multilayer perceptron (MLP) (precision 98.28%, recall 98.13%) and deep learning neural network (DLNN) (precision 99.38%, recall 99.26%). From the obtained precision and recall values, the F1-score value is analyzed and the attained value is shown in Table 4.

Table 4 Deep reinforcement learning-based robot assistance system—F1-score

Table 4 demonstrates the F1-score value of deep reinforcement learning process-based robotic assistance system. The multiple layers of a fully connected network, and optimized learning factors and discount factors lead to an increase in the overall system training and learning process compared to several machine learning techniques such as Bayesian network (BN), multilayer perceptron (MLP) and deep learning neural network (DLNN). In addition to this, the system performs effectively for each action present in each state that indicates that robots are perfect assistance the patient according to the single requirements. The effective training process minimizes the entire system complexity, and the graphical representation of the F1-score value is depicted in Fig. 6.

Fig. 6
figure 6

DRL—F1-score

The F1-score value of deep reinforcement learning process-based robotic training is depicted in Fig. 5. The DRL method attains high accuracy value (98.42%) compared to other machine learning techniques such as Bayesian network (BN)(95.03%), multilayer perceptron (MLP) (96.20%) and deep learning neural network (DLNN)(98.04%). The minimum deviation and effective selection of features help to improve the overall robotic training process. In addition, the deep reinforcement learning process needs to select the mental illness people requirement-related function. So, the relationship between the features is examined using Matthews’s correlation coefficient [38] that is computed as follows.

$${\text{Matthews}}\,{\text{correlation}}\,{\text{coefficient}} = \frac{{{\text{TP}}*{\text{TN}} - {\text{FP}}*{\text{FN}}}}{{\left( {{\text{TP}} + {\text{FP}}} \right)\left( {{\text{TP}} + {\text{FN}}} \right)\left( {{\text{TN}} + {\text{FP}}} \right)\left( {{\text{TN}} + {\text{FN}}} \right)}}$$
(11)

According to Eq. (11), the correlation between the features is computed and the respective values are shown in Table 5.

Table 5 Deep reinforcement learning-based robot assistance system—Matthew’s correlation coefficient (MCC)

Table 5 demonstrates Matthew’s correlation value of deep reinforcement learning-based robotic assistance of mental illness people. The suggested approaches effectively determine the relationship of illness people requirement and appropriate help-related data. The generated Q value in state and respective action helps to choose the right assistance information which improves the overall system training process. According to Table 5, the system ensures the maximum relationship value compared to other machine learning techniques such as Bayesian network (BN), multilayer perceptron (MLP) and deep learning neural network (DLNN). Then the respective graphical analysis is shown in Fig. 7.

Fig. 7
figure 7

DRL—MCC

According to the discussion, deep reinforcement learning-based robotic assistance system effectively predicts the relationship between the patient needs and respective assistance information with 98.72% of accuracy that is higher than collated with remaining machine learning techniques such as Bayesian network (BN) (96.61%), multilayer perceptron (MLP) (97.48%) and deep learning neural network (DLNN) (98.46%). Thus, the introduced deep reinforcement learning-based robotic assistance system successfully analyzes the user request, need-related assistance with minimum deviation and high accuracy. The effective robotic assistance relieves from the mental stress, depression, anxiety and other mental disorders successfully.

5 Conclusion

Thus, the paper analyzes the deep reinforcement learning-based robotic assistance system. Initially, the system collects the mental data from a collaborative psychiatric epidemiology survey (CPES) dataset which consists of several mental sets of information. The gathered data have 10% of missing value, which is eliminated according to the random forest approaches. The approach makes the decision depending on the missing value. During the decision-making process, missing value is eliminated by replacing any one of these values such as mean, median, max and standard deviation. After that, the effective learning system is created with multiple layers of a fully connected network. The created system successfully processes each action in every state that generates the Q value for every input. The generated Q values are stored in the table that used to get the future optimal value. Based on this process, respective assistance is provided to the robot according to the robotic needs. Then the efficiency of the system is analyzed using MATLAB-based experimental results in which the deep reinforcement system ensures 0.083 error rate and 98.42% accuracy. In the future, the metaheuristic optimized techniques are used to process the mental illness patient details for improving the assistance process.