Human Stress Detection in and Through Sleep Patterns Using Machine Learning Algorithms

Geetha, R.; Gunanandhini, S.; Srikanth, G. Umarani; Sujatha, V.

doi:10.1007/s40031-024-01079-y

Human Stress Detection in and Through Sleep Patterns Using Machine Learning Algorithms

ORIGINAL CONTRIBUTION
Published: 25 May 2024

(2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of The Institution of Engineers (India): Series B Aims and scope Submit manuscript

Human Stress Detection in and Through Sleep Patterns Using Machine Learning Algorithms

Download PDF

R. Geetha ORCID: orcid.org/0000-0002-4541-3314¹,
S. Gunanandhini¹,
G. Umarani Srikanth² &
…
V. Sujatha³

117 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Stress has a remarkable impact on various cognitive functions, demanding timely and effective detection using strategies deployed across interdisciplinary domains. It influences decision-making, attention, learning, and problem-solving abilities. As a result, stress detection and modeling have become important areas of study in both psychology and computer science. This study links the fields of psychology and machine learning to deal with the urgent requirement of accurate stress detection methodologies and highlights sleep patterns as a key indicator for stress detection, discussing a novel approach to understand and determine stress levels. Psychologists use affective states to measure stress, which refers to a sense of feeling an underlying emotional state. However, most stress classification work has been limited to user-dependent models, which new users cannot use without additional training. This can be a significant time burden for new users trying to predict their affective states. Therefore, it is critical to address basic mental health issues in children and adults to prevent them from developing more complex problems on account of undergoing stress. The medical field processes vast amounts of medical data; the machine learning algorithms sift through patterns that might escape the human eye. The machine learning algorithms act as detectives, able to spot correlations and bring out a sense of complex information. The machine learning algorithms reveal fine correlations and patterns, aiding in more precise and prompt diagnoses particularly to focus fundamental mental health issues in individuals of all ages. This research work deploys an enhanced Multilayer Perceptron (MLP), exhibiting an extensive feature analysis for processing medical datasets, resulting in improved effectiveness in predicting stress levels. This helps us to diagnose issues more accurately and swiftly which improves the patient outcomes. The proposed and enhanced MLP model undergoes stringent evaluation and its performance metrics are measured as Accuracy 99%, Precision 98.6%, Recall 99%, and F1-Score 99.5% compared against existing competent machine learning algorithms that include Adaboost, Random Forest, Gradient Boosting, and Decision Tree for different stress levels undertaken. The results show that MLP provides best results of accuracy compared with existing machine learning techniques in identifying stress detection via sleep patterns.

Stress Prediction Using Machine Learning and IoT

Prediction and Analysis of Stress Using Machine Learning: A Review

Issues and Challenges in Detecting Mental Stress from Multimodal Data Using Machine Intelligence

Article 28 March 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The technique for detecting stress based on physiological cues is a popular approach which is used in research and clinical settings for many years [16,17,18]. This technique is referred as biofeedback, which involves measuring physiological parameters and providing feedback to the individual. Galvanic Skin Response (GSR), pulse rate, body temperature, muscle tone, and blood pressure are all commonly used parameters to measure stress. GSR is a measure of skin electrical conductivity and increases when a person is under stress. As part of the body's "fight or flight" reaction to stress, heart rate rises. Due to decreased blood supply to the extremities, temperature may decrease. The body's reaction to stress includes an increase in muscle tone, which makes the individual to feel uncomfortable. The body’s response to stress can increase blood pressure. By measuring these parameters, it is possible to provide feedback to the individual on their state of mind and level of stress [19,20,21]. For example, an individual may be asked to relax their muscles or take deep breaths to lower their HR and muscle tension.

Stress is said to be occurred as facing rough and unpredictable challenges which demands the need for effective coping tools to navigate through the difficult times. Training the individual about the relaxation approaches like meditation or progressive muscle relaxation are the considered to be the best and effective tools enable them to navigate the stormy seas of stress. It entitles individuals to take an active role in managing their stress levels. Meditation helps in calming the mind, while progressive muscle relaxation alleviates body strain. It's a holistic strategy that addresses both the mental and physical aspects of stress [22,23,24]. Overall, persons who want to regulate their stress levels can benefit from using physiological indicators to identify stress, along with using other forms of stress management, like counseling, exercise, and a healthy lifestyle.

Artificial intelligence (AI) [27] is termed as the intelligence to carryout intended tasks which naturally require intelligence par with humans. The objective of AI is to imitate cognitive abilities of human in machines, empowers computing machines to execute complex tasks and adjust to changing environments. The advanced algorithms of AI are widely exploited in machine learning [28] expert systems, natural language processing, speech recognition and machine vision. It mirrors growing perception of real artificial intelligence as technology gets progressed, being evolved into a common, prominent and inevitable too. Optical Character Recognition (OCR) [29], which involves the ability of machines to distinguish and interpret text from images or scanned documents, is one of the cutting-edge AI applications and it has demonstrated outstanding success, assisting in the resolution of a number of difficult issues in both industry and academics. AI provide enterprises with valuable findings into their functionalities, unveiling patterns and trends that may have gone unobserved using conventional methods. This analytical capability can be considered as an innovative solution for strategic decision-making. Tasks such as analyzing large scale sets of legal documents require meticulous attention, and AI excels in this domain. Its competence to process large volumes of data swiftly, reduces the likelihood of errors, making it more reliable and efficient tool for types of tasks. AI can be termed as precise assistant in handling the analysis of complex workflows.

Machine learning uses [26] historical data to forecast the future. ML enables the computers that can learn from data without explicitly programming. This concept is similar to teach processing machines to observe patterns and draw conclusions based on past experience. The emphasis on developing programs that adapt to new data, reinforce the dynamic and evolving nature of machine learning applications. This leads to creating systems that can continuously improve and optimize their performance as such systems encounter new information dynamically. Python highlights the practical aspect of implementing ML. Python is a popular programming language for ML due to its simplicity and versatility where specialized algorithms are deployed using ML in training and prediction processes.

Unsupervised, reinforced, and supervised learning are the three different types of learning [25]. The input data and necessary tagging are sent to a supervised learning system such as artificial Neural Networks [30] which enables it to learn the material, which must first be tagged by a person. Learning without supervision has no labels, well standard algorithms are designed for training the machines to act intellectually and it needs the classification algorithms in order to achieve the same. Reinforcement learning gets associated with its environment, learns from positive/negative response to improve performance. Data scientists used to handle classical machine learning approaches to identify novel patterns in python that result in insights as shown in Fig. 1. The data used for classification can be multi-class or binary, depending on the task at hand, such as identifying the gender of a person or detecting spam messages. Classification problems are prevalent in various sectors such as speech recognition, handwriting recognition, biometric identity verification, medical document analysis, stress detection and so on.

Supervised learning is the most remarkable strategy, corresponding algorithm learns from a labeled dataset, where it's supplied with typical input-output pairs (X and Y), and objective is to learn mapping function (f) which exactly predicts the output variable (Y) for quite new and unseen input data (X). The intention is to carry-out the task of designing an optimal mapping function which predicts the output data(Y) when an input data (X) is presented. Supervised learning models, such as logistic regression, multi-class classification, decision trees, and support vector machines, are commonly used supervised training models, suitable for different types of problems. The training data is labeled with the exact outcomes, and this method is proven to build predictions by finding suitable patterns and relationships with this labeled data.

Furthermore, the proposed framework has several potential applications. Identifying stress levels in individuals by analyzing their sleep patterns thoroughly which enables healthcare providers to apply preventative measures to alleviate the impact of stress. The framework developed in this research can also be integrated into wearable devices or smart phone applications to provide real-time monitoring of stress levels of individuals. With the advent of telemedicine, the proposed framework can enable remote monitoring of patients' stress levels. Employers can also utilize the research findings to initiate workplace wellness programs to mitigate stress among employees.

The remainder of the sections is arranged as follows. Section "Related Work" discusses about relevant survey work carried out to find stress detection using machine learning algorithms. Section "Methodology" portrays architecture of the proposed method; Section "Performance Evaluation" illustrates outcomes and performance evaluation with empirical results and concludes the findings of the proposed solution presented in this article.

Related Work

Hatoon Alsagri et al [1] used machine learning techniques to identify Twitter users who may be experiencing depression by observing their behavior and keywords patterns in their tweets. Social media sites such as Facebook, Twitter, and Instagram seem to yield remarkable and significant influence on society. While social networking has its benefits, there are also significant downsides. Researchers have observed that frequent social media usage results higher rates of depression among the users. The authors developed and tested classifiers to analyze a person's network activity and tweets to determine whether the individual depressed. The results show that accuracy and F-measure scores for spotting depressed users improve as more features are included. This data-driven method used as a predictive strategy for early identification of depression and other mental illnesses. Key contribution of the work highlighted in this work is the investigation of the traits and impact on analyzing the severeness of depression.

In this study, Meera sharma et al [2], the authors worked with unknown datasets to find whether individuals are seeking treatment for mental health issues by employing range of deep learning, machine learning classifiers and predictive techniques to ensure accurate predictions through statistical analysis to overcome both issues. The study conducted in the year 2017 revealed that, more than 792 million individuals, which is around 10% of the world's population, suffered with mental disorders, led 78 million suicides. Previous efforts to predict suicidal tendencies using data science have been unsuccessful. Additionally, the authors employed extensive variety of deep learning and machine learning classifiers to make exact, optimal predictions using statistical analysis.

Sandhiya et al [3] handled a dataset of questionnaire posted to IT employees to assess their mental health status. Several machine learning approaches were applied to study the outcome, which highlighted the importance of consistent mental health screenings for IT workers to monitor their well-being. Although mental health is a popular research topic nowadays, but it is less discussed in everyday life, despite the fact that one's level of well-being is an indicator of their mental health. Due to the increasing use of technology, individuals in various industries, including IT, may experience mental health issues, such as stress, worry, and depression. Companies should provide medical care in the workplace and offer benefits to affected employees. Detecting and treating common childhood mental health issues early can greatly improve patients' quality of life. Machine learning techniques have been designed and proved well in analyzing medical data and aiding in diagnosis.

Sumathi et al [4] validated performance of eight distinct machine learning strategies in identifying five common mental health issues. The techniques were been trained and experimented on a dataset consisting of 60 cases, with 25 characteristics identified as crucial for determining the issue. Feature selection approaches were exploited to minimize the features and correctness of the classifiers was measured using entire attribute set and condensed features set. Multilayer Perceptron, Multiclass Classifier, and LADTree classifiers were found to produce most accurate results with little variation between using overall attributes set and condensed attributes set. It is important to continue developing and improving these techniques to effectively diagnose and treat childhood mental health issues.

Sarah Graham et al [5] provided an overview of the potential benefits and drawbacks of AI technology in mental healthcare. Recent original research on AI and its current uses in healthcare was also examined. The review analyzed various studies, utilized diverse methods using e-health records; brain imaging data, monitoring systems, and social media platforms. The objective is to categorize diseases pertaining to mental illnesses. Although promising, authors caution against premature conclusions and emphasize the need for bridging gap between clinical treatment and research about mental health using artificial intelligence. Amir Mohammed Mohammadi et al [6] described a stress detection model that uses four signal types, including body temperature, respiration, Electro Cardio Gram (ECG), and Electro Dermal Activity (EDA), extracts 65 features from a public dataset. The study found that 43 of the 65 features significantly differ between stressed and relaxed states using Kruskal-Wallis analysis. The K-Nearest Neighbor (KNN) technique was exploited to classify the states, achieving an accuracy of 96.024%. The system is advantageous as it requires fewer sensors and less power, relying on ECG and EDA signals, which provide excellent accuracy. Additionally, a high-performance sensor was devised which measures ECG and EDA signals from 18 strong individuals aged 16-40, who are exposed to stress using the Stroop Color-Word Test and an arithmetic mental exercise. This sensor achieves an accuracy of 94.425% and can operate for up to 70 hours on a single battery charge.

Samriti sharma et al [7] aimed to construct a simple pre-surgery stress detection method using Electrodermal Activity (EDA) measured through a minimally invasive wrist bracelet. The study recruited 41 participants from Sri Ramakrishna Hospital in Coimbatore, India, who underwent various surgical procedures. Using the EDA data collected, a supervised machine learning algorithm was developed to detect motion artifacts, achieving 97.83% accuracy on a new user dataset. Stress can have detrimental effects on individuals undergoing surgery, both physically and mentally, highlighting the importance of identifying preoperative stress levels. The findings emphasize the potential of this approach in detecting preoperative stress levels and mitigating negative impacts on surgical outcomes.

Ravinder Ahuja et al [8] focused on investigating influence of stress on candidates who are pursuing degree in an Institution, during different phases of their academic periods, specifically in a week ahead of examinations and the time intervals when using the internet. Mental stress, particularly among young individuals, is a significant problem in today's world. The supposed carefree period of life is now fraught with increased stress levels, leading to various issues like depression, suicide, heart attacks, and stroke. The study highlighted mental stress due to “overlooked” impact of exam and recruitment process, and the authors observed that there is a connection between this type of stress and student’s frequent internet usage. The authors collected a dataset from 206 candidates studying at a university, used categorization methods to measure sensitivity, specificity, and accuracy and it was proved that Support Vector Machines exhibit highest accuracy rate of 85.71%.

Shruti gedan et al [9] provided an in-depth review of stress identification using wearable sensors along with machine learning approaches. Stress is an elevated state of both body and mind, arises in situations which challenging or demanding. Stressors are the environmental factors that trigger stress. If someone is exposed to multiple stressors simultaneously over an extended period, it can lead to chronic health problems. Wearable technology allows for constant and real-time data collection, enabling individuals to monitor their own stress levels. This paper also suggests the construction of a multimodal stress identification architecture which was designed in association with wearable sensor-based deep learning techniques. Future research studies are expected to examine the stresses, methods, outcomes, benefits, limitations, and concerns for each study. Can et al [10] have devised an approach for stress detection which utilizes smart bands to collect physiological data. The novel architecture was adopted to monitor the stress levels of 216 individuals over an eight-day training session for an EU work. The study collected 2780 self-report questions from participants of various nationalities, as well as 1440 hours of physiological data. The system captured environmental information and various forms of physiological data to calculate each participant's subjective stress levels. The proposed system could be effectively utilized to determine perceived stress levels over sessions, days, and time.

Gjoreski et al [11] introduced a system that can continuously detect stressful events using a commercially available wrist device. Long-term exposure to stress indeed results detrimental effects both on physical and mental health. It attributes various health issues, such as cardiovascular diseases, weakened immune system and mental health disorders. Hence it is imperative to detect stress early for preventing the negative impacts. The proposed architecture has three components: a stress detector device assesses short-term stress periodically; an activity monitor that keeps track of user activity consistently records contextual data, and context-based stress detector captures outcome of stress detector and user context to make a decision every 20 minutes. This proposed device was measured in both laboratories and in real-world settings, achieved 92% for a two-class problem and launched as Smartphone app for managing physical and mental health issues.

Can et al [12] discussed about the widespread use of Smart phones, smart watches, and smart wristbands taking over people's lives. Stress has become a prevalent issue among common people, and this has led to a discussion about the potential for wearable sensors and Smart phones to detect and prevent stress. In this study, the researchers examined current research on the use of wearable technology and Smart phones for detecting stress in various daily life settings, including office, campus, transportation, and unrestricted daily living situations. Ayten Ozge Akmandor et al [13] focused on stress is that a common and widespread psychological disorder that inevitably affects people's mood and behavior. If left unchecked, chronic stress will create serious impacts on an individual's physical and mental health. There is potential for the application of various nature-inspired computing techniques and deep learning methods, such as Deep-Belief Network, Convolutional-Neural Network, and Recurrent-Neural Network, to analyze multimodal data gathered from behavioral testing, electroencephalogram signals, finger temperature, respiration rate, pupil diameter, galvanic-skin-response, and blood pressure readings.

Furthermore, Kim et al [14] designed a hybrid model incorporating several computational approaches, adaptation, parameter adjustment, utilizing chaos, levy, and Gaussian distribution, to express issues related to stress. Prolonged exposure to stress can create negative impacts on immune, cardiovascular, and endocrine systems. In order to deal this problem, a team of researchers have devised a Stress Detection and Alleviation system called SoDA. System makes use of Wearable Medical Sensors (WMSs), including ECG, GSR, respiration rate, blood pressure, and blood oximeter to consistently examine stress levels. The system's effectiveness was evaluated by analyzing data obtained from 32 individuals who experienced four stressors and were subjected to three stress reduction techniques. SoDA uses a mixture of both supervised feature selection and unsupervised dimensionality reduction to identify stress with 95.8% accuracy. Nath et al [15] created a stress prediction model for elderly people using a smart wristband that measures Electro-Dermal Activity (EDA), Blood Volume Pulse (BVP), and Heart Rate Variability (HRV) were gathered from 40 individuals during an analysis process known for inducing stress, measured through salivary cortisol. A supervised method was adopted to select 27 out of 47 features extracted from the signals.

Accumulating information from multiple signal streams proved to have remarkably escalated the model's performance in distinguishing between stressed and not-stressed states. Achieving accuracy of 94% is quite substantial and recommends that the model is effectively capturing and leveraging the relevant features from each signal stream. It's a great example of how a holistic approach can improve the capabilities of a model in executing complex tasks like stress detection. This novelty made the model to achieve an accuracy of 94% and a macro-average F1-score of 0.92 when using features from all four signals. The study lasted for a year with an average age of 73.625 ± 5.39.

Methodology

The proposed method is a new approach to identify stress in the decision-making process, which was evaluated using dataset collected from stressful situations in kaggle website. Unlike prior research that only assessed stress levels generally, this method aims to detect stress specifically in the decision-making process, providing insight for identifying stress in future decision-making scenarios. Stress can impact decision-making, making early recognition of stress vital to enhance clinical performance. Although the existing methods have demonstrated potential in detecting stress, previous studies used only individual-level features for classification, without considering the inter-channel correlations in the brain that could reveal distinctive features for stress detection. The disadvantages include that (a) this is a complex process because some instrument type material was deployed to detect the stress level (b)performance metrics were not measured (c) Deployment is not implemented.

Data about stress from numerous sources is combined to form the dataset. Data is downloaded, verified as accurate, cleaned and trimmed. The acquired dataset is separated as training and testing datasets. Test dataset and testing dataset are created based on the accurateness of results. The system model pre-processes outliers, irrelevant data, and a combination of continuous, categorical, and discrete variables, the ML prediction model proved successful in predicting stress. The training set plays a critical role in the machine learning process with a Multi Layer Perceptron (MLP) classifier, random forest, decision tree classifier, and gradient boosting algorithms, along with test set prediction is made in accordance with the accuracy of the test results. The advantages include accuracy of the work improvised and performance metrics of each algorithm are compared which provide better results. The various phases involved in the proposed methodology as shown in Fig. 2 are as follows:

Data Analysis and Model Deployment

Data Pre-processing

Validation procedures are very useful to access the percentage of errors of machine learning models, which is normally close to the actual error rate of the dataset. However, when working with data samples that are not representative of the population, validation becomes necessary. This involves identifying missing or duplicate values and data types to ensure data quality and accuracy. Incorporating information from the validation dataset into the model setup can lead to biased evaluations, and adjusting hyper-parameters based on the validation set should be done carefully. Therefore, understanding your data and its characteristics during the data identification phase can assist in choosing the appropriate method for constructing our model. Python's Pandas module can be used for various data cleaning tasks, particularly for handling missing values, which is one of the most significant data cleaning tasks. It is essential to realize the various types of missing data from statistical perspective analysis. Ultimately, more time should be spent on modeling and analysis, and less on data cleaning.

Data Collection

Separating the given dataset is an intelligent approach to validate outcome of models at hand and algorithms like Random Forest, MLP, Decision Trees, Gradient Boosting, and Adaboost were adopted to design the data model. Each algorithm has its potential, and ensemble methods like Random Forest and Boosting can often enhance overall performance. It is advisable to maintain 7:3 ratios for training and testing to keep the balance between training and testing on unseen data.

Data Manipulation

Data is loaded, checked for delicacy, and trimmed and gutted for analysis. Make sure to precisely validate the cleaning opinions and give defense.

Data Visualization

Data visualization provides a powerful set of tools for gaining a qualitative understanding of a dataset, helping to identify patterns, outliers, and other key relationships. By presenting data visually through charts and graphs, it can become more understandable to stakeholders. Visualizing data is also imperative for fast analysis in both applied statistics and machine learning, where various plot types are used to explore and analyze data samples and other objects in Python. Some common data visualization tools and libraries in Python include Matplotlib, Seaborn, Plotly, and Bokeh. These libraries provide an interactive visualization option, allowing for a more engaging and in formative presentation of data.

Building the Classification Model

The following factors make the robust and high accuracy prediction model for human stress effective: It produces satisfactory and reliable outputs in classification problems, has the ability to handle well the preprocessing outliers, different types of variables, managing the combination of continuous, categorical, and discrete variables for addressing real-world complexity. It also generates unbiased out-of-bag estimate errors which add impartial in numerous tests.

Construction of a Predictive Model

It is known that machine learning often demands a large amount of data for training; it is not always necessary to use raw, unprocessed information and is a process of cleaning and altering data into a recommended format for machine learning algorithms. Preprocessing can involve several steps, such as removing outliers, normalizing data, and encoding categorical variables. This process explains about how preprocessing steps are tailored to specific needs of the data and basic requirements of machine learning tasks. Regarding accurately predicting human stress levels, there are various machine learning models that can be trained on preprocessed data to achieve this goal. Some popular models for regression problems include linear regression, decision trees, random forests, and neural networks. It is highly inevitable to notice that correctness of the model depends on quality and quantity of data being used for training.

The dataset was obtained from Kaggle and then goes through data-preprocessing to eliminate duplicate and null values. Then the data is represented in graph by data visualization. The algorithms are implemented and the highest accuracy is shown in the model. The model is deployed using the input given by users as shown in Figs. 3 and 4.

In the initial step, data related to sleep and stress is assembled. This data may include physiological signals such as heart rate, respiration rate, snoring rate, etc that are recorded during sleep. The collected data needs to be preprocessed in order to filter noise or artifacts that may be found. This may include filtering, artifact removal, and normalization of the data. The next step is to collect relevant features from preprocessed data, may include statistical metrics such as mean, standard deviation, skew and spectral features such as power spectral density and spectral entropy. The extracted features may be high-dimensional and contain redundant or irrelevant information. Hence, feature selection strategies such as Principal Component Analysis (PCA) / Mutual Information resorted to choose most identical features.

Machine Learning Model Training

Once the relevant features are selected, a machine learning algorithm is employed on the labeled dataset to predict stress levels. Some of the commonly used algorithms include Multi-Layer Perceptron (MLP), Decision Tree Classifier, Random Forest, Gradient Boosting and Adaboost Classifier.

Model Evaluation

Trained model is validated on a test dataset to estimate its performance. Metrics includes accuracy, precision, recall, and F1-score depend on task being solved. Accuracy gives a measure of correctness, while precision and recall speak about how well the model is performing on specific classes.

Deployment

After the designed model is trained, tested and evaluated, to detect stress levels during sleep. This may involve monitoring physiological signals in real-time and making predictions based on the trained model. The deployment process may also involve testing and validating the overall performance of the model in real-world conditions to ensure that it can handle variations in signal quality, environmental noise, and user variability. Overall, deploying a machine learning model for real-time stress detection during sleep is a difficult and challenging task, but one with the potential to improve the understanding and management of stress-related disorders.

Modified Multilayer Perceptron

In this work we propose a modified Multilayer Perceptron which involves the dropout layers, which randomly deactivate a percentage of neurons during training, preventing overfitting and enhancing generalization. Additionally, we explored the variations in activation functions and the number of hidden layers to optimize MLPs for specific tasks, contributing to the ongoing evolution of neural network architectures in the human stress level detection.

MLP is a type of artificial neural network that consists of multiple layers of interconnected nodes, each layer contributing to the learning and abstraction of complex patterns. By employing hidden layers and activation functions, MLPs can effectively model non-linear relationships, making them versatile for various machine learning tasks such as image recognition and natural language processing.

The process begins with the input layer, which receives the raw data or features to be processed as shown in Fig. 5. Each node in this layer represents a feature of the input data. Every connection between nodes in adjacent layers is associated with a weight, representing the strength of the connection. Additionally, each node has a bias term, allowing for greater flexibility in modeling. Following the input layer are one or more hidden layers. These layers perform transformations on the input data using weighted sums and activation functions. Nodes within each hidden layer apply an activation function to the weighted sum of inputs and biases. Common activation functions include sigmoid, tanh, ReLU, and softmax. These functions introduce non-linearity, enabling the network to learn complex patterns in the data. The input data is propagated forward through the network layer by layer, with each layer's output serving as the input to the next layer. The final layer, known as the output layer, produces the network's predictions or outputs. The number of nodes in this layer depends on the nature of the problem (e.g., classification, regression).

A loss function measures the difference between the network's predictions and the actual target values. The goal during training is to minimize this loss by adjusting the network's parameters (weights and biases).

To update the network's parameters, an optimization algorithm such as stochastic gradient descent (SGD) is used. Backpropagation, a key concept in training neural networks, calculates the gradients of the loss function with respect to the network's parameters. The gradients obtained from backpropagation are used to update the weights and biases in the direction that minimizes the loss function.

The learning rate determines the size of these updates. Training typically occurs over multiple iterations called epochs. In each epoch, the entire dataset is passed through the network. Batch training involves dividing the dataset into smaller batches to update the parameters more frequently. Techniques such as dropout and L2 regularization are commonly employed to prevent overfitting, where the model performs well on training data but poorly on unseen data. Once training is complete, the model's performance is evaluated on a separate validation set to assess its generalization ability. Finally, the trained model can be used to make predictions on new, unseen data by passing it through the network and obtaining output values.

MLP first layer is input layer, which takes raw input data (such as images or text) and forwards it to the next layer. The next layers, known as hidden layers, perform a series of nonlinear transformations on the input data to capture complex patterns from input features. The final layer, called output layer, constructs classification output based on patterns observed in the previous layers. The training MLP uses a process called back propagation to update values of weights and bias of the neurons found in each layer to improve similarity between predicted and actual outputs.

Utilizing a non-linear kernel function, the outcomes, as denoted in Equation (1), are computed where 'w' denotes the vector weights, 'y' represents the input combination, and 'b' signifies the bias, with the kernel function denoted as 'Φ'.

$$y=\phi \left(\sum_{i=1}^{n}{w}_{i}{x}_{i}+b\right) =\phi \left({w}^{r}{x}_{i} +b\right)$$

(1)

The training process of the multilayer perceptron (MLP) involves a two-phase back-propagation approach. In the initial.

forward phase, Eq. (1) is employed to compute categorized outputs based on the provided input data. Subsequently,

in the backward phase, partial derivatives of the kernel function concerning parameter adjustments are computed and.

propagated back through the network. Following this, a gradient boosting algorithm is applied to update the network's.

weights, and the entire procedure iterates until the weights converge.

Hyperparameter Optimization

Training an MLP involves updating hidden layer weights to maximize performance, where hyperparameters play a significant role. Hence, fine-tuning through hyperparameter optimization is crucial due to the substantial impact on model performance. Even with the same MLP architecture, accuracy can vary greatly based on hyperparameter combinations. We have chosen 4 hyperparameters for optimization for yielding better results. Among the four optimized hyperparameters in this study, the first two were the number of hidden layers and nodes. Increasing these can effectively capture complex features, but excessive complexity risks overfitting, necessitating careful adjustment. The third hyperparameter, learning rate, determines weight updates during training; extremes can hinder convergence or cause slow progress. Dropout, the fourth hyperparameter, limits training node participation to prevent overfitting, although potentially extending training time.

To ensure the prediction model's accuracy and prevent overfitting, we conducted the aforementioned four hyperparameter optimization and defining the tuning sets for the hyper parameters through trial and error. Twenty percent of the total data was allocated as the validation dataset (SD_SL_test). Unlike the training set, this portion wasn't directly involved in training but served to monitor and evaluate the model's predictive accuracy during the training process.

During training, network is supplied with data (stress level data inputs L0, L1,L2, L3,L4 and corresponding outputs- SD_SL_train) and weights are updated to reduce this error. This process is continued for numerous epochs until the error is minimized or a predefined stopping criterion is met. Increasing the number of hidden layers can lead to a proliferation of unnecessary features, hindering accurate predictions. Hence, for this model, we opted for two hidden layers each with 10 neurons to balance complexity and performance. Additionally, we experimented with 20 hidden nodes and found improved accuracy. The maximum number of iterations the solver can perform is set to 1000. The random mode is set to 42, which ensures that the MLP is reset every time it runs with the same random weight. This can be beneficial in terms of repeatability. The activation parameter is set to “Relu”, which means that the MLP uses a rectified linear unit activation function in its hidden layers. We determined that a learning rate of 0.01 yielded better average performance, avoiding suboptimal solutions or local minima. Setting the dropout value to 0.5 further enhanced average performance, despite the model not necessitating regularization.. It is well suitable for large datasets and deep neural networks.

The pseudocode of the proposed algorithm is explained below that detects the various levels of human stress from the dataset collected from kaggle.

The provided pseudocode outlines a workflow for stress detection and prediction using an MLP model. Initially, the dataset is split into training and testing sets to facilitate model training and evaluation. For each individual data entry in the training set, a conditional check is performed to ascertain if there are any missing or null entries. If such inconsistencies are detected, preprocessing steps are applied to ensure data integrity. Conversely, if the data is complete, the MLP model is trained for stress level classification. This iterative process continues until all data entries in the training set are utilized for training. Following model training, each entry in the testing set undergoes evaluation using the trained model. The model predicts the stress level for each entry, categorizing it into one of the predefined stress levels (L0 to L4). Finally, the classification results are returned, providing insights into the stress levels present in the dataset. This approach enables the automated detection and prediction of stress levels based on input data, facilitating proactive intervention and support strategies. The proposed model is evaluated against the other machine learning algorithms such as AdaBoost Classifier, Random Forest Classifier, Gradient Boosting Classifier, and Decision Tree Classifier for accuracy, precision, recall and F1 score.

Other Machine Learning Algorithms

Decision Tree Classifier

This classifier adopts a supervised approach to segment training data based on specific parameters. The segmentation process produces decision nodes and leaves, which are used to construct a tree-like structure. Decision trees are useful in various machine learning applications, such as classification, regression for their resemblance to real-world scenarios and represent decisions formally and graphically in decision analysis. In classification and regression problems, decision trees are a non-parametric method that aims to create a model using decision rules derived from data attributes to predict the target variable's value.

Gradient Boosting Classifier

When the decision trees are poor learners, the resulting method, known as gradient-boosted trees, typically beats random forests. Building gradient-enhanced model follows same step-by-step process as the previous enhancement technique, but generalizes the other techniques by allowing a differentiable loss function to be optimized. Gradient boosting classifier’s primary premise is to fit a series of decision trees to the training data, where each tree tries to fix the mistakes produced by the preceding tree. The algorithm learns how to give the incorrectly categorized samples more weight throughout the succeeding iterations during the training process. Gradient boosting classifiers are hence very adept at managing unbalanced datasets. It has a high degree of accuracy and the capacity to manage very vast and intricate datasets. Because it emphasizes fixing the errors made by the prior models, it is also less prone to over fitting than other ensemble approaches, such Random Forests.

Random Forest Classifier

This classifier creates several random samples from the training data and randomly selected features for each split. Each tree is trained on a different sample, and their performance is evaluated during training to select the best tree. The algorithm uses the majority vote of the individual tree forecasts to produce a prediction for a new data point after it has been processed by all the decision trees in the forest. The class with the higher votes decides final prediction. Comparing Random Forest to other classification methods, there are various benefits. The method is strong, resists over-fitting, and can handle noisy or missing data. Moreover, it can handle high-dimensional datasets and trains rather quickly. Applications for random forests include text classification, image classification, and prediction in the financial and medical fields.

AdaBoost Classifier

Adaptive Boosting, is an ensemble learning method combines predictions of multiple weak classifiers to create a strong classifier. A weak classifier is a model that performs slightly better than random chance. Initially it assigns equal weights to all training data samples, trains a weak classifier on the data, and evaluates its performance. Later it calculates the error of the weak classifier and weight of the error is used to identify misclassified examples. Later weights of the misclassified examples are increased, making them more important for the next classifier. This process is repeated till a perfect classifier is achieved.

In this work, deployment is done in Jupyter Notebook in Anaconda Navigator, with Django acting as middleware. The frontend consists of HTML and CSS. Human stress can be detected from numerical values such as snore rate, breathing rate, body temperature, limb movement, blood oxygen, eye movement, sleep time, and heart rate. Django is web framework for developing web applications quickly and easily. It has a built-in administrative interface that can be customized to manage the data in the application. Django's templating engine allows developers to create reusable templates for building consistent user interfaces across the application. The framework also includes a URL routing system that maps URLs to appropriate views, making it easy to organize and manage the application's logic. After training a machine learning model, a pickle data format file (known as a.pkl file) is received which is deployed to enhance the user interface and improve accuracy of predictions. By doing so, the trained model can be readily accessed and used for real-time decision-making.

Performance Evaluation

The performance of the proposed algorithm against the existing machine learning models is evaluated using various performance metrics such as true positive, true negative, false positive, false negative, accuracy, precision, recall, F-score, Confusion matrix as discussed below.

False Positive (FP): It occurs when the model identifies a positive outcome, but the real outcome is found to be negative. Diminish FP is difficult, especially in scenarios where the consequences of false alarms are remarkable. It's a balance between sensitivity and precision in our model. False Negative (FN): FN occurs when the model predicts a negative outcome, but the recorded outcome is positive, it's a situation where model is unable to fail to identify a true positive. FP is crucial to reduce in situations where missing a positive case has severe consequences. True Positive (TP): TP occurs when the model correctly predicts a positive outcome, and the actual outcome is positive. It's a win for the model when it successfully predicts a positive event. It represents instances where the model and reality are coherent, correctly recognizing the positive class. True Negative (TN): TN occurs when the model correctly identifies a negative result, and the actual outcome is indeed negative. In simple terms, it accurately identifies a negative event. It represents instances where the model is said to find the absence of the positive class.

$$ \begin{gathered} {\mathbf{True}} \, {\mathbf{Positive}} \, {\mathbf{Rate}} \, \left( {{\mathbf{TPR}}} \right) \, = \, {\mathbf{TP}} \, / \, \left( {{\mathbf{TP}} \, + \, {\mathbf{FN}}} \right) \hfill \\ {\mathbf{False}} \, {\mathbf{Positive}} \, {\mathbf{Rate}} \, \left( {{\mathbf{FPR}}} \right) \, = \, {\mathbf{FP}} \, / \, \left( {{\mathbf{FP}} \, + \, {\mathbf{TN}}} \right) \hfill \\ \end{gathered} $$

Accuracy

It is the most common evaluation metrics that provides an overall measure of how well a classification model perform.

$$ {\mathbf{Accuracy}} \, = \, \left( {{\mathbf{TP}} \, + \, {\mathbf{TN}}} \right) \, / \, \left( {{\mathbf{TP}} \, + \, {\mathbf{TN}} \, + \, {\mathbf{FP}} \, + \, {\mathbf{FN}}} \right) $$

Precision

The proportion of successfully predicted favorable outcomes is known as precision. It is the proportion of all positively predicted observations to those that were correctly predicted.

$$ {\mathbf{Precision}} \, = \, {\mathbf{TP}} \, / \, \left( {{\mathbf{TP}} \, + \, {\mathbf{FP}}} \right) $$

Recall

Is a metric which measures the ability of a classification model to record all the relevant positive instances.

$$ {\mathbf{Recall}} \, = \, {\mathbf{TP}} \, / \, \left( {{\mathbf{TP}} \, + \, {\mathbf{FN}}} \right) $$

F1 Score

The F1-score is a metric, combination of precision and recall into a single value, providing a balanced measure of a model's performance.

$$ {\mathbf{F1}} \, {\mathbf{score}} \, = \, {\mathbf{2TP}} \, / \, \left( {{\mathbf{2TP}} \, + \, {\mathbf{FP}} \, + \, {\mathbf{FN}}} \right) $$

Results and Discussion

In this work, it has some genuine estimation to assess how well the different classification algorithms performed in tests. Many evaluation techniques, such as accuracy, sensitivity, specificity, and precision as well as the F1 measure were used to gauge the effectiveness of the categorization systems.

Multi Layer Perceptron (MLP)

MLP is capable of providing accurate results by leveraging multiple layers of interconnected neurons. By fine-tuning the model, the MLP can achieve high sensitivity, ensuring that it detects a large portion of positive instances correctly. It also exhibits good specificity, meaning it can correctly identify negative instances. Additionally, the MLP's precision is noteworthy, as it delivers precise predictions by minimizing false positives. Overall, the MLP's performance can be evaluated using the F1 measure, which combines both precision and sensitivity, providing a balanced assessment of its predictive capabilities.

Precision

It measures the proportion of positive instances out of all instances that the model predicted as positive. The value of precision measured for the stress levels fall into five categories 0, 1, 2, 3, 4 shown in Fig. 6. From the graph shown, it is observed that the MLP algorithm correctly identified stress levels 0, 1, 2, 3, and 4 with maximum precision for each category and performed exceptionally well in categorizing stress levels. It is demonstrated that high level of effectiveness is arrived which is the indication of the model's ability to accurately predict stress levels based on the provided data.

Recall

It is the measure that correctly identifies True Positives. A perfect recall rate for each stress level category is shown as 100, 97.8, 100, 100, and 100 in Fig. 7. It is observed that the algorithm can successfully identify and recall all viable instances of each stress level category from the given dataset applied during testing. In other words, we can say that the MLP algorithm has a high sensitivity to each stress level category, ensures that it can identify all the appropriate instances accurately, and demonstrates the robustness of the MLP algorithm in accurately identifying stress levels, which is inevitable for applications such as stress monitoring and prediction.

F1–Score

It is one of the machine learning evaluation metrics to measure accuracy and overall performance of a binary classification model and combines both precision and recall scores of a model. It is metric to deal with imbalanced classes. The model is trained to classify data into five different categories representing stress levels ranging from 0 to 4. The outcome for these categories is plotted as a scatter graph shown in Fig. 8. F1-Score of 1 indicating that perfect precision and recall are achieved for the five categories of stress levels 0,1,2,3,4. This is achieved due to high-quality training data and the model is capable of learning complex patterns in the data, allowing it to make accurate predictions across all stress level categories.

Confusion Matrix

It is one of the performance evaluation tools in machine learning, representing the accuracy of a classification model. It is the N*N matrix compares the actual target values with the predicted values generated by the machine learning model. In the Fig. 9, the confusion matrix of MLP is drawn using the matrix value. The result of the matrix came diagonally; it shows the overall support for the algorithm using stress levels. From the diagonal values it is implied that this algorithm has good support for correctly predicting stress levels across the dataset. The provided confusion matrix disclosed that the MLP algorithm is making predictions exclusively in favor of certain classes while completely neglecting other.