Introduction

In today’s modern world every other person is having stress. It can be sporadic for some or a part of life for others. For example, study pressure during exams can be sporadic stress for some students. Other factors that contribute to stress are- fear of losing jobs, job-work pressure, target-oriented deadlines, family stress, financial crisis, loss of loved ones, etc.

The classification of stress can be based on the duration of exposure to stressors, leading to three types: acute, episodic, and chronic stress [1]. Acute stress is a normal and inborn response to short-term exposure to stressors and usually does not lead to negative consequences. Episodic stress is experienced when individuals face frequent and ongoing stressful situations that occur intermittently. Chronic stress, on the other hand, is a persistent stress caused by stressors from family or work environments, and it is considered harmful to an individual's health.

Research has shown that stress, especially when severe and continuous, can have severe impacts on individuals, causing mental and physical health problems. Prolonged stress has many adverse effects on physical and mental health. It can cause other diseases like high blood pressure, depression, anxiety, fatigue, and sleep deprivation. Many studies have linked stress as a root cause of diabetes, heart disease, and cancer. Stress despoils emotional as well as physical health. People are not able to think, make decisions, concentrate, and enjoy life. Therefore, it is required to detect stress and manage it. The advancement of technologies such as Internet of Things (IoT), Internet of Medical Things (IoMT), and sensors have made it possible to monitor a person's mental state, and understanding the different modalities that contribute to stress detection is important.

Machine learning models such as logistic regression classifier (LRC), support vector machine (SVM), decision tree (DT), random forest (RF), and K-nearest neighbor (KNN) have been applied to the stress detection problem, achieving high accuracy rates. Deep learning models such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and autoencoders have also shown promising results in stress classification using physiological and behavioral data.

This review paper aims to provide an overview of the various modalities used in stress detection, including machine and deep learning algorithms, datasets, and application areas. A list of abbreviations used in this study is enumerated in (Table 1).

Table 1 List of Abbreviations used in this study

Significance of Study

Stress is a topic of contention globally as it leads to many other chronic diseases. Figure 1 shows that a significant amount of research on stress has been conducted in the United States and China, but there is still a need for further research to better understand and address this issue. The negative effects of stress in the workplace are a global concern, and it is important for researchers and organizations worldwide to study and develop effective solutions to minimize stress and promote well-being. It is of utmost importance to monitor and detect stress. Artificial Intelligence has dominated the healthcare industry. Medical professionals are experimenting with these technologies to improve prediction and decision-making. By accurately detecting stress, these models can provide valuable insights for developing personalized stress management strategies and interventions. Stress is highly multimodal. It is often expressed through facial expressions, gestures, voice, or physiology. In this study, we aimed to provide an understanding of different modalities that can be used for the identification of stress. This study also provides a comprehensive review of available research that used ML/DL algorithms to detect stress.

Fig. 1
figure 1

Countries contributing to the study of Stress Detection

Research Goals

In this review paper, the following important aspects of mental stress detection are explored:

  • An overview of different modalities that aid in detecting stress.

  • The use of machine and deep learning techniques for automatic stress detection.

  • The datasets that are available for stress detection.

  • The various areas where stress detection has been conducted.

  • Obstacles in detecting stress.

Outline of the Paper

The structure of the paper is as follows: Section II outlines the methodology used for selecting and collecting relevant studies, including the inclusion and exclusion criteria. Section III reviews various modalities for stress detection and provides an overview of existing research in the literature. The standard datasets available for stress detection are discussed in Section IV. Sections V and VI provide a brief literature survey of machine learning and deep learning techniques. Section VII discusses the application of stress detection in different areas. Section VIII presents problems in stress recognition. Finally, Section IX concludes the paper.

Methods

Search Strategy for Research Studies

The present study conducted a systematic literature review by the PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analysis) framework to ensure the inclusion of relevant studies [80]. The identification and selection of articles for the review are illustrated in Fig. 2. The literature search was restricted between January 2018 to November 2022. The initial search identified 345 articles. Out of 345, 271 remained after removing duplicates. Review articles, book chapters, and review papers were not selected 187 papers were excluded at this stage. 22 were excluded after the title and abstract screening. 17 papers were not retrieved. The last 45 articles published in English were considered for review.

Fig. 2
figure 2

Search approach used for this Review Paper

The search strategy tailored 4 databases: SCOPUS, Google Scholar, PubMed, and Dimensions and the search term used was: ("multi-modal" OR mental OR psychological) AND (stress OR anxiety OR depression) AND ("Machine Learning" OR “Deep Learning"). The search was conducted using the title and abstract field.

Inclusion and Exclusion Criteria

The articles included in the review must identify and detect psychological stress using ML/DL. The search articles focused on multimodal data were included in the review process. Figure 3 shows a co-occurrence analysis of keywords for the papers considered for review using VOS Viewer [79]. The included articles were published from 2018 November 2022 (inclusive). The titles and abstracts of the articles were screened and excluded based on the following criteria: (1) publications unrelated to psychological stress, (2) articles not involving ML/DL algorithms for data analysis, and (3) papers that were purely psychological or medical. Review articles, editorials, letters, book chapters, theses, conference abstracts without full text, and non-English articles were also excluded. A total of 310 articles were excluded, there were 45 records extracted at this stage.

Fig. 3
figure 3

Keyword Co-occurrence Analysis

Quality Assessment Criteria for Research Studies

To ensure the quality and relevance of the academic literature included in the review process, this systematic literature review only considered original research articles, which were thoroughly checked for duplicates. Abstracts were carefully analyzed and purified, and each research process was meticulously evaluated at a later stage.

Data Extraction

In this phase, 55 articles were selected and the features extracted were:

  1. 1)

    Only original papers were considered for the study, while case studies and review papers published reports were excluded.

  2. 2)

    Extracted articles were published between 2018 and 2022.

  3. 3)

    The articles must be from the engineering field using ML/DL techniques.

  4. 4)

    The language of the articles must be English.

Features for Stress Detection

Psychological, physiological, and behavioral responses are all interconnected forms of our reaction to various stimuli. Psychological responses refer to internal mental processes that are not physically visible, while physiological responses are unconscious physical responses of the body. On the other hand, behavior encompasses actions that can be controlled and observed from the outside. Emotional responses such as anger, anxiety, or depression are part of psychological responses, whereas changes in hormone levels, heart rate, and muscle activation are examples of physiological responses. Finally, variations in eye gaze, blink rate, and facial expressions fall under the category of behavioral responses. According to Figure 4, the different modalities for stress detection are categorized into the following groups:

  • Psychological or Self-reported measures: This includes questionnaires and surveys that ask individuals to self-report their stress levels.

  • Physiological signals: This includes measurements of heart rate, electrocardiogram (ECG), galvanic skin response (GSR), electroencephalogram (EEG), blood pressure, and respiration rate.

  • Behavioral signals: This includes features such as facial expressions, voice/speech patterns, body gestures, and eye movements.

  • Multimodal approaches: This includes the combination of two or more of the above modalities to improve the accuracy of stress detection.

Fig. 4
figure 4

Categorization of features to detect stress

Psychological

Traditionally, stress detection relied on self-reported questionnaires filled out by participants, which were then evaluated by experts. Some commonly used questionnaires include the PSS, DASS, and DSM-IV [12, 14, 28]. However, this method is both time-intensive and lacks accuracy and reliability. As it relies on self-reported data which is subjective and also requires manual intervention for recognition and interpretation of the results.

In addition, this method only provides a snapshot of the individual's current stress level and may not accurately reflect their stress levels over a longer period of time. Moreover, the results may not be consistent as stress levels can fluctuate throughout the day and even within the same day. As a result, this method may not provide a comprehensive understanding of an individual’s stress levels and may not be effective in detecting stress at an early stage.

Physiological Signals

Psychological response refers to our mental activity and emotions, such as anger, anxiety, or depression, and can impact our emotional state. Physiological response, on the other hand, involves changes in our bodily functions that are beyond our control, such as an increase in heart rate or muscle activation due to changes in hormonal levels. Wearable sensors can be utilized to capture physiological signals. Figure 5 shows the common place where sensors can be placed on the human body.

Fig. 5
figure 5

Diagram showing typical positions on the human body where wearable sensors can be placed

Hormone Levels

Cortisol, also known as the "stress hormone," plays a significant role in regulating the body's stress response as well as metabolism, inflammation, and immunity. Its levels can be measured through blood, saliva, or urine samples. Studies have shown that cortisol levels tend to increase during stressful situations [34, 35]. However, it's important to note that cortisol levels are also influenced by various factors such as sex, age, medication, weight, blood pressure, type-2 diabetes, and saliva flow rate. Cortisol levels tend to be higher in the morning and decrease throughout the day, so timing should be taken into consideration when collecting samples.

Electro Dermal Activity (EDA)

EDA is an indicator of the electrical conductance of the human skin. Sweating is managed by central nervous activity. Sweating activity increases with an increase in psychological arousal and hence increases the conductance of the skin. EDA [33] is used to measure the psychological state and emotional level of a person. However, it can be affected by humidity and room temperature. In [5], stress is calculated by recording EDA and respiratory changes. The experiment was carried out in two stages. In stage 1. Physiological signals were recorded. In stage 2, neural signals from EEG were recorded. They used cognitive experiments to induce stress. CNN was used for classification and achieved an accuracy of 90%.

Electrocardiogram (ECG)

ECG records the arrhythmia and electrical activity of the heart. It is a graph between voltage and time of the heart's electrical activity measured by placing electrodes on the body. ECG has been widely used by researchers to measure stress as it can be measured easily. Heart rate variability (HRV) is computed through ECG.

Heart rate variability measures the time duration between two successive heartbeats. Study shows that it is analogous to assessing stress and mental health conditions. HRV alone cannot be used as a tool for detecting stress as other factors can also alter the HRV results like breathing, rhythms, obesity, smoking, drinking, etc. Thus, the patient's medical history and psychology should also be considered. ECG signals in different environments [7, 8], 9, 27, have been widely used by researchers to detect stress. In [8], college students' stress levels were identified using ECG with the help of the CNN model. Similarly, Amin et.al [7] used ECG signals to detect driver's stress. They used transfer deep learning and fuzzy logic to classify stress levels. In [10], Zhang et.al, used CNN with Bi-LSTM (Bidirectional Long Short-Term Memory) to classify stress into low, medium, and high.

Electroencephalogram (EEG)

EEG is a technique that records the electrical activity of the brain using metal electrodes placed on the scalp. The resulting signals are classified into different frequency bands, including Alpha (8–12 Hz), Beta (13–30 Hz), Gamma (> 30 Hz), Theta (4–8 Hz), and Delta (< 4 Hz). Alpha and beta are linked to stress or negative emotions. EEG recording is affected by the blink of an eye and head movement so careful examination and pre-processing of signals is required. In [17], captured frontal lobe EEG in real time for assessing stress in students. Fast Fourier Transform (FFT), was used for feature extraction, and SVM and NB were used as a classifier. They used a cold presser as a stressor. Saeed et al. [14] used wearable sensors for recording brain activity using EEG electrodes. They claimed that EEG signals can be used for the classification of long-term stress. They extracted 45 features from EEG signals to classify as stress or control group. Five ML classifiers were used of which the support vector machine gave the maximum accuracy of 85.20%. In [15], recorded EEG signals of 22 participants and used mRMR to extract features and enhance accuracy. Zhang et.al [12], fused EEG with voice signals to detect depression.

Skin Temperature (ST)

Many researchers [18, 32, 33] have used skin temperature in stress and emotion detection. Skin temperature is measured by placing electrodes on fingertips. Researchers claim that skin temperature is negatively correlated with stress. In some cases, it leads to a rise in temperature. It is referred to as "psychogenic fever”. Skin temperature can further be stimulated due to other reasons like room temperature.

Blood Pressure (BP)

When the body is under stress, the body releases adrenaline and cortisol hormones, which can result in an elevated heart rate and blood pressure due to the constriction of blood vessels. Blood pressure is affected by other factors like being overweight, consuming too much alcohol or caffeine, smoking, etc. It cannot be used alone to detect stress. BP along with other physiological signals have been used for stress detection.

Behavioral Signal

Behavior can be defined as the manner in which an individual or a group of individuals conduct themselves in a particular situation, guided by the prevalent norms, rules, or accepted social practices. Stress can impact behavior, resulting in changes such as increased irritation or anger. Some of these changes are difficult to measure, but others, like changes in a person's interaction with technology, have been studied as a way to gauge stress levels. Measuring behavioral responses has the advantage of being unobtrusive and, in some cases, not requiring expensive equipment. Many researches have been conducted and demonstrated the effectiveness of stress detection using physiological signals. Over the past few years, research has been conducted to fuse speech, facial expression, eye movements, and text along with physiological signals to detect stress. In [28], observed head movements, facial action units, and eye-gaze approximation and then used multi-head LSTM to classify stress. J. Zhang et.al [31], fused ECG, voice, and facial expression for stress detection.

Facial Expression

Facial expressions reveal many different emotions like anger, sadness, happiness, stress, likes and dislikes. Researchers have investigated facial expression and recognize the emotions of human beings. Under stress conditions, facial muscle changes like eyebrows are pulled and some vertical or horizontal marks appear. Head and mouth movements are also associated with stress. Some activation units are calculated to decode facial expressions. These activation levels are provided by the Facial Action Coding System (FACS). In [31], used facial expression and designed a temporal attention module (TAM) to find the keyframes to extract facial features and used ResNet50 and I3D deep learning algorithms for classification and achieved an accuracy of 79.2%.

Eye Movements

Eye blink rate, gaze, and pupil diameter are also linked with stress. According to the American Optometric Association, due to stress, the eye may make uncontrolled movements. Eye blink rate usually increases under stress. Research conducted by [52], mentioned that pupil diameter and eye blink increase with an increase in stress but blink rate decreases when cognitive load increases. G. Giannakakis et al. [53], captured facial videos to extract features related to head movement, eye movement, and mouth movement and used K-NN, Generalized Likelihood Ratio, SVM, NB classifier, and AdaBoost classifier. The maximum accuracy achieved was by the Adaboost classifier 91.68%.

Audio

According to the National Alliance of Mental Illness (NAMI), people suffering from stress face difficulty interacting and speaking clearly. It results in slower or faster speech and sometimes the speech is jumbled or slurred. Mel-frequency Cepstral Coefficients (MFCCs) are commonly used for feature extraction in speech recognition. In [25], Long Short-Term Memory (LSTM) to extract MFCC features from speech reaches an overall validation accuracy of 76.27%. X.Zhang et al. [12], extracted prosodic features, MFCC, and spectrum features fused with EEG to detect stress with an accuracy of 76.40% using ensemble classifiers. In [20] initially extracted features at the frame level to preserve the original temporal association of a speech sequence, and subsequently analyzed the differences between depressed speech and non-depressed speech. They used multi-head LSTM and it shows an improvement of 2.3% over the traditional LSTM method. In [33] Minghao Du et al., the process of speech generation and perception is described using linear predictive coding (LPC) and Mel-frequency cepstral coefficients (MFCC) to capture relevant information. Then, a one-dimensional CNN and an LTSM were used for classification with an accuracy of 77% on the MODMA dataset and 85% on the DAIC-WOZ dataset.

Text

H. Zogan et. al [10], a novel model was proposed to detect depressed users by analyzing their social media activity. The model involved feature extraction from the user's behavior and online timeline (i.e. posts). In [28] used text, video, and audio to detect stress using deep learning. In [36], used text, images, and exercise data to detect teenager's stress.

Dataset for Stress Detection

In this section, we present the publicly available dataset for stress detection. Table 2 provides an overview of these datasets.

Table 2 Overview of Existing Dataset for Stress Detection

WESAD

The WESAD (Wearable Stress and Affect Detection) dataset [56] is publicly accessible and provides a wealth of physiological and motion data for detecting stress and affect using wearable devices. The dataset contains information from 15 participants, recorded in a laboratory setting using wrist- and chest-worn sensors. The data includes multiple sensor modalities, such as blood volume pulse, electrocardiogram, electrodermal activity, electromyogram, respiration, body temperature, and three-axis acceleration. The dataset also features data from three different affective states (neutral, stress, and amusement) and self-reports collected through established questionnaires. This multimodal dataset is a valuable resource for researchers in the field of stress and affect detection.

SWELL-KW

The Smart Reasoning for Well-being at Home and at Work (SWELL-KW) dataset was created by Koldijk et al. [57]to study the stressful behavior of knowledge workers in an office environment using a pervasive, context-aware system. The dataset captured various modalities, including computer interactions, facial expressions, body postures, ECG, and SC, and used time pressure and e-mail interruptions as stressors. The dataset also included self-reported stress questionnaires as the ground truth. Initial results showed the ability to distinguish between normal and stressful work conditions.

DEAP

The Database for Emotion Analysis using Physiological Signals (DEAP) [58] is a multimodal dataset that can be used to analyze human affective states. It consists of EEG and peripheral physiological signals collected from 32 participants while they watched 40 one-minute music video excerpts. The participants rated each video in terms of arousal, valence, like/dislike, dominance, and familiarity. Additionally, a frontal face video was recorded for 22 of the participants. The dataset also includes a unique method for selecting stimuli, which involves using affective tags from the last.fm website, video highlight detection, and an online assessment tool.

Driverdb

The dataset for detecting drivers' overall stress levels [59] has been widely used in various studies and was collected by measuring stress responses while driving on planned routes with varying cognitive loads. It captures physiological modalities such as respiration, EMG, ECG, HR, and GSR in an ambulatory environment, along with video recordings of the driver to estimate their stress levels based on head movements and confirm cognitive load. However, the dataset has certain limitations, such as the unsynchronized video and sensor clocks that make it impossible to measure a driver's response time to stress stimulus, and the absence of self-reports for the driver's cognitive state.

MuSE

The MuSe dataset [60] is a publicly available dataset for multimodal stress detection, which includes physiological signals (ECG, respiration, skin conductance, temperature) and audio-visual data (facial expressions, speech, body gestures) of participants undergoing different stress-inducing tasks. The dataset was collected using a mobile sensing platform and includes data from 40 participants from different age and gender groups. The stress-inducing tasks include public speaking, mental arithmetic, and interpersonal interaction tasks. The dataset aims to facilitate research in multimodal stress detection using machine learning techniques.

CLAS

Cognitive Load, Affect, and Stress Database (CLAS) [61] is a publicly available dataset that was created for the recognition of cognitive load, affect, and stress. The dataset contains physiological signals, including electroencephalogram (EEG), PPG, and galvanic skin response (GSR), as well as facial expression videos and self-reports of the participants' stress and cognitive load levels. The dataset was collected from 62 participants who performed a cognitive task with varying levels of difficulty. The CLAS dataset has been used for the development and evaluation of machine learning algorithms for stress and cognitive load detection.

MAUS

MAUS (Mental workload Assessment on n-back task using wearables Sensor) [62] is a publicly available dataset for mental workload assessment using wearable sensors. The dataset includes physiological signals such as electrocardiogram (ECG), ECG, Fingertip-PPG, Wrist-PPG, and GSR signal. The data was collected from 22 participants performing a 2-back and 3-back task, which are commonly used to measure cognitive workload. Self-reported workload scores were also collected using the NASA Task Load Index (TLX). The dataset aims to facilitate the development and evaluation of methods for mental workload assessment using wearable sensors in real-world settings.

Machine Learning Algorithms for Classification

AI has revolutionized healthcare by leveraging data collected through sensor devices to make accurate predictions about patients' health status. This has been made possible through the use of advanced AI algorithms that analyze vast amounts of medical data, such as electronic health records, medical images, and genomics information. By identifying patterns in this data, AI can provide early warning signs of potential health issues, assist in the diagnosis of diseases, and even tailor personalized treatment plans. This approach not only enhances patient outcomes but also reduces the workload of healthcare professionals. Machine learning is broadly categorized into supervised and unsupervised learning. Supervised learning is a type of machine learning where the training data is labeled, meaning it has predetermined outcomes, and the algorithms used with this data are known as supervised learning algorithms. These algorithms are mainly used for diagnosis and prediction. Some supervised algorithms used for stress detection are linear and logistic regression, k-nearest neighbor, decision tree, random forest, support vector machine (SVM), and naive Bayes classifiers.

On the other hand, unsupervised learning algorithms use “unlabeled data” to identify hidden relationships between the data and group them according to their similar characteristics. Unsupervised learning algorithms are used to classify or group patients based on their characteristics. K-means clustering, hierarchical clustering, and Hidden Markov model are some examples of unsupervised learning algorithms. Ensemble techniques in machine learning involve combining the predictions of multiple models to produce a more accurate final prediction. They are widely used in machine learning because they can improve the accuracy and robustness of models, reduce overfitting, and handle noisy data. Figure 6 shows the various machine learning algorithms used for classifying stress in research studies.

Fig. 6
figure 6

Machine Learning Algorithms Used for Classification

The algorithms utilized for stress classification through machine learning are presented in Table 3. The majority of the research has focused on using physiological signals to decide between stress and non-stress conditions. The current study shows that stress revealing using physiological pointers is more accurate than using other modalities. However, this does not imply that behavioral information cannot effectively detect stress, as the literature shows that it can. There is still significant room for study in this area. Binary classification of stress is prevalent in the literature.

Table 3 Classification of Stress Using Machine Learning Techniques

Deep Learning Algorithms for Classification

Deep learning is an area of machine learning that employs artificial neural networks to acquire knowledge from data. It is a form of artificial intelligence that allows computers to learn from experience and carry out tasks that typically require human intelligence, including image recognition, natural language processing, and decision-making.

Deep learning is characterized by its use of deep neural networks, which consist of multiple layers of artificial neurons. These networks are designed to learn increasingly complex representations of the input data as the layers get deeper. The learning process in deep neural networks is typically supervised, meaning that the network is trained on labeled data and adjusts its weights and biases to minimize the difference between its predictions and the true labels [55].

Powerful computing resources such as graphics processing units (GPUs) that can efficiently perform the complex calculations required by deep neural networks and the availability of large datasets have contributed to the success of deep learning. Deep learning has revolutionized a wide range of fields, including speech recognition, computer vision, natural language processing, and autonomous driving, and has enabled the development of intelligent systems that can make decisions based on complex input data. Deep learning is rapidly transforming healthcare by enabling the development of intelligent systems that can analyze large amounts of patient data and help healthcare professionals make more accurate diagnoses and treatment decisions. Some commonly used deep learning techniques are listed in Table 4. Figure 7 shows the classification of deep learning algorithms.

Table 4 Brief description of several Deep Learning Techniques
Fig. 7
figure 7

Classification of Deep Learning Algorithms

Deep learning has demonstrated impressive outcomes in multiple domains, including speech recognition, computer vision, and natural language processing. It has also been implemented for stress detection, using models like convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) networks to scrutinize physiological and behavioral data. Figure 8 illustrates the percentage of deep learning techniques employed for stress classification. These models have shown better accuracy and performance compared to traditional machine learning algorithms, as discussed in Table 5.

Fig. 8
figure 8

Deep Learning Algorithm Used for Classification

Table 5 Classification of Stress Using Deep Learning Techniques

The efficiency of deep learning algorithms tends to decrease when dealing with a small dataset. To address this concern, one can explore techniques like data augmentation, data synthesis or generation, and the application of K-fold cross-validation. Researchers have explored that hybrid deep learning approaches, such as CNNs with RNNs or LSTM, along with attention mechanisms, have enhanced feature extraction and accuracy. In [88] the presence of noise in the signals was found to cause underfitting when using CNN alone. To overcome this drawback, an ensemble of CNN and LSTM was employed, resulting in improved accuracy.

The generalized model for analyzing mental stress is presented in Fig. 9, several studies have been conducted using different stressors and classification algorithms, and a common approach is to categorize stress into binary states of stress or no-stress, while others use three, four, or a maximum of five levels.

Fig. 9
figure 9

Generalized Model for Stress Detection

Numerous problem-solving disciplines have benefited from the use of nature-inspired algorithms, which are frequently based on the behavior of biological or natural systems. Researchers have considered applying these algorithms for feature selection, classification, or optimization tasks. Several well-known examples of optimization algorithms that take their inspiration from nature include the genetic algorithm, particle swarm optimization (PSO), ant colony optimization, the cuckoo search algorithm, and the bat method [89, 90].

These algorithms can provide creative solutions to improve feature selection, model parameters, or both, in stress detection systems, thereby enhancing the precision of stress detection. Algorithms inspired by nature aid in handling highly dimensional data. In [85], Grey Wolf Optimizer was used to select optimized features from the EEG signal and then the hybrid model of BLSTM and LSTM was used for classification.

Application Areas for Stress Detection

The application of stress detection is broad and can be useful in various fields. One of the most common applications is in healthcare, Stress detection can be employed for the diagnosis and management of stress-induced illnesses such as anxiety disorders, depression, and cardiovascular diseases. In addition, stress detection can be used in sports training to monitor the stress levels of athletes and prevent overtraining. Stress detection can also be used in workplaces to monitor the stress levels of employees and take measures to prevent burnout and improve productivity. Moreover, stress detection can be used in educational settings to monitor the stress levels of students and prevent stress-related problems (Fig. 10).

Fig. 10
figure 10

Application areas for Stress Detection

Stress Detection in Drivers

The identification of driver stress levels can help to prevent hazardous situations, as stressed drivers are more likely to make mistakes and have decreased reaction times. Thus, continuous monitoring of driver stress levels can provide early warning signs and help to take corrective action. To detect driver stress, researchers have employed various models utilizing physiological signals such as GSR, ECG, and Respiration (RESP). One commonly used dataset in this research is the Physionet Drivers Data Set, created by Jennifer Healy, which includes ECG, Electromyogram (EMG) of the right trapezius, measurements of GSR taken on both the hand and foot and respiration. Rastgoo et al. [42] used CNN and LSTM to fuse the ECG, vehicle data, and contextual data for stress levels detection and achieved an accuracy of 92.8%. Similarly, Amin et al. [7] utilized deep transfer learning to classify the stress level of drivers into three categories: low, medium, and high. The classification was performed using seven pre-trained network models, such as GoogLeNet, DarkNet-53, ResNet-101, InceptionResNetV2, Xception, DenseNet-201, and InceptionV3, which were evaluated based on ECG signals. The maximum accuracy achieved was 95%. In [14] Saeed et al. used SVM, KNN, NB, logistic regression, and MLP for classification and the highest accuracy was 85.2% using SVM. Luntian M. et al. [51], proposed a framework for detecting stress levels in drivers through the use of a deep learning-based multimodal fusion system. The suggested approach combines various non-invasive data streams, such as eye-tracking, vehicle, and environmental data, and utilizes an attention-based convolutional neural network (CNN) and long short-term memory (LSTM) model to extract characteristics and assess the stress level. The validity of this framework was tested using a driving simulator and the results showed that it had an accuracy of 95.5%. In [52], proposed a novel approach to identify drivers' stress through short-term physiological signals like FGSR, HGSR, and HR. The method involves creating a 2-dimensional nonlinear representation of the continuous Recurrence Plots (Cont-RPs) which are then transformed into satisfactory representation vectors through a multimodal Convolutional Neural Network (CNN) model. This allows the model to effectively differentiate between states of stress and relaxation.

Stress Detection in Academic Environment

Stress detection in the academic environment can be important to understand the level of stress students face during their studies and to address any potential negative effects of stress on their mental and physical health, academic performance, and well-being. Several methods have been used to detect stress levels in academic settings, including self-report questionnaires, physiological measures, and behavioral observations. Tian [8], denoised the student ECG signals using wavelet transform and CNN for classification with an accuracy of 98%. In their study, Al Shorman O et al. [17] utilized a support vector machine (SVM) with RBF, linear, polynomial, and sigmoid kernel, and Naive Bayes (NB) classifiers to analyze five features extracted from frontal lobe EEG data obtained from undergraduate university students. Zhang P. et al. [10], used CNN-BiLSTM for stress classification from ECG data of 34 participants from the Chinese Academy of Sciences with an accuracy of 85%. Morales-Fajardo et al. [43], proposed a remote photoplethysmography (rPPG) method to collect PPG signals of engineering students. According to their claim, the classifiers J48, random forest, and KNN achieve an accuracy rate of 96%.

Stress Detection in Work Environment

It is common for people to experience high levels of stress in their workplace, particularly in high-pressure office environments where there are tight deadlines, heavy workloads, and high expectations. The effects of this kind of stress can be substantial on a person's health, general contentment, and job satisfaction. People may experience physical symptoms such as headaches, fatigue, muscle tension, and difficulty sleeping, and may also suffer from emotional symptoms such as irritability, depression, and decreased self-esteem. It is important for individuals and organizations to address stress in the workplace in order to promote healthy work-life balance, improve job satisfaction, and maintain productivity and well-being. Research has demonstrated that machine learning is a viable method for detecting workplace stress [38]. In [44] W. Seo et al. fused physiological signals with video data and used deep learning for the classification of working people.

Stress Detection During COVID-19

In [45], used PSS for identifying stress in university students during COVID-19. They applied several ML algorithms for stress detection, of all the classifiers, the Logistic Regression Classifier (LRC) demonstrated the uppermost accuracy, reaching 97.8%.S. N. Gamage et al. [46], used questionnaire-based methods to conduct a study on IT professionals working from home for 40 h a week for a year during the pandemic. The CatBoost algorithm gave a maximum accuracy of 97.1%. [47]47 used Twitter posts to analyze stress during the pandemic. They utilized machine and deep learning algorithms in combination with Natural Language Processing for classification. C. A. V. Palattao et al. [49], used machine learning classifiers to determine factors contributing to mental health issues based on questionnaires. H. A. Khan et al. [50], developed a secure framework that utilizes wearable sensors for measuring and transmitting physiological signals to a cloud-based server, which then uses machine learning to predict stress levels.

Issues and Challenges

Detecting, assessing, and analyzing stress in humans is an important process to address this phenomenon. Although stress has a subjective dimension, researchers aim to find reliable and objective measures that can effectively represent stress, and which cannot be controlled or manipulated. Physiological measures are often regarded as a more dependable indicator of stress levels. By measuring physiological signals such as heart rate, blood pressure, electrodermal activity, and cortisol levels, researchers can gain insights into an individual's stress levels and develop effective stress detection and assessment techniques. Behavioral information can be effective for detecting stress, as it can provide additional information about a person's mental state and level of stress. While physiological signals such as heart rate and skin conductance are commonly used for stress detection, they may not always be accurate indicators of stress. For example, a person may have a high heart rate due to exercise rather than stress. Behavioral signals such as facial expressions, body language, speech patterns, and typing behavior can provide complementary information to physiological signals and help confirm the presence of stress. Therefore, incorporating both physiological and behavioral signals in stress detection systems can potentially increase the accuracy and reliability of stress detection. Combining multiple modalities, including physiological, behavioral, and contextual, may lead to more accurate and comprehensive stress detection systems. However, it is important to consider the limitations and potential biases in using physiological measures for stress detection, as these signals may be influenced by factors such as age, gender, and medication use.

Identifying stress and calm states is a difficult task due to the complex nature of stress. Nevertheless, physiological signals demonstrate specific patterns that could serve as a suitable foundation for distinguishing primary states like arousal. These signals can also aid in identifying more intricate emotional states, such as stress, with proper modeling. Utilizing these physiological signals may enhance the precision and dependability of stress detection and analysis, and provide deeper insight into the underlying mechanisms of stress. However, additional investigation is necessary to fully explore the potential of these signals and to overcome the obstacles in stress detection. Some of the major challenges identified are:

  • A common issue with physiological signals is that they are susceptible to noise and can be affected by other factors like skin temperature which can be affected by the room temperature.

  • There is a lack of standardization in terms of data collection protocols, feature extraction methods, and evaluation metrics, making it difficult to compare results across studies.

  • Most of the research conducted is in controlled environments, where stressors were intentionally induced to ensure a measurable amount of acute stress. These stressors rarely model real-life stressors. Therefore, proper characterization of stressors requires further investigation.

  • There are significant differences in stress responses between individuals, making it difficult to create a one-size-fits-all model for stress detection.

  • There is a limited amount of publicly available annotated data for stress detection, which makes it challenging to train and evaluate machine learning models.

  • The most challenging task for developing a stress detection model includes collecting real-time data and removing artifacts and noise.

  • As stress detection is a highly multimodal task another crucial consideration in stress detection is the integration of data from multiple modalities.

  • Researchers have investigated a variety of deep learning and hybrid deep learning algorithms for classification; however, feature selection and parameter tuning remain difficult tasks.

  • To the best of our knowledge, stress detection has not yet been accomplished using nature-inspired algorithms, despite their widespread use in other fields.

Conclusion

Stress is a common psycho-physiological response to various events or demands encountered in daily life. Several researches have been conducted in controlled laboratory environments to detect stress using physiological reactions detected by sensors. These experiments have shown high accuracy compared to real-time stress detection methods, which provide lower accuracy. However, with the advent of wearable devices and IoT, these devices have become more user-friendly and have improved accuracy in data collection. Additionally, some researchers have developed their own low-cost sensor devices that have shown promising results. In most studies, multiple physiological signals have been used for stress detection and obtained using one or more wearable devices.

There has been research conducted to detect stress using behavioral signals. Behavioral signals such as speech, facial expressions, and body movements have been studied for their potential in detecting stress. For example, changes in speech patterns, such as increased pauses, changes in pitch, and slower speech rate, have been found to be indicative of stress. Facial expressions, such as frowns and furrowed brows, have also been studied for their potential in detecting stress. Body movements, such as fidgeting and hand wringing, have also been considered as potential indicators of stress. However, these signals are not as reliable as physiological signals, and more research is needed to fully understand their potential in stress detection. The raw data collected from these signals are pre-processed by removing artifacts and noise using filters. The pre-processed data is then used for feature extraction and selection. In addition to numerous machine learning algorithms, the use of deep learning algorithms has increased significantly. This review paper aims to provide researchers with insights into stress detection by presenting an overview of different modalities used in stress detection, along with available datasets. It also highlights various deep-learning algorithms that can be explored for stress detection.