Introduction

Stress has been recognized as the “health epidemic of the 21st Century” by the World Health Organization [1]. The Transactional Model of Stress, the leading model in psychological stress research, defines stress as a dynamic interaction between individuals and their environments. Specifically, stress is the result of an individual’s perception of imbalance between the environment’s demands and their available resources to respond to those [2]. Stress only occurs when individuals perceive a situation as extending their resources to cope with it. Stress perceptions can range from daily situations (e.g., conflict with partners) to major life events (e.g., home moving) [3].

Medical education is inherently stressful, being considered difficult and time demanding, requiring commitment and dedication. Literature review shows that medical students exhibit higher distress levels than the non-medical students [4]. A study in the United States, comparing medical students and students from other courses, found there was a greater dependence or abuse of alcoholic beverages on medical students [5]. In Portugal, previous works have shown that among medical students, stress is a prevalent risk factor, affecting the decision-making process and altering the activity of brain networks [6, 7].

In fact, students are subject to increasing periods of work with a progressive focus on autonomy and continuous assessment. In addition to medical preparation and activity being considered as a high stress potential, there are other factors, such as: the student’s first contact with the patient, the fact that they often live alone and away from home, long hours of study, and the concerns about professional performance at the end of the course.

In specific, concerns about curriculum and academic exams are the most cited in the literature as potential sources of stress [8]. Although it is a fundamental phase in the training and certification process, it is also one of the strongest stress factors due to the high-stake implications in the academic progress and self-perceived image. Generally, the high levels of stress could be due to the frequency of examinations, lack of time to revise the subject, sleepless nights, difficulty in understanding the subject, poor academic performance, and competitive environment [9]. For these reasons, medical education is commonly associated with anxiety, depression, addictions, and even suicidal ideation [10]. Therefore, it is not rare that medical students present more frustration, exhaustion, helplessness, and psychological disorders [11].

According a literature review [12], the total number of articles on stress has almost doubled during the last five years. Although the term stress is often associated with a negative connotation, a minor level of stress could be beneficial and it could enable the student to become a more dynamic and better performer. However, it is very difficult to distinguish between an optimal stress level – it is called eustress - and an exacerbated level – it is called distress.

Eustress, distress and its assessment

In this present study, we aimed to monitor the stress of medical students, by comparing stress levels during academic exams and during a regular week. We performed different statistical tests in order to identify the most fitted to develop a predictive model of stress. We intend this model to help students monitor their own stress level so that they find their optimal performance during academic exams.

The first distinction between eustress and distress was made in the 1960s by Selye [13] to distinguish these two different types of responses to stress. The author defined eustress as the stress level that creates challenge and constitutes a positive motivating force [1, 13], is associated with positive feelings and outcomes (e.g., well-being, work satisfaction) [14, 15]. In contrast, distress has been characterized by stress experiences in which the individual appraisal the stressor as a source of harm or threat [14, 15]. Distress is mainly related to negative emotions and unhealthy bodily states, such as headaches and sleep problems [1, 14, 16]. An over-exposure to a large number of stressor factors during an extended period of time, in the absence of enough resources to deal with, can lead to burnout [3, 5]. Burnout can be defined as a psychological syndrome encompassing three key dimensions: overwhelming exhaustion, depersonalization of the beneficiaries of one’s work, and a lack of accomplishment [5, 11, 17]. A burned-out person has, among other consequences, the increased risk of addiction and deteriorating health. Namely, the continuous experience of distress (it means, chronic stress) takes significant consequences for individuals’ mental (e.g., depression, anxiety, memory loss) and physical health (e.g., musculoskeletal disorders, high blood pressure, cardiovascular problems) [18, 19]. Stress is also related to the main leading causes of death such as heart disease, cancer, accidents, and suicide [16].

Given the complexity of the stress construct, some challenges are posed to its evaluation [20]. Among the different methods to evaluate stress in its different levels, the most common in the literature are in the psychological, physiological and behavioral domains. Usually, the first way of assessing psychological stress in humans was based fundamentally on self-report questionnaires. There are a number of instruments to measure different responses to stressful stimuli, which are widely used and considered reliable. Some examples of these instruments are Stress Self-Assessment Scale, Perceived Stress Scale - PSS [21], and Stress Response Inventory – SRI [22]. However, these questionnaires present some disadvantages: it has a lower sensitivity with respect to the physiological variations, and it only provides information on stress levels at the time of evaluation, not covering the stressors or the evolution of stress levels [23].

Concerning the physiological data, biological markers include acute phase response hormones/mediators (cortisol, interleukins, ferritin) that are released during the stress response. For physiological assessment, there are several types of bio-signals that can be used, such as hormone levels (e.g. cortisol). However, it needs a careful design in data collection, in order to control the influence from other variables (e.g. time of day, consumption of substances or, in women, the menstrual phase in which the samples were collected) [24].

To overcome these limitations, there is another approach to evaluate stress in real life using simple and wearable devices, such as belts or smartbands. These devices have become a new tool for stress evaluation and management, widely used in the health field [25]. The rationale for the use of these gadgets is that stress affects physiological processes. A recent meta-analysis provided evidence that heart rate variability (HRV) changes in response to stress which support the use of HRV as a psychological stress indicator [26]. In specific, the exposure to a stressor triggers the autonomic nervous system, activating the sympathetic nervous system and inhibiting the parasympathetic nervous system [27, 28]. As a result, there are changes in heart rate (HR; the number of heartbeats per minute) and in heart rate variability (HRV; the variation in the time intervals between consecutive heartbeats, it is called RR interval) [28]. Wearable devices are equipped with physiological signal sensors that can be an effective way for continuous and non-invasive assessment of heart rate measures [23, 28].

The EuStress solution

The research project named “EUSTRESS – Information system for the monitoring and evaluation of stress levels and prediction of chronic stress” aims to evaluate and develop an IS that monitors and evaluates in real-time the stress levels of an individual in order to develop a predictive model of stress response and chronic stress. An application was used, specially developed for this project, implemented in Android on smartphone and received data from the wearable device – Microsoft Smartband 2. This smartband evaluates skin conductance, body temperature, heart rate variability, calorie intake and expenditure, sleep patterns, and quality [23]. The RR signal was obtained from a photoplethysmogram (PPG), an optical technique, used for heart rate monitoring purposes. PPG measures blood volume changes through a light source and a photodetector at the surface of skin [29, 30]. All of these data was sent to the mobile application via Bluetooth.

Data provided from all of these sources will be integrated through statistical models in order to develop stress profiles (by baseline stress patterns and stress and reactivity patterns) to consequently predict stress states of individuals. In this current study, we analyzed the stress data offline. Computations to define the model were done in a server and computations to evaluate stress were done in the smartphone.

In the architecture of this project, a set of proprietary and Open Source technologies, linked and exchanging data, allow the collection and storage of biometric data. Figure 1 shows the architecture of this information system.

Fig. 1
figure 1

Architecture of EuStress Solution

The solution has two main goals. The first of them is to determine stress profiles (by baseline levels of stress and by patterns of reactivity and stress recovery) to classify individuals, making the IS adaptable to any individual. The second goal is to interpret the stress reactivity patterns of the individual and predict stress states, cumulative effects of stress, and chronic stress.

Data collection and ethical procedures

This project was reviewed and approved by the ethical committee of the Life and Health Sciences Research Institute at the University of Minho (Portugal) and the National Data Protection Commission. Informed consent was obtained from all individual participants included in the study. Data was collected from September 2017 to October 2018. Participants were medical students at the Life and Health Sciences Research Institute at the University of Minho. All students received an informative email about the project and during classes, the researcher explained the purpose of their participation, inviting collaboration.

Data were collected in two different conditions for each participant: the first condition was at the beginning of the academic year, a time without the stress induced from evaluations (baseline condition), and the second was during university exams (stress condition) that occurred at three different times. At baseline, participants’ completed some global self-reported measures (e.g., sociodemographic questionnaire evaluating gender, age, nationality, and academic year; the PSS Portuguese version [31] and used the smartband during a week. In the baseline condition, physiological data was collected at 5 min each hour. At the end of each day, participants answered the PSS 4 Items. In the stress condition, participants used the smartband during multiple-choice computer-based exams. They also answered the PSS at the end of their exam. All these data were part of the broad project.

Data analyses

Thirteen HR and HRV time domain indices [28] that quantify the “amount of variability in measurements of the interbeat interval” were calculated from the heartbeat interval data of smartband sensors. Some examples of these indices were: mean heartbeat intervals (Mean RR), minimum (Min RR) and maximum values of RR (Max RR), median value of RR (Median RR), standard deviation of the RR intervals between normal beats (SDNN), root mean square differences of consecutive RR intervals (RMSSD), and percentage of consecutive RR intervals that differ by more than 50 ms (pNN50). The analyses were performed using SPSS 22.0 and Software Orange 3.20.

Results

Data was collected from 83 medical students who volunteered to enroll both in baseline and stress conditions. Sixty-three (76.8%) were female and 19 (23.2%) were male aged 17 to 38 years (M = 22.13; SD = 5.55). For approximately 63% of the participants, the attendance of the course did not imply the change of the residence.

Participants’ academic year ranged from 1st to the 5th, with 19.5% of the participants from the 3rd year alternative program for graduate individuals. About 85% of the sample defined the vocational interest as the reason for the application to higher education.

We performed an Independent Samples t -Test in order to compare the HR and HRV variables at baseline and at exam condition. Table 1 presents the mean and standard deviation for each one. Significant differences were found between baseline and exam condition, except Diff Mean, Diff p mean and pNN50. The Mean, Min, Median, and Diff Min were significantly higher in baseline condition than in the exam condition. In contrast, SDDN, Max, Diff SDSD, and diff Max were significantly higher in the exam than in baseline condition.

Table 1 Independent Samples t -Test comparing HR and HRV across two different conditions (baseline and exam)

Following these results, we performed different statistical tests: logistic regression, neural network, naïve Bayes, support vector machines, random forest, and k-nearest neighbor in order to predict stress based on HR and HRV variables. We calculated a model with 70% of the sample for training and a model with 30% of the sample for test, with 10 repetitions. In the case of the neural network, we used 100 hidden layers and the activation function used was the hyperbolic tangent “tanh”.

For each test, two models were established. In the model 1, we include all variables listed; in the model 2, we only include the significant variables. Table 2 presents the comparison between Model 1 and Model 2 for each test calculated. The neural network revealed the better results for both models. In specific, the sensitivity value was 75.2% for Model 1 and 74.2% for Model 2. The specificity values were 77.9% and 78.1%, respectively for Model 1 and Model 2.

Table 2 Comparing statistical tests to predict stress from the HR and HRV variables

From our sample, only 19 individuals have measures for all conditions: biometric data and questionnaire answers in both basal and exam. A Shapiro-Wilk test was performed in order to verify data normality. PSS13 in Basal did not have normality (p < .05), but for the other variables is not possible to refute the null hypothesis (p > .05), that is, variables have normality. Therefore, to evaluate eventual differences for the two conditions we used the Wilcoxon test to PSS13 variable and Paired Samples t test for the other variables.

The results revealed that the Mean RR has statistical difference for two categories, variable PSS13 present difference as well. A Wilcoxon Signed-Ranks Test indicated that PSS13 scores in the stress condition were statistically significantly higher than in the basal condition Z = −2.62, p < .05.

Conclusions and future work

In this paper, we aimed to describe the initial steps in order to develop an information system. Specifically, we focused on testing the best machine learning algorithms. For this, we performed different statistical tests and compare them. The neural network had the better model fit. In fact, this technique is robust enough to deal with the possibility of some possible badly classified data. Nevertheless, our models have low levels of sensitivity and specificity. These values can be explained by the sample size which not seem to correspond to the requirements of machine learning techniques in the context of biometric data measurements, given the higher variability of this continuous data.

The use of wearable devices to assess stress during exams could be provide useful information to help students to management their stress levels. Consequently, they could have a better performance. In our study, we consider the reduced sample size as a limitation. In addition, the dataset contain missing data for some participants, as most students were busy with curricular activities, and were not able to participate in all the data collection moments.

Future steps intend to design an intervention program where failing students are identified based on the real-time collection from exams (reactivity to stress/anxiety, biological markers, cognitive performance, and decision-making behavior) and recovered by customized coaching programs. According to the literature, it is a known fact that people react in different ways to stimuli, which might be a stressor factor for some, but not for others. The personality traits and previous experiences are also important aspects, which should not be neglected while evaluating the performance while undertaking a task, as it might have implications in the way that a person reacts under a stress situation. Therefore, for students presenting a low academic performance, it might be important to have a Psychology professional evaluating such aspects and providing assistance towards improving its grades. In addition, promoting extra-curricular activities such as sport and others, which promote cooperation and social support between students, might be a strategy to minimize the competitive environment, and facilitate the support network amongst students.