Introduction

Psychological stress is an increasingly prominent ailment in industrialized countries, and it represents a prominent health risk factor, nearly as detrimental as smoking or diabetes (Yusuf et al. 2004). Accordingly, interventions to better cope or buffer stressful experiences are of major importance. Several important physiological stress-induced alterations can be observed in the autonomic nervous system. For example, heart rate (HR) and heart rate variability (HRV) adequately reflect the balance of the autonomic nervous system during real life (Vrijkotte et al. 2000; Clays et al. 2011). Psychological stress has been associated with decreased HRV and increased HR (Sloan et al. 2011), and exaggerated cardiovascular reactivity to stress is associated with the development of cardiovascular disease (Lovallo 2011). In contrast, a higher aerobic capacity and regular aerobic exercise training (AET) are related to increased HRV, mainly due to increased vagal modulation of HR (Hautala et al. 2009). Whether regular AET also leads to decreased sympathetic activity is less clear (Aubert et al. 2003; Hautala et al. 2009).

The cross-stressor adaptation hypothesis suggests that regular exercise leads to adaptations in the stress response systems that induce decreased physiological responses to psychological stressors (Sothmann et al. 1996; Sothmann 2006). Over the recent decades, numerous studies empirically investigated the cross-stressor adaptation hypothesis, mostly by comparing the physiological reactivity of physically fit versus unfit people to psychological stress tasks (Albright et al. 1992; de Geus et al. 1993; Calvo et al. 1996; Spalding et al. 2000; Jackson and Dishman 2006; Rimmele et al. 2007, 2009; Klaperski et al. 2013; Childs and de Wit 2014).

Unfortunately, reviews and meta-analyses revealed inconsistent results in these previous studies (Sothmann et al. 1991; de Geus et al. 1993; Hamer et al. 2006; Jackson and Dishman 2006; Forcier et al. 2006; Gerber and Pühse 2009). While two meta-analyses found evidence for diminished physiological stress reactivity to psychological stressors in physically fit and exercising individuals (Crews and Landers 1987; Forcier et al. 2006), another meta-analysis (Jackson and Dishman 2006) did not indicate empirical support for reduced reactivity. Specifically, Forcier et al. (2006) reported decreased physiological reactivity, HR and systolic blood pressure to psychological stressors in physically fit versus unfit individuals including cross-sectional and longitudinal studies. In contrast, Jackson and Dishman (2006) did not present any support for reduced physiological stress reactivity, but empirical evidence for improved physiological recovery after psychological stress. In almost the same manner, the results of previous randomized, controlled trials are inconsistent. While some studies found decreased physiological stress reactivity (Stein and Boutcher 1992; Spalding et al. 2004; Klaperski et al. 2014), others did not demonstrate differences in physiological stress reactivity between control and experimental groups (Albright et al. 1992; de Geus et al. 1993; Sloan et al. 2011; Lindgren et al. 2013).

In our opinion, two main issues related to study design are responsible for the aforementioned discrepancies in previous studies. First, even though the cross-stressor adaptation hypothesis has been described as a within-subject hypothesis with a clear temporal order (physical exercise does reduce the physiological stress reactivity to psychological stressors over time within given subjects), most studies use between-subject designs, i.e., they do not investigate subjects over time, but out of convenience, compare physically fit and unfit participants in a cross-sectional manner. However, to properly test the cross-stressor adaptation hypothesis, a within-subject design with intervention is necessary.

Second, the majority of previous studies used artificial stressors in a laboratory environment. While laboratory studies deliver important insights under controlled conditions, several authors have argued that studies conducted in the laboratory may reveal effects with smaller size or the outcomes may not be comparable to real-life findings at all, principally due to the non-personally relevant stressors and the artificial setting (Gauvin et al. 1996; Schwartz 2003; Sloan et al. 2011; Zanstra and Johnston 2011; Wilhelm et al. 2012). To improve ecological validity, physiological processes, physical activity and self-reported psychological variables (e.g., perceived stress) can be assessed in real life using ambulatory assessment (Trull and Ebner-Priemer 2013). The few real-life studies on the cross-stressor adaptation hypothesis did demonstrate encouraging results. Specifically, Ritvanen et al. (2007) showed that aerobic fitness was associated with reduced HR and perceived stress during work in teachers. Brooke and Long (1987) found maintained lower HR levels in fit compared to unfit subjects during rappelling.

Despite the encouraging results of previous cross-sectional studies, they do not allow us to draw causal conclusions. It still remains unclear whether regular exercise leads to lower physiological stress responses to psychological stressors, or whether people with low physiological stress reactivity tend to exercise more (de Geus and Stubbe 2007). To our best knowledge, no study has been conducted to investigate the effects of regular exercise on physiological stress responses to real-life stressors which used ambulatory measures and a randomized controlled design.

To test the cross-stressor adaptation hypothesis with real-life stressors, we conducted a randomized, controlled trial. We hypothesized that students participating in a 20-week AET, exhibit a lower physiological stress response during a real-life psychological stressor compared to a control group (CG). We used the academic examination period (end of the semester) as a real-life stressor, because examinations have been shown to induce noticeable stress in students (Hazlett et al. 1997; Spangler 1997) accompanied with significantly elevated cardiovascular reactivity (Sausen et al. 1992; Zanstra and Johnston 2011; Hazlett et al. 1997). Physiological stress response was operationalized by continuous measures of HRV in everyday life using ambulatory assessment (Trull and Ebner-Priemer 2013).

Methods

Participants

Sixty-one male inactive (≤60 min of moderate activity per week) electrical engineering students (M age = 21.4 ± 1.8) were recruited by presenting the study information during first week lectures of the semester. Interested students signed in to be contacted via e-mail for participation in an informatory kick-off meeting. To control for differences between exam periods, subjects had to be in the 3rd or 5th semester of their studies. At the kick-off meeting, participants received detailed information about the study and eligible students willing to participate gave written consent. The sample resulted from two phases (each including 30 participants) lasting for one semester (October 2011–March 2012 and October 2012–March 2013). Participants with similar exercise levels were identified via pre questionnaire (Jekauc et al. 2013) first, and randomly allocated to either a wait list CG or an AET group in a 1:1 allocation afterwards. Students did not receive financial compensation, but did receive credit points within a course if they completed all measurements and the intervention (AET group).

Procedure

To test the effects of the AET intervention on the physiological reactivity to real-life stressors, we used two specific real-life assessment periods (Fig. 1). We set the pre-intervention baseline assessment to the beginning of the semester because we assumed that it represents a regular academic period where students experienced low stress (Loft et al. 2007). The post-intervention assessment was set to a real-life stressor, an academic examination (Fig. 1). Ambulatory ECG, physical activity and perceived stress were assessed pre- and post-intervention for a 2-day period. To obtain valid data during academic examinations every participant completed two separate assessment periods during academic examinations (Post1 and Post2; 2 days each). The time span between the exam and the end of data recording was restricted to at least one and a maximum of 2 days.

Fig. 1
figure 1

Illustration of the study design: timeline with pre- and post- assessments of physiological measures, cardiopulmonary exercise testing (CET) and the duration of the exercise intervention

As a manipulation check of the AET intervention, we assessed aerobic capacity pre- and post-intervention. We conducted a CET pre-test (CETpre) during the first 2 weeks of the study (start of semester), prior to the intervention of the AET group. After completion of the pre-tests (November 1st), the AET for the experimental group started. After 20 weeks of AET (see Fig. 1), we conducted the CET post-test (CETpost) with exactly the same procedure as the pre-tests. Participants completed the post-tests prior to the examination period to avoid any influence on performance. The wait list CG received AET after the end of the study.

Measures

Participants wore the device ecgMove (movisens GmbH, Karlsruhe, Germany) at the chest with a textile dry electrode chest belt to record a full 1-channel ECG (electrocardiogram) waveform (1024 Hz), three-axis acceleration (64 Hz) and barometric altitude (8 Hz) as raw data on internal memory. Raw data were read out via USB. Participants received the devices one day prior to the assessments. The start time was programmed to 10 a.m. and the measurement ended on the next day at 10 p.m., thus recording 36 h including one night. Participants had to complete an assessment protocol addressing times during which the device was not worn and corresponding physical activity, special incidents, and consumption of caffeine, alcohol and cigarettes. Data processing was done using DataAnalyzer 1.7 (movisens GmbH). The following steps were performed internally in DataAnalyzer:

HR/HRV

Activities of daily living are characterized by permanent movement-induced noise. Artifacts were detected by looking for non-physiological ECG signal sections in terms of QRS amplitudes as well as the number of zero crossings. Artifact regions were marked as such and excluded from further processing. After detecting R peaks in the ECG waveform, RR intervals were calculated (Hamilton 2002). To obtain only normal to normal (NN) intervals, filtering of RR intervals was performed by checking for non-physiological changes in consecutive RR as well as changes in the R amplitude (Clifford et al. 2002). HR was calculated from the valid NN intervals. From the filtered intervals, segments with a length of 120 s and a shift of 30 s were built. Segments with a cumulated NN interval time smaller than 90 % of the segment length were dismissed.

HRV can be measured through time domain and frequency domain analyses. From popular time domain indices, we decided to calculate the root mean square of successive differences (RMSSD), a short-term component of HRV reflecting predominantly vagally mediated autonomic alterations (Task Force 1996). In addition, we calculated the LF/HF ratio which is suggested to display the sympathovagal balance (Malliani et al. 1998). While the low-frequency (LF) component (0.04–0.15 Hz) is modulated by both parasympathetic and sympathetic nervous system, the high-frequency component (0.15–0.40 Hz) is defined as a predominantly parasympathetic marker and the interpretation of VLF is unclear. RMSSD (ms) was calculated for each segment (Task Force 1996). All segments were detrended to prepare them for spectral analysis (Eleuteri et al. 2012). Detrended segments were interpolated at 4 Hz, and the fast Fourier transform (FFT) was then calculated for each segment. Spectral power was calculated for LF (0.04–0.15 Hz) and HF (0.15–0.4 Hz) regions, respectively, to calculate LF/HF ratio (Task Force 1996). We used 1 min as output rate for HR and all HRV parameters.

Physical activity

Acceleration and barometric signals were used for physical activity estimation. The raw acceleration signal was at first high-pass filtered in each axis to remove the gravitational DC component. Then a low-pass filter was applied to filter non-body movement accelerations (Van Someren et al. 1996). Then the three axes were combined by calculating the vector magnitude. The parameter movement acceleration intensity [in (g)] was calculated for each minute by averaging the vector magnitude signal.

Activity recognition was then performed by calculating various features from the acceleration and barometric signals. Activities were classified from the extracted features (lying, inactivity, walking, jogging, stairs). For each activity a regression model for activity energy expenditure was selected. Activity energy expenditure was calculated using the following model parameters movement acceleration, altitude change, age, gender, weight and height. Activity energy expenditure was calculated for each minute by averaging (Anastasopoulou et al. 2012, 2014). Using the acceleration data sleep/wake classification was performed based on the algorithm of Cole et al. (1992).

Aerobic capacity

Maximum oxygen consumption (VO2max) is the primary parameter for determining aerobic capacity. An effective AET intervention (significantly enhanced VO2max) was presumed for the hypothesized effects of the intervention on real-life stressors. Thus, as a manipulation check to assess the aerobic capacity, VO2maxrel (ml/min/kg) was determined via CET (MetaMax 3B, cortex-biophysik) on a treadmill until exhaustion, with an initial pace of 6 km/h, a continuous slope of 1 % and an increase of 2 km/h per stage every 3 min (Roecker 2007). HR was recorded continuously using POLAR HR monitors (RS800). Within a standardized interruption of 20 s between stages, an arterial blood sample was taken from the participants’ hyperaemic earlobe into a 20-μL end-to-end capillary for the analysis of blood lactate. All participants achieved a respiratory exchange ratio (RER = ratio between the amount of carbon dioxide produced and the oxygen consumed during one breath) of ≥1.1, a HR of at least 220 beats minus age (Meyer and Kindermann 1999) and were appraised as exhausted by an experienced instructor. Thus, for all participants an achievement of VO2max was assumed.

Perceived stress

To assess perceived stress, we used electronic diaries (Touch Diamond 2, HTC Germany GmbH, Frankfurt, Germany) and the software MyExperience movisens Edition (movisens GmbH, Karlsruhe, Germany) to install the scales on the diaries and to program time stamps and intervals between diary prompts. Participants answered the following question presented on the diary screen by choosing the appropriate number on the presented scale (based on stress-rating scales of previous studies; e.g., Åkerstedt et al. 2007) using the integrated diary touch pen: “How much stress did you experience during the last two hours?” (0 = no stress at all to 7 = extreme stress). All diary prompts were initialized automatically by a vibrating signal every 2 h with a varying incidence of the signal of 1–5 min to prevent participants from waiting for the signal. All diary prompts occurred between 10 a.m. and 10 p.m. If participants missed a signal, reminders were sent after 5, 10, and 15 min.

Intervention

Participants of the experimental group attended a 20-week AET until they had completed both measurements during the academic examinations. Besides VO2max, the main outcomes of the CET were maximum running speed, lactate threshold, and individual anaerobic threshold. Based on these, we calculated individual HR zones for each participant (zone I: aerobic; zone II: aerobic–anaerobic) using the software Ergonizer (Sports medicine, Freiburg, K. Roecker) and installed the training schedules and the HR target zones on the RS800 HR monitors. Skilled exercise science students supervised and instructed the running groups of 7–8 people by adhering to a standardized training schedule for AET (Roecker 2007). At the beginning, running sessions lasted about 30 min and included walking phases of 2 min. Besides a continuous increase in duration of training (3 min/week) over time, intensity was progressively increased by adding intervals of 3 and 4 min in zone II after week 4. Participants completed the training in the group and wore the HR monitors to record the training and meet their individual HR requirements. To allow for individual but group-based running sessions, the training sessions took place outdoors in an area close to the campus with fixed trails. Training sessions were conducted during lunch time to allow for short routes and little time loss. Students had to complete two training sessions every week.

Data analysis

We used Multilevel Models (MLM) to calculate the effects of the exercise intervention on HRV in response to real-life stressors. Using MLM, we were able to provide consideration to our hierarchical data structure of multiple observations of HR, HRV and physical activity within participants. Furthermore, we were able to include participants with missing data, estimate within-subject effects as well as between-subject effects and handle uneven numbers of observations per participant (Hoffman and Rovine 2007). In addition, we were able to control for physical activity continuously. We calculated separate day- and night-specific models for the HRV parameters as dependent variables. To calculate MLM based on aggregated mean values per hour, we considered only hours with more than 30 min of valid data and participants with at least 18 h of valid data (>30 min/h). Due to their skewed distribution, we log transformed all HRV and physical activity variables.

Day-specific analyses

The day-specific analyses were based on both days (10 a.m.–10 p.m.) of the second period of examination measurements because it took place during the middle of the examination period (Post2). To control for physical activity, we included the sum of activity intensity (sum/h based on movement acceleration) in the model. In addition, we included baseline HRV (Pre; mean over 2 days at baseline) as a predictor. To consider differences between the 2 days of measurement, we calculated a three-level model including a random intercept between days. Time was coded as 0 for the first hour during the day (10 a.m.) and as 11 for the last hour during the day (9 p.m.).

Night-specific analyses

For the night-specific analyses, we used data from the two nights of both measurements during academic examinations (Post1 and Post2). Within the time span of 10 p.m. and 10 a.m. (next morning), all data that indicated sleep (accelerometer-based sleep/wake detection; see methods section) were analyzed. In the statistical model, we added group as fixed effect. To control for physical activity of the previous day on nightly HRV values, activity energy expenditure of the past day (AEEpastday) was included in the model. In addition, we included baseline HRV (pre; mean HRV for the night at baseline) as a predictor. Time was coded as 0 for the first hour during the night (10 p.m.) and as 11 for the last hour during the night (9 a.m.).

Furthermore, we included the perceived stress level (grand-mean centered) of students as a predictor in day- and night-specific models because the impact of perceived stress on physiological stress response can differ between individuals (Weekes et al. 2006, 2008). For the day-specific analysis we added VO2maxrel pre-intervention (VO2maxrel_pre) in the model to analyze whether the initial VO2maxrel value was associated with higher HRV during the day. For the night-specific analysis, we added the difference in VO2maxrel from pre- to post-intervention (VO2maxrel_Diff) to analyze whether a higher improvement in VO2maxrel predicted HRV. Both variables were grand-mean centered. Level 1 variables (observations) were movement acceleration/per hour; Level 2 (days) variables were perceived stress and Level 3 (person) included baseline HRV, group, and VO2maxrel_pre. Note that the night-specific model was the same except that physical activity was controlled with a Level 2 instead of a Level 1 variable (activity energy expenditure aggregated over the past day) and VO2maxrel_Diff was included instead of VO2maxrel_pre.

To determine whether the intervention had successfully improved the AET group’s aerobic capacity, we conducted an analysis of covariance (ANCOVA) with post-intervention VO2maxrel as the dependent variable (CETpost), pre-intervention VO2maxrel as the covariate (CETpre) and group (CG/AET) as the treatment factor.

Results

Aerobic capacity

Aerobic capacity operationalized via VO2max increased by 8.8 % (pre 48.5 ± 6.5; post 52.8 ± 6.4) in the AET group and decreased by 5.1 % (pre 50.2 ± 6.5; post: 47.6 ± 6.8) in the CG. The ANCOVA revealed significance for the factor “group” in VO2maxrel, F(1, 56) = 55.3; p < 0.001; η 2 = 0.51 indicating a strong effect (Cohen 1988) of the intervention.

Our final data set indicated good subject compliance. For the day-specific analysis, 3.4 % of missing data were due to non-wear times and for the night-specific analysis, 3 % of missing data were due to non-wear times. Further missing data were caused by ECG artifacts (mainly caused by a faulty batch of textile ECG chest straps), with 12.2 % of artifacts caused during the day and 8.9 % caused during the night. Participants of the AET group and the CG did not differ in person characteristics or cardiopulmonary exercise test parameters prior to the intervention. We excluded six participants from the day-specific analysis because they had more than 50 % of missing data per hour. For the night-specific analysis, we removed data of six participants of the first measurement and data of five participants of the second measurement during examinations.

For the day-specific analyses, 1164 observations were available. For the night-specific analysis, 776 observations were available. Intraclass coefficients were σ = 0.39 for HR, σ = 0.78 for LF/HF and σ = 0.77 for RMSSD, indicating that intraindividual variation caused 61, 22 and 23 % of the variance, respectively.

Descriptive statistics

Mean values of HR and LF/HF were similar in both groups during the night at baseline but the AET group had a descriptively higher baseline level of RMSSD during the night. In contrast, baseline RMSSD during the day was similar between both groups, as well as mean LF/HF ratio and mean HR values (Table 1). Average group values of the night were similar at baseline and academic examinations in all HRV parameters. However, during the two days of academic examinations, the AET group showed descriptively lower mean HR (9.3 %) and higher mean RMSSD values (25.9 %) compared to baseline (Table 1). During the examination period, the AET showed 42 % higher mean RMSSD compared to the CG at day 1 and 30.6 % higher mean RMSSD during day two.

Table 1 Descriptive statistics (M, SD, Min and Max) of the HRV parameters for the night and day during baseline and examinations

Day-specific effects of AET on HRV

RMSSD

As hypothesized, the AET group showed significantly higher (β = 0.15) RMSSD during the examination period compared to the CG (Fig. 2) when controlling for activity, baseline RMSSD, perceived stress and VO2maxrel_pre. Higher hourly activity levels as well as higher perceived stress (per day) lead to decreased RMSSD, whereas a higher RMSSD at the beginning of the semester predicted a higher RMSSD during the examinations (Table 2). Neither the Group × VO2maxrel interaction nor the VO2maxrel explained noticeable variance.

Fig. 2
figure 2

LF/HF and RMSSD (predicted values) during the day at baseline and during examinations

Table 2 Parameter estimates for multilevel models of LF/HF, RMSSD, and HR for the day

As CG was coded as 0 and the AET group was coded as 1, the model parameter estimates presented in Table 2 have the following interpretation (exemplary for RMSSD): (1) the intercept is the level of RMSSD (Ln) of the CG during the academic examinations, (2) the group estimate is the difference in RMSSD (treatment minus control), (3) the baseline estimate is the change in RMSSD with the level of baseline RMSSD in the CG, (4) VO2maxrel_pre is the change in RMSSD with the level of VO2maxrel pre-intervention, (5) the Group × VO2maxrel_pre estimate is the difference in RMSSD between the control and AET group in dependence of the VO2maxrel level pre-intervention. The random effects between-persons and -days illustrated that RMSSD differed between the 2 days of measurement and between persons. The significant estimate for autocorrelation indicates that RMSSD observations that were closer in time were more similar than more distant observations.

LF/HF

In accordance with the results for RMSSD, the AET group showed significantly reduced LF/HF during the examination period (Table 2; Fig. 2) when controlling for activity, baseline LF/HF and perceived stress. A higher baseline level of LF/HF predicted higher LF/HF during the examinations. Furthermore, higher activity predicted higher LF/HF (Table 2). In contrast to the results for RMSSD, perceived stress did not predict LF/HF.

HR

The AET group showed significantly reduced HR during the examination period compared to the CG (Table 2) when controlling for activity, baseline and perceived stress. As expected, a higher activity level significantly predicted a higher HR during the examinations. In addition, participants with a higher baseline HR had significantly higher HR during the examination period. More importantly, higher perceived stress predicted a significantly higher examination HR. In addition, the VO2maxrel_pre and the group × VO2maxrel_ pre interaction significantly predicted HR during examinations. Participants with a higher baseline VO2maxrel level showed reduced HR during the examinations. In the CG, the lower the VO2maxrel_pre was the higher the HR was during examinations. In contrast, within the AET group, participants had similar levels of HR irrespective of different VO2maxrel_pre. The between-person and -days random effects revealed that the HR and the effect of the activity level varied significantly between persons and the 2 days of measurement (Table 2).

Night-specific effects of AET on HRV

RMSSD

As hypothesized, the AET group showed significantly increased RMSSD values compared to the CG when controlling for activity and perceived stress of the past day, baseline RMSSD and VO2maxrel_Diff. Participants’ RMSSD level at baseline predicted participants’ RMSSD level during examinations (Table 3). While VO2maxrel_Diff significantly predicted RMSSD at night during academic examinations, perceived stress and PA of the previous day did not.

Table 3 Parameter estimates for multilevel models of LF/HF, RMSSD, and HR for the night

LF/HF

Participants of the AET group showed reduced LF/HF compared to the CG (Table 3). Furthermore, participants with a higher LF/HF ratio during baseline showed higher values during the examination period and the difference in VO2maxrel significantly predicted LF/HF ratio. In accordance with the results for RMSSD, the activity level of the past day and the stress level did not predict LF/HF. Note that because of the frequent studying by students during academic examinations, the activity level was very low in both groups during the 2 days before the examination (Table 2). LF/HF and the effect of the baseline LF/HF varied between-persons and -nights (Table 3).

HR

The night-specific analyses of HR did not show the same distinct group influence compared to the day-specific analysis. Participants of the AET group did not show a remarkable reduction in HR (Table 3) during the examination period compared to their control counterparts. In contrast to RMSSD and LF/HF, the activity level of the past day significantly predicted the HR values during the examination period. Participants, who were more active during the day had higher HR during the night. Furthermore, participants with higher HR at baseline had higher HR during the examination period.

Discussion

As hypothesized, the aerobic exercise intervention lead to reduced physiological responses to real-life stressors, supporting the cross-stressor adaptation hypothesis. Specifically, participants of the AET group exhibited higher HRV (lower LF/HF and higher RMSSD) during the real-life stressor, the examination period, compared to the CG. However, our findings are partially in contrast to previous meta-analyses and reviews pertaining to the cross-stressor adaptation hypothesis, which have demonstrated inconsistent results: While two meta-analyses reported support for the cross-stressor adaptation hypothesis (Crews and Landers 1987; Forcier et al. 2006), another did not demonstrate any support for reduced stress reactivity, but it indicated faster recovery after stress (Jackson and Dishman 2006). We suppose that two methodological issues may have caused our favorable results: first, we used a personally relevant real-life stressor—an academic examination. Second, we were able to investigate within-subject associations as we used an intervention design. Both issues might increase the detectable effect size of the cross-stressor adaptation hypothesis. Our results are in line with the recent findings of Klaperski et al. (2014) which used a more personally relevant laboratory stress task as well as a within-subject intervention design. They reported reduced HRV reactivity in response to the Trier Social Stress Test (TSST) after 12 weeks of exercise training in healthy men. Our findings provide some evidence regarding the involved physiological processes. RMSSD has been shown to be a very good marker of vagal tone in ambulatory ECG studies (Goedhart et al. 2007). Thus, the significantly higher RMSSD values of the AET compared to the CG observed in our study suggests that AET can buffer the stress-induced deteriorated HRV by improved parasympathetic activity. The utility of the LF/HF ratio as a marker of autonomic balance has been the subject of some controversy (Eckberg 1997; Malliani et al. 1998). Unfortunately, we did not assess a parameter which clearly reflects sympathetic activity, such as the pre-ejection period (Cacioppo et al. 1994); however, the significantly lower LF/HF ratio of the AET compared to the CG in the present study provides initial support that reduced sympathetic activity contributes to the buffering effect of AET on stress-induced deteriorated HRV.

We found increased RMSSD and decreased LF/HF (and HR) in the AET group compared to the CG both during the night and the day of an academic examination period. Previous research has shown that real-life stressors can impair restorative function of sleep which may be illustrated by decreased HRV (Pichot et al. 2002; Jackowska et al. 2012). Specifically, Sakakibara et al. (2008) used ambulatory pulse wave recording and found decreased HF in university students during the night one day prior to examinations. Regular exercise can induce increased nocturnal HRV (Nummela et al. 2010; Vesterinen et al. 2013), especially through increased parasympathetic activity. Furthermore, regular exercise is associated with improved sleep quality (Youngstedt 2005). Although we did not assess sleep using sophisticated multichannel EEG in a sleep laboratory, our preliminary findings of significantly higher RMSSD in the AET group during the night may suggest that exercise also buffers the deleterious effects of stress through improved recovery during the night.

Besides other real-life stressors such as medical internships (Lin et al. 2013), shift work or other job demands (Vrijkotte et al. 2000), academic examinations are a useful real-life stressor because they can induce significantly elevated cardiovascular reactivity (Sausen et al. 1992; Hazlett et al. 1997; Lucini et al. 2002; Zanstra and Johnston 2011), have a discrete start and end as well as natural and frequent occurrence. Furthermore, they permit consistent characteristics of the situation across participants and allow for the assessment of a pre-stress baseline (Hazlett et al. 1997; Zanstra and Johnston 2011). The descriptively similar baseline and examination values in our study may question whether students experienced noticeable physiological stress during the examination period. Several studies reported comparable values under real-life examination stress in student populations (Lucini et al. 2005; Saito et al. 2008; Tharion et al. 2009). The comparison of studies examining autonomic changes during academic examination periods is difficult due to a discrepancy in the time points at which measurements were performed (1 week before, immediately before, during and after the examination), specific measures obtained and a variety of reported parameters (Hazlett et al. 1997; Spangler 1997; Lucini et al. 2002; Sakakibara et al. 2008). However, assessing HRV immediately before examinations, Lucini et al. (2002) reported lower HRV values.

We assumed that a significant improvement in aerobic fitness in the AET group was required to identify effects on HRV in response to the real-life stressor. We were able to significantly improve VO2max of the AET group by on average 9 % while it decreased in the CG by 5 %. Previous studies depicted very heterogeneous results for VO2max improvement after AET due to influencing factors like target group, baseline vagal tone, age, gender, training load and genetic disposition (Hautala et al. 2009). We also observed these intraindividual differences in VO2max improvement (SD = 6.4). We cannot draw a definite conclusion regarding the VO2max decrease of the CG. However, in a secondary analysis we identified a significant decrease in unstructured physical activity from baseline to the anticipatory period of the exams during which the post-cardiopulmonary exercise tests were conducted. The reduction in unstructured physical activity may partly explain the significant decrease in VO2max of the CG. Furthermore, untrained participants might be less motivated to give 100 % in a maximal fitness test.

This study is not without limitations which should be mentioned. First, we used a male sample, thus the results of the present study are not generalizable to women. Second, the use of a regular academic period has been suggested as useful to obtain “non-stressful” baseline values (Hazlett et al. 1997; Loft et al. 2007). To enable that the intervention lasted the whole semester and because we assumed that students had low stress during this “regular academic” period, we set the beginning of the semester as our baseline period. However, the average group HRV (and HR) values at baseline are lower (higher) in our sample compared to previous studies with student samples (Hazlett et al. 1997; Hughes and Stoney 2000; Lucini et al. 2002; Dimitriev et al. 2008). This may indicate a very sedentary group or more stress at baseline. For comparison, Dimitriev et al. (2008) reported LF/HF ratios ranging from 2.15 to 2.65 during baseline and from 2.65 to 4.08 during an examination period dependent upon subjective prediction of success and actual examination marks. Furthermore, there is quite a lot of between-subject variability regarding the examination period. While for some students the days immediately prior to an examination are the most stressful, other students experience more stress during the anticipation phase before exams (Sausen et al. 1992; Spangler 1997; Sakakibara et al. 2008). Specifically, Dimitriev et al. (2008) reported that 20 % of their students sample had no HR increase from semester to examination period. While some students in our CG sample showed no decrease in HRV (and increase in HR) from baseline to the examination period, there were noticeable increases in other students of the CG. However, previous studies have demonstrated that overall examinations are useful real-life stressors (Sausen et al. 1992; Hazlett et al. 1997; Lucini et al. 2002).

Second, even though we used a stress item which worked quite well in previous studies (Åkerstedt et al. 2007), we cannot preclude that our specific participants did have conceptions about stress that may differ from scientific conceptions.

Finally, based on our objective accelerometer measures of physical activity, we controlled for the influence of physical activity on HRV. Although this is state of the art, future studies should also control for the influences of posture on HRV during everyday life.

Conclusion

The results of our randomized, controlled trial provide empirical support for the cross-stressor adaptation hypothesis during everyday life. Since repeated prolonged stress accompanied by enhanced dysregulation of the autonomic nervous system increases the risk for negative health outcomes (Jarczok et al. 2013), effective preventive strategies are needed. Our results indicate that through regular AET, the regulation of the autonomic nervous system can be improved by an attenuated physiological stress response to real-life stressors. This indicates that exercise may be a useful strategy to prevent stress-related cardiovascular diseases. In addition, we demonstrated positive effects of AET during both day and night. To the authors’ knowledge this is the first study to address the effects of regular AET on real-life stressors using ambulatory measures and a randomized, controlled trial. Future studies should investigate the effects of exercise on physiological stress responses to various real-life stressors in more detail.