Introduction

Stress is one of the major threats for physical and mental health (Chrousos, 2009). Although the deleterious effects of stress are well known, stress levels in the general population are high, prompting researchers to focus their attention on the topic of stress regulation (American Psychological Association, 2013). Empirical research has found physical exercise to be a remedy for many stress-related physical and mental diseases (Raglin & Wilson, 2012; Warburton et al., 2006), and it has long been recognized that physical exercise and stress are closely interrelated (de Geus & Stubbe, 2007). One specific aspect of the exercise-stress relationship has been highlighted by the cross-stressor adaptation hypothesis. It states that regular exercise leads to biological adaptations which contribute to reduced physiological reactions not only to exercise-related stressors but to stressors in general (Gerber, 2008; Sothmann, 2006). As empirical findings show that chronic hyper-elevated activations and slow recovery of the stress systems have a negative impact on health (Chrousos, 2009), a cross-stressor adaptation would be a central health-protective mechanism. The purpose of this study is to test the cross-stressor adaptation hypothesis experimentally.

The cross-stressor adaptation hypothesis is typically tested in a laboratory setting in which stressors are applied to different exercise- or fitness-level groups. Thereafter, the reactions of the two stress-systems—the autonomic nervous system (ANS), with its sympathetic and parasympathetic division, and the hypothalamic–pituitary–adrenal (HPA) axis—are assessed in terms of (a) stress reactivity, i.e., the height of the stress response due to the exposure to a stressor, and (b) stress recovery, i.e., the degree to which the elevation in the physiological parameter persists after stressor cessation (Linden et al., 1997). To date, more than a hundred studies examining the effects of physical exercise and/or fitness on stress response exist; more than two dozen cardiovascular (e.g., heart rate, blood pressure) and endocrine (e.g., ACTH, cortisol) stress parameters have been assessed as the response to a broad variety of stress tasks (Jackson & Dishman, 2006). In the following, we first sum up existing results from meta-analyses and reviews as well as from more recent studies; second, we provide a short overview of the existing experimental findings.

To date, several review articles have examined the findings for the cross-stressor adaptation hypothesis (e.g., de Geus & Stubbe, 2007; Forcier et al., 2006; Gerber, 2008; Jackson & Dishman, 2006). The most profound meta-analytic results emerged for cardiovascular stress responses, as a majority of the studies examined cardiovascular stress parameters. Regarding cardiovascular reactivity, Forcier et al. (2006) stated that physical exercise is linked to a lower stress reactivity; however, Jackson and Dishman’s (2006) meta-analytic findings did not support this assumption. In both of these meta-analyses physically active persons were found to possess a better cardiovascular stress recovery. Thus, overall a majority of these findings support the assumption that physical exercise can reduce the cardiovascular stress response. Conversely, regarding the HPA axis van Doornen and de Geus (1993) and Sothmann (2006) concluded that there is no link between hormonal components of the stress response and fitness or exercise training. However, reliable data on neuroendocrine stress response has grown considerably and recent studies support the cross-stressor adaptation hypothesis for cardiovascular as well as endocrine stress responses: Rimmele et al. (2007, 2009), for instance, found a reduced cortisol response in addition to a lower heart rate reactivity in male sportsmen compared to untrained men in response to the Trier Social Stress Test (TSST; Kirschbaum et al., 1993), a standardized protocol for inducing psychosocial stress. Moreover, a study by Klaperski et al. (2013) on a female sample indicated that engagement in physical exercise is associated with a reduced heart rate as well as reduced cortisol reactivity in response to the Trier Social Stress Test for Groups. However, causal conclusions can only be drawn from experimental findings.

To date, at least 17 randomized controlled trials (RCT) have examined the cross-stressor adaptation hypothesis, yet all of these studies examined the hypothesis only with regard to the cardiovascular stress response. Half of the experimental studies supported the assumptions of the cross-stressor adaptation hypothesis: They found that participation in an aerobic exercise training program led to more beneficial changes in the cardiovascular stress response (e.g., heart rate, blood pressure) than participation in a no-treatment group (Anshel, 1996; Georgiades et al., 2000; Spalding et al., 2004; Stein & Boutcher, 1992; Throne et al., 2000), a relaxation or meditation training program (Anshel, 1996; Keller & Seraganian, 1984), or an anaerobic exercise (mainly strength and/or flexibility training) program (Blumenthal et al., 1988, 1990; Sherwood et al., 1989; Spalding et al., 2004). Spalding et al. (2004), for instance, compared 15 subjects who had completed a 6-week aerobic exercise program with 15 subjects who had completed a 6-week weight training program and 15 participants of a no-training group; the authors found the greatest reductions in cardiovascular stress reactivity and recovery in response in the participants of the aerobic exercise program. While Spalding et al. (2004) used an arithmetic task with auditory distraction as stressor, other studies used a competitive motor task (Anshel, 1996), a mental arithmetic task (Blumenthal et al., 1988, 1990), a competitive reaction time task (Sherwood et al., 1989), a modified Stroop test (Stein & Boutcher, 1992), simulated fire scenes (Throne et al., 2000), or a series of different mental, psychosocial, and physical stressors (Georgiades et al., 2000; Keller & Seraganian, 1984) as the stress induction method. However, other experimental studies supported the cross-stressor adaptation hypothesis only partially (Blumenthal et al., 1991; Calvo et al., 1996; King et al., 2002; Kubitz & Landers, 1993; Sinyor et al., 1986) or found no support for the hypothesis at all (Albright et al., 1992; de Geus et al., 1990; Steptoe et al., 1990) when comparing stress responses. These studies also used a broad variety of stress tasks, e.g., mental arithmetic, memory search, public speaking, or reaction time tasks. For example, de Geus et al. (1990) compared 14 male undergraduates who completed a 7-week aerobic exercise training program with eight male undergraduates who did not train and found no effects of exercise on the stress response (e.g., heart rate, respiratory sinus arrhythmia) in response to a loud tone avoidance task, a memory search task, and a cold pressor test.

All in all, the experimental evidence predominantly supports the cross-stressor adaptation hypothesis. However, in reviewing the evidence, it becomes clear that it is limited by two points: First, all studies examined the effects of exercise on the cardiovascular stress response and thus rather focused on autonomic activation. Only one small RCT also examined respiratory sinus arrhythmia as a more specific index of sympathetic and parasympathetic influences (de Geus et al., 1990). Moreover, to the best of our knowledge, the stress response of the HPA axis has not been tested experimentally at all. Second, in almost every RCT different stressors were used; they were often not designed to induce stress and there is no clear protocol which can be replicated. As different stressors are known to elicit different physiological response patterns (Jackson & Dishman, 2006), it is hard to compare heterogeneous findings attained by means of different stressors. In addition, if a stressor is not a validated and replicable stress induction method, it cannot be ruled out that it was inappropriate for eliciting stress responses big enough to detect response differences.

It becomes clear that it is still very difficult to draw a final conclusion on the assumptions of the cross-stressor adaptation hypothesis. We therefore set out to conduct the first test of the cross-stressor adaptation hypothesis to use an RCT design, assessing effects on the cardiovascular system as well as on the HPA axis, and applying a standardized psychosocial laboratory stressor in a group format (TSST-G; von Dawans et al., 2011). In particular, we compared the heart rate (HR), heart rate variability (HRV), and salivary free cortisol responses of an aerobic exercise group and two control groups—a wait list control group and a relaxation group—before and after a 12-week intervention period.

According to previous experimental and correlational findings we expected that participants completing the 12-week aerobic exercise program would show: (1) a significantly lower stress response (HR, HRV, and cortisol reactivity and recovery) after the intervention in comparison to before the intervention; (2) greater stress response reductions than participants of the wait list control as well as the relaxation group.

Methods

Study design

The study was experimentally conducted as a randomized controlled trial and encompassed three groups (allocation ratio 1:1:1): a 12-week endurance exercise training group (EG) as the intervention group, a 12-week relaxation group (RG) as an attention control group with a non-exercise training program, and a wait list control group (WCG; receiving no intervention). Comparing the exercise group with a wait list control group allowed us to illustrate the intervention effect; additionally, comparing the exercise group with a relaxation group enabled us to control for unintended treatment effects (e.g., effects stemming from attention effects). We chose a relaxation training program as a means of establishing an attention control group with an appealing program for male employees. As can be seen in Fig. 1 (time bar to the left), we conducted a pre- and a post-test 1 to 2 weeks prior to and after the intervention period (assessment points “T1” and “T2”). We recruited the participants about 1 month prior to T1 (assessment point “T0”). Since the targeted number of subjects could not run through the study at once due to technical reasons and as a means of controlling for possible seasonal effects, we conducted the study in two identical waves (wave 1: February to June 2012, n = 76; wave 2: August to December 2012, n = 73). The study was approved by the ethical review board of the University of Freiburg.

Fig. 1
figure 1

Participant flow diagram and study design timeline. P(s), participant(s); *, one participant was sick during the first 4 weeks of the intervention and switched to the WCG

Participants

A priori power analyses to compute the required sample size indicated that a total sample size of 95 participants was needed to detect a medium to large effect (as expected due to previous cross-sectional findings) with a power of .80 or greater in analyses of covariance. The target sample consisted of healthy male office workers who did not engage in physical exercise and relaxation techniques at all or only infrequently. Subjects were recruited in companies from the banking and insurance industry as well as in civil service facilities located in Freiburg, Germany. In order to avoid biased self-reports of exercise, we did not disclose the true target sample; instead, we addressed all male employees via their employers. We invited all employees to participate in a study consisting of a short survey and informed them that some of them would also be invited to take part in a 12-week training study.

The detailed participant flow is presented in Fig. 1. Four hundred and seventy-four men participated in the screening survey (T0). From this pool n = 228 participants came into consideration for participation in the training study as they reported no or hardly any regular engagement in exercise and relaxation techniques during the previous 2 months (see “Measures”); these persons were telephoned and asked whether they would also like to participate in the training study. Two hundred and ten persons were willing to participate further. We informed them about the procedure of the training study and checked them for eligibility in a screening interview. Exclusion criteria were: (1) severe acute or chronic medical illness, a current psychiatric disorder, or psychotherapy; (2) disorders or injuries preventing them from participating in the program; (3) substance abuse; (4) lack of fluency in German. In total n = 149 participants were eligible and were randomized into a study group. We scheduled eligible subjects for the pre-test and sent them a written reminder with the behavior expected of them prior to the study. A total of n = 123 subjects completed the study.

Of the 123 subjects completing the study, n = 27 subjects were not included in the statistical analyses for the following three reasons: (a) We assessed for every subject whether he met the intervention requirements described below as we wanted to prevent a dilution of intervention effects due to noncompliance.Footnote 1 Eleven men did not meet the intervention requirements (EG: n = 5; RG: n = 5; WCG: n = 1), and six men unexpectedly exercised more than 2 h/week at T1 (EG: n = 1; RG: n = 1; WCG: n = 4) and were therefore excluded from the analyses. (b) As being overweight and consuming nicotine impacts physiological stress responses, eight subjects who smoked more than five cigarettes/day (EG: n = 2; WCG: n = 4) or had a BMI over 35 (EG: n = 1; RG: n = 1) were excluded from statistical analyses. Thus, the final sample consisted of n = 96 subjects.

Procedures

Pre- and post-test

As depicted in Fig. 2, the testing sessions always comprised a stress response assessment (approximately 2 h) and thereafter a fitness test (approximately 1 h) (see further explanations below). All subjects participated in two testing sessions: one 1–2 weeks before the intervention period (pre-test, T1) and one 1–2 weeks after the intervention period (post-test, T2).The pre- and post-test procedures only differed on two points: (a) The subjects received information and provided their consent only at T1; (b) at T2 the subjects were thanked and debriefed. All testing sessions at T1 and T2 took place between 3:00 p.m. and 9:00 p.m., with cortisol samples being taken between 3:30 p.m. and 8:00 p.m. to control for diurnal variations of cortisol release. The time and day of the testing session varied to the same extent at T1 and T2 within all study groups; there were no systematic changes in the day or time of the testing session from T1 to T2. In advance of the pre- and the post-test, the participants were instructed to refrain from exhaustive physical exercise 3 days prior to the study session, to refrain from exercise, alcohol, coffee, and green or black tea 24 h prior to the study session, to refrain from smoking and eating 1.5 h prior to the study session, and to have a regular breakfast and lunch but refrain from juices, cola, and chewing gum on the day of the study session. Each participant completed the sessions in random groups of three with two partners they were unacquainted with. After arrival, the subjects were randomly assigned to one of three tables and asked not to communicate with each other. All subjects provided written informed consent, were informed of their right to discontinue the study at any time as well as to withdraw their consent afterwards, and completed a health history questionnaire.

Fig. 2
figure 2

Study design: sequence of events and measurements on timeline

Stress response assessment: As illustrated in Fig. 2 the testing session began with the stress response assessment, which lasted approximately 2 h. The time period of the stress response assessment can be further split into a preparation period (30–45 min), a presentation period (14 min), and a resting period (60 min). The core of the stress response assessment was stress induction by means of the Trier Social Stress Test for Groups (TSST-G). The TSST-G is a standardized motivated performance task protocol (von Dawans et al., 2011). It combines high levels of uncontrollability and socio-evaluative threat (subjects deliver an unrehearsed speech and complete a mental arithmetic task in front of a camera and two judges; further details below) which reliably and validly induces psychosocial stress (Dickerson & Kemeny, 2004). In the present study, the TSST-G was adapted to groups of three instead of six persons.

During preparation, the subjects first completed initial questionnaires, and then we explained saliva sampling to them and applied HR devices. Second, we assessed the subjects’ HR and HRV baselines for 5 min in an upright standing position—the same posture the subjects were in during the TSST-G and the first 10 min of the recovery period. Third, the participants were introduced to the upcoming TSST-G task: Subjects were given 10 min to prepare for a job interview in which they were supposed to convince two interviewers that they were the most suitable candidate for a position; at T1 the subjects applied for a job in their own field of work, at T2 they applied for a job outside of their normal field of work. Subjects were told that they would be videotaped, that the two interviewers were experts in the evaluation of non-verbal behavior, and that a video analysis of their performance would be conducted afterwards; in addition, the participants were informed that the panel could call them at any time to ask further questions (see von Dawans et al., 2011). At the end of the preparation period the subjects were led into the TSST-G presentation room, about 3 min slow walking distance away from the preparation room (see Fig. 3, minutes −5 to −2). In the presentation room, the subjects stood in an upright standing position separated by dividing walls in front of a camera and a male as well as a female jury member representing the selection committeeFootnote 2; then the investigator explained the upcoming task again (see Fig. 3, minutes −2 to 0) and left. During the presentation period (see Fig. 3, minutes 0–14), only the male member of the committee talked to the subjects, the female member permanently observed the subjects; both judges were trained to withhold any verbal and non-verbal feedback at all times. Subjects were first asked to deliver their free speeches. Each of the three participants was given 3 min to present his speech. Following all three speeches, the subjects completed an unannounced serial subtraction task, with each subject alternately taking turns until three 30 s trials were completed. The subjects were asked to serially subtract the numbers 16 (at T1) and 17 (at T2) from a given number (e.g., 4,878) as quickly and accurately as possible. Participants received an individual starting number for each turn to prevent learning effects and had to restart if they made a mistake (see von Dawans et al., 2011). Finally, after 14 min the committee thanked the subjects and left after the investigator had reentered the room. At this point the resting period started. For the first 10 min of this period the subjects stayed in their upright standing positions and filled out questionnaires. After 10 min the investigator guided the subjects back to the first room, where they sat quietly and filled out questionnaires until saliva sampling was completed. At the end of the stress response assessment, the subjects drank a standardized drink of grape juice and water (250 ml) to obtain similarly high levels of blood glucose. At T2 the subjects were also thanked and debriefed by the selection jury.

Fig. 3
figure 3

Mean salivary free cortisol levels (a), mean baseline HR and mean HR (b), and mean baseline HRV and mean HRV (c) before, during, and after the psychosocial stressor (TSST-G, shaded area) in the EG, RG, and WCG at T1 and T2. Error bars are SEM; the horizontal lines accentuate the highest physiological stress level during the stress task and facilitate comparisons between T1 and T2

Fitness test: After completing the stress response assessment the participants began the fitness test, which lasted approximately 1 h. They were guided in an adjoining building to the “Radlabor”—a professional performance testing company. They were weighed and underwent a graded exercise test with lactate measurement on a bicycle ergometer. The test protocol required that they start with an initial workload of 60 watts and pedal continuously with a cadence of approximately 90 revolutions per minute. Every 3 min the load was increased by 25 watts until the subject ended the test due to exhaustion. We monitored HR throughout the test and took a capillary blood sample from the subject’s earlobe before each load increase to determine blood lactate concentration.

Intervention

As depicted in Fig. 1, after the pre-test n = 50 participants started with the endurance exercise training (EG), n = 50 participants started with the relaxation training (RG), and n = 42 were in the wait list control group (WCG). The exercise and the relaxation group had two 60 min training sessions per week. One training session per week was supervised in a group format; the other training session was conducted individually by the subjects according to written training instructions, as two supervised training sessions per week would have been too time consuming. Subjects who were not able to attend the supervised training session were asked to conduct two individual training sessions that week. Exercise training: In the exercise group the subjects underwent a heart rate-controlled running training program for beginners outdoors, which aimed at increasing aerobic fitness. During the first 4 weeks, we had the participants alternate between walking and running in order to allow them to get used to the training load; after 5 weeks, the subjects ran, if possible, continuously at an intensity of 60–80 % of their maximal heart rate as determined by the lactate test at T1 (Janssen, 2001). The participants were instructed to warm up before and cool down after running. Relaxation training: Participants in the relaxation group underwent a relaxation program for beginners indoors. The aims of the training program were to increase the subjects’ feelings of relaxation and to enable them to apply relaxation techniques. The subjects learned four of the six major approaches to relaxation described by Smith (1999): progressive muscle relaxation, autogenic training, breathing exercises, and imagery. During the first 6 weeks of the program the main topic was progressive muscle relaxation, thereafter autogenic training. During relaxation the subjects mostly laid on the floor; no relaxation exercise required any strenuous movements. Control group: Members of the wait list control group did not receive any special instructions; they started with the training program of their own choice about a month after T2.

The training was conducted by two trained female instructors, who conducted both training programs. Additionally, a running assistant accompanied the exercise group participants to allow each subject to run in his own training zone. In both groups each training session was monitored by a heart rate device, which in the exercise group also served as training control. The subjects were instructed to record all training sessions in a training journal. We used three methods to control for intervention compliance: First, we assessed each subject’s training adherence by means of the heart rate device and the training journal. Subjects who attended less than six supervised training sessions or undertook less than 12 sessions in total were excluded from analyses due to a lack of training adherence (EG: n = 2, RG: n = 1). Second, we asked the subjects whether they had increased their exercise or relaxation behavior due to their participation in the study; three members of the exercise group who declared that they had made no changes in their exercise behavior were excluded. Last, we controlled for increases in exercise amounts in the relaxation and wait list control group between T1 and T2; we excluded subjects who reported increases of more than 90 min/week (WCG: n = 1; RG: n = 4), as they would have diluted intervention effects.

Measures

At T0, T1, and T2 every participant filled out a questionnaire asking for information about physical exercise, relaxation behavior, and activities of daily living (e.g., housework). At T1 and T2 we furthermore assessed physiological stress responses and physical fitness. In order to control for systematic differences in trait anxiety and chronic stress between the groups, we also asked the participants to complete the State Trait Anxiety Inventory (STAI; Laux et al., 1981) and the short form of the Trier Inventory for Chronic Stress (TICS; Schulz et al., 2004) at T1 and T2. The internal consistency was good for both scales at T1 and T2, with a Cronbach’s alpha of .93 and .93 for “trait anxiety” and .86 and .91 for “chronic stress.”

Exercise and activities measures

For measurement of exercise and activities of daily living, we used the Measurement of Daily Activities and Exercise Questionnaire (Fuchs, 2012). Physical exercise: The subjects named a maximum of three exercise activities they had regularly engaged in within the last 4 weeks and indicated the frequency and duration per episode in minutes for each activity. We checked whether each activity they had named was a valid physical exercise activity, considering as valid only exercise activities involving larger groups of skeletal muscles and leading to maintenance of or increases in endurance, power, coordination, or flexibility. In total, eight activities named by the participants (e.g., riding a motor bike, walking a dog) were classified as invalid and excluded from further calculations. For each valid exercise activity, we computed an exercise amount in “minutes per week” by multiplying frequency by duration and—as the subjects reported the monthly frequency—dividing it by 4. To adjust the exercise amount for very long-lasting exercise activities typically containing a lot of resting periods, we divided amounts for hiking, downhill skiing, pair dancing, and physiotherapeutic exercise by 8 instead of 4. We then added up all single amounts to obtain a total physical exercise index value. Relaxation activities: We assessed relaxation activities the same way we measured physical exercise and used the same methods to calculate a total index value. Activities of daily living: Participants reported how often per week and for how long per episode they had engaged in five given physical activities (cycling to work, other cycling for locomotion, walking for locomotion, strolling, physically strenuous household activities [e.g., gardening]). Again, we multiplied the frequency by duration for each activity and added up all single amounts to obtain a total activities of daily living index value (minutes per week). The mean values and standard errors of the mean for exercise and activities of daily living in the study groups at T1 and T2 are displayed in Table 1.

Table 1 Description (mean, SEM) of the study groups at T1 and T2

Physical fitness measures

We calculated the individual lactate performance curve and the Individual Anaerobic Threshold (IAT) by relating blood lactate concentrations to achieved power (Roecker et al., 1998). The power achieved at the IAT relative to body weight (p[IAT]/kg) was used as an indicator of physical fitness (Kindermann et al., 1980; Roecker et al., 1998; Wasserman, 2011). The fitness values for four subjects are missing: one man (WCG) aborted the test at T1; three subjects could not take the test due to injuries at T2 (EG: n = 2; RG: n = 1).

Physiological stress measures

We used salivary free cortisol as an indicator for endocrine stress response and assessed autonomic stress response by means of HR and HRV. Cortisol: Cortisol release is a valid indicator for HPA activity as a response to an acute psychosocial stressor, especially when psychosocial stress is induced by a performance task containing social-evaluative threat and uncontrollability (Dickerson & Kemeny, 2004). The free, biologically active cortisol fraction in the blood can be reliably and validly assessed through the measurement of salivary free cortisol. Salivary free cortisol gradually increases within about 10 min, reaching its peak 10 to 30 min after stressor cessation (Foley & Kirschbaum, 2010). We collected six saliva samples from each participant before (−3 min relative to stressor onset) and after stress exposure (+15, +25, +40, +55, +75 min; Fig. 2) using a commercially available sampling device (Salivette®; Sarstedt, Germany). The saliva samples were stored at −20 °C and sent to Dresden LabService GmbH (Germany) for biochemical analysis, where they were thawed and spun at 21 °C at 3,000 rpm for 3 min to obtain clear saliva; free cortisol concentrations (nmol/l) were determined by a luminescence immunoassay for the in vitro-diagnostic quantitative determination of cortisol in human saliva (IBL International). Interassay coefficients of variation were below 5 %. For cortisol analysis, one exercise group and one relaxation group subject were excluded due to usage of cortisone compounds and drinking a small amount of alcohol in the morning.

Heart rate and heart rate variability: For HR and HRV measurement, a wireless chest heart rate transmitter with a wrist monitor recorder (Polar RS800CX, Polar Electro, Finland) applied to each subject assessed HR in beats per minute (bpm) and HR beat-to-beat intervals in a 1 ms resolution. For HRV, the “Relaxation Count” (RLX) by Polar was recorded in milliseconds (ms). RLX is an approximation of SD1, which is a time-domain index for HRV obtained by means of the Poincaré Plot method (Karmakar et al., 2011; Tulppo et al., 1996). RLX is calculated by way of recursive analysis of consecutive R–R intervals and thus allows monitoring of changes as a function of time. Thereby, RLX provides information about the degree of the sympathetic and parasympathetic modulation of the heart rate, which reflects the adaptability of the heart to changing conditions (Karmakar et al., 2011; see “Discussion”). We monitored HR and HRV continuously from 5 min before the stress onset until 7 min after stressor cessation; for analyses, we aggregated 1 min intervals. During preparation period we assessed the subjects’ HR and HRV baseline values in a 5 min interval in an upright standing position (see Fig. 2). Due to technical problems, we did not record the heart rate of nine subjects during T1 and that of two subjects during T2 (EG: n = 5; RG: n = 3; WCG: n = 3).

Statistical analyses

We tested for systematic differences in age, body mass index (BMI), chronic stress, trait anxiety, activities of daily living, physical exercise, and fitness between the study groups: We analyzed differences between the groups at T1 and T2 by means of separate analyses of variance (ANOVAs), and we analyzed differences between the groups over time by means of separate 3 × 2 (group by time) ANOVAs with repeated measurement. Furthermore, we used paired t-tests in each group to assess changes from T1 to T2 (see Table 1).

We analyzed the physiological stress response parameters as follows: First, we logarithmized the cortisol, HR, and HRV values to normalize the positively skewed physiological data. Second, we assessed for T1 and T2 separately whether the stressor elicited a significant stress response in terms of cortisol, HR, and HRV. For this purpose, we conducted two 3 × 2 (group by time) ANOVAs with repeated measurement for each physiological parameter: For cortisol, we contrasted levels directly before stress onset (sample 1) and 25 min after stress onset (sample 3); for HR and HRV, we contrasted the baseline value with the mean value of the 14-min stress period. Lastly, baseline differences of cortisol, HR, and HRV at T1 and T2 were analyzed by means of analyses of covariance (ANCOVAs) with the three study groups as group factor and age and BMI as covariates.

Stress reactivity: To assess stress reactivity, we calculated the “area under the individual response curve with respect to the ground” (AUC G ) for each participant by using the formula from Pruessner et al. (2003).Footnote 3 The AUC G is more suitable than repeated measurement analyses for analyzing physiological data with numerous points of measurement and sometimes different time distances between measurements (Pruessner et al., 2003). As cortisol peaked at the time of the third sample on average, we calculated the AUC G for cortisol for the time period between the 1st and 3rd sample (sample before stress onset to sample 10 min after stressor cessation) to assess the absolute stress reactivity. For HR and HRV, we calculated the AUC G during stress exposure (14 min TSST-G time interval). In order to analyze changes in reactivity over time, we calculated a difference value “Δ AUC G ” for each parameter by subtracting the AUC G at T2 from the AUC G at T1. We used separate ANCOVAs and a priori comparison to determine differences in Δ AUC G between groups with age and mean BMI (mean of BMI at T1 and T2) as covariates. For HR and HRV analyses, we further included the HR and HRV baseline at T2 and the HR and HRV baseline difference (T1–T2) as covariates, as regular exercise leads to a decrease in HR and an increase in HRV (Carter et al., 2003; Wittling & Wittling, 2012) and as we wanted to assess relative changes in HR and HRV stress reactivity.Footnote 4 To complement the examination of differences between study groups, we also compared the AUCG at T1 and at T2 in separate ANCOVAs (covariates as before); the results are displayed in Table 2. Last, for each group we used t-tests for paired samples to assess whether significant changes occurred from T1 to T2. Stress recovery: We assessed stress recovery by analyzing the extent of elevation in the physiological parameter after stressor cessation. As no standard procedure for the analysis of recovery exists—every parameter has specific advantages and disadvantages—recovery parameters should be chosen with respect to the specific data (Linden et al., 1997). We decided on an AUC measure (Linden et al., 1997) to accommodate for frequent sampling intervals in our study and participants who did not fully recover in the time provided. As “recovery” refers to an elevation with regard to a physiological zero level reference, we analyzed the amount of physiological elevation during recovery with regard to the first sample (for cortisol) and the baseline (for HR and HRV). For cortisol, we accomplished this by first subtracting the first sample value from the values of the samples taken during the recovery phase (samples 4, 5, and 6) and computing the area under the curve for these values. For HR and HRV, we proceeded identically and calculated an AUC for each baseline adjusted value during the first 7 min after stressor cessation. In the following we refer to these areas under the curve as “AUC with respect to baseline” (AUCB). Again, we calculated a difference value “Δ AUC B ” for each parameter; all further analyses were performed just as described above for Δ AUC G . Generally, the level of significance was set at p < .05 for all analyses. Where the Mauchly test of sphericity indicated heterogeneity of covariance, we used Greenhouse–Geisser corrections. All data were analyzed with SPSS Statistics 20.0 and are presented as mean ± SEM.

Table 2 Description (mean, SEM) of the AUC values for cortisol, HR and HRV reactivity and recovery for the study groups at T1 and T2

Results

Mean values and changes in participant characteristics

The characteristics of the sample at T1 and T2 are presented in Table 1. At T1 the three study groups did not differ significantly in terms of age, BMI, chronic stress, trait anxiety, activities of daily living, physical exercise level, or physical fitness (all p > .05; see Table 1 for exact p values). At T2 the study groups differed significantly only in levels of physical exercise, F(2,93) = 42.50, p < .001, η 2 p  = .478: During the last 4 weeks prior to T2, members of the exercise group exercised 103 min/week on average, while members of the relaxation group and wait list control group exercised only 22 and 17 min/week, respectively. Longitudinal analyses (change from T1 to T2) identified significant changes in the exercise group in terms of BMI, t(34) = 2.50, p = .017, d = .08, physical exercise, t(34) = −10.22, p < .001, d = 2.30, and physical fitness, t(32) = −2.50, p = .018, d = .31. The amount of physical exercise also significantly increased in the relaxation group from 8 min/week at T1 to 22 min/week at T2; however, this increase is markedly smaller than in the exercise group. Training adherence did not differ between training groups (p > .05), members of the exercise group and relaxation group participated in M = 9.94 (SEM = 0.27) and M = 10.15 (SEM = 0.22) supervised sessions and M = 10.66 (SEM = 0.42) and M = 10.68 (SEM = 0.51) free training sessions, respectively.

Cortisol response to stress

Mean salivary cortisol levels (nmol/l) are presented in Fig. 3a; all values used for analyses are log-transformed. There were no significant differences in cortisol levels between the three groups at the time of the first salivary sample, neither at T1, F(2,89) = 0.84, p = .43, nor at T2, F(2,89) = 2.80, p = .07. At T1 as well as at T2 the TSST-G induced a significant increase in cortisol levels in all participants, illustrated by significant main effects of time in the repeated measurement ANOVAs, with F(1,91) = 237.62, p < .001, η 2 p  = .723 for T1, and F(1,91) = 223.06, p < .001, η 2 p  = .710 for T2.

Cortisol reactivity: For Δ AUC G as dependent variable we found a significant main effect of the group factor (F(2,89) = 3.58, p = .03, η 2 p  = .075), indicating that the three groups changed differently over time with respect to their stress reactivity. Contrast analyses showed that the Δ AUCG of the exercise group differed significantly from that of the wait list control group (p = .01), but not from that of the relaxation group (p = .44); the difference in Δ AUCG between relaxation group and wait list control group missed significance (p = .06). Analyses of stress reactivity changes within the groups indicated that the reactivity abated significantly from T1 to T2 in the exercise group (t(33) = 3.25, p = .003, d = .64) and relaxation group (t(32) = 2.70, p = .01, d = .41) but not in the wait list control group (p = .83). Cortisol recovery: The three groups did not significantly differ in their Δ AUCB with F(2,89) = 1.99, p = .14. Analyses of stress recovery changes within the groups revealed that members of the wait list control group significantly improved their cortisol recovery from T1 to T2 (t(26) = 2.57, p = .02, d = .34), whereas no significant changes occurred in the other groups. Finally, results concerning cross-sectional group differences indicated that at T2 cortisol stress recovery significantly differed between the three groups with F(2,89) = 3.11, p = .05, η 2 p  = .065. Exact AUC- and p values for salivary free cortisol are displayed in Table 2.

Heart rate response to stress

Mean HR levels (bpm) are presented in Fig. 3b; all values used for analyses are log-transformed. HR at baseline significantly differed among the study groups at T1 (F(2,80) = 3.26, p = .04, η 2 p  = .075) as well as at T2 (F(2,80) = 3.36, p = .04, η 2 p  = .078). Contrast analyses showed that at T1 the wait list control group’s baseline was significantly lower than those of the exercise group and the relaxation group (both p < .05), whereas at T2 the exercise group’s baseline was significantly lower than that of the relaxation group (p = .02) and by trend lower than that of the wait list control group (p = .08). Age proved to be a significant covariate at T1 and T2, with older subjects having a lower HR baseline (both p < .05). Significant large main effects of time in the ANOVAs at T1 (F(1,82) = 217.65, p < .001, η 2 p  = .726) and T2 (F(1,82) = 145.79, p < .001, η 2 p  = .640) indicated that the stressor induced a substantial increase in HR in all groups at both points of measurement. It is also noteworthy that at T2 the groups’ HRs during the slow 3 min walk to the presentation room rose equally (see Fig. 3b, minutes −5 to −2) and then diverged during the TSST-G.

HR reactivity: From T1 to T2, HR reactivity during stress exposure—as indicated by AUCG—changed differently in the three groups, shown by a significant main effect of the group factor (F(2,78) = 3.64, p = .03, η 2 p  = .085). Contrast analyses revealed that the exercise group had a significantly greater Δ AUC G than the relaxation group (p = .01), but not than the wait list control group (p = .10); the Δ AUC G did not differ between the relaxation group and wait list control group (p = .49). The covariate HR baseline difference was positively associated with the Δ AUC G (p < .001). Further analyses of HR reactivity within the groups indicated that HR reactivity decreased significantly from T1 to T2 in the exercise group (t(29) = 4.93, p < .001, d = .77), whereas no changes occurred in the relaxation group or wait list control group (p = .50 and p = .21). HR recovery: For HR Δ AUCB a significant group effect emerged (F(2,78) = 4.23, p = .02, η 2 p  = .098), indicating a different T1–T2 development in HR recovery. Contrast analyses revealed that the Δ AUCB of the exercise group is significantly larger than those of the relaxation group (p = .01) and the wait list control group (p = .02) Δ AUCB, while the latter two groups did not differ in this regard (p = .90). The covariate HR baseline difference was significantly (p < .001) negatively associated with Δ AUCB. Analyses of HR recovery within the groups showed a significant change, namely an improvement, in HR recovery only in the exercise group (t(29) = 2.83, p = .008, d = .60). Exact AUC- and p values for HR are displayed in Table 2.

Heart rate variability response to stress

Mean values for HRV are presented in Fig. 3c; all values used for analyses are log-transformed. The HRV baselines at T1 (F(2,80) = 2.25, p = .11) and T2 (F(2,80) = 1.14, p = .33) did not differ significantly between the three study groups. Age at T1 turned out to be a significant covariate, with older subjects having a lower HRV baseline (p = .001). Main effects of time in the ANOVAs at T1 (F(1,82) = 10.37., p = .002, η 2 p  = .112) and T2 (F(1,82) = 5.33, p = .024, η 2 p  = .061) indicated at both times a significant decrease in HRV due to the stressor. As before for HR, during the slow 3 min walk to the presentation room the groups’ HRV values rose equally at first (Fig. 3c, minutes −5 to −2) before diverging from one another during stress exposure.

HRV reactivity: The Δ AUC G for HRV reactivity changed differently in the groups, which is indicated by the significant main effect for the group factor (F(2,78) = 4.28, p = .02, η 2 p  = .099). Contrast analyses showed that the HRV Δ AUC G value in the exercise group was significantly higher than the corresponding values in the relaxation (p = .02) and the wait list control group (p = .01); the Δ AUC G values of the relaxation group and wait list control group did not differ (p = .58). The HRV baseline difference was significantly positively associated with HRV Δ AUC G (p < .001). Analyses of changes within the groups indicated that HRV reactivity significantly improved from T1 to T2 in the exercise group (t(29) = −4.15, p < .001, d = .54) but not in the relaxation or the wait list control group. Furthermore, HRV reactivity at T2 significantly differed between the groups (F(2,78) = 3.58, p = .03, η 2 p  = .084). HRV recovery: For HRV Δ AUCB the group effect just missed significance with F(2,78) = 2.87, p = .06). Age (p = .05) and HRV baseline difference (p < .001) were significant covariates, with age and HRV baseline improvements being negatively associated with HRV recovery improvements. Analyses of the HRV recovery within the groups yielded no significant changes, yet at T2 the groups significantly differed in their HRV recovery (F(2,78) = 4.18, p = .02, η 2 p  = .097). Exact AUC- and p values for HRV are provided in Table 2.

Discussion

This is the first study to experimentally examine the effects of an endurance exercise program on the cardiovascular stress response and the response of the HPA axis to a standardized psychosocial laboratory stressor (TSST-G) using a RCT design. As major findings this study revealed that (1) a 12-week endurance exercise program significantly reduced the reactivity to a psychosocial stressor in terms of cortisol, heart rate (HR), and heart rate variability (HRV), while a 12-week relaxation program only reduced cortisol stress reactivity and a wait list control group did not show any changes. It was also shown that (2) the endurance exercise program alone led to an improved HR stress recovery. The exercise program did not seemingly affect cortisol or HRV recovery; however, these findings must be regarded with caution due to limitations in assessment. Further findings were that (3) participants of the 12-week exercise program showed significant improvements in their physical exercise and fitness levels. In the relaxation group the level of exercise also significantly increased, but the augmentation was small and did not result in a fitness change. Last, we found that (4) our laboratory stressor (the TSST-G) effectively induced stress responses in all three groups in the pre-test as well as in the post-test. In line with previous findings (Federenko et al., 2004), the cortisol response decreased from the first to the second TSST-G exposure in all groups, while such decreases did not emerge for HR and HRV (see Fig. 3). In the following, we discuss the major findings in more detail.

Cortisol

For cortisol reactivity, our findings confirm experimentally what has been hypothesized in cross-sectional studies by Klaperski et al. (2013) and Rimmele et al. (2009): Our 12-week exercise training significantly reduced cortisol stress reactivity, while no such changes occurred in the wait list control group. However, a comparison of the effects of the exercise and relaxation training does not reveal a clear advantage for the exercise group in terms of cortisol reactivity: Subjects of the relaxation group also significantly reduced their cortisol reactivity, and the groups did not differ regarding a change in cortisol reactivity over time. However, only the exercise group’s reactivity change was significantly different from that in the wait list control group (the difference between the two control groups only approached significance). From these findings we conclude that engagement in physical exercise leads to a reduction of cortisol reactivity under psychosocial stress. However, it cannot completely be ruled out that the reduction in cortisol stress reactivity in the exercise group was brought about by nonspecific effects (e.g., attention effects) rather than the exercise stimulus per se as we also found a significant reduction in cortisol reactivity in the relaxation group.

Our results for cortisol recovery stand in contrast to our findings for cortisol reactivity: Overall, we could find no evidence that the three groups’ stress recovery abilities changed differently over time. However, notwithstanding the fact that this finding is in accordance with previous results (Klaperski et al., 2013; Rimmele et al., 2009), it must be kept in mind that possible effects on cortisol stress recovery might be masked by limitations in assessment methods. Cortisol levels are not assessed continuously but at fixed assessment points, and thus the course of the stress response is reflected incompletely. Additionally, due to its delayed response onset, cortisol release cannot be related precisely to a certain timeline or course of events, and a clear distinction between cortisol stress reactivity and recovery cannot be made. It is therefore imaginable that recovery effects are not adequately depicted by present assessment methods. Apart from that, neither exercise nor relaxation training significantly affected cortisol stress recovery. However, participants of the wait list control group significantly improved their cortisol stress recovery from T1 to T2. We ascribe this finding to the effects of lower reactivity values due to habituation and a baseline measurement problem: Cortisol levels at the time of the first saliva sample did not significantly differ between the groups, yet the first cortisol values of the exercise group and relaxation group were lower at T2 than at T1, while the wait list control group’s level remained stable (see Fig. 3a). As the first saliva sample value is used as a reference point for the recovery calculation, the wait list control group achieved better recovery values at T2. In line with results from Gaab et al. (2005), we assume that the intervention groups already showed a reduction in the anticipatory stress response. However, we cannot test this assumption as we failed to assess a valid baseline about 20 min after arrival and 15 min before stress onset. Future studies should not fail to control for possible anticipatory stress responses.

All in all, our experimental findings support the assumption of the cross-stressor adaptation hypothesis with respect to endocrine stress reactivity; in agreement with previous results (Klaperski et al., 2013) we come, unlike Rimmele et al. (2009), to the conclusion that a small amount of exercise training can already lead to adaptations of the HPA axis in formerly physically inactive persons. However, in order to draw definite conclusions studies are needed by means of which the influence of nonspecific intervention effects can be entirely ruled out.

Heart rate

Our data confirm a link between physical exercise and HR response to stress: HR stress response patterns changed differently in the three study groups in terms of HR reactivity and HR recovery. HR reactivity: Only the exercise group’s participants significantly reduced their HR reactivity from T1 to T2, and the HR reactivity change in these subjects was significantly higher than in the relaxation group but not higher than in the wait list control group. HR recovery: Similar to HR reactivity, only subjects in the exercise group showed a significant improvement in their HR recovery, while recovery changes in the relaxation and wait list control group were smaller and non-significant. Beyond that, we found a general reduction in HR in the exercise group at T2 as a consequence of the exercise program (Carter et al., 2003).

In sum, the results for heart rate support the cross-stressor adaptation hypothesis and are in accordance with empirical evidence suggesting that physical exercise reduces HR reactivity and improves HR recovery (Forcier et al., 2006; Jackson & Dishman, 2006; Klaperski et al., 2013). Furthermore, the results are in line with the majority of the existing experimental studies in which participation in an aerobic exercise program but not participation in a relaxation program or in no program improved HR reactivity and recovery.

Heart rate variability

Our findings also confirm a link between engagement in physical exercise and HRV response to stress; however, only for stress reactivity but not for stress recovery. HRV reactivity: Analyses revealed that the study groups changed differently over time, with the exercise group showing a significantly larger reduction in HRV reactivity from T1 to T2 than the relaxation group and the wait list control group. The exercise group was the only study group to significantly reduce its HRV reactivity during stress from T1 to T2. HRV recovery: Within and between the three groups, HRV recovery did not change significantly from T1 to T2. However, at T2 the three groups significantly differed in their HRV recovery, with the relaxation group achieving the best and the wait list control group the worst recovery values.

Hence, our 12-week exercise training program led to an improvement in HRV reactivity but not in HRV recovery, which supports the cross-stressor adaptation hypothesis only partially with regard to HRV. HRV parameters have rarely been examined in studies examining the cross-stressor adaptation hypothesis (e.g., Spalding et al., 2000). The only experimental result with regard to HRV stems from de Geus et al. (1990), who found no effects of aerobic exercise on the subjects’ respiratory sinus arrhythmia. Our findings contradict this previous results and indicate that physical exercise affects not only HR but also HRV reactivity. This result provides an initial indication that not only the sympathetic division (mainly responsible for activation) but also the parasympathetic division and the interplay between the two divisions are impacted by exercise. However, our results must be regarded as first tentative insights as we could only apply limited HRV assessment methods, and the RLX index we used is only an approximation of SD1 and reflects the modulation through both ANS subsystems. Our experimental results should encourage researchers to examine the influence of physical exercise on the HRV stress response by means of more specific assessment methods and measurement of parasympathetic activity because HRV, and in particular the parasympathetic activation, is being acknowledged more and more as an important indicator for stress and health states (Wittling & Wittling, 2012).

Strengths and limitations

The findings of the present study may be assumed to possess high internal validity because of five study features: (1) We used a randomized, controlled intervention design with a wait list control group and an attention control group; (2) we assessed three physiological parameters (heart rate, heart rate variability, cortisol) as indicators of the reactions of the sympathetic and parasympathetic nervous system and the HPA axis; (3) we applied a standardized and valid protocol to induce laboratory stress (TSST-G); (4) we assessed the effects of the exercise training by means of standardized graded exercise lactate tests; and (5) we assessed stress reactivity and stress recovery as two different parameters of the stress response. These characteristics support the notion that our findings can reflect a causal relationship between physical exercise and the physiological stress response.

Our results indicate that the cross-stressor adaptation hypothesis is valid for cardiovascular stress reactivity and supposedly also for endocrine stress reactivity, however, with regard to endocrine stress reactivity stronger experimental evidence is needed. The findings allow only limited conclusions regarding stress recovery, as it cannot be ruled out that recovery effects were masked by assessment limitations. Our results concerning HRV are promising but future studies with more specific HRV assessment methods are needed to expand on these findings. Furthermore, it has to be acknowledged that we did not assess the effectiveness of the relaxation program as this program served as an attention control group; thus, our findings with regard to the effects of the relaxation program on the stress response are limited. Moreover, although the relaxation training served its purpose by constituting an appealing control group with almost no drop-out, it was not an optimal attention control group since relaxation may have also affected the stress response, as indicated by the result for cortisol reactivity. The generalizability of our findings is limited to healthy, male white-collar workers. Studies on females, other occupational groups or a focus on age-groups would provide more insight into the influence of gender, age, occupation, and education.

Outlook

Health is affected negatively by long-lasting hyper-elevated activations and slow recovery of the stress system (Chrousos, 2009). Moreover, low reactions to laboratory stressors are linked to a lower risk of stress-related diseases (Chida & Steptoe, 2010). As this study showed that engagement in regular physical exercise is likely to reduce the reactivity to a psychosocial laboratory stressor with regard to the autonomic nervous system and with regard to the HPA axis, the idea of physical exercise as an “buffer” against the negative health effects of stress is strengthened. Whether the cross-stressor adaptation indeed forms the basis for a stress-buffering effect and whether stress response changes are linked to changes in health will have to be examined further. Just as importantly, the role of physical fitness and psychological resources must be explored. Lastly, however, it should not be disregarded that some studies have shown chronic physiological hypo-reactivity to be linked to an increased risk of other chronic diseases such as obesity (Carroll et al., 2008). Reduced stress reactivity might thus be a double-edged sword which lowers risks for some diseases but heightens the risk for others.