FormalPara Key Points

The psychological consequences of exercise training might be largely attributable to the placebo effect, but few exercise training studies have been conducted that permit an assessment of the placebo effect.

The results of this meta-analysis suggest that the true effect of exercise training on psychological outcomes is about half of what has been reported previously.

If the psychological consequences of exercise training are to be understood, more attention needs to be paid to the potential placebo effects associated with exercise training.

1 Introduction

The placebo effect, also referred to as the placebo response [1], has received increased attention during the last decade. The development of innovative research designs [2] has helped investigators understand the mechanisms that elicit placebo effects and the physiological systems that are influenced. One approach has been to manipulate expectations, such as those regarding pain, depression, and motor performance, and this work has provided insight into the specific neurotransmitters and brain regions that are linked to placebo effects [3]. An alternative approach has been to induce placebo responses in the autonomic, endocrine, and immune systems through classical conditioning [4]. Influential contributions such as these have permeated the scientific community and given researchers cause to recognize the potential implications of the placebo phenomenon within their own fields of expertise.

Until recently, applied placebo research has largely been restricted to medical settings, but a small body of evidence suggests that placebo effects influence physical performance. It is estimated that placebos can have a small-to-moderate impact on exercise performance [5, 6], although the specific mechanisms are not well understood. One barrier to elucidating the underlying mechanisms in the physical performance literature is the dearth of studies that permit a valid assessment of the placebo effect. Experts in this area have suggested that future investigators consider (1) including both a placebo and a natural history control group in randomized controlled trials, (2) alternative study designs to isolate variables that affect the magnitude of placebo responses (e.g., the balanced placebo design), and (3) using psychological instruments to measure differences between placebo responders and non-responders [5, 6].

Psychological benefits attributed to exercise may be confounded by the placebo effect. A narrative review suggests that the mental health benefits of acute exercise are plausibly placebo effects because acute psychological consequences of exercise have not been shown to be mediated by the stimulus characteristics such as exercise duration or intensity [7]. In the chronic exercise literature, stimulus characteristics of exercise that mediate or moderate psychological outcomes vary among studies, which possibly indicates the presence of placebo effects. However, a majority of exercise training studies have been designed in a way that does not permit a valid assessment of the placebo effect.

Prior conclusions about the psychological benefits of exercise training have been based on hundreds of studies designed with two groups: an exercise training group (the intervention) and a minimal treatment group (the control). This design fails to assess and control for the concomitant non-exercise, psychosocial context variables (e.g., expectations, conditioning, social interactions) that could account for some or all of the psychological benefits currently attributed to exercise training. Few randomized trials have been designed to assess the magnitude of the placebo effect in exercise or medical treatments by including an intervention, placebo, and control group. In studies that have used placebos, the placebo effect has been substantial for some subjective outcomes such as pain intensity [810]. The addition of a placebo group in exercise training studies could allow researchers to assess the extent to which non-exercise factors contribute to the psychological consequences of an exercise intervention and the extent to which psychological benefits are caused by exercise training per se [11, 12].

Some authors have argued that “the idea of a placebo group in exercise studies is, in practice, impossible” [13], but others have emphasized the need to incorporate placebos into exercise training studies, including using research designs that would elucidate the mechanisms that underlie placebo effects [7]. Exercise training interventions often include questionnaires or other measures of psychological outcomes that may be especially open to biases introduced by demand characteristics [14] and experimental artifacts such as the placebo effect [15, 16]. Double-blind designs are not possible in exercise training studies [17]; however, placebo groups have at times been used to assess the placebo effect.

The placebo effect has received increased attention as a plausible mediator of psychological outcomes of acute exercise [7] but has yet to be quantified in studies that examined psychological outcomes of exercise training. The aim of this investigation was to estimate the magnitude of the population placebo effect in psychological outcomes from placebo conditions used in exercise training studies and compare it with the effect of exercise training. The psychological outcomes that could be included were anxiety [1826], cognitive performance [2732], depression [21, 26, 3346], and energy and fatigue [36, 37, 4761]. Potential effect size moderators were also considered. Based on prior research, we hypothesized that placebo effects would be (1) present in both subjective and objective outcomes [1, 62], (2) larger with greater exposure to the placebo condition [63], and (3) moderated by placebo type [64].

2 Methods

The present meta-analysis is reported in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement [65].

2.1 Search Methods for Identification of Studies

We searched for relevant articles using Google Scholar, MEDLINE, PsycINFO, and The Cochrane Library. The search was organized to locate articles that contained one of the following words: ‘anxiety’, ‘cognition’, ‘depression’, ‘energy’, ‘fatigue’, or ‘pain’ AND the exact phrase ‘exercise training’ or ‘chronic exercise’ AND at least one of the following words or phrases: ‘randomized controlled trial’, ‘placebo group’, ‘placebo control’, ‘expectancy’, ‘expectation’. In addition, we manually searched the reference lists of any relevant meta-analysis, systematic review, or narrative review. We concluded our search by extending an email request to current field experts for information about relevant randomized controlled trials that we did not identify. No additional articles were retrieved. The search produced 696 full-text articles; 687 were excluded. Figure 1 provides a flow chart of study selection.

Fig. 1
figure 1

Flow chart of study selection

2.2 Study Selection

Studies were included in the analysis when the following criteria were met: (1) English language, (2) designed as a randomized trial with participants allocated to an exercise treatment arm, a control arm, and an arm that we or the authors classified as a placebo, (3) the treatment group engaged in at least 4 weeks of exercise training, (4) exercise training was not included as an adjuvant to another treatment, (5) the placebo group received an inert intervention for the outcome measure being reported, and (6) outcome data were reported for anxiety, cognitive performance, depression, energy, or fatigue.

The operational definitions for placebo interventions are inconsistent [9, 10, 6671]. Here, a placebo intervention was defined as an intervention that was not generally recognized as efficacious, that lacked adequate evidence for efficacy, and that has no direct pharmacological, biochemical, or physical mechanism of action according to the current standard of knowledge [72]. Physical activity that involved small muscle groups (e.g., hand or facial movements) or very low intensities (e.g., ≤40 % peak) was classified as an exercise placebo condition. Convincing evidence that documents that psychological improvements result from these types of exercise stimuli is lacking. Some past meta-analyses have considered any intervention defined as a placebo by the study authors to be a valid placebo [9], but that approach was flawed for the present analysis. Instead, we verified that placebo groups did not receive an intervention that was later discovered to be efficacious for the outcome being studied. For instance, two early studies assigned participants to a ‘placebo’ strength and flexibility program [73, 74]. However, those studies were not included in the present analysis since it has been determined that resistance exercise can influence certain psychological outcomes [75].

We took into consideration the fact that some experimental conditions were both a treatment and a placebo depending on the outcome being measured. For instance, Roth and Holmes [76] assigned participants to an aerobic exercise program or progressive muscle relaxation training and measured anxiety and depression. In this case, progressive muscle relaxation training was considered to be a treatment for symptoms of anxiety. However, progressive muscle relaxation is not recognized as a treatment for depression and was considered to be a placebo for that outcome.

A similar approach was taken with a study that focused on participants with severe knee osteoarthritis. Williamson et al. [77] used a three-arm trial to compare the effects of physiotherapy, acupuncture, and no treatment on pain symptoms. Acupuncture is therapeutic for knee pain symptoms [78]. However, anxiety and depression symptoms were included as secondary measures, and we considered acupuncture to be a placebo for those outcomes because there is insufficient evidence to conclude that acupuncture is effective for reducing symptoms of anxiety or depression in the groups studied in the present meta-analysis [79, 80].

We operationally defined an exercise treatment group as any study allocation in which participants engaged in “planned, structured and repetitive bodily movement performed to improve or maintain one or more components of physical fitness” [81]. Wait-list, usual-care, and no treatment groups were classified as control groups. In some instances, the authors of a given study reported using a control group that for our purposes was considered to be a placebo intervention. For example, one study of elderly adults with mild cognitive impairment, in which cognitive performance was the outcome, allocated part of a usual-care control group to receive social visits [82]. Based on our definition, participants who received social visits were classified as a placebo group and those who did not receive social visits were classified as the control group.

2.3 Data Synthesis and Analysis

Two of the authors (JBL and PJO) independently extracted means and standard deviations. The original agreement between authors yielded an intra-class correlation of 0.96, and the discrepancies were resolved. Three instances occurred in which means and standard deviations were not explicitly reported. In one circumstance, between-group mean differences were reported in lieu of post-test means [83]. The corresponding author was contacted and post-test means were obtained. In another case, the authors did not provide pre- or post-test means and standard deviations [84]. In this situation, we estimated means from a figure and used normative data to approximate standard deviations [85]. In the third case, standard errors (SEs) were reported [86], which we converted to standard deviations by multiplying the SE by the square root of the sample size.

2.4 Effect Size Calculation

We calculated between-group effect sizes to control for potential confounding threats to internal validity (e.g., regression to the mean) that remain unaccounted for when solely within-group effect sizes are used [11, 12]. Two between-group effect sizes were calculated: (1) placebo compared with control and (2) exercise compared with control. We considered the placebo–control comparison to represent the placebo effect and the exercise–control comparison to represent the observed effect of exercise.

One study included two placebo groups [82]. In this case, the number of extracted effects for the placebo versus control (k = 18) was double the amount of effects for the exercise versus control comparison (k = 9). This resulted in a higher total number of effects in the placebo versus control comparison (k = 50) than in the exercise versus control comparison (k = 41).

Data from the exercise, placebo, and control groups were entered into the following formula: g = (∆M treatment − ∆M control)/S pooled, where g is the magnitude of the effect size, ∆M treatment is the mean change of the intervention group, ∆M control is the mean change of the control group, and S pooled is the pooled standard deviation [87]. These effects were then converted to Hedges’ d to correct for sampling error using Hedges and Olkin’s small sample size adjustment [87]. Effects were coded so that positive values represented an improvement in psychological outcomes.

2.5 Aggregation of Effects

Effects were weighted using the inverse variance method and aggregated using a random effects model [88]. Mean effect size macros (MeanES and METAREG) [88] and SPSS version 21 (SPSS IBM, Armonk, NY, USA) were used to estimate mean effects and test for significant effect moderators. The number of effects needed to overturn the result (N+) was calculated [89].

In most studies, multiple outcome measures or repeated measurements across time yielded nested effects within studies (median of five to six effects per study), which might systematically differ from each other. Hence, a multi-level model with robust maximum likelihood estimation was used to adjust for between-study variance and correlated effects within studies [90] according to standard procedures [91, 92] using Mplus 7.11 (Los Angeles, CA, USA) [90]. Parameters and their errors were estimated with clustering on study using the Huber–White sandwich estimator to calculate SEs that are robust to heteroscedasticity and correlated effects [9395]. The effect of moderators in the multi-level nested model was tested by comparing the conditional model (which included the intercept and the moderator) with the unconditional intercept-only model using a likelihood ratio test and the adjusted Bayesian Information Criterion (BIC) [92].

2.6 Potential Primary Moderators of Placebo Interventions

To determine whether aggregated effect sizes varied according to heterogeneity between studies and to gain a better understanding of the literature [96], characteristics of each investigation were recorded in a spreadsheet and coded for moderator analysis. Potential moderators of placebo effects were selected to describe characteristics of a given study that could contribute to a heterogeneous effect size. The moderators were selected based on relevant literature [810] and availability of data. The three selected primary moderators were outcome type, total minutes of exposure to the placebo intervention, and the type of placebo intervention.

2.7 Potential Secondary Moderators of Placebo Interventions

A list of secondary moderators was generated for descriptive purposes. Secondary moderators of placebo included the following: placebo intervention session duration, daily frequency of exposure to the placebo intervention, placebo intervention program length, whether placebos were administered in a group or individual setting, blinding of test administrators to group allocation, type of control comparison, whether intent-to-treat analysis was used, whether the participant samples were clinical patients, whether the study outcome was reported as primary or secondary, the year of publication, and the geographic location of the study.

2.8 Potential Primary Moderators of Exercise Training

Moderators for exercise training were selected based on their empirical relevance to the psychological effects of chronic exercise and availability of data. Primary moderators included outcome type, total minutes of exposure to the exercise, intervention and the type of exercise intervention.

2.9 Potential Secondary Moderators of Exercise Training

Secondary moderators included the following: exercise intervention duration, frequency of the exercise intervention, exercise intervention program length, the presence or absence of supervision, whether exercise took place in a group or individual setting, blinding of test administrators to group allocation, type of control comparison, whether intent-to-treat analysis was used, whether the participant samples were clinical patients, whether the study outcome was reported as primary or secondary, and the geographic location of the study.

3 Results

3.1 Preliminary and Descriptive Results

Because this review focused on examining group differences at the conclusion of each intervention, only data that were reported immediately following the study (≤1 week) were used in the final analysis. Thus, 25 effects from four studies [76, 77, 82, 83] were excluded from the mean effect size calculation due to the length of the follow-up period (5 weeks to 4 months post-treatment).

A total of 50 effects from nine studies were included in the final placebo versus control mean effect size calculations [76, 77, 8284, 86, 9799]. The total number of effects for anxiety, cognitive performance, depression, energy, and fatigue were 1, 35, 9, 1, and 4, respectively.

A total of 41 effects were included in the exercise versus control mean effect size calculation. The total number of effects included in the exercise versus control mean effect size calculation for anxiety, cognitive performance, depression, energy, and fatigue, were 1, 26, 9, 1, and 4, respectively.

Table 1 provides a description of the included studies. Tables 2 and 3 provide information about the subjective and objective outcomes, respectively. Table 4 provides a description of placebo intervention characteristics and Table 5 provides a description of exercise intervention characteristics. Table 6 provides a description of selected methodological features of the included studies and Tables 7 and 8 provide a list of univariate analyses of moderators for placebo versus control and exercise versus control analyses, respectively.

Table 1 Description of studies
Table 2 Information about subjective outcomes
Table 3 Information about cognitive outcomes
Table 4 Description of placebo interventions
Table 5 Description of exercise interventions
Table 6 Selected methodological features of included studies
Table 7 Summary of univariate moderator analysis for placebo–control comparison
Table 8 Summary of univariate moderator analysis for exercise–control comparison

3.2 Primary Results

3.2.1 Placebo-Control

The unadjusted mean effect size ∆ for placebo compared with control was 0.12 (95 % confidence interval [CI] 0.03, 0.21; z = 2.50; p = 0.01). The distribution of effects, which ranged from −0.81 to 1.16, was positively skewed (0.50, SE = 0.34) and leptokurtic (1.59, SE = 0.66). Of the placebo-control effect sizes, 56 % (28 of 50) were greater than 0, favoring a psychological improvement after the placebo intervention. The fail-safe number of effects was n = 33. Examination of the forest (Fig. 2) and funnel plots (Fig. 3) showed a lack of publication bias. Egger’s test for bias was not significant, t(1,48) = −0.27, p = 0.79. In the multi-level, intercept-only model (χ 2 (2) = 135.1, BIC = 137), the mean was 0.20 (95 % CI −0.02, 0.41) with non-significant variance between effects (0.045, SE = 0.035, z = 1.29, p = 0.198).

Fig. 2
figure 2

Forest plot of Hedges’ d effect size for placebo compared with control (k = 50). Positive values favor placebo, and negative values favor control. Each row represents an individual effect that was extracted from a given study. The broken vertical line represents the mean effect size prior to adjusting for nesting effects. CI confidence interval, k number of effects

Fig. 3
figure 3

Funnel plot of Hedges’ d effect size for placebo–control versus study standard error

3.2.2 Primary Moderator Analysis

The overall meta-regression model was significant (Q R3 = 19.36; p = 0.0002; R 2 = 0.39; Q E44 = 29.81; p = 0.95). Outcome type (β = 0.19; z = 3.65; p = 0.0003) significantly contributed to the total variation of the effect of placebo interventions on psychological outcomes. Effects were larger when subjective outcomes were measured (∆ = 0.31; 95 % CI 0.12, 0.42) compared with objective outcomes (∆ = −0.02; 95 % CI −0.15, 0.10).

In the multi-level model, outcome type (beta = 0.193, SE = 0.050, z = 3.85, p < 0.001) and placebo type (beta = −0.139, SE = 0.063, z = 2.21, p = 0.027) improved model fit (χ 2 (4) = 119.8, BIC = 123) compared with the intercept-only model (∆χ 2 (2) = −356.8, p < 0.001). There was zero residual variance (z = 0.06, p = 0.956), indicating that all of the variance between effects was explained by these two moderators.

3.2.3 Exercise-Control

The unadjusted mean effect size ∆ for exercise compared with control was 0.23 (95 % CI 0.12, 0.34; z = 4.07; p = 0.0001). The distribution of effects, which ranged from −0.58 to 1.73, was positively skewed (1.01, SE = 0.37) and leptokurtic (2.44, SE = 0.72). Of the exercise–control effect sizes, 68 % (28 of 41) were greater than 0, favoring an improvement in psychological outcomes after exercise. The fail-safe number of effects was n = 161. Examination of the forest (Fig. 4) and funnel plots (Fig. 5) showed a lack of publication bias. Egger’s test for bias was not significant, t(1,39) = 1.975, p = 0.06. In the multi-level, intercept-only model (χ 2 (2) = 124.5, BIC = 126), the mean effect size was 0.37 (95 % CI 0.11, 0.63), with non-significant variance between effects (0.071, SE = 0.048, z = 1.48, p = 0.140).

Fig. 4
figure 4

Forest plot of Hedges’ d effect size for exercise compared with control (k = 41). Positive values favor exercise, and negative values favor control. Each row represents an individual effect that was extracted from a given study. The broken vertical line represents the mean effect size prior to adjusting for nesting effects. CI confidence interval, k number of effects

Fig. 5
figure 5

Funnel plot of Hedges’ d effect size for exercise–control versus study standard error

3.2.4 Primary Moderator Analysis

The overall meta-regression model was significant (Q R3 = 20.17; P = 0.0002; R 2 = 0.34; Q E37 = 38.33; p = 0.41). Outcome type (β = 0.17, SE = 0.05, z = 3.25, p = 0.001) and exercise type (β = −0.20, SE = 0.08, z = −2.39, p = 0.017) were related to effect size. Effects were larger when (1) subjective outcomes were measured (∆ = 0.47; 95 % CI 0.28, 0.67) compared with objective outcomes (∆ = 0.08; 95 % CI −0.03, 0.19) and (2) when combined exercise interventions were used (∆ = 0.52; 95 % CI 0.31, 0.72) compared with interventions that used resistance exercise (∆ = 0.08; 95 % CI −0.04, 0.42) or walking exercise (∆ = 0.30; 95 % CI 0.05, 0.54).

In the multi-level model, outcome type (beta = 0.143, SE = 0.058, z = 2.45, p = 0.014) and exercise type (beta = −0.135, SE = 0.050, z = 2.70, p = 0.007) improved model fit (χ2 (4) = 117.2, BIC = 120) compared with the intercept-only model (∆χ2 (2) = −21.08, p < 0.001). There was zero residual variance (z = 0.10, p = 0.923), indicating that all of the variance between effects was explained by these two moderators (i.e., variance in the conditional model including study duration/variance in the intercept only model).

4 Discussion

This appears to be the first meta-analytic review of the placebo effect associated with psychological outcomes of exercise training. Below, the results are integrated into the literature to the extent possible given the paucity of directly relevant literature. Limitations of the current literature are outlined and suggestions for future research are discussed.

4.1 Observed Effect of Exercise Training

After adjusting for nesting effects, the magnitude of the observed effect of exercise training on psychological outcomes (∆ = 0.37) was comparable to the findings of previous meta-analyses that quantified the psychological consequences of exercise training; for example, the ∆ = 0.34 effect size for cancer-related fatigue [57] and the effect size of ∆ = 0.29 and ∆ = 0.30 for anxiety and depressive symptoms in adults with a chronic illness [20, 38].

4.1.1 Primary Moderators of the Observed Effect of Exercise Training

With regard to moderators of the exercise training effects, the influence of outcome type was consistent with prior research. Larger effects have been reported for subjective outcomes, such as anxiety [25], depression [39], and energy/fatigue [58], compared with objective cognitive performance [32]. We also found that exercise mode significantly moderated the effect of exercise training on psychological outcomes, which is consistent with one previous review [42]. Overall, these findings imply that the nine studies included in this meta-analysis were not unusual trials or likely to be outliers in the literature.

4.2 Placebo Effect

After adjusting for nesting effects, the magnitude of the mean population placebo effect in psychological outcomes from placebo conditions used in exercise training studies was estimated to be ∆ = 0.20. A key novel finding of this investigation is that the mean placebo effect was about half of the observed psychological benefits of exercise training. Put another way, when the placebo effect (∆ = 0.20) was subtracted from the observed effect of exercise (∆ = 0.37), the true effect of exercise training on psychological outcomes was estimated to be ∆ = 0.17. Therefore, the true effect of exercise training per se on psychological outcomes is likely to be substantially smaller than those suggested in previous reviews that have ignored the potential placebo effect [19, 28, 34].

Other authors have explicitly recognized that exercise training effects are attenuated when comparisons are made with placebo groups [58, 100]. These observations combined with the present findings underscore the usefulness of including both control and placebo groups in experimental designs aimed at understanding the true effect of exercise on psychological outcomes. Previous studies that restricted participant allocation to an experimental and control group presumably show effects attributed to exercise training that may well be conflated by non-exercise variables that are part of the psychosocial situation in which exercise takes place, including variables such as participant expectations—the predominant hypothesized mechanism of the placebo response [101, 102].

It is unclear why the placebo effect became non-significant in the multi-level model, but controlling for nesting effects revealed that some placebo treatments had a negative effect on psychological outcomes, which resulted in a wider range of the CI around the mean effect. Most of the nested effects came from a study that examined the influence of exercise training on objective measures of cognitive performance, and more studies would have enhanced the statistical power of this analysis.

4.2.1 Primary Moderators of the Placebo Effect

In the unadjusted meta-regression model, outcome type significantly contributed to the total variation in the placebo effect. After adjusting for nesting effects, placebo type also became significant. The discrepancy between the original model and the multi-level model suggests that contribution of placebo type to the regression model was influenced by nesting effects.

4.2.2 Outcome Type

The placebo effect was substantially moderated by outcome type before and after controlling for nesting effects. Outcomes that were classified as subjective (e.g., anxiety, depression, energy, fatigue) showed larger effects than did objective outcomes (e.g., performance on cognitive tests). Therefore, studies that measure the impact of exercise on subjective psychological outcomes appear to be especially susceptible to placebo effects. While this seems plausible, others have found similar sized placebo effects when comparing subjective and objective outcomes [103, 104]. No studies yet have addressed this question using subjective and objective measures calibrated to be equally sensitive to change with exercise. It would have been useful to determine whether the size of the placebo effect varied between the different types of subjective outcomes or domains of cognition, but there were not enough data to conduct a meaningful analysis.

4.2.3 Placebo Type

The moderating role of placebo type found in this analysis is consistent with other literature [66, 105]. The size of the placebo effect varied depending on the classification of placebo type. The largest effects were shown for placebos that were classified as ‘very low-intensity exercise’. The very low-intensity exercise placebo conditions included here appear to have involved trivial increases in metabolic rate, much less than in typical exercise training programs. Few experiments have been designed to provide clear information as to whether there are dose–response relationships between exercise and psychological outcomes; perhaps the least is known about the minimum exercise dose needed to reliably produce psychological benefits. So, it is possible that these ‘exercise placebos’ represented exercise and had therapeutic effects. However, evidence is not compelling that the physical activity doses used were enough to produce true psychological benefits caused by the exercise per se.

4.3 Limitations

While the present investigation presents novel information potentially useful to the field, it also has several limitations. Only nine studies met the inclusion criteria, which reduced the statistical power and generalizability of the findings. Less than 2 % of the studies that were assessed for eligibility were placebo-controlled experiments with a concomitant control group. Moreover, the most recent study to meet the inclusion criteria was published in 2009, which is 4 years prior to the start of the literature search [99]. Potential reasons for the infrequent use of placebos in exercise training studies include ethical concerns [62], the lack of consensus as to what constitutes an appropriate placebo condition for exercise, and the inability to include a placebo group due to insufficient resources. The type of placebo groups that were used varied substantially; this was illustrated by the fact that no two studies used identical placebo conditions. Our analysis suggested that ‘very low-intensity exercise’ placebos showed the largest placebo effects, but only two studies used placebo interventions that could be categorized in that way.

Expectations about treatment outcomes are considered to be a likely antecedent to placebo responses [106], and a small body of evidence suggests that expectations mediate the effect of exercise on certain psychological outcomes [107]. Only two of the nine studies reviewed here measured participant expectations. This limited our ability to test for a moderating effect of expectations or interactions between expectations and other moderators (e.g., outcome type, exercise mode, placebo type, and total minutes of exposure to the exercise or placebo intervention).

Adherence to placebo and exercise interventions was inconsistently reported but could have moderated the psychological outcomes. It would be useful to know how adherence to a placebo or exercise intervention relates to both initial and subsequent expectations. A recent correlational study found that initial expectations predicted adherence to a 2-week walking program [108]. Therefore, it is plausible that participants with higher expectations are more likely to adhere to a study protocol than those who do not expect a benefit from exercise training. This worthwhile idea has yet to be explored in an experimental setting.

Inadequate reporting of the methods and interventions in a majority of the included studies limited our ability to evaluate the features of an exercise program that are important for psychological health. For instance, few studies reported the intensity of the exercise program [76, 83], adherence rate to the study protocol [83, 86], whether an intent-to-treat analysis was used [77, 83, 86], or whether test administrators were blind to group allocation [77, 82].

The generalizability of this meta-analysis is limited by the characteristics of the samples that were included. A majority of the studies included in the analysis focused on older adults [77, 82, 9799], two studies were based on middle-aged adults [83, 86], and the remaining studies recruited college students [76, 84]. It is possible that age moderates the magnitude of placebo effects in intervention outcomes, but this has not been reported in previous literature and was not examined here due to the low number of studies that could be included.

4.4 Future Research

The present findings are consistent with the robust placebo effects reported in research and clinical settings. The results reported here suggest that placebo effects play a substantial role in the psychological outcomes of exercise training. Ultimately, placebo effects might be important in elucidating the psychological benefits of regular physical activity from the perspectives of explanatory mechanisms and the optimization of the benefits in clinical settings. However, there are numerous gaps that need to be addressed in order for placebo effects that may accompany exercise training to be fully understood.

It is recommended that future studies include placebo groups in randomized controlled trials that are designed to examine the psychological effects of exercise. Conditions that resemble some aspects of very low-intensity exercise include equipment that moves the limbs of an individual (passive exercise) [109], low-intensity electrical stimulation of muscle [110], hypnotic suggestion of exercise [111], and imagery of exercise [112]. Choices regarding the characteristics of an exercise placebo could be tailored to the psychological outcome being investigated. It is important to ensure that an ‘exercise placebo’ is truly inert and has not been proven to be an effective therapy, but is also administered in a psychosocial context that is believable to a participant; otherwise expectations may not be influenced [62]. Exploratory research that evaluates the magnitude of the treatment and expectancy response to different types of potential placebo conditions (e.g., passive exercise, low-intensity electrical stimulation, hypnotic suggestion of exercise, imagery of exercise, social contact, relaxation, sugar pills, sham ultrasound, sham acupuncture) compared with an exercise training and no-treatment control group is needed to help determine the advantages and disadvantages of various approaches for placebo groups in exercise training studies in which the focus is on psychological outcomes.

It is critical that future randomized controlled trials use better methods and carefully report each detail of an exercise program to limit experimental error and improve the ability to replicate and understand findings. One important methodological detail is the measurement of expectations. Past exercise studies have used a variety of methods for measuring participant expectations [76, 84, 107, 108], which makes between-study comparisons difficult. Questionnaires with psychometric evidence supportive of their validity should be used [113, 114], but this strategy has rarely been adopted in exercise training studies.

Exercise training appears to improve pain symptoms, but this could be largely caused by placebo effects [115140]. No study with a pain outcome measure met the criteria to be included in the present analysis. One study that was considered included exercise training, control, and placebo groups but did not provide sufficient information to allow extraction of effects [141]. Placebo analgesia is a frequently researched topic in the placebo literature, and the largest placebo effects often are realized with pain outcome measures [64]. Studies with placebo and control groups that focus on the effects of exercise training on pain are needed.

Finally, a randomized controlled trial is a limited research design for studying placebo effects [1]. Due to the inability to blind participants to exercise training, expectations about the intervention are likely to introduce error to the observed effect of exercise [13]. A feasible alternative to the randomized controlled trial could be a between-subjects balanced-placebo design that provides a better controlled estimate of the placebo effect [70, 142]. To date, no studies have attempted to use the balanced-placebo design to study the size of placebo effects in psychological outcomes of exercise interventions.

5 Conclusion

Exercise training trials focused on psychological outcomes that include placebo and control groups are urgently needed to augment our understanding of both placebo responses and the true effect attributable to exercise training per se. The small body of studies reviewed here suggests that the effect of exercise on psychological outcomes is considerably smaller after accounting for the placebo effect. Placebo effects appeared to be stronger when subjective outcomes were measured and when ‘very low-intensity exercise’ placebos were used. Researchers should consider using this information to guide their own interpretation of previous and future studies.