This review focuses on heart rate variability biofeedback (HRVB), a method that has become increasingly popular in recent years among psychophysiologically-minded psychotherapists (Kaur et al. 2016; Lehrer 2016, 2018). A growing body of literature has consistently shown that organized variability in heart rate (HR) may be a reasonable index of general health, both physical and emotional (Joyce and Barrett 2019; Kristal-Boneh et al. 1995; McCraty and Shaffer 2015; Perna et al. 2019; Sessa et al. 2018; Young and Benton 2018), and that biofeedback as a method to increase heart rate variability has widespread beneficial effects.

The pattern of heart rate variability is complex, but, in the healthy heart, heart rate variability can be decomposed to a small set of overlapping oscillations. The complexity is organized in that it can be described using a set of nonlinear formulas that generally track control of heart rate by the central nervous system. The autonomic nervous system is the primary controller of these oscillations, which reflect a set of reflexes that help control various body functions. Heart rate variability biofeedback directly affects two of these reflexes: respiratory sinus arrhythmia (RSA) and the baroreflex (BR).

Respiratory Sinus Arrhythmia

RSA is the variation in HR that accompanies breathing, such that HR increases during inhalation and decreases during exhalation. It has an important function in controlling ventilation, such that the amount of blood flowing to the lung can be maximized when the greatest amount of oxygen is in the lung. This relationship can be important for respiratory disease as well as for athletic and mental performance requiring additional oxygen to the muscles and the brain. When breathing and heart rate oscillations are entirely in phase with each other gas exchange efficiency is maximized (Yasuma and Hayano 2004). Moving the phase angle more out of phase can similarly be helpful to avoid hyperventilation. In everyday activity heart rate and breathing usually oscillate at a 90° phase angle, such that the peak heart rate tends to occur in the middle of a breath, and gas exchange efficiency is at an intermediate level. When heart rate and breathing are completely out of phase, such that the peak of inhalation occurs during the lowest point in heart rate, gas exchange efficiency is at its lowest. In HRVB, our studies of young healthy people have shown that breathing and heart rate usually oscillate in phase with each other (Vaschillo et al. 2002).

Another important aspect of RSA for psychology is its neural control. It is mediated by the vagus nerve, a major parasympathetic nerve, such that it is stimulated in periods of calmness and relaxation and depressed during periods of stress. The amount of RSA can be quantified by the amplitude of peak-to-trough excursions in heart rate that occur with each breath. The amplitude of changes (in beats per minute) tends to be greater in healthy people than in sick people, in younger people than in older people, and in people who are aerobically more fit. It is smaller in states and traits of anxiety, anger, and depression, and in a host of physical diseases ranging from heart disease to febrile infection. It is quantified in many ways. In addition to average peak-to-trough amplitude, it can be measured by the root mean square of successive interbeat differences in adjacent heart periods (the time between adjacent heart beats), the percent of adjacent interbeat intervals (IBI’s) differing by 50 ms or more, and by the spectral amplitude in the range of normal respiration rate, between 0.15 and 0.4 Hz, or nine to 24 breaths per minute, a range conventionally described as the “high frequency” range in heart rate variability. When doing HRVB, RSA amplitude increases dramatically (Lehrer et al. 2003, 2004).

A third aspect of RSA of interest to psychologists is the relationship between RSA and sociability. Porges has pointed out that social animals, like people, dogs, horses, and certain monkeys, have large amounts of RSA. Less sociable animals, like cats and certain nonsocial rodents, have little RSA. It also is related to sociability among people (Doussard-Roosevelt et al. 2003; Porges et al. 1996; Porges and Furman 2011). Among people, those with major social deficits, particularly autism, have very low levels of RSA (Patriquin et al. 2019; Porges et al. 2013). Among married couples, those with good marriages tend to have high levels of RSA while interacting, while those with bad marriages tend to have less. Although little research has been done yet on effects of HRVB on sociability, this is a potential application.

The Baroreflex and the Brain

Another modulatory reflex that is greatly stimulated by HRVB is the BR (Lehrer et al. 2003). The BR sets the conditions for resonant effects of breathing between 4.5 and 6.5 breaths/min, which produce the large changes in RSA. It is a reflex that modulates changes in blood pressure. It is mediated through stretch receptors in the carotid artery and aorta, called baroreceptors. When blood pressure (BP) rises the walls of these arteries stretch. When the baroreceptors sense an increase in BP, the BR’s cause an immediate decrease in HR, leading to a subsequent mechanical decrease in BP caused by less blood flowing through the vasculature, with a constant delay of close to five seconds (Eckberg and Sleight 1992), the length of which is apparently caused by the amount of blood in the system, with a longer delay among taller and more muscular people (Vaschillo et al. 2006). The BR helps control changes in BP, and is a modulating force for promoting blood pressure homeostasis. It is controlled though centers in the brain stem, chiefly the nucleus tractus solitarius, which communicates directly with structures in the limbic system and prefrontal cortex that both generate and modulate emotion (Henderson et al. 2004; Mather and Thayer 2018; Rogers et al. 2000; Sakaki et al. 2016; Shoemaker and Goswami 2015; Yoo et al. 2018). As with RSA, the heart rate component of the BR is also under parasympathetic control. BR gain can be quantified as the amount of change in HR that is triggered by each millimeter of mercury change in BP. As with RSA, BR gain also is smaller in various illnesses than in healthy people (Davydov et al. 2018; Haji-Michael et al. 2000; Suzuki et al. 2017; Peckerman et al. 2003) and depressed in various states and traits of negative emotion (Dawood et al. 2008; Vasudev et al. 2011).

Recent findings on effects of HRVB on the brain also show large increases in blood flow oscillations during HRVB throughout areas involved in emotional generation and modulation, particularly the limbic system and the cingulate and prefrontal cortices (Mather 2019; Vaschillo et al. 2019), and there is some evidence for greater connectivity between limbic and prefrontal structures, including evidence for increases in brain tissues in these connectivity pathways, after people have practiced the technique for several weeks. This may be a mechanism whereby HRVB can help modulate emotional swings.

Because of the time delay in the BR system, it tends to produce a rhythm in everyone’s heart rate with a period of about ten seconds. This rhythm is found in everyone, and has long been identified as the ‘Meyer wave’. The rhythm varies among people in the range of 4.5 to 6.5 cycles per minute (Fuller et al. 2011; Vaschillo et al. 2002, 2006), and correlates highly with what has been termed the “low frequency” spectral range in heart period, between 0.05 and 0.15 Hz (three to nine cycles per minute). The BR system can be considered a “closed loop” system because it is characterized by an internal feedback loop that helps control cardiovascular stability when stimulated from the outside. Any closed loop negative feedback loop with a constant delay has the characteristics of resonance (Grodins 1963; Ringwood and Malpas 2001), and any resonant system stimulated at its resonance frequency produces very large amplitude oscillations at that frequency, recruiting most other oscillations at other frequencies (Ogata 2004). Thus, when breathing at resonance frequency, RSA stimulates the BR. Other forms of stimulation, such as rhythmical muscle tension (Lehrer et al. 2009) and rhythmical exposure to emotional pictures (Vaschillo et al. 2008) can produce a similar effect, although usually with a smaller amplitude of heart rate oscillations.

HRVB thus creates tremendous increases in vagus nerve activity, with increases in RSA amplitude regularly increasing two to fivefold while people are breathing at their individual resonance frequency, which varies between 4.5 and 6.5 breaths/min (Lehrer et al. 2003, 2004; Vaschillo et al. 2006). This increase is entirely mediated by the vagus nerve, a major parasympathetic nerve. Vagus nerve activation, and parasympathetic activity in general, are characteristics of relaxation and lower levels of stress. Importantly, many of the vagus nerve fibers are afferent, meaning that activity in the vagus nerve affects the brain as well as vice versa, through the pathways described above.

Why HRV Biofeedback?

HRVB directly stimulates various homeostatic ‘negative feedback loops’ (Lehrer and Eddie 2013). Particularly, because the baroreflex and RSA are both stimulated by HRVB and parasympathetic activity is increased, there is therefore reason to believe that HRVB should improve emotional regulation (Mather and Thayer 2018; Thayer et al. 2012). Because of the in-phase relationship between heart rate and breathing during HRVB, there is reason to believe that it improves gas exchange efficiency and helps respiratory disease and other breathing disorders. Because it stimulates the BR, there is reason to believe it helps control blood pressure. Because it stimulates the vagus nerve it might be expected to produce a sense of relaxation and well-being.

Also, in addition to stimulating parasympathetic activity, HRVB directly stimulates a variety of homeostatic reflexes, perhaps a unique characteristic of this intervention, while other accepted methods of stress management directly target other mechanisms. HRVB stimulates an interaction between RSA and the BR, two reflexes with regulatory functions (Lehrer 2013).

Other methods work through other pathways. For example, Jacobson’s method of progressive muscle relaxation, which, when done according to his method, teaches relaxation of the muscles down to the level of underlying muscle tone (Jacobson 1938). Because the muscles are part of the sympathetic nervous system (Di Bona et al. 2019; Mitchell and Victor 1996; Notarius et al. 2015) the direct effect of progressive muscle relaxation is to decrease the level of sympathetic arousal (Cottier et al. 1984; Larkin et al. 1990), with more indirect effects on homeostatic functions, as reflected in stress recovery (English and Baker 1983). Similarly, methods such as hypnosis, cognitive therapy, and meditation focus on thought processes than on direct physiological control, with a less direct pathway to physiology. It is the unique pathway of effects that prompts this evaluation of the usefulness of HRVB.

A particular symptom targeted by HRVB is hyperventilation. HRVB appears to inoculate people against this tendency in the face of various respiratory stimulants, including altitude, exposure to high levels of ambient carbon dioxide, and stress. Patients with panic disorder, many of whose symptoms are those of hyperventilation, are particularly trained to use the method to abort panic attacks when symptoms first start, and to avoid hyperventilation symptoms when exposed to various panic triggers. Having a reliable method of controlling panic symptoms then becomes a useful tool for decreasing fear of hyperventilatory body symptoms. The mechanism by which hyperventilation is targeted by HRVB has not been proven, although it is reasonable to hypothesize that it involves a combination of slow breathing, decreased emotional and autonomic reactivity, and attention to breathing mechanisms for controlling it. The increased gas exchange efficiency, described above, may also have an effect on modulating respiratory drive.

Perhaps because of these unique effects of HRVB, interest in the method has grown. The number of studies of HRVB published each year has grown exponentially between the early 1990’s and 2016 (Kaur et al. 2016).

Although not covered in this review, it is important to mention possible side effects of HRVB. These are quite minor in most cases. It is common for people to hyperventilate slightly when first doing slow breathing, where increased depth of breathing overcompensates for the slow pace. In the standard protocol, the trainee is specifically instructed to breathe shallowly, particularly in response to feelings of lightheaded ness, which usually is the first hyperventilation symptom to occur. Another possible side effect, of unknown risk, occurs among people with frequent cardiac arrhythmias. In rare cases, individuals with frequent preventricular contractions may show an increase in these events, particularly toward the end of exhalation when doing slow breathing. These events may be caused by a buildup of carbon dioxide during a long exhalation. They are easy to detect from a biofeedback heart rate tracing. The cardiac risk of these biofeedback-induced arrhythmic episodes is unknown so the method should be used with caution among people with this condition, although some people who have continued practicing the method despite this pattern of arrhythmias actually have shown a decrease in the spontaneous occurrence of them.

The Method of HRVB

In HRVB, people are taught to breathe slowly, at the particular rate of the baroreflex rhythm. Because of resonance characteristics of the BR system (Hammer and Saul 2005; Lehrer et al. 2009; van de Vooren et al. 2007; Vaschillo et al. 2002) and the particular phase relationships among HR, BP and breathing when people breathe at the BR frequency (Vaschillo et al. 2002, 2006), a very large increase in the amplitude of HR oscillations occurs when people breathe at the BR frequency, caused by an interaction between RSA and the BR.

In HRVB, people learn through biofeedback to detect the particular frequency at which HRV is maximized for each individual when they breathe at that rate. This can easily be detected by following a simple heart rate monitor. These are increasingly free or low cost, and easy to use (Hunkin et al. 2019). There are various methods for HRVB training, which usually include a combination of paced breathing at various rates in order to determine the rate producing the biggest swings in HR from inhalation to exhalation, and simply following a HR tracing on a computer screen or playing various computer games where displays are proportional to the change in HR with each breath. People are instructed to follow the tracing in order to maximize swings in HR with each breath, which only can be achieved by breathing at their individual resonance frequencies. When people practice HRVB daily over a period of time, amplitudes of HR excursions at both RSA and BR frequencies are increased even when people are not practicing the technique (Lehrer et al. 2003). Thus, these two important regulatory reflexes, RSA and the BR, appear to be strengthened by exercise during biofeedback, with expected effects of improved immunity to and recovery from stress and adaptability to various mental and athletic demands on the system.

As will be reviewed below, a number of studies have found that HRVB does, in fact, produce improvement in a variety of physical and emotional conditions including anxiety, depression, hypertension, asthma, and pain, as well as improvement in various kinds of human performance including mental concentration and agility, athletics, dance, and music. The technique is easily learned and can be trained using inexpensive equipment including several free smart phone applications. HRVB has been proposed as a psychotherapy component that specifically targets the neurovegetative components of emotional problems and may improve treatment effectiveness (Caldwell and Steffen 2018; Lehrer 2018; Wheeler 2018). Most people can achieve high-amplitude oscillations in HR after just a few minutes of training, and almost everyone can master the technique within one to four sessions of coaching. After initial training some people still achieve better results by following a heart monitor, while others do just as well doing paced breathing at their resonance frequency, once this frequency has been determined by biofeedback, following the second hand on a clock or counting seconds silently. The exceptions are people with frequent cardiac arrhythmias, such as preatrial or preventricular contractions, which make it difficult for them to determine their resonance frequency.

To test the hypothesis that HRVB promotes general health and performance, we conducted a systematic review and meta-analysis of all randomly controlled trials of HRVB, including all outcome measures used in all studies, regardless of the target problem or population, and whether the particular outcome measure was closely related to the target problem, e.g., measures of anxiety for a study on treatment of asthma even where baseline levels of anxiety are in the normal range at pre-test. We consider this to be a conservative test because it maximizes the possibility of floor effects on some variables, where little improvement is possible. We excluded HRV variables because the effect sizes would be very high for acute changes during biofeedback (Lehrer et al. 2003, 2004), and because it would be circular to impute higher levels of resilience from higher HRV. Although baseline changes in HRV after treatment vary widely among studies, they are mostly related to age and are unrelated to symptom changes (Lehrer et al. 2006; Wheat 2014). Older people have smaller HRV amplitudes and smaller changes in HRV after HRVB and, in some cases, greater symptom improvement (Alayan et al. 2019; Lehrer et al. 2006). It is possible that the symptom effects reported here may be due to frequent and cumulative application of HRVB to ameliorate acute symptom changes associated with the large changes in HRV, and that neural mechanisms for these effects may differ from peripheral effects on HRVB, perhaps due to age-related effects on the cardiovascular system.

Because resonant effects on heart rate variability tend to occur when the system is stimulated close to the resonance frequency but not at it exactly (Vaschillo et al. 2004), it is possible that simply doing paced breathing at about six breaths per minute would have the same salutary effects as breathing more exactly at resonance frequency. This can easily be taught by following a computer-generated pacing signal or a clock. For greater comfort, some respiratory biofeedback devices provide signals to gradually decrease respiration rate to the desired frequency. It has not yet been definitively established whether HRVB has better clinical effects than simple paced breathing at six breaths per minute, although one small study on borderline hypertension found that both methods produced significant effects on decreasing blood pressure, although, as would be expected from the description of mechanisms described above, the effect of HRVB was slightly, although nonsignificantly, greater (Lin et al. 2012). In this review we decided to include studies of breathing at approximately six breaths per minute as well as HRVB studies because the effects are so similar, and to compare the effect sizes for the two methods.

Methods

Identification of Studies for Inclusion

A literature search was performed to generate articles for the meta-analysis, with specific search criteria, using the databases CINAHL, Cochrane, PsychINFO, PubMed, Scopus, and Web of Science. The search terms included common HRVB maneuvers and the equipment used to conduct HRVB as well as various descriptors of voluntary control and various outcomes. The complete search criteria are in the supplement to this paper, Table S1.

A search of all published papers and grey literature (unpublished convention papers, dissertations, etc.) through November 15th 2018 generated 1868 papers, of which 1514 were unduplicated. Studies with a 2019 publication date were reviewed from prior convention presentations. At least two of five independent reviewers (KK, AS, KS, RH, and JB) performed a preliminary review of each abstract searching for inclusion criteria. The inclusion criteria were the use of HRVB or paced breathing (PB) at a rate of approximately six breaths/min (bpm, the approximate rate of breathing during HRVB), use of this maneuver for any condition, and consisting of a randomized controlled trial. The reviewers reconciled differences after their independent reviews and eliminated 1272 papers. Where reviewers disagreed, final decisions were made by PL.

The remaining 242 papers were analyzed in a secondary review by the same combination of reviewers, where papers were read in their entirety, additionally screening for the inclusion criteria and for the following exclusion criteria: lack of a treatment goal other than increasing HRV, use of PB at a rate other than six bpm, biofeedback for average heart rate but not HRV, a small sample sizes (n < 10), confounding effects of HRVB and other methods (e.g. HRVB along with another intervention, compared with a control group), or insufficient usable data for the Comprehensive Meta-Analysis (CMA) program, version 3.3.070 (Borenstein et al. 2009), which we used for all calculations. All of these papers were additionally reviewed by PL. Reasonable efforts were taken to contact the authors of studies with insufficient data so the studies could be included. Studies combining HRVB or PB with other interventions were included where the same additional interventions were given to control groups.

After the secondary review, 185 additional papers were excluded, and a total of 58 studies from 57 papers were entered into the CMA program. Coding of these papers for various mediators was done by three independent reviewers per study, who reconciled differences after coding.

Data Extraction

We coded all outcome measures reported for each study other than heart rate variability measures and process measures (e.g., home practice time, treatment believability, etc.) The components extracted from each study included: study name and year, comparison used (HRVB vs control, PB vs control, HRVB vs PB, etc.), outcome measures, time points at which data were collected, data format (e.g., pre- and post-treatment, follow-up, midpoint, etc.), outcome measures (e.g., pre- and post-treatment means and standard deviations, effect sizes for therapeutic effects of treatment vs. control conditions, or values of F or chi square), sample size in the treatment and control groups, year of publication, type of treatment received (HRVB or PB), the number of weeks spanning the beginning and end points, number of treatment sessions, type of control used, e.g., active or inactive, control description (e.g., standard care, relaxation, cognitive therapy), disorder or target problem studied, description of each outcome measure, type of measure, e.g., self-reported, physiological, whether each particular outcome measure was specifically targeted to the study and the population studied, and measure direction (improvement indicated either by low or high scores). The outcome measures we analyzed are summarized in Table 1 along with ways in which we categorized each, and the types of control groups are in Table 2. The standardization method usually was pre-post standard deviation. Where various outcome statistics were reported, we favored using pretest and posttest means and standard deviations. Where only pre-post difference scores were reported, we calculated g based on these and standardized using the standard deviation of the difference score.

Table 1 List and categorization of outcome measures
Table 2 Control conditions

Each outcome measure within each study was given a separate entry in the CMA program, such that multiple entries existed for many studies. Follow-up and post-test analyses were given separate entries, but the results of these were averaged in the analyses. Fifty-eight studies generated 360 entries of usable data in the CMA analysis, with 2485 participants across studies. Figure 1 shows the flow of procedures for this study, using PRISMA guidelines (Moher et al. 2009).

Fig. 1
figure 1

PRISMA flow of study procedures

Statistical Analysis

Hedge’s g, a corrected version of Cohen’s d, was used as an effect size measure, using a random effects model (Borenstein et al. 2010). An average Hedge’s g was calculated across studies, averaging within-study outcome measures and time points, and weighting studies by sample size. When a study had two control groups the n in the treatment group divided by two. Follow-up and post-test data were not separated because some time points labeled as ‘follow-up’ were shorter than some intervals labeled as ‘post-test’. Hedges’ g for each study comprised the average of all measures and time points within the study. Where post tests and follow-up measures were reported in separate papers, a single g was calculated for each study across papers. Cohen (1988) suggests that effect sizes of 0.2 be considered small, 0.5 medium, and 0.8 large. A funnel plot was used to detect outlying studies. In this plot the Y axis represents the size of the sample, with smaller variation among studies expected among studies having a larger n. Although sometimes interpreted as having experimental bias, outlying studies also could represent unique characteristics of procedures or study participants, so we examined the outlying studies for unusual characteristics (Borenstein et al. 2009). Because it may be unclear whether the outliers represent bias or unique study characteristics, data are presented both with and without the outliers. We also report the significance of heterogeneity among studies using the Q statistic despite the fact that an analysis with many studies and large sample sizes, as the current one, may yield a significant Q statistic with small amounts of heterogeneity, rendering the statistic less meaningful. We also assessed the percentage of heterogeneity among studies due to real heterogeneity vs. chance (within-study) variance using the I2statistic (Higgins and Thompson 2002), and calculated the prediction interval as the average g ± 2 × tau, the standard deviation of real effect sizes, to estimate the range of values within which there is 95% confidence that another study would find g (Higgins 2008). We additionally calculated separate effect sizes comparisons for HRVB/PB compared with control conditions from studies with active and inactive control groups, and for various individual target problems and types of outcome measures in order to examine the effect sizes of treatment vs. control conditions on specific problems. We used meta-regression analysis to examine the effect of treatment length and intensity (number of sessions), whether particular measures were targeted, whether controls were active or inactive, whether treatment was by HRVB or PB, and year of publication. We included the intercept in the model for these analyses. Data were coded such that more negative values, yielding negative g’s, represented a therapeutic effect, with positive values indicating a deterioration in the participant’s condition on that measure. We used a mixed effects analysis for examining dichotomous mediators (e.g., whether or not a particular measure was targeted to the population studied, whether the control group was active or inactive), and computed g for each alternative and computed differences between them using the Q statistic.

Results

Figure 2 shows a forest plot of all studies, with individual study statistics shown in Table 3 and summary statistics summarized in Table 4. The average effect size for HRVB /PB vs. control conditions was found to be small to medium (g = 0.37) with significant heterogeneity, considerable heterogeneity and error variance, and a 95% prediction interval between a large effect favoring HRVB and a small effect favoring a control group (g = − 1.03 and + 0.29). A funnel plot (Fig. 3) shows three outlying studies with greater therapeutic effect than others, studies by Lehrer et al. (2004), Munafo et al. 2016, and Paul and Garg (2012). Although the possibility of bias cannot be ruled out for these outlying studies, each had some unique characteristics that may have contributed to the very high effect sizes (g = 1.9–2.7). Lehrer et al. (2004) used an unusual design, including biweekly adjustment of medication (an outcome variable) and an unusually sensitive measure of pulmonary function, forced oscillation pneumography. Munafo et al. (2016) also used only physiological measures as outcomes, with systolic blood pressure closely related to baroreflex function, which was directly targeted by HRVB. Paul and Garg (2012) used acute measures of basketball performance. Although three additional studies also were slightly beyond the expected limits, one showing a slightly higher effect size than expected and two showing a slightly lower size, we decided not to treat these studies as outliers because they were not influential and did not create significant heterogeneity. When we recomputed meta-analytic statistics without the three outliers, we found a small but still significant effect size without significant heterogeneity (Table 4) and a funnel plot showing no influential outliers (Fig. 4).

Fig. 2
figure 2

Forest plot of Hedges’s g. Studies with more than one control group are entered separately for each control

Table 3 Individual statistics for all studies (in order of Hedges’s g as in Fig. 2)
Table 4 Mixed effects met analyses on all studies
Fig. 3
figure 3

Funnel plot of all studies

Fig. 4
figure 4

Funnel plot deleting three outlying studies

When we compared HRVB with inactive control conditions (treatment as usual, sham procedures, etc.) we still found a significant small to medium effect size, both with and without the outliers (Table 5). When compared with paced breathing at a relaxed respiratory rate of about 15 breaths per minute, HRVB/PB (at 6 breaths per minute) was nonsignificantly superior, with a small effect size, g = − 0.26, p < 0.06 (Botha et al. 2015; Breach 2013; Carpenter et al. 2013; Lehrer et al. 2017; Tsai et al. 2015). When compared with all breathing interventions other than resonance frequency breathing, including attention to breathing, counting breaths, and deep breathing, a small but significant effect size was found (Table 5). Two studies using PB, both with inactive controls, found a nonsignificant small to medium effect size, g = 0.38, p = 0.32, and, with greater power, 41 studies using HRVB with inactive controls yielded a similar effect size without outliers, g = 0.33, p < 0.0005, and g = 0.45, p < 0.0005 with outliers. When HRVB was contrasted with effects of all other effective interventions (active interventions), we found little difference (Table 5). Effect sizes are similar for behavioral, physiological, and self-report outcome measures. We found small to medium effect sizes for physiological and self-report measures and a medium but nonsignificant effect size for behavioral performance measures, but no significant differences among these three types of outcome measures (Table 5). When we compared the effects of HRVB/PB on measures that were specific to the target population, the effects do not differ from those on nontargeted measures (Table 5).

Table 5 Effects of HRVB and PB on particular symptoms

For individual problems, effect sizes varied widely between small and medium to large across most disorders or targets. The number of studies for each symptom is low, so some of the results may be unreliable and nonsignificant due to lack of power (Table 6). Irrespective of statistical significance, the highest effect sizes were found for athletic/artistic performance, depression, gastrointestinal problems, anger, anxiety, respiratory disorders (including an outlying study), systolic blood pressure, substance craving, and pain. The lowest effect sizes were for self-reported stress, physical functioning/quality of life, diastolic blood pressure, post traumatic stress, general activation/energy, and sleep. Some effect sizes were slightly higher for measures related to the problems targeted in individual studies than for nontargeted measures. A particularly wide dispersion was found among studies of anxiety and artistic/athletic performance, where a significant effect was found only where outlying studies were included in the analysis. Larger effect sizes were found for anger and gastrointestinal problems, but the effects are not statistically significant due to lack of power.

Table 6 Mixed effects contrasts

Regression analyses for linear moderators are shown in Table 7. Very small and nonsignificant regression coefficients both for number of treatment sessions (median = 6, range = 1–40), and number of weeks between pre-treatment and post-treatment assessments (median = 5, range = 0–40). In all cases participants had been encouraged to practice the HRVB/PB techniques between sessions. The meta regression on year of publication also yielded nonsignificant regression.

Table 7 Meta regression analyses

Discussion

The results of this review provide evidence that HRVB and PB at approximately six breaths per minute have positive effects on a variety of physical, behavioral, and cognitive conditions. The overall effect sizes are modest but highly significant, suggesting that these methods may not be sufficient for treating any one problem but may be useful as a complementary intervention. The effect sizes appear to be equivalent to those of other established psychological treatment modalities, although there are not enough studies to evaluate relative superiority to any particular other treatment. These results suggest that HRVB might be a useful addition to the skill sets of clinicians working in a variety of settings, including mental health, behavioral medicine, sports psychology, and education. The method is easy to learn and can easily be used along with other forms of intervention, with rare side effects.

Although the effect sizes are magnified by three outlying studies, they remain significant and small to medium with these studies removed. When this is done, the heterogeneity among studies is small and nonsignificant with a narrow prediction interval, suggesting that the effect size estimate is stable. Effect sizes tend to be similar for targeted and nontargeted measures, suggesting that the method may be as useful for helping various problems in the normal range as well as those that generally require special treatment. Between small and medium-to-large effect sizes were found for a variety of individual problems, although here too there are insufficient data for evaluating effects for most specific applications, due to lack of power. The largest number of studies are for anxiety and cardiovascular disorders, where the evidence for a significant although small to medium effect is strong. Across problems, the effect size appears greater than that of various placebo interventions or breathing exercises that do not affect the baroreflex system, which is the pathway that appears to mediate modulation of emotion. The differences in effect sizes between HRVB and placebo interventions are not significant, probably due to lack of power, but it seems probable that some specific ingredient in HRVB/PB contributes to effectiveness. It is also probable that suggestion is a component in the overall effect of HRVB, as it is in all pharmacological as well as nonpharmacological interventions (Petrie and Rief 2019).

It is surprising that response to direct questions about stress tended to have a low effect size, despite the wide use of HRVB for treating stress reactions. This may be due to focus of stress questionnaires on sources rather than symptoms of stress. Measures of stress, quality of life, PTSD, and sleep may reflect impairments that are less directly related to the RSA-BR systems than are depression, anxiety, and some symptoms of physical disease.

Limitations and Questions Raised for Further Research

The possible pathways for mediating mechanisms for HRVB were not covered in this review. In addition to possible effects of suggestion, attention to breathing has a meditative component and may foster acceptance of various body sensations and processes, a purported mechanism for effects of mindfulness training and ‘acceptance and commitment therapy’ (Gaudiano 2017). Also, explanations to the client for how biofeedback can help may have a cognitive effect in decatastrophizing various problems by conveying a notion that various physiological, behavioral, and emotional events can be brought under voluntary control (Mizener et al. 1988; Nanke and Rief 2000; Wilson 2018). Nevertheless, whatever the complex mechanisms are for the placebo response and for other interventions (Brook and Fauver 2014; Jensen et al. 2015; Levine et al. 2013; Tu et al. 2019), the equivalence to other methods of known effectiveness suggests that the method has an active effect. Only one study compared HRVB with mindfulness meditation and found a minimally greater effect size for meditation g =  + 0.137 for treating stress (de Bruin et al. 2016), and one study compared HRVB to cognitive behavior therapy and found a nonsignificantly greater effect size for HRVB g = − 0.038 for treating irritable bowel syndrome (Thompson 2010). Further research comparing HRVB to various specific treatments is warranted, as well a research on mechanisms by which HRVB has its effects.

It is possible that mechanisms for HRVB effects may differ for various applications. Fit may be found to have specific applications to emotional, cardiovascular, and perhaps gastrointestinal effects, where autonomic and baroreflex effects may be involved. More research on blood pressure effects are needed, particularly since HRVB directly impacts a blood pressure modulator, the baroreflex. Greater effects on systolic than diastolic blood pressure would be consistent with baroreflex action, which more directly affects systolic blood pressure. Similarly, the zero degree phase relationship between breathing and heart rate oscillations may have specific effects on respiratory disease, athletic performance, and perhaps cognitive performance, where gas exchange in the lung and oxygen perfusion in the muscles and brain may play a role. Mediator analyses of these effects remain to be done.

Additionally, this review did not consider the effect size needed for clinically relevant results. HRVB and PB appear to have at least a moderate beneficial effect on almost all of the problems studied, and a small but significant incremental effect, on average, over other established treatments, but the clinical utility of this effect remains to be evaluated. Even where benefits or incremental benefits are small, the minimal risk involved in these methods may make them worthy of use. We have found no studies of effects on mortality or very severe exacerbations, where even a small effect would be worthwhile. One study (Lehrer et al. 2004), an outlier, did find a significant effect in preventing asthma exacerbations requiring additional medical intervention, and another study (Reineke 2008) found similar results for hypertension. At present these interventions appear to be useful as minimal-risk complementary methods for these and various other applications.

An interesting implication of our findings is that length of treatment and home practice does not influence the effect size. It is possible that very short training periods may suffice. Perhaps learning how to breathe at resonance frequency provides a sufficient method for most of the beneficial effects, such that it is mostly used when needed. Study of acute effects and their influence on chronic effects could clarify this question.

Although the data look very consistent without the three outliers and the effect sizes did not diminish over the years, it still is possible that subtle biases may have contributed to assessments of HRVB effects. Few studies mentioned blinding of data analysts, and double blinding is never possible in behavioral intervention research, although comparisons with other credible treatments may have reduced the potential for experimenter bias, where biases of experimenters may have gone in either direction. Also, the large amount of effect size heterogeneity among studies does not provide confidence that positive effects will be obtained in particular cases or particular studies.

Additionally, our procedure of averaging across various outcome measures may obscure some effects because of the irrelevance or unreliability of some measures, a probable cause for the heterogeneity of effects between and within studies. Combining various disparate measures within and between studies, a hallmark of meta-analysis, may make it difficult to determine highly pinpointed effects. Nevertheless, because financial support for behavioral research does not reach the level necessary to test thousands of people to evaluate modest but important effects, meta-analysis, with all of its flaws, may be the best alternative for evaluating these effects.