Introduction

Irritable bowel syndrome (IBS) is a chronic gastrointestinal condition of unknown etiology characterized by abdominal pain and altered bowel function [1]. In fact, severe IBS can reduce health-related quality of life similar to congestive heart failure and rheumatoid arthritis [2, 3]. Beyond notable health concerns to the patient, the societal and economic impacts of IBS are substantial. More than 10% of the general population currently suffers from IBS [4, 5], resulting in exorbitant healthcare expenditures [6].

Although the pathophysiology of IBS remains unknown, recent studies suggest that bacteria may play a role. In support, a growing number of controlled studies suggest that antibiotics are effective in improving IBS symptoms [79]. In at least one study, the effect of a non-absorbed antibiotic was seen to have lasting benefits for 10 weeks after therapy was discontinued [8]. This prolonged effect in IBS suggests that antibiotics influence a pathophysiologic, and likely bacterial, process in IBS.

While data supporting the effectiveness of antibiotics continue to mount, controversy remains about what bacterial abnormality the antibiotics are impacting. In the lead-up to the use of antibiotics in IBS, the speculation was that there was a high prevalence of small intestinal bacterial overgrowth (SIBO) in IBS subjects based on the indirect technique of lactulose breath testing [10]. This test involves obtaining breath samples every 15 min after ingesting a 10-g solution of lactulose. The rise of hydrogen within 90 min is believed to represent bacterial fermentation [8], and the timing suggests that this is occurring in the small intestine. Although a recent study seemed to support at least an increase in coliform counts in the small bowel (although not meeting the traditional definition of SIBO) [11], there remains a significant disagreement in the literature about the relevance of SIBO in IBS [12, 13]. Much of this controversy stems from questions about breath test accuracy. Critics argue that breath testing is unreliable and suggest culture as the gold standard for diagnosing SIBO. Problems exist with current methods of small bowel culture, however, as most of the intestinal flora are not culturable, the area most susceptible to SIBO (distal half of small bowel) is not accessible to culture, and anaerobes are difficult to collect without affecting colony counts. Thus, against this poor gold standard, breath testing cannot truly be judged as valid or invalid [14].

Recently, the STARD guidelines have provided guidance in validating new diagnostic testing for disease [15]. In this context, there exist two approaches with breath testing in IBS. One approach is to invalidate breath testing by stating that it is inaccurate for SIBO in the setting of no gold standard. Another is to assume that the breath test is able to identify IBS compared to healthy controls. If breath testing is more commonly abnormal in IBS compared to controls, this would further settle the controversy about the role for breath testing and perhaps provide an understanding of the success of antibiotic therapy in IBS. In this study, the literature on breath testing in IBS is examined to compare results among subjects with IBS to those of healthy control subjects. These data will be used to perform a meta-analysis of the utility of breath testing in IBS.

Methods

Search Methodology and Abstract Review

Using OVID MEDLINE (January 1950 to Week 2, January 2010), a comprehensive search was performed for abstracts matching one or more of the following search terms: ‘irritable bowel’ and/or ‘IBS.’ Within these initial results, a second abstract search was executed using the terms ‘breath test$’ and/or ‘breath hydrogen test$’ allowing for variations in the word ‘test’, such as ‘tests’, ‘testing’, and ‘tested’. It was assumed that studies not including the preceding search terms in the abstract would not have conducted breath testing in the original study protocol to evaluate IBS patients. Studies not published in peer-reviewed journals were excluded from our search. Studies of all languages were eligible for evaluation.

After the first literature search, the resulting abstracts moved into a full paper review and were thoroughly analyzed by two independent reviewers. To evaluate the paper for eligibility, the following inclusion criteria were applied: the case group must only have included patients previously diagnosed with IBS using predetermined criteria; the control group must have been in good health; the study must have included some form of breath testing in the original study protocol to determine the proportion of positive tests in both controls and IBS subjects. There was no minimum study size for inclusion, since, based on existing literature, we did not assume rarity of a breath test abnormality in IBS among general patient populations.

The substrate used by each study for the breath test to determine the presence of SIBO in IBS was noted. The only identified substrate leading to exclusion from the study was lactose, since this was primarily used in breath testing to determine lactose intolerance. Since there were no validated criteria for the determination of a positive breath test, the review did not bias against studies based on their a priori determined criteria for positive test in each individual study. Studies that met these criteria were then eligible for the meta-analysis.

Data Extraction and Analysis

Studies meeting inclusion and exclusion criteria were analyzed by two reviewers independently. During the data collection process, extracted data included the size of case and control groups and the number of positive breath tests among case and control groups according to the preset criteria outlined in each study protocol. The substrate and criteria for positive breath test used by each study for breath testing were noted. In cases where there was more than one substrate, the substrate that appeared to represent the primary outcome of the study was used (not a secondary outcome analysis). While the primary meta-analysis included all eligible studies, studies employing appropriately age- and sex-matched IBS and control subjects were identified. Any disagreement over study inclusion or identification of relevant data within a particular study during any stage of the review process was resolved by third-party consensus.

A number of evaluations were then made for the meta-analysis. The first evaluation included all identified papers as a group and analyzed the prevalence of an abnormal breath test (as identified in the paper) in IBS compared to controls. Since papers investigating pediatric populations with IBS were also identified, a similar evaluation was also conducted based on the adult-only papers. A final meta-analysis then examined the studies which were most appropriately designed using age and sex-matched controls in comparison to the IBS groups.

In an additional analysis, the substrates used for breath testing were analyzed separately. In this case, all data from all studies were used. For example, the primary outcome in the study by Walters et al. was based on lactulose substrate, findings which were included in the primary meta-analysis. However, xylose was also studied and provided information on testing dynamics for this substrate [16]. This xylose data was compiled to add to other studies examining xylose. This was done for each substrate studied among the selected papers to determine the pooled sensitivity and specificity of each type of breath test to identify IBS.

Statistical Analysis

Summary effect estimates (pooled odds ratios) were computed using the DerSimonian and Laird random-effects model to produce more conservative estimates [17]. Where cell counts equaled zero, e.g., no control subjects with positive breath test, a standard approach to add 0.5 to each of the cells of the 2 × 2 table for a study was applied [18]. For each evaluation, forest plots were generated containing odds ratios, 95% confidence intervals, and weights for each study, followed by an overall (pooled) odds ratio. Heterogeneity statistics were calculated to evaluate the extent to which odds ratios varied between studies. An I 2 statistic was calculated for each evaluation to assess between-study heterogeneity attributable to variability in the effect of positive breath test [19]. Funnel plots were created to assess evidence of significant publication bias or small-study effects within the studies included in each evaluation. Harbord’s test [20], the modified version of Egger’s test for binary data, was used to test for funnel plot asymmetry. All statistical analyses were performed using Stata Version 10 (StataCorp 2009. Statistical Software: Release 10.0. College Station, TX). Sensitivities and specificities were calculated for the pooled overall group of studies, studies which were age- and sex-matched, and groups of studies based on their substrates.

Results

Paper Selection Outcome

The search strategy returned 9,274 abstract titles, of which 115 matched the search term criteria (Fig. 1) for paper review. After application of paper review, 11 papers then proceeded to full review as they met all of the eligibility requirements [16, 2130]. Ten of the 11 papers were written in English [16, 2130], and one was in Italian [17]. This Italian paper was translated into English to acquire the data. The data from all final papers selected were summarized (Table 1).

Fig. 1
figure 1

Flowchart of final paper selection for meta-analysis

Table 1 Summary of qualifying studies

The results used in our meta-analysis were generally based on each study’s principle definition for positive breath test (not sub-analyses). However, in several instances, this was less clear. Bratten et al. [24] evaluated both hydrogen peak within 90 min, and peak >20 ppm within 180 min, as criteria for breath test positivity. As reported by Bratten, positivity in either criterion was used to represent a positive breath test in our analysis. Results from dual hydrogen peaks were not included since this was not a breath test-positive criterion according to Bratten. Skoog et al. [27] compared fructose to high fructose corn syrup (HFCS), as substrate on the same case and control groups on separate occasions. Since fructose was the principle substrate used for analysis, only these data were used. Walters et al. [16] compared lactulose and xylose in a similar fashion. As lactulose was the principle outcome of the study and xylose the comparator, only results using lactulose substrate were included in the final meta-analysis. Lactulose was the more conservative result.

Study Heterogeneity

Although a number of papers were large and well designed, there was some heterogeneity among the final papers. A common problem encountered was differences in the definition of an abnormal breath test, which are each explicitly described in Table 1. Another issue was the inclusion of subjects using proton pump inhibitors (PPI) [31]. While a recent study refuted the influence of PPI on breath testing [32], there was still some heterogeneity among the selected papers with regards to inclusion of subjects on PPI. Only two studies excluded patients based on previous use of proton pump inhibitors [26] or anti-secretive therapy [28]. Another significant issue among the papers was the selection of control subjects. In many cases, the control subjects were not selected carefully. For example, in the relatively large study by Bratten et al., the controls were neither age- nor sex-matched [24]. In fact, the mean age for controls in this study was more than a decade younger than IBS subjects. Finally, there existed a variety of evaluated substrates (see Table 1 for summary) among studies.

Total Group Meta-Analysis

In total, the 11 studies represented breath testing in 1,076 IBS and 509 controls subjects (Fig. 2). Using a random-effects method of meta-analysis, we found that the odds-ratio for abnormal breath test in IBS compared to controls was 4.46 (95% CI = 1.69–11.80, p = 0.003) with significant between study heterogeneity among the complete group of studies (I 2 = 84.6%). Despite this high degree of heterogeneity, the Harbord’s test did not find evidence of significant publication bias or small-study effects (p = 0.774). A pooled summary of hydrogen breath test data suggested a sensitivity of 43.6% and specificity of 83.6% in identifying IBS.

Fig. 2
figure 2

Forest plot of all studies meeting inclusion and exclusion criteria for breath testing in IBS compared to controls

Meta-Analysis Among Adult Studies Only

Two of the 11 studies were conducted in pediatric subjects with IBS [29, 30]. After excluding these two studies, there was still a significant propensity for an abnormal breath test in IBS subjects compared to controls in adult populations (OR = 3.03, 95% CI = 1.10–8.33, p = 0.032) (Fig. 3). Again, there was a high level of heterogeneity among studies (I 2 = 81.2%), but the Harbord test again suggested no evidence of significant publication bias or small-study effects (p = 0.866). The pooled sensitivity of breath testing in adult studies of IBS demonstrated a sensitivity of 39.9% and specificity of 84.2%.

Fig. 3
figure 3

Forest plot of studies involving adult IBS subjects only

Meta-Analysis of Age- and Sex-Matched Studies

The more rigorously conducted studies, comparing age- and sex-matched IBS and control subjects, were then analyzed. Of the 11 primary papers, six were age- and sex-matched [22, 23, 25, 2830], representing 582 case-group patients and 317 controls. The odds ratio for finding an abnormal breath test in IBS compared to age- and sex-matched control subjects was 9.64 (95% CI = 4.26–21.82, p < 0.001). In this analysis, although there was significant between-study heterogeneity, the percentage of variation attributable to heterogeneity was only moderate (I 2 = 67.9%) and publication bias or small-study effects were not apparent (p = 0.272). In a pooled analysis, the breath test suggested a sensitivity of 48.4% and specificity of 89.2%.

Test Characteristics Based on Substrate

Among the identified studies, different substrates were used for breath testing. Of the 11 studies, five studies utilized lactulose (10 g) [16, 18, 19, 25, 26], representing 451 IBS subjects and 165 controls. Studies with lactulose substrate conferred a sensitivity of 72.2% and a specificity of 66.0% (Table 2). Three studies utilized glucose (50 g) as a substrate for breath testing [23, 26, 28] and represented a combined 420 IBS subjects and 272 controls. Studies with glucose as the substrate for breath testing produced a sensitivity of 15.7% and specificity of 97.0%. Other substrates were examined in fewer studies. Skoog et al. was a comparative study between 40 g fructose and 40 g high fructose corn syrup in separate breath tests [27]. Studies were also identified that examined xylose [16, 21] and sucrose [25] substrates. While there were not enough data representing each of these substrates to generate meaningful results for comparison with lactulose and glucose, all substrates are summarized in Table 2.

Table 2 Sensitivity and specificity of breath testing to identify IBS according to substrate used

Discussion

In this systematic review and meta-analysis, the utility of carbohydrate breath testing as a surrogate of small intestinal bacterial overgrowth (SIBO) was evaluated in IBS subjects compared to healthy controls. The results demonstrate that a “positive” breath test is more common in IBS patients compared to controls, irrespective of definition. This suggests that there may be merit in the use of breath testing in IBS, strengthening the justification for a bacterial hypothesis in IBS. More importantly, the prevalence of abnormal breath test was even more significant when examining high-quality age- and sex-matched studies. Although another recent interesting and well-conducted review of the literature examined the criteria for a positive breath test in IBS [33], new large-scale age- and sex-matched studies [2830] have recently been published. In addition, this meta-analysis examines the more practical presence or absence of a positive result in case-control studies rather than the specific criteria used for testing.

Over the last decade, there has been a growing list of controlled studies suggesting the benefit of antibiotic therapy in IBS [79]. In these studies, the antibiotic approach has not only demonstrated immediate alleviation in IBS symptoms but also maintained improvement after the cessation of therapy over a follow-up period as long as 10 weeks [8]. No previous pharmacotherapy for IBS has demonstrated this durability of response. This suggests that antibiotics have in some way modified a causative factor in IBS, presumably bacterial in nature.

The concept of antibiotic treatment for IBS and the hypothesis that bacteria may play a role in IBS development were catalyzed by an initial study that proposed the presence of SIBO due to findings of lactulose breath test abnormalities in IBS subjects [10]. Since this time, there has been much controversy about the rationale of SIBO in IBS due to a lack of confidence in the validation of breath testing [12, 13], with many justifiable reasons. Critics argue that culture should be the gold standard, though this is problematic for a number of reasons [14]. First, most of the gut flora cannot yet be cultured. Second, and more importantly, samples should be acquired from mid to distal sections of the small bowel to prove SIBO. Unfortunately, this type of sampling has not been conducted to validate the breath test. Conversely, in the absence of a reliable gold standard for comparison, breath testing cannot be condemned. As such, investigators began examining the prevalence of abnormal breath test findings in IBS with mixed and polarizing results.

There are many difficulties that arise with the technique of breath testing in any patient. Hydrogen breath testing as an indirect method for diagnosing SIBO assumes that a rise in exhaled hydrogen gas occurs earlier than expected due to carbohydrate metabolism by abnormally high levels of small intestinal bacteria. The technique itself requires attention to detail. Frequent calibration, proper handling of samples, and proper patient preparation are vital to a successful result [34]. These intricacies make breath testing difficult to conduct in all but the most diligent of centers. In addition, traditional definitions of a positive test rely on an elevation of hydrogen following ingestion of carbohydrate substrate. However, many bacteria in the gut utilize hydrogen gas for their energy source, including methanogens and sulfate-reducing bacteria [35]. The presence of these bacteria can significantly impair the accurate detection of hydrogen. Interpretation of breath test results is compromised, since no technology currently exists to detect hydrogen sulfide and scaled down versions of the gas chromatograph do not even measure the more common methane gas. In addition, methane appears to have some importance in constipation-predominant IBS [36] beyond the scope of this meta-analysis. These various problems with breath testing could account for the heterogeneity in studies and make breath testing currently impractical for widespread clinical use without significant care in the technique.

Despite the existing limitations and controversy regarding the breath test, this study demonstrates that controlled breath testing results in abnormalities in IBS subjects much more commonly than in controls (Fig. 2). This is particularly true when examining only high-quality studies that utilized age- and sex-matched controls (Fig. 4). These data support the utility of breath testing in IBS, at least as a confirmation that abnormal fermentation patterns seen on test results seem indicative of some bacterial derangement in IBS. However, the proof that this is due to SIBO is still controversial. In one study, the authors suggest that SIBO among IBS patients is not common, based on traditional standards for SIBO-positive interpretation of proximal small-bowel culture (>105 colony forming units/ml) [11]. The threshold used in this study is now contested, since it was only validated based on typical findings in Billroth II patients [14]. When lower thresholds are used, IBS subjects demonstrate a significant increase in small-bowel coliform counts [11]. Thus, there may be some culture evidence of disturbed small-bowel bacteria levels.

Fig. 4
figure 4

Forest plot of age- and sex-matched case-control studies in IBS

While this meta-analysis demonstrates increased odds of abnormal breath test findings in IBS subjects than controls, it is unlikely that the breath test will be used as a diagnostic tool for IBS—breath testing in IBS has more theoretical utility in identifying patients with the potential to benefit from antibiotic therapy. In two of the controlled trials evaluating antibiotic therapy in IBS, the breath test was helpful in determining patient outcome. In one of these, normalization of the breath test with antibiotics predicted a greater proportion of subject response to treatment [7]. In the other study, subjects responding to the antibiotic rifaximin had a significantly greater reduction in hydrogen compared to non-responders [9]. These results need further validation to consider this use of breath testing in clinical practice.

Interestingly, the dynamics of breath testing as a diagnostic test (Table 2) fell within the expectations of the type of substrate. For example, glucose, sucrose, and xylose are absorbable carbohydrates and do not reach the distal small intestine. Therefore, a positive test is more likely to represent bacterial overgrowth, since there would be no normal expectation of hydrogen production in the proximal small bowel to account for an abnormal test. This likely explains the high specificity and low sensitivity (elevated false-negative rate) of the test in IBS and also supports a role of small bowel bacterial in IBS. In the case of lactulose, this carbohydrate traverses the entire small bowel and reaches the colonic flora. In the setting of rapid transit, lactulose could reach the colon and occur at a time less than 90 min in this circumstance. Thus a false-positive test would be seen. This would produce a test with a high sensitivity and lower specificity (elevated false-positive rate).

There are a number of limitations with the current meta-analysis. First, there is some heterogeneity among studies. These include differences in the substrate used to trigger gut fermentation (including lactulose, glucose, sucrose, and fructose). Second, there are also differences in what various investigators considered to be abnormal or positive. Third, the most serious problem for many studies was the lack of appropriate age- and sex-matched controls. When age and sex matching was accounted for, the pooled-odds ratio and variation attributable to heterogeneity were more favorable, indicating that in better-quality studies there is more support for a positive breath test in IBS (Fig. 4).

Despite criticism of breath testing, this meta-analysis demonstrates that the breath test is a valid and important catalyst in the development of the bacterial hypothesis for IBS. The breath test findings imply that IBS subjects indeed have altered flora and fermentation patterns, supporting the development of novel antibiotic therapies that are now emerging in IBS. Despite this, the role of breath testing in IBS remains unknown. While the future of diagnostic breath testing remains in question, there is new and more-convincing data on the association between methane during breath testing and constipation [25, 36, 37] and in guiding therapy in this subgroup [38]. Further evaluation of various breath-testing methods, as well as future technologies, may help identify subsets of IBS patients to define the correct antibiotic treatment to use.