Introduction

Asthma, the most common chronic disease in children, is characterized by airway hyperresponsiveness and inflammation. The incidence of asthma is increasing worldwide [1]. Accurate detection of asthma is important for effective therapy. The current gold standard for asthma diagnosis is based on clinical history of respiratory symptoms, physical examination, and respiratory function testing [2]. However, these methods are far from perfect. The history of variable respiratory symptoms provided by children is less than objective. Lung function tests are complicated and expensive, and some, such as spirometry and bronchial challenge tests, are relatively invasive and carry a risk of bronchospasm. In addition, these tests do not provide information about airway inflammation.

A non-invasive and readily available technique, fractional exhaled nitric oxide (FeNO), has been developed for asthma diagnosis. Nitric oxide (NO), a signaling molecule in respiratory epithelial cells [3], functions as a vasodilator and bronchodilator in the lungs and serves a protective role in the asthma response. NO synthase enzymes, induced by inflammatory cytokines, mediate the synthesis of NO. NO is detectable in exhaled breath by a chemiluminescence analyzer; much evidence indicates that NO is increased in asthma. Similarly, some studies have demonstrated that FeNO is elevated in children with asthma compared to those without asthma. Further, the FeNO level reflects the amount of eosinophilic airway inflammation [4]—a key pathological feature of asthma.

FeNO testing is relatively easy for school-aged children, and even preschool-aged children, to perform [5]. Initially used in the 1990s and approved for clinical use in 2003, FeNO testing has been widely applied in the diagnosis of asthma in children [6]. Given its relatively convenient and non-invasive measurement, FeNO has been highlighted as a biomarker in recent years to quantitatively evaluate airway inflammation. Indeed, evidence of the diagnostic efficacy of FeNO is accumulating [79]. However, individual studies have yielded insistent or conflicting findings [10, 11], possibly due to limitations associated with individual studies. Further, the FeNO level can be influenced by a number of factors that may affect its diagnostic accuracy. A meta-analysis of 25 studies with a total of 3983 subjects showed that FeNO is a promising marker for the diagnosis of asthma in all populations [12]. However, more recent studies need to be considered. To our knowledge, there is no meta-analysis focused on the diagnostic value of FeNO for asthma exclusively in children.

Thus, to shed light on contradictory results and focus on a pediatric population, we performed a systematic review and meta-analysis of current research reports to assess the performance of FeNO in the diagnosis of childhood asthma. Our findings have implications for the use of FeNO testing in pediatric clinical practice.

Methods

Literature Search and Study Identification

We performed a literature search of relevant databases including PubMed, the Cochrane Library, EMBASE, MEDION, and Web of Science to identify eligible studies published through March 31, 2016. Various combinations of medical subject headings (MeSH) and non-MeSH terms were used, as follows: fractional exhaled nitric oxide, FeNO, and asthma. In addition to published studies in these electronic databases, a manual search of related reports from major annual meetings in the field of pediatrics and reference sections of studies and all relevant reviews was also performed. Studies were restricted to English-language, peer-reviewed publications.

Inclusion criteria for eligible studies were as follows: (1) diagnosis accuracy test design; (2) the index test was FeNO; (3) study subjects were limited to children, and the minimum number was 10; and (4) a two-by-two contingency table could be constructed from data presented by the study. Studies were excluded if they met the following criteria: (1) studies conducted on animals or in vitro systems; (2) the article was a review, case report, or editorial comment; (3) studies on the monitoring impact of FeNO in managing asthma in children; and (4) studies containing overlapping participants. Notably, articles by the same author or research group were included only when a different sample of patients was used. Two investigators (Songqi Tang and Yiqiang Xie) independently performed the literature search and study identification according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [13]. Any disagreement was resolved by discussion with the other three reviewers (Conghu Yuan, Xiaoming Sun, and Yubao Cui).

Quality Assessment

To assess the quality of each included study, we used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool [14]. Briefly, QUADAS-2 comprises four key domains: patient selection, index test, reference test flow of patients through the study, and the timing of the index and reference tests (flow and timing). These four domains were used for evaluating the risk of bias, and the first three were applied to assess applicability. According to the investigators’ answers for all signaling questions in each domain, risks of bias were graded as “low risk,” “high risk,” or “unclear risk.” As for applicability concerns, review authors were supposed to document relevant information and then to assess their concerns if the study matched the review question. Similarly, concerns of applicability were rated as “low risk,” “high risk,” or “unclear risk.” A standardized table and figure, recommended by the QUADAS-2 official website, were used to display the summarized results of the QUADAS-2, with numbers of studies observed with low, high, or unclear risk of bias or applicability concerns for each domain.

Data Extraction

Characteristic information of included studies were extracted, including published year, country, study design, blindedness, number of study subjects, FeNO measurement standard, online measurement, reference standard for the diagnosis of asthma, and cut-off value of FeNO in each study. Absolute numbers of true positive (TP), false positive (FP), false negative (FN), and true negative (TN) were also extracted.

Diagnostic Measures Combination

The pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), diagnostic score, and area under the summary receiver operating curve (AUSROC) with the corresponding 95 % confidence interval (CI) were obtained by a bivariate binomial mixed model [15]. The sensitivity, specificity, DOR, and AUSROC were considered as the major outcomes in this study.

Heterogeneity

A Cochrane-Q test of heterogeneity was performed using inconsistency index, I2, as a measure to illustrate the percentage of the total variability in effect estimates among trials that is caused by heterogeneity instead of chance [16]. A value of I2 more than 50 % was defined as heterogeneity. A two-sided p value <0.05 indicated statistical significance.

Diagnostic Threshold Effects

Since the cut-off values were different among the included studies, diagnostic threshold effects were inspected [16]. The summary receiver operating curve (SROC) was visually evaluated at first. A Spearman correlation analysis was used to assess the heterogeneity derived from diagnostic threshold effects.

Meta-regression Analysis

To identify the sources of heterogeneity among studies, meta-regression analysis was carried out [16]. Possible sources of heterogeneity including published year, country, study design, blindness, number of study subjects, FeNO measurement standard, online measurement, reference standard for the diagnosis of asthma, and cut-off value of FeNO in each study were included in the analysis. Other sources were not included because of data insufficiency in at least one study.

Publication Bias

Deeks’ funnel plot asymmetry analysis was performed to identify the publication bias [17]. Briefly, the Deeks funnel plot was a scatter plot of the inverse of the square root of effective sample size [1/root (ESS)] against the ln (DOR).

Fagan’s Nomogram Analysis

The Fagan nomogram plot comprised three vertical axes [18]. The left axis represented the pre-test probability, which was derived from the prevalence in each included study. Another axis in the middle displayed the likelihood ratio showing the extent to which the index could raise or lower the probability of having the disease. The right vertical axis signified the post-test probability of patient’s probability of having the positive or negative results of the reference standard test after the index test result was known.

Bivariate Box Plot

With logit specificity and logit sensitivity as horizontal axis and vertical axis, respectively, a bivariate box plot was applied to assess the distributional properties of sensitivity against specificity and investigate possible outliers [19].

Data Synthesis and Statistical Analysis

Data synthesis and most statistical analyses were undertaken by STATA software version 12.0 (College Station, TX, USA) apart from the Meta-regression analysis by Meta-Disc version 1.4 (Clinical Biostatistics Unit, Hospital Ramon y Cajal, Madrid, Spain).

Results

Literature Search and Quality of Studies

The initial search yielded 359 citations from PubMed, the Cochrane Library, EMBASE, MEDION, and Web of Science. Since the search strategy was relatively broad, most of the results were not eligible. After screening on the basis of title and abstract, 62 studies were excluded for various reasons, such as irrelevant topic, different data design, and insufficient sample size. Following full-text assessment, eight studies met the pre-defined inclusion criteria and were included in this systematic review and meta-analysis [2027]. Figure 1 shows the procedure for inclusion. Characteristics of the included studies and patients’ baseline demographics are displayed in Table 1.

Fig. 1
figure 1

Flow chart of search and selection of eligible studies. n number of studies

Table 1 Characteristics of the included studies and patients’ baseline demographics

Among the eight trials published from 2009 to 2015, three trials were prospective and the other five were retrospective. Four studies were conducted in Asian countries (China and Korea), while the remaining four were conducted in European countries (Israel, Norway, Spain, and Poland). The sample size of each study ranged from 88 to 1651 patients. For these included children, FeNO was measured in accordance with the American Thoracic Society (ATS) or European Respiratory Society guidelines. FeNO values were obtained using online nitric oxide monitors and expressed as parts per billion (p.p.b.). Two studies reported that the physicians providing the asthma diagnosis were blinded to the FeNO results [20, 24]. All the trials used a combination of symptoms, spirometry, bronchial provocation test results, and bronchial dilation test results as the reference standard, except one study by Yao et al. that used symptoms and spirometry only. In addition, various cut-off values with the highest Youden index (sensitivity + specificity − 1) were different among the included studies. The count data for primary studies including true positive (TP), false positive (FP), false negative (FN), and true negative (TN) were extracted and presented in Table 1.

The criteria of Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2), which is an updated evaluation tool for the systematic review and meta-analysis of diagnostic test accuracy, identified the quality conditions of the included studies (Table 2 and Fig. 2). The quality evaluation was performed independently by two investigators.

Table 2 Summary of review authors’ ratings of bias risk and applicability concerns for each study
Fig. 2
figure 2

Cumulative bar plot of bias risk and applicability concerns across all studies

Data Synthesis of Diagnostic Accuracy

In total, 2933 children from eight trials were included in this systematic review and meta-analysis. With a bivariate model, diagnostic performances of FeNO in asthmatic children were pooled and are summarized in Table 3.

Table 3 Summary of the pooled estimates of FeNO in the diagnosis of asthma in children

The combined estimates of sensitivity, specificity, and DOR for the detection of asthma in children were 0.79 (95 % CI, 0.64–0.89), 0.81 (95 % CI, 0.66–0.90), and 16.52 (95 % CI, 7.64–35.71) (Fig. 3 and Table 3). The AUSROC was 0.87 (95 % CI, 0.84–0.90) (Fig. 4). Through graphical examination of the SROC plot, the threshold effect was considered absent, with a Spearman correlation coefficient of 0.50 (p = 0.21). The corresponding summary PLR and NLR were 4.20 (95 % CI, 2.38–7.39) and 0.25 (95 % CI, 0.15–0.44), respectively, and the combined diagnostic score was 2.80 (95 % CI, 2.03–3.58) (Table 3).

Fig. 3
figure 3

Forest plot of the combined diagnostic estimates of sensitivity, specificity, diagnostic odds ratio of FeNO. DOR diagnostic odds ratio, ES estimates

Fig. 4
figure 4

Summary receiver operating curve of FeNO in the diagnosis of asthma in children. AUSROC area under the summary receiver operating curve, SROC summary receiver operating curve. 1: study by Sivan et al.; 2: study by Sachs-Olsen et al.; 3: study by Yao et al.; 4: study by Pérez Tarazona et al.; 5: study by Woo et al.; 6: study by Jerzyńska et al.; 7: study by Zhu et al.; 8: study by An et al.

Fagan’s nomogram analysis (Fig. 5) revealed that, with a fixed pre-test probability of 20 % and a pooled PLR of 4.20, the post-test probability was increased to 51 %. Conversely, with a combined NLR of 0.25, the post-test probability was decreased to 6 %.

Fig. 5
figure 5

Fagan’s nomogram plot analysis to evaluate the clinical utility of FeNO for the detection of asthma in children. In each plot, a vertical axis on the left shows the fixed pre-test probability. Using the likelihood ratio in the middle axis, a post-test probability (patient’s probability of having the disease after the index test result was known) was acquired. With a fixed pre-test probability of FFR of 20 %, the post-test probabilities of having asthma, given positive and negative FeNO results, were 51 and 6 %

Meta-regression Analysis

As displayed by the forest plot of the pooled main diagnostic measures (sensitivity, specificity, and DOR) in Fig. 3, the heterogeneity for FeNO in the detection of asthma was significant (I2 = 93.10 %, p < 0.05; I2 = 97.62 %, p < 0.05; I2 = 100.00 %, p < 0.05, respectively). Since the threshold effect was absent as mentioned above, the heterogeneity was not caused by this effect. To identify the sources of heterogeneity of the publication year, country, study design, blind, sample size, reference standard, and cut-off value, meta-regression analysis was performed. None of these factors were obvious sources of heterogeneity.

Bivariate Box Plot for Evaluating the Outliers

To evaluate the distributional properties of sensitivity versus specificity and identify possible outliers of diagnostic results, a bivariate box plot analysis was used. As shown in Fig. 6, data from the study by Sachs-Olsen et al. nearly reached the limit of extreme value, indicating that the study had the potential to be heterogeneous with regard to other studies. In addition, data from three studies (Sivan et al., Woo et al., and Zhu et al.) were mild outliers. The entire shape of the bivariate box plot was symmetrical, indicating that the data were closely within a normal distribution.

Fig. 6
figure 6

The bivariate box plot for evaluating the outliers. 1: study by Sivan et al; 2: study by Sachs-Olsen et al; 3: study by Yao et al; 4: study by Pérez Tarazona et al; 5: study by Woo et al; 6: study by Jerzyńska et al; 7: study by Zhu et al; 8: study by An et al

Publication Bias

Deeks’ funnel plot (Fig. 7) was applied to assess the publication bias. At first, visual evaluation revealed that the plot was a symmetrical funnel shape, indicating that publication bias was likely absent. Further, the p value for the Deeks funnel plot asymmetry test was 0.08. Therefore, publication bias was not identified in this meta-analysis.

Fig. 7
figure 7

Deeks’ funnel plot for detecting publication bias. ESS effective sample size. 1: study by Sivan et al.; 2: study by Sachs-Olsen et al.; 3: study by Yao et al.; 4: study by Pérez Tarazona et al.; 5: study by Woo et al.; 6: study by Jerzyńska et al.; 7: study by Zhu et al.; 8: study by An et al.

Discussion

This systematic review and meta-analysis of eight current diagnostic accuracy studies including 2933 cases provides an overview of the diagnostic performance of FeNO in children with asthma. The findings indicate that the diagnostic accuracy of FeNO in the identification of asthma in children is moderate. As suggested by the ATS, FeNO plays an important role in the diagnosis of asthma, especially in the diagnosis of eosinophilic airway inflammation. Moreover, its predictive value is adequately robust to be used in this area [28], more reliable than peak expiratory flow and spirometry, and even comparable to the bronchial challenge test. Considering the diagnostic performance, relative convenience, and non-invasive procedure of measurement, FeNO is a valid tool in diagnosing asthma in children.

Evidence has demonstrated that the utility of FeNO monitoring in distinguishing children who are more at risk of developing asthma lies in the fact that bronchial inflammation and eosinophilic inflammation are found in children before the diagnosis of asthma [29]. We excluded one study for failing to meet the inclusion criteria of our meta-analysis, but it also supported the diagnostic value of FeNO. This study, by Malmberg et al., indicated that the performance of FeNO was better than impulse oscillometry in detecting asthma among preschool-aged children [30]. Moreover, ROC analysis showed that, compared with diagnosis based on history, FeNO had superior diagnostic accuracy in discriminating between children with probable asthma and healthy controls. As widely known, the AUSROC provides an overall evaluation of diagnostic accuracy. Based on the recommended guideline for the interpretation of AUSROC values [31], the diagnostic utility of FeNO for asthma is moderate [AUSROC 0.87 (95 % CI, 0.84–0.90)]. Further, the DOR index combines sensitivity and specificity; therefore, the higher the value of the DOR, the higher the diagnostic accuracy of the test. Our combined estimates of DOR were 0.79 (95 % CI, 0.64–0.89), indicating a moderate diagnostic value of FeNO in the detection of asthma in children. Indeed, Fagan’s nomogram revealed that the testing of FeNO would increase the post-test probability of children having asthma (confirmed by symptoms, spirometry, bronchial provocation test, and bronchial dilation test) from 20 to 51 %. Meanwhile, the probability of having a FeNO-negative diagnostic outcome was decreased from 20 to 6 %. According to these findings, FeNO is of moderate value in identifying asthma in children.

A recent systematic review and meta-analysis included 25 studies and 3983 subjects to pool results in determining the effectiveness of FeNO in detecting asthma in different populations [12]. The sensitivity, specificity, and DOR for the entire population were 72 % (95 % CI, 70–74 %), 78 % (95 % CI, 76–80 %), and 15.92 (95 % CI, 10.70–23.68), respectively. The SROC analysis revealed a receiver operating characteristic of 0.88. In subgroup analysis, the DOR for patients using corticosteroids and those for steroid-naive, non-smoking, smoking, chronic cough, and allergic rhinitis patients were 4.47 (95 % CI, 3.39–5.90), 21.40 (95 % CI, 15.38–29.76), 19.84 (95 % CI, 15.63–25.19), 5.41 (95 % CI, 2.97–9.86), 35.36 (95 % CI, 23.90–52.29), and 2.99 (95 % CI, 0.85–10.45), respectively. These results revealed the comparatively good performance of FeNO for the diagnosis of asthma in steroid-naive or non-smoking patients, particularly in chronic cough patients. Though the cases in our analysis were limited to children, the diagnostic estimates of FeNO in our analysis were in line with the entire population as reported by Guo et al. However, concern has been raised about the univariate model used in separately combining estimates of sensitivity and specificity, either with a fixed- or random-effects model, since it might lead to inaccurate results with the threshold effects and correlation between the two estimates being ignored.

Here, we applied a sophisticated method, the bivariate binomial mixed model, for combining diagnostic estimates. This specialized model utilizes a hierarchical structure of the distribution of data in terms of two levels corresponding to the variation within and between studies, with the two-dimensional nature of diagnostic accuracy being preserved [32]. Considering the significant between-study heterogeneity in our analysis, it is more appropriate to use this model. Additionally, this model takes into account a potential link between sensitivity and specificity and manages the differences in the precision of the two estimates. Last but not least, the bivariate model allows for the effect of covariates that influence the sensitivity and specificity. Notably, this method is at present regarded as the optimal method for acquiring pooled statistics for meta-analysis of diagnostic test accuracy and recommended by the Diagnostic Test Accuracy Working Group of the Cochrane Collaboration and the Agency for Healthcare Research and Quality (AHRQ) [31]. Therefore, our pooled estimates of diagnostic measures should be reliable.

Based on the ATS guidelines, FeNO levels less than 20 p.p.b. are considered below the “negative cut-off,” indicating that eosinophilic inflammation and responsiveness to corticosteroids is less likely [28]. On the other hand, FeNO levels greater than 35 p.p.b. represent the “positive cut-off,” signifying that eosinophilic inflammation does exist and the responsiveness to corticosteroids is good. Intermediate FeNO between 20 and 35 p.p.b. should be interpreted cautiously based on the clinical context. Among the included studies in our analysis, around 20 p.p.b. appears to be supported as the reasonable cut-off value as was used in relatively recent studies. Sivan et al. reported that children having values of FeNO more than 23 p.p.b. are very likely to have asthma, with a sensitivity of 0.86, specificity of 0.89, and DOR of 46.02 [20]. In a report of a birth cohort, Sachs-Olsen et al. found that 16.6 p.p.b. in children at age 10 was the optimal value in diagnosing current allergic asthma, but not in current non-allergic asthma [21]. The cut-off value in other studies ranged from 15.6 to 28 p.p.b. However, the Spearman correlation coefficient was 0.50 (p = 0.21), implying that the threshold effect was likely absent.

When interpreting FeNO levels in diagnosing asthma, it is necessary to take neonatal history into account. As reported in a study by Ricciardolo et al., a history of neonatal respiratory distress in preterm infants without bronchopulmonary dysplasia might lower the value of FeNO [33]. This finding was in parallel to the results of a study of lower FeNO values in school-aged children with a history of bronchopulmonary dysplasia compared to controls [34]. Meanwhile, Filippone et al. found that, among adolescents with former preterm and bronchopulmonary dysplasia (BPD), former preterm adolescents without BPD, and healthy adolescents, FeNO levels were comparable [35]. Thus, a FeNO value lower than expected warrants caution for interpretation in the clinic since changes may occur in the airway in formerly premature infants with and without a history of BPD.

In addition to its diagnostic value, FeNO is recommended by the ATS as an indicator of medication management [28]. Peirsman et al. investigated the application of FeNO in asthma management with inhaled corticosteroids and leukotriene receptor antagonist medications in children with mild-to-severe asthma and allergic sensitization over a 52-week period. The researchers found that children in the FeNO group (FeNO was used to determine asthma management using a cut-off value of 20 p.p.b.) were observed with less asthma exacerbations over 1 year, compared with those in another group where clinical symptoms, rescue medication use, and the forced expiratory volume in 1 s were used to determine medication management. However, a recent meta-analysis of pediatric trials comparing the use of FeNO with conventional methods to manage asthma (including the studies of Jartti et al.) showed that, although the use of FeNO was associated with a lower frequency of >1 asthma exacerbation in asthma, this management has no superior clinical benefit over the guideline-based method [36].

Our analysis had several limitations. First, the search strategy was restricted to articles published in English, which excluded some non-English literature studies, conference abstracts, and other study design articles. Second, the number of included studies in this systematic review and meta-analysis was relatively small, and the sample size varied widely in different studies. In addition, there was significant heterogeneity observed in the outcomes of sensitivity, specificity, and DOR, but we failed to identify the main source of heterogeneity since the information reported was limited.

Conclusions

Our systematic review and meta-analysis lends support to the view that FeNO is a promising tool for the detection of asthma in children, with moderate diagnostic accuracy. This is the latest meta-analysis assessing the diagnostic value of FeNO for asthma in children exclusively. Notably, however, more clinical trials are warranted to demonstrate its clinical benefits in real-world practice.