Introduction

The routine prefeed gastric aspiration to assess feed tolerance and determining the advancement of feeds is a widespread practice in preterm infants on gavage feeding [1,2,3]. However, the evidence for the benefits and risks associated with this practice is controversial. There is significant variability in the interpretation of findings of gastric residue among the healthcare workers [4, 5]. Feeds are frequently withheld or reduced in the presence of large volumes or discolored gastric residuals due to concerns of necrotizing enterocolitis (NEC). The frequent withholding of feeds and discarding of gastric residues leads to reduced intake of calories and micronutrients, including bile acids, that might result in extrauterine growth restriction [1, 6, 7]. Besides, the associated prolonged use of intravenous fluids, parenteral nutrition, and possible investigations for NEC further burdens the family and health care system [1, 6, 8].

A recent guideline on feeding in low-birth-weight infants suggested checking prefeed gastric residual volume only after a minimum feed volume is attained rather than using it as a routine practice [2]. However, this recommendation was based upon observational studies and is not supported by recent trials [4, 9,10,11,12,13]. A Cochrane review concluded that there is uncertainty as to whether routine monitoring of stomach aspirates reduces NEC and warrants further studies [1]. A few more randomized controlled trials (RCTs) have been published after the Cochrane review [4, 14, 15]. We aimed to systematically synthesize the effect of avoiding the practice of routine prefeed gastric residue aspiration on the incidence of NEC (stage 2 or more) and other clinical outcomes in preterm (< 37 weeks) infants.

Methods

Search strategy

This review was registered with PROSPERO (CRD42020197657) and is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [16]. The electronic literature search was performed using The Cochrane Central Register of Controlled Trials (CENTRAL), CINAHL, Embase, PubMed, and Web of Science for RCTs published until March 8, 2021. The search was performed independently by two investigators (JK, JM) and included relevant medical subject heading (MeSH)/Emtree terms, keywords, and word variants for the study population (Neonates), intervention (gastric aspirate), and study design (Supplementary Table 1). The electronic search was supplemented by a hand search of the bibliography of the included studies and relevant review articles. In addition, we also searched the conference abstracts presented at the Pediatric Academic Societies meetings (https://plan.core-apps.com/year) in the last 3 years (2018–2020). We also searched for ongoing or completed but yet-to-be published trials in various registries namely Clinical Trial Registry of India (http://ctri.nic.in), ClinicalTrials.gov, Australian New Zealand clinical trial registry (http://www.anzctr.org.au/), and EU Clinical Trials Register (https://www.clinicaltrialsregister.eu/). For ongoing/unpublished trials, we contacted the corresponding author and enquired about the status of the trial (Supplementary Table 2). In case the trial was completed, we requested published/unpublished data and if available, included it in the analysis. We did not use any language restriction.

Study selection

RCTs comparing routine prefeed gastric residue aspiration with either no aspiration or any other intervention except prefeed aspiration were considered eligible for the review. Two investigators (JK, JM) independently screened the titles and abstracts for eligibility and identified studies that met the inclusion criteria. The same authors examined full texts of potentially eligible studies. Studies were included if they met all the following criteria: (i) population—preterm neonates on gavage feeds, (ii) intervention—routine prefeed gastric residue monitoring in one group, (iii) comparison—no prefeed gastric residue aspiration (may include another intervention), and (iv) outcome—reported at least one of the predefined clinical outcomes like time to reach full enteral feeds, NEC, sepsis, and mortality, etc. We excluded (i) studies involving term neonates or older children and (ii) studies reporting laboratory data only (like change in microbial flora or inflammatory markers, etc.).

Primary and secondary outcomes

Our primary outcome was the incidence of NEC stage 2 or more as per modified Bell’s staging. The secondary outcomes included time to reach full enteral feeds (at least 120 mL/kg/day), time to achieve higher volumes of enteral feeds (150 or 180 mL/kg/day), the incidence of sepsis (any sepsis and culture-positive sepsis), time to regain birth weight, number of days of parenteral nutrition, number of days of central venous line usage, episodes of feed intolerance requiring stopping feeds, anthropometry data at discharge and 40 weeks postmenstrual age, duration of hospital stay, and all-cause mortality. We expected a lot of variation in defining full-enteral feeds. Therefore, for this review, we considered the first point of achieving a volume of at least 120 mL/kg as the time to reach full enteral feeds. If the trial did not provide time to achieve 120 mL/kg feeds, the next highest volume was considered as time to full enteral feeds. Any sepsis refers to late-onset sepsis with or without culture-positive sepsis as defined by the primary authors.

Data extraction and quality assessment

Two investigators (JK, JM) independently extracted data from the full text of the eligible trials. The extracted data comprised of the first author’s name, year of publication, study population characteristics, inclusion and exclusion criteria, detailed methodology including randomization, allocation concealment, details of intervention and control group, and relevant clinical outcomes. Another researcher (PM) rechecked the extracted data for its accuracy and completeness. Two investigators (JM, JK) independently assigned an overall risk of bias to each trial using the Cochrane risk of bias tool [17]. Any disagreement was resolved through discussion with the senior investigator (PK).

Statistical analysis

Meta-analysis was undertaken for an outcome when relevant data was available from more than one study. A random-effect estimate of the pooled risk ratio (RR) (for dichotomous variables) or mean difference (MD) (for continuous variables) with 95% CI of each outcome using the Mantel–Haenszel method was deployed. The interquartile range, range, or 95% CI were converted to standard deviation by using RevMan calculator or appropriate statistical conversion formulas as advocated by the Cochrane handbook [18]. In one study where the results are given as least square mean (LSM) [4], we directly used untransformed value as the mean. As arithmetic mean and LSM are calculated differently, there might be some risk of heterogeneity in combining them. We addressed this heterogeneity by doing sensitivity analysis as well as calculating standardized mean differences (instead of mean difference) for relevant continuous outcomes (as recommended by the Cochrane handbook). Heterogeneity among studies was explored by inspection of the forest plots and chi-square test on Cochrane’s Q statistics and was quantified with I2 statistics. Publication bias could not be assessed due to an insufficient (n < 10) number of studies. Subgroup analysis was performed based upon gestational age and/or birthweight at enrolment, any additional intervention in the control group, intrauterine growth restriction status, wherever feasible. RevMan 5.3 software was used for quantitative analysis. We assessed the quality of evidence for major outcomes using GRADE guidelines and GRADE Pro software (https://gdt.gradepro.org) [19, 20].

Results

A total of 894 records were identified of which 231 were duplicates and 620 were out of the context of the current review. Forty-three studies were considered eligible for full-text screening, of these six fulfilled the inclusion criteria and were considered for qualitative and quantitative synthesis (Fig. 1). Out of six included trials (451 participants), four (321 participants) compared routine prefeed gastric residue monitoring versus avoiding routine gastric residue monitoring [4, 12,13,14], whereas two trials (130 participants) used abdominal circumference monitoring along with avoiding routine prefeed aspiration [11, 15]. Table 1 shows the characteristics of the included studies along with the feeding schedule and the definition of “full enteral feeds,” which varied among the trials. Cochrane risk of bias tool was used to assess the quality of the trials. Blinding of participants and personnel was not done in any of the trials except one, therefore were considered at high risk for performance bias. Similarly, blinding for outcome assessment was not done in any of the trials due to the nature of the intervention, therefore subjecting them to high risk for detection bias for subjective outcomes like time to reach full enteral feeds, feed intolerance, withholding feeds, hospital stay, etc. (Supplementary Figure 1). For one trial, only an abstract was available that provides very limited information for risk of bias (ROB) assessment [14].

Fig. 1
figure 1

PRISMA flowchart

Table 1 Characteristics of the included studies

Primary outcome (Fig. 2a)

Our primary outcome was to compare the incidence of NEC stage 2 or more as per modified Bell’s staging. Five trials (421 participants) reported the primary outcome and did not find any statistically significant difference among the two groups (RR 0.80; 95% CI 0.31–2.08; I2 5%).

Fig. 2
figure 2

Forest plot showing a comparison of necrotizing enterocolitis stage 2 or more (a) and time to reach full enteral feeds (b)

Secondary outcomes (Table 2)

There was great variability among studies in defining the full enteral feeds. As decided a priori, we considered the first point of achieving a volume of at least 120 mL/kg or more as the time to the full feeds. The group avoiding routine prefeed gastric aspiration achieved full enteral feeds much earlier (5 trials, 421 patients, MD − 3.19 days, 95% CI − 4.22, − 2.16; I2 0%) (Fig. 2b). We also compared the time to reach full enteral feeds for different definitions of full feeds (Table 2). There was no significant difference in achieving enteral feeds of 120 mL/kg/day, but as we advance to higher volumes (150 and 180 mL/kg/day), the group avoiding routine prefeed gastric aspiration achieved full enteral feed earlier.

Table 2 Comparison of secondary outcomes (avoiding routine aspiration versus routine prefeed aspiration)

The group avoiding routine prefeed gastric aspiration also has a significantly lower incidence of late-onset sepsis (RR 0.77; 95% CI 0.60–0.99; I2 0%), but not culture-positive sepsis (RR 0.80; 95% CI 0.60–1.06; I2 0%). Also, they had a shorter hospital stay (MD − 5.32 days; 95% CI − 10.25, − 0.38; I2 22%) as compared to the routine prefeed gastric aspiration group. There was no significant difference in time to regain birth weight, number of days of total parenteral nutrition, number of days of central venous line usage, and all-cause mortality. None of the trials reported anthropometry data at hospital discharge or 40 weeks postmenstrual age.

Other outcomes

Some of these studies also compared the effect of the intervention over the number of episodes of feed intolerance, the number of times feed withheld, days of feed interruption, rate of feed hiking at various intervals, and anthropometry at various time-periods. Since the definitions of defining feed intolerance and threshold of stopping feeds were different across studies, we did not combine them in meta-analysis.

Two studies [12, 15] did not observe any significant difference in the number of times the feeds were withheld, whereas Kaur et al. [11] reported significantly higher feed intolerance episodes (80% vs 35%, p < 0.001) and feed interruption days (p < 0.001) in routine prefeed gastric aspiration group. Similarly, in one study, the episodes of abdominal distension were significantly higher (p 0.001) in the routine prefeed gastric residue aspiration group [4]. Two trials studied enteral intake at weekly intervals [4, 13]. Torrazza et al. compared enteral intake at 2 weeks and 3 weeks postnatal age and did not find any significant difference [13]. On the contrary, Parker et al. observed that the group avoiding routine prefeed aspiration had faster feed advancement (mean weekly increase, 20.7 mL/kg/day vs 17.9 mL/kg/day; p 0.02) and tolerated significantly more feed volumes at 5 and 6 weeks postnatal age (though the differences were not significant in first 4 weeks) [4].

Torrazza et al. and Parker et al. also compared the anthropometric parameters at 3- and 6-weeks postnatal age, respectively [4, 13]. Parker et al. observed significantly higher mean estimated log weights (7.01 [95%CI, 6.99–7.02] vs 6.98 [95%CI, 6.97–7.00]; p = .03) in the prefeed aspiration avoidance group at 6 weeks, whereas Torrazza et al. did not find any significant difference in weights at 3 weeks. In both trials, there was no significant difference in the length and head circumference.

Subgroup analysis

As decided a priori, we did subgroup analysis based upon gestation, weight, and the additional intervention (abdominal girth monitoring) in the “avoiding routine prefeed gastric residue aspiration” group. Out of six trials, three trials enrolled infants ≤ 32 weeks/1250 g, one enrolled > 1500 g, whereas the rest two enrolled wider ranges of gestation (27–36 weeks) and weights (750–2000) without distinct subgroup analysis. Therefore, only four trials were available for gestation/weight subgroup analysis (< 1250 g versus ≥ 1250 g) and all of them compared routine prefeed aspiration versus avoiding routine aspiration (Table 3). There was no significant effect of gestation and weight for any of the clinical outcomes. We planned to assess the impact of gestational age over magnitude and direction of the effect size by doing meta-regression analysis, however due to lesser number of studies (< 10), it could not be performed.

Table 3 Subgroup analysis for primary and secondary outcomes

We also compared the subgroups where abdominal circumference (AC) monitoring was done in addition to avoiding routine prefeed aspiration to those where AC was not monitored. Except for time to reach full enteral feeds (which was significantly shorter in the subgroup where AC was additionally monitored in no aspiration group), there was no significant effect of additional AC monitoring in the no aspiration group (Table 3). The difference in time to reach full enteral feeds is likely related to feed volume (rather than AC) used to define full enteral feeds as both studies of the AC group considered full enteral feeds at higher volumes (150 and 180 mL/kg) as compared to the other sub-group (in which all studies considered 120 mL/kg as full feeds). Due to the nonavailability of data, the subgroup analysis for intrauterine growth restriction was not done.

Sensitivity analysis

We decided a priori to do a sensitivity analysis based on ROB in studies. However, all trials were at high risk of bias precluding this analysis. To address the heterogeneity induced in continuous outcome by the inclusion of Parker et al. study. we did post hoc sensitivity analysis [4]. On removing this study from analysis, there was no significant effect on the overall direction of the outcomes (though the effect size decreased as it is the largest trial so far) except length of hospital stay (2 trials; RR − 4.63 days; 95% CI − 12.43, 3.18). We also calculated standardized mean differences for continuous outcomes including this study. In this post hoc analysis also, the results (SMD with 95% CI) remained unchanged for duration of hospital stay (− 0.29; 95%CI − 0.54, − 0.03); duration of parenteral nutrition (− 0.41; 95%CI − 1.05, 0.23), and duration of central venous line (− 0.14; 95%CI − 0.47, 0.19).

Discussion

In this meta-analysis of six RCTs, avoidance of routine prefeed gastric residue aspiration was not associated with an increased incidence of NEC (stage 2 or more) or all-cause mortality before discharge, therefore can be considered a safe practice (low-quality evidence). Also, avoiding routine prefeed aspiration was related to the achievement of full enteral feeds earlier (moderate-quality evidence), and earlier discharge from the hospital (moderate-quality evidence). Furthermore, avoiding routine prefeed aspiration was associated with decreasing late-onset sepsis (low-quality evidence, the number needed to treat for additional benefit, 10 (95% CI 5–54)). No significant difference was observed in culture-positive sepsis, days of parenteral nutrition, time to regain birth weight, and central venous line use. None of the trials reported long-term data on growth and neurodevelopmental outcomes. Due to the small sample size and wider confidence interval of effect size, we are uncertain for most of the outcomes. The ongoing clinical trials (Supplementary Table 2) could provide more data on important clinical outcomes.

Preterm neonates frequently experience feed intolerance due to gastrointestinal immaturity and decreased intestinal motility. The presence of a large amount of gastric residue indicates decreased motility and is considered an indicator of feed intolerance. In clinical practice, it is often used to decide advancement or withholding of feeds. The presence of abnormally large gastric residual volume (GRV) is assumed to be an early indicator of NEC. This assumption is largely based on older case-control studies [8, 10]. Cobb et al. investigated the relationship of GRV and NEC in a case-control study and found that the infants who developed NEC had a maximum GRV of 4.5 mL as compared to 2 mL in the control group. Based on this finding, they suggested that a GRV greater than 3.5 mL or one third of the previous feed volume may be associated with a higher risk for NEC [8]. Although the difference in residuals was statistically significant, there was a significant overlap in maximal GRV between the groups, therefore, decreasing the confidence in the results. Similarly, in a case-control study of 34 infants comparing GRVs from birth to the diagnosis of NEC, Bertino et al. observed significantly higher maximal GRV among the infants diagnosed with NEC (7.46 mL vs 4 mL, p 0.04) [10]. Although the difference was significant, there was a 17-day delay between the obtainment of the maximum GRV and the diagnosis of NEC. Moreover, there was no difference in overall 24-h residual volume further complicating the clinical significance. Though both studies found a relationship between higher GRV and NEC, there was a lack of consensus concerning the GRV threshold. Cobb et al. suggested that a GRV of > 3.5 mL may be associated with a higher risk of NEC, whereas Bertino et al. reported that the mean maximum GRV of 4 mL is safe (as seen in the control group). Mihatsch et al. observed that in the absence of other clinical signs, there is no correlation between light green GRVs and NEC [21]. These findings suggest that no defined threshold of GRV can reliably predict NEC. Also, in absence of other gastrointestinal signs, GRV alone may not be a good predictor of NEC. The delay in attainment of full enteral feeds, prolonged use of parenteral nutrition, and longer indwelling central lines might be counterproductive. Therefore, it might be reasonable to forego the routine evaluation of prefeed GRVs in absence of other gastrointestinal manifestations.

This was further supported by a large retrospective (before and after) study [22]. Riskin et al. enrolled 239 gavage-fed infants in whom routine GRV estimation was done and compared them with 233 infants in whom it was not practiced. They found that avoiding routine prefeed aspiration was associated with earlier attainment of full enteral feeds without increasing the risk for NEC. Moreover, the practice of routine aspiration of gastric residuals negatively affects the nutritional intake of these infants [4, 7]. These observations were further supported by multiple prospective randomized controlled trials [4, 11,12,13,14,15].

Furthermore, the frequent aspiration of gastric residue may cause gastric mucosal damage and intestinal bleeding. Hydrochloric acid is an important intestinal barrier and is considered essential in limiting intestinal bacterial overgrowth. Since the aspirated gastric residuals are frequently discarded, it can increase inflammation and alter the intestinal microbiome leading to increased risk for late-onset sepsis [5, 14]. A recent trial found that routine prefeed aspiration adversely affects the protective microbiome (Lactobacillus), leading to overgrowth of pathogenic bacteria (Escherichia, Shigella, and Citrobacter) [14]. Also, routine aspiration does not have any beneficial effect on gastrointestinal function, intestinal inflammation, or gastrointestinal mucosal bleeding [5, 14]. The increased risk for late-onset neonatal sepsis in the routine prefeed aspiration group seen in our meta-analysis might be related to the altered microbiome.

Along with gastric residue, the abdominal circumference is also not a reliable measure of feed tolerance. Limited low to very-low quality evidence does not favor routine abdominal girth measurement for assessing feed intolerance or other clinical outcomes. It is highly prone to intra- and inter-observer variation. Furthermore, studies have shown that even among healthy premature infants, AC may vary by 3.5 cm during one feeding cycle, further precluding the use of any clinically meaningful cutoff value [2]. In a nutshell, considering the harmful effects associated with this practice, it may be best to reserve it for infants showing other clinical signs of feed intolerance and NEC.

Strength and limitations

Though the systematic review was performed as per standard PRISMA guidelines while adhering to published protocol, there are a few limitations too. Most of the included trials were small and not adequately powered for serious clinical outcomes like NEC and mortality. Also, they were at high risk for performance and detection bias, thereby decreasing the overall certainty of evidence. Half of the studies used mixed feeding therefore, the effects of formula milk on the adverse outcomes like feed intolerance and NEC cannot be ruled out. There were wide variations in the feeding protocol. None of the trials provided separate data for infants with intrauterine growth restriction and perinatal asphyxia, therefore precluding subgroup analysis. Furthermore, data on long-term growth and the neurodevelopmental outcome is lacking.

For trials comparing routine prefeed aspiration with no routine aspiration, NEC (stage 2 or more) should be the appropriate primary outcome. To reliably prove that avoiding routine prefeed aspiration does not lead to a higher incidence of NEC as compared to routine aspiration, 16,986 participants will be required for an equivalence trial (1% equivalence limit) and 6458 participants for a superiority trial, with 80% power. Such mega-trials may not be feasible. Future RCTs of routine prefeed aspiration vs. no aspiration should follow the uniform study and feeding protocols with similar standard definitions for the outcomes.

Conclusions

There is low to moderate quality evidence to suggest that avoiding routine prefeed gastric aspirate monitoring helps in the reduction of late-onset sepsis, achieving full enteral feeds earlier, and earlier discharge from the hospital. Also, it does not increase the risk of death or NEC. Therefore, it seems worth forgo the practice of routine prefeed gastric residue monitoring in the absence of other signs of feed intolerance in preterm low birthweight neonates.