Introduction

Acute appendicitis remains the most common indication for urgent abdominal surgery in children [15], yet the clinical evaluation for appendicitis is frequently nonspecific and insufficient to guide surgical decision-making [611]. During the last three decades, US and CT have dramatically improved the diagnostic accuracy of appendicitis in pediatric patients, and they have therefore become central to the workup of children with clinically suspected appendicitis at most institutions [1218]. More recently MRI has been employed for evaluation of the pediatric appendix at some hospitals [1928].

Enthusiasm for using MRI to diagnose pediatric appendicitis has been driven by a commitment to the ALARA (as low as reasonably achievable) principle of diagnostic imaging and radiation exposure in children. Reports from several academic institutions suggest that MRI might be comparable to CT in evaluation of the pediatric appendix, with both sensitivities and specificities over 95% [1921, 2426, 28]. However, the MRI criteria provided for appendicitis have generally been extrapolated from old US and CT literature rather than from rigorously analyzed individual MRI parameters [2022, 24, 25, 28]. Further complicating the situation, some of the historically accepted criteria for defining a normal appendix on both US [29, 30] and CT [31] have been called into question, with recent descriptions of a greater range of normal variation than was previously acknowledged.

Regarding the normal appearance of the appendix on MRI in children, the primary literature remains sparse. Therefore we investigated MRI findings of the normal appendix in order to aid in establishing a reference standard in the pediatric population.

Materials and methods

Our institutional review board approved this retrospective study with a waiver of informed consent. It was conducted in accordance with the Health Insurance Portability and Accountability Act (HIPAA).

Study population

We included pediatric patients (≤18 years of age) who underwent lumbar spine or pelvis MRI examinations at Boston Children’s Hospital between Jan. 1, 2013, and Dec. 31, 2013, for clinical indications other than acute intra-abdominal or pelvic visceral inflammatory pathology or associated pain, as determined through a review of medical records. There were 838 eligible pediatric patients whose MRI examinations were screened by a board-certified radiologist (D.W.S., a pediatric radiology fellow in the final month of training) to determine whether the cecum was included in the field-of-view. We excluded patients whose cecum was not included (n = 492; 58.7%) in the imaging because it was considered likely that the appendix would also be absent. The cecum was included in MRIs of 346 of the 838 patients (41.3%), and it was considered possible that the appendix might also be identified on available images. Therefore, we included the 346 patients whose cecum was incidentally imaged on MRI (207 girls [59.8%]; median age 11 years, [25th, 75th] quartiles = [5.9, 14.6] years).

The indications for MR imaging were evaluation of (1) vertebral body anomaly including spinal dysraphism or scoliosis (n = 104; 30.1%); (2) musculoskeletal pain, infection or tumor (n = 102; 29.5%); (3) lower back pain with or without radiculopathy (n = 72; 20.4%); (4) pelvic tumor (n = 30; 8.7%); (5) congenital anatomical anomaly of pelvis (n = 17; 4.9%); (6) superficial soft-tissue lesion (n = 14; 4.0%), and (7) central nervous system neoplasm (n = 7; 2.0%). On review of medical records, none of the children included in this study reported acute abdominal pain suspicious for appendicitis at the time of imaging, nor did any of them undergo appendectomy at our institution during the 2 weeks prior to or following their MRI examination.

MRI technique

Multiplanar MRI was performed using 1.5-T Avanto (n = 46; Siemens Healthcare, Erlangen, Germany) or 3.0-T system (n = 300, including Tim Trio = 104 [Siemens Healthcare, Erlangen, Germany]; Skyra = 94 [Siemens]; Signa = 59 [GE Healthcare, Waukesha, WI] and Verio = 43 [Siemens]). The 346 patients included 221 with exclusive lumbar spine MRI exams (38 of which included gadolinium-enhanced imaging), 100 with exclusive pelvis MRI exams (54 with gadolinium-enhanced imaging) and 25 who had both lumbar spine and pelvis MRI exams (15 with gadolinium-enhanced imaging). In the patients with both lumbar spine and pelvis MRI exams, the studies were interpreted together, with appendix characteristics recorded by the readers from whichever sequences best depicted the findings.

The MR imaging planes and pulse sequences varied depending on the region of interest and specific clinical indication. The protocol for lumbar spine MR imaging consisted of standard axial and sagittal T1- and T2-weighted spin echo, with variable inclusion of sagittal T2-weighted spin echo with fat saturation or fast inversion recovery, and axial and sagittal T1-weighted spin echo with fat saturation prior to and following intravenous administration of gadolinium-based contrast material (with slice thickness of 3 mm, no skip, and in-plane resolution of 0.5 mm × 0.5 mm). The protocol for pelvis MR imaging consisted of standard axial and coronal T1- and T2-weighted spin echo, with variable inclusion of axial or axial oblique T2-weighted spin echo with fat saturation, and axial and sagittal T1-weighted spin echo with fat saturation prior to and following intravenous administration of gadolinium-based contrast material (with slice thickness of 3.5 mm, no skip, and in-plane resolution of 1 mm × 1 mm).

Image interpretation

Two board-certified radiologists (G.R.S. and D.W.S., both pediatric radiology fellows in their final month of training) reviewed each MRI examination independently, coding answers to the following questions:

  1. (1)

    Is the appendix visualized? 1 = no, 2 = yes with low confidence, or 3 = yes with high confidence.

  2. (2)

    If the appendix is visualized, what is the maximum transverse diameter (defined on short axis from outer serosal margin to outer serosal margin, and as the smaller of the perpendicular cross-sectional measurements if the appendix appears oval-shaped, in order to account for obliquity of imaging)?

  3. (3)

    If distinguishable from appendix wall, what are the signal characteristics of the appendiceal intraluminal contents? 1 = predominantly hypointense to appendix wall on both T1- and T2-weighted sequences, similar to known gas-containing structures; 2 = predominantly hyperintense to appendix wall on T2-weighted sequences, similar to known fluid-containing structures; 3 = variable segments reflecting both options 1 and 2; or 4 = indeterminate because of motion artifact, decompressed lumen, or low-confidence appendix visualization.

  4. (4)

    Is there periappendiceal edema (fat stranding on T1- or hyperintensity on T2-weighted or short tau inversion recovery, when available)? 1 = no, or 2 = yes.

  5. (5)

    Is there periappendiceal free fluid (T2 hyperintense signal similar to known fluid-containing structures, including urine-filled bladder when imaged)? 1 = no, or 2 = yes.

Because MRI examinations were performed for a variety of indications, there was variation in available sequences for review. Although all exams included at least a T1- or T2-weighted sequence without fat saturation, some exams also included T2 with fat saturation and T1 with fat saturation prior to and following intravenous administration of gadolinium contrast agents. All available sequences for a given exam were reviewed, and answers to these questions were based on whichever sequence(s) best demonstrated the appendix.

Statistical analysis

We used the Cohen kappa statistic to assess agreement regarding qualitative assessments (e.g., appendix visualization, intraluminal content signal characteristics, presence of periappendiceal edema or free fluid). We used the Spearman rho to assess the correlation between the estimates of the appendix diameter obtained by the two readers. Ordinary linear regression models were developed to assess the correlation between appendix measurements and demographic data, such as patient age. All statistical analyses were performed using the software MATLAB (MathWorks, Natick, MA). The statistical significance level was set at 0.05. In cases of non-normally distributed data, the 95% confidence intervals were estimated via bootstrapping with replacement (2,000 draws).

Results

Agreement on visualization of the appendix

Reader 1 saw the appendix with high confidence in 197/346 cases (56.9%) and did not see the appendix in 149 (43.1%). Reader 2 saw the appendix with high confidence in 203/346 cases (58.7%) and with low confidence in 3 (0.9%), and did not see the appendix in 140 (40.5%). Both readers identified the appendix in the same patients with high confidence in 191 cases (55.2%), and did not see an appendix in 135 overlapping cases (39.0%). In 5 cases (1.5%), reader 1 saw the appendix with high confidence whereas reader 2 did not see the appendix, and in 12 cases (3.5%) reader 2 saw the appendix but reader 1 did not. In 2 cases (0.6%) reader 1 saw the appendix with high confidence and reader 2 saw it with low confidence. Using a 3 × 3 contingency table (for high and low confidence and non-visualization) and the Fisher exact test, there was a statistically significant agreement between readers on visualization of the appendix (P < 0.0001). The Cohen unweighted kappa was k = 0.88 (P < 0.0001), indicating strong agreement between readers.

Agreement on measurement of appendix diameter

Both readers saw the appendix in the same patient (with either high or low confidence) in 192 cases and provided a maximum diameter measurement. Figure 1 shows the relationship between the readers’ measurements. Given that these data were approximately normally distributed and their relationship was linear, correlation of appendix diameters was assessed using the Spearman correlation coefficient. Estimated appendix diameters were in the range 2–11 mm. Median appendix diameter was 5 mm for reader 1 and 6 mm for reader 2, with a confidence interval (CI) = [5, 6] mm for both, and (25th, 75th) quartiles = (5, 6) mm for both. The two sets of measurements were statistically identical (P = 0.11) and well-correlated (Spearman rho = 0.73, P < 0.0001).

Fig. 1
figure 1

Distribution of the two readers’ measurements of appendix diameter, shown on a scatter plot

Relationship between age and appendix diameter

The relationship between age and appendix diameter is shown in Fig. 2. Ordinary linear regression models were developed separately for the two readers, as well as for their averaged measurements, with diameter as the outcome and age as the predictor. There was a positive and statistically significant relationship between age and diameter, both for individual readers and their average. In all three models P < 0.001 for the regression coefficient corresponding to age, standard error = 0.02, (Wald statistic = 32.26–43.82).

Fig. 2
figure 2

Graph shows relationship between age and appendix diameter

Agreement on signal characteristics of appendix intraluminal contents

Data on this parameter were available in 197 cases for reader 1 and 206 cases for reader 2, given their differential visualization of the appendix. Figure 3 shows the distributions of the two readers’ characterizations of the appendix intraluminal contents. These were characterized as mostly air (1), mostly fluid (2), a combination of air and fluid (3) or uncharacterizable (4). The two readers characterized the intraluminal contents as mostly air in 139 of 197 (70.1%) and 108 of 206 (52.4%) cases, respectively; mostly fluid in 9 of 197 (4.6%) and 15 of 205 (7.0%) cases, respectively; a combination of air and fluid in 6 of 197 (3.0%) and 17 of 205 (8.3%) cases, respectively; and uncharacterizable in 43 of 197 (21.8%) and 66 of 205 (32.2%) cases, respectively. Consequently their agreement varied across categories, with significant agreement on the intraluminal contents being air (P < 0.001), both fluid and air (P = 0.02) and uncharacterizable (P = 0.01). In addition to the variability of intraluminal contents, there was substantial variability between the readers, with significant differences between them for all categories except the characterization of mostly fluid (P = 0.24).

Fig. 3
figure 3

Luminal contents characterizations. Graphs show signal characteristics of appendix intraluminal contents. These were characterized as mostly air (1), mostly fluid (2), a combination of air and fluid (3) or uncharacterizable (4)

Agreement on presence of periappendiceal edema or fluid

Neither reader identified evidence of periappendiceal edema in any case. Both readers identified the same six cases (two girls, four boys) with free fluid near the appendix, corresponding to 3.0% of the visualized appendix cases for reader 1 and 2.9% for reader 2.

Signal characteristic of the appendix wall on variable MRI sequences

When distinguishable from intraluminal contents, the normal appendix wall demonstrated consistent signal characteristics on standard MRI sequences, without observable variation among patients. Figure 4 provides representative transverse images of the normal appendix, including one image demonstrating how the appendix diameter was measured in short axis. On standard T2-weighted spin-echo imaging, the appendix wall was T2-isointense to other bowel wall and mildly hyperintense to skeletal muscle. On standard T1-weighted imaging, the appendix wall was isointense to other bowel wall and iso- to slightly hyperintense to skeletal muscle. On T1-W images with post-gadolinium fat saturation, the appendix wall was isointense to other bowel wall and markedly hyperintense to skeletal muscle.

Fig. 4
figure 4

Lumbar spine MRI follow-up after surgery for release of a tethered cord in a 6½-year-old girl. The appendix, demarcated by calipers (a) or white arrows (b and c), measured 7 mm in diameter by both readers. a Axial T2. b Axial T1. c Axial T1 with fat saturation following intravenous administration of gadolinium-based contrast material

Discussion

In our study the normal appendix was seen on MRI examinations in approximately half of pediatric patients. When visualized on MRI, the normal pediatric appendix had a median diameter of 5–6 mm. Although the normal appendix shows variable intraluminal signal characteristics, there should be no adjacent inflammatory changes, although surrounding free fluid is rarely present. Given the increasing use of MRI for evaluating appendicitis in recent years and a paucity of information regarding the normal appearance of the appendix on MRI in pediatric patients, our findings can aid in establishing a reference standard for assessment of the appendix on MRI in the pediatric population.

Recent publications describing the use of MRI in the evaluation of pediatric appendicitis report sensitivities and specificities over 95% [20, 21, 2426, 28]. By comparison, a meta-analysis of US and CT for appendicitis performed by Doria et al. [14] in 2006 revealed pooled sensitivities and specificities of 88% (95% CI: [86%, 90%]) and 94% (95% CI: [92%, 95%]), respectively, for US, and 94% (95% CI: [94%, 97%]) and 95% (95% CI: [94%, 97%]), respectively, for CT in pediatric patients, suggesting that MRI performs better than US and CT for diagnosing and excluding appendicitis in children. However a common factor in the studies describing the utility of MRI for the evaluation of pediatric appendicitis is that they have been performed at large academic hospitals, presumably with high-volume pediatric emergency departments, and with interpretations by sub-specialized pediatric radiologists who are experienced in the use of MR imaging in children. Therefore, the generalizability of these results into broader practice settings remains unproven.

If MRI can be shown to consistently perform as well as CT for diagnosing pediatric appendicitis, then the ALARA principle would suggest it should ultimately replace CT in many instances, whether following US as part of a stepwise imaging algorithm [19, 21, 32] or as a primary imaging modality [26]. This could be particularly relevant in adult-focused community emergency departments, where pediatric patients are more likely to undergo CT for suspected appendicitis than they would be at a dedicated pediatric hospital [18]. Unfortunately for clinicians and radiologists interested in performing MRI for the evaluation of the pediatric appendix, the variability of imaging protocols and the inconsistency of described criteria for distinguishing the normal appendix from acute appendicitis on MRI present challenges to replication of reported results. To the best of our knowledge, our study represents the largest series characterizing the normal appearance of the pediatric appendix on MRI.

In 2002 Hörmann et al. [33] described the normal MRI appearance of the appendix in a small study of 13 healthy pediatric volunteers, reporting a mean diameter of 4.5 mm (range 3–5 mm). The signal characteristics of the appendiceal wall and “center” were subjectively described; however signal characteristics of intraluminal contents were not distinguished from those of coapted mucosa. In 2008 Baldisserotto et al. [34] described a small cohort of 19 healthy adolescent volunteers whose appendix was visualized, reporting mean appendix diameter of 5 mm (range 3–7 mm). In all cases, the appendiceal contents were reported as hyperintense to muscle on T2-weighted sequences and hypointense on T1-weighted sequences. In our study, which included a considerably larger patient cohort than these two other studies, the normal appendix was visualized in 192/346 (55.5%) pediatric patients, providing a dataset that is an order of magnitude larger than the prior studies reporting the MRI characteristics of the normal appendix, allowing for calculation of 95% confidence intervals and possible extrapolation to a larger patient population.

The median appendix diameter in our study population was in the range 2–11 mm (median was 5 mm for reader 1 and 6 mm for reader 2). These values are similar to measurements reported in recent investigations using US and CT. For example, in 2011 Goldin et al. [35] described 127 pediatric patients with a fully visualized normal appendix, with median diameter of 5.0 mm (inter-quartile range, 4.0–5.7). In 2012 Trout et al. [30] described 117 pediatric patients with a normal appendix imaged by US, in which the mean appendix diameter was 5.9 mm (standard deviation [SD] 1.8 mm) for one reader and 5.7 mm (SD 1.6 mm) for another. In 2014 Coyne et al. [29] reported 388 cases of normal appendix evaluated by a single pediatric radiologist with compression US, with mean appendix anteroposterior diameter of 4.4 mm (SD 0.9 mm), and mean transverse diameter of 5.1 mm (SD 1.0 mm). In 2014 Trout et al. [31] reported 420 cases of normal pediatric appendix imaged by CT, with mean appendix diameter of 5.57 mm (SD 1.35) for one reader and 5.68 mm (SD 1.50 mm) for another.

These recent studies describing the normal US and CT characteristics of the pediatric appendix were conducted in response to a concern that the commonly stated 6-mm upper limit of normal for the appendix diameter might not account for normal variation among children and might therefore contribute to persistent false-positive diagnoses of appendicitis if used as a strict diagnostic threshold [2931]. The >6-mm diameter criterion for appendicitis was originally suggested as a guideline by Jeffrey et al. [36] in 1988 based on their experience performing appendix US in a population of 250 adults with clinically suspected appendicitis. This guideline was adopted as a CT criterion for appendicitis, perhaps with less rigorous validation than would have been optimal, and has now been commonly referenced in both the US and CT literature for decades [30, 31]. Interestingly, several authors describing MRI for evaluation of the pediatric appendix have repeated the 6-mm appendix diameter cutoff [20, 22, 34]. Alternatively, some authors have used a 7-mm upper limit of normal as a diameter threshold for distinguishing normal from inflamed appendices, based on their institutional experience [21, 35]. Our data suggest that a significant proportion of normal appendices measures >6 mm on MRI, reinforcing others’ suggestions that appendix diameter should be only one of several criteria for the diagnosis of appendicitis [26, 27, 30, 31].

In our study population there was a positive linear relationship between age and appendix diameter (Fig. 2). A relationship between age and appendix diameter has not been reported in the MRI literature. This finding is supported by the experience of Trout et al. [31] in their recent article characterizing the normal CT appearance of the pediatric appendix; they reported that the diameter increased by 0.4 mm/year until 6–7 years of age, after which they did not observe further diameter increase. Contradicting our data is the report of Coyne et al. [29], who observed a normal distribution of normal appendix diameters across their entire pediatric population but did not identify a direct age relationship.

Our patients showed considerable variation in the signal characteristics of intraluminal contents, including signal patterns interpreted to most likely represent gas, mixed content, or fluid. There was also substantial variability between the readers, which might be partly attributable to observations from different segments of the appendix, in addition to intrinsic interobserver variation. Nevertheless, this variability of the intraluminal contents makes sense and correlates with US and CT findings of occasional gas, fluid and mixed fecal content within appendiceal segments, as observed in clinical practice. Interestingly, Hörmann et al. [33] described mild T2-hyperintensity in the “center” of all normal appendices (n = 13), and Baldisserotto et al. [34] described T2-hyperintense “internal contents” in all normal appendices (n = 19). Both described predominant T1-hypointensity of the center, or of internal contents. The homogeneity of their observations could relate to their small numbers of patients.

It is particularly noteworthy that several articles describe a fluid-filled appendix as a finding of appendicitis on MRI, and fluid is generally included as one of several suggestive findings [21, 22, 25]. We found that the normal appendix was predominantly fluid-filled in a significant minority of patients (4.6% for reader 1 and 7.0% for reader 2). The descriptions by Hörmann et al. [33] and Baldisserotto et al. [34] for the normal MRI appearance of the pediatric appendix also suggest that T2-hyperintense internal contents are seen in many normal cases, although it is unclear from their descriptions whether the T2-hyperintense signal was considered suggestive of fluid, per se. In combination, these findings support the statement of Moore et al. [27] that fluid may be present within the appendiceal lumen in the absence of inflammation.

Neither reader in our study identified any of the normal visualized appendix cases as demonstrating periappendiceal inflammatory edema or fat-stranding, while both readers identified free fluid around the appendix in the same six children. Free fluid is a nonspecific finding in both boys and girls, and alone it is not a useful diagnostic criterion for appendicitis [37, 38]. However inflammatory changes localizing to the periappendiceal region may be the most important finding of appendicitis [20, 22, 26, 27]. Moore et al. [27], for example, stated that “the most important finding to indicate acute appendicitis on MRI is focal periappendiceal inflammation.”

Our study has several limitations. First, while two prior studies of the normal MRI appearance of the pediatric appendix were performed on healthy, asymptomatic volunteers, our study cohort was composed of consecutive pediatric patients with imaging indications ranging from minor trauma to central nervous system neoplasms; therefore our cohort might not reflect a completely normal population of children. However we excluded children who carried a diagnosis likely to involve the appendix. Assuming logically that our cohort does reasonably reflect otherwise healthy patients, our results likely provide a valuable reference for the normal pediatric appendix on MRI. A second potential limitation of our study is that we did not evaluate the utility of individual MR imaging sequences or planes for identification of the normal appendix. We reviewed all available images for each patient, regardless of the MR sequence, plane of imaging or presence of intravenous contrast material. Therefore we cannot comment on whether the use of contrast material improves visualization of the appendix. On this topic, however, Rosines et al. [28] reported no significant difference in rates of visualization of the appendix among T2-weighted sequences only, T1-weighted unenhanced sequences only, T1-weighted contrast-enhanced sequences only, and all sequences combined. This suggests that minimal, if any, bias was introduced by the review of all available contrast-enhanced sequences for rate of appendix visualization or characterization of its appearance in our study cohort. A third limitation is that we did not evaluate differences in appendix imaging characteristics between 1.5-T and 3.0-T magnets. No inferences can therefore be drawn from our dataset regarding the relative performance of various MR imaging systems.

Conclusion

The normal appendix was incidentally visualized on MRI examinations in approximately half of pediatric patients. When visualized on MRI, the normal pediatric appendix had a median diameter of ~5–6 mm (CI: [5, 6] mm). The normal appendix may contain fluid or mixed contents, though it is more often predominantly gas-filled. Although there should never be periappendiceal inflammation in a normal appendix, the presence of surrounding free fluid is rarely seen. Our findings could be used to develop a reference standard for the normal appendix on MRI in the pediatric population in both clinical and research settings.