Introduction

Acute appendicitis is one of the most common surgical emergencies in pediatric patients. In the USA in 2016, it was diagnosed in approximately 100,000 children [1]. Ultrasound is a first-line imaging test for pediatric patients with right lower quadrant pain and suspected acute appendicitis based on American College of Radiology Appropriateness Criteria recommendations [2]. However, ultrasound is commonly described as a highly “operator dependent” imaging modality, in which operator variation impacts diagnostic performance [3,4,5]. This attributed characteristic is often listed as a limitation of ultrasound, particularly compared to other modalities like computed tomography (CT) and magnetic resonance imaging (MRI). Despite claims of operator dependence, the impact of the individual sonographer and radiologist on imaging and clinical outcomes remains largely unstudied, including in the realm of pediatric appendix ultrasonography.

A small number of recent studies have moved beyond the assessment of an imaging test via diagnostic performance and begun to evaluate the impact of individual radiologists on specific clinically meaningful outcomes [6, 7]. Interestingly, these studies found that while radiologist experience failed to predict clinical outcomes like hospital admission, hospital readmission, and need for surgical intervention, there were significant differences between radiologists in the use of downstream healthcare resources, such as MRI and subspecialty consultation. We believe such outcome measures may help further characterize the “operator dependent” nature of pediatric appendix ultrasound.

The purpose of the current study was therefore to determine if individual sonographers and radiologists impact appendix visualization by ultrasound and subsequent utilization of CT in children presenting to the Emergency Department with suspected acute appendicitis. We hypothesized that there would be no significant impact of individual sonographer or radiologist on these imaging-related outcomes at our pediatric tertiary care institution after adjusting for clinical variables and system factors thought to influence these outcomes.

Materials and methods

This single-institution retrospective cohort study was approved by the local Institutional Review Board and was compliant with the Health Insurance Portability and Accountability Act (HIPAA, USA). A waiver of documentation of informed consent was granted.

Study population

All patients 18 years old or younger who underwent an appendix ultrasound examination at Cincinnati Children’s Hospital Medical Center between January 2010 and September 2019 were eligible for inclusion. Patients were identified using software held in the Department of Radiology (Insight; Softek Illuminate).

Appendix ultrasound examinations meeting any of the following criteria were excluded (Fig. 1): examination performed in the inpatient or outpatient setting (i.e., only exams performed for Emergency Department patients were included); radiology report was addended; examination was performed by a sonographer with fewer than 100 appendix ultrasound examinations of experience during the study period (this number was determined a priori); examination was interpreted by an attending radiologist with less than 100 interpreted appendix ultrasound examinations of experience during the study period (this number was determined a priori); radiologist, sonographer, and/or other imaging data were missing (often related to the radiology report not following our agreed-upon departmental structured template, see below); impression of the radiology report did not fit into one of five categories (defined within the structured report, see below); ultrasound encounter could not be linked to a patient’s records during the electronic medical record (EMR) query for clinical data (see below); or no recorded patient weight. Finally, repeat examinations on the same patients were excluded. Only the first appendix ultrasound examination was included per patient, and all included ultrasound examinations were obtained in unique individuals.

Fig. 1
figure 1

Flowchart depicting derivation of final study population through application of inclusion and exclusion criteria

Ultrasound protocol

For each appendix ultrasound examination, the patient was brought to a dedicated room in the Department of Radiology and placed on an examination table in supine position. One of 31 dedicated pediatric sonographers with 4 to 34 years of experience (mea n = 16.2 ± 9.2 years) since completion of sonographer training performed the examinations. A linear high-frequency transducer (variable frequencies during the study period, ranging from 9 to 18 MHz) was utilized to image the right lower quadrant and appendix (if identified). Images were acquired in transverse and longitudinal planes using graded compression technique, and cine clips of the appendix were obtained to assess compressibility of the appendix. A curved low-frequency transducer (variable frequencies during the study period, ranging from 4 to 11 MHz), was also routinely used to assess the pelvis per protocol. Abdominal tenderness was assessed during graded compression of the appendix and recorded by the sonographer as absent, mild, moderate, or marked. Rebound tenderness was assessed by observing for pain immediately after removing the ultrasound transducer from the abdomen and recorded by the sonographer as absent, mild, moderate, or marked.

A pediatric radiology attending physician, pediatric radiology fellow, or radiology resident reviewed ultrasound examination images with the sonographer either in person or by telephone prior to completion of the imaging study. At the discretion of the reviewing radiologist, additional images may have been obtained by the sonographer, the pediatric radiology fellow, or the attending pediatric radiologist. Attending radiologists either completed the imaging report alone or reviewed and finalized a report drafted by a radiology resident or pediatric radiology fellow. All examinations were interpreted by one of 31 fellowship-trained pediatric radiologists with 2 to 32 years of post-fellowship experience (mean = 14.1 ± 8.1 years). All included cases were reported using a structured report (Fig. 2) [8]. Radiologists were able to make free-text edits to this report.

Fig. 2
figure 2

Institutional structured report for appendix ultrasound examinations. Pick list selections and editable fields are highlighted in gray. Note: Impressions 1 and 4 were considered negative for acute appendicitis. Impression 3 was considered positive for acute appendicitis. Impressions 2 and 5 were considered equivocal or indeterminate for acute appendicitis. Section demarcated by “[ …]” was completed only if the appendix was visualized

Documentation of imaging and system data

Multiple demographic and imaging-related variables, including patient age and sex, sonographer performing the ultrasound examination, attending radiologist interpreting the ultrasound examination, facility where the ultrasound examination was performed (main hospital vs. smaller satellite hospital), and date and time of the examination were all collected from the Department’s Picture Archiving and Communication System (PACS) or the radiology information system (RIS). Time of ultrasound examinations was classified as “weekday” if performed Monday through Thursday between 07:00 and 22:00 h or Friday from 07:00 to 17:00 h. These intervals reflected the time daytime departmental radiology staff were present. Time of ultrasound examinations was classified as “weekend/evening” if performed during another time. The PACS and RIS also were queried to identify those patients who underwent a CT examination of the abdomen and pelvis within 24 h of the ultrasound examination.

The following data fields were extracted from the pertinent ultrasound imaging reports: appendix visualization (yes or no), abdominal tenderness with graded compression (see above), rebound tenderness (see above), and imaging impression (structured text). Clinically, patients were assigned to one of five impression categories as part of our structured report. If the impression was edited in such a way that the conclusion of the report was not different from one of the five defined categories, it was assigned to the parent category. If the impression did not fit one of these categories, the examination was excluded. For analysis purposes, ultrasound impression categories 1 and 4 were grouped as “negative for acute appendicitis,” 3 as “positive for acute appendicitis,” and 2 and 5 as “equivocal or indeterminate for acute appendicitis” (Fig. 2).

Clinical data

The following clinical data related to each imaging encounter were documented from the EMR (Epic, Epic Systems Corporation): weight (kg) obtained closest to and within 6 months of the ultrasound examination (age, height, and body mass index [BMI, or kg/m2] were also recorded but not included in final multivariable analyses due to collinearity with weight), initial patient temperature in degrees Fahrenheit during the same ED encounter as the ultrasound examination, and white blood cell count in number of thousand cells per microliter obtained closest to and within 24 h of the ultrasound examination.

Statistical analysis

Continuous data were summarized as means and standard deviations, while categorical data were summarized as counts and percentages. Frequency of appendix visualization (determined by the imaging report, which was finalized by the interpreting radiologist, see above) was calculated per individual sonographer and radiologist. The student’s t-test and Chi-square test were used to compare continuous and categorical variables, respectively, between patients with (1) ultrasound appendix visualization vs. non-visualization, and (2) CT within 24 h after ultrasound vs. no CT within 24 h after ultrasound. Multivariable logistic regression was used to further evaluate the associations between individual sonographers, individual radiologists, clinical variables, and system factors on ultrasound appendix visualization and CT utilization. Final models were fitted using stepwise variable selection, with sonographer and radiologist also forced into the models. Sonographer and/or radiologist variables were then removed from the model(s) if they failed to improve model goodness of fit. These models allowed the independent effect of sonographers to be studied on our outcomes, following adjustment for interpreting radiologists, and vice versa. Based on institutional workflow at our main and satellite hospitals, the relationships between sonographers and radiologists were considered to be random. Adjusted odds ratios (OR) and OR plots with 95% confidence intervals were calculated, as appropriate.

Finally, the associations between sonographer appendix visualization frequency and (1) number of ultrasound examinations performed for suspected acute appendicitis and (2) number of years’ experience since graduation from ultrasound training were assessed using Pearson correlation (r).

A p value less than 0.05 was considered statistically significant for all inference testing. Statistical analyses were performed using SAS, version 9.4 (SAS Institute Inc.).

Results

There were 17,984 ultrasound examinations for suspected acute appendicitis eligible for inclusion in our study. After applying exclusion criteria, 9271 unique patients and examinations (51.5%) were included in the final study population (Fig. 1).

Study population

Mean patient age was 9.9 ± 4.2 years; 5392 of 9271 (58.2%) of the included patients were boys. Thirty-one sonographers and 31 radiologists performed or interpreted (respectively) at least 100 ultrasound examinations for suspected appendicitis during the study period. The sonographers performed an average of 299 ± 139 examinations (range, 115–610), and the radiologists interpreted an average of 299 ± 157 (range, 101–845) examinations. The appendix was visualized in 5449 of 9271 (58.8%) of ultrasound examinations. An abdominopelvic CT was obtained within 24 h after the ultrasound examination in 843 of 9271 (9.1%) patients. Clinical variables, system factors, and imaging findings for the study population are summarized in Table 1.

Table 1 Summary of clinical variables, system factors, and ultrasound impressions for our study population (n = 9271). Significant p values in bold

Unadjusted frequencies of appendix visualization and CT utilization

Tables 2 and 3 summarize the frequencies of appendix visualization and CT utilization by individual sonographer and radiologist. The unadjusted mean frequency of appendix visualization was 57.8 ± 8.7% (range, 40.9–76.0%) per sonographer and 59.5 ± 4.1% (range, 51.7–66.3%) per radiologist. The unadjusted mean frequency of CT utilization was 9.2 ± 2.0% (range, 5.9–14.0%) per sonographer and 9.2 ± 1.8% (range, 3.4–12.1%) per radiologist. The number of examinations performed or interpreted and years of experience for sonographers and radiologists are presented in Tables 2 and 3.

Table 2 Frequencies of appendix visualization and CT utilization by sonographer for exams performed during the study period (2010 to 2019), organized by descending frequency of appendix visualization
Table 3 Frequencies of appendix visualization and CT utilization by radiologist for exams performed during the study period (2010 to 2019), organized by descending frequency of appendix visualization

Multivariable logistic regression and primary outcomes

Tables 4 and 5 present the results of multivariable logistic regression models, including odds ratios, with appendix visualization by ultrasound and post-ultrasound CT utilization serving as primary outcomes. Significant predictors of appendix visualization included patient weight (OR = 0.98, p < 0.0001), male sex (OR = 1.26, p = 0.0003), white blood cell count (OR = 0.97, p < 0.0001), temperature (OR = 0.92, p = 0.002), abdominal tenderness (OR = 0.61–1.05, p = 0.004), presence of appendicitis (OR = 2.92–221.55, p < 0.0001), sonographer (OR = 1.00–5.95, p < 0.0001), and radiologist (OR = 1.04–2.48, p = 0.02). Predictors of CT utilization following ultrasound included patient weight (OR = 1.02, p < 0.0001), white blood cell count (OR = 1.07, p < 0.0001), abdominal tenderness (OR = 1.62–3.40, p < 0.0001), presence of rebound (OR = 0.77–1.88, p = 0.0003), and presence of appendicitis (OR = 0.08–1.95, p < 0.0001), but not sonographer or radiologist. When forced into the model, neither the sonographer (p = 0.71) nor radiologist (p = 0.73) variable was significant. Odds ratio plots for appendix visualization by ultrasound and post-ultrasound CT utilization are presented for individual sonographers and radiologists, respectively, in Fig. 3.

Table 4 Multivariable logistic regression model for appendix visualization by ultrasound. Significant p values in bold
Table 5 Multivariable logistic regression model for post-ultrasound CT utilization. Significant p values in bold
Fig. 3
figure 3figure 3

Multivariable logistic regression odds ratio plots showing effect of sonographer and radiologist on appendix visualization by ultrasound and post-ultrasound CT utilization, respectively, following adjustment for one another, clinical variables, and system factors. Reference sonographer/radiologist set at OR = 1.0. a Plot showing odds ratios for appendix visualization by sonographer (reference = sonographer 13). b Plot showing odds ratios for post-ultrasound CT utilization by sonographer (reference = sonographer 29). c Plot showing odds ratios for appendix visualization by radiologist (reference = radiologist 23). d Plot showing odds ratios for post-ultrasound CT utilization by radiologist (reference = radiologist 26). Whiskers represent odds ratio 95% confidence intervals. Dashed line is OR = 1 reference line

Sonographer experience and appendix visualization

There was no significant correlation between the number of ultrasound examinations performed for suspected acute appendicitis and the frequency of individual sonographer appendix visualization (r = 0.24; p = 0.20). There was no significant correlation between the number of years’ experience since completion of sonographer training and the frequency of individual sonographer appendix visualization (r = −0.15; p = 0.43).

Discussion

To our knowledge, this is the first study within the area of pediatric appendicitis to rigorously compare the impact of individual sonographers and radiologists on important imaging-related outcomes in a very large patient cohort. In our cohort of 9271 appendix ultrasound examinations (unique patients), the frequency of appendix visualization was 58.8%. This is slightly higher than a prior study which showed a 48.8% frequency of pediatric appendix visualization across 10 institutions in the USA [9]. Upon multivariable analysis and adjusting for the effect of one another, both sonographers and radiologists significantly varied in their likelihood of identifying the appendix (40.9–76.0% of examinations for sonographers and 51.7–66.3% of examinations for radiologists). The frequency of appendix visualization was not significantly correlated with the number of ultrasound examinations a sonographer performed during the study period (all sonographers in our study performed more than 100 examinations during the study period). Similarly, there was no statistically significant association between frequency of visualization and amount of time from completion of ultrasound training. This lack of correlation between sonographer experience (years, number of exams) as a predictor of appendix visualization alludes to the possibility that experience alone does not predict performance; further research in this area could be useful.

Discrepancies between sonographer years of experience and number of examinations were noted, which may relate to years of experience at an outside institution or part-time employment. While conjecture, visualization of the appendix by technologists could be due to several factors, including specific training experience, ability to optimize the ultrasound system, sonographer size and strength impacting quality of graded compression, willingness to take extra time with challenging cases, and preference of sensitivity at cost of specificity (or vice versa).

Ultrasound imaging of the appendix also was radiologist-dependent, with some radiologists identifying the appendix more often than others after adjusting for sonographer, clinical variables, and system factors. This could be due to several reasons. First, it is possible that some radiologists simply have a better “eye” for identifying the appendix on ultrasound images and cine clips provided. Second, it is possible that certain radiologists are more likely to post-scan, and thus are more likely to identify the appendix. Third, it is conceivable that sonographers perform differently when scanning for different radiologists. Finally, similar to sonographers, it is likely that different radiologists reside on different locations of the receiver operating characteristic curve, and thus have varying sensitivities and specificities.

Several clinical factors also were significantly associated with appendix visualization by ultrasound. Increasing patient weight was significantly associated with decreased odds of appendix visualization. Difficulty identifying the appendix in overweight and obese patients is not surprising, but the literature is inconsistent with regard to the effect of obesity on the sonographic diagnosis of appendicitis [10, 11]. Decreased body temperature and decreased white blood cell count also were associated with increased appendix visualization, implying that greater systemic inflammation was associated with a paradoxically lower likelihood of appendix identification, and perhaps indicating cases of perforation. Similarly, patients with mild tenderness were associated with increased appendix visualization, while patients with moderate or severe tenderness were associated with decreased appendix visualization. These findings could relate to the limited effectiveness of graded compression in the setting of greater pain and inflammation (i.e., patient guarding) or could reflect appendiceal perforation resulting in altered appendiceal morphology or obscuration of the appendix by robust right lower quadrant inflammation.

The final impression of the ultrasound report was highly associated with appendix visualization, with examinations both positive for appendicitis (OR = 217.09) and equivocal for appendicitis (OR = 2.89) having increased odds of seeing the appendix compared to examinations negative for appendicitis. This is not unexpected as selection of the positive and one of the two equivocal categories is predicated on identification of the appendix. There was no significant association between appendix visualization and time of day or examination location (main hospital vs. satellite hospital). There was a significant association between male sex and increased appendix visualization, which could be related to differences in body habitus or the lack of nearby anatomical distractors (i.e., ovaries and uterus).

In our population, fewer than 10% of patients with an ultrasound examination for suspected acute appendicitis underwent CT imaging within the following 24 h. Radiologist and sonographer differences did not predict CT utilization within the following 24 h. This suggests that in our practice, utilization of CT after ultrasound for suspected appendicitis is not based solely on visualization (or non-visualization) of the appendix, but instead primarily on other factors (i.e., clinical considerations). Clinical predictors that were associated with increased CT utilization included increased patient weight, increased white blood cell count, presence of abdominal tenderness, and presence of mild rebound tenderness. Moderate and severe rebound tenderness were associated with decreased CT use, possibly related to the presence of a surgical abdomen resulting in immediate clinical management without additional imaging. The ultrasound report impression was significantly associated with CT utilization. Specifically, patients with an ultrasound demonstrating appendicitis rarely underwent CT (OR = 0.08), while a substantial proportion of patients with an ultrasound equivocal for appendicitis commonly did (OR = 1.95). There was no significant association between CT use and time of day or examination location (main hospital vs. satellite hospital).

Our study has limitations. It is retrospective and is subject to selection bias. Approximately 8000 ultrasound examinations met the exclusion criteria. However, the final study population is large and includes many radiologists and sonographers, and all included sonographers and radiologists performed or interpreted (respectively) a minimum of 100 examinations during the study period. While exclusion of examinations could theoretically impart selection bias, we believe the study population is representative of pediatric patients undergoing ultrasound at our institution for suspected appendicitis. It is uncertain how generalizable our results and conclusions are to centers with less subspecialty expertise or centers that utilize a different workflow or protocol for appendix sonography. For example, other institutions may perform compete abdominal ultrasonography for suspected appendicitis while ours performs appendicitis ultrasounds.

Additionally, use of CT within 24 h of appendix ultrasound was assumed to be for further evaluating the appendix and/or managing suspected appendicitis, although it is conceivable that it could have been performed for other reasons. Additionally, it is feasible that magnetic resonance imaging could have been obtained instead of CT, and this was not studied. Also, we were unable to identify the ultrasound examinations in which the radiologist post-scanned the patient, which would allow us to further understand our results. Finally, there were likely potential difficult to measure or unknown confounders (e.g., appendix perforation, exposure to analgesic medications) that could have affected the associations we identified. It is worth noting that we intentionally did not include any specific clinical appendicitis scores in our analyses, as they are inconsistently used at our institution, are often incomplete missing necessary variables, and typically include many of the variables we included in our statistical models [12].

In conclusion, our study supports the supposition that visualization of the appendix by ultrasound in children with suspected acute appendicitis is both sonographer and radiologist-dependent, with significant and wide variability (40.9–76.0% for sonographers and 51.7–66.3% for radiologists). Conversely, neither sonographers nor radiologists impacted the eventual use of CT. We believe these data can be used to inform sonographer and radiologist quality improvement efforts. Additional research is needed to assess the effect of individual sonographers and radiologists on important clinical outcomes, such as hospital admission, appendectomy, post-operative complications, and readmission within 30 days.