Introduction

Efficiency in healthcare systems has become a critical focus in public health research [1].The aging population, rising rates of chronic morbidity, and advancements in medical technologies like digital health services have contributed to the overload of global healthcare systems, resulting in an increased number of imaging exams, some of which may be deemed inappropriate [2, 3]. Given the widespread incorporation of imaging exams into global healthcare management, with a significant presence in hospital care pathways, there has been a notable worldwide increase in the volume of imaging procedures [4].

In Poland, for example, between the years 2010 and 2020, the number of CT and US examinations nearly doubled (from 87.4 to 155.7 and from 52.1 to 86.5 per 1000 patients in 2010 and 2020 respectively) [4]. Unfortunately, approximately 20–30% of these imaging exams are considered unnecessary, indicating a significant prevalence of inappropriate imaging practices [5, 6]. Such practices not only lead to the overuse of medical resources but also pose risks of unnecessary radiation exposure and potential increased cancer risk [5]. Addressing inappropriate use of medical resources is therefore crucial for improving patient care and enhancing the efficiency of medical systems [7, 8].

To tackle inappropriate imaging, the American College of Radiology (ACR) has developed the Appropriateness Criteria guidelines, which are freely accessible. However, these guidelines suffer from limited general awareness and adoption [9,10,11]. A systematic review of Clinical Decision Support Systems (CDSS) has shown that providing context-specific information at the point of care significantly increases healthcare providers’ adherence to guideline recommendations [12, 13]. By incorporating a CDSS, clinicians can be guided to prescribe the most suitable exams, leading to improved patient outcomes and better allocation of resources [12, 13].

To promote awareness and adoption of appropriate imaging guidelines, several evidence-based CDSS tools have been developed [14, 15]. Among them are Orderwise by MedCurrent, based on the Royal College of Radiologists (RCR) guidelines, CareSelect sourced from ACR appropriateness criteria and Mayo Clinic, and the ESR-iGuide, which is also sourced from ACR appropriateness criteria but adapted to suit the European context [16]. The ESR-iGuide, as a prominent CDSS model, has been developed to assist in selecting appropriate imaging referrals. This web-based, rule-based system takes into account patient data, estimated cost, and expected radiation exposure. It has been developed using an evidence-based methodology and is distributed by the European Society of Radiology’s (ESR) affiliate Quality and Safety in Imaging (QSI) company (GmbH). The ESR-iGuide provides suggestions for the most appropriate imaging exams and assigns a graded scale to each exam, with ratings of 7–9 corresponding to “usually appropriate,” 4–6 defined as “may be appropriate,” and 1–3 defined as “usually not appropriate” [17,18,19]. The literature has documented the benefits of incorporating the ESR-iGuide, including improved compliance with ESR guidelines, increased physician agreement, and the promotion of a cohesive medical workforce [18]. Developed using an evidence-based methodology [20], the ESR-iGuide sets a common standard for Europe and serves as a European imaging referral guideline CDSS platform. It offers real-time, evidence-based information and actionable decision support for imaging decisions [18].

Accordingly, the primary objective of this study was to investigate the application and suitability of the ESR-iGuide within a single medical center. By doing so, the study aimed to establish a solid foundation and identify the key requirements for its successful implementation at a national level. This exploration is essential for understanding the feasibility and potential benefits of implementing the ESR-iGuide on a broader scale, encompassing multiple medical centers across the country.

Objective

To assess the appropriateness of Computed Tomography (CT) examinations as ordered in a public-based medical center, using the ESR-iGuide, a CDSS tool.

Materials and methods

Study design

A retrospective study was conducted in 2022 at a medium-sized acute care teaching hospital, where approximately 6235 CT exams are performed annually. Nationally, approximately 575,000 in-hospital CT exams are conducted. These numbers were used to calculate the needed sample size.

Sample size

Based on this data and a literature review, which evaluated previous accuracy of imaging referrals, we assumed a 20% accuracy rate, with a confidence level of 95%. Sample size was calculated using OpenEpi software, based on population size and statistical requirements for models of this type [21]. Based on a frequency of 20%, test power of 80%, confidence interval of 95%, and significance of 0.05, there is a minimal sample size need for 237 patients.

Data collection

A sample of 291 consecutive cases of CT imaging referrals performed on patients from several hospital departments (10%) as well as from the Emergency Department (90%) during September 2021 were obtained. After eliminating cases of total body (N = 9) that are not appropriate for ESR-iGuide and invalid data (N = 4), the study sample for analysis consisted of 278 CT referrals.

For each case, we collected the original text referral (indication/s for the exam), the ordered exam, patient characteristics (age, gender, clinical background), and physician characteristics (gender, type of specialty, physician status—intern, resident, senior physician). We also collected data regarding the shift in which the imaging exam was carried out, as well as whether the image was interpreted as normal or abnormal.

Procedure

We assessed the ESR-iGuide recommendation for each scenario. For this purpose, we inserted anonymous case details into the system, including sociodemographic characteristics of the patient (age and gender), clinical indication/s, and red flags, defined as signs and symptoms found in the patient’s history as well as the clinical examination, that may help identify the presence of potentially serious conditions.

We then recorded the “appropriateness score” of the exam that was ordered as well as the “appropriateness score” of the ESR-iGuide most recommend exam, ranging from 9 (highly recommended) to 1 (not recommended). A rating grade of 7–9 corresponded to “usually appropriate,” 4–6 was defined as “may be appropriate,” and a rating of 1–3 was defined as “usually not appropriate.” [16]. If performing a CT exam was not part of the recommendations, for the purpose of this analysis, a score of 0 was assigned. In addition, we obtained the relative radiation level (RRL) for each exam using ionizing radiation exams according to the ESR-iGuide guidelines—for both the most recommended ESR-iGuide exam and for the actual exam performed.

Data analysis

Descriptive measures including box plot were calculated for the level of ESR appropriateness score of referred exam given by the physician at point of care, as well as for the radiation level of the performed imaging exam. Appropriateness and RRL scores according to ESR-iGuide of the actual referral were compared to the top recommended examination according to ESR-iGuide as the reference. Wilcoxon signed-rank test was performed, based on the difference between recommended ESR exam appropriateness score and actual referral appropriateness score as well as on the differences between recommended ESR exam RRL and actual referral RRL.

We further performed sub-analysis to explore the correlation between ESR-iGuide appropriateness level and other variables: physician characteristics (specialty (intern/internal medicine/surgery/emergency medicine/pediatrics), professional status (intern or resident/specialist)), patient characteristics (age and gender), and the shift during which the exams was ordered (morning 8:00–16:00, evening 16:00–24:00, night 00:00–8:00).

Physician characteristics were grouped into two categories for statistical analysis: surgery/non-surgery. For subgroup analyses, ESR-iGuide level of appropriateness was classified using a binary variable (score less than 7—“may be appropriate/usually not appropriate”; score between 7 and 9—“usually appropriate”). Categorical variables are reported as frequency (/percentages) and were compared by Pearson’s chi-square test or Fisher exact test when the value of any expected cell was less than five. For continuous variables, age (expressed as mean ± SD) comparisons between the two samples (“usually appropriate” and “may be appropriate/usually not appropriate”) was performed using independent samples t-test. A 2-tailed p value of less than 0.05 was considered significant.

To capture the contribution of each factor, we constructed a stepwise logistic regression model. The probability of a “may be appropriate/usually not appropriate” scores (less than 7) was modeled. Associations having an alpha threshold level of 0.25 in the univariate analysis were entered into the multivariate model (20). This resulted in variables “physician specialty” and “physician status” being included in the model, as well as “age” that was included as covariate. Statistical analysis was performed using SAS Enterprise Guide v.8.3.

Ethical considerations

The study protocol was approved by the Institutional Human Subjects Ethics Committee (CM-0058–21) of the relevant medical facility. Written informed consent was waived by the Institutional Review Board. All performed procedures followed the ethical standards of both the institutional and national research committees; these complied with national ethical standards.

Results

The sample study included 278 imaging CT referrals.

The mean age of the patients was 59.05 ± 23.01, majority of the patients were female (n = 165, 59.3%). The majority of the exams performed were CT head (n = 177, 63.67%) and CT abdominal pelvis (n = 66, 23.74%). Most of the physicians were residents (n = 157, 56.47%) and 93 were specialist physicians (33.45%). The majority of the physicians were males (n = 64, 63.37%). Most of the referring physicians were general surgery specialists (n = 102, 36.7%), followed by internal medicine (n = 95, 34.17%) and emergency medicine (n = 51, 18. 35%). The most frequent shift when the exam was performed was the evening shift (n = 125, 44.9%) (see Table 1 for descriptive statistics of the sample).

Table 1 Descriptive statistics of the study sample

When comparing the appropriateness score of the actual referral and the ESR-iGuide recommended exam, the overall mean of appropriateness for the actual referral was 6.62 ± 2.69 on a scale of 1–9 as opposed to 8.29 ± 0.85 for the recommended ESR-iGuide exam (p < 0.0001 according to the Wilcoxon sign rank test) (see Fig. 1A).

Fig. 1
figure 1

A box plot comparing the ESR appropriateness score (a) and the radiation level (b) of the actual examination and the best recommended examination according to ESR-iGuide, N = 278

Furthermore, when examining the RRL according to the recommended ESR-iGuide exam and the actual referral, mean level of radiation was 3.26 ± 0.45 on a scale of 1–5 for the actual referral and 2.16 ± 1.56 for the recommended ESR-iGuide exam (p < 0.0001) in the Wilcoxon sign rank test) (see Fig. 1B).

When using a binary variable (score 7–9 was considered “usually appropriate,” otherwise “may be appropriate/usually not appropriate”), 70% of the actual imaging referrals resulted in an ESR-iGuide score corresponding to “usually appropriate” (195 out of 278). A significant relationship (p = 0.0045) was found between the physician who requested the exam classified as surgery/non-surgery and the degree of appropriateness according to ESR-iGuide (Fig. 2, Table 2).

Fig. 2
figure 2

Stepwise logistic regression modeling the probability of “May Be Appropriate/Usually Non-Appropriate” score: ROC curve for model with specialty and status as explanatory variables

Table 2 Analysis of CT orders regarding ESR-iGuide appropriateness for ordered test (appropriate = 7–9), N = 278. Pearson chi-square test/Fisher’s exact test

When building a stepwise logistic regression for modeling the probability of “may be appropriate/usually not appropriate” score (less than 7), both physician specialty and status were significant (p = 0.0011, p = 0.0192 respectively). Thus, non-surgery and specialist physicians are more at risk of ordering “may be appropriate/usually not appropriate” exams as compared to surgery and training physicians (resident or intern). The findings indicate that a non-surgical specialist is 2.748 times more likely to order an exam scored less than 7 according to the ESR-iGuide (95% CI 1.519–5.162) compared to a surgical specialist. Furthermore, we found that specialist physicians are 1.975 times more likely to order an exam scored less than 7 as compared to resident/intern physicians (95% CI 1.119–3.506). Age was not statistically significant and therefore was not included in the final model. When examining the contribution of the final model in the ROC curve analysis, specialty and status contributed slightly to the model (AUC = 0.631), while 28 out of 278 referrals were scored 0 (Table 3).

Table 3 Stepwise logistic regression modelling the probability of “may be appropriate/usually non-appropriate” score

Discussion

This study explores the proportion of appropriate imaging referrals and their subsequent unnecessary radiation exposure as determined by the ESR-iGuide. Approximately 30% of CT exams, mostly head (64%) and abdominal-pelvis (24%) CTs, were defined as “may be appropriate/usually not appropriate,” similar to the rates reported previous studies where approximately 20–30% of imaging exams performed did not generate information that improved diagnosis or treatment or had an impact on the patients’ health. These studies included several decision support interventions, but not the ESR iGuide [2,3,4].

A recent study from 2022, using the ESR- iGuide, assessed the appropriateness of CTs for acute abdominal pain. The findings showed that according to the ESR-iGuide and based on the clinical suspicion of CT requests, CT examination were considered crucial in 264 (45.05%). 54.9% if the patients had a referral reason for CT exam that could be considered “may be appropriate” according to ESR- iGuide criteria (4, 5, 6 scoring). Of these, 135 had an inappropriate CT request according to image findings [22].

As defined by Ruhm et al, the rate of inappropriate imaging was 25%, with a margin of error of 7.5%, resulting in inadequate resource allocation [23], as well as inappropriate radiation exposure [23, 24].

In line with Ruhm et al’s results, the findings of the current study suggest unnecessary patient radiation exposure as determined by the ESR-iGuide (1.07 on a scale from 0 to 5; p ≤ 0.0001 when compared to the recommended exam) [23, 24].

Recent data show physicians are generally aware of radiation risks, with interest among physicians in learning about radiation protection and considerations for regulatory mechanisms to control the safety of annual radiation exposure. In practice, two-thirds of these physicians indicated their referral decision were based on clinical indications and radiation exposure from prior CT exams [25].

No association was found between the shift in which the exam was ordered, the age of the patient, gender of the patient, and appropriateness of the exam. Sixty-four percent of exams ordered by specialists were considered appropriate by the ESR-iGuide, compared with 73% of exams ordered by training physicians. These findings were not statistically significant (p = 0.12) according to the Pearson chi-square test; however, in the logistic regression model, significance was determined (p = 0.02). Evolving data suggests a mix of strategies are essential for clinical decision-making, with models suggesting that novices and specialists use varying principle strategic approaches. One such approach is intuitive decision-making, in which formulation of a decision uses intuitive principles. In contrast, analytic decision-making considers evidence-based protocols, clinical pathways, interdisciplinary input, and knowledge, using step-by-step logical thinking, and reviewing the collected data [26].

Abate and colleagues suggest that novices rely mostly on analytical principles for decision-making, while specialists formulate decisions primarily through tactics involving intuition [27]. Further data by Kosicka and colleagues showed that specialty certification and higher workload were associated with higher use of intuitive decision-making [26]. To ensure optimal outcomes for both patient and resource management, it is of importance that CDSS be used in tandem with the intuition of physicians [22, 28, 29].

Eighty percent of exams ordered by surgeons were considered appropriate, compared with a 64% appropriate exam rate ordered by non-surgeons (using a binary variable), indicating that surgeons were more likely to order appropriate exams compared with other specialists. The higher appropriateness rate for imaging exams by surgeons may be due, in part, to their facing specific clinical situations with more concrete treatment protocols (ex. Query for Appendicitis) while internal medicine deals with “less well-defined problems” and patient presentation may fit with more than one diagnosis [30, 31].

Future studies may focus on the cases in which there was disagreement between the physician ordering the exam and the exam deemed appropriate by the ESR-iGuide, to better elucidate where the system can better guide the physician and vice versa. Evaluation models were performed on similar tools designed to reduce unnecessary referrals, such as done by Shaunna Smith et al (2022) [30]. In 2015, the Medical Ultrasound Society released a referral justification document for rejection of inappropriate ultrasound referrals to help manage increasing demand and ensure correct utilization of diagnostic imaging exams. Smith and colleagues evaluated canceled referrals to examine the accuracy of the tool and found that the majority of the canceled referrals were justified as such. It is of importance to note that 10% of cases were found pathologic, requiring further imaging exams to be conducted. Identifying the clinical pitfalls of both physicians and guidance systems are necessary to continue improving clinical patient outcomes and resource management [31].

Study limitations

This study has several limitations. First, the retrospective nature of this study design does not evaluate the use of the ESR-iGuide in real time, and thus findings may differ in a real-time setting. In addition to this, the study sample evaluated patient cases in which inherently imaging exams were already performed. Cases in which the physician would not prescribe exams, but the ESR-iGuide would, are missing. Third, data regarding referrals prescribed by training physicians that were instructed by senior colleagues was not available to us, and thus may impact on the appropriateness of referrals. Moreover, due to the study design, no statement is possible whether the physician, who requested the exam intuitively, may have saved more lives.

Finally, this study does not consider the cost and resource use of the exam, even though the ESR-iGuide may sometimes recommend an exam that is more expensive and utilizes a greater amount of system resources. Depending on the nature of the healthcare system, this may not be a concern to the patient, but it will always have an impact on the healthcare system. Irrespective of whether the additional costs are covered by the patient or some other health system actor, the additional resource depletion may impact on waiting times and other outcomes of interest. These issues could be important in any country, particularly in countries with a publicly financed health care system.

Conclusions

In conclusion, comparing the appropriateness of referrals prescribed by physicians with the recommended exams suggested by the ESR-iGuide suggests inappropriate exams and unnecessary radiation exposure for CT head and CT abdominal pelvis, particularly in the ED department. These findings indicate an urgent need for appropriate physician use of imaging referrals to promote the most desirable health care delivery and resource management. We found appropriate exams to be related to physician specialty and the seniority of the physician.

With similar rates of appropriate exams and unnecessary radiation exposure as shown in other countries globally [29], our results strongly indicate the necessity of implementing a national CDSS within the healthcare system. The findings highlight the potential benefits of utilizing CDSS to promote appropriate imaging referrals and effectively reduce unnecessary radiation exposure. By incorporating a national CDSS, healthcare providers can make more informed decisions regarding the appropriateness of imaging examinations, leading to improved patient care and enhanced radiation safety.

Future studies should explore the implementation, integration, and usability of CDSS which incorporate artificial intelligence in real time to promote the optimization of health care services, especially in times when medical systems are overwhelmed.