Introduction

The use of computed tomography (CT) imaging is rapidly increasing in healthcare [1, 2]. Despite physicians’ growing reliance on radiologic assessments, however, the reliability and accuracy of reads are highly variable. Small case series in neuroradiology suggest that substantial variability can exist between radiologists’ interpretations of cervical spine CTs and other related imaging [3, 4]. Similarly, a review of radiology readings between external overnight radiology services and staff radiology readings at a level 1 trauma center showed a discordance of 16%, with 37% of these determined to be clinically significant [5]. Thus, variations in interpretation of radiologic imaging can potentially translate into significant differences in patient management plans. This variation in interpretation can substantially affect patient care [6].

Inconsistencies may result from multiple factors. Studies have found that radiologist’s assessments may be affected by subspecialty training and even images of patient’s faces provided with the imaging [7, 8]. Another important factor may be the clinical information provided. In a 1981 study on the assessment of seven chest films involving four radiology residents and one staff radiologist, provision of an accurate clinical history was shown to increase diagnostic sensitivity while incorrect clinical information and patient history led to an increase in false-positive diagnoses [9, 10]. Large, well-designed studies on the effect of the presence and quality of clinical information on radiologist’s assessments of abdominal/pelvic CT scans do not currently exist.

Our aim was to assess how the presence/absence and accuracy of clinical information affects radiologist interpretation of CT scans. To focus the aim, a single, highly prevalent disease process was the central focus: ventral hernias. We hypothesized that the presence and quality of clinical information will affect radiologist’s assessment of CT scans for the presence or absence of a ventral hernia.

Methods

This was a double-blind randomized controlled trial at a single academic institution performed within the Departments of Surgery and Radiology. Institutional Review Board approval was obtained. This study has been posted on Clinicaltrials.gov (Number NCT03121131) [11]. All patients seeking care in general surgery clinics from October to December 2016 with a CT scan of their abdomen/pelvis within the past year with no intervening surgery were enrolled. CONSORT guidelines were followed [12]. A surgical clinician blinded to the CT scan findings was trained specifically to perform a standardized physical examination to assess for abdominal wall hernias and determined if each patient had a clinically apparent ventral hernia or not. Patients with no clinically apparent ventral hernia were graded as having an indeterminate or low likelihood of hernia based upon clinical judgment. In general, exams were classified as indeterminate likelihood if obesity precluded what the clinician perceived to be an adequate physical exam or if no hernia was palpated but the patient complained of localized pain or discomfort in the region in question. If a hernia was palpable, the patient was judged as having a high likelihood of hernia.

Patients also completed a modified Activities Assessment Scale (AAS) patient-centered outcome questionnaire at the time of physical exam. The modified AAS is a validated, hernia-specific survey in which patients rate their satisfaction with their abdomen, abdominal pain, and abdominal function [13]. Modified AAS scores were normalized to a 1–100 scale, in which 1 signified the lowest quality of life (QOL) and 100 the highest.

The ventral hernia-related clinical information provided to radiologists with each CT scan was randomized into three groups: accurate clinical exam data, no clinical exam data, or inaccurate (purposely incorrect) clinical exam data. The file provided to radiologists either contained information on clinical exam findings or no clinical information was provided. Radiologists were aware that some files had clinical information and others did not, but they were blinded to the fact that a third group obtained inaccurate information. Allocation was stratified by clinical exam findings: clinically apparent hernia versus no clinical hernia (indeterminate and low likelihood of clinical hernia). Randomization was subsequently performed using a random number generator with equal allocation into each of the three randomization groups. The same clinician who had performed the initial physical exam and who remained blinded to the CT scan results generated the random number sequence. CT scans and clinical information were given to seven independent radiologists with expertise in body CT imaging to review. Radiologists were blinded to the study hypotheses as well as to the randomization scheme. They were not aware that in 1/3 of cases, they were provided purposely incorrect clinical information. As the radiologists were instructed to assess only for the presence of ventral hernias for this study, the original indications for the CT scans were irrelevant and will subsequently not be described within this manuscript.

Categorical variables were analyzed using chi square. Difference in QOL of the three groups was analyzed as a continuous variable using Kruskal–Wallis rank test, with a p-value of 0.006 (0.05/9) considered significant following Bonferroni correction for multiple testing. Percent agreement and Fleiss’ kappa were also calculated to determine interobserver reliability. To obtain a “gold standard” of the presence or absence of a ventral hernia on radiologic imaging, three radiologists and one attending surgeon with specialization in ventral hernia management jointly assessed and discussed each CT scan, and provided a consensus assessment as to whether a ventral hernia was present on radiologic exam. None of the members of the consensus group were involved in the other portions of the study or initial review. Those in the consensus group had access to all patient information, but were blinded to the votes by the radiologists in the study group. Consensus was defined as unanimous agreement. This process has been previously validated [14]. A confusion matrix, or a contingency table showing actual and predicted classifications, was generated [15].

We assumed an alpha of 0.05, beta of 0.20, and difference within each stratum of 0.20. Given the assumption that 1/3 of patients would fall into each stratum (low, indeterminate, high), a minimum total of 93 patients reviewed by seven radiologists (651 reads) would be needed. To account for a 25% variance in estimates, we sought a sample size of 115 patients or 805 reads. The primary outcome was proportion of hernias detected on CT scan.

No changes to the methods were made after trial commencement. No interim analysis was performed. No stopping guidelines were necessary. No important harms or unintended effects occurred in any group, as radiologists’ reads for this trial were not revealed to patients or their providers and were not used to guide clinical care. Because radiologists were explicitly instructed to assess only for the presence of ventral hernias, we did not record the presence of any incidental cancers or other serious abnormalities. The study was also performed with the hospital Chief of Radiology as a co-principal investigator. With institutional review board oversight and approval, all patients received a written consent of participation; however, for participating radiologists a written consent was waived. Instead, radiologists were provided with a letter of information, which explained a broad scope of the project, voluntary nature of participation, risks and benefits, and contact information in the event of questions or to withdraw from the study (eTable 5).

Results

A total of 115 patients were enrolled in the study (Fig. 1). Baseline demographic and clinical characteristics for each group are provided (Table 1). No losses or exclusions occurred after randomization. The trial was stopped when our sample had reached the predetermined sample size. On clinical examination, 46 (40.0%) patients had a clinically apparent hernia while among those with no clinical hernia (69, 60.0%), half were deemed to have a low likelihood of having a hernia (33/69, 47.8%) and half were deemed to have an indeterminate likelihood of having a hernia (36/69, 52.2%). There were no differences in QOL scores among the three study groups: accurate exam data median 60.5 (IQR 35.8–82.0), no exam data median 56.8 (IQR 36.0–90.8), and inaccurate exam data median 62.8 (IQR 41.7–83.4) (p = 0.279). There were also no significant differences in proportion of patients presenting with high, indeterminate, and low likelihood of a clinical hernia between the three study groups (p = 0.998) (Table 1).

Fig. 1
figure 1

Flow sheet of patient enrollment

Table 1 Baseline demographic variables of patients per randomization group

Seven radiologists reviewed the CT scans of the abdomen and pelvis of the 115 enrolled patients for a total of 805 CT reads. All seven radiologists agreed on 43% of the scans for a kappa value of 0.50 (p < 0.001). The proportion of hernias detected by individual-blinded radiologists differed by 10–20% depending on if accurate, no, or inaccurate clinical information was provided (Table 2). Inaccurate clinical data in patients with no hernia on physical exam led to a higher radiologic hernia detection rate. No clinical data in patients with a hernia on physical exam led to a significantly lower radiologic hernia detection rate.

Table 2 Radiologic hernia determined by radiology consensus, stratified by physical exam findings

Following the consensus meeting, a total of 96/115 (83.5%) patients were determined to have radiologic hernia, 17/115 (14.8%) patients were determined to have no ventral hernia, and no consensus could be obtained for 2/115 (1.7%) patients (Table 3). Clinical exam aligned with consensus radiologic findings (Table 4). Clinically detected hernias had the highest positive detection rate by consensus meeting. Hernias deemed to be not clinically apparent on physical exam had the lowest detection rate by radiologists’ consensus. In cases with hernia and with inaccurate provided clinical information, radiologic reads had decreased positive predictive values, accuracy, positive likelihood ratios, and specificity.

Table 3 Radiologic hernia determined by individual-blinded radiologists stratified by clinical exam
Table 4 Radiologic hernia determined by consensus

A confusion matrix based upon clinical exam results is reported in Table 5. The confusion matrix with consensus radiologic findings as the gold standard is presented in Table 6.

Table 5 Confusion matrix results: radiologic hernia determined by individual-blinded radiologists
Table 6 Confusion matrix: radiologic hernia determined by individual-blinded radiologists

When accurate clinical exam information was provided, radiologists had the highest positive and negative predictive values, accuracy, positive likelihood ratio, and sensitivity (Table 5). Radiologists’ individual reads had the highest negative likelihood ratio when no clinical exam information was provided.

Discussion

The presence and quality of clinical information provided to radiologists dramatically impacts assessment of CT scans for ventral hernias. Communication between clinicians and radiologists regarding abdominal wall hernias can alter diagnoses and potential management in up to 25% of patients. This is the first double-blind randomized controlled trial assessing the impact of clinical information and quality of information on ventral hernia assessments by radiologists. Previous studies, typically unblinded cases series, have shown similar results. For example, provision of relevant clinical details affected the accuracy of chest X-ray interpretations by 46% among consultant-grade radiologists [16]. Simply placing radiologists’ reading rooms in clinical areas to increase rates of direct communication between radiologists and clinicians also was associated with a significant difference in the percentage of critical test result management messages delivered by radiologists [17].

Based upon the results of the present study, rates of over- and under-diagnoses of ventral hernias can be substantial depending on clinician, radiologist, modality of diagnosis (clinical or radiologic), and quality of communication between radiologist and clinician. This has considerable financial, clinical, and QOL implications for patients and the healthcare system. Ventral hernias missed on radiological assessment (false negatives) can lead to patients who experience decreased QOL to be misdiagnosed and go untreated. False-positive diagnoses can also have harmful effects, including patient anxiety, and additional diagnostic and treatment procedures [18].

Our results emphasize the importance of an accurate and thorough physical exam, as the clinical information provided to radiologists impacted their reads. These findings may be explained by an extreme reliance on provided clinical information, resulting in radiologist under reading or complacency, as defined by Renfrew et al. [19]. The physical examinations performed in this study were standardized, involving assessing patients in standing and supine positions with and without Valsalva maneuvers. The lack of a significant difference in QOL between groups also reflected our efforts to present similar groups of patients within each intervention arm so that any change in radiologists’ reads would result only from the clinical information that we provided. The implications of clinically apparent versus radiologic-only, or occult, hernias, however, must be elucidated both in terms of objective outcomes (including risk of hernia incarceration and strangulation) and patient QOL. Prospective trials should be conducted to assess the outcomes of occult hernias with no surgical intervention versus operative management. Given our current knowledge, it is recommended that any patient who has a hernia read on CT scan undergo a focused physical examination. Patients with a clinically apparent bulge or symptoms should be referred to a surgeon. Oligosymptomatic patients can be counseled on the risks and benefits of non-operative management.

Limitations of this study include our gold standard, limited generalizability, and lack of information concerning the clinical implications of these findings. No ideal method exists for the detection of ventral hernias. We subsequently relied on either clinician physical exam or a consensus meeting of radiologists and surgeons that is also subject to some imperfections. While only one clinician performed the physical exams, we think that this mirrored reality, as radiologists are not aware of the accuracy of referring clinician’s physical exams and multiple clinicians rarely provide input for one patient’s CT exam interpretation. Of note, since only one examiner performed the physical exam there is no way to test the sensitivity of physical exam with this study, however, a barrier to multiple examiners would be patient discomfort. We chose to perform a single, optimal exam: a trained surgeon performing a standardized exam to identify all clinically apparent hernias. Future studies should assess reliability and accuracy of physical examination among different providers. Our patient population may also be considered high risk for ventral hernias with an average (standard deviation) body mass index of 30.5 (6.4), nearly 20% rate of diabetes mellitus, and the widespread presence of other major comorbidities (American Society of Anesthesiologists’ categories of 3 or 4, 35.2% of patients) [20, 21]. A high prevalence of ventral hernias may have biased radiologists’ assessments. We also did not assess the long-term impact of radiologists’ diagnosis of hernia or no hernia on patients’ management, although this is outside the scope of the present study and may be investigated in future research. Multiple radiologists also commented during the consensus meeting that in a few instances, they did not consider so-called small hernias to be of clinical significance and subsequently omitted reporting them during the initial independent reads. Because of the rarity of these instances, these omissions most likely did not affect our overall conclusions.

Previous publications have begun to establish clinical evidence and recommendations to improve the radiologic assessment of patients with ventral hernias [14]. Suggestions include developing a standardized definition for radiologic ventral hernias, tissue eventration, and mesh eventration; developing a systematic method for reviewing the entire abdominal wall; and standardizing communication between surgeons and radiologists, including how surgeons should provide mesh-related information and clinical concerns to radiologists [14]. There is a need among radiologists to develop consensus guidelines on the assessment and reporting of hernias seen on radiological imaging. Equally important is the need to identify clinically important hernias, not just radiologic hernias. Clinicians may consider performing a careful, standardized physical examination in standing and supine position with and without Valsalva. Patients with asymptomatic hernias seen only on radiologic examination require no further treatment other than counseling and intermittent re-assessment. Repair of “symptomatic” hernias, or patient presenting with a focal area of pain and a hernia seen only on radiologic examination, should be approached cautiously with careful patient counseling. Currently, there is no high-quality evidence to show if surgical repair of these hernias will benefit the patient. Additional studies should also assess the accuracy of CT scan versus physical exam in obese subjects and perform further anatomical (CT diagnoses) versus functional (clinical examination) analyses.

Conclusions

This is the first double-blind randomized controlled trial demonstrating that the presence and quality of clinical information can affect radiologist’s reads of CT scans. Differences in clinical information provided can alter diagnosis in 1/4 of cases. There is need to improve the quality and accuracy of clinical examinations along with standardized guidelines to improve the accuracy and reliability of radiologic interpretation.