Background

Ultrasound of the head and neck is the diagnostic modality of choice for a wide variety of routine and emergency patients in otorhinolaryngology [1,2,3,4,5]. Potential reasons for this development may include the high availability, the absence of potential harm due to radiation, the applicability for claustrophobic patients as well as a high cost–benefit calculation [6, 7]. Whereas mode and manner of the examination is taught widely throughout medical school and residency, high quality reporting remains a major challenge. This stands in sharp contrast to the high and yet rising importance of the report and its respective content. Consequently, insufficient report quality may cause misunderstandings between the referring and examining physician which may result in inadequate clinical decision making with potential medical and legal issues [8,9,10].

Structured reporting has proven to be a promising approach to standardize report content and improve overall report quality of several diagnostic modalities, including head and neck ultrasound [11,12,13,14,15,16,17]. Additionally, referring and examining physician generally favor structured reports (SR) over free text reports (FTR) because of the standardized approach and use of recommended terminology [18,19,20,21,22]. Since head and neck ultrasound is a key element in tumor follow-up and planning of operations, comprehensive and understandable reports are indispensable [21]. Additionally, inexperienced residents may profit from using SRs because relevant anatomical structures are pointed out to the examiner and the recommended terminology is also offered. This may result in more complete and comprehensive composed reports during the learning process [3, 13].

While clinical studies were able to demonstrate a superior report quality of SRs of head and neck ultrasound in the context of routine outpatient treatment and medical school training, there are no data concerning its impact on the longitudinal learning process during residency [14, 15]. It remains elusive at what point in time structured reporting should be implemented during training and how this affects the individual learning curve.

Therefore, the present study’s objective was to analyze the effects of using SRs of head and neck ultrasound studies on the longitudinal learning curve over the course of residency. As previously described, we hypothesized that training effects are characterized by obtaining new expertise and capacities that ultimately influence attitudes, decisions and actions [15, 23]. By monitoring the report quality of participating residents’ report quality over the course of a year, the additive training effect of each report type may be illuminated. Besides, we examined the user contentment of participating residents regarding each type of report.

Methods

Study design

In total, 24 residents of different training levels who participated in our 2018 tripartite course on head and neck ultrasound, accredited by the German Society for Ultrasound in Medicine (DEGUM), agreed to participate in this trial. All participants were trained to create FTRs ahead of the course in their daily work routine. The individual level of experience with regard to ultrasound diagnostic was evaluated prior to inclusion by individual self-assessment using a five-point scale (0: insufficiently experienced, 5: very experienced, see Table 1).

Table 1 Particularities of participating residents

Participating residents received training on how to use our department’s standard FTR template and were randomly allocated to pictures of various frequent diseases of the neck in each course. The pictures were sampled at our outpatient-department ahead of the course and selected in an increasing order of complexity (see Table 2). Therefore, the individual learning process was reflected in order to prevent a ceiling effect. Subsequently, each participant created FTRs and SRs of the assigned pathology and completed a user contentment questionnaire at each course.

Table 2 Pathologies to be reported in the Mainz 2018 DEGUM-courses on head and neck ultrasound

Sample size calculation

The amount of reports needed was computed based on the anticipated effect size when comparing the quota of each report type with a completeness of 80% or higher [24]. We figured that using FTRs would result in a ratio of 40% very high completeness appraisals, considering prior publications [14, 15]. Additionally, we estimated that using SRs results in an increase of very high completeness ratings to 80%. The power was set to 80% with a significance level of α = 0.05. Consequently, the minimum number of reports required within this trial was computed to be n = 44 (22 reports of each type).

FTR and SR

In this study, our standard form used in our department was utilized to create FTRs. As previously published, an online-based platform (Smart Reporting GmbH, Munich, Germany, https://smart-reporting.com) was utilized to create a specialized structured reporting template for head and neck ultrasound studies [14, 15]. The structured reporting template incorporates the current recommendations of the DEGUM with regard to anatomical structures and terminology and addresses a maximum variety of pathologies consistently in every report (see Fig. 1).

Fig. 1
figure 1

Screenshot of a decision-tree within the structured reporting template. Shown is an exemplary report of a benign tumor of the parotid gland. On the left side, the examiner can select the type of pathology, side, size as well as pathological feature such as distal ultrasound pattern, duct obstruction and assessment of dignity while the template generates full semantic sentences on the right side

Report evaluation

Anonymized reports were assessed by two board-certified otorhinolaryngologists independently regarding their completeness with respect to lymph nodes, major salivary glands and blood vessels, accuracy concerning pathological features and terminology. In order to standardize the assessment, an evaluation form was incorporated and reports were categorized as insufficient (0–20% overall report quality), poor (20–40%), moderate (40–60%), high (60–80%) and very high (80–100%) as previously described [14, 15]. Moreover, legibility of each report type was subjectively valued utilizing a five-point scale as previously described [14, 15]. Time spent on reporting was document during report generation. User contentment was inquired by using a questionnaire utilizing a ten-point visual analogue scale as previously published.

Statistical analysis

Data are reported as mean ± standard deviation (SD). To compare report evaluations and questionnaire findings, Wilcoxon signed-rank test for paired nominal data was applied. Additional possible correlations were evaluated using linear regression analysis and inter-rater reliability was tested by Fleiss’ kappa [25]. A p-value of less than 0.05 was defined as statistically significant. All statistical tests were performed utilizing SigmaPlot 12 (Systat Software, Inc., San Jose, CA, USA).

Results

Report analysis

In total, 144 anonymized reports (72 SRs and FTRs each) were derived from all three course parts. Report evaluation revealed that using a SR template lead to a significantly increased comprehensiveness in all categories (95.6% vs. 26.4%, p < 0.001). To be more precise, structured reporting produced higher completeness ratings in terms of reported lymph node levels (92.3% vs. 17.3%, p < 0.001), major blood vessels (98.8% vs. 15.5%, p < 0.001) and salivary glands (97.8% vs. 59.3%, p < 0.001). Additionally, pathologies were reported significantly more accurate and detailed using structured reporting (72.3% vs. 58.9%, p < 0.001). Average duration to finalize the report was also significantly shorter in SRs (99.1 s vs. 115.0 s, p < 0.001). SRs were significantly better readable (100% vs. 52.4%, p < 0.001) than FTRs. Consequently, overall report quality was significantly better in SRs in comparison to FTRs (91.8% vs. 35.1%, p < 0.001) with a positive correlation between high-quality reports with structured reporting (91.7% vs. 6.0%, p < 0.001). More details of the report analysis are given in Fig. 2.

Fig. 2
figure 2

Results of overall report analysis. Structured reports (SR) received significantly better completeness ratings in terms of cervical lymph nodes, major neck vessels and salivary glands than free text reports (FTR, a). Moreover, pathologies are described in significantly greater detail and legibility resulting in a significantly superior overall report quality when using SRs (b). Mean time needed to generate the report was significantly shorter using structured reporting (c). *p < 0.05

In a next step, the participants’ individual longitudinal learning progress throughout the three course parts was evaluated. For SRs, data analysis showed a progressive time efficiency in course II (− 16.1 s, p = 0.072) which continued and reached significance level in course III (− 20.1 s, p = 0.036) when compared to baseline. This effect was not observed in FTRs which showed constant time requirements in courses II (− 1.1 s, p = 0.463) and III (− 0.48 s, p = 0.479). Moreover, FTRs revealed a significant absolute decrease in overall report quality in course II (− 15.8%, p = 0.009) as well as in course III (− 10.7%, p = 0.04) when compared to baseline. This significant decrease in overall report quality was not observed in SRs, neither in course II (− 2.2%, p = 0.09) nor in course III (− 6.2%, p = 0.084). More details concerning the report progress analysis can be found in Fig. 3.

Fig. 3
figure 3

Results of report progress analysis throughout the three course parts. Structured reports (SR) showed a significant increase time efficiency (b) without compromising overall report quality (a). In contrast, no increase in time efficiency (b) and a significant decrease in report quality (a) was seen in free text reports (FTR). *p < 0.05

Additionally, only structured reporting produced a very high inter-rater reliability with a Fleiss’ kappa of 0.9.

User contentment

Overall, the user contentment questionnaire showed that all interviewed participating residents significantly favored structured reporting (8.3 vs. 6.3, p < 0.001). In detail, using SRs was thought to generate a predominant report quality (8.7 vs. 5.2, p = 0.005) and to be supportive for residents learning to report head and neck ultrasound studies (8.5 vs. 6.9, p = 0.017). All other questions revealed a tendency towards a preference for SRs without reaching significance level (see Fig. 4).

Fig. 4
figure 4

Visual analog scale (VAS) of questionnaire findings. User contentment of participants was evaluated using a questionnaire incorporating a VAS (10: complete agreement, 0: complete disagreement). Examining residents were asked about practicability (Q1: practicability), usefulness in everyday practice (Q2: everyday practice), improvement in report-quality (Q3: quality improvement), time efficiency (Q4: time efficiency), justification of additional time needed (if applicable, Q5: justif. add. time), benefits for inexperienced physicians conducting (Q6: benefits conducting) and reporting (Q7: benefits reporting) ultrasound studies of the head and neck and usability by intuition (Q8: intuition) of structured reports (right side, blue bars) and free text reports (left side, red bars). The questionnaire revealed a significant overall preference for structured reports and a tendency in all subcategories. *p < 0.05

Discussion

Over the course of the last few decades ultrasound studies of the head and neck have evolved to the gold standard in the diagnostic workup of a great variety of pathologies in otorhinolaryngology [1,2,3,4,5]. Despite its great importance for clinical practice and decision-making, there is almost no training in reporting in most departments [8]. The report of any imaging technique represents the essence of the examination since it transmits its content and conclusion. Additionally, it is the baseline for follow-up examinations which are frequently carried out in head and neck oncology [5, 26]. The head and neck region is comprised of a multitude of delicate structures within a rather small space. This makes their three-dimensional topography more complicated to interpret, which effects the reporting of any imaging technique [27]. Therefore, the implementation of structured reporting tools has the potential to overcome these challenges [14, 15].

Structural report content, terminology as well as important anatomical structures and their mutual relevance may be incomprehensible to inexperienced physicians because of a general lack of report training. Structured reporting has been promoted to challenge these troubles by multiple societies and publications. It has the capability to lead inexperienced examiners through the process of examination and reporting and by proposing important anatomical structures and their reciprocal orientation along with appropriate language to specify [28].

Our analysis revealed that using SRs leads to a significantly higher report completeness, a more detailed description of pathologies and a better report legibility resulting in a higher overall report quality. Besides, average time to create a report was significantly shorter for SRs. Evaluation of user contentment revealed a significant overall preference for SRs with a focus on improvement of report quality and support in report training. These results are in accordance with previous publications that studied the impact of structured reporting on a variety of imaging techniques, including head and neck ultrasound [12, 14, 15, 18,19,20,21, 24].

Moreover, SRs have been shown to reduce grammatical or orthographical mistakes for inexperienced and especially non-native residents the era of globalization and rural depopulation with an increasing need for telemedical consulting [29, 30]. Additionally, SRs have been associated with a reduced number of missed pathologies, a higher diagnostic accuracy and an improved intra- and interrater reliability as underlined by our results [13, 16, 19, 31].

It remains unclear to what extent structured reporting supports the learning process of diagnostic modalities [28]. Previous publications from our study group have pointed out a positive influence of structured reporting on report quality and time efficiency during medical school [15]. It is yet unknown if inexperienced examiners, whether medical students or residents, will benefit from an early implementation of this technology or if a fundamental knowledge, which is indispensable for free text reporting, is also favorable ahead of implementation. Additionally, it is unclear if these positive effects are attributed only to the implementation or if this development progresses longitudinally over time. The latter would most likely indicate a sustainable additive educational effect of structured reporting. As far as we know, there have not been any longitudinal studies concerning the impact of structured reporting on potential training effects. Our data provide evidence for the first time that improvement of report quality is not exclusively caused by the implementation of a SR template itself. Participating residents created superior reports in terms of quality and time efficiency using structured reporting already at the time of implementation which is in line with other recent studies [14, 15, 20]. Consequently, the implementation itself constitutes a benefit in report quality for trainees of the diagnostic modality. This conflicts with the hypothesis or earlier publications that the introduction of structured reporting results in an initial decrease of time efficiency [32]. In contrast to the latest SR technologies, the use of first-generation SR templates has been proven to be insufficiently intuitive which resulted in an initial impairment of workflow [33].

As stated before, the initial loss in time efficiency in other studies may not be solely attributed to the introduction of structured reporting into clinical practice [32]. A more decisive factor seems to be that most physicians have received training in free text reporting over the past decades.

Whether this instant improvement in report quality and workflow may be compensated over time due to the ceiling effect of the individual learning curve using both modalities is of central importance within the characterization of the learning effect of structured reporting. The longitudinal analysis revealed a progressive time efficiency using SRs in course II which was even more pronounced in course III. In contrast, no improvement in time efficiency was observed using FTRs, neither in course II nor in course III. Additionally, the overall report quality of FTRs deteriorated significantly in course II and remained significantly inferior in course III. Even though there was a tendency towards a decline in overall report quality in SRs as well, this trend remained insignificant.

These findings may be explained by the fact that experienced and versed physicians often concentrate on the main problem for which an examination is carried out, while neglecting other less important or unremarkable findings. This may result in a reduced overall completeness. Additionally, the pathologies presented to the participants during the three course parts were chosen to be increasingly complex and difficult to report. Reporting on a complex pathology in a detailed manner is based on experience and is time-consuming. Consequently, a speed-up due to improved routine may be consumed by more dedicated and detailed reporting. Therefore, the increase in complexity may have outweighed the individual learning curve, resulting in a decrease in report quality and time efficiency. This was most evident in the FTR group in course II in which a substantial decline in report quality was observed. The decline was partially compensated by the individual learning progress between courses II and III but remained significantly inferior to baseline values.

The early introduction and the consequent application of structured reporting resulted in a continuing increase in time efficiency while upholding the report quality at the same time. Both factors are promoted by the pre-defined structure and redundancy of the report. Also, clickable decision-trees prevent physicians from neglecting additional findings by repeated querying. All of these factors facilitate an efficient workflow and therefore cause the significant preference for SRs in this study. Continued improvement of report quality and facilitation of training may be resort to the struggle most diagnostic departments face with queries because of incomplete and ambiguous reports [32].

Finally, participating residents uniformly stated that the use of SRs offers an increase in report quality and supports the learning process and its continued improvement over time. Whether these factors lead to an improved quality of diagnostic and therapeutic services resulting in an improved patient outcome has to be evaluated by future studies. Nonetheless, studies have shown that structured reporting greatly facilitates the compliance with clinical guidelines and therefore with evidence-based medicine [28].

Conclusions

In conclusion, SRs should be considered as the report type of choice for head and neck ultrasound studies during residency. Early implementation of structured reporting results in an increased longitudinal time-efficiency while upholding the report quality at the same time. Superior outcomes in terms of comprehensiveness, legibility and time-efficiency can be observed immediately after implementation. Progressive time efficiency and maintained report quality over time may suggest a sustainable learning effect due to the use of SRs which reflects an improved workflow. These superior findings are substantiated by the fact that residents significantly favor SRs. Therefore, we recommend that structured reporting of head and neck ultrasound studies should be implemented early on during residency.