Introduction

Deep learning (DL) is an advanced form of artificial intelligence that has revolutionized computer vision, leading to a rather rapid expansion into a much broader range of science and engineering disciplines, including medical imaging [1].

New DL applications to musculoskeletal radiology have rapidly expanded over the past decade, including image reconstruction, image data transformation, tissue segmentation, body compositional analysis, opportunistic screening, workflow support, and disease detection [2,3,4,5,6].

We provide a perspective on the prospects and value propositions of DL applications to musculoskeletal radiology, adoption into clinical practice, and obstacles.

Image reconstruction and transformation

Continuous improvement of musculoskeletal magnetic resonance imaging (MRI) efficiency is an essential cornerstone for retaining and further increasing its value through expanding MRI availability and accessibility, improving tolerability, decreasing motion artifacts, decreasing needs for sedation and anesthesia, and augmenting throughput [7, 8]. Novel DL-based image reconstruction algorithms are prime examples of how DL satisfies this need and delivers promised value propositions in musculoskeletal radiology in an exemplary fashion.

Novel DL reconstruction algorithms correct aliasing artifacts, signal loss, and noise amplification with previously unobtainable effectiveness, permitting reconstruction of 4-fold accelerated MRI acquisitions with better image quality than widely used 2-fold parallel imaging acceleration [9]. Scanner-based DL image reconstruction algorithms can further reduce the acquisition times of a broad range of accelerated pulse sequences by 50% or more, without delays, need for external software, or additional information technology infrastructure to enable clinical MRI exams under 5 min [10].

Similarly promising are DL algorithms for noise reduction of computed tomography (CT) images, which have great potential to redefine both the thresholds of reduced-dose CT and frontiers of ultra-high spatial resolution CT [11].

DL-based imaging data transformation is another exciting avenue that promises multiple benefits, including efficiency gains, reduced radiation exposure, and shorter acquisition times. DL algorithms can accurately and automatically transform MRI data into synthetic CT images with measurable Hounsfield unit data and show promise to synthetically add fat suppression to native MR images [12,13,14,15].

Tissue segmentation

Tissue segmentation is an essential component of articular cartilage mapping, compositional analysis of muscle bulk and fatty tissues, and often a requisite for advanced tasks, such as lesion detection [5, 16, 17]. Humans can perform segmentation tasks accurately, but most are time-consuming and tedious. The speed of DL-based tissue segmentation promises great efficiency gains that may permit the inclusion of tissue compositional-based information routinely into radiology reports, including opportunistic screening for sarcopenia, obesity, and osteoporosis, as well as volumes and fat fractions of muscle tissues [18].

Workflow enhancements

DL algorithms give rise to multiple opportunities for improving musculoskeletal workflows, including intelligent and adaptive hanging protocols, speech recognition, report generation, scheduling, precertification, and billing.

Prospective protocoling of MRI and CT studies is an important task that ensures every patient receives the most appropriate imaging exam. However, the synthesis of demographic information, surgical and medical history, allergies, history of present illness, previous imaging studies, pathological diagnosis, microbiological analysis, and laboratory data is a time-consuming process, which can be greatly expedited with DL support [19].

The ability to customize hanging protocols is a much desired and utilized feature for the systematic evaluation of musculoskeletal imaging studies. However, the current hanging protocol capabilities of PACS systems are often unsatisfactory due to the myriad variations of imaging protocols and labels, which leads to a large amount of time spent by radiologists to manually rearrange image series. While many conventional hanging protocols depend on DICOM header information, DL algorithms can be trained to recognize countless image characteristics for accurately assigning viewports, display modes, and features, including imaging modality, anatomy, contrast weighting, reconstruction kernel, and spatial orientation [20].

Computerized speech recognition for report generation is now used almost ubiquitously in musculoskeletal radiology. While it has increased efficiency and is cost-effective, the phonetic error rate has increased. More than 20% of radiology reports may contain errors in laterality, wrong-word substitution, nonsense phrases, missing words, and spelling errors [21]. Advanced DL-based algorithms could provide important safeguards and reduce error rates by utilizing more information, including patients, studies, and idiosyncrasies [22].

DL-based scheduling support is another potentially high yield area where advanced algorithms could add tremendous value. By incorporating multiple layers of information and patterns based on patient demographics, past behavior, socioeconomic data, site-specific idiosyncrasies, referrers, planned procedures, as well as weather, traffic, holidays, public health, and seasonal information, DL algorithms could accelerate scheduling, reduce no-show rates, and permit sustainable overbooking to avoid empty imaging slots [23,24,25].

Disease detection

The value proposition of disease-detecting DL algorithms is primarily footed on improved diagnostic performance while reducing human subjectivity and errors due to distraction and fatigue. DL algorithms can also incorporate many complex and higher dimensional radiomics features for diagnosis [26, 27].

Fundamental questions that require answers before the widespread adoption of disease-detecting DL algorithms into clinical practice include whether DL algorithms measurably improve the already low (3–5%) error rate of radiologists (21) and what type of systematic errors DL algorithms make (e.g., subtle errors versus egregious errors). A subtle error may constitute a missed single bundle partial-thickness anterior cruciate ligament tear, which radiologists may or may not diagnose, whereas an egregious error may constitute a missed full-thickness double-bundle anterior cruciate ligament tear, which most radiologists would diagnose [28].

A recent meta-analysis summarized the performance of DL algorithms to detect osseous fractures [29]. A total of 5914 radiographic examinations with 1343 fractures (prevalence, 22.7%) qualified for inclusion. The DL algorithms made 1086 true-positive (18.4%), 259 false-positive (4.3%), 257 false-negative (4.3%), and 4312 true-negative (73%) diagnoses, which suggests a pooled area under the curve of 87%, sensitivity of 80%, and specificity of 94%. While the diagnostic performance is promising, several of the included studies showed similar or higher diagnostic performances of human interpretations [30,31,32,33,34]. Similarly, studies evaluating the diagnostic performances of DL algorithms detecting and classifying internal derangement of the knee joint on MRI have also demonstrated that human performance is similar or better than DL performance [17, 35,36,37,38].

Although studies have demonstrated similar performances to humans, studies have also demonstrated that the combination of DL algorithms and radiologists performs better than either alone, suggesting that the most viable clinical practice model might be musculoskeletal radiologists using DL algorithms for interpretations rather than DL algorithms working independently [36, 39].

Clinical efficiency gains

DL algorithms may reduce the time needed to read musculoskeletal studies. A recent study demonstrated reduced reading time when using a deep DL algorithm to detect rib fractures on chest CT exams compared to interpreting CT exams alone [40]. However, promised efficiency gains by DL algorithms might require careful consideration if study designs did not include a clinically applicable workflow.

For example, a DL algorithm proposed for detecting a certain musculoskeletal condition in a certain anatomical region may be studied with a test set of 250 radiographs. The study may conclude that the radiologists read cases 30% faster with the DL algorithm while maintaining high diagnostic performance, which could equate to finishing a day’s work earlier or interpreting more cases during the workday. However, the efficiency gains derived by such a study design may not translate into clinical practice because the 250 radiographs of a certain anatomical region will most likely not be read back-to-back but will be scattered within a mixed volume of cases, meaning the efficiency gains will span over a longer period of time than proposed.

Additional statistically significant studies comparing the impact of DL algorithms on diagnostic accuracy and interpretation speed are necessary to fully understand the value of DL algorithms for musculoskeletal disease detection.

Practice integration

In addition to the central focus on diagnostic performance and efficiency gains, an important consideration for delivering the value propositions of DL algorithms is the successful integration into the workflow of musculoskeletal radiologists. Despite a growing number of regulatory agency-approved DL algorithms and artificial intelligence products [41], widespread adoption by musculoskeletal radiologists has not yet occurred. While hesitancy may be due to uncertainties about diagnostic performance improvements and clinical efficacy gains, challenges in successfully integrating well-validated DL algorithms into the workflow of musculoskeletal radiologists also represent a major roadblock.

Given the current focus of DL algorithms on single tasks, comprehensive artificial intelligence support for musculoskeletal radiology reports will likely require the adoption of multiple algorithms. The most effective algorithms will likely be from different vendors and include detection, segmentation, classification, and prediction of multiple conditions, including fractures, masses, degeneration, deformities, malalignment, bone density, and soft tissue lesions.

While the practice integration of a single algorithm may already pose substantial challenges, the integration and management of multiple algorithms is difficult and may represent the largest roadblock for practical integration of DL algorithms.

Fundamental prerequisites for seamless integration of DL algorithms include routing imaging studies to the DL algorithm, fast image processing, presentation and formatting of the results, and interfacing with viewing software and dictation systems. Vendor-specific app stores featuring DL algorithms for their hospital and radiology information systems, PACS systems, and post-processing software may provide solutions that can utilize and integrate seamlessly into preexisting infrastructure. However, vendor-neutral infrastructures providing a universal plug-and-play environment may be needed ultimately for successful practice integration and management [42].

It is important to note that many DL algorithms may render a diagnosis regardless of what images are being provided as input. For example, a bone age–determining algorithm will return a bone age even if presented a chest radiograph or photo of a cat [43]. Therefore, the routing of accurately preselected images to the DL algorithms is of paramount importance, as many DL algorithms cannot automatically select appropriate and reject inappropriate images. For such DL algorithms, imaging studies will either have to be curated digitally, which may be achieved with other DL algorithms or manually routed [44].

An additional requirement for the successful adoption of DL into daily practice is the PACS integration of DL results to eliminate the need to open and switch between multiple applications while interpreting exams using algorithms from different vendors.

Costs versus benefits

The ultimate question regarding using DL algorithms in clinical practice may pertain to the tradeoffs between benefits and costs. The answer will need to factor in the true diagnostic value, efficiency gains, and problem-solving abilities of DL algorithms.

The cost of DL algorithms may be offset through efficiency gains. However, musculoskeletal radiologists and DL algorithms may detect the vast majority of musculoskeletal diseases and injuries with similar accuracy, and indeterminate musculoskeletal tumors may still need tissue sampling despite a DL-augmented differential diagnosis.

For DL algorithms that increase the diagnostic performance for disease detection, multiple business models are under debate, including radiology departments bearing the cost, managed healthcare bearing the cost by paying for DL with the hope of recouping the cost through improved patient outcomes, or shifting the cost to patients.

Each radiology department will need to prioritize which DL algorithms should be purchased based upon their individual needs. The needs of musculoskeletal radiologists are different from those of neuroradiologists or breast imagers. Therefore, the impact and value of DL need to be carefully evaluated for each use case before adoption.

DL algorithms that do not result in meaningful improvements in diagnostic performance and productivity may still be beneficial by reducing fatigue and stress, which could help combat the cautioning 80% prevalence of burnout symptoms that a study by the Society of Skeletal Radiology found among musculoskeletal radiologists [22].

Conclusion

DL offers musculoskeletal radiology exciting possibilities, including faster MRI, reduced-dose CT, image data transformation, automated tissue segmentation, workflow support, and disease detection. Novel DL-based image reconstruction algorithms are prime examples of how DL algorithms deliver promised value propositions in musculoskeletal radiology in an exemplary fashion. Although additional studies are needed to understand the value and impact of DL algorithms on clinical practice, DL technology will likely play an important role in the future of musculoskeletal imaging.