Introduction

Ultrasonography (US) is currently regarded as a method of choice for noninvasive diagnosis of rotator cuff lesions. Yet, US is often quoted as the most operator-dependent type of imaging tests of the shoulder [1, 2]. Many studies have documented the accuracy of sonography in the detection of rotator cuff tears for more than 20 years, with variable results [36]. Although technical developments and increased experience have significantly improved sonographic results, US of the shoulder remains susceptible to interobserver variability and has a long learning curve owing to the complex shoulder anatomy and various pitfalls [2]. Ultrasonographic reproducibility of standard views is difficult, and changes in echogenicity within the tendon due to artifacts or malpositioning of the transducer are frequent. Thus, the success of US depends largely on the experience of the operator [2]. Operator dependence, which is frequently considered a limitation of US, may be the most likely cause for the variation in reported accuracy [1, 7]. Yet, interobserver variability of US has not been exhaustively investigated. Most previous studies dealt only with the performances obtained by expert sonographic operators [6, 8, 9]. Hence, we chose to compare a standard sonographic operator, reflecting the average level obtained by most radiologists in routine, with an expert sonographic operator. When US produces unclear findings, shoulder magnetic resonance (MR) arthrography may be used for achieving an accurate diagnosis, especially if conservative treatment is unsuccessful or if patients are destined for surgery [10, 11].

The purpose of this study was to evaluate the role of the operator’s experience in sonography of the painful shoulder and to establish the real diagnostic accuracy of US in routine practice. To refine the search for interobserver variability, we tested the interobserver reproducibility between two radiologists with different levels of sonographic experience with MR arthrography as the reference standard.

Materials and methods

Between July 2005 and February 2006, 65 consecutive patients with a high clinical suspicion of rotator cuff lesion and no previous history of shoulder trauma or surgery were referred to our institution for MR arthrography and considered for inclusion in this prospective study. All of these patients had been initially evaluated clinically for shoulder pain of more than 6 weeks duration by a single orthopedic surgeon, who is subspecialty trained in shoulder and elbow surgery. There were 32 men and 33 women, and their ages ranged from 23 to 75 years with a mean age of 52.4 years. Informed consent was obtained from all patients before their examination. Patients were examined during a single hospital visit with ultrasonography by two completely independent radiologists, who were respectively “standard” and “expert” operators, and with MR arthrography by a third independent radiologist.

Sonography was performed with the HDI 5000 ultrasound unit (ATL-Philips ultrasound) using a 5–12 MHz linear array transducer with optimized settings and automatic variable frequency adjustment depending on the focal depth. The standard sonographic operator had 6 months of experience in musculoskeletal sonography when the study started. The expert sonographic operator had more than 15 years of experience in musculoskeletal US. The standard and expert sonographic operators had trained in different centers before the study started. Each sonographic examination was performed with the radiologist blind to the other radiologist’s findings. Patients were scanned while seated. Standard scanning techniques were used [2, 12]. Both observers used similar scanning protocols. Short-axis and long-axis US scans of each rotator cuff tendon and long head of biceps tendon were obtained. Acromioclavicular joint was evaluated by anterior approach and superior approach. Imaging parameters such as scanning frequency, focal zone number and placement, and gain were left to the discretion of the operator. After the examinations, each observer filled out a data sheet. The positive findings of interest were full-thickness rotator cuff tears, partial-thickness rotator cuff tears, intratendinous rotator cuff tears, supraspinatus tendinosis, abnormality of the long head of biceps tendon (tendinosis, subluxation, dislocation, rupture), subacromial bursa abnormalities (fluid or synovitis), and acromioclavicular joint osteoarthritis. These items were scored in keeping with published sonographic descriptions [1, 13, 14]. US imaging criteria for supraspinatus tendinosis were tendon thickening, abnormal patterns of echogenicity with hypoechoic areas, and preserved contours [5]. Subacromial bursa abnormalities were defined as presence of subacromial fluid or bursal thickening [1].

All arthrographic examinations were performed by a single musculoskeletal radiologist with more than 15 years of experience with the examination technique, who was blinded to the sonographic findings. MR arthrography was performed on the day after the sonographic evaluation. Fluoroscopically guided injection of 10 mL of diluted gadopentetate dimeglumine (Magnevist; Schering, Berlin, Germany) with a concentration of 2.5 mmol/L was performed via an anterior approach [10, 15, 16]. MR imaging was commenced within 30 min after contrast injection with a 1.5-T system (Symphony; Siemens Medical Solutions, Germany) and a dedicated phased array shoulder coil (Siemens). Patients underwent imaging with the arm in neutral position. T1-weighted spin-echo images (TR range/TE range, 500–700/14–20) with fat suppression were obtained in the transverse plane, coronal oblique plane (perpendicular to the glenohumeral joint space), and sagittal oblique plane (parallel to the glenohumeral joint). T2-weighted fast spin-echo images (3,000–4,200/90–120) with fat suppression were acquired in the coronal oblique plane. Slice thickness was 3 to 4 mm. The imaging matrix was 256 × 192 or higher, and the field of view was 14–16 cm.

Arthro-MR imaging criteria of full-thickness tears were extension of paramagnetic contrast through the entire thickness of the rotator cuff and presence of contrast medium in the subacromial bursa. Criteria of articular-sided partial-thickness tears were disruption of the smooth undersurface of the tendon with accumulation of paramagnetic contrast within the substance of the tendon, and no evidence of paramagnetic contrast was present within the subacromial bursa [10]. In bursal-sided partial-thickness tears, some tendinous fibers on the bursal surface were interrupted. In intratendinous tears, the split was only within the tendon itself, with no communication with the subacromial bursa or the shoulder joint. The original findings on the MR arthrograms were used as reference standards for evaluation of the diagnostic performance of US (Fig. 1).

Fig. 1
figure 1

Longitudinal (a) and transverse (b) sonograms show a large bursal-sided partial-thickness supraspinatus tendon tear (arrowheads), misinterpreted by the standard sonographic operator as a full-thickness rotator cuff tear. Coronal oblique T2 FSE-weighted and fat-saturated MR image (c) shows a bursal-sided supraspinatus tear (arrowhead) with subacromial bursal fluid. Coronal oblique T1 SE-weighted with fat saturation MR arthrogram (d) demonstrates low signal intensity in the subacromial bursa (arrow), indicating a bursal-sided partial-thickness tear

Cross tabulations of arthro-MR assessments with the diagnoses based on standard operator sonography and expert operator sonography were created. First, diagnostic accuracy (sensitivity, specificity, positive predictive value, negative predictive value) of the standard and expert sonographic operators was calculated with MR arthrography as the reference standard.

Then, we tested interobserver variability between the standard and expert operators. Statistical analysis was performed using Cohen’s kappa statistic calculated by SPSS 15.0 for Windows. The kappa statistic was interpreted as follows: 0.00 = poor agreement, 0.00–0.20 = slight agreement, 0.21–0.40 = fair agreement, 0.41–0.60 = moderate agreement, 0.61–0.80 = substantial agreement, and 0.81–1.00 = almost perfect agreement [17].

Results

MR arthrography

At MR arthrography, 74 rotator cuff tears in 65 patients were diagnosed, including 43 full-thickness, 17 partial-thickness (ten articular-sided and seven bursal-sided), and 14 intratendinous rotator cuff tears. The full-thickness tears respectively involved the supraspinatus tendon only in 28 patients, both the supraspinatus and infraspinatus tendons in ten, the infraspinatus tendon only in one, and the subscapularis tendon in four.

The articular-sided partial-thickness tears involved the supraspinatus tendon in seven patients, the infraspinatus tendon in one, and the subscapularis tendon in two. The bursal-sided partial-thickness tears and the intratendinous tears were isolated to the supraspinatus tendon in all the patients (Fig. 2).

Fig. 2
figure 2

Longitudinal (a) sonogram shows a small articular-sided partial-thickness supraspinatus tear (arrowhead) misinterpreted by both sonographic operators as a supraspinatus tendinosis. Retrospectively, transverse sonogram reveals slightly hyperechoic cartilage (arrowhead) at the site of the tear (b). Coronal oblique T1 SE-weighted with fat saturation MR arthrogram (c) exhibits an articular-sided partial-thickness rotator cuff tear

Supraspinatus tendinosis without tendon tear was observed in 18 cases. Long head of biceps tendon abnormality was diagnosed in seven cases. The diagnosis of subacromial bursa abnormality was established in 55 patients. Acromioclavicular osteoarthritis was documented in 36 patients (Fig. 3).

Fig. 3
figure 3

Longitudinal sonogram (a) shows a small intratendinous supraspinatus tear (arrowheads) misinterpreted by the standard sonographic operator as a supraspinatus tendinosis. Coronal oblique T2FSE-weighted and fat-saturated MR image (b) demonstrates an intratendinous supraspinatus tear with a split only in the tendon itself (arrowhead). Coronal oblique T1 SE-weighted with fat-saturation MR arthrogram (c) exhibits low signal intensity in the intratendinous tear (arrowhead) with no communication with the shoulder joint

Ultrasound

The standard and expert ultrasound operators’ results are summarized in Tables 1 and 2. First, we detail the diagnostic performances of the expert sonographic operator. For full-thickness tears of the rotator cuff, the sensitivity was 95.3% (41 of 43 cases) and specificity was 95.5% (21 of 22 cases). For partial-thickness tears, the sensitivity was 70.6% (12 of 17 cases). Seven of ten articular-sided and five of seven bursal-sided tears were correctly diagnosed. For intratendinous tears, the sensitivity was 64.3% (nine of 14 cases). For abnormality of the long head of biceps tendon and tendinosis of the supraspinatus, the sensitivity was 100% (seven of seven cases) and 88.9% (16 of 18 cases), respectively. For subacromial bursa abnormalities and acromioclavicular joint osteoarthritis, the sensitivity was 96.4% (53 of 55 cases) and 91.7% (33 of 36 cases), respectively.

Table 1 Diagnostic accuracy (sensitivity, specificity, positive predictive value, negative predictive value) of the standard sonographic operator with use of MR arthrography as the reference standard
Table 2 Diagnostic accuracy (sensitivity, specificity, positive predictive value, negative predictive value) of the expert sonographic operator with use of MR arthrography as the reference standard

Then, we present the interobserver variation (Table 3). There was very good agreement between the expert and the standard sonographic operators on almost all criteria, except partial-thickness and intratendinous rotator cuff tears.

Table 3 Interobserver variability between the two sonographic standard and expert operators (statistical analysis using Cohen’s kappa test)

Almost perfect agreement was observed for full-thickness rotator cuff tears (κ = 0.90), subacromial bursa abnormality (κ = 0.891), abnormality of the long head of biceps tendon (κ = 0.84), and acromioclavicular osteoarthritis (κ = 0.815). Highly substantial agreement was found for supraspinatus tendinosis (κ = 0.80). For intratendinous and partial-thickness rotator cuff tears, the interobserver agreement was less good, only moderate to substantial (κ = 0.57 and κ = 0.63 respectively).

In summary, the level of interobserver variability in the sonographic detection and characterization of full-thickness rotator cuff tears, supraspinatus tendinosis, abnormalities of the long head of biceps tendon, subacromial bursa abnormalities, and acromioclavicular osteoarthritis was low in our study. The interobserver variation was higher for partial-thickness and intratendinous rotator cuff tears.

Discussion

Sonography is a noninvasive, dynamic, inexpensive, and widely available imaging technique for assessment of the painful shoulder [12, 18]. Many studies have documented the accuracy of sonography in the detection of full-thickness rotator cuff tears. Sensitivities range from 94% to 100%, and specificities from 91% to 94% [3, 4, 6, 19]. Hence, US is increasingly used for initial evaluation of patients with a painful shoulder. Several factors are known to influence the results of sonographic examination in clinical practice, such as morphological criteria, examination technique, and operator experience [2]. Operator dependence, which is frequently considered a limitation of US, may be the most likely cause of the variation in reported accuracy. Yet, very few published works have addressed interobserver variation [1, 7, 8]. Most previous studies dealt only with the performances of expert operators [6, 8, 9, 19]. Hence, we chose to compare a standard sonographic operator, reflecting the average level obtained by most radiologists in routine, to an expert sonographic operator.

Several limitations may be considered inherent in the materials and methods. First of all, the patients evaluated in this prospective study reflect the type of population seen by a subspecialized shoulder surgeon who primarily evaluates referred patient with a high probability of having significant rotator cuff abnormalities. So, the population evaluated in our study is unlikely to match most patient populations referred for shoulder sonography. However, the high percentage of torn cuffs examined probably makes the interobserver agreement less likely [8]. In addition, although the total number of patients in our study is relatively high, the results of some special lesions are of statistically limited value due to the small number of cases. The results based on small subgroups are nevertheless in agreement with those in the literature, confirming in particular the difficulty in diagnosing partial-thickness and intratendinous rotator cuff tears [5, 13].

The second limitation relates to the definition of the standard and expert sonographic operators. First, the standard operator, who had 6 months of experience in musculoskeletal sonography when the study started, may have improved his level during the 8 months the study lasted. Then, the diagnostic accuracy of the expert operator should be discussed. Our expert operator, who had more than 15 years of experience in musculoskeletal sonography, obtained results completely comparable to those found in the recent literature not only for the full-thickness but also for the partial-thickness rotator cuff tears [6, 20].

Another limitation is the lack of tissue harmonic imaging (THI), which may have the potential to help solve diagnostic problems, such as difficulty differentiating partial-thickness from full-thickness rotator cuff tears, or detecting intratendinous tears [21]. So, THI might have improved our results for both the standard and the expert sonographic operators. However, we chose to use conventional US to reflect the way most radiologists work routinely.

Finally, the reference standard chosen in this study should be discussed. Although surgical proof in all patients would have been desirable, many factors influence the choice of whether to perform surgery in patients with a rotator cuff tear. However, our study was conducted with MR arthrography as the “gold” standard. MR arthrography depicts partial-articular cuff tears better than conventional MR imaging because injection of contrast medium into the joint space helps to produce a cleft, especially when torn fibrils spontaneously remain close together [5, 11, 22]. When combined with fat suppression, it provides excellent results in the depiction of both full-thickness and partial-thickness tears [23, 24]. Limitations in the diagnosis of partial-thickness tears are mainly restricted to small articular-sided tears (Ellman grade 1). In addition, no improvement in diagnosing bursal surface partial-thickness tears and intratendinous tears has been documented with MR arthrography in comparison with conventional MR imaging, because no communication exists between these lesions and the articular joint [24]. So, MR arthrography may not be a perfect reference standard, but it is a reasonable one [21]. Interobserver variability related to MR arthrography may also be an issue in our work because, as in earlier reports, all examinations were evaluated by a single musculoskeletal radiologist who was blinded to the sonographic findings [24].

As we wished to refine the search for interobserver variability that may be encountered in routine clinical practice, we privileged the comparison between a standard sonographic operator and an expert sonographic operator. The strengths of our study are as follows: (1) this was a controlled, prospective, double-blinded sonographic analysis of patients with a high clinical suspicion of rotator cuff lesion; (2) the expert sonographic operator was highly experienced with results comparable to those from the literature; and (3) MR arthrography was used as the most reasonable reference standard for comparison.

Contrary to an earlier report in which reasonable agreement between inexperienced and experienced sonographic operators was seen in tendon calcification only, there was very good agreement in our study between the standard operator and the expert operator on all criteria, except partial-thickness and intratendinous rotator cuff tears [1]. The differences between our study and the previous one might be explained by a better technique of standardization and a more robust assessment of trainees in musculoskeletal US.

Considering full-thickness rotator cuff tears and subacromial bursa abnormalities, the almost perfect interobserver agreement reflects the accuracy and reproducibility of US for these two major elements of routine shoulder assessment. Here, it is necessary to clarify that we chose to study only the presence or absence of bursal abnormality and not to evaluate whether this abnormality was fluid or synovitis. Interobserver disagreement can indeed be observed on this subject; it is due to the requirement for compression to elicit displacement of a hypoechoic area in order to confirm its fluid nature [1]. We also have to clarify that, in case of full-thickness rotator cuff tear, arthrography may have increased the bursal effusion in MR arthrography with regards to sonography performed previously.

Besides, our results show that interobserver variability is low concerning abnormalities of the long head of biceps tendon. Ultrasound is known to be very sensitive and specific for detection of long head of biceps tendon pathology in the intertubercular groove, such as tendinosis, ruptures, subluxations, and dislocations. US is, nevertheless, less reliable for detection of long head of biceps tendon intracapsular lesions and partial-thickness tears [25] (Fig. 4).

Fig. 4
figure 4

Transverse anterior sonogram showing the biceps tendon (arrowhead) medial to the lesser tuberosity and the intertubercular groove (a). The corresponding transverse T1 SE-weighted MR arthrogram confirms a medial dislocation of the biceps tendon (arrowhead) into the subscapularis tendon (b)

Our results confirm also that interobserver variability is low concerning supraspinatus tendinosis and acromioclavicular osteoarthritis, which are elements frequently encountered in painful shoulders. On the other hand, the interobserver agreement was lower for intratendinous and partial-thickness rotator cuff tears. In addition, the accuracy of the expert sonographic operator was only moderate for these two lesions, in agreement with the data of the literature [4, 6, 13, 26, 27]. Moreover, the difficulty in distinguishing partial- and full-thickness tears has been shown to be the primary cause of interobserver variability between two experienced observers [8, 9]. Another source of error comes from the difficulty in distinguishing tendinosis from a small Ellman grade 1 articular-side partial-thickness tear [8, 9]. In summary, our results primarily reflect the difficulty that sonographic operators, whatever their experience, have in distinguishing extensive partial-thickness tears from full-thickness tears and tendinosis from partial-thickness tears. From a clinical perspective, this main diagnostic limit of sonography should not have a significant impact on patient care because the initial treatment options are generally similar.

Sonography is hailed as an accurate and widely available imaging technique for assessment of the painful shoulder [28]. However, US is often quoted as the most operator-dependent type of imaging tests of the shoulder. Only a limited number of individuals have the expertise to scan shoulders, and an expert operator is frequently not available at many institutions. Consequently, the interobserver variation of US needed to be documented to validate assumptions about its technical performance in routine practice.

Our results show that in moderately experienced hands as in expert’s hands sonography has a low level of variability for full-thickness rotator cuff tears, supraspinatus tendinosis, long head of biceps tendon abnormalities, subacromial bursa abnormalities, and acromioclavicular osteoarthritis. Considering partial-thickness and intratendinous rotator cuff tears, which are known to represent the main diagnostic limit of US, our results confirm interobserver variability is higher. In conclusion, US provides valuable and reproducible information about the condition of the major elements of the painful shoulder, even when used as a routine technique by a standard sonographic operator.