Introduction

Photographs are often included in the operative report of knee arthroscopies but, in the event of a legal investigation, can they be interpreted a posteriori by an expert to hold the surgeon liable?

Inter-observer variations in arthroscopic diagnoses of intra-articular knee lesions exist [1]. Persistent pain and infection are potential complications of all knee arthroscopies and the most frequent reasons for litigation [2, 3]. In the United States, 5% of lawsuits after knee arthroscopy have been reported to be related to misdiagnosis [4]. Thus, accurately diagnosing and classifying intra articular lesions is essential in justifying the therapeutic decision-making process [5]. In the case of a legal investigation, the photographs within the patient file can be used, in addition to the MRI, to assess whether the indication for arthroscopy was justified [6].

Knee arthroscopy is an old and common procedure for both diagnostic and therapeutic purposes [7, 8]. More than 167,000 knee arthroscopies were performed in France in 2018; making it the commonest surgery in orthopaedic practice [9]. The future trend by year 2050 is expected to be an increase of 5.6% and 1.2% in meniscal surgery and ligament reconstruction respectively [9], so might the number of complications [2] and legal investigations. It holds true for other countries in the world as data from the United States of America are closely comparable [10]

The extent of inter-observer variability reported in the literature is heterogeneous and dependent on the structures studied and the methodology used [11, 12]. Most authors use either video or successive arthroscopies as proof of assessment [13,14,15]. Even when observers perform the arthroscopy themselves, as opposed to simply viewing photographs, the differences in assessment can be significant. In 2002, Javed et al. [16] demonstrated an overall variability of 20% from successive arthroscopies by different surgeons in evaluating intra articular structures of the knee. They stressed the importance of the surgeon’s level of experience.

The study of inter-observer reliability from photographs alone has not yet been evaluated to our knowledge. The objective of this study was to find out whether the interpretation, a posteriori, of photographs taken from knee arthroscopies by an external observer was reliable and reproducible, in order to judge the relevance of their use in legal investigations. The purpose was to assess:

  • the inter-observer reliability of the detection of a lesion (primary endpoint)

  • the inter-observer reliability of the classification of the lesion in question (secondary endpoint)

  • the comparison between the interpretation of the photographs and the data from the operating report considered as a gold standard reference (secondary endpoint)

The hypothesis was that arthroscopic diagnoses from photographs are not reproducible, neither accurate.

Material and method

This was a monocentric observational retrospective study written in accordance with the principles of the Helsinki declaration and the MR004 French methodology. All patients included gave their informed consent.

The population studied included patients who underwent a first knee arthroscopy, regardless of indication, between January and May 2018, i.e. 66 arthroscopies. Among these files, six did not have the necessary photographs and were excluded. This was a continuous series of 60 patients from the same operator including 25 women (42%) and 35 men (58%). Patient characteristics and data collected from the operative reports are described in Table 1.

Table 1 Population characteristics and per operative evaluation

Photographs were taken with a Smith & Nephew arthroscopy cart: 660HD Image Management System, 460P 3-CCD Digital Camera. The images were in 720 × 540 JPEG format.

Three observers with experience in knee arthroscopy were chosen from a panel of orthopaedic surgeons at the Grenoble University Hospital. One of them was an expert for the courts. All the photographs were taken from an antero-lateral portal. The investigator successively presented the photographs of each patient on a computer. For each patient, the observer had access to all the photographs of the diagnostic exploration of the knee simultaneously including at least one image per compartment (medial femoro-tibial, lateral femoro-tibial and patello-femoral compartments) and one image of the intercondylar notch. Photographs taken after the therapeutic intervention were excluded. The only known information given was the side of the operated knee. The answers of the other experts were not known. Each observer had to decide on the presence or absence of lesions of the different structures: anterior cruciate ligament, medial meniscus, lateral meniscus, patellar cartilage, trochlear groove cartilage, cartilage of the internal and external tibial plateaus, cartilage of the medial and lateral femoral condyles.

The inter-observer reliability for the detection of the lesions constituting the primary endpoint was thus calculated according to a binary evaluation (structure considered healthy or damaged).

Cartilage lesions were then classified according to the ICRS (International Cartilage Repair Society) classification [17]. A descriptive classification was used to classify the meniscal tears: bucket handle or longitudinal fissure, horizontal fissure, radial fissure, or meniscal flap. Inter-observer reliability for lesion classification was in turn calculated based on a qualitative assessment. Comparison between the interpretation of the photographs and the conclusions of the operating report were performed.

The power calculation was carried out by the statistics department of The Grenoble University Hospital. For an alpha risk of 5% and a power of 90%, it was necessary to include 60 patients, to ensure a proportion of one lesion for every three arthroscopies. The expected effect for the primary endpoint was 0.5 agreement between the three observers. Primary and secondary endpoints were evaluated using the Fleiss' kappa index and the 95% confidence interval. A score below 0 was considered to have poor agreement, 0.01 to 0.20 slight, 0.21 to 0.40 fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial and from 0.81 to 1 almost perfect according to Landis and Koch criteria [18]. The concordance between observers answers and per operative diagnosis were described through descriptive statistics as the average percentage of correct photograph interpretation for the three observers.

Results

Inter-observer reliability for lesion detection was fair for all structures except the ACL and the lateral condyle. The inter-observer reliability assessed using the Fleiss' Kappa index for the primary endpointshowed almost perfect agreement (κ = 0.84, [0.71; 0.96]) for the ACL. Concerning the menisci, the agreement was substantial for the medial meniscus (κ = 0.61, [0.44;0.77]) and moderate for the lateral meniscus (κ = 0.49, [0.24;0.74]). Finally, the evaluation of the cartilage surfaces showed poor agreement for the lateral condyle (κ = -0.01, [-0.14;0.13]), slight for the lateral tibial plateau (κ = 0.15, [-0.05;0.35]) and moderate for: the medial condyle (κ = 0.51, [0.34;0.68]), medial tibial plateau (κ = 0.43, [0.23;0.63]), and patellar cartilage (κ = 0.58, [0.26;0.89]). It was considered substantial for the trochlear cartilage (κ = 0.61, [0.27;0.96]).

For classifying cartilaginous and meniscal lesions, inter-observer reliability was mainly poor. The agreement was slight for almost all structures studied: medial condyle (κ = 0.12, [0.02;

0.22]), medial tibial plateau (κ = 0.10, [0.04;0.18]), medial meniscus (κ = 0.04, [0;0.25]), lateral condyle (κ = 0.011, [0;0.28]), lateral tibial plateau (κ = 0.03, [0;0.20]), patellar cartilage (κ = 0.14, [0;0.31]), and trochlear cartilage (κ = 0.035, [-0.02;0.33]). Regarding the lateral meniscus, the agreement was substantial (κ = 0.65, [0.45;0.84]).

The inter-observer reliability therefore drops with respect to the primary endpoint when it comes to classifying the lesions.

Considering lesion detection, the percentage of accurate concordance between photographs evaluation and per operative diagnosis ranges from 60 to 92%, depending on the structure. The average concordance for all structures was 73.8%. A lesion was more likely detected when involving the ACL (92%), the medial meniscus (81%) and the patellar cartilage (81%). The rate of correct lesion detection was 78% for the trochlear groove, 77% for the lateral meniscus, 66% for the lateral tibial plateau, 65% for the lateral condyle, 64% and 60% for the medial condyle and the medial tibial plateau respectively.

These rates drop between 2 and 23% depending on the structure (8,6% in average) to classify correctly the lesion previously detected. Finally, when the observer’s answer did not match the per operative diagnosis, the mistake lead to underdiagnosed lesions in 71% of the cases.

Discussion

The interpretation, a posteriori, of photographs included in the operative report of knee arthroscopies can be used as proof in case of legal investigations. However, the relevance of these interpretations requires to be evaluated. The main finding of our study is that the reproducibility of arthroscopic diagnoses is not reliable using photographs alone, confirming our hypothesis.

This study has some limitations which necessitate caution when interpreting the results.

The postero-medial compartment of the knee was not systematically explored during the procedure; thus the posterior ramp of the medial meniscus was not evaluated for all patients and photographs of the postero-medial compartment available for some patients were removed from the ones presented to the observers. Photographs were examined in a digital format which may have improved the inter-observer reliability due to better image quality than printed photographs and allowing enlargement at the convenience of the observer. This made it possible to overcome printing defects, but this situation did not correspond to the medicolegal framework which relies on expert reports including printed photographs. Finally, the sample size was probably too small to analyze inter-observer reliability for classifying lesion because of the high rate of non-injured structures among the knees evaluated.

The inter-observer reliability for lesion detection was moderate overall. Fleiss’ kappa index ranged from 0.4 to 0.61 for all structures with two exceptions: the external tibial plateau and the external condyle, the agreements were slight (0.15) and poor (-0.01), respectively. To determine the grade of cartilage or meniscal lesion, the inter-observer reliability was slight for all elements except for the lateral meniscus.

The originality of our study lies in the fact that only photographs were used. Inter-observer reliability is generally lower than that found in the literature, where the mediums used were either video or successive arthroscopies [1, 11,12,13,14,15,16,17] (Table 2).

Table 2 Studies evaluating inter-observer reproducibility of knee arthroscopy diagnosis from year 2000

Video evaluation seems to bring a better intra articular knee evaluation than photographs. Anderson et al. [11] studied the reproducibility of meniscal tear assessments using the International Society of Arthroscopy, Knee Surgery and Orthopedic Sports Medicine (ISAKOS) system. Videos of 37 arthroscopies were reviewed by eight orthopaedic surgeons. The kappa coefficient ranged from 0,25 to 0,72 for various qualitative meniscal tear assessment (pattern, depth, location, quality of the tissue). Same conclusions were reported by Marx et al. [13] for classifying cartilage lesion using videos with kappa coefficients ranging from 0.345 to 0.87 depending on the location. The choice of the classification does not impact the inter-observer reliability according to Brismar et al. [1] who studied the reliability of diagnoses from four observers on 19 arthroscopy videos by comparing three classification systems for cartilage lesions. They found acceptable agreement with kappa coefficients between 0.45 and 0.49 depending on the classification used. Videos allow a dynamic evaluation while arthroscopic images capture barely a moment in time. It explains the better agreement coefficients reported in these studies using video compared to our results. It might also explain the high rate of underestimated lesions in our study (71%), especially for ICRS stage I cartilaginous lesion for which dynamic palpation is mandatory to show up abnormal soft cartilage. Finally, the portal used and the angle between the scope and the structure when capturing an image might cause distortion [19, 20] that compromises its interpretation compared to the multiples angles and depth offered by a video record.

Regarding possible extrapolation of our results, it should be emphasized that low reliability of arthroscopic diagnoses was observed despite all observers being experienced in knee arthroscopy, working at the same center, and routinely using the same equipment and same classification systems. We can expect the variability to be even greater if the experts come from different centers, have different surgical norms, or are not specialists in knee arthroscopy.

In the event of a legal dispute, the interpretation of knee arthroscopy photographs by an expert does not constitute irrefutable proof of medicolegal liability. However, photographs can constitute an additional element to criticize the diagnosis a posteriori despite the improvement of preoperative imaging techniques [21]. Questioning the surgeon for a potentially unjustified meniscectomy or ligamentoplasty can however give rise to legal proceedings. According to Randsborg et al. [22] an incorrect indication represented 2% of legal proceedings resulting in compensation following ACL reconstruction in the Norwegian register. Lawsuits cases for patient death following arthroscopy have been reported, and involve almost exclusively knee arthroscopy due to the high number of procedure performed and the risk of pulmonary embolism [23]. This highlights the need for strong evidence when challenging the initial indication throughout medicolegal process.

According to our results, the possible use of photographs as proof for medicolegal proceedings should only be considered under strict conditions. First, photographs have to be analyzed on digital high-definition format by a minimum of two experts. Then, interpretations of photographs must come along with clinical and radiological data as a set of arguments taken into consideration in the legal decision. Finally, photographs should be taken in a standardized manner to exhaustively point out the normal or pathological aspect of all intra articular structures. A protocol defining the number of photographs, the structures seen on each of them as well as the angulation of the scope and the surgical portal used requires to be established.

The use of miniature photograph printed on the operative report for medicolegal purposes should be avoided. In our practice, arthroscopy photographs are no longer included in the operative report but rather uploaded in the electronic medical file of the patient, thus available latter on for any use. As suggested by Brown [24] already in 1989, recording videotapes appears as the best option to accurately and reproducibly diagnose intra articular knee lesions. He raised the difficulty of using videotapes compared to “still images” for various purposes such as publishing or teaching. Somehow, what he pointed out is still relevant 35 years later since videotape use remains limited in some institutions like ours by the issue of large file storage in patient’s file.

Despite the widespread use of photographs in medical practice, its regulation by governmental law in terms of capture method, data transfer, storage, copyright and medical privacy remains limited [25]. Arthroscopic photographs consist of non-identifiable image for which informed consent seem sufficient in obtaining and storing them in the patient’s medical file.

Arthroscopic diagnoses of the knee based on photographs alone are not reproducible, particularly for classifying lesions. In the event of a legal investigation following knee arthroscopy, the photographs included in the operative report should not be, by themselves, used to hold the surgeon liable.