Keywords

1 Historic Methods

The evolution of knee surgery has been predicated on the development, refinement, and evaluation of new surgical techniques. Historically, empiric assessment was used to document the relative efficacy of treatment. This unscientific approach often resulted in erroneous conclusions by researchers.

The problem lies not in veracity but rather in human nature, subjective interpretation of variables, and the difficulty of evaluating results. Even the most consciousness researcher, especially the surgeon, is subject to bias. The knowledge and perceptual ability of the examiner is an important variable. Experienced examiners frequently produce appreciable differences in translation and rotation when evaluating the limits of knee motion. Even when the examiners produce the same displacement, the correct interpretation depends on accurate perception of the motion.

The complexity of the knee and the number of criteria used to assess results make accurate evaluation even more difficult. Anderson et al. [1] found that the problem was exacerbated by the number of operative procedures and diverse methods of evaluation described in the 1980s. They [1] reported that, during that decade, 52 articles were published in the American Journal of Sports Medicine and the Journal of Bone and Joint Surgery American on conservative or operative treatment of the ACL-deficient knee. Twenty-eight different operative procedures were described, including primary repair, five extra-articular procedures, 13 intra-articular procedures, and 9 combined intra- and extra-articular reconstructions. The results of these procedures were rated as good or excellent in the majority of cases, although they were evaluated with 38 different rating scales.

The consensus among the researchers who have compared rating scales is that the differences are sufficiently great to preclude predicting results from one scale based on another and that inconsistency among these scales created an impediment to progress in the field.

2 Development of the IKDC Standard Knee Evaluation Form

The consensus was that a uniform scale was vital to the evaluation of treatment. Under the leadership of John Feagin from the United States and Werner Mueller from Switzerland, and the auspices of the American Orthopaedic Society for Sports Medicine and the European Society of Knee Surgery and Arthroscopy, the International Knee Documentation Committee (IKDC) was formed in 1987 to develop a standardized, international documentation system.

The initial objectives of the committee were to develop a form that was one page, including only the essential reproducible criteria necessary to evaluate results and to develop a form simple enough to be used by any clinician, both with and without research assistance. Second, the form was developed only for acute ACL injuries, but it was anticipated that this would serve as a foundation for a more comprehensive evaluation system, allowing for a valid scientific analysis of knee function. The first step was to agree on standard terminology to document knee motion and function. Next, the clinical examination of the limits of knee motion was critiqued, and a core of measurements was adopted. Finally, methods for documentation of activity, evaluation of limb function, and assessment of symptoms were evaluated, and a format was designed to record these observations.

3 Development of Standard Terminology

The discrepancy in the implied meaning of terms used in the literature has been an impediment to international communication. To improve communication, the IKDC met in New York in August 1987 to discuss standard terminology [6]. Noyes, Grood, and Torzilli [24] submitted definitions of terms for motion, position of the knee, and injuries of the ligaments. The committee critiqued, revised, and adopted a standard set of definitions. The following definitions are among those adopted [24]:

  • Motion: the act or process of changing position. Motion is described as the rate and direction of change.

  • Displacement: the net effect of motion; a change in position between two points without regard to the path followed. Displacement may be described by a change in translation or in rotation, each of which has three degrees of freedom.

  • Translation: motion of a rigid body in which all lines remain parallel to their original orientation. By convention, knee translation is described as motion of the tibia relative to the femur. Translation of the tibia may be mediolateral, anteroposterior, or proximodistal. Translation is measured in millimeters. The reference point normally used to measure translation is midway between the medial and lateral margins of the joint.

  • Rotation: a type of motion or displacement in which all points move about an axis. Rotations of the knee may be flexion-extension, internal-external, and abduction-adduction.

  • Range of motion: the displacement occurring between two limits of movement for each degree of freedom. Range of motion does not indicate the extremes of motion. For motion other than flexion-extension, range of motion depends on the angle of knee flexion.

  • Limits of knee motion: the extreme positions of movement possible for each of the 6 degrees of freedom. The term limits of knee motion is more specific than range of motion. It indicates where motion begins and ends and includes range of motion. There are 12 limits of motion, two for each 6 degrees of freedom. Ligament injury increases the limits of knee motion. The European system describes the limits of flexion and extension with three numbers: the maximum extension, neutral position, and flexion.

  • Coupled motions: a displacement or motion in 1 or more degrees of freedom caused by a load applied in another degree of freedom. Coupled motions occur during the clinical examination. An anterior displacement force applied during the Lachman test causes anterior translation and internal rotation of the tibia. A posterior displacement force results in posterior translation and external rotation. The amount of motion depends on the force applied and the constraints of the coupled motion. For example, constraint of rotation during the Lachman test significantly diminishes anterior translation.

  • Laxity: a lack of tension; looseness, referring to a normal or abnormal range of motion. In the first context, laxity is used to describe a lack of tension in a ligament and, in the second, as a looseness of a joint. This ambiguous term should be used to indicate lack of tension in the ligament. The degree of laxity should be specified as either normal or abnormal. Laxity should not be used in the context of looseness of a joint; the motion should be specified. The term anterior translation is preferable to anterior joint laxity.

  • Instability is another ambiguous term that has been used in two ways. First, it is used to describe the symptoms of giving way and, second, as the sign of increased joint motion. Rather than use instability to refer to symptoms, it is preferable to describe the event (i.e., giving way with activity). It is incorrect to designate a specific anatomic structure as the cause of ACL instability; rather, instability should only be used in the general sense to indicate excessive motion of the tibia as the result of traumatic injury.

Fact Box 1

The discrepancy in the implied meaning of terms used in the literature has been an impediment to international communication.

4 Limits of Knee Motion Evaluation

The methods of examination that have been used to determine the limits of knee motion are qualitative and clinician specific. Even experienced examiners may produce and perceive appreciable differences in displacement. Accurate assessment of translation and rotation is more demanding in ligament injuries, which increase more than one limit of motion. In these circumstances, clinicians have difficulty identifying either the starting or ending positions for the tibia.

The objectives of the second IKDC meeting in Zurich, Switzerland, in 1988 were, first, to agree on the clinical tests essential to evaluation of knee motion limits and, second, to identify the conditions that maximize the accuracy and reproducibility of measurements.

The consensus was that reproducibility depends on specifying the conditions of the tests. Clinical and laboratory studies confirm that the position of the knee at the initiation of testing affects displacement. The site of measurement must be identified, and the magnitude, direction, and point of application of force should be specified. Measurements in translation should be reported in millimeters and rotation in degrees. Changes in any of these conditions will result in different interpretations of the tests.

Subsequently, the IKDC convened in Jackson Hole, Wyoming, in July 1988 for its third meeting. The objective of this meeting was to determine the accuracy of the clinical tests and conditions for testing adopted at the Zurich meeting. Three studies were performed to assess the reproducibility of clinical measurements, differences in test techniques, and clinical accuracy in estimating knee displacement [2426].

Ten patients were examined by eleven IKDC members to determine the reproducibility of clinical measures [24]. Nine of the ten patients had sustained a ligament injury. The examination technique and recording system were standardized and reviewed by the examiners before testing. The patients also underwent an instrumented knee examination with the KT-1000, KSS, and Genucom.

The examiners estimated anteroposterior translation in millimeters and rotation in degrees, at both 25 and 90° of flexion. A thigh support was used to facilitate relaxation and standardize testing at 25° of flexion. When testing at 90° of flexion, the sole of the foot supported the limb. The sagittal knee profile or quadriceps active drawer test was used to evaluate the normal anatomic position.

Varus-valgus stress tests were measured at 0° and 25° of flexion. The pivot shift and reverse pivot shift tests were performed with the tibia in internal, neutral, and external rotation. These tests were graded in the following manner: 0 = none, 1 = glide, 2 = moderate, and 3 = severe.

The results of this study demonstrated that, even with benefit of standardized test techniques, a significant discrepancy existed in the examiner’s estimation of displacement. The greatest differences occurred in the evaluation of anteroposterior translation. One clinician recorded a side-to-side difference of greater than 3 mm in all eight patients, and another examiner only reported one patient with a side-to-side difference of greater than 3 mm. Analysis of the data revealed that the correlation between the examiners was better for total anteroposterior translation than for either anterior or posterior displacement.

The second study was performed to identify the differences in examination techniques contributing to the discrepancy in estimation of displacement. Another objective of this study was to determine the accuracy of the clinicians’ estimate of tibiofemoral displacement [25]. In this study, 11 members of the IKDC examined two cadaver knees that were instrumented with a device to measure three-dimensional motion. The examiners’ estimation of joint displacement was compared with the actual measurements recorded by the instrumented spatial linkage system. The ACL and MCLs were cut in one knee. The examination included estimation of anteroposterior displacement, mediolateral joint opening, and internal/external rotation.

The examiners were accurate in diagnosing injuries of these ligaments. Nine of the ten examiners correctly diagnosed a complete tear of the ACL and MCL, and the other two diagnosed partial tears of the ACL and MCL.

The examiners were not as accurate with the rotation tests. Seven of the eleven examiners misinterpreted the external tibial rotation associated with MCL injury as injury to the posterolateral ligaments. This error indicated that the examiners were incapable of determining if the medial tibial plateau came forward or the lateral tibial plateau went back. The tests that assess rotation are not accurate, even for experienced examiners.

The actual measurements of anterior tibial translation produced during the Lachman tests range from 7 to 16 mm. The discrepancy in displacement related to differences in the position of the knee at the initiation of testing (range of flexion of 2–25°) and the magnitude of displacement forces. The constraint of coupled motions did not significantly influence the measured displacement. Only three examiners estimated anterior displacement within 2 mm of the measured value, five estimated the displacement between 2 and 4 mm, and the estimates of two examiners were more than 5 mm different from the measured value.

Significant differences in displacement were produced by the examiners for both internal/external rotation and mediolateral joint opening. The knee flexion angle at the initiation of testing varied widely among the examiners. Some examiners started the mediolateral opening test with the femoral condyle in contact with the tibial plateau, and others did not. Even so, the examiners were more accurate in estimating medial joint opening; either of the examiners estimated displacement within 3 mm of the measured displacement.

In summary, only six of the examiners estimated true anteroposterior displacement within a range of 2 mm, tibial rotation within 5 mm, and medial joint opening within 3 mm.

Fact Box 2

The reproducibility of the clinical examination depends on specifying the conditions of the tests, including magnitude and direction of force, site of measurement, and point of application of force. However, even in the best of circumstances, large variations may exist in clinician’s estimates of displacement. Consequently, objective estimation of pathologic knee laxity by clinicians is qualitative, at best, and therefore cannot be validated.

These studies demonstrate that limb position, site of measurement, and application of force should be standardized. Even under the best circumstances, large variations may exist in clinicians’ estimates of displacement. Consequently, instrumented or stress radiography measurements should be used to report clinical results. The rotation tests are even more difficult to assess than either anterior posterior or mediolateral displacement. Evaluation of rotary subluxation is subject to error and the rotational test cannot be validated.

5 Analysis of the Pivot Shift Test

In the third study conducted at the Jackson Hole meeting, each member of the IKDC performed their versions of the pivot shift test on the instrumented cadaveric limbs [25]. Like the anteroposterior displacement tests, the beginning test position varied between examiners, although it was typically close to extension. The difference in maximum anterior translation of the medial tibial plateau recorded during the pivot shift ranged from 6 to 17 mm, and the maximum subluxation of the lateral plateau ranged from 14 to 20 mm among the examiners.

Analysis of the data confirmed that the examiners constrain knee motion when performing the pivot shift test. The coupled knee motions of anterior translation and internal tibial rotation were induced to produce anterior subluxation. The examiners who internally rotated the tibia most in performing the test also limited anterior translation of the medial tibial plateau. One examiner performed the test in internal, neutral, and external rotation. The greatest translation of both the medial and lateral tibial plateaus occurred in neutral and external rotation of the tibia. The committee recommended avoiding internal tibial rotation when performing the pivot shift test.

The variability of measurement indicated the pivot shift could only be considered a qualitative test. At that time, in vivo measurement devices were not available to quantitate displacement in millimeters; consequently, the committee recommended grading the pivot shift: negative; 1+, glide; 2+, clunk; 3+, gross.

After analyzing the data of these three studies, the committee recommended but did not validate instrumented or radiographic measurement of the Lachman test, at 25° of flexion, total anteroposterior translation at 70° of flexion, and medial and lateral joint opening at 20° of flexion, and the qualitative, pivot shift, and reverse pivot shift tests.

6 Documentation of Activity

By consensus, the committee agreed that limitation of knee function may be masked by involuntary low-activity levels. The criterion “return to sports” was considered imprecise because different activities place different demands on the knee. The IKDC field tested a comprehensive form evaluating the level of difficulty, intensity, and exposure. Intensity describes the level of activity as occupational, light recreational sports, vigorous recreational sports, or competitive sports. Exposure, the best estimate of the number of hours per year at a given functional level and intensity, was recorded only for participation of more than 50 h/year.

Changes in activity may occur for knee-related or non-knee-related reasons. A decline in athletic activity and participation is inherent with aging, and a question was included to specify the reasons for any changes in activity.

After field testing the comprehensive form, the committee selected the minimum criteria necessary to evaluate activity. The functional tests are as follows: I, strenuous; II, moderate; III, light; and IV, sedentary. These are based on the demands that certain activities place on the knee. Assessment of activity is equally important for patients who do not participate in sports. Heavy manual work was assigned a level II rating, light work a level III rating, and activities of daily living a level IV rating.

The level of activity at which the patient is able to perform, without significant symptoms, is recorded before injury, before treatment, and after treatment. Credit is not given for participation in activities that cause significant symptoms (i.e., “knee abusers”). Two questions were included in the IKDC form to determine how the knee affected activity. One of these questions – “How does your knee affect your activity level?” – was graded 0–3.

7 Symptoms and Impairment

The committee recognized that the magnitude of symptoms and impairments is difficult to quantitate, and the collection of data is prone to bias. Even so, this important category has been included in every rating scale.

The symptoms and impairments were evaluated in the field test. The symptoms of pain, swelling, and giving way were universal to earlier knee rating systems. Giving way indicates an event precipitated by a pathologic tibiofemoral shift. It should not be mistaken for the buckling caused by weakness or other conditions. Partial giving way is not associated with falling or swelling, although these events are included in full giving way.

Patients with pathologic conditions frequently decrease activity to avoid symptoms. To detect these patients and prevent an exaggerated symptom score, the committee adopted the philosophy of relating symptoms to activity. Other patients who are capable of performing strenuous activities without symptoms may avoid them by choice. To prevent a reduction of a symptom score in these cases, patients are asked to grade the highest activity at which they can participate without symptoms, even if they are not participating at that level.

In general, the impairments had not been included in the published rating scales, and the IKDC did not consider them among the minimal essential criteria. The subjective assessment questions and evaluation of symptoms in the IKDC form provide an overall assessment of impairment.

8 Compartment and Roentgenographic Findings

Restoration of stability and prevention of degenerative changes are long-term goals of knee reconstruction, but evaluation of success in attaining this goal is difficult. Early degenerative changes cannot be accurately evaluated without visual inspection, and roentgenographic changes occur late in the course of osteoarthritis. Assessment of crepitation was included in the IKDC form to detect early compartment changes. Unfortunately, only limited conclusions may be drawn from the evaluation of crepitation. The collection of data is subject to examiner bias, and crepitation may not indicate articular cartilage abnormality. Crepitation associated with pain is a significant finding that is graded more stringently.

Roentgenographic changes are also qualitatively graded. A mild grade indicates flattening of the femoral condyle, subchondral sclerosis, or small osteophytes. The moderate and severe grades had progressive joint narrowing in addition to these changes.

Evaluations of compartment and roentgenographic findings are not included in the final evaluation of the IKDC form. These data are qualitative and influenced by investigator bias.

9 Functional Tests

The IKDC critiqued the methods that have been used to evaluate limb function. Gait analysis, instrumented strength testing (i.e., Cybex evaluation), agility tests, and hop tests provide quantitative data that compare the involved knee to the normal knee. Instrumented examination was excluded because it requires expensive equipment that is not universally available.

The single-leg hop is more accurate and easier to perform than the agility tests. Although a normal score does not preclude giving way with activity, an abnormal score is correlated with significant functional limitations. The single-leg hop test is a useful screening test that provides quantitative data [27]. Like the compartment and roentgenographic findings, the results are recorded but not graded.

10 Rating Results

Rating results are fundamental to the evaluation and comparison of different methods of treatment. The methods of grading that have been used reflect differences in philosophy, which are as diverse as the rating scales themselves. Most scales have used a numeric system to assign points to each variable. In some scales, points are added to produce a single-digit total score, whereas others categorize the results as excellent, good, fair, or poor. Tegner and Lysholm [34] and Feagin and Blake [7] recommended separate scores for symptoms, subjective function, and clinical findings.

Numeric grading systems are popular because they are easy to understand, although some investigators condemn assigning points to variable, stating that this practice requires an arbitrary judgment of the relative importance of a variable to the knee as a whole. The numbers reflect the values of the author and not necessarily the clinical outcome. Apley once declared that “we should resist the seductive simplicity of numerical scores and we should abandon the practice of adding unrelated scores” [3].

The IKDC adopted the system used by Noyes et al. [23] and the Swiss knee group [21], in which the lowest grade within a group determines the group grade and the worst group grade determines the final evaluation.

Fact Box 3

The original IKDC Knee Ligament Standard Evaluation Form made an important contribution by serving as a rudimentary form that functioned as a foundation for a more advanced evaluation system.

The IKDC Knee Ligament Standard Evaluation Form was published in 1993 [12] but never validated. It made an important contribution by serving as a rudimentary form that functioned as a foundation for more advanced evaluation systems. The future goals of the IKDC were to refine the standard form, identify additional important and reproducible criteria, and develop a comprehensive method of evaluation.

11 Evidence-Based Medicine

A new paradigm of assessment, evidence-based medicine, called into question our fundamental basis of learning. An important tenet of evidence-based medicine is the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients. The best research evidence places emphasis on patient-centered research related to the accuracy of diagnosis, power of prognostic identification, and efficacy and safety of surgical interventions.

Historically, researchers did not have outcome instruments to accurately measure the quality of life impacting complaints (with an ACL tear, those complaints may be subjective, pain, instability, and functional limitations). Therefore, researchers were forced to use surrogate measures (i.e., objective measures such as range of motion, strength, and laxity) for what the surgeon and patient really cared about. Although these impairment measures appear to have accuracy because they can be reduced to a number, they often suffer from poor intra-rater and inter-rater reliability because these measures contain elements of subjective measurements by the examiner as documented by the IKDC studies. In addition, they have poor correlation with important domains of health to the patient. Consequently, the relationship between impairment of body structures and function to activity limitations and participation restrictions is not direct. For example, some authors have demonstrated there is no relationship between anterior displacement measured with KT-1000 and patient-reported activity and participation [17, 32].

Fact Box 4

Although these impairment measures appear to have accuracy because they can be reduced to a number, they often suffer from poor intra-rater and inter-rater reliability because these measures contain elements of subjective measurements by the examiner as documented by the IKDC studies.

In contrast to objective measures, many subjective clinical measures might not appear to be reliable or valid but, when rigorously tested using well-established scientific methods, actually can be shown to be very reliable and valid. Activity and participation are of utmost concern to the patients. Therefore, health-related quality of life should be the primary outcome measure “how is the patient doing?”. The secondary outcome should be “how is the knee doing?”.

In March 1997, at John Feagin’s request, the AOSSM Board of Directors moved to support the revision of the knee ligament evaluation form created by the IKDC. The board’s interest in revision stemmed from the success of the initial form, as demonstrated by its widespread use, and the opportunity to integrate advances in the measurement of medical outcomes into the knee ligament form, making it more broadly applicable and credible.

Three members of the committee and Chad Munger from Data Harbor met in Sun Valley in June 1997 and developed the following objectives:

  • To update the current objective portion of the IKDC form, enhance assessment of injuries and develop new modules for the objective evaluation of the PCL and patellofemoral components of the knee.

  • Develop a new subjective evaluation form to assess patient-reported outcomes for measurement of function and symptoms.

  • Evaluation of the psychometric properties of each module of the knee ligament evaluation form.

  • Publish and disseminate results of testing.

Thereafter, between July and October 1997, the committee developed a work plan, budget, and list of additional individuals needed to ensure complete international representation and clinical expertise. The committee’s preliminary work plan estimated development and testing for approximately 2.5 years, including psychometric evaluation and publication. In the fall of 1997, work on the revision process began. Members of the AOSSM included Allen Anderson (Chairman), John Bergfield, Art Boland, Mininder Kocher, John Feagin, Christopher Harner, Nick Motahi, John Richmond, Don Shelbourne, and Glenn Terry. ESSKA members included Hans Uli Staeubli, Roland Jakob, Philippe Neyret, Jorgen Hoeher, and Werner Mueller. APOSSM members included K. M. Chan, Masahiro Kurosaka, James Irrgang, M.S., P.T. psychometrician/consultant; Chad Munger, Data Harbor consultant; and John Fulkerson, ex officio. Committee members were assigned one of three work groups related to the ACL, PCL, or patellofemoral joint. Each member was charged with reviewing background material for the purposes of identifying new items or revisions to existing items that could be included in the objective portion of the form.

The other major objective included the development of a valid, reliable, and responsive IKDC Subjective Knee Form that would serve as an appropriate means by which to evaluate a variety of knee impairments, including ligament and meniscal injury, articular cartilage lesions, and patellofemoral pain. In this regard, the development of a single instrument that is valid for a variety of conditions affecting the knee could simplify data collection and also provide an opportunity to compare the impact of different knee conditions on the individual’s level of symptoms, function, and sports activity. This objective influenced all phases of the IKDC development.

Finally, the committee felt that it was critical to develop a worldwide consensus of opinion to create a standard outcome form that would provide a uniform method of evaluation and facilitate the sharing of results and solving clinical problems.

The committee devised a demographic module primarily from the current health assessment module of modems. This module includes age, sex, race, and education items, as well as a fully tested comorbidity index. The general health questionnaire, SF36, was included because patients with knee conditions may have other health-related problems which would be reflected in lower scores on outcome assessment.

Between October 1997 and March 1998, three revisions of each form were completed, involving the addition, deletion, and modification of hundreds of items. By March 1998, the committee agreed to a testable version of the form, consisting of 42 questions.

At that time, James Irrgang Ph.D., P.T., A.T.C., a psychometrician who worked closely with the orthopedic community, was recruited by the committee to assist in the design and implementation of a study to evaluate the validity, reliability, and sensitivity of the revised form.

Field testing of the demographic, subjective, and objective assessment modules began in April 1998. Over an 8–10-week period, 144 patients completed the demographic and subjective modules. During this same period of time, the objective module was completed by 31 patients. The results were summarized and presented to the committee in Vancouver, British Columbia. Key findings were as follows:

  • There were very few missing data for all of the items on the demographic module.

  • There were substantial missing data for many of the items on the subjective module. This was particularly problematic for the items that were related to symptoms (i.e., pain, swelling, giving way, and locking). Additionally, the proportion of missing data was greater for items located at the end of the instrument, indicating the need to shorten the instrument to lessen the burden on patients.

  • There were substantial missing data for many of the items included on the objective module. Items that were related to prior surgery, procedure, and diagnosis codes, status of the menisci, range of motion, and KT-1000 and hop tests had the greatest proportion of missing data.

Fact Box 5

The IKDC Subjective Knee Form was pilot tested on 144 patients. The results were used to modify the objective and subjective models. Field testing was performed by having 222 patients complete the subjective model and 211 the objective model. The results of analysis were used to modify the modules.

With the input of the committee, the results were used to modify the subjective and objective modules. Further testing of the revised subjective and objective modules was undertaken in August 1998. Two hundred twenty-two (222) patients completed the subjective module, and the objective module was completed for 211 patients. The results were summarized and reported to the committee in Boston, MA, in November 1998. A summary of the results presented to the committee follows:

  • Subjective Module

    • Problems with missing data for items on the subjective module were resolved. Most items had less than 10 % missing responses, and items with the highest proportion of missing responses continued to be those related to symptoms.

    • An exploratory factor analysis indicated there was a single dominant trait underlying responses to the subjective module. Most of the items had a high loading on this dominant trait (i.e., there were high correlations between the item and the dominant trait). Broadly, this dominant trait reflected a combination of symptoms, function, and sports activity, which implies that it is reasonable to combine the item scores into a single total score to reflect an individual’s level of function. Items with a low loading on the dominant factor were considered by the committee for elimination.

    • A Rasch analysis was also performed to evaluate the subjective module. Overall, the results indicated that the Rasch model adequately fit the data. Collectively, the items measured a broad range of function. Several misfitting items (i.e., those items that did not conform to Rasch model) were identified and considered for elimination by the committee.

    • A stepwise regression analysis was performed using the individual items to predict the total score (i.e., the sum of the item scores). The results indicated that 99.9 % of the variance of the total score could be predicted by 24 of the 42 items included on the scale. The committee used these results during the item reduction process.

  • Objective Module

    • Problems with missing data on this version of the objective module were reduced. Most of the missing data were related to information that was not routinely measured or recorded during the history and physical examination, such as diagnosis and procedure codes, as well as the status of the menisci. Portions of the physical examination that continued to have a high proportion of missing data included crepitus, harvest site pathology, and one-legged hop and KT-1000 tests. A high proportion of the data for documentation of knee extension could not be interpreted as recorded on the form.

    • An exploratory factor analysis was performed to determine the structure of the objective module. The results indicated that there were four or five factors underlying the objective module, and as a result, an orthogonal rotation was performed to clarify the meaning of the factors. Components of the objective module that loaded on the first factor included most of the laxity tests (Lachman, A-P translation, varus and valgus rotation, and pivot shift). The second factor represented crepitus and radiographic narrowing of the joint. The third factor represented loss of motion. The fourth and fifth factors represented the posterior drawer and reverse pivot shift tests, respectively. Given that the correlation between each of these factors was zero. These results question the validity of combining the results of the objective module into a single score.

Fact Box 6

Factor analysis demonstrated that it was reasonable to combine all the questions in the IKDC Subjective Knee Form into a single score.

The above results were used to modify the subjective and objective modules. By considering the statistical properties and content of the individual items, the committee reduced the subjective module from 42 items to 19 items. To modify the objective module, findings from the physical examination were separated from the historical data.

At the conclusion of the meeting in Boston, the committee requested additional information concerning the reduced version of the subjective module. This included a comparison of an individual’s rating of function on an 11-point scale (i.e., 0–10) to a rating of function using the 4-point scale included in the original IKDC guidelines (i.e., normal to severely abnormal). Evidence that the items performed the same for those with and without a ligament injury was also requested. The data that were analyzed above were used to address these questions. To better describe the sample, the centers that submitted the original data were asked to provide demographic information including the subjects’ age, sex, and diagnosis. The results were provided to the committee at its meeting in Anaheim, CA.

A summary of the findings is as follows:

  • The rating of function on an 11-point scale was similar to the rating of function on the 4-point scale. The correlation between the two items was .71.

  • An exploratory analysis of the reduced item set demonstrated a single dominant trait underlying the item responses. All of the items, except for the item related to locking, loaded highly on this trait.

  • The Rasch model fit the data well. The items continued to measure a broad range of ability.

  • To compare performance of the items for those with a ligament injury to those without a ligament injury, the diagnosis code was used to split the sample into two subsamples (i.e., those with a ligament injury and those without a ligament injury). A Rasch analysis was performed separately on each subsample. If the items performed the same for each group, one would expect the item statistics (i.e., the item difficulty parameters) to be the same for each sample. The results supported this premise. Thus, it appears that the items performed the same for those with a ligament injury compared to those without a ligament injury. Similar findings were found when the sample was split by age (i.e., the items performed the same for young and old individuals).

  • Three scoring methods were compared. This included summing the item scores, summing the item scores using the results of the factor analysis to weight the items, and using the Rasch model to score the instrument. All three scoring methods yielded similar results. The distributions of the scores for each method were also similar. Additionally, the correlation between the three scoring methods ranged from .993 to .998. Thus, for simplicity sake, summing the scores was a satisfactory method to score the subjective module.

Several changes were made to some items in the subjective module during the committee meeting in 1999 in Anaheim, CA. To assess the effects of these changes and to describe the psychometric properties of the final version of the subjective module, additional data was gathered with the revised subjective module.

In 2001, the final version of the IKDC Subjective Knee Form (SKF), consisting of 18 questions, was administered to 590 patients with ligament injuries, meniscal injuries, patellofemoral pain, and osteoarthritis, to provide additional evidence that performance of the instrument was not dependent on diagnosis [13]. The average age of the patients was 37.5 years old and 52.6 % were males. In the sample, 76 % participated in sports activity; 19 % were competitive athletes, and 57 % were recreational athletes.

Fact Box 7

In 2001, the final version of the IKDC (SKF), consisting of 18 questions, was administered to 590 patients with ligament injuries, meniscal injuries, patellofemoral pain, and osteoarthritis.

The factor analysis demonstrated that it is reasonable to combine all of the questions in the IKDC Subjective Knee Score into a single score. Other patient-reported measures of symptoms and function have applied differential scoring based on the author’s perception of what is important and how it should be scored rather than on statistical evidence.

Three different methods of scoring were evaluated. These included adding unweighted scores for the questions, a weighted sum of the questions that used the factor loadings from the factor analysis, and a method based on item response theory. The correlations among the three methods of scoring were all high. Additionally, the method of adding unweighted scores and the method based on item response theory identified the same five highest and lowest scoring subjects. Given these results and the simplicity of adding the unweighted scores was recommended over the other two methods of scoring.

The IKDC SKF has acceptable levels of internal consistency. A high value of coefficient alpha (0.92) indicated that the questions consistently measure the underlying construct of symptoms, function, and sports activity in patients with a variety of knee problems. The underlying concept for internal consistency is that the consistency with which a patient responds from one question to the next can be used to provide an estimate of reliability for the total test score [22].

Test-retest reliability and responsiveness are important characteristics of a rating scale designed to measure change over time [16]. Test-retest reliability reflects measurement error associated with repeated measurement when the patient’s status remains the same. Thus, high levels of test-retest reliability imply that repeated measurements yield consistent scores when a patient’s symptoms, function, and sports activity have remained constant. The IKDC SKF had high (0.94) levels of test-retest reliability.

Fact Box 8

Psychometric analysis demonstrated that the IKDC SKF functions similarly, regardless of age, sex, or diagnosis.

A major objective in the development of the IKDC SKF was to create a form that would be appropriate for patients with a variety of knee impairments, including ligament and meniscal injuries, articular cartilage lesions, and patellofemoral conditions. Item response theory was used to determine if the IKDC SKF would perform the same for young versus old, for men versus women, or for patients with different knee problems. The results indicated that, with few exceptions, the questions and therefore the entire form functioned similarly regardless of age, sex, or diagnosis.

12 Responsiveness

The next step in testing was to determine responsiveness of the IKDC SKF. Responsiveness is the ability of a form to detect minimal clinically important differences when the patient’s status has changed [9]. Demonstration of responsiveness requires administration of the instrument on two or more occasions to patients who are expected to undergo change. To provide evidence for responsiveness, the IKDC SKF was administered longitudinally to 207 patients who had a variety of knee problems [14].

In summary, the IKDC SKF, a well-standardized outcome instrument, has been proven to be reliable, valid, and responsive for any measure of change in symptoms, function, and sports activity over time in patients with a variety of knee impairments.

Fact Box 9

The minimal detectable change, the change in score necessary to be certain that the change is greater than the measurement error of the outcome instrument, was 12.5. The minimal clinically important difference, the change in score necessary for the patient to perceive change that is clinically relevant, was 11.5.

13 Normative Data

The next step in standardization of the IKDC SKF was the collection of normative data. The primary purpose of this study was to provide clinicians and researchers with normative data that would place scores, changes in scores, and scores from male or female patients of different ages within the context of normal population values. Normative comparison facilitates the interpretation of results on the IKDC form for patient management decisions and for comparison between groups of patients by demonstrating how close patients come to the normal range of functioning.

The Subjective Knee Evaluation Form was mailed to 600 people in each of 8 age/gender categories (18–24 years, 25–34 years, 35–50 years, and 51–65 years for both male subjects and female subjects) [2]. Participants were drawn from a panel of 550,000 households (1,300,000 subjects) representative of noninstitutionalized persons in the United States and were matched to data from the United States Census Bureau on geographical region, market size, income, and household size.

Fact Box 10

Normative data were determined in each of 8 age/gender categories by testing 5,246 subjects.

Results

Complete data were available for 5,246 knees. Twenty-eight percent of respondents reported an injury, weakness, or other problem with one or both knees. Normative data were determined for respondents as a whole and for the subset of respondents with no history of knee problems. Scores on the IKDC Subjective Knee Evaluation Form vary by age, gender, and history of knee problems. The normative data published in 2006 allow clinicians to interpret how patients with knee injuries are functioning relative to their age- and gender-matched peers and will enable researchers to determine the clinical outcome of treatment [2].

14 Pedi-IKDC

Fact Box 11

The pediatric IKDC was developed and psychometric characteristics were determined on 589 patients, ages 6–18, with a variety of knee disorders.

A crucial feature of evaluating the psychometric properties of the IKDC SKF is demonstration of validity for the target population. The use of a validated outcome measure is not necessarily appropriate for pediatric patients. Patient-reported outcome measures rely on literacy and comprehension of questions that children may not understand. Consequently, cognitive interviews were conducted to determine how well children understood the components of the IKDC SKF [15]. This study revealed that children had difficulty comprehending and answering certain questions. Based on the specific areas of misunderstanding, a modified IKDC SKF (pedi-IKDC) was developed, and psychometric characteristics were determined on 589 patients, ages 6–18, with a variety of knee disorders [18]. The pedi-IKDC SKF demonstrated overall acceptable psychometric performance for outcome assessment of children and adolescents with various knee disorders [4].

15 Future Directions

In October 2014, the AOSSM Board voted to update the IKDC SKF by developing a computerized adapted test and integrating it with Patient-Reported Outcomes Measurement Information System (PROMIS) physical and functional computer-adaptive tests (CAT). The rationale for converting the existing IKDC SKF to a CAT is that it would enable the IKDC SKF to continue to be used as a measure of physical function and other dimensions of health overall more efficiently without increasing the total number of items administered to the patient. In addition, the IKDC SKF may be integrated with the PROMIS physical function and pain CAT for sports-related knee injury. This would be very valuable and could advance the field of measuring patient-reported outcomes for sports-related conditions.

16 Conclusions

The IKDC SKF was rigorously tested and found to be an instrument that was valid, reliable, and responsive and could be used to assess symptoms, function, and sports activity in patients with a variety of knee disorders including ligament and meniscal injuries, patellofemoral pain, chondral injuries [8, 10], and osteoarthritis [2, 8, 13, 14, 28]. Studies comparing IKDC SKF to other outcome measures demonstrate superior psychometric characteristics of the IKDC for meniscal [5, 33, 35], ACL [36], and cartilage repair outcomes [10].

As a result of rigorous psychometric testing, the availability of normative data, a pediatric version [28, 30, 31], and comparison to other outcome instruments, the IKDC SKF has gained worldwide recognition and popularity. It has been culturally adapted and translated in 19 languages [11, 19, 20, 29]. The forms and translated versions are available at www.sportsmed.org.

figure a