Introduction

Improvement in patients’ activity level is a key objective of total joint arthroplasties. Efforts to accurately assess this activity in patients undergoing THA and TKA have generated numerous activity scales during the past three decades. A review of the psychometric properties of the existing scales concluded that the University of California, Los Angeles activity scale (UCLA) and the Lower Extremity Activity Scale (LEAS) stand out as the two most rigorously developed and valid activity scales in orthopaedics [24].

The UCLA score, which was developed in 1984, includes 10 statements that cover the range of activity states from being “wholly inactive, dependent on others, and cannot leave residence” to “regularly participate in impact sports” [1, 27]. The LEAS, which was developed in 2005, has a similar structure to the UCLA, but provides more options, including 18 statements that start with “I am confined to bed all day” to “I am up and about at will in my house and outside. I also participate in vigorous physical activity such as competitive level sports daily” [20]. Both scales require the respondent to select one statement that is most representative of their current activity. A low score on either scale means low levels of activity and as the score gets higher, it reflects more activity. However, the description of the activity levels is slightly different and, with eight more statements, the LEAS aimed to more accurately describe the activity level of the patient compared with the UCLA score.

The many similarities between these two scales provides an opportunity to create a crosswalk between them. A crosswalk is a concordance table that allows conversion of scores between scales [2, 9, 16, 17, 23, 25, 26]. These crosswalks are very helpful when comparing the results of different studies, and pooling those results for meta-analyses. They also facilitate combining datasets from multiple registries and data sources that may have used one activity scale or the other.

The UCLA and LEAS scales continue to be used in orthopaedics to assess patient activity [18]. We aimed to create a crosswalk between the UCLA scale and the LEAS for patients undergoing THA and TKA that estimates scores on one scale from scores on another, and vice versa. This crosswalk between the two scales will allow the clinical and research community to compare results across studies and registries, and pool data from multiple sources to conduct large-scale analyses.

Patients and Methods

We retrospectively studied all patients who had either a primary THA or TKA performed by three senior surgeons (MP, DP, GW), and were enrolled in the two total joint arthroplasty registries at the Hospital for Special Surgery between May 2007 and December 2011. The first registry, CORRe, was established in 2003 by a group of arthroplasty surgeons at the Hospital for Special Surgery and is focused primarily on documenting intraoperative factors and device use [21]. This ongoing registry collects the UCLA score on patients at baseline and 2 years after surgery. The second registry (Legacy) was a federally funded registry at the Hospital for Special Surgery that was focused primarily on long-term patient-reported outcomes of joint arthroplasty [3, 7, 8]. This registry collected the LEAS scores on patients at baseline and at 2 years followup. Recruitment for this registry spanned 2007 to 2011, while followup continues. The activity scales were self-administered by patients in both registries. If patients had more than one TKA or THA during the study period, only the UCLA and LEAS scores for the first surgery were included.

During the study period, the three participating surgeons performed a total of 1163 primary TKAs and 1191 THAs. For TKAs, 56% participated in the CORRe and the Legacy registries (n = 647). For THAs, 66% participated in the CORRe and the Legacy registries (n = 784). Of those, 56% who had TKAs (n = 364) and 51% who had THAs (n = 403) completed the preoperative UCLA and preoperative LEAS surveys. No differences were found in age (67 versus 68 years for TKA and 65 versus 66 years for THA), gender (69% versus 68% female for TKA and 59% versus 58% female for THA), BMI (31 kg/m2 for TKA and 28 kg/m2 for THA), Charlson-Deyo comorbidity index (30% versus 36% greater than one for TKA and 29% versus 27% greater than one for THA) between patients who completed the baseline surveys and those who did not.

Of the patients who completed the preoperative surveys, 69 who had TKAs and 85 who had THAs returned the UCLA and LEAS 2-year surveys. We also compared patients who returned 2-year data versus those who did not. Patients who did not return 2-year surveys were not different from those who returned 2-year surveys in terms of age (67 ± 9 years versus 66 ± 8 years for TKA; p = 0.845) and gender (68% versus 72%; p = 0.399 for TKA and 58% versus 58%; p = 0.969 for THA). We also found that patients who did not return 2-year surveys were not different in terms of LEAS and UCLA scores at baseline (LEAS 9.0 ± 2.8 versus 9.5 ± 3.0; p = 0.162 and UCLA 4.4 ± 2.0 versus 4.8 ± 2.5; p = 0.267 for TKA; LEAS 9.2 ± 3.1 versus 9.7 ± 3.2; p = 0.150 and UCLA 4.3 ± 2.1 versus 4.7 ± 2.1; p = 0.103 for THA.

The final analytic dataset included 403 patients having THAs and 364 having TKAs and who were enrolled in both registries and fulfilled the inclusion and exclusion criteria for the study. The 403 patients having THAs had a mean age of 65.5 ± 10.8 years, and 237 (58.8%) were female. Of those, 85 patients had the 2-year UCLA and LEAS followup scores available from both registries. The mean UCLA and LEAS scores of the baseline and 2-year pooled data (n = 488) were 4.7 ± 2.1 and 9.7 ± 3.2 respectively. The 364 patients having TKAs had a mean age of 67.1 ± 9.4 years, and 243 (66.8%) were female. Of those, 69 patients had the 2-year UCLA and LEAS followup scores available from both registries. The mean UCLA and LEAS scores of the baseline and 2-year pooled data (n = 433) were 4.8 ± 2.1 and 9.4 ± 2.9 respectively.

The equipercentile equating method then was used to create the crosswalk between the UCLA and LEAS scores [11]. In simple terms, the crosswalk was created by identifying the scores on both scales that have the same percentile ranks. The equipercentile equating method requires that the two scales measure the same construct and have at least moderate Spearman correlation (> 0.3) [4]. We calculated the percentile rank functions for the UCLA and LEAS scores and identified for every LEAS score a UCLA score that has the same percentile rank and vice versa using the SAS® EQUIPERCENT Macro (Price, Lurie and Wilkins, San Marcos, TX, USA) [19]. We calculated a crosswalk for patients having THA and repeated these methods for patients having TKA. Crosswalk results were summarized in a conversion table. The equipercentile equating was first performed separately for baseline and 2-year followup data. However, because the resulting crosswalk tables were consistent across times, we combined the baseline and the 2-year data.

To assess the validity of the crosswalks, we first compared the mean scores for the actual and converted scores. We then compared the responsiveness to change from baseline to 2-year followup of the actual and converted UCLA and LEAS scores by applying the standard response mean (SRM) method to the subset that had both scores [10]. Third, we calculated the areas under the receiver operating characteristic (ROC) curves to compare the ability of the actual and converted scores to discriminate different thresholds of function measured using the Hip dysfunction and Osteoarthritis Outcome Score (HOOS) activities of daily living (ADL) subscale for patients having THA and the Knee injury and Osteoarthritis Outcome Score (KOOS) ADL subscale for patients having TKA [6]. The HOOS and KOOS measures were obtained from the Legacy registry. We used the quintile cutoff values as these thresholds. An area under the ROC curve of 0.5 indicates that the discrimination ability of the scores is not better than chance in predicting the threshold and an area of one indicates that the converted score perfectly predicts the threshold. Differences between mean scores and areas under the ROC curves were compared using the inequality test.

Results

The assumptions for performing equipercentile equating were met for THAs and TKAs. The Cronbach alpha was 0.64 and the correlation between the UCLA and LEAS scores was 0.47. When the equipercentile method was applied, all 10 UCLA activity scores had equivalent LEAS scores (Table 1). Each of the UCLA scores was matched to a unique LEAS score for patients who had TKAs; however, THA scores of 7 and 8 were matched to a LEAS score of 14. When converting the LEAS to UCLA score, all 18 activity scores had matching equivalent UCLA scores; however, the matching rate ranged from 1:1 to 3:1 LEAS scores to a UCLA score.

Table 1 Equipercentile equating crosswalk between UCLA and LEAS scores

The mean converted UCLA and LEAS scores were not different, with the numbers available, from the mean actual scores for THA (converted LEAS versus LEAS scores: mean difference, −0.05; SD, 3.21; p = 0.73; converted UCLA versus UCLA scores: mean difference, 0.05; SD, 2.21; p = 0.638) and TKA (converted LEAS versus LEAS scores: mean difference, 0.01; SD, 3.06; p = 0.93; converted UCLA versus UCLA scores: mean difference, -0.03; SD, 2.20; p = 0.79). Responsiveness was compared for the actual and converted scores. SRM of the converted scores was not different from that of the original scores (Table 2). The areas under the ROC curve for the original and converted scores also were not different (Table 3).

Table 2 Responsiveness of the original and the crosswalk-derived scores
Table 3 Area under the ROC curve for original and crosswalk-derived scores

Discussion

The UCLA and LEAS scales are the two most valid patient-reported measures of lower extremity activity for the population undergoing arthroplasty. Despite the many similarities between the two scales, comparing results of studies using these two measures is not currently possible. Creating a crosswalk, or concordance table, to easily convert scores between the two scales will facilitate this comparison, especially when pooling data for meta-analyses. We are not aware of prior studies that aimed at creating such a crosswalk. In this study, we successfully applied the equipercentile equating method to create a crosswalk between the UCLA and the LEAS scores. The crosswalk allows researchers to derive an equivalent score on the LEAS for a UCLA score and vice versa, for patients having TKA and THA (Table 1). We showed that the crosswalk-derived scores had similar responsiveness to change as the original scores, and had similar discriminant properties. The crosswalks were similar for patients having TKA and THA.

This study has limitations. Not all patients participated in both registries, and this could result in potential selection bias. However, patients who responded were not different from those who did not on numerous key demographic factors. The order of completing the UCLA and LEAS scores, that is, which one was completed first, was not recorded. Prior research has shown that order of questions affects how participants respond to questions [12]. However we have no reason to believe there was a specific pattern to this order that may systematically bias the LEAS or UCLA scores and thus the crosswalk values [22]. The recruitment for the two registry efforts was not coordinated. Responsiveness to change was based on a small subset of these patients. Although we have shown that patients who had completed their followup surveys were not different from those who did not from a demographic viewpoint, additional testing with more complete followup data, and in other settings would be ideal to validate our findings. We cannot exclude the possibility that the patients we studied represent a nonrandom subset of those who had surgery; in general, patients lost to followup have inferior health status to those who return for followup. In addition, a disproportionally smaller number of our patients were at the bottom of the activity scales, and this relative lack of data may have prevented an accurate accounting of patients who are less active. Loss to followup and underrepresentation of less-active patients could result in the crosswalk being less precise than it otherwise would be.

This study represented a natural experiment born of inefficiency. Two simultaneously active patient registries in the same patient population are extremely rare and not likely to be repeated elsewhere. Use of the crosswalk in clinical studies should provide evidence of the generalizability of the crosswalk. Finally, patients for this study were recruited from a high-volume specialized orthopaedic hospital and may not be representative of the population of patients undergoing THA and TKA in the United States. However, prior research has shown that patients in our registries are generally similar to those in the nationally representative Function and Outcomes Research for Comparative Effectiveness in Total Joint Replacement and Quality Improvement (FORCE-TJR) registry [14, 15].

Crosswalks between survey instruments have been used in patient-reported outcomes research, largely to convert scores of patient-reported instruments that measure symptoms, such as activity scales to utility measures, such as the EQ-5DTM [23]. We used the equipercentile equating method, an advanced statistical method, to derive an equivalent LEAS score for a UCLA score and vice versa, for patients having TKA and THA. Although there is no gold standard for developing the crosswalk between two instruments, some other methods currently exist, including regression-based linear equating and item-response theory (IRT) Rasch analysis-based equating methods [5]. Although the Rasch-based equating method is widely available and offers a flexible and powerful framework for score linking, it is based on strong assumptions that often are not a good approximation of the reality of testing instruments [13]. The Rasch or any IRT method assumes that the probability that a responder will answer a question correctly does not depend on whether the question is placed at the beginning, in the middle, or at the end of the test. It also does not apply to instruments with hierarchic scale structure. The regression-based linear equating method, however, produces results that depend heavily on the group of patients or test-takers [13]. This method is straightforward to implement but it greatly depends on the groups used for the equating process. The means and standard deviations for each group directly influence the equating equation, and the transformation cannot be applied to other groups. For instance, the linear equating in a strong responder group can differ noticeably from the linear equating in a weaker responder group.

Because the crosswalk involved equating 18 LEAS activity states to 10 UCLA activity states, two crosswalks were necessary to properly convert scores between the scales. Therefore, up to three states on the LEAS scores were matched to one on the UCLA score in the conversion from LEAS to UCLA scores; whereas in the conversion from UCLA scores to LEAS scores, one state on the UCLA score was equated to only one state on the LEAS score. These conversions equated the UCLA score for a particular state to the highest score of the equivalent states of the LEAS score for the least-active states, and to the lowest or middle score for higher levels of activity (Table 1). These equating algorithms potentially might bias the LEAS scores derived from the UCLA scores upward or downward depending on the disability status of the study population. However, the consistency in crosswalk values that were derived from the baseline data, which include patients who were debilitated going into surgery, and the 2-year data, which include patients who have recovered and with higher functional ability, suggest that these biases are less likely to occur. Therefore, while the crosswalk conversion can be applied to scores at the individual level, comparisons of the converted and actual scores are most accurate at the group level. Finally, the similarity in the crosswalks between the two scales for patients having TKAs and THAs suggest that deriving one crosswalk for lower-extremity procedures is possible, especially since the UCLA and LEAS scores are not procedure- or joint-specific.

We derived and validated a crosswalk between the UCLA and the LEAS scores, the two most psychometrically robust activity scales in arthroplasty research, for patients having THAs and TKAs. The crosswalk should be helpful in comparing findings from different studies, especially when conducting meta-analyses, and when pooling data from multiple sources such as registries. In addition, given the nonjoint-specific nature of the two lower extremity activity scales, reproducing the crosswalk for other lower extremity conditions or surgical procedures may extend its utility to studies assessing activity in patients with these conditions or procedures.