Introduction

The number of people receiving total knee arthroplasty (TKA) to relieve pain and improve functional status in patients with symptomatic knee osteoarthritis (OA) has increased worldwide. However, despite substantial advances in patient selection, surgical technique, and implant design for TKA, one study has indicated that 11–18 % of patients remain unsatisfied with the operation [1]. It is well recognized that a proper evaluation of patients undergoing TKA is needed, especially to determine the patient’s expectations and satisfaction.

Some of knee scoring systems have been translated effectively into Japanese and scientifically validated [24], but a better scoring system designed specifically for TKA evaluation is required. Therefore, in 2011, the new Knee Society Scoring System (KSS) was developed to better characterize the symptoms, expectations, satisfaction, and physical activities of patients who live more diverse lives [5]. This new scoring system is based on new scales and validation work [6], and its reliability has been evaluated by previous research [79] with satisfactory results.

To integrate self-assessments of Japanese patients into international clinical projects and to compare the results of TKA between Japan and other countries, it is crucial to have a Japanese version developed through a standardized method of cross-cultural adaptation and to validate it using self-assessment of Japanese patients who have undergone TKA. This is especially true because the incidence of knee symptoms is reportedly higher in Japan than in other countries, partly because of the super-aging society [10, 11] and because the demand for TKA and its proper evaluation has increased.

Before the Japanese version of the KSS becomes widely available, the scoring system must be adapted to the Japanese-speaking population and must be validated with patients who have undergone TKA. To facilitate comparisons of treatment results after TKA worldwide, the feasibility, reproducibility, and construct validity must be assessed using standardized methods. The aims of this study were to establish the Japanese version of the KSS developed by cross-cultural adaptation and to validate the psychometric properties of patients with OA who have undergone primary TKA.

Materials and methods

Cross-cultural adaptation

The cross-cultural adaptation of the KSS into Japanese was performed according to published guidelines [12, 13]. Briefly, the English version of the KSS was translated separately by three native Japanese bilingual translators, all three with medical backgrounds. After uniform agreement was reached among the three forward translators, a pretest Japanese translation was established. This version was back-translated by two bilingual nonmedical, professional English translators, who were blinded to the original English version. We continued this process until a final version was produced that had no disagreements between the English and Japanese version. When the penultimate consensus version was formed, the back-translated English version was sent to and approved by the inventor of the KSS. Any requests for changes were discussed among the translators until they reached an agreement and then sent the revision back to the inventor of the KSS. In discretionary knee activities of KSS, we added “ground golf” and “hiking” because these are popular activities among Japanese people based on a previous survey.

Validation study

Patients and data collection

Institutional review approval was obtained before the study. We sent the translated versions of the KSS (Suppl. Figure.), Oxford 12-item Knee Score (OKS), and Short Form 36 Health Survey (SF-36) to 93 patients with OA who consecutively underwent primary TKA in our institution from April 2011 to January 2015. Patients whose activities were markedly affected by other locomotive disease(s) or psychoneurological issues were excluded according to the self-report or objective observation at the time of a regular visit. Sixty-three patients responded, and the 55 patients who completed the questionnaire and signed the informed consent form were included in this study. Each patient received a set of questionnaires for immediate completion. For reliability testing, patients were asked to complete the second questionnaire 7 days after completion of the first. For those who had had TKA on both sides, the questionnaire was completed only for the first operation.

Psychometric characteristics of the Japanese KSS

Feasibility was evaluated based on the response rate and presence of a floor or ceiling effect (>15 % of patients reach the minimum or maximum score) [14]. The test-retest reliability of the KSS was determined by assessing the reproducibility of the results obtained 7 days apart without any treatment changes and was assessed by calculating the intraclass correlation coefficient (ICC) with 95 % confidence intervals [3]. For questionnaires missing one or two responses, the missing responses were replaced by the mean of the completed subscale responses. In cases with three or more missing responses, the subscale was not calculated for the patient. An ICC >0.8 was considered excellent. Cronbach’s alpha was used to measure the internal consistency of the test. Consistency was deemed satisfactory if this coefficient was ≥0.7. Construct validity was estimated using the correlation between the domains of the KSS and those of other questionnaires as assessed by Spearman’s coefficient. The correlation was considered as strong, moderate, or weak if the coefficient was >0.5, 0.5–0.35, and <0.35, respectively [14].

Statistical analysis

All analyses were performed using The JMP Pro statistical package, version 11.0.0 (SAS, Institute Inc., Cary, NC). The scores were reported as mean ± standard deviation, and a significance threshold of P < 0.05 was used in all analyses.

Results

Sixty-three of 93 patients completed and returned the questionnaire. After excluding insufficient answers, a total of 55 patients were included in this group. Their mean age was 73.2 years (range 51–87), and 68 % were women.

Feasibility

Of the 93 included patients, only 63 (67.7 %) completed and returned both sets of questionnaires. For determining the ceiling effect, the maximum score was obtained from one patient for the symptom subscale (1.6 %), four patients for the satisfaction subscale (6.3 %), three patients for the expectation subscale (4.8 %), and no patients for the activity subscale (0 %). For determining the floor effect, the minimum score was obtained for three patients for the expectation subscale (4.8 %) and no patients for the symptoms, satisfaction, and activity subscales (0 %). We concluded that there was no ceiling or floor effect in this Japanese version of the KSS.

Reliability and internal consistency

Table 1 shows that the reliability was excellent in the majority of domains with ICCs of 0.65–0.88, which showed an adequate reproducibility. Internal consistency by Cronbach’s alpha was 0.78–0.94 for individual subscales and was good to excellent for all domains. The differences for the two sets of the subscales ranged from 0.0 to 1.1 and were not significant. Both the satisfaction and expectation subscales showed excellent ICC values (0.88 and 0.83, respectively) and Cronbach’s alpha (0.94 and 0.91, respectively).

Table 1 Reliability data for the seven Japanese New Knee Society Score domains

Validity

Table 2 shows that all four domains of the KSS correlated significantly with the Japanese Oxford 12-item Knee Score. The activity domain of the KSS correlated significantly with all of the subscales of the SF-36, four of which showed strong, three moderate, and one weak correlations. The satisfaction domain showed positive correlations with the physical function, role-physical, bodily pain, general health, and vitality subscales. The symptom domain showed moderate negative correlations with the bodily pain and vitality domains, and the expectation domain showed a moderate negative correlation with the bodily pain domain and weak and moderate positive correlations with the physical function and vitality domains, respectively.

Table 2 Construct validity

Discussion

As the number of TKA cases is increasing in Japan as well as in other countries, the demand for validated assessment tools specific for TKA has also increased. Several scoring systems in Japanese are available for assessing knee ailments, some of which have been validated in a scientific manner such as the Japanese Knee Osteoarthritis Measure, the Knee Injury and Osteoarthritis Outcome Score, and OKS [24]. However, patient-oriented outcomes are now required more often than before, and the focus is now on patient’s expectation and satisfaction domains when evaluating the results of the operation. This is why the new KSS, developed in the USA [5, 6], has been translated into other languages [8, 14]. However, the new facets of the scoring systems should be tested in a variety of national, ethnic, and cultural backgrounds. We have developed the Japanese version of the new KSS and tested its feasibility, reliability, and validity in patients with OA who have undergone primary TKA within the past 3 years.

The new KSS has been translated into Dutch and French so far [8, 14]. The English-speaking populations in the USA and Britain would have similar cultural backgrounds to those in European countries, but the Japanese population has distinct differences from the cultural perspective. In particular, the sport activities documented in the activity subscale may not be suitable for Japanese older people who undergo TKA. This is probably the main reason why the response rate was not high in this study and why the correlation coefficients were not strong in many subscales. The same may be true for the symptoms and expectation subscales. However, because elderly people have become more active in daily life and tend to participate in sport activities much more than before, the activity domain is expected to become more important for evaluating TKA. In our study, the activity domain showed moderate to good correlations with most subscales of the SF-36 (Table 2). Also, as satisfaction would be the main focus of TKA, the satisfaction domain showed significant correlations with the majority of subscales of the SF-36. The low correlations do not necessarily reduce the usefulness of this scoring system but do indicate that further studies are required to identify the causes of the differences and whether the differences would affect the validity of the instrument in the near future.

There are several limitations of this study. First, the translation from English to Japanese established in this study might not be the most suitable for the Japanese version because there are always limitations in translating from one to another language. If better words or phrases were suggested, those would require validation in similar standardized protocols in the future. Second, the number of cases tested might not be sufficient to test the validity, as indicated by the current results. Also, the response rate (67.7 %) was not very high. Moreover, in particular, the differences and similarities between various disorders and between primary and revision cases should be investigated in the near future. Third, the new KSS should be tested pre- and postoperatively in the same patients, especially the expectation and satisfaction subscales. Lastly, subjective scales might be supported better by objective evaluation by physicians. It is crucial to understand the differences between objective and subjective observations, which should lead to the development of better treatments to improve patient satisfaction. This Japanese version includes two additional activities in discretionary knee activities​, so the researchers should state in any publication that this is a regional deviation from the original KSS.

In summary, a Japanese version of the new KSS was developed successfully. The translated version showed excellent reliability and internal consistency, and the activity and satisfaction domains showed moderate to good validity. The Japanese version of the new KSS is a valid, reliable, and responsive instrument to capture subjective aspects of the functional symptoms and abilities of patients who undergo TKA.