Introduction

Total knee arthroplasty (TKA) is performed for the treatment of advanced stages of knee osteoarthritis (OA) to decrease pain, to increase locomotor function and independence in daily life activities [1]. However, 37 to 55% of patients with TKA experience persistent deficiencies in walking performance, stair climbing, and functional mobility [2, 3]. These deficiencies are closely associated with impaired postural balance resulting from deterioration in motor coordination and decrement in muscle strength [4]. The main consequences of poor balance are the decreased functional mobility and the increased risk of falling [5]. An understanding and objectively monitoring of the balance-related issues following TKA surgery provides for clinicians to design appropriate rehabilitative and fall-prevention interventions; hence clinicians need the valid and reliable tools to assess patients’ balance ability at baseline and post-intervention.

In clinical practice, clinical and performance-based tests are practical, easy for administration, and time- and cost-efficient methods in assessing balance ability and locomotor function in patients with TKA [6, 7]. These tools objectively evaluate standardized tasks mimicking specific aspects of postural balance and locomotor function [8, 9]. The Step Test (ST), as a clinical-based balance test, assesses step number within a specified time and requires supporting the body on one leg and two-legged phases, steadying the center of gravity, weight transferring strategies, and postural adjustments [10,11,12]. Therefore, it is assumed that the ST resembles to locomotion and has a close relationship to gait ability [11, 12]. Having these aspects, the ST could be a useful measurement method in assessment of dynamic balance ability and locomotion for patients with TKA.

The validity and reliability of the ST in individuals with various conditions such as stroke [10], hip osteoarthritis [13], and surgical fixation after hip fracture [14] have been established in literature. However, according to our knowledge, no study was conducted on the reliability and validity of the ST in patients with TKA in the current literature. Reliability and validity are population-specific, and psychometric properties of the ST should be evaluated in TKA patients in order to use it in their assessment of dynamic balance and locomotor function. Therefore, this study aimed to assess the ST in patients with TKA in terms of the test–retest reliability, and minimal detectable change (MDC), and concurrent validity.

Materials and methods

Participants

Fifty-six patients with TKA, operated by the same surgeon using the paramedian approach, were included in this study. Patients were eligible if they have had a TKA at least 3 months ago, and able to understand test procedure instructions. The exclusion criteria of the study were undergoing revision TKA; being unable to understand verbal and written instructions; having pain at rest of more than 50 mm on a visual analog scale; having any orthopedic, neurological, or vestibular disorder limiting balance ability and locomotor function; and having surgery within 3 months. An a priori power analysis identified a sample size of 52 that was necessary to identify a desired intraclass correlation coefficient (ICC) of 0.90 with a lower confidence interval (CI) of ICC = 0.70 in the reliability analysis [15]. All patients approved a written informed consent that was obtained in accordance with the Declaration of Helsinki. This study was approved by the Local Ethics Committee. This study conforms to all the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines and reports.

Procedure

Demographic data of age, gender, weight, height, and body mass index (BMI) of the TKA patients were obtained, and then a physical therapist demonstrated the ST, 10-m walk test (10MWT), and timed up and go (TUG) procedures to the participants. After demonstration of test procedure, all subjects completed the 10MWT and TUG test in order to determine concurrent validity, and completed the ST (time 1). Finally, subject completed a second set of the ST (time 2) in order to determine test–retest reliability. Patients performed all tests on the same day. The order of tests was randomized at the first assessment session (time 1). Also, participants were allowed to rest for 5 min on a comfortable chair between tests at the first assessment session. Between the first and second trials of the ST, patients waited for an hour on sitting position in order to minimize fatigue-related effect. The 1-h interval between first and second tests was consistent with previous studies investigating test–retest reliability [16, 17]; therefore, this timeframe was considered long enough to prevent fatigue-related effect. All patients were assessed in the same clinical setting in both sessions, to minimize the effects of the environment on test performance. Two physical therapists conducted all tests. They had more than 5 years of clinical experience administering the clinical-based tests in orthopedic rehabilitation setting. Each participant was assessed by the same tester independently.

Outcome measurements

In all tests, the time taken was recorded in seconds by the stopwatch. A short time indicated good balance ability and functional mobility for the TUG and the 10MWT. A greater number of steps during the ST indicated good dynamic balance ability and locomotor function. Patients performed 2 trials for each test. The first trial was the practice trial, and the second trial was the actual testing. The practice trial was conducted to familiarize patients with the tests, while the mean score of the actual testing was used for analysis.

Step test

Subjects were instructed to place the whole foot of stepping limb onto the step block, then return it fully back down to the floor, repeatedly as fast as possible for the test duration. The subjects stood 5 cm directly in front of the step block to provide a standardized starting position. The tester stood on one side of subjects, to supervise and provide their safety. If subject overbalanced, the measurement and counting of step was ceased and the score of measurement was determined as the number of completed steps. The tester commenced the test period by command “start,” and started a stopwatch at the same time, and ended the measurement by command “stop” when the test duration expired [10]. For the ST, the number of repetitions that a subject could step on and off a standard step block in 15 s was assessed for the study (supporting) limb.

Timed up and go test

The TUG is an objective, reliable, and simple clinical measure for assessing functional mobility and balance ability [17]. Subjects were instructed to start with their back against the chair, were asked to stand up from a chair and walk 3 m away as fast as possible but safe pace, turn in whichever direction, come back to the chair, and sit down. A chair of standard height of 47 cm with armrests was used for all tests. The stopwatch for the TUG was started when the subjects got up from the chair and was stopped when the subjects sat on the chair after walking a certain distance. Participants were instructed to walk as fast as possible on the test line.

Ten-meter walk test

The 10MWT is a performance measure used to assess walking speed and functional mobility for a short distance [18]. The 10MWT testing distance requires 5-m acceleration and 5-m deceleration space, with the inner 10-m zone being the distance over which gait was timed. Before the test, patients were asked to walk as fast as possible. The stopwatch was started as soon as the patient’s leg passed over the starting line and was stopped the patient’s leg pass over the 10-m sign.

Statistical analysis

All data were analyzed using the IBM SPSS Statistics (Version 23.0) software. The Kolmogorov–Smirnov test was used for the determination of the normal distribution. Shrout and Fleiss Type 2,1 intraclass correlation coefficients (ICC) (2,1) were used to calculate test–retest reliability between the two trials for the ST. A coefficient value < 0.5 indicates poor reliability, between 0.51 and 0.75 indicates moderate reliability, and a coefficient value > 0.75 indicates excellent reliability [19]. The standard error of measurement (SEM) was calculated to ensure the accuracy of the measurement method. Minimal detectable changes at the 95% confidence level (MDC95) were calculated according to the following formula: MDC95 = SEM × 1.95 × √2. Pearson’s correlations were calculated to determine the concurrent validity between the ST and TUG, and 10MWT. A correlation coefficient between 0 and 0.49 was considered unacceptable, between 0.50 and 0.69 was considered moderate, 0.70–0.79 was considered high, and 0.80–1.00 was considered excellent [20]. A value of p < 0.05 was considered statistically significant in all analysis.

Results

All measurements were completed safely, and no adverse effects were observed during tests. Fifty-six patients (42 female/14 male) with TKA were included for reliability and validity analysis. The demographic and clinical characteristics of the patients are shown in Table 1.

Table 1 Characteristics of the patients

The ST showed excellent test–retest reliability in patients with TKA. The test–retest reliability of the ST was 0.90. SEM and MDC95 for ST were 0.76 and 2.11, respectively. A significantly moderate association was found between the ST and TUG (p < 0.05, r: − 0.69), and 10MWT (p < 0.05, r: − 0.67). The reliability and concurrent validity results of the ST are shown in Table 2.

Table 2 The test–retest reliability and concurrent validity of the Step Test in patients with TKA

Discussion

This is the first study that investigated test–retest reliability, concurrent validity, and determined SEM and MDC95 for the ST in patients with TKA. Findings showed that the ST is a valid and reliable method in the assessment of balance ability and locomotor function in patients with TKA. These findings are similar to previous studies that found the ICCs between 0.81 and 0.92 for ST in different populations with stroke [10], hip osteoarthritis [13], and surgical fixation after hip fracture [14]. Shrout and Fleiss analysis indicated that the 0.75 level of ICC would be considered excellent reliability [19]. Similar to previous studies, higher than 0.75 level was obtained in this study.

Many alternative measures such as laboratory-based, patient-reported, and clinical-based balance tests are available for measuring balance ability in clinical setting [6, 21]. Some studies have suggested that the laboratory-based balance tests such as posturography and stabilography are more sensitive and specific to detect a slight change in postural control [22, 23]. However, they are not clinically useful because they are large, expensive, and lack portability. Most clinical settings do not have access to laboratory-based balance measures [6, 7]. The scores on the patient-reported balance measures can be substantially influenced by individuals’ perception of balance performance, memory, and candor [7, 21]. Therefore, the assessment of balance ability and/or disability in clinical setting has mostly relied on the clinical-based balance measures.

The clinical-based balance tests objectively quantify standardized tasks of balance-related function resembling daily living activities, as opposed to relying on individuals’ perception of balance performance [6, 8, 21]. In the clinical-based balance measures, the assessment on the basis of direct, standardized, objective, and predetermined criteria is primarily of the subject’s ability or inability to complete a task. Alternatively, these tests measure the timing, counting, or distance of testing, thus measuring the degree of difficulty in a metric fashion [8, 21]. Therefore, current evidence has demonstrated that these measures, evaluating objectively specific aspects of balance-related function, are considered to have greater reliability than the patient-reported measure, and also are likely to be more sensitive to changes over time [6, 21]. These measures can be used to assess the impact of a specific intervention or an overall treatment program. Thus, these measures have particular importance in patients with TKA because rehabilitation goals focus on improving balance ability, locomotion, and functional mobility [6, 24, 25]. Consistent with previous suggestions, the results of this study support that the ST as a clinical-based test could be a sensitive and clinically useful assessment method for dynamic balance evaluation in clinical settings. Clinicians can use the ST in measuring dynamic balance ability in patients with TKA.

The ST reveals balance ability by evaluating the step-taking performance of the individual [10]. In addition to balance ability, the step-taking performance also requires adequate functional status, proprioceptive, and muscle strength levels [10, 12]. Therefore, it is also possible to observe the functional, proprioceptive, and muscle strength levels by evaluating the step-taking performance of patients with TKA. In addition, falls may occur during mobility due to decreased balance and functional performance and especially inadequate muscle strength level in TKA patients [5]. ST can also give an idea to observe the risk of falling, which is associated with disability and morbidity in TKA patients. Therefore, our study showing that the ST is a reliable and valid measurement tool in TKA patients makes a significant contribution to the literature.

A sufficient postural control, muscle strength, and gait ability enables individuals to perform daily life activities in desirable level [24]. However, altered joint movement and moment because of persistent pain, fear of loading, and muscle weakness reduce the ability of TKA patients in control of the center of gravity displacement, particularly at early post-operative phase [26, 27]. Therefore, TKA patients suffer from reduced gait velocity and stride length during gait [27]. In addition to this, previous studies have concluded based on their 3D motion analyzer that the disability in gait ability of TKA patients results from the impairment of the movement pattern of step motion. They suggested that an assessment of stepping function should be taken into consideration to clarify and determine an impairment related with body balance and gait ability [27, 28]. From this viewpoint, many systems need to be evaluated to understand what is wrong with a subjects’ gait ability and postural control following TKA surgery.

Additionally, to assess comprehensively the balance ability, complex locomotor tasks mimicking the real-life activity should be evaluated. Representative components of the ST tool resemble the basic locomotor tasks in the real-life activity such as gait, and weight transfer between legs [10, 11]. Therefore, the ST mimics the real-life tasks; namely, it can indicate whether an individual would be able to have sufficient dynamic postural balance and body weight shifting strategy in standing, as well as during gait, all of which depend on a complex interaction of physiological mechanisms. These features of the ST may enable the empirical, yet the more realistic assessment of the biomechanics of dynamic balance ability and locomotor function [10,11,12]. Having similar results with previous studies conducted on the ST in different population, the results of the current study show that the ST is applicable and repeatable for patients with TKA as an advanced tool to estimate their dynamic balance ability and locomotor function in situations more similar to real-life situations.

In a previous study conducted to assess validity of the ST in population with chronic stroke, measurement tools such as the functional reach test, the gait speed, and stride length test have been used [10]. It was reported in the existing literature that TUG is one of the most frequently used clinical-based balance measures in orthopedic, neurological, geriatric, and general rehabilitation practices [7, 29,30,31]. Therefore, we used the TUG, as our concurrent “gold standard” measurement, to determine the validation, since it is not only used for performance-based measure in TKA patients, but also widely used tool to assess the balance ability in clinical practice [17, 29]. More, we used additionally the 10MWT, which is used to measure the functional mobility, as another comparative standard of concurrent validity, because of the high association between balance ability and functional mobility [18, 32]. In the previous study assessing validity of the ST in chronic stroke, the correlation level of the ST with the other balance and functional mobility related tests varies between moderate and excellent (from r: 0.68 to r: 0.83) [10]. Similarly, the results of the current study showed a moderate correlation between the ST and TUG (r: − 0.69), and 10MWT (r: − 0.67). In the current study, the moderate correlation between these tests may relate to the fact that they are measuring the tasks requiring similar physiological mechanisms to maintain postural stability and perform locomotor function. Also, the current findings showing significantly moderate correlations of the ST with TUG and 10MWT suggest that the ST could be an indicator of the functional mobility in TKA patients. Therefore, the ST could be added to the other functional performance tests as a supplemental test for the practical assessment of TKA patients available to clinicians in clinics.

Relative (ICC coefficient) reliability is not enough to be useful in clinical practice. Therefore, absolute (SEM and MDC95) reliability is significant when investigating the effect of interventions on postural balance and locomotor function in patients with TKA. This study evaluated the relative and absolute reliability of the ST. MDC is a statistical estimate of the smallest amount of change that can be detected by a measure that corresponds to a noticeable change in ability. Changes in scores exceeding the MDC are clinically relevant by definition. Clinicians can use changes above the MDC to represent a real clinical change in rehabilitation process [33]. In the current study, MDC95 values were demonstrated as 2.11 for the ST. In the present study, low SEM and MDC95 values were demonstrated. The ST measure can be performed in a clinical setting with a small measurement error. Clinicians can be confident that changes greater than 2.11 reputations represent a “real” clinical change in patient with TKA for dynamic balance and locomotor function level in different session measurements.

The primary limitation to this study is that all recruited patients were from the same university-based orthopedic surgery clinic, and a single surgeon performed all of the surgeries. This may influence the generalizability of these findings to other clinical settings.

Conclusion

The results of the current study showed that the ST is a reliable and valid measurement tool that can be used to assess dynamic balance ability and locomotor function in patients with TKA. The advantages of the ST are that it is simple, time efficient, and require inexpensive equipment and minimal staff; therefore, it could be used to assess balance and locomotor function in almost any clinical environment, as part of the routine medical examination. The overall low MDC value also provides that the ST can be used to quantify small meaningful changes in balance and locomotor function related to interventions in patients with TKA, by clinicians in clinical settings.