Introduction

Basic activities of daily living (BADL) are an essential aspect of the health status of older persons as is demonstrated in the World Health Organization’s International Classification of Functioning, Disability and Health [1]. Assessing disability in BADL is important in both clinical practice and clinical research for elderly people [2].

Disability in BADL can be defined either as dependence or difficulty [3, 4]. Dependence is the degree of assistance offered by another person or by special equipment, such as a cane for ambulation or a tub bench for bathing. Difficulty is the degree of subjective difficulty in performing certain activities. Using data from cross-sectional and longitudinal analyses, Gill and colleagues [5] suggested that elderly who were BADL independent but had difficulty had functional profiles, physical performance scores, and rates of health-care utilization and death that were intermediate to those of elderly who were independent without difficulty and to persons who were dependent. Their finding implied that questions about both BADL difficulty and dependence can depict the continuum of disability and frailty more fully than either question alone [5].

Several BADL scales [68] that include both independence and difficulty were reported. These scales were developed for and validated in elderly people living in Western countries [68]. To our knowledge, however, these scales are not used in Japan. To assess BADL disability of Japanese elderly people, it is important to use a suitable scale that reflects the Japanese lifestyle, such as standing up from the floor. We developed a new instrument, the Functional Independence and Difficulty Scale (FIDS), that reflects the Japanese lifestyle and assesses both independence and difficulty in performing BADL [9]. Our previous study on this scale suggested that psychometric evaluation of FIDS using data from a random sample of 593 community-dwelling Japanese elderly people demonstrated acceptable item validity, internal consistency, and external validity [9].

Many different methods to assess BADL have been developed and described [10]. Because there are many existing measures, it is important to know whether the newly developed FIDS offers added benefit over these measures. Thus, we sought to determine whether the FIDS was more useful than existing measures.

To be useful, a measure should be valid, i.e., measure what it is supposed to measure, and contain neither a ceiling nor floor effect [11]. The ceiling/floor effect means the fraction of participation with the highest/lowest possible score. If ceiling/floor effects are present, participants with the lowest or highest possible score cannot be distinguished from others, which limits responsiveness because changes cannot be measured in these participants [12].

The purpose of the present study was to examine the ceiling/floor effect and validity of newly developed FIDS and to compare these measurement properties between FIDS and the Barthel Index (BI) [13] in community-dwelling elderly people in Japan. The BI is a representative existing measure of BADL that has been recommended, together with the Functional Independence Measure [14], as a measure of activity [15] and has been proposed as the standard index for clinical and research purposes [16].

We initially hypothesized that FIDS, which can capture information about not only BADL dependency but also difficulty and can provide information about the continuum of BADL disability more fully than BI, would show a relatively small ceiling effect compared to the BI. To test this hypothesis, we calculated the ceiling/floor effect and compared it between FIDS and the BI. Our second assumption was that different relationship would exist between FIDS or BI and a health-related quality of life (HRQOL) scale. Measures of HRQOL include several aspects of perceived physical and mental health [17] and have become popular because they have positive associations with physical activity [18] and physical performance [19]. Moreover, HRQOL is often a predictor of clinically meaningful adverse health outcomes, such as acute care readmission [20], nursing home placement [21], and mortality [22]. Both FIDS and HRQOL capture subjective aspects of daily living, whereas the BI captures objective aspects of daily living, whether help is required from another person. Because of this discordance between FIDS and the BI, we further hypothesized that FIDS would show a more positive relationship to a HRQOL scale than the BI.

Methods

Study design and participant recruitment

This study was a cross-sectional study. Recruitment of the participants was done on the basis of convenience sampling.

Participants

We included two separate participant groups of community-dwelling elderly people living in Japan: healthy elderly people not using Japanese long-term care insurance (LTCI) services (HE group), and frail elderly people using Japanese LTCI services (FE group). In Japan, a mandatory LTCI system was implemented in 2000. Municipalities are responsible for certification of long-term care and support needs based on the evaluation results by the Certification Committee for Long-term Care Need [23].

The HE group comprised elderly people from Tsumagoi district, Gunma Prefecture. Tsumagoi district is a rural area located about 150 km north of Tokyo. The population of the Tsumagoi district was estimated at 10,183 citizens of whom 28.5 % were aged ≥65 years (1 October, 2010) [24]. Participant recruitment in the HE group was conducted at a public hall in Tsumagoi district. The participants were individuals who voluntarily joined a specific health examination provided to those insured by public medical insurance. At the public hall, individuals not using Japanese LTCI services were invited to participate in this study by the researchers.

The FE group comprised elderly people from Kawasaki City, Kanagawa Prefecture. Kawasaki City is a city area located 20 km south of Tokyo. The population of Kawasaki City was estimated at 1,425,512 citizens of whom 16.6 % were aged ≥65 years (1 October, 2010) [25]. Participants in the FE group were recruited from a home-visit nursing station located in Kawasaki City. The participants were individuals who were registered as users of LTCI services including home-visit nursing care or rehabilitation provided from the home-visit nursing station. The researchers visited each participant’s home and invited them to participate this study.

For both groups, common inclusion/exclusion criteria and data collection methods were applied. The inclusion criteria were living in the community, age ≥65 years, and being able to respond to the questionnaire in Japanese. The exclusion criterion was subjects who were blind. We recruited 346 participants to participate in this research (HE group, n = 252; FE group, n = 94). Subjects who did not match these criteria or those who did not want to participate in research procedures voluntarily were excluded.

Data collection

Data collection was carried out from April to June 2014. Each subject answered the questionnaire by themselves, and the answers were checked by an examiner. When self-administration of the questionnaire was difficult for any reason, the examiner interviewed the subject.

Background variables

Background variables included age, sex, height, weight, care levels of LTCI [23], Tokyo Metropolitan Institute of Gerontology Index of Competence (TMIG-IC) [26], and degree of independence of daily living. The national uniform level of long-term care need [23] was based on the insured’s mental and physical conditions and on family doctors’ letters of opinion. The criteria for certification of long-term care need level in Japan are as follows: Requiring support 1, Requiring support 2, Requiring long-term care 1, Requiring long-term care 2, Requiring long-term care 3, Requiring long-term care 4, and Requiring long-term care 5. Benefits according to these long-term care levels are set to minimum for Requiring support 1 and to maximum for Requiring long-term care 5. The people classified into the two support levels are able to independently perform BADL and are considered to need some support to prevent an increase in eligibility level due to physical or mental impairments, whereas people classified into the long-term care levels need assistance to perform basic activities of daily living [27]. Typically, the elderly people in care levels 1–2 can walk independently, whereas those in care levels 3–5 have difficulty in walking alone [27]. Some studies [28, 29] distinguished between care levels 1–2 and care levels 3–5, with the former called “moderately disabled” and the later called “severely disabled.”

The TMIG-IC [26] was designed to evaluate capacity higher than BADL and consists of three subscales: instrumental self-maintenance, intellectual activity, and social role. The total score is the sum of all 13 items, with a higher score (maximum 13 points) indicating higher competence of the elderly. Koyano and colleagues developed normative data based on a probability sample of community-dwelling Japanese elderly people aged 65 and over, finding that the mean total score of the TMIG-IC was 11 points [26].

Criteria assessing the degree of independence of daily living were as follows: independent, going outside independently; house-bound, needing help to go outside but, in general, living independently within their house; and bed-bound, needing help for all BADL.

Functional Independence and Difficulty Scale

FIDS is a measure that assesses the performance of BADL independently and with difficulty. FIDS comprises 14 items of BADL, as detailed in our previous report [9]. The function scores for FIDS range from 14 to 42, with higher scores representing better function.

Barthel Index

The BI includes ten items of BADL, as detailed in the original report [13]. This scale ranges from 0 to 100, for which higher scores are associated with a greater degree of independence.

Health-related quality of life

The Japanese version of the Medical Outcomes Study Short Form 8 Health Survey (SF-8) [30], a short eight-item form based on the original Short-Form 36 Health Survey [31], was applied as the measure of HRQOL. The SF-8 is a self-reporting form that subjectively assesses health based on physical functioning, role limitations due to physical and emotional health problems, freedom from physical pain, general health perception, vitality, social functioning, and mental health. From these eight dimensions, a physical component summary (PCS) score and mental component summary (MCS) score are calculated following the scoring algorithm outlined in the SF-8 manual [30]. Higher scores represent higher self-reported subjective health.

Data analysis

Continuous variables are reported as mean ± standard deviation (SD), and categorical variables are reported as numbers and percentages. We compared the characteristics of the participants between the HE group and FE group using the two-sample t test, unpaired Mann–Whitney test, and Chi-square test.

For each group, ceiling/floor effect was quantified by the percentage of subjects with the maximum/minimum score. Next, in each group, participants were assigned to three groups according to their functional state, and the ceiling/floor was calculated. In the HE group, participants were assigned according to their TMIG-IC score: 13 points (maximum score), 12–11 points (less than maximum score but normative mean score [26] or more), and 10–0 points (less than normative mean score [26]). In the FE group, participants were assigned according to their LTCI level: support level (requiring support level 1–2), requiring long-term care with moderate disability (requiring long-term care level 1–2), and requiring long-term care with severe disability (requiring long-term care level 3–5) [27, 28].

The validity of FIDS was assessed using the Spearman correlation coefficient and partial correlations after controlling for subject age and sex. First, the relationship between the total FIDS score and background variables, BI, and the SF-8 were examined. Next, the relationship between FIDS/BI and the SF-8 were examined. We interpreted the associations as negligible correlation (0.00–0.30), low correlation (0.30–0.50), moderate correlation (0.50–0.70), high correlation (0.70–0.90), and very high correlation (0.90–1.00) [32].

A two-tailed P value of <0.05 was considered significant. Statistical analyses were carried out using IBM SPSS Statistics (Version 22, IBM Japan Ltd.).

Results

Subjects and characteristics

Of the 346 participants, we excluded 3 subjects (HE group, n = 1; FE group, n = 2) from the analysis because they were blind, 5 subjects (HE group, n = 4; FE group, n = 1) because they refused to participate in this research, and 24 subjects (HE group, n = 22; FE group, n = 2) because of missing values. Therefore, the final sample for analysis comprised 314 subjects (HE group, n = 225; FE group, n = 89). There was a discrepancy in sample size between the two groups: the number of participants in the FE group was smaller than that in the HE group.

Background characteristics and other variables of the two groups are summarized in Table 1. Mean age in the HE group (126 women and 99 men) was 76.0 years. All participants satisfied the criteria of independence degree as “Independent.” Mean age in the FE group (49 women and 40 men) was 80.5 years. Among these 89 subjects, 31 satisfied the criteria of independence degree as “Independent,” 50 as “House-bound,” and 8 as “Bed-bound.”

Table 1 Characteristics of participants and results of comparison test between two study groups

Comparison analysis showed significant differences in age, body mass index, independence degree of daily living, registered long-term care, and scores of FIDS, BI, TMIG-IC, SF-8 PCS, and SF-8 MCS between two groups. This analysis indicated that participants in the FE group were significantly older and frailer than those in the HE group.

Ceiling and floor effect

No floor effect was observed in any of the participants on both FIDS and BI. Among the 225 participants in the HE group, 139 (61.8 %) had a maximum score on FIDS, whereas more participants (204, 90.7 %) had a maximum score on the BI. Among the 89 participants in the FE group, 1 (1.1 %) had a maximum score on FIDS, whereas more participants (23, 25.8 %) had a maximum score on the BI.

Ceiling effects by functional state are shown in Table 2. In both groups, relative to those with the lowest functional state, the percentages of participants with a maximum score rose as the level of functional state increased. In the HE group, 67.2 % of those who received a full score on the TMIG-IC had a maximum score on FIDS, whereas more subjects (94.9 %) had a maximum score on the BI. In the FE group, only 5.3 % of those assigned to LTCI levels as requiring support 1–2 had a maximum score on FIDS, whereas more subjects (63.2 %) had a maximum score on the BI.

Table 2 Percentage of participants with a ceiling effect on the BI and FIDS by functional state

Relationship between FIDS and other variables by groups

The relationship between FIDS and background variables, BI, and SF-8 are shown in Table 3. In the HE group, the FIDS score partially correlated positively, but negligibly, with the BI (r = 0.25) and TMIG-IC (r = 0.14). In the FE group, the FIDS score partially correlated positively and highly with the BI (r = 0.81) and TMIG-IC (r = 0.75). The SF-8 subscale score, except for “Mental health” in the HE group and “Bodily pain” and “Social functioning” in the FE group, showed positive partial correlation with the FIDS score; however, the strength of correlation was negligible to low. Although, the SF-8 PCS score showed significant positive partial correlation with the FIDS score, the SF-8 MCS score showed no significant partial correlation with the FIDS score.

Table 3 Spearman rank correlation coefficients and partial correlations after controlling for age and sex between FIDS and other variables

Relationship between SF-8 and FIDS/BI in the HE and FE groups

Table 4 shows the relationship between FIDS/BI and SF-8 in the HE group. For all scales except for the “Mental Health” and MCS, there was a significant partial correlation between the SF-8 score and FIDS score. However, the strength of the correlation was negligible to low. In contrast, the BI score showed a significant partial correlation only with the “role emotional” score.

Table 4 Spearman rank correlation coefficients and partial correlations after controlling for age and sex between FIDS, BI, and SF-8 in the HE group (n = 225)

Table 5 shows the relationship between FIDS/BI and SF-8 in the FE group. The FIDS score showed positive partial correlation with the SF-8 score except for “Bodily pain,” “Social functioning,” and MCS score. However, the strength of the correlation was negligible to low. In contrast, the BI score showed a significant partial correlation only with the “Role physical,” “General Health,” and PCS score.

Table 5 Spearman rank correlation coefficients and partial correlations after controlling for age and sex between FIDS, BI, and SF-8 in the FE group (n = 89)

Discussion

To determine the usefulness of the newly developed FIDS, we examined the measurement properties of and compare them between FIDS and the BI. Our two expectations were that FIDS would show a relatively small ceiling effect compared to the BI and that FIDS would show a more positive relationship to the HRQOL indicator, the SF-8, than to the BI. Our data supported the first hypothesis: FIDS showed a relatively small ceiling effect compared to the BI in healthy and frail elderly people. However, the resulting data only partially supported our second hypothesis. FIDS showed significant correlation with the broader aspect of the SF-8 subscales than with the BI, but the strength of the correlation was not necessarily high.

The first point of discussion is the ceiling/floor effect of FIDS and the BI. The floor effect was 0 % and did not differ between the two measures. However, the ceiling effect was different. For both the healthy and frail elderly groups, the ceiling effect of FIDS was smaller than that of the BI. A possible explanation for this difference was that the BI could not fully capture BADL disability. Although both scales capture BADL disability, the FIDS captures BADL disability in dependency and difficulty, whereas the BI captures BADL disability in dependency only.

When assessing a floor or ceiling effect, quality criteria are considered to be present if more than 15 % [12] to 20 % [33] of respondents achieve the lowest or highest possible score, respectively. From these criteria, our findings suggested that the BI might be affected by a ceiling effect even for frail elderly people using LTCI services, and particularly for those “requiring support.” In contrast, FIDS may be an adaptable BADL assessment tool for frail elderly people using LTCI services to assess BADL disability without the influence of a ceiling effect.

In contrast, for the healthy elderly not using LTCI services, both FIDS and the BI may be affected by a ceiling effect from these criteria. Of importance, however, is that about 32 % of the 137 elderly subjects who received a maximum score on the TMIG-IC and 43 % of the 63 elderly subjects who received a less than maximum but normative mean score or more on the TMIG-IC showed no ceiling effect on FIDS, whereas the rates were about 5 and 11 %, respectively, on the BI. These findings suggested that despite their high-level function, some healthy elderly people not using LTCI services might have subjective difficulty in performing certain BADL. This observation is consistent with that of a previous study, which reported that a scale defining BADL on the basis of difficulty produced BADL disability estimates 1.2–5 times greater than those estimated from a scale defining BADL on the basis of dependence [34]. For healthy elderly subjects, FIDS, which can capture BADL disability on the basis of both dependency and difficulty, might be a useful tool for clinicians and investigators to assess BADL disability that cannot be detected by the BI.

The second concern was the validity of FIDS and whether FIDS has any benefit over BI in terms of its validity as a measure of HRQOL. In both the HE and FE groups, a significant positive correlation between the FIDS score and TMIG-IC/BI score was obtained. These results provide evidence that FIDS is intrinsically equivalent to the BI and TMIG-IC for both healthy elderly and frail elderly people who use LTCI services. The strength of correlation was higher in the FE group than in the HE group. A possible explanation is that compared to the FE group, the distribution of BI scores and TMIG-IC scores in the HE group was concentrated on the higher score, and the correlation coefficient became smaller.

Although FIDS showed a significant positive correlation with the broader aspect of the SF-8 subscales than did the BI, the strength of correlation between FIDS and the SF-8 was weak to negligible. These results suggested that our hypothesis that FIDS would show more a positive relationship to the HRQOL scale than would the BI and would have a benefit over BI from the viewpoint of validity as a measure of SF-8 was not fully supported. The strength of correlation between FIDS and the SF-8 would be influenced by sample size and distribution of the scores. Because we used convenience sampling as our recruitment method, the sample size was insufficient, and distribution of the scores might be influenced by sampling bias. Thus, the hypothesis that FIDS would have any benefit over BI from the viewpoint of validity as a measure of SF-8 must be tentative until further research using a sufficient number of probability samples is conducted.

Limitations

First, as mentioned above, additional studies using a sufficient number of probability samples is needed to better examine the relationships between FIDS and the SF-8. Moreover, assessment of ceiling/floor effect by functional state also requires a sufficient large probability sample. Terwee and colleagues [12] argue a sample size of at least 50 participants to assess ceiling/floor effects. When we assessed the ceiling/floor effect according to functional state, participants were assigned to three groups, and the number of participants dropped to less than 50 participants, especially in the FE group. Second, we excluded participants who were blind. Therefore, our results may not be generalizable to elderly subjects with visual impairment. Third, because our subjects were Japanese elderly people, our results might not be generalizable to elderly people living in other countries.

In conclusion, we compared the measurement properties of the newly developed FIDS with the BI. Compared with the BI, FIDS showed a relatively smaller ceiling effect. Although the strength of correlation was not high, compared with the BI, FIDS showed significant partial correlation with the broader aspects of the SF-8 subscales. With additional studies, FIDS might be shown to offer added benefit over the BI and to be a more useful assessment tool to evaluate BADL in elderly people in Japan.