Background

A pressing issue in the current healthcare system is the growing burden of chronic disease and multimorbidity associated with the world’s aging population [1, 2]. There is an increasing number of older adults who require home care or housing options to support additional needs, including retirement homes, assisted living, or long-term care facilities [1]. Maintaining functional ability in later adulthood is a key public health priority and the promotion of physical activity (PA) is a central strategy for healthy aging initiatives [3]. Regular participation in PA has been shown to improve physical function, reduce impairments, promote independent living, and improve quality of life in older adults [4]. Physical activity can assist in maintaining cardiovascular, metabolic, and cognitive function; all of which reduce the risk of multimorbidity [5,6,7].

The World Health Organization (WHO) defines PA as “any bodily movement produced by skeletal muscles that requires energy expenditure” [8]. A growing body of evidence has demonstrated the importance of overall activity levels, including lighter intensity activities [9]. In addition to recommendations for moderate to vigorous activities, PA guidelines encourage changes in time allocation from sitting activities to light intensity activities, including standing [8, 10]. Given the inclinations for lighter intensity activities in older ages (e.g., walking, gardening), clinicians and researchers must have tools to accurately assess and monitor the full spectrum of physical activities in this population.

Direct measures of PA (e.g., pedometers, accelerometers, and the gold standard of the doubly labelled water method) [11] can capture the full spectrum of activities. However, these measures can be more expensive, rely on equipment availability, and place a greater burden on participants [5]. Alternatively, self-report measures can be a low-cost, feasible tool for assessing and monitoring activity levels [12]. While not all questionnaires capture the same breadth of activities, the Physical Activity Scale for the Elderly (PASE) has been recommended for use in older adults for its inclusion of lighter intensity activities [5]. The PASE was designed to consider a greater number of activity domains more representative of the typical activities undertaken by older adults (e.g., gardening and household tasks) [13]. The questionnaire was developed for older adults (≥ 65), takes approximately 10 min to complete (10 questions), and asks participants to recall their activity over the last 7-days [13, 14]. Activity types include sitting, walking, sport/recreation, exercise, occupational, and household [13]. A total score for PA can be calculated using these answers and the predetermined weights associated with each activity [13]. The PASE has been described as a suitable PA outcome measure for older adults who have multiple chronic conditions and is a recommended for measuring total PA in older adults based on evidence for its reliability and validity compared to other questionnaires [12].

To date, there has not been a comprehensive review of the populations and settings in which the PASE has been used. Rather, the literature on the PASE has focused on comparing the psychometric properties of multiple self-report measures of PA for specific populations. For example, Sattler et al. (2020) explored PA measures in healthy older adults and Garnett et al. (2019) in community-dwelling older adults with multiple chronic conditions. As part of their syntheses of all self-report PA measures both included a summary on the PASE, of ten and seven studies respectively [5, 12]. As both these reviews recommend the use of the PASE, a more thorough exploration of the PASE with broader criteria is warranted. Further, the extent of the literature on its psychometric properties has not been thoroughly investigated. Therefore, the purpose of this scoping review was to map the nature and extent of the literature on the PASE in older populations (mean age 60) and to consolidate knowledge about the characteristics of studies using the PASE as an outcome measure, including available data on its psychometric properties. Our research questions were as follows:

  1. 1.

    To what extent has the PASE been used in older populations (e.g., number of studies, PASE administration, outcome operationalization from the PASE)?

  2. 2.

    What are the characteristics of studies that have used the PASE as an outcome measure (e.g., locations, sample characteristics, study designs)?

  3. 3.

    What is the nature and extent of the literature on the psychometric properties of the PASE in older populations (e.g., reliability, validity, cultural translation)?

Methods

The JBI guidelines for scoping reviews were followed in addition to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines (checklist available in Additional file 1 Table A1) [15, 16]. This review protocol was registered with Open Science Framework (https://doi.org/10.17605/OSF.IO/7BVHX).

Search strategy

A broad search strategy was created with the assistance of a research librarian at the Health Sciences Library at McMaster University using the following key terms: “Physical Activity Scale for the Elderly”, “PASE”, “physical activity profile”, and “older”. Unique search strategies were developed for the following electronic databases: MEDLINE (Ovid), Embase (Ovid), Allied and Complementary Medicine Database (AMED; Ovid), Emcare (Ovid), CINAHL (EBSCO), Ageline (EBSCO). Databases were searched from inception to January 25th, 2023. Backward citation searching was performed in Web of Science (Clarivate) for the original PASE article by Washburn and colleagues [13]. The complete search strategy for all databases is available in Additional file 1 Table A2. Reference lists of relevant systematic reviews, meta-analyses, and scoping reviews were screened and hand searched for additional articles.

Inclusion/exclusion criteria

To be included in this review studies must have populations consisting of older adults with a mean age greater than or equal to 60 years in line with the United Nations definition of older adults [17]. No restrictions were placed on sex, race or cultural background.

The overarching concept for this scoping review was the PASE; this included the original version and translated versions. Therefore, to be included studies must have incorporated PA in their aims and present results from the administration of the PASE. This criterion was further refined to specify that PASE must be included as a primary or secondary outcome (i.e., not just a covariate). The outcomes of interest to this review were the characteristics of the studies (e.g., cross-sectional vs prospective) and populations the PASE was used in (e.g., country, clinical populations, sex), mean total scores of the PASE, how the PASE was used (e.g., to look at relationships with PA, to determine intervention efficacy), as well as psychometric properties that have been evaluated.

Studies from any geographic location were included. After initial full-text screening the inclusion criteria was further refined to improve heterogeneity of included studies and ensure feasibility of the project due to the large number of results. The setting was restricted to designated community-dwelling populations which reflects the original context the PASE was designed in [13].

Studies were excluded if they were not written in English or if they were conference abstracts, presentations, systematic reviews, meta-analyses, scoping reviews, evidence maps, rapid reviews, literature reviews, narrative reviews, or critical reviews. Reviews were flagged and screened for additional citations.

Study selection

Results from the comprehensive literature search were organized in Endnote 20 (Clarivate, Philadelphia, USA) and uploaded to Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia) for screening. Duplicated studies were removed using both programs prior to screening and any remaining were removed by hand. Prior to each phase of screening the reviewer team conducted pilot screening to improve agreement. For title and abstracts screening and full-text eligibility two independent reviewers (NB, LL, JL, IV, SH, and CD) confirmed the predetermined eligibility criteria. Due to the volume of full-text screening authors were not contacted for further details; where information for a given eligibility criteria was not reported or unclear the paper was excluded. Any disagreements during the abstract or the full-text review process were resolved by either consensus or arbitration by a third reviewer when necessary.

Data extraction and analysis

Data was extracted from the studies verbatim by two or more independent reviewers (NB, LL, JL, IV, SH, and CD). Modifications to the initial data extraction table made during the piloting process included the removal of details not necessary in a scoping review (e.g., funding sources, conflicts of interest) and the aims of this study (e.g., setting, recruitment methods). Additionally, separate columns were added to distinguish values calculated or extrapolated by reviewers versus authors (e.g., mean PASE scores, income classification). The following descriptive data was extracted: study details (geographical location, outcome measures, study design), population description (number of participants, mean age, sex, clinical population), PASE version and administration method, how the PASE was reported (e.g., mean vs categorical, subcategories vs full questionnaire), and psychometric properties reported.

Data was summarized in a descriptive manner through counts and percentages in tabular presentation. Weighted means and variances were calculated for total PASE scores across identified subgroups (sex, age, and clinical populations) where appropriate using the ‘metamean’ package in RStudio Team (R version 4.2.2, 2020, PBC, Boston, MA). In studies that reported only subgroup mean total PASE score or age, the authors combined the subgroup data using methods recommended in the Cochrane handbook [18]. Where possible, studies that provided median scores were converted to mean scores using the methodology developed by Wan et al. [19]. Studies that did not provide sufficient information for either transformation were omitted from some review syntheses. Studies were grouped by income based on the World Bank ratings from 2023 [20].

Results

The database search produced 6,372 articles and hand searching citations produced another 24 articles for a total of 6,396. A total of 886 studies were assessed for full-text eligibility and 536 articles were found to use the PASE in older adults, 232 of which met all inclusion criteria (i.e., community-dwelling and the PASE was a primary/secondary outcome). An overview of the screening process can be found in PRISMA-ScR flow diagram (Fig. 1), and reasons for full-text study exclusions can be found in Additional file 2 Table A2.

Fig. 1
figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram. Searches run on January 25th, 2023

Summary of PASE use

The PASE was used for a variety of reasons with the most common being to explore the effect of PA on a health outcome(s) (e.g., an association of PA type with all-cause mortality) [21], and the relationship of a determinant with PA (e.g., the association between walkability and walking time) [22]. Almost all the studies used the PASE in its entirety (96.55%). The studies that used partial aspects of PASE often focused on leisure time PA (e.g., walking, sport/recreation, and exercise) [23,24,25], and two studies focused on walking exclusively [26, 27]. Most authors (93.97%) used total PASE scores (i.e., used provided activity weights). Nineteen studies (8.19%) included a measure other than central tendency for total PASE score (e.g., dichotomous, tertiles, quartiles, quintiles). Eleven studies did not use the PASE score but instead operationalized PA using different pieces of the PASE (e.g., frequency, time). Details on the use of PASE are summarized in Table 1.

Table 1 PASE characteristics of included studies

The PASE was primarily delivered in person (69.40%) followed by mail (11.21%); 45 studies were either unclear or did not report how the PASE was administered to participants. A total of 15 different versions or languages were reported; the most common version used was English (63.79%). Six studies did not report which version or language the PASE was delivered in. In many cases, only the seminal paper on the English version by Washburn et al. was cited, with no further clarification of the version or modifications made, including several papers from countries where the primary language is not English (n = 29).

Study characteristics

A summary of the study characteristics can be found in Table 2. The PASE was used throughout the world; however, nearly half of the studies were completed in North America (49.57%). In total, studies from 35 different countries were included in this review; the most common countries outside of North America included China (n = 20), Australia (n = 19), and Japan (n = 10). Most studies were conducted in high-income countries (86.64%). The mean age for studies ranged from 60.00 [28] to 84.40 [29] with the majority (43.10%) falling between 70–74 years old. Most studies included mixed sex samples (n = 184), with only 17 looking at females and 22 at males. Fifty-three studies looked specifically at 21 clinical conditions (e.g., musculoskeletal, cognitive impairment, and cardiorespiratory). The 232 studies of community-dwelling older adults included 171,206 participants, with individual study samples ranging from 8 [30] to 14,881 [31]. Studies were published between 1993 [13] and 2023 [32,33,34,35,36]. The PASE was used in a variety of study designs, including cross-sectional studies (60.78%), prospective studies (25.43%), and experimental (12.07%).

Table 2 Characteristics of included studies

Where possible, weighted means for different subgroups were summarised based on age, sex, and clinical population. Studies with a mean age between 60–64 years had the highest mean PASE scores (159.53 (95% CI 146.58, 172.49)) and studies with a mean age over 80 years old had the lowest mean PASE scores (67.17 (95% CI 51.95, 82.39)) (Fig. 2, Forest plots available in Additional file 1 Figure B1-B5). Figure 3 presents forest plots for the combined total mean PASE score for female only studies (n = 13) 123.99 (95% CI 108.09, 139.88) [26, 37,38,39,40,41,42,43,44,45,46,47,48,49,50,51] and male only studies (n = 14) 136.27 (95% CI 122.46, 150.09) [52,53,54,55,56,57,58,59,60,61,62,63,64,65]. Based on data availability, pooled means were created for the following clinical populations: cancer (n = 2) [28, 66], Chronic Obstructive Pulmonary Disease (COPD) (n = 2) [67, 68], cognitive impairment (n = 6) [33, 69,70,71,72,73], Diabetes (n = 3) [74,75,76], Osteoarthritis (n = 12) [46, 77,78,79,80,81,82,83,84,85,86,87], and Parkinson’s disease (PD) (n = 10) [88,89,90,91,92,93,94,95,96,97]. Forest plots for clinical populations are available in Additional file 1 Figure B6.

Fig. 2
figure 2

Pooled Mean PASE scores by age groups

Fig. 3
figure 3

Pooled Mean PASE score forest plots for females(1) and males(2)

Psychometric properties of the PASE

Several papers evaluated the psychometric properties of the original PASE (n = 5) along with a number of validation studies (n = 14) for different translations and clinical populations (acute coronary event [98], COPD [68], Cancer [28, 66], and Parkinson’s disease [89]). In total, ten different versions of the PASE were assessed for reliability and/or validity in community-dwelling older adults, including: English (n = 5) [13, 14, 66, 98, 99], Malay (n = 2) [100, 101], Arabic (n = 1) [102], Chinese (n = 2) [68, 103], Italian (n = 1) [104], Norwegian (n = 1) [105], Persian (n = 1) [106], Polish (n = 1) [107], Taiwanese (n = 2) [28, 108], Turkish (n = 1) [109], and two studies did not report the version [65, 89].

Sixteen studies reported on the test-retest reliability of the PASE, time frames ranging from 3 days [99, 105] to 3–7 weeks [13] and sample sizes ranging from 18 [98] to 349 [100] (details available in Table 3). Across all versions of the PASE 12 studies reporting ICCs for the total score, only two fell below acceptable limits proposed in the COSMIN guidelines [110] (Malay version 0.49 (95% CI 0.37, 0.59) [100] and version NR 0.66 (95% CI 0.46–0.71) [89]). However, the majority of values were 0.90 and above (n = 8). Internal consistency was examined in seven versions and all Cronbach alpha’s fell within an acceptable range (0.70 (Arabic and Persian subcategory lowest) to 0.82 (Italian total score)). Only four studies examined measurement error. Alqarni et al. reported the minimal detectable change (MDC95) for PASE subcategories (9.0–23.6) [102] of the Arabic version and MDC95 for total scores were provided for the Chinese version (19.21) [68] and the Polish version (38.39) [107]. Two studies also included standard errors of measurement for the PASE total score (Chinese version 6.93 [68] and NR version 30.00 [89]).

Table 3 Reported reliability of the PASE

Four studies stated they were exploring criterion validity; however, each used a different measurement tool as their gold standard for PA: pedometer (walking steps and energy expenditure) [68], Actigraph (activity counts/minutes) [28], International Physical Activity Questionnaire (IPAQ) [109], doubly labeled water (total energy expenditure, energy expenditure/resting metabolic rate) and VO2max [65]. The PASE was significantly correlated to all but the doubly labelled water outcomes and VO2max [65]. During the development of the PASE Washburn et al. assessed the three aspects content validity by asking participants (n = 36) about the appropriateness of the items, the completeness (i.e., comprehensiveness), and the comprehensibility; results were used to inform the final version of the PASE [13]. Three additional studies assessed and reported acceptable content validity for the PASE across three different clinical groups: acute coronary events (English) [98], COPD (Chinese) [68], and cancer survivors (Taiwanese) [28]. Only the English version had responsiveness and minimal important difference (MID) reported and this was in a sample of individuals with lung cancer [66].

Construct validity was the most commonly assessed form of validity, predominantly exploring convergent validity (details available in Table 4). Physical function performance measures and self-report questionnaires were commonly cited, and relationships ranged from fair to moderate, including the Timed Up and Go (r = -0.45 to r = -0.69) [102, 106, 107], Berg Balance (r = 0.20 to r = 0.82) [14, 104, 107], and the physical function section of the Short Form-36 (r = 0.53 to r = 0.58) [68, 103, 109]. Muscle strength was another common construct with poor to fair correlations; specifically, grip strength (r = 0.29 to r = 0.43) [13, 68, 100, 102, 103], and lower limb strength (r = 0.18 to r = 0.37) [13, 66, 103]. There were also several self-report measures examining general health (r = -0.12 to r = 0.44) [13, 68, 98, 100, 103] and activities of daily living (r = 0.10 to r = 0.78) [100, 106]. The PASE demonstrated moderate correlations with the IPAQ (r = 0.65 to r = 0.74) [68, 107, 109]. Five studies compared the PASE to a direct measure of PA (e.g., accelerometers and pedometers), including outcomes such as steps per day (r = 0.39 to r = 0.61) [66, 68, 101] and activity counts (r = 0.43 to r = 0.64) with fair to moderate correlations [28, 99, 101]. Only Bonnefoy et al. used the gold standard doubly labelled water, and they found no significant correlations [65].

Table 4 Reported validity of the PASE

Discussion

To the authors’ knowledge, this is the first review to provide a comprehensive summary of the use of the PASE in community-dwelling older adults. The PASE has been used extensively to measure PA in older adults (536 primary papers before restricting to community-dwelling settings); however, it was mainly used in high-income countries with cross-sectional research designs. While strong evidence was summarized supporting test-retest reliability and construct validity, there was a paucity of evidence examining the PASE’s responsiveness, important change thresholds, and predictive validity. In addition, we have presented pooled means for different age groups and clinical populations to provide preliminary reference values to improve interpretations of total scores.

The PASE has been used extensively in community-dwelling older adults; 171,206 participants from 35 countries were included in this review. The PASE was developed in the United States, which is reflected in the greater uptake in North America and high-income countries [13]. However, the PASE has been used across five continents and in some middle-income countries (n = 8). Importantly, we have seen the validation of several translated versions including Arabic, Chinese, Malay, Persian, and Turkish. Furthermore, the application of the PASE to clinical and disease-specific populations has also occurred, and the high content validity in these populations is promising. The use of the PASE in persons with chronic conditions has been supported previously based on feasibility and psychometric properties [5]. While the literature summarized is extensive, more is available outside of community-dwelling populations not captured in this review, including further translations and validations (e.g., Nigerian translation) [111]. Our results show the PASE is a commonly used measure of worldwide but has been used sparingly in countries outside of North America and in lower-income countries. Decreasing the heterogeneity in how PA is measured is imperative for meaningful comparisons and data harmonization. Large numbers of self-report PA measures already exist, and previous work has recommended using these rather than creating more [12, 112]. This review shows the large uptake of the PASE, presenting a suitable choice for research on older adults. However, it is important that psychometric measures are assessed for the population of interest.

Psychometric properties are essential for outcome measures to ensure their validity, reliability, and interpretability. Of the 232 studies included, 19 studies aimed to examine the psychometric properties of the PASE in community-dwelling older adults. According to COSMIN, most studies (12/15) found acceptable test-retest reliability for the PASE total score. However, there was variability between studies that was more pronounced between subcategories of activity types (e.g., ICC subcategory values 0.56–0.94 [99], 0.76–0.93 [106], 0.78–0.99 [107]), which may suggest more variation week to week in single activity types and less for overall activity. There was a paucity of evidence on measurement error, including MDC and standard error of measurement. Of the four studies reporting in this area, one only provided values for activity subcategories, not total score [102], and two were for clinical populations (COPD and Parkinson’s disease). The varying populations may explain the large difference in values (e.g., MDC95 = 38.4 (general) vs MDC95 = 19.2(COPD); and SEM = 30 (PD) vs SEM = 6.9 (COPD)). Establishing the minimal detectable change values is essential for ensuring differences are real and not from measurement error. In addition, none of the included studies reported minimal clinically important differences (MCID), another important parameter for interpreting change in score. This paucity of evidence must be addressed across versions in community-dwelling older adults to support further use and interpretability of the PASE.

The PASE was validated in community-dwelling older adults in ten different languages. Content validity is regarded as the most important psychometric measurement property [113]; however, other than the sentinel paper, only three included studies reported on the relevance, comprehensiveness, and comprehensibility [28, 68, 98]. As presented in these papers, PA appears to be influenced by cultural/societal norms, highlighting the importance and continued need to verify the content validity of PA questionnaires when validating in new populations [28]. Fair to moderate relationships between the PASE and performance-based measures of physical function and mobility, strength, and health outcomes were regularly reported for construct validity. Four studies stated they examined criterion validity, which compares the PASE score to the gold standard of the same construct. However, only one study used the commonly regarded gold standard of PA doubly labelled water and did not find a significant relationship [65]. The remaining three studies found moderate correlations (> 0.60) using more accessible measures of PA: a pedometer [68], accelerometer [28], and a questionnaire [109]. The PASE-Polish [107] demonstrated the highest correlation at 0.74 with the IPAQ, which has been validated in 12 different countries, including low-income countries and rural samples [114]. The IPAQ was the only PA questionnaire reported, and only two other studies compared direct measures of PA (i.e., accelerometers). The correlations with the IPAQ ranged from 0.65–0.74, whereas correlations with direct measures tended to be lower and more variable (e.g., activity counts 0.43–64, walking steps 0.39–0.61). Several PASE versions did not contain a measure of PA in their validity analysis (n = 3). Further studies investigating these metrics using a wider variety of measures of PA (e.g., different questionnaires and more direct measures) are needed to clarify these relationships.

No studies reported on longitudinal validity, demonstrating a great need for studies to evaluate the PASE’s predictive validity for important health outcomes in community-dwelling populations across the globe. Despite almost 20 studies using the PASE to measure change in PA, responsiveness, which is critical for ensuring the PASE can accurately reflect change over time, has not been reported in any of the included studies. Therefore, research is needed to explore the predictive validity and responsiveness of the PASE to inform whether the PASE can be used to predict important health outcomes (e.g., future falls, hospitalization) and change in PA (e.g., over time or through intervention) for community-dwelling older adults.

A noteworthy finding of this review was the reporting of pooled means by age, sex, and clinical population. Pooled PASE scores decreased with increasing age groups from < 65 (159.53 (95% CI 146.58, 172.49)) to the 80 years and older group (67.17 (95% CI 51.95, 82.39)). In general, this is consistent with the literature where levels of PA progressively decrease with age for both men and women [115, 116]. Some clinical populations appeared to have greater decreases in PA than others (e.g., cognitive impairment 91.11 (95% CI 72.77, 109.40) vs osteoarthritis 129.53 (95% CI 110.40, 148.65)). Clinical groups also appear to be important in addition to age for PA level; for example, the studies in the cognitive impairment group were mostly younger age groups (5/6 less than 80 years old), but the mean PASE score was closer to the two oldest age groups. The provided reference data for age, sex, and clinical population can be used to improve the interpretability of PASE scores among similar populations of community-dwelling older adults. However, future research creating normative values for the PASE could further improve interpretability and uptake of this questionnaire.

There are several limitations of this scoping review that should be acknowledged. First, several eligibility criteria were placed on this review, resulting in papers related to the PASE being excluded. Specifically, studies were restricted to the English language, age of 60 years or older, and community-dwelling settings. These decisions were made for feasibility and to reflect the original PASE; however, they have limited our understanding of how far the PASE has been applied in different populations. With the robust search strategy reviewed by a health research librarian, we are confident that the summarized evidence accurately reflects the current literature for community-dwelling older adults. A second limitation is that only published studies were included, and grey literature was not considered, which opens the possibility that new and emerging research regarding the PASE was missed. Finally, several studies used data from the same databases/studies, resulting in the same or overlapping samples; we did not extract the information necessary to tease this apart. Therefore, pooled means will be biased toward samples included more than once. In addition, pooled mean PASE scores in clinical populations with only two studies should be interpreted cautiously due to limited sample sizes.

This review has identified areas for future consideration, including further expanding the validation of the PASE to middle- and low-income countries. A systematic review focused on the psychometric properties of the PASE with no setting restrictions may provide a valuable resource for researchers. Future investigations are needed on psychometric properties of the PASE, including thresholds of important change, responsiveness, and predictive validity for all versions of the PASE, as well as data on psychometric properties in specific clinical populations.

Conclusion

This review found that the PASE is a widely used PA measure among community-dwelling older adults, with evidence supporting its test-retest reliability and construct validity. The widespread use of a questionnaire increases the ability for data harmonization across studies and improves the ability to compare between studies. Further research is warranted to investigate the PASE’s ability to detect meaningful change (i.e., MDC, MCID) along with predictive validity and responsiveness. Pooled mean total PASE scores reported in this review can provide preliminary reference values for different age groups and clinical populations to help improve the interpretability of PASE scores until normative values are established.