Background

Lung cancer is associated with the highest cancer-related mortality worldwide [1]. Survivors of lung cancer experience high disease burden, physical hardship and morbidity [2, 3]. There are guidelines regarding recommended levels of physical activity for people with cancer [4]. However, evidence suggests that most people with lung cancer do not meet the guidelines even at time of diagnosis [5, 6]. After a diagnosis of lung cancer, there is often a reduction in physical activity, and functional decline is rapid [5, 6]. Higher levels of physical activity are seen in people with lower symptoms and better health-related quality of life (HRQoL) [7, 8]. Increasingly, physical activity is being recognised as an important outcome in lung cancer and interventions which aim to increase this are urgently needed.

There are a variety of methods available to measure physical activity in the clinical setting. These include objective movement sensors, such as accelerometers or pedometers, which detect and record movement [9], and subjective patient-reported measures, such as questionnaires, which ask the participant to recall their engagement in physical activity [9]. Questionnaires are advantageous because they are quick, inexpensive, associated with minimal participant burden and feasible to implement on a large scale. There are a number of different questionnaires available for use; however, there is no consensus as to which is the best questionnaire for the lung cancer population. When selecting a questionnaire, consideration of its clinimetric properties is vital. This includes the ability of the questionnaire to measure what it is intended to measure, which are how well the data relate to data obtained from the gold standard instrument (criterion-concurrent validity), how well the questionnaire obtains data, as hypothesised, when compared to an instrument measuring a similar construct (convergent or construct validity) or how well data predict an outcome (predictive validity/utility) [10, 11]. Additionally, the clinical applicability of the questionnaire is also important. This includes whether there is a floor or ceiling effect; the ability of the questionnaire to detect meaningful change over time (responsiveness) [11], and whether there is a known minimal important difference (MID) (the smallest change in the questionnaire that patients and clinicians consider to be clinically relevant) [12]. Whilst a questionnaire may have excellent validity and clinical applicability for use with one patient group, these findings cannot always be extrapolated to other patient groups [11].

The Physical Activity Scale for the Elderly (PASE) is a questionnaire which asks the participant to recall their level of physical activity over the previous 7 days [13]. This questionnaire was originally developed in a population of healthy community-dwelling older adults in the USA [13] and since then has been widely used across many patient groups [14], including people with cancer. The PASE has well-established clinimetric properties in the healthy elderly population: moderate criterion validity with double-labelled water analysis (r = 0.68) [15], fair convergent validity with accelerometery (r = 0.49) [16] and excellent test-retest reliability (r = 0.84) [13]. However, there is no research on the clinimetric properties of this questionnaire in the lung cancer population. Therefore, the aim of this study was to assess the clinimetric properties of the PASE when used in a population with lung cancer, specifically (1) the convergent validity with movement sensors, (2) the construct validity with measures of physical function, functional exercise capacity and muscle strength, (3) the predictive utility and (4) clinical applicability (floor and ceiling effects, responsiveness and MID). It was hypothesised that the PASE would have fair positive convergent validity with movement sensors (correlation 0.25–0.49) and fair positive construct validity with measures of physical function, functional exercise capacity, and muscle strength (correlations 0.25–0.49). The consensus-based standards for the selection of health status measurement instruments (COSMIN) guidelines [17] and the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines were followed in reporting this study [18].

Methods

Study design, setting, and participants

This was a nested observational study within two multicenter prospective cohort studies [6, 19]. Participants were recruited from three tertiary hospitals in Melbourne, Australia, between December 2008 and October 2012. All sites had institutional ethical approval, and participants provided written informed consent. Participants were included if they were English-speaking adults with newly diagnosed nonsmall-cell lung cancer and had not commenced any form of cancer treatment. Exclusion criteria included a physician rated Eastern Cooperative Oncology Group Performance Status (ECOG-PS) of three (capable of only limited self-care, confined to bed or chair more than 50 % of waking hours) or four (completed disabled, cannot carry out any self-care, totally confined to bed or chair) [20]. Participants were included in this substudy if they completed the PASE at least once.

Procedure

Baseline measures of the PASE were conducted at time of diagnosis, and thereafter, the PASE was administered at 2-, 4- and 6-month follow-up. As this study was nested within two larger trials, not all participants completed the follow-up measures (Electronic Supplementary Material 1). During the time period between follow-up testing, standard care at the institutions was followed and not modified. Participants were not offered formal education regarding physical activity or exercise, and referral to rehabilitation was not part of usual care at the centres. In addition to completing the PASE, participants completed a range of additional tests at the same time point allowing comparisons of the PASE with these measures. Participants were stable and unchanged in the time between completing the PASE and additional measures.

PASE

The PASE is a 28-item questionnaire that asks the participant to recall their physical activity performed over the previous 7 days [13]. The PASE assesses physical activity performed during 12 typical daily activities which form three subgroups: leisure time activities (walking outside of the home or backyard, light sport, moderate sport, strenuous sport and muscle strengthening), household activities (light housework, heavy housework, home repairs, lawn work, gardening and caring for a dependent person) and occupational activities (paid or unpaid work other than work which mainly involved sitting). The five leisure time activities are recorded categorically according to frequency and duration of activity performed per week. The average daily hours of each type of leisure activity across the week is then estimated according to published calculations to give five individual leisure time activity scores [21]. The six household activities are recorded if they had occurred during the previous week. Occupational activities are recorded as total hours worked per week. Published weightings for the PASE based on estimated metabolic equivalent of task (METS) for each type of activity are multiplied by each of the 12 activity scores [13]. The weighted score from activities in three subgroups is summated to calculate the total PASE score out of a maximum of 400. Higher scores represent higher levels of physical activity. The maximum score attainable is 400, and the average score for elderly individuals is 103 points [13].

Additional outcome measures

Participants were asked to estimate the average duration and frequency of moderate intensity physical activity performed over the previous week, as defined by the American College of Sports Medicine [4]. Level of physical activity was classified categorically as sufficient (≥150 min/week), insufficient (1–149 min/week), or sedentary (0 min/week) according to physical activity guidelines for older adults [22] and individuals with cancer [4].

Physical activity levels were measured objectively using a movement sensor (Sparkfun Electronics GPS-08725 Accelerometer, Colorado) [23]. This measure was only completed in one of the two trials, and thus movement sensor data were only available for a subset of the cohort (60 occasions) (Electronic Supplementary Material 1). The use of this particular device was opportunistic. The device was chosen as it had capacity to measure outdoor activity which was an outcome in one of the larger trials. This device has previously been used in a population with traumatic brain injury as well [24]. The movement sensor device contains captive beam elements which registered time-stamped acceleration of the body in three planes (vertical, medial-lateral and anterior-posterior). At each assessment time-point, participants were asked to wear the device around their waist during waking hours for five consecutive days, including at least one weekend day. A minimum of three ‘full days’ (defined by device turned ‘on’ for ≥8 hours/day) were required for participants’ data to be included [19]. The short time frame of three days was chosen to allow a preoperative measurement as often there is only a brief window between time of diagnosis and surgical intervention. Step data were analysed with computer software programs custom-designed for this study. Data were averaged across the number of full days that the device was worn.

Physical function was measured using the Eastern Cooperative Oncology Group Performance Status (ECOG-PS) rated by the participant [20]. Health-related quality of life was measured using the European Organization for the Research and Treatment of Cancer questionnaire and lung cancer module (EORTC-QLQ-C30 and LC13) [25]. The core questionnaire includes nine multi-item scales (five functional scales, three symptom scales and a global quality of life scale) and six single-item symptom scales [25]. Higher scores on functional domains and global health status/quality of life scale represent better status, whereas lower scores on symptom domains and single-items represent less symptoms [25]. Distress was measured using the Distress Thermometer. Functional exercise capacity was measured using the six-minute walk distance (6MWD) [26]. Quadriceps femoris muscle strength was measured with a Powertrack II Commander 1500 hand-held dynamometer (JTech Medical JT-AA104, USA). Demographic and medical data were recorded, and comorbidities were scored with the simplified Colinet comorbidity score [27] which is commonly used in cancer.

Sample size

Sample sizes of ≥50 participants are recommended for studies assessing clinimetric properties of questionnaires [28].

Statistical analyses

Data were analysed with SPSS Windows version 22.0 (SPSS, Chicago, IL, USA). Data were assessed for normality using the Kolmogorov-Smirnov statistic. Parametric data are presented as mean and standard deviation (SD), and nonparametric data are presented as median and interquartile range (IQR). To assess the validity of the PASE, Spearman’s rank correlation coefficient was used to assess the bivariate relationships between PASE scores and test outcomes (movement sensor steps/day, ECOG-PS, EORTC-QLQ-C30 physical function domain, 6MWD, handheld dynamometry quadriceps strength) on all available data [10]. Coefficients were interpreted as little (0.00–0.25), fair (0.25–0.50), moderate (0.50–0.75) and large association (0.75–1.0) [11]. The Kruskal-Wallis test was conducted to compare PASE scores between physical activity categories (sedentary, insufficient, or sufficient) according to the physical activity guidelines. Alpha was set at 0.05 for all analyses.

Predictive utility of the PASE was assessed using linear regression analyses to investigate the ability of the PASE (when measured at baseline/time of diagnosis) to predict future physical function or global quality of life status 6 months from diagnosis. Baseline PASE was the variable of interest and was included in all regression models. The outcomes of interest were self-reported physical function and global quality of life measured by the EORTC-QLQ-C30 domains at the 6-month assessment. Potential covariates were age, gender, cancer stage, comorbidities, smoking pack year history, distress, symptoms and 6MWD. Potential covariates with significant univariate correlation with the outcome of interest were included in the model if collinearity was not identified. Mann–Whitney U test was used to assess differences in baseline PASE scores according to participants’ vital status at 6-month follow-up.

Floor and ceiling effects of the PASE were determined using the percentage of occasions when participants scored the lowest score (zero) or highest score (400) possible. Change over time from baseline to 2-month follow-up in the PASE was assessed using the Wilcoxon signed rank test [11]. Responsiveness (baseline to 2-month follow-up) of the PASE was determined by calculation of the effect size defined as r = Z divided by the square root of sample size, as recommended for nonparametric data [11, 29]. Thresholds for interpretation of the change were small (≤0.2), moderate (0.5) and large (≥0.8) [29, 30]. The MID for the PASE was determined using distribution-based estimation with calculation of the standard error of the measurement (SEM) and Cohen’s effect size. The SEM formula was SEM = σ1√(1 − r), where σ1 was baseline SD of the PASE score, and r was test-retest reliability coefficient of the PASE. As there is no literature reporting the test-retest reliability of the PASE specifically in lung cancer, the reliability coefficient (r = 0.89) was obtained from a previous study including a heterogeneous group of patients with cancer [31]. In addition, a moderate effect size is considered a clinically important effect and was calculated using the formula 0.5 × SD of the change scores from baseline to 2 months [32, 33].

Results

The PASE was administered to 69 patients on a total of 176 separate occasions. The characteristics of the cohort studied are reported in Table 1. The median [IQR] PASE score across the 176 testing occasions was 54.0 [27.9–111.0]. At the baseline assessment, the median [IQR] PASE score was 65.7 [38.5–116.2].

Table 1 Demographics of cohort (n = 69)

Validity

There was moderate convergent validity between the PASE and the movement sensor (steps/day): n = 60, rho = 0.50 [95 %CI 0.29–0.66], p < 0.005 (Fig. 1). The PASE was able to discriminate between participants’ level of physical activity according to the physical activity guidelines (sufficient, insufficient, or sedentary):χ 2 [2, n = 119] = 22.17, p < 0.005), with those participants who met the physical activity guidelines having significantly higher PASE scores than those engaged in insufficient or sedentary levels (Fig. 2).

Fig. 1
figure 1

Correlation between the PASE and the movement sensor (tri-axial accelerometery)

Fig. 2
figure 2

PASE scores categorised according to the physical activity guidelines

The PASE demonstrated fair to moderate construct validity with measures of physical function, functional exercise capacity, and quadriceps muscle strength. A positive moderate strength relationship existed between the PASE and the EORTC-QLQ-C30 physical function domain: n = 168, rho = 0.57 [95 %CI 0.46–0.66], p < 0.005. Relationships between the PASE and ECOG-PS (n = 176, rho = 0.36 [95 %CI 0.23–0.49], p < 0.005), 6MWD (n = 134, rho = 0.40 [95 %CI 0.23–0.55], p < 0.005), and handheld dynamometry quadriceps muscle strength (n = 94, rho = 0.37 [95 %CI 0.18–0.54], p < 0.005) were fair in strength.

Predictive utility

The PASE, when administered at time of diagnosis, was not able to predict mortality at 6 months. The PASE, at diagnosis, demonstrated predictive utility with physical function and global quality of life outcomes at 6-month follow-up. At diagnosis, PASE scores and levels of pain were significant factors in determining EORTC-QLQ-C30 physical function domain scores at 6 months (PASE: B coef = 0.35, p = 0.008). Similarly, at diagnosis, PASE scores and levels of dyspnoea were significant factors in determining EORTC-QLQ-C30 global quality of life scores at 6 months (PASE: B coef = 0.35, p = 0.023).

Clinical applicability

Scores on the PASE ranged from zero to 303 across the 176 testing occasions (Fig. 3). There was a small floor effect with 3 % (n = 6/176) scoring zero. No ceiling effect was observed with the highest score achieved (303) below the maximum possible of 400 (Fig. 3).

Fig. 3
figure 3

Distribution of the PASE scores across all testing occasions

There was a significant change in the PASE (decline) from baseline to the 2-month follow-up (Z = −2.4, p = 0.018) and from baseline to 6-month follow-up (Z = −2.3, p = 0.023). From baseline to 2 months, the effect size of the PASE was 0.23; and from baseline to 6 months, the effect size was 0.24; both values represent a small responsiveness to change. Distribution-based estimation indicated an MID for the PASE of 17 points which was 4.2 % (17/400) of the score width based on the SEM. The MID was 25 points based on Cohen’s effect size.

Discussion

Physical activity is an important outcome in lung cancer, and our study has demonstrated that the PASE is a valid and clinically applicable choice of self-reported activity measure. Results demonstrate that the PASE has moderate convergent validity with objective movement sensors; this is an important and promising finding. In addition, we have shown that the PASE, when administered at diagnosis, is a predictive factor for physical function and quality of life 6 months later. Given the poor prognosis associated with lung cancer and rapid functional decline, this is also a significant finding.

The correlation with movement sensors (rho = 0.50) is within acceptable limits (≥0.50) recommended for correlations between physical activity questionnaires and movement sensors [28]. Objective measures of physical activity are considered to be superior to self-reported measures because patients often under-report or over-report their engagement in physical activity [34]. However, there are many reasons why a questionnaire may be preferred over objective measurement. Measuring physical activity with patient worn devices is challenging. It requires patients to be compliant (turning the device on/off, remembering to wear it, charging the battery, and protecting the device from water), and subsequently, it is often difficult to obtain complete datasets. A valid questionnaire is advantageous for the measurement of physical activity in large-scale research studies and clinical practice.

Our findings are in contrast to findings of the only other similar study in the cancer population, where Liu and colleagues demonstrated the PASE to have poor convergent validity with movement sensors (accelerometery r = 0.16) [31]. The difference in findings may be due to differences in the cohorts studied: Liu and colleagues included a younger sample (mean age 50 ± 12 years), predominately with haematological cancer (68 %, lung cancer < 6 %), and had received treatment in the 1 year prior to testing (chemotherapy 100 %, surgery 5 %). Importantly, their cohort was also more physically active than our cohort (median [IQR] PASE scores 86 [49–161]) [31]. Given the PASE was originally created for use in an elderly population [13], it is likely that the PASE is not valid in younger and more active individuals with cancer. Further research is required to confirm whether the PASE is valid in young and highly active people with lung cancer.

There was no ceiling effect, and only a small floor effect was seen in the PASE. Floor and ceiling effects are of concern for longitudinal analyses as these limit the ability to detect deterioration or improvement respectively [10]. The 3 % floor effect observed in the PASE is well within the acceptable range (below 15 %) [35]. The strong performance of the PASE on this aspect may be due to the fact that the PASE was developed for use in elderly people. This is in contrast to other questionnaires, such as the international physical activity questionnaire which is designed for use in adults up to 65 years of age [36]. The advantage of the PASE is that it includes questions regarding physical activity encountered with participation in household activities and has less focus on sporting activities therefore representing more typical physical activities performed by the elderly population [13]. In addition, the PASE has a shorter recall period (7 days) compared with other questionnaires, which reduce recall bias from short-term memory loss.

Of concern is the fact that the PASE only demonstrated a small responsiveness to change from time of diagnosis to 2-month follow-up. Whilst the PASE did show statistically significant change over time, the effect size was small. We speculate that this finding is due to the low responsiveness of the questionnaire rather than limited change in physical activity over the measurement period. Thirty-nine percent of the cohort received surgery between the measurement time points (Table 1), and prior research has shown that physical activity is markedly reduced after surgery (compared to preoperative) when measured using objective movement sensor devices [37]. This has implications for the design of randomised controlled trials: to adequately power studies using the PASE as the primary outcome measure, it may mean that large sample sizes are required.

We determined the MID in our study using distribution-based estimation; however, there is controversy within the literature as to which is the best method to determine the MID [38]. Distribution-based methods utilise statistical analyses to determine the MID using the degree of variability of the test scores. The disadvantage of this method is that it does not take into account whether the patient or clinician feel the change is clinically meaningful. An alternative approach is anchor-based estimation of the MID which utilises a patient-related anchor, such as a global rating of change scale, to determine if the patient is clinically changed [38]. Anchor-based approaches are potentially more clinically relevant as the patients and/or clinicians’ opinion is measured and utilised as an anchor in order to determine if the patient has actually changed. The disadvantage of this method is that there can be a large amount of individual variation amongst patients and they cannot account for the measurement error of the test [12]. Consistencies in anchor-based and distribution-based MIDs are commonly reported in the chronic respiratory disease literature [39]. Given we identified the MID to be between 17 and 25 points with distribution-based estimation (SEM and Cohen’s effect size, respectively), further research is required to confirm if this is replicated using anchor-based methods [38].

Lack of physical activity is a global pandemic and a rising concern for the community [40]. Physical activity is now considered to be an important outcome in the cancer population as well. There is preliminary evidence in colorectal and breast cancer that higher levels of physical activity are associated with improved survival [4143]. Whilst there is no direct link between increased physical activity and survival in lung cancer, there is growing evidence regarding the efficacy of exercise training to improve functional exercise capacity [4446]. The link between higher functional exercise capacity and improved survival in lung cancer has been established [47]; however, the question remains as to whether higher levels of physical activity are associated with improved survival in lung cancer. Lack of physical activity is common in lung cancer, and interventions which aim to increase physical activity across the disease continuum are urgently needed [8]. From the chronic respiratory disease literature, we understand that participation in an exercise program and improvements in functional exercise capacity do not necessarily translate to improvements in physical activity [48]. Physical activity has been rarely measured as an outcome in studies evaluating exercise training in lung cancer [49]: we recommend future research studies to include measures of physical activity.

Limitations

Participants who were non-English speaking or who had very poor performance status (ECOG 3 or 4) at time of diagnosis were excluded. Therefore, results should not be generalised to such individuals. This study was limited by lack of movement sensor data and PASE follow-up data. Movement sensor data were only collected on a proportion of participants (n = 65/176 testing occasions), and repeat measures of PASE at 2-month follow-up were only available for 52 of 69 participants; this may introduce selection bias and results regarding the convergent validity between the PASE and movement sensors, and responsiveness of the PASE should be viewed with caution. Participants were required to wear the movement sensor device for at least 3 days, including a weekend day. This is in contrast to the PASE which measured physical activity over a 7-day period, including two weekend days. Given the large variability in physical activity, particularly between weekend and weekdays, the validity of the PASE against the movement sensor device may be underestimated in this study.

Conclusions

Level of physical activity is an important outcome in lung cancer. The PASE is valid and has high clinical applicability for use in a population with newly diagnosed lung cancer. There is a moderate relationship between the PASE and objective measurement of physical activity, no ceiling effect, a small floor effect and a small responsiveness to change over 2 and 6 months. The minimal important difference of the PASE in lung cancer is between 17 and 25 points. The PASE should be considered as an outcome measure for the assessment of self-reported physical activity in the lung cancer population.