Introduction

Psychometric scales are critical tools for measuring quality of life outcomes. Through a process of feedback and refinement, psychometric scales often change over time, with items and multi-item scales being added, deleted or modified. While this process yields improved instruments, it also creates the problem of comparing scores across different versions of the scale.

The Impact of Cancer (IOC) [14] is an example of a psychometric scale that has evolved over time. The IOC was developed specifically to measure psychosocial aspects of long-term (more than 5 years post-diagnosis) cancer survivorship not captured by other health-related quality of life instruments. An initial scaling of the IOC was conducted using data collected from 193 long-term cancer survivors. These analyses yielded the IOC version 1 (IOCv1), which included 10 subscales measured by 41 items, as well as positive and negative summary scales [3, 4]. Due to the small sample size, factor analyses were conducted using a priori domains. A more comprehensive scaling of the IOC was conducted several years later with a sample of 1,188 long-term survivors. This scaling process involved de novo factor analysis without a priori domains as well as split-sample cross-validation and evaluation of construct and concurrent validity. This yielded the IOC version 2 (IOCv2), with eight subscales measured by 37 items, as well as positive and negative summary scales [1, 2].

Since the IOCv2 underwent more rigorous psychometric development, it is recommended that it be used rather than IOCv1. However, by the time of publication of IOCv2, studies had already been undertaken using IOCv1 [49], and this earlier version may continue to be used. Since IOCv2 is the recommended version, studies that have administered IOCv1 to participants may wish to score the instrument so as to obtain scores comparable to IOCv2 scores. Such “pseudo-IOCv2” scores may facilitate comparison with other cancer survivor samples surveyed using the IOCv2.

Our objective was to develop and test models for obtaining pseudo-IOCv2 scores using only IOCv1 responses.

Methods

Overview

We used data collected from long-term breast cancer (n = 1,176) and non-Hodgkin lymphoma (n = 652) survivors. Both samples had completed a questionnaire that included all items used in both the IOCv1 and IOCv2, described in [3]. Hence, it was possible to compute both IOCv1 and IOCv2 scores for these participants. Our goal was to develop predictive models that utilized as input only IOCv1 item responses to obtain pseudo-IOCv2 scores that closely matched participants’ observed IOCv2 scores.

Participants

The breast cancer survivors were participants in the Life After Cancer Epidemiology (LACE) Study, a prospective cohort study of early-stage breast cancer survivors [10]. The cohort consists of women diagnosed from age 18 to 79 years with a first primary breast cancer (Stage I ≥ 1 cm, II or IIA) recruited primarily from the Kaiser Permanente Northern California Cancer Registry and the Utah Cancer Registry. Demographic, treatment and medical characteristics of this sample are described in Crespi et al. [1]. Twelve participants included in Crespi et al. [1] were subsequently found to have had recurrences and were excluded from the current sample, reducing the sample size from 1,188 to 1,176. The LACE participants were mailed the self-administered questionnaire of IOC items as part of a resurvey wave.

The non-Hodgkin lymphoma survivors were identified through the Duke University and University of North Carolina at Chapel Hill Lineberger tumor registries as previously described [11]. Patients were eligible if diagnosed with non-Hodgkin lymphoma, ≥19 years at diagnosis, and ≥2 years post-diagnosis. Characteristics of this sample are described in Crespi et al. [2]. These participants were mailed the self-administered questionnaire of IOC items as part of a cross-sectional survey.

Since the IOC is intended to apply to cancer survivors rather than individuals with active disease, respondents with active disease or unknown recurrence status were excluded from the analysis. Both studies were approved by human subjects review boards, and written informed consent was obtained from all participants.

Measures

IOC items are presented as statements regarding specific impacts of cancer to which respondents indicate their level of agreement from 1 (strongly disagree) to 5 (strongly agree). IOCv1 uses 41 of these items; IOCv2 uses 37 items, some of which are on IOCv1 and some of which are not. IOCv2 also includes employment and relationship scales not considered here.

The 37 items on IOCv2 are used to compute 8 subscale scores and 2 summary scores. Subscale scores are computed as the mean of items comprising the subscale. The IOCv2 Positive Impact Summary scale is scored as the mean of the items on the Altruism/Empathy, Health Awareness, Meaning of Cancer and Positive Self-Evaluation subscales. The IOCv2 Negative Impact Summary scale is scored as the mean of the items on the Appearance Concerns, Body Change Concerns, Life Interferences and Worry subscales.

Statistical analysis

The combined sample of 1,828 survivors was randomly divided into training, validation and test sets using an approximate 50%/30%/20% split (n’s of 927, 509 and 392, respectively). We obtained a predictive linear regression model for each IOCv2 item missing from IOCv1 using the least absolute shrinkage and selection operator (LASSO) method [12, 13] for model selection on the training set, implemented in the GLMSELECT procedure in SAS 9.2. The LASSO is a shrinkage and selection method for linear regression that minimizes the sum of squared errors with a bound on the sum of the absolute values of the coefficients as a protection against overfitting. Each IOCv2 item response targeted for prediction was used as a dependent variable, and all IOCv1 item responses were included in the potential predictor pool. Average squared error for the validation data was used as the criterion for choosing among models at each step of the LASSO algorithm.

The test set was used to assess the predictive performance of the selected models on independent data not used in model selection. For participants in the test set, the selected models were used to compute pseudo-responses for IOCv2 items missing from IOCv1. These pseudo-IOCv2 item responses were used to compute pseudo-IOCv2 scale scores, computed as the mean of items comprising the scale but using pseudo-IOCv2 item responses for items not in the IOCv1 where applicable. Predictive performance was assessed by examining the distribution of differences between observed and pseudo-IOCv2 scale scores and the Pearson correlation between them.

Results

Table 1 lists the IOCv2 scales and indicates the number of items in each scale that are included or missing from IOCv1. In total, IOCv2 uses 37 items to form 8 subscales and 2 summary scales; 30 of these items are also included in IOCv1, and 7 are not and thus were targeted for prediction. The Life Interferences subscale had the highest level of missingness, with 57% (4/7) of its items not in IOCv1. Three other subscales were missing only single items. The Negative Impact summary scale had more items missing than the Positive Impact summary scale (25%, 5/20 compared to 12%, 2/17).

Table 1 Comparison of item content of IOCv1 and IOCv2

The LASSO algorithm selected models with 5–9 predictors for each IOCv2 item targeted for prediction (Table 4). Table 2 summarizes the performance of model selection in terms of the average squared error of prediction of the selected models in the training, validation and test sets for each of the seven predicted items. The average squared errors were comparable across sets, supporting the generalizability of the models to independent data.

Table 2 Comparison of average squared error of prediction of the selected models in the training, validation and test sets

Table 3 compares observed and pseudo-IOCv2 scores in the test set, for IOCv2 scales with missing items. In all cases, the mean difference between observed and pseudo-scores was near zero, and the standard deviation of the differences was less than 0.33, and more typically less than 0.18. The correlations indicated close agreement, especially for the summary scales, in both survivor groups and overall. The Life Interferences subscale was predicted the least accurately, but still had correlation of 0.896 in the overall test sample.

Table 3 Comparison of observed and pseudo-IOCv2 scores in the test set (N = 392)
Table 4 Regression coefficients to be used to compute IOCv2 pseudo-item responses from IOCv1 items

Conclusion

We have developed models for obtaining pseudo-IOCv2 scale scores from IOCv1 responses that may facilitate comparison of quality of life impacts across samples of survivors surveyed using different versions of the IOC. The models had very good predictive performance in an independent test sample.

The regression models for obtaining pseudo-IOCv2 item responses are provided in Table 4, and Table 5 provides an example of the calculations. A SAS macro for computing pseudo-IOCv2 scale scores is available from the first author.

Table 5 Example of calculation of pseudo-IOCv2 item responses, subscale and summary scores for one respondent

Limitations must be acknowledged. The respondents completed an 81-item questionnaire rather than the shorter IOCv1 or IOCv2, and responses may have differed from what would have been obtained from the IOCv1 or IOCv2 due to the different context; in particular, similarity of item responses may have been enhanced. The sample was limited to breast cancer and non-Hodgkin lymphoma survivors, and the models may not perform as well for individuals with other diagnoses.

Overall, the predictive performance of the models together with the substantial overlap between IOCv1 and IOCv2 suggests that investigators can use pseudo-IOCv2 scores with confidence that they are comparable to actual IOCv2 scores. Our approach may be useful to other investigators seeking to compare participant samples surveyed using different versions of the same scale.