Introduction

Over the past two decades, the Institute of Medicine has consistently highlighted the importance of continuity of care (CoC) for obtaining a high-quality health care system in the United States [1, 2]. Thus, CoC has become a cornerstone of many health policies, including the primary care-based model of health care delivery known as the patient-centered medical home (PCMH) [3, 4]. CoC within the context of the PCMH and other health policies, however, is difficult to define and measure [5]. Many CoC assessments used in evaluations derive from administrative claims data [6, 7]. However, the importance of patient reports is gaining recognition among those evaluating health policies such as the PCMH [811]. Yet, there are few theoretically driven, patient-reported models of CoC especially that incorporate the constructs of knowledge, trust, and respect within the enduring patient–provider relationship and particularly that are specific to older adults [6, 7, 12].

Theoretical dimensions of CoC have been proposed previously. In 2003, John Saultz published a conceptual hierarchy for CoC that included informational (medical record knowledge), longitudinal (ongoing healthcare interactions), and interpersonal (patient–provider relationship) dimensions of continuity. The underlying implication of this hierarchy is that at least some informational CoC is required to establish longitudinal CoC and one could only have interpersonal CoC in the presence of longitudinal CoC [13]. And, within the interpersonal CoC dimension, there are affective (mode of provider behavior toward the patient) and instrumental (content of provider knowledge about the patient) subcomponents of the patient–provider relationship [14, 15]. In practice, medical records, billing claims, or patient reports could be used to measure informational and longitudinal CoC, but only patient and/or provider reports could adequately measure the interpersonal CoC dimensions.

The main objective of this research was to evaluate a theoretically derived, patient-reported CoC model for older adults, who are most likely to benefit from CoC given their propensity to have multiple chronic conditions needing management [16, 17]. To do this, we used patient reports from 2,620 Medicare beneficiaries who completed all of the necessary components of the 2004 National Health and Health Services Use Questionnaire (NHHSUQ) [18, 19]. The NHHSUQ collected self-reported data on usual primary provider and place of care, as well as data on the quality and duration of the patients’ relationship with their provider. These data enabled us to empirically evaluate a multidimensional model of CoC that incorporates two of the theoretically key patient-reported aspects of continuity—longitudinal (with site and provider) and interpersonal (of both the affective and instrumental relationship).

Methods

Study design

The NHHSUQ survey was designed to identify factors affecting enrollment in Medicare managed care plans [19]. It was mailed to a disproportionately stratified random sample of 6,060 community-residing Medicare beneficiaries 65 years old or older in the fall of 2004 to obtain equal numbers of participants with regard to race/ethnicity (white, black, Hispanic), Medicare plan type [Medicare fee-for-service (FFS) or Medicare managed care (MMC)], sex, and population density (metropolitan or nonmetropolitan). The sampling frame included six urban areas (Los Angeles, Phoenix, Chicago, Houston, New York City, and Tampa) and nonmetropolitan counties in three broad regions—the southwest (California, Nevada, and Arizona), mid-south (Texas and Louisiana), and southeast (Florida). These regions provided wide geographic diversity and comparable numbers of MMC and FFS enrollees in each of the race/ethnicity and sex groups. After adjusting for the 363 survey recipients who were ineligible (e.g., noncommunity residing, moved out of the geographic area, or died before the survey was mailed), the overall response rate was 53 % (2,997/5,697) [19]. Both the Baylor College of Medicine and University of Iowa institutional review boards approved this study.

Measures

We hypothesized and evaluated a model of self-reported continuity using the NHHSUQ data. This theoretically derived CoC model has four dimensions: longitudinal continuity of the care site and provider, and instrumental and affective interpersonal continuity. Figure 1 depicts this a priori conceptualization.

Fig. 1
figure 1

Theoretically derived model of patient-reported continuity of care. Boxes indicate observed variables. Circles indicate latent variables. Solid arrows indicate directional causal pathways. Dashed arrows indicate covariation

Longitudinal continuity: care site

The NHHSUQ asked two questions about the usual place of care. The first was, “Of the places you go for medical care, where do you go most often for care if you are sick or need advice about your health?” Reponses included a doctor’s office or clinic, walk-in urgent care center or emergency room, Veteran’s Affairs Medical Center (VAMC), other, or “no specific place I visit most often for care.” We created a binary marker for any usual place of care, and an ordinal variable for the type of care site that ranked the responses from the least to the most conducive setting to promote continuity of care (0 = no specific place, 1 = other, nonspecific, 2 = urgent care/emergency room, 3 = VAMC, 4 = doctor’s office). The second question was “Approximately how long have you been receiving your care at this place” and had categorical responses of “less than 6 months,” “6 months to 1 year,” “1 year to 2 years,” “2 years to 5 years,” and “5 years or more,” along with an option to choose no specific care place. We created a continuous variable using category midpoints truncated at 5 years, with those not indicating a specific care site coded as zeroes.

Longitudinal continuity: provider duration

Three questions tapped provider durational continuity. The first asked “When you go for regular medical care, is there a particular doctor that you usually visit?” with a yes/no response set. A “doctor” could mean a variety of practitioners who provide primary care services (e.g., general doctor, nurse practitioner, or physician’s assistant). The second question asked about the long-term duration of care with this provider. These questions were combined into a continuous measure indicating the number of years of care (truncated at 10 or more) with this usual provider, with those not identifying a usual provider coded as zeroes. The third question asked whether this relationship had changed during the past 12 months and was used to construct a measure of the one-year duration with the provider. This short-term duration variable was truncated at 1 year (for those who indicated at least 1 year duration with a provider and no change in the past year) with 0 indicating no usual provider, 0.25 for those indicating no usual provider because of a provider change within the past 6 months, 0.50 for those indicating no change in usual provider but the relationship duration was <6 months, and 0.75 for those indicating no change in usual provider with the length of the relationship from 6 months to 1 year. The correlation between the long- and short-term measures was 0.55.

Interpersonal continuity: instrumental

The instrumental component of interpersonal continuity involves physician competence in performing the technical aspects of care (e.g., performing diagnostic tests, physical examinations, or prescribing treatments) and, from the patient perspective, assesses the content of the providers’ behavior. Four items tap instrumental continuity. The first three asked respondents to rate the “thoroughness of your primary doctor’s examinations,” “accuracy of your primary doctor’s diagnoses,” and “the explanations you are given of medical procedures and tests” on a scale from 5 to 1 for excellent, very good, good, fair, or poor. The fourth question asked respondents “How knowledgeable about your health and health care is your primary doctor or the providers at your usual place of care?” The response set was very knowledgeable, somewhat knowledgeable, unsure, and not knowledgeable.

Interpersonal continuity: affective

The affective component of interpersonal continuity reflects the “people skills” portion of the interaction, such as warmth, empathy, and how the physician approaches the patient [14, 15]. This component assesses providers’ interaction style and reflects communication, trust, and respect in the enduring patient–provider relationship. Four questions were used to tap the affective component. The first two asked participants to rate “your primary doctor’s interest in you” and “your primary doctor’s interest in your medical problems” on a scale from 5 to 1 for excellent, very good, good, fair, or poor. The third question asked participants “how satisfied are you with your health care” on a scale from 4 to 1 for very satisfied, somewhat satisfied, somewhat dissatisfied, and very dissatisfied, with “not sure” responses coded as 2.5. The fourth question asked “How comfortable are you with your primary doctor or with the providers at your usual place of care” on a scale from 5 to 1 for very comfortable, somewhat comfortable, not sure, somewhat uncomfortable, and very uncomfortable.

Statistical analyses

AMOS version 20 [20] and SPSS version 20 were used for all analyses. As a first step, sensitivity analyses were conducted after alternately assigning the lowest level of continuity to “not sure” responses (assuming those stating uncertainty perhaps had limited continuity); because these results were essentially equivalent, the original coding was retained. We used confirmatory factor analysis (CFA) to formally evaluate the conceptual model shown in Fig. 1 and the alternative models (based on modification and fit indices) that imposed additional constraints to identify the best configural model representing the data. For the initial model, items were allowed to load on a single latent factor only, errors were uncorrelated, and the factors were allowed to covary. We also evaluated two alternative higher-order models to account for the potential hierarchical nature of CoC and compared them to the four-factor model. The first included interpersonal continuity (from the instrumental and affective factors) and longitudinal continuity (from the care site and provider duration factors) as second-order factors, and the second included one higher-order construct, Continuity.

We evaluated the CFA models using a range of fit measures. Because the overall chi-squared goodness-of-fit statistics are more sensitive to large sample sizes [21], we expected inflated chi-square statistics. Therefore, we also selected and reviewed several other fit indices that are less influenced by sample size, including the goodness-of-fit index (GFI), the normed fit index (NFI), the comparative fit index (CFI), and the Tucker–Lewis index (TLI). Values of these indices range from 0 to 1, with values of ≥0.90 indicating a good fit and values ≥0.95 indicating an excellent fit [22]. We also evaluated the root mean square error of approximation (RMSEA) statistic that is sensitive to model complexity [23]. RMSEA values also range from 0 to 1, with values ≤0.05 indicative of a good fit and values up to 0.10 suggesting adequate fit. Cronbach’s alpha [24] was calculated to assess the internal consistency of the final four-factor model with values greater than 0.70 considered acceptable. To evaluate whether the complex, stratified sampling design had any effect on our final model, we reestimated the model after applying the sampling weights by using a weighted correlation matrix as the input data file.

The final model was evaluated for factorial equivalence across sex, race/ethnicity, Medicare plan type, as well as a median split on general health status. Sex and race/ethnicity were self-reported, and factorial equivalence was expected. Medicare plan type was defined as FFS or MMC at the time of the survey. We hypothesized that differences might exist between FFS and MMC respondents because options for health care might be dictated by health plan restrictions. We created two health status groups based on responses to the self-rated health question from the SF-8 Health Survey [25]—good general health (responses of “excellent,” “very good,” or “good”) and not good general health (responses of “fair,” “poor,” or “very poor”)—because CoC perceptions might vary based on health status.

Because our objective was to evaluate the consistency of the final model across the various groups, our multigroup analyses fitted a model that imposed constraints by forcing the factor loadings to be equal across groups and compared this to the baseline configural model without constraints. Measurement invariance holds if the constraints make a significant improvement in the model fit. One assessment of whether or not there is significant improvement is to assess the Δχ 2 between the two models. Failure to observe statistical differences between the baseline configural model and the constrained models when examining the Δχ 2 is one indicator of factorial invariance across groups. However, since Δχ 2 is a function of sample size and we have a relatively large sample, using the change of fit indices (noted above) to determine whether factorial invariance holds is recommended. [2629].

Results

Respondent characteristics

Of the 2,997 respondents in the NHHSUQ survey, 2,620 (87.4 %) had complete responses to all items used in the CFA models. Table 1 displays the characteristics of these respondents. Age ranged from 65 to 100 years old (mean age = 74.3; SD = 6.5). Most respondents had at least a high school education (65 %), 49 % reported an annual income <$20,000, and most (61 %) reported good to excellent health. By design, the sex, race-ethnic, and care plan distributions were nearly equivalent. Fifty-one percent were men, 38 % were white, 30 % black, and 30 % Hispanic, and about half were in managed Medicare care plans (53 %). Also by design, most respondents were from urban areas (62 %).

Table 1 Characteristics of respondents to the NHHSUQ survey (n = 2,620)

Confirmatory factor analyses

CFA was initially conducted on the model shown in Fig. 1, which assumes that each of the error terms are independent and the four factors are correlated. With the exception of “Site duration,” the items generally had strong loadings as hypothesized (ranging from 0.55 to 0.97). The chi-square goodness-of-fit was 2,828.7 with 59 degrees of freedom (df) and was statistically significant (p < .001). The other fit indices indicated that the model did not fit the data adequately (GFI = 0.86, CFI = 0.89, NFI = 0.89, TLI = 0.86, and RMSEA = 0.13).

In our conceptual model, the “Site duration” item was included with the Care Site construct because the focus of the construct was continuity at a care site. However, based on the modification indices from the initial CFA, it was apparent that the “Site duration” item contributed far more to the Provider Duration construct than the Care Site construct. In hindsight, this is intuitively plausible because providers are nested within care sites. Therefore, it is reasonable that duration with a care site might also contribute to a Provider Duration construct. In effect, this created a Care Site construct specific to the identification of a usual care site/provider of care and a Provider Duration construct specific to the notion of duration of continuity, with both constructs theoretically contributing to longitudinality. Upon additional review of the modification indices, the error terms between “Satisfaction” and “Comfort,” and between “Site duration” and “Provider duration (long-term)” were allowed to correlate (i.e., were freely estimated). Standardized factor loadings were all above 0.50 except for the “Site duration” item, which had a factor loading of 0.42. These changes drastically reduced the chi-square (1,091.8, df = 57) although it remained statistically significant. The other fit indices, however, indicated that this revised model had an adequate to good fit (GFI = 0.94, CFI = 0.96, NFI = 0.96, TLI = 0.95, RMSEA = 0.08).

Cronbach’s alpha coefficients for the longitudinal continuity scales of care site (two items) and provider duration (three items) were 0.88 and 0.75, respectively, for this second model. The instrumental and affective relationship continuity scales (each with four items) had Cronbach’s alpha coefficients of 0.88 and 0.87, respectively. Thus, all four scale constructs had acceptable internal consistency. The correlations between the four factors ranged from 0.11 (between care site and both affective and instrumental) to 0.89 (between affective and instrumental). Based on the modification indices for Model 2, we allowed the error term for “Knowledge of health” to be correlated with the error term for “Comfort” in Model 3. This cross-factor correlation (factorial complexity) improved the fit (chi-square = 752.4, df = 56; p < .001); however, the GFI, CFI, NFI, TLI, and RMSEA values remained virtually unchanged. Because the fit indices did not markedly improve with the addition of the cross-factor correlation, Model 2 was retained as the four-factor configural model to compare with the higher-order alternative models.

The first alternative hierarchical model included the two second-order latent constructs of Interpersonal and Longitudinal continuity, and the model chi-square was 1,093.59 (df = 58). The addition of the two second-order constructs did not significantly improve the model fit (\( \chi_{\text{diff}}^{2} = 1.8 \); df = 1; p > .05). The second alternative hierarchical model included one second-order latent construct (Continuity), and the model chi-square was 1,180.74 (df = 59) with the chi-square difference test (\( \chi_{\text{diff}}^{2} = 8 8. 9 \); df = 2; p < .001) indicating that this specification also did not improve the model fit. Applying the sampling weights to these models did not appreciably alter the findings. Factor loadings differed primarily at the second decimal and goodness-of-fit criteria differed at the third decimal (results available upon request). Given these findings and due to software limitations, the unweighted, first-order model in Fig. 2 was retained for evaluating factorial invariance.

Fig. 2
figure 2

CFA model of continuity. Boxes indicate observed factors, and circles indicate latent factors. Single-headed arrows indicate direct causal pathways, and double-headed arrows represent covariation

Factorial invariance

Table 2 presents the multiple group CFA fit indices for the four-factor configural model across sex, race/ethnicity, Medicare plan type, and general health status. For each analysis (with the exception of sex), the model comparison chi-square values were statistically significant (p < .01), suggesting potential model differences across the groups. All factor loading differences between Medicare plan types, health status, or among race/ethnic groups were less than 0.10 with the following exceptions. Factor loadings for two items, “Usual site” and “Site duration,” were higher for whites (+0.16 and +0.24, respectively) and Hispanics (+0.18 and +0.12, respectively) compared to blacks. For the “Provider duration (long term)” item, factor loadings were higher for whites than either Hispanics (+0.17) or blacks (+0.24), but for the “Provider duration (short term)” item, factor loadings were lower for whites compared to either Hispanics (−0.20) or blacks (−0.21). However, given these minimal differences along with good overall model fit (RMSEA < 0.08 and CFI, GFI, NFI, and TLI > 0.95), the first-order, four-factor model (Model 2) is sufficiently consistent across sex, race/ethnicity, Medicare plan, and general health for use among older adults.

Table 2 Multiple group analysis fit indices by sex, race/ethnicity, medicare type, and general health status for the four-factor final model

Discussion

CoC should have a significant positive impact on the health status of older adults. Yet, there is debate about whether the best way to define and measure CoC includes only the longitudinal aspect, or whether it should also include assessment of the provider–patient interpersonal relationship. If the latter is to be included, then the use of administrative data alone may not be sufficient, and patient (or provider) reports would need to be incorporated into CoC measurement [13, 17, 3032]. We used CFA to evaluate a theoretically derived model of CoC in older adults and found that both longitudinal and interpersonal dimensions of CoC can be evaluated by Medicare beneficiaries.

Our multidimensional patient-reported CoC model consists of 13 items tapping longitudinal continuity with a site and a provider, and interpersonal continuity through assessment of the patient’s experience with the provider’s instrumental knowledge and affective demeanor. All subscales had good internal reliability with Cronbach’s alpha ranging from 0.75 for the provider duration subscale to 0.88 for the instrumental and care site subscales. Longitudinal assessments are most commonly used to measure continuity, and yet, when we evaluated an alternative model with the two second-order constructs of Longitudinal and Interpersonal continuity included (which was statistically equivalent to the final model in Fig. 2), the factor loadings for the two longitudinal constructs of Care Site (0.31) and Provider Duration (0.81) were not as large as those for the interpersonal constructs of Instrumental (0.91) and Affective (0.99) continuity. Given this finding, we might suggest added emphasis on the interpersonal domains.

The strength of this work is that the sample included almost equal numbers of men and women, white, black, and Hispanic older adults, and respondents in FFS and MMC plans. Thus, we were able to determine whether older adults’ perceptions of CoC were factorially invariant across these groups and perceived health. Specifically, our results suggest that the “Usual site” item contributes more to the Care Site construct and the “Site duration” item contributes more to the Provider Duration construct for whites and Hispanics. In contrast, the “Provider duration (short term)” item contributes more to the Provider Duration construct for minorities, whereas the “Provider duration (long term)” contributes more to Provider Duration for whites. Overall, our results supported factorial invariance for males and females and indicated somewhat weaker factorial invariance across race, health plan, and perceived health. The weaker factorial invariance for these groups is not surprising. In the early 2000s, several studies showed discrepancies in continuity of care based on race/ethnicity, insurance type, and health status [3337]. It is well documented that minorities have access barriers to health care [33, 34], individuals in managed care plans may have more discontinuities in care [34, 35], and individuals with health problems have varying degrees of continuity [36, 37]. It is therefore important to account for these potential differences when assessing the implications of CoC.

There are some limitations to this work. One is that we were not able to design the content or format of the survey questions. Although the longitudinal and interpersonal continuity questions were designed to map well to the Consumer Assessment of Healthcare Providers and Systems (CAHPS) [38] and the Medicare Current Beneficiary Survey (MCBS) [39], we could not fine tune the questions nor control the number of items used to assess each construct. That being said, our final model did have one subscale with only two items (one of which had a dichotomous response) stemming from one survey question. This fact may limit the validity of Cronbach’s alpha as a test for the internal consistency of this subscale and increase the likelihood of measurement error. Another limitation is that we did not have access to information about nonrespondents for assessing the potential impact of differential response rates. Finally, the perceptions of CoC held by these older Medicare beneficiaries may not generalize to younger people.

The limitations imposed by our data and the fact that we were not able to include all known continuity of care domains (e.g., informational continuity from the Saultz hierarchy) limit our ability to recommend this 13-item scale as a definitive measure of care continuity. However, our results strongly suggest that both the longitudinal and interpersonal domains, as experienced by the patient, should routinely be included in the assessment of continuity. These findings support the work of Gulliford and colleagues [40] who developed and tested an experience-based measure of continuity of care for diabetic patients. By evaluating the patient experience of continuity across a more heterogeneous group of older adult patients, we expand upon the relevance of this earlier work in highlighting the importance of the patient experience when measuring CoC.

Our results are important for two reasons. First, the most commonly used measures of CoC are those that only identify longitudinal care. Our findings show that longitudinal continuity is only part of the concept. Second, the longitudinal measures are most commonly used because they are easily calculated using administrative claims. Yet, there is no way to measure interpersonal continuity using claims data. Interpersonal continuity can only be measured through assessment of the patient experience. This finding supports the interests advanced by organizations such as the Patient-Centered Outcomes Research Institute (PCORI) and the National Committee for Quality Assurance (NCQA) who advocate for the importance of using the patient perspective in the evaluation of health care quality. In future research, we will link the NHHSUQ data to each beneficiary’s Medicare claims to expand upon this work by evaluating how well this patient-reported CoC measure relates to extant claim-based CoC measures and subsequently validating these CoC measures by relating them to health outcomes and service use in older adults.