Introduction

Demographic ageing has drawn the attention of policy makers to the negative effects that diseases and disabilities may have on participation by older people [1, 2]. Participation is defined by the International Classification of Functioning, Disability and Health (ICF) as ‘involvement in life situations’ [3]. Policy makers aim to promote participation in this group because of the expected benefits for society and increased quality of life for the individuals concerned [1, 2, 4]. For quality of life, research suggests that social roles may be a more important aspect of participation than daily activities [4, 5]. For society to function, social contact and exchange between people are imperative. Research would therefore benefit from a measurement instrument, which focuses on the social aspects of participation. We developed such an instrument, the Maastricht Social Participation Profile (MSPP).

The MSPP intends to measure actual social participation by older adults with a chronic physical illness. It builds on a definition given by older adults with a chronic physical illness themselves. They define social participation as a positive experience having one or more of the following three characteristics: social contact, contributing to society (like paying a visit) or receiving from society (like receiving a visit) [6]. This definition excludes behaviours which do not involve an exchange between people (like doing own household chores), which distinguishes social participation from the broader concept of participation. Furthermore, the definition includes behaviours which involve receiving from society, while other definitions of social participation tend to exclude these behaviours [612].

The question of whether social participation involves only contributing or both receiving and contributing is particularly relevant in the case of people with a chronic illness, because their opportunities to contribute may diminish. In response, people may explore alternatives [13]. If social participation involves both receiving and contributing, there are more alternatives, making it easier to maintain a given level of social participation (substitution). For example, instead of paying a painful or fatiguing visit to a friend (contribute), the friend may come to visit (receive).

The MSPP measures actual social participation, which refers to the frequency and diversity of social participation: how often do people engage in social participation and in how many different types of social participation do they engage? Information about actual social participation of people with a chronic illness is important from a societal perspective, because it tells us to what extent people are integrated in society. From an individual perspective, actual social participation may play a less important role in quality of life than its subjective experience [5]. For people are autonomous and differ in the frequency and types of social participation they prefer. Nonetheless, information about actual social participation may improve our understanding of subjective social participation and quality of life. This requires the use of additional instruments to measure subjective social participation and quality of life. Are people more satisfied about their social participation if they participate more often or if they participate in several different ways—or are both equally important (or unimportant)? Also, if a measure improves people’s subjective social participation, it is important to understand why: did the actual social participation change or did people afterwards feel better about the same actual social participation?

This paper addresses the development and clinimetric properties of the MSPP. The MSPP was developed as a self-administered generic measure for actual social participation by older adults with a chronic physical illness with the purpose of discrimination and evaluation.

Methods

The MSPP was developed with a sample of older adults with either chronic obstructive pulmonary disease (COPD) or diabetes mellitus type 2 (diabetes). COPD and diabetes both take a gradually deteriorating course, but COPD has intermittent exacerbations, while diabetes is characterized by a long stabilization phase followed by chronic complications [14].

The development process consisted of a number of steps, after each of which the MSPP was revised. Figure 1 outlines the sequence and purpose of the steps. In this section, we explain the steps. The results section will focus on the final step, the second field test, in which we evaluated the reproducibility and validity of the semi-final version in order to arrive at a final version of the MSPP.

Fig. 1
figure 1

Development of the Maastricht Social Participation Profile

Medical ethics committee approval was granted.

Development of the semi-final version

Initially, we conducted a qualitative study [6]: a literature search and ten individual interviews resulted in a list of social participation examples, which was used in two focus group sessions. The discussion focused on participants’ reasons for classifying certain items as social participation while rejecting others. This resulted in the definition of social participation already presented in the introduction.

Next, following guidelines about questionnaire design [1519], we constructed three indices, based on the presented definition and using items from the list of social participation examples [6]. We deviated from the definition in two ways. First, we did not operationalize that social participation should be a positive experience, because this involves subjective evaluation, while we wanted to develop a measure for actual social participation. Second, we treated ‘social contact’ as a necessary characteristic to focus the content of the MSPP. Consequently, all three indices included only participation behaviours involving social contact. The first index concerns consumptive participation (CP, nine items), which is characterized as benefiting from society (for example taking a course or visiting a restaurant). The second index concerns formal social participation (FSP, two items), which is characterized as contributing to society (participation in clubs and volunteer work). The third index concerns informal social participation (ISP, nine items), which is characterized as contributing to society, receiving or both (contact with family, friends and acquaintances).

The response format referred to the number of times something was done in the last 4 weeks, but the response key indicated with how often a week this corresponds. The former is easier to answer when counting (rare and salient behaviours), while the latter is easier when estimating (frequent and mundane behaviours) [20]. The MSPP includes both.

The authors GM, GK, IP, IM and JvE systematically evaluated response accuracy and content validity [21]. Content validity refers to ‘the extent to which an empirical measurement reflects a specific domain of content’ [22: 20]. The ISP was split into separate indices for acquaintances (ISP-A) and family (ISP-F) (identical items).

Next, we conducted ten cognitive interviews to test whether items were interpreted as intended (content validity). Participants were one man and two women with COPD and six men and one woman with diabetes, ranging in age between 65 and 83 years. Participants were asked to formulate retrospectively (or concurrently if they preferred) how they had interpreted items and decided on their answers [23, 24]. Probing techniques were used to check feasibility and response accuracy. Although revisions were made, the main result of the interviews was that items had been interpreted as intended.

The first field test involved a random sample of adults older than 59 years with either COPD (n = 71) or diabetes (n = 75) (May 2004). We analysed missing value patterns, frequency distributions, inter-item correlations and comments written on questionnaires. All statistical analyses in this study were done with the SPSS computer program version 12.0.1 [25]. The two highest response categories were combined.

Parallel to the first field test, content validity was assessed by six Dutch experts in the area of participation research, who had not been involved in the project so far. The experts received the MSPP, a schematic representation of its operationalization, and the argumentation behind it. They were asked to comment on both the operationalization and the underlying conceptualization and could freely structure their response or follow a more detailed list of questions which was provided. Generally, the experts were positive about the conceptualization and operationalization, but they also made some critical remarks. In response, items were added, removed and rephrased. Not all issues raised by the experts resulted in revisions, however, either because we could not (item overlap) or would not (for reasons of feasibility and conceptual choices).

In the second assessment of content validity, three colleague researchers not involved in the project sorted, independently of each other, the items of the MSPP to the hypothesized indices. The sortings were compared, and the intended ordering and differences discussed. No revisions were necessary.

Semi-final version of the MSPP

The semi-final version of the MSPP consisted of 26 social participation items in four indices: CP (seven items), FSP (three items), ISP-A (eight items) and ISP-F (eight items). All items had the same response format: did not do this in last 4 weeks (zero times), did this less than once a week (one to three times), did this once to twice a week (four to eight times), did this more than twice a week (nine times or more). Two types of scores could be calculated for each index: diversity and frequency. Diversity scores refer to the number of items on which a respondent had a score of at least one. Frequency scores reflect the mean score of the items. In addition, the total diversity score refers to the number of indices on which a respondent had a score of at least one. Higher scores indicate more diverse or more frequent social participation. In the present study, scores were only calculated if there were no missing values in a given index. The MSPP is included in Appendix.

Methods of the second field test

Sample and data collection

The semi-final version of the MSPP was sent out in two waves to a new random selection of people older than 59 years with either COPD or diabetes. They had previously been screened for a study about chronic illness and depression carried out by the School for Public Health and Primary Care of Maastricht University, the Netherlands (Delta study) [26], which had also asked consent to participate in the present study. Those included in the Delta study (criterion: minor or mild to moderate depression) were not invited to participate in the present study to prevent high respondent burden. The Delta study recruited participants through 89 family practices in the south of the Netherlands.

In wave one (October 2004), 600 questionnaires were sent out to people with either diabetes (N = 300) or COPD (N = 300). To increase response, a telephone reminder was issued after 2 weeks. Respondents who returned questionnaires with missing values were also followed up by telephone. One-third of the participants who returned the questionnaire (random selection stratified by disease) received the questionnaire again 4 weeks after their first response to assess reproducibility (wave two). A period of 4 weeks was chosen, because the time frame of the items was ‘last 4 weeks’ and we wanted to avoid partially overlapping time frames. We considered 4 weeks long enough to prevent recall bias.

Instruments

Besides the MSPP and questions about background characteristics (socio-demographics and health), the questionnaire included parts of the RAND-36 [27, 28] to measure general health perception and physical functioning. The Frenchay Activities Index (FAI) [29, 30] was included to assess construct validity. It consists of fifteen activities (scored on a four-point scale) in three subscales: leisure/work, outdoors and domestic domain. Higher scores indicate that people are more active. The FAI has been validated in a Dutch sample of stroke patients and a control group of older adults. Construct validity was acceptable and Cronbach’s alpha > 0.60 in both groups for all three subscales [29].

Analyses

Reproducibility

Reproducibility of the MSPP was evaluated with intraclass correlation coefficients (ICCs) [31] and with smallest real differences at group level (SRDsgroup) [32]. ICCs were computed for each index and each item separately using a two-way random effects model with absolute agreement between the scores of wave one and two [33]. ICCs are relevant if the MSPP is used for discrimination purposes and should be at least 0.70 [34]. SRDsgroup were computed for each index according to the following formula [35]:

$$ {\text{SRD}}_{\text{group}} = {\frac{{{\text{SD}}_{{{\text{wave}}2 - 1}} }}{\sqrt n }} \times 1.96 $$

SRDgroup is relevant if the MSPP is used for evaluation purposes, because it indicates the magnitude of difference that may, with 95% confidence, be expected between two measurements on the same, stable group of participants (‘noise’). The SRD is expressed in the same units as the indices and should be smaller than the minimal amount of change that is considered to be important (MIC) [34]. As we do not know yet which amount of change researchers and/or patients may consider important, readers should judge the SRD levels for themselves. To facilitate interpretation, we here define the MIC as the amount of change in the mean scores if half of the sample remains stable and the other half scores one point higher on one item (frequency scores) or scores on one item more (diversity scores).

Convergent and discriminant validity

To evaluate convergent and discriminant validity of the MSPP, we used the FAI [29, 30], because it is a measure for actual participation, like the MSPP. To our knowledge, the FAI is the only concise instrument for actual participation validated in a Dutch sample. The FAI measures the broad concept of participation, rather than social participation and could, therefore, be used for convergent as well as discriminant validation.

The FAI domestic domain (preparing meals, washing up, washing clothes, light housework, heavy housework) does not measure social participation, but only activities which do not involve an exchange between people (not related to MSPP indices). By contrast, the FAI leisure/work domain (social outings, pursuing hobby, outings/car rides, house/car maintenance, gainful work) covers all three characteristics of social participation: social contact, contributing to society and receiving from society (positively related to all four MSPP indices). Finally, the FAI outdoors domain (local shopping, walking outdoors, driving/bus travel, gardening, reading books) includes items which may involve social contact and receiving from society, but not contributing (positively related to MSPP consumptive participation and MSPP informal social participation, not related to MSPP formal social participation).

We hypothesized that MSPP CPfrequency should correlate positively (Pearson correlation coefficient) with FAI leisure/work and outdoors (convergent validity), and those correlations should be higher than the correlation with FAI domestic (discriminant validity) [36], tested with Steiger’s [37] (four hypotheses). The correlation with FAI domestic should be lower rather than absent, because the MSPP and FAI might correlate for other reasons, like physical functioning. We hypothesized the same for MSPP ISP-Afrequency and ISP-Ffrequency (eight hypotheses). MSPP FSPfrequency should correlate positively with FAI leisure/work, and this correlation should be higher than the correlations with FAI outdoors and domestic (three hypotheses). Twelve of the fifteen hypotheses should find empirical support [34].

Comparison between COPD and diabetes

Reproducibility and validity analyses were carried out for COPD and diabetes separately.

Results of the second field test

Response and sample characteristics

Of the 600 questionnaires sent out in wave one, 412 (69%) were returned (206 COPD and 206 diabetes). Four weeks later, in wave two, 125 of 137 questionnaires were returned (91%). The percentage of respondents in wave one without missing values on an index was 93% for CP, 97% for FSP, 91% for ISP-A and 91% for ISP-F (before telephone follow-up). Mean age was 70 (range 60–87). More men than women participated, as a result of a skewed sex distribution in the sampling frame. General health perception and physical functioning [27, 28] were significantly worse in participants with COPD than in participants with diabetes. Co-morbidity was common in both. Table 1 presents various characteristics of participants.

Table 1 Characteristics of the study population in the validation study (as measured in wave 1)

Scores on the MSPP

Table 2 presents the scores on the MSPP for participants with COPD and diabetes separately. Observed scores on the MSPP covered the entire range of theoretically possible scores for all indices except CPfrequency, ISP-Afrequency and total diversity. On this last score, the observed score range reveals that all participants engaged in at least one type of social participation as measured by the MSPP. The results further suggest that people with diabetes tended towards a more diverse and more frequent social participation than people with COPD, but differences were small. Only total diversity (P = 0.02) and FSPfrequency (P = 0.02) were significant at 0.05 level.

Table 2 Scores on the MSPP by disease

Reproducibility

Tables 3 and 4 show reproducibility results. Index ICCs ranged from 0.63 for CPfrequency to 0.83 for FSPdiversity (should be 0.70). Item ICCs were partly low, except in FSP. One might expect low ICCs to be found in particular in items referring to irregular types of participation, but this was not evident. SRDsgroup were smaller than the MICs (as they should), except for ISP-Afrequency and ISP-Ffrequency.

Table 3 Reproducibility of the MSPP indices
Table 4 Reproducibility of the MSPP items

Convergent and discriminant validity

Convergent and discriminant validity of the MSPP were supported by the correlations between the MSPPfrequency and the FAI, which Table 5 shows. Convergent correlations were higher than discriminant correlations, but not very high. Differences between correlations were significant except for one, which means that 14 of 15 hypotheses found significant empirical support.

Table 5 Convergent and discriminant validity of the MSPP (Pearson correlations with FAI)

Reproducibility and validity for COPD and diabetes separately

Separate analyses for COPD and diabetes suggested better reproducibility of CPdiversity, CPfrequency, ISP-Adiversity and ISP-Afrequency in diabetes than in COPD (e.g. ISP-Afrequency: ICC diabetes = 0.80, ICC COPD = 0.64), while ISP-Fdiversity and ISP-Ffrequency yielded worse reproducibility results in diabetes than in COPD (e.g. ISP-Fdiversity: ICC diabetes = 0.62, ICC COPD = 0.79). For FSP, results were similar in diabetes and COPD.

Regarding convergent and discriminant validity, analyses for COPD and diabetes separately yielded similar results, except that fewer differences between correlations were significant due to a lower power (data not shown).

Final version of the MSPP

The results of the second field test did not cause us to change the MSPP. The final version of the MSPP is, therefore, identical to the semi-final version (see Appendix).

Conclusion and discussion

Existing instruments for participation (in the broad sense) in the field of health and disability measure its performance [3841], frequency [30, 42] or subjective experience [42, 43]. The MSPP also measures frequency of participation, but distinguishes itself, because it can yield both frequency and diversity scores and focuses on the social aspects of participation. Furthermore, it builds on a definition of social participation of older adults with a chronic illness themselves. We first discuss the development, validity and reproducibility of the MSPP and then compare with the development and validation of two other instruments measuring frequency of participation (in the broad sense), namely the FAI [30] and (the objective part of) the Participation Objective Participation Subjective (POPS) [42].

The development process of the MSPP did not include the use of standard techniques based on associations between items, like internal consistency and factor analysis. We decided against these techniques because the items of the MSPP are causal variables rather than indicator variables [44]. Indicator variables reflect an underlying concept, which completely explains the correlations between the indicator variables. In this case, techniques based on associations between items are appropriate. In contrast, causal variables ‘are part of the definition of what the concept being measured means. (…) if they are present (…) then the concept in question is present.’ [44: 237] There is, for instance, no underlying degree of consumptive participation, which instigates people to go to the cinema. Rather, people engage in consumptive participation because they go to the cinema. Causal variables may be associated irrespective of the relationship with the concept they are measuring (e.g. social participation items that are impeded by fatigue). This makes techniques based on associations inappropriate, because these techniques may suggest removing items at the cost of content validity [45], or may suggest grouping items together based on other factors (e.g. fatigue) than the concept in question (social participation). We, therefore, decided not to use these techniques and instead paid close attention to content validity. The results from the first field test show that inter-item Pearson correlations were partly low or even negative, which supports our decision not to use techniques based on associations.

Content validity of the MSPP was scrutinized by experts in the area of participation research and, after amendments, tested again by other researchers. Convergent and discriminant validity were supported by correlations between the indices of the MSPP and the Frenchay Activities Index, but differences between convergent and discriminant correlations were small. One reason might be that the FAI is not an optimal match for convergent validation. The social activities in the FAI are spread in the subscales that also include daily activities. Another reason might be that circumstances like physical functioning produced correlations between the MSPP and FAI indices (convergent and discriminant).

Reproducibility of the MSPP is moderate rather than good for the purpose of discrimination, because two of nine ICCs were lower than the threshold of 0.70 (seven if using lower limit of the 95% confidence intervals). Reproducibility is good for the purpose of evaluation (SRDgroup), but it is a limitation of the present study that the MIC was rather arbitrarily defined.

Furthermore, reproducibility of the MSPP differs for COPD and diabetes. It is unclear whether this is a limitation of the MSPP or rather a limitation of the present study. As the MSPP measures actual social participation in the last 4 weeks and the interval between waves one and two in our study was likewise 4 weeks, social participation may really have been different between waves one and two. This is not unlikely, considering that reproducibility results of CPdiversity, CPfrequency, ISP-Adiversity and ISP-Afrequency were worse in COPD, which is characterized by intermittent exacerbations. In times of exacerbations, people may be forced to, or choose to, restrict social participation, causing social participation to fluctuate more in COPD than in diabetes. To test whether reproducibility really differs for COPD and diabetes, waves one and two would have to take place within the closest possible time, for example on the same day or on two consecutive days.

There might be reservations about the sample used, because the MSPP is intended to be a generic instrument for older adults with a chronic physical illness, but the development process involved only two types of chronic disease. By using only two types, we could compare measurement properties. A generic instrument should be robust across different types of disease, meaning that scores may differ, but measurement properties should be the same.

Comparison with the development of the FAI and the objective part of the POPS shows that the former was developed using factor analysis [30], while the latter, like the MSPP, was developed using methods for causal variables. These seem more appropriate for measures of observable activity, like frequency of participation [42]. Particularly, validity of the POPS was explored by comparing results with expectations about differences between groups and correlations between subscale scores. Results and expectations did not match well [42].

Regarding reproducibility, ICCs of the FAI, POPS and MSPP are similar [42, 46]. The ICCs of the POPS subscales vary considerably. The POPS authors suggest as an explanation that participation behaviours that are ‘not scheduled into an invariant behaviour’ may vary between measurements [42]. Likewise, we suggested true variability as an explanation for the differences between COPD and diabetes in reproducibility of the MSPP.

Future research should try to establish the minimal change in MSPP scores deemed important by people with a chronic illness to facilitate the evaluation of reproducibility. This would also allow the assessment of responsiveness, which is important for evaluation purposes. Furthermore, the MSPP still needs to be tested in other patient groups. Since the items are not specific to people with COPD or diabetes, it might also be worthwhile to test the MSPP in a general population of older adults and to use it to compare healthy older adults with those with a chronic illness. Given the social participation behaviours it covers, the MSPP does not appear to be valid for use in younger age groups.

Although there are some unresolved issues, we conclude that the Maastricht Social Participation Profile is a measure for actual social participation by older adults with a chronic physical illness, which appears to have good validity and acceptable reproducibility for discrimination purposes.