Introduction

Wellbeing may be considered a contented state of being happy, healthy or prosperous. Measures of subjective wellbeing are highly recognized for their useful application in the assessment of global quality of life and its components. One of the most common approaches to their evaluation is through satisfaction scales, which represent a cognitive judgment of one’s life as a whole or relating to its different aspects [1, 2]. Questions on satisfaction directed to people’s feelings about themselves or life domains are commonly used to measure subjective wellbeing [3]. With regard to this perspective, the international wellbeing group created a measure of subjective wellbeing for cross-cultural comparisons: the international wellbeing index (IWI). It consists of two scales, the personal and the national wellbeing index (PWI and NWI, respectively). The former is a measure of satisfaction with different life domains, proximal to the individual, and the latter is a more distal measure focused on satisfaction with aspects relating to the current situation of the country where people live [4, 5]. The IWI has been found to be a reliable and valid instrument, with application in many countries [68]. All validation studies used a classic psychometric approach.

The development of the IWI was based on two theoretical principles. According to the theoretical principle of deconstruction, subjective wellbeing is measured by the minimum set of domains that represent the first-level deconstruction of satisfaction of life as a whole [3]. The theory of subjective wellbeing homeostasis proposes that, under normal circumstances, each person’s subjective wellbeing level is maintained within a limited positive range, and this is more evident at a personal wellbeing level [9].

At an intermediate level, between personal and national level, research on the community has received special attention in relation to health and wellbeing from various perspectives such as social epidemiology, health geography, and social ecology [1012]. Specifically, importance of place of residence and neighborhood satisfaction has been stressed in studies about health, physical activity, life satisfaction, and quality of life [1315], with a significant contribution to the research of aging in place [12, 16, 17].

Community quality of life studies have been based on different objective and subjective indicators including poverty, education, health, participation, business services, government, non-profit organizations, environment, and other important aspects, contributing with useful and exhaustive information to community planning [18, 19]. No agreement exists about a universal definition of community wellbeing. This work focuses on key aspects that contribute to the quality of life at a community level. Accordingly, community wellbeing could be defined as the satisfaction with the local place of residence taking into account the attachment to it, the social and physical environment, and the services and facilities.

Some aspects of community wellbeing have been addressed in neighborhood satisfaction questionnaires such as a subscale of the neighborhood environment walkability scale [20] and also by a single item [14]. Other measures focus on sense of community [21, 22], which place emphasis on feelings of belonging, fulfillment of needs through cooperative community behavior, reciprocal influence of the individual and the community, and emotional connection. However, there is no instrument that explicitly assesses community wellbeing while providing a brief and comprehensive linear measure, focusing on subjective aspects, and allowing potential integration into the IWI.

Community wellbeing is an important indicator in studies focusing on the impact of local place of residence in the person’s health and general wellbeing. The residents’ perception of their community should contribute to practice and social policy decisions. For this reason, measures of community wellbeing are valuable tools that should provide rigorous assessments to assure valid and reliable results.

The goal of this study was to develop a community wellbeing index (CWI) using the same format as the PWI and NWI to measure residents’ evaluation of their community, considering several items from a subjective perspective. Comprising the personal and the national indices, the IWI has greatly contributed to the evaluation of subjective wellbeing from a personal (PWI) to a more distal (NWI) approach. In this context, the CWI has the potential to complement the IWI at an intermediate level.

To provide a rigorous validation of the proposed index, classic psychometric analysis and measurement theory approach using Rasch analysis were used. Being stricter than the classic psychometric approach, Rasch analysis represents a step further and it is currently considered the standard for developing new instruments and judging the quality of existing ones [2327].

Methods

Study sample

The study followed a cross-sectional design, with a nationally representative sample made up of 1,106 community-dwelling older adult residents in Spain, aged 60 years and over. Proportionate stratified sampling was employed in order to improve representativeness. The strata consisted of Spain’s autonomous communities (14 groups), population size of municipalities (7 groups), age (3 groups), and sex [28]. People whose scores exceeded 4 points in the Pfeiffer’s Short Portable Mental Status Questionnaire [29], indicative of cognitive deficit that might have impaired the ability to complete the survey, represented 4.4% (51 subjects) of the initial sample and were excluded from the study. Subjects who did not meet inclusion criteria or who refused to answer were replaced by others from the same sampling stratum.

Assessments and procedure

The CWI was applied in the context of a larger quality of life survey. Besides sociodemographic information, the survey included the following measures: PWI, CWI, NWI, EQ-5D [30], and Duke-UNC Functional Social Support Questionnaire (DUFSS) [31, 32]. The survey was administered as a structured interview at the participant’s home, after obtaining written informed consent. The study was approved by the Ethics Committee of the Carlos III Institute of Health.

The PWI and NWI consist of 7 and 6 items, respectively [3, 4]. Both use the same 11-point Likert-type response scale, from 0 (completely dissatisfied) to 10 (completely satisfied), with the neutral point being represented by 5. The two indices are represented by the mean values of the respective items, once transformed into a percentage over the maximum possible score.

The CWI followed the IWI structure, using the same response scale. The CWI is an inductive-based measure of community wellbeing. Four different sources of information were used to develop the proposed questionnaire, the first one being the literature review about residential environment, community wellbeing and aging in place. The second source was an earlier quality of life survey of older adults living in the province of Madrid (n = 499) [3335]. The questionnaire was designed according to different life domains, including residential environment. The survey also incorporated the SEIQOL-DW [36], and results indicated that health, family network, economic situation, social network, and free time or leisure activities were the 5 dimensions most mentioned [34]. The third source was a survey on residential environment in old age, showing that the environment, services, leisure, and integration in the community greatly influenced residential satisfaction and satisfaction with life in old age [17, 37]. Finally, we used the format of the IWI and some of their items to propose a subjective and multi-item wellbeing index at an intermediate level.

Experts were invited to a meeting where the results from the previous studies were summarized. Since the PWI and the NWI are made up of 7 and 6 items, respectively, and items are usually deleted based on statistical results, the goal was to generate more than 7 items. The experts’ discussion continued until item redundancy (saturation) was reached and no new items were identified. Eleven items were initially proposed, measuring “the level of satisfaction with the following aspects of the village/city/place where you live”: economic situation, environment, social conditions, distribution of wealth, health services, social services for older people, support to families, trust in people, leisure services, belonging, and security. An item about “life in general in your place of residence” was also included in the survey.

To analyze discriminant validity, measures of health status (EQ-5D) and social support (DUFSS) were also included. Both are widely validated, including in Spain, and are therefore available in the participants’ language. The EQ-5D is a utility-based measure of health outcome, made up of 5 items that provide a single index value for health states, ranging from 0 to 1 [30]. It also comprises the EQ-VAS, a self-rating of current represented in a visual analogue scale from 0 to 100. The DUFSS is a self-assessment measure of the perceived functional social support, consisting of 11 items rated in a 5-point Likert-type scale [31, 32]. A higher score indicates better health or social support.

Data analysis

Data analysis was performed in two parts: Rasch analysis and classic psychometric approach. The Rasch model, the simplest application of measurement theory approach [38], assumes that there is a functional relationship between the person’s ability (in this case, the person’s rating of subjective wellbeing) and the item difficulty (or the level of the construct being measured). Drawing from the application of the Rasch model in costumer satisfaction survey data, the items’ difficulty may be interpreted in terms of quality [39]: the smallest item measures correspond to better quality, and the highest item measures to poor quality of several community aspects.

Rasch analysis is based on two basic premises: local independence and unidimensionality. Local independence means that the scores are related to each other only through the construct, whereas unidimensionality implies that only one construct is being measured.

Rasch analysis was performed using RUMM2020 [40]. Rasch analysis with large sample sizes might result in a statistical significance of small deviations from the Rasch model, and deletion of items that otherwise do fit the model. Therefore, for statistical analysis purposes, a random sample of 300 was taken, which allows getting accurate estimates of item and person locations regardless of the scale targeting [41]. No specific procedure was taken into account for missing data since the Rash model is quite robust accommodating for missing data [42].

The unconstrained polytomous model [43], applicable in case of items with response categories functioning differently, was used. A step-by-step approach was taken [44, 45], and the following aspects were analyzed: fit of data to the Rasch model, reliability, local independence, unidimensionality, internal construct validity, response category ordering, differential item functioning (DIF) by gender, and scale targeting.

If there were a good fit of data to the Rasch model at item (or person) level, then a non-significant item (person) chi-square difference would be expected. In addition, individual item and person fit residuals should be within the ±2.5 range. Item and person summary fit statistics should also follow a standardized normal distribution with a mean ± standard deviation (M ± SD) of 0 ± 1.

Reliability was measured through the person separation index (PSI), interpreted similarly as the Cronbach’s alpha; the PSI is calculated using person locations instead of raw scores. Local independence means that one item response is not biased by another item response and was examined through the residual correlation matrix; correlations noticeably higher than the rest might indicate local dependency [40].

Unidimensionality was specifically tested though a principal component analysis (PCA) of the residuals [46]. In case of multidimensionality, an exploratory factor analysis (EFA) of the CWI item raw scores was performed in PASW Statistics 17.0, using the principal components method. Varimax rotation was used to interpret the factors. The number of factors retained was determined by an examination of the scree plot [47]. Results from the EFA were taken to suggest unidimensional domains to be tested by means of confirmatory factor analysis (CFA) in AMOS 16.0.0 and Rasch analysis. The following standard values were taken for considering an acceptable fit for CFA: chi-square approaching non-significance; chi-square value/degrees of freedom ≤2 [48]; root mean square error of approximation RMSEA ≤0.06 [49]; comparative fit index CFI ≥0.95 [49]; goodness-of-fit statistic GFI ≥0.95 [50].

In Rasch analysis, response categories’ ordering was examined through category probability curves. Disordered thresholds (the point where probability of scoring on two adjacent categories is the same) were corrected by collapsing adjacent categories. For example, after collapsing the second and third category of the initial response scheme (0–10), the new response scheme would be 01123456789.

Differential item functioning examines whether items perform similarly across different groups. An analysis of variance (ANOVA) was conducted for each item comparing scores across each level of the group factor (gender, in this case) and across levels of the construct. DIF was considered to be present if the ANOVA was significant using a Bonferroni correction. Item characteristic curves by gender were also visually inspected.

Scale targeting refers to the distribution of item location estimates in relation to the distribution of person location estimates [51]. Person-item distribution was used to analyze whether the sample covered all levels of the scale through visual inspection of a graphic showing the frequency and spread of persons and items along the variable (Fig. 1). The distribution of person locations is expected to have a M ± SD = 0 ± 1 logits. One logit, the unit of measurement in Rasch analysis for items and person locations, is the log-odds transformation of the probability of a response [52].

Fig. 1
figure 1

Person-item threshold distribution, in logits, of the 3 CWI scales (n = 300, final Rasch models): community services (on the figure's left); community attachment (in the middle); and physical and social environment (on the right). On the right of the figure's vertical lines, the value before the dot represents the item number followed by a short description, and the value after the dot represents the threshold

Once fit to the Rasch model was obtained, Rasch analysis was cross-validated in the remaining sample (n = 806). The linear measures obtained from Rasch analysis of the whole sample, expressed in a 0–100 scale, were exported into PASW Statistics 17.0 to evaluate the CWI’s external construct validity (n = 1,106). For convergent validity [53], a moderate (r = 0.35–0.50) correlation was expected with the PWI, NWI, and the item “satisfaction with life in general in your place of residence.” A weak correlation (r < 0.35) was expected with the EQ-5D and DUFSS, as the constructs they measure are only loosely related in general population. To control for the measurement error associated with the measures, correlations were disattenuated for error, which takes into consideration the reliability of the correlated measures [54].

Known-groups validity was also analyzed by comparing groups with low and high satisfaction with place of residence, taking the median, 7, as a cutoff point. Since measures that fit the Rasch model provide true linear scales, parametric statistics (Pearson r correlations; Student’s t tests for group comparisons) were used after confirming normality assumptions.

Results

Descriptive information of the study sample was summarized according to sociodemographic and health characteristics, and the IWI scales (n = 1,106, Table 1).

Table 1 Descriptive data of the sample (n = 1,106): sociodemographics, health characteristics and IWI scales

Rasch analysis of the subsample of 300 indicated that the initial 11-item scale showed poor fit to the model. The item “distribution of wealth” displayed a significant misfit, with a high residual (5.333), and was therefore discarded. Almost all items displayed reversed thresholds, suggesting that the 11-point response scale was not working appropriately. A PCA of the residuals was performed, and when the person estimates of two subsets of items were compared, there was a high (15.38%) number of independent t tests falling outside the ±1.96 range, indicating lack of unidimensionality.

An EFA suggested the presence of 3 factors or subscales: community services, community attachment, and physical and social environment (eigenvalues: 5.08, 1.23, and 0.88, respectively), accounting for 71.9% of the variance (Table 2). Results from CFA showed a good fit after allowing freeing parameters in the error covariance matrix, with a χ2(26) = 51.31, P = 0.002; χ2/df = 1.97; RMSEA = 0.058, CFI = 0.98; GFI = 0.97 (Table 2). Correlations between factors ranged from 0.73 to 0.82.

Table 2 Factor loadings for the exploratory factor analysis (EFA) and final confirmatory factor analysis model (CFA) of CWI items (n = 300)

These subscales were analyzed according to the Rasch model. All items but two (items 4 and 8) had disordered thresholds and were therefore rescored. The new response scheme was as follows (Table 2 shows a description of item content): items 1 and 3, 01123456789; item 2, 01112345667; items 5 and 6, 01123456789; items 7, 9, and 10, 01112345678. After rescoring for disordered thresholds, the three CWI subscales showed a satisfactory fit with a PSI of 0.82–0.85 (Table 3). All items were locally independent and displayed a satisfactory fit; unidimensionality of the subscales was verified by a low proportion of significant t tests [46]. All items were free from DIF by gender. DIF was also absent in the total sample (n = 1,106). Table 4 shows the item difficulty hierarchy and the individual item fit to the Rasch model by subscale. Online Resource 1 (See Supplementary material) presents the Spanish CWI.

Table 3 Global fit to Rasch model of CWI subscales (n = 300) after rescoring for threshold disordering and deleting item “wealth”
Table 4 Individual item fit to the Rasch model after rescoring for threshold disordering (n = 300)

The three subscales displayed adequate scale targeting, with a M ± SD person location of 0.240 ± 1.066, 0.429 ± 1.599, and 0.576 ± 1.524 (community services, community attachment, and physical and social environment, respectively), evenly spread item thresholds covering a wide range (−4 to 5 logits), and a normal distribution of person locations, with no floor or ceiling effects (Fig. 1).

Item hierarchies within the subscales are presented in Table 4, and Fig. 1 shows both individuals and item thresholds placed in the same logit scale. People with higher measures (top) have higher satisfaction levels on the constructs represented by the subscales. Item thresholds are ordered, so that the easiest, more likely to occur thresholds to endorse are positioned on the top and the hardest on the bottom.

To cross-validate the analyses, the final models were satisfactorily replicated in the rest of the sample (n = 806 cases) and also in the total sample (n = 1,106), measured through fit indices to the Rasch model. Using independent sample t tests, person locations of the study sample (n = 300) were not significantly different from the cross-validation sample (n = 806), except for the subscale physical and social environment. The mean difference was less than 0.5 logits (0.222; confidence interval: 0.010–0.435) and approached non-significance (P = 0.04).

Online Resource 2 (See Supplementary material) provides the transformation of initial total scores (after rescoring for disordered thresholds) into linear measures for the total sample (n = 1,106). The person estimates of the three subscales, expressed in a 0–100 linear measure, were used to conduct the classic psychometric analysis. To calculate the total score for each subscale without having to perform Rasch analysis, first rescore items as explained above, then obtain a total score by summing the resulting item scores, and finally convert each subscale total score into the 0–100 linear scale following Online Resource 2 (See Supplementary material).

Table 5 shows CWI descriptive information. The CWI subscales displayed a moderate association (disattenuated correlations) with the NWI (range: 0.35–0.56) and small-to-moderate correlations with the PWI (0.30–0.45). Finally, CWI subscales displayed a weak correlation (r < 0.25) with the EQ-5D (index and VAS) and DUFSS. People more satisfied with the place of residence had a significantly higher community wellbeing, in the three subscales, than people less satisfied [community services: t(1,000) = −5.82, P < 0.001; community attachment: t(1,038) = −9.16, P < 0.001; physical and social environment: t(975) = −7.29, P < 0.001].

Table 5 Linear measures of CWI subscales (n = 1,106): descriptive statistics and correlations (Pearson r coefficients) with other scales

Discussion

The main goal of this study was to develop and validate the CWI, using Rasch analysis and a classic psychometric approach. Scales developed under the Rasch model offer many advantages besides a guarantee of unidimensionality. They provide results on a true linear measurement scale, which support the use of parametric statistics and calculation of change scores. In addition, Rasch scales furnish ordered response categories. Frequently, Rasch analysis provides smaller, equally reliable scales than those developed following classic methods.

The CWI, centered on the local place of residence, provides an intermediate focus between the NWI, which measures satisfaction with the whole country, and the PWI, aimed at assessing people’s satisfaction with their own lives. With 10 items grouped into three subscales, the CWI is briefer than the 17-item Neighborhood Satisfaction Subscale [20], while providing more comprehensive information than single-item measures [14].

Rasch results indicate that the initial CWI was not a unidimensional scale and present empirical evidence against using a single total CWI sumscore. Instead of measuring a single construct, the CWI provides measures in three subscales, quite different from the ones found in an unpublished thesis, proposing a 24-item CWI version validated using classic psychometric methods in 255 adults [55]. It would be interesting to use parallel analysis to examine whether the factor structure of the CWI replicates in other samples.

The CWI subscales measure three constructs, identified by the names of the three subscales: community wellbeing in terms of satisfaction with the community services, with community attachment, and with physical and social environment. This is congruent with the idea that life satisfaction is represented by satisfaction with several life domains (such as personal, community or national wellbeing); following a satisfaction hierarchy, satisfaction with a life domain is influenced by lower levels of concern with that domain [5658]. The importance of community services, community attachment, and perception of the community environment has been recognized by several authors [5861].

The unit of measurement of community wellbeing is one logit, linearly transformed into a 0–100 continuum. The measurement continuum represents a low-to-high satisfaction with the 3 above-mentioned measured constructs. Items and persons may be ordered by quality in this same scale, from good (smallest item measures) to poor quality (highest item measures) of community aspects.

Community wellbeing index subscales showed adequate reliability, allowing to statistically differentiate between at least 3 participant groups [62]. In previous studies, the PWI showed a Cronbach’s alpha of 0.70–0.85 [3] and the NWI 0.83 [8].

The initial 11-point response scale displayed disordered thresholds, indicating that older respondents were not able to reliably distinguish between response categories. With the new response scheme, all response categories were ordered as expected. Rasch analysis of the PWI and NWI scales would allow finding how the 11-point response scale works with these scales. Rasch analysis provides an important contribution to the debate about response formats on ratings of life satisfaction [63] by offering an empirical test to the number of response categories.

The PWI and NWI are sensitive to gender differences in large community-level studies [4]. However, it is important to guarantee that they do not include items that produce gender bias. In other words, men and women, for the same subjective wellbeing level, are expected to answer similarly. This was specifically analyzed at the CWI through DIF analysis, and all items were free from gender bias.

The CWI’s construct validity was determined in three ways. First, the CWI showed good internal construct validity, displayed by an adequate fit to the Rasch model in two samples. Second, a classic psychometric approach also allowed analyzing the CWI’s external construct validity, with moderate correlation coefficients with the NWI, and low correlations with unrelated constructs (health status and social support). Finally, the CWI showed satisfactory known-groups validity, operationalized as the ability to differentiate between groups with high and low satisfaction with the place of residence. These results are in accordance with the construct validity of the PWI, established previously using multiple regression [3].

There are several limitations to this study, one of them being related to lack of a gold standard in the study of the CWI’s convergent validity. The PWI and NWI were used as related measures, although they actually measure non-overlapping constructs. On the other hand, known-groups validity was examined by comparing groups with high and low satisfaction with life in place of residence, an item created purposely for the study and therefore with uncertain reliability and validity. However, this is a similar item to the ones used in the validation work of the PWI and NWI [3, 4], with good results.

A second limitation refers to the absence of qualitative or quantitative pilot studies to help define the constructs and generate the initial CWI item pool. The source of item generation was based on the IWI index, literature review, and results of previous quality of life surveys carried out by the researchers. Further studies are also needed to more thoroughly examine validity issues (including content and external construct validity) and other psychometric attributes such as inter-rater and test–retest reliability, precision, and responsiveness [64, 65]. The high PSI obtained in Rasch analysis, giving many different points on the measurement scale, is supportive of responsive subscales. As with the PWI, the CWI is not proposed as a static instrument, but as one that will improve as new results and theory are developed [3].

This study proposes a new measure of subjective wellbeing related to the local place of residence, the CWI. This promising measure, when applied to older adults, provides valid and reliable results in three subscales. Being a measure that fits the Rasch model, it offers results in a linear scale. With the development of self-scoring forms, results in the linear scale could be easily obtained [66].

Further studies are needed to check the robustness of the CWI psychometric properties, both under the classic and the measurement theory approaches, with samples of different age-groups and cultural settings. It would also be very interesting to examine the fit of the PWI and the NWI to the Rasch model.