Introduction

In many European countries, the delivery of nursing home services occurs mainly through public and private nonprofit providers, which may be financed by different payment schemes. Widespread concerns about nursing home quality have played an important role in the recent political discussions about increasing funding [1]. The underlying assumption is that more resources are needed to boost quality levels. However, little evidence is available to support this hypothesis in the nonprofit sector and for different payment systems. The aim of this analysis is to provide evidence on the relationship between quality and costs in one European country, Switzerland. This evidence may help to inform policymakers in countries that share the same characteristics of the Swiss nursing home sector.

Ideally, we would want to investigate this relationship by pooling data from different European countries under the same delivery system. However, for our analysis, we choose to focus on one country since access to data represents a barrier, and country and regional differences in regulation and definition of nursing home services are likely to confound our findings.

Switzerland is a federal state with 26 independent cantons which are granted extensive autonomy in the provision of long-term care services and other social services. Since the high level of autonomy led to 26 heterogenous systems, we limit the analysis to one homogeneous region (Canton Ticino). In Canton Ticino, nursing home services are mainly provided by public (46.5 %) and private nonprofit (48.5 %) nursing homes (NHs), while for-profit providers represent a small minority (5 %). However, private NHs are excluded from the cantonal administration and are not required to share their data. The provision of nursing home care is further decentralized at the local level (municipalities), and elderly people are commonly assigned to the NH in the community of residence, virtually excluding competition and patient self-selection. Prices and some aspects of quality are regulated at the cantonal level. In particular, quality is regulated in terms of structural elements and staffing levels.Footnote 1 Capital costs are covered through a retrospective payment system, while operating costs are subject to global budget payment. The global budget payment system replaced the previously-in-force cost reimbursement system in 2006, in order to increase transparency and efficiency in the sector. Consumer fees finance part of the system and are a function of residents’ wealth and income (pension payment).

A positive relationship between costs and quality is generally expected when higher levels of quality can be provided through structural and procedural improvements, such as obtaining more costly equipment or additional staff employment. However, adverse patient outcomes may be costly to treat because they involve additional resource utilization for extra care. The relationship between costs and quality may therefore depend on the dimension considered. Better procedures are expected to increase costs, while prevention of adverse outcomes may actually reduce costs. Recent studies on nursing home costs using clinical quality indicators generally include single indicators of quality in cost analyses, possibly neglecting the relationship between different quality dimensions. Since the correlation between quality indicators is usually low, more effort is needed to understand whether the multidimensional nature of quality is better captured by single or combined quality indicators.

Through this paper, we investigate the relationship between quality and costs in NHs using a cost function approach consistent with the economic theory of health care production. We contribute to the existing literature in two main respects. First, we use panel data models to address omitted variables bias, which is particularly relevant when a subselection of indicators is included in the analysis. Second, we disentangle the impact of process and outcome quality dimensions on costs. This is done using both composite and single measures of quality. To our knowledge, this is the first study providing evidence on the relationship between costs and quality in nonprofit nursing home care using panel data, if we exclude the analysis by Wodchis et al. [3], which does not specify a cost function.

The remainder of the paper is organized as follows: "Quality" defines quality measurement in nursing home care. "Empirical evidence on the impact of quality on costs" reviews previous studies on the relationship between costs and quality. "Model specification and data" describes the data set and discusses the choice of quality indicators and the empirical strategy. Estimation technique and results are presented in "Econometric estimation and results". "Conclusions" provides concluding remarks.

Quality

No universal definition of quality exists in health research. The US Institute of Medicine [4] states that “quality of care is the degree to which health services for individuals and populations increase the likelihood of desired health outcomes and are consistent with current professional knowledge”.Footnote 2 This definition has significantly influenced the literature on quality and is very much related to the paradigm of quality proposed by Donabedian [9]. His seminal article on the assessment of quality of care represents the foundation of modern quality assessment, providing a framework of reference with guidance validity. Donabedian proposed the so-called structure, process, and outcome (SPO) framework. Structure is defined by the attributes of the setting in which care is provided, such as material resources (e.g., equipment), human resources (e.g., staffing levels), and organizational structure (e.g., payment system). Process refers to the activities of practitioners in giving care, such as making a correct diagnosis and implementing the treatment accordingly. Outcome defines the change in health status of the patient.

Inability to include information about these three dimensions of care is due to measurement deficiencies and limitations in data availability. Recently, the introduction of the resident assessment instrument (RAI) in the United States and some European countries started a comprehensive and multidimensional assessment of all nursing home residents’ health status. These data, also called minimum data set (MDS), are used to develop a battery of clinical indicators of quality that meet the taxonomy of the SPO model [10, 11]. As such, they offer a unique tool to measure and compare quality of nursing homes in different domains of care [12].

Table 1 Classification of quality indicators according to the SPO framework developed by Donabedian [9]

The success of the SPO paradigm lies in its broad scope, which encompasses older and newer definitions of quality. Table 1 shows how different measures of quality used in the literature fall within the dimensions of the SPO framework. The first three columns include non-clinical and clinical indicators of structure, process, and outcome. The last column comprises consumer- and family-reported indicators. In a recent study, Li et al. [13] show that satisfaction ratings are associated with other common indicators of quality, such as higher nursing staffing levels and fewer citations. The authors also find higher scores in public and private nonprofit nursing homes compared to for-profit nursing homes. Unfortunately, satisfaction ratings are available for only one year in the present study and suffer from lack of variation across facilities (\(<\)10 %). For this reason, these data cannot be used.

With the development of quality indicators derived from the RAI, clinical measures of quality regarding process and outcome are now available. However, countries still use different systems to measure quality in the nursing home sector [14], and only a few of them have adopted the RAI.

Previous studies attempt to capture nursing home quality differences mainly using indicators of structure or indirect signals, such as the number of deficiency citations [15], staffing levels [1618], staff characteristics [1922], and willingness to take leadership [23]. A recent systematic review by Bostick et al. [24] shows not only evidence of association between higher levels of licensing attained by staff members and quality, but also a significant relationship between staff turnover and quality indicators such as pressure ulcers, weight loss, and functional ability. Some relatively old indicators (non-clinical) are still considered valid and are often combined in empirical studies with clinical quality indicators derived from the RAI.

The advantages and disadvantages of quality indicators based on the SPO model are discussed in [15]. Structural indicators are easy to measure, and data are often available. The disadvantage is that the presence of structural attributes does not imply their best use. Indicators of process are usually easy to interpret as they inform on the provision of a particular treatment. Even in this case, it cannot be determined whether or not the provided treatment is appropriate. Finally, outcome indicators are of natural interest, as they measure the change in patients’ health status. The main problem with these indicators is that it is extremely difficult to isolate the effect of care and changes in health, as the latter may be influenced by many uncontrolled factors.

The recent development of clinical quality indicators has improved the measurement of quality, but with some limitations. Firstly, due to the absence of a universally accepted definition of quality, the selection of quality indicators to include in empirical analyses is, to some extent, arbitrary. This is an issue because of the usually low correlation among quality indicators. Indeed, facilities with excellent outcomes in some dimensions may perform poorly in others. The choice of indicators may therefore affect the perception of nursing home quality. Secondly, detection bias occurs if higher-quality nursing homes are more vigilant in looking for and detecting quality issues [25]. Since nursing home staff rather than an independent authority assess residents’ health status, risk of detection bias exists. Thirdly, variation in clinical quality indicators may be due not only to changes in quality, but also in risk or error [26]. To cope with this issue, different risk adjustment techniques are used. While previous studies of nursing home quality mainly use adjustment methods at the facility level [2729], more recently, risk adjustment is performed at the individual level when data are available. Different approaches include stratification, covariate models [30], and standardization [31]. For some clinical indicators of quality that are considered particularly relevant in detecting the presence of problematic cases of quality shortcomings, no risk adjustment is required. Among these are the presence of daily physical restraints [12], dehydration, and fecal impaction [26, 32]. The main issue of risk adjustment techniques is that they may only partially capture residents’ risk factors, resulting in biased estimates of quality coefficients. Risk adjustment is also of concern when risk adjustment factors are themselves a function of quality. In these cases, quality scores could be over-adjusted, giving credit for poor quality [33].

Empirical evidence on the impact of quality on costs

The literature on nursing home costs is extensive, but only marginally addresses quality of care. One challenge lies in the measurement of quality. Due to the absence of clinical indicators of quality, studies mainly use non-clinical measures, such as the number of deficiency citations, information about staffing (e.g., turnover rate or skill characteristics), or mortality rates. Others rely on modeling quality as a latent variable [34, 35].

Empirical models using non-clinical quality measures mainly focus on the impact of specific factors on costs, such as market structure, forms of organization, or reforms implemented in the nursing home sector. Quality measures are usually introduced as control factors. Some of these studies use staffing information [18, 21, 36] or deficiency rates [37]. Another strand of literature exploits determinants of quality variability. Factors considered include the impact of state regulations [38, 39], ownership form [40, 41], competition [4245], and financial performance [46].

Table 2 Overview of selected studies investigating the relationship between costs and quality in nursing homes

We focus our review on studies that use clinical indicators derived from the RAI to investigate the relationship between costs and quality. The main contribution of these studies is summarized in Table 2, where details on the choice of quality indicators, the empirical approach, and the results are presented.

Mukamel and Spector [47] analyze nursing homes in New York State using regression-based risk adjustment. The authors report an inverted U-shaped relationship between costs and quality. An important contribution to the cost-quality relationship is provided by Laine et al. [48, 49], who implement stochastic frontier models for the Finnish long-term care sector. The prevalence of pressure ulcers is the only quality indicator positively associated with technical inefficiency. Laine et al. [49] provide a similar cross-sectional analysis that shifts the focus from productive efficiency to cost efficiency. The mean values of the indicators over a three-year period are taken without risk adjustment. The results show that a worse outcome in terms of higher prevalence of pressure ulcers is associated with higher costs, while poor process quality measured by the weekly use of antidepressants and hypnotics is associated with higher inefficiency. However, the impact of these quality indicators is relatively low.

Weech-Maldonado et al. [50] investigate the impact of quality on costs in US nursing homes. Using cross-sectional data from around 750 facilities, they test the inverted U-shaped theory. Indicators are adjusted for risk using the covariates model [51]. To our knowledge, this is the only study that addresses endogeneity by instrumenting the quality indicators with county-level variables associated with nursing home demand (e.g., poverty rate over 65, female older than 75 years, education levels, mortality rate, Medicare inpatient days). However, the validity of the instruments is not tested. The results show an inverted U-shaped relationship between costs and pressure ulcers. An opposite pattern arises for mood decline, showing that different indicators of quality may lead to different types of relationships. Additional evidence based on data from Ontario, Canada, is provided by Wodchis et al. [3], who estimate panel data models. The analysis shows a negative relationship between costs and daily use of physical restraints, as well as worsening incontinence. Antipsychotic use, the prevalence of pressure ulcers, and the prevalence of severe pain are not statistically significant.

Most of the studies presented above find correlation between some quality indicators and costs. However, the association is weak and the approaches used are hardly comparable. The majority of these studies use a cross-sectional design and do not account for unobserved heterogeneity that may affect both costs and quality. Unobserved heterogeneity may represent a serious problem in analyses of costs-quality relationship due to the difficulty in measuring quality. If the risk adjustment technique used in cross-sectional studies does not capture the facility-specific features perfectly, then the results may be biased. Also, only a few studies address the potential endogeneity of quality, and virtually no test is provided on the validity of the instruments.

In the following section, we propose an empirical approach to investigate the relationship between costs and quality using data from Swiss nursing homes. The main novelty of this approach is the inclusion of process and outcome quality measured by composite or single quality indicators into a cost function. As compared to previous studies, we are also able to control for omitted variable bias by exploiting the panel structure of our data.

Model specification and data

Choice of quality indicators

Quality indicators measure adverse events such as the use of antipsychotic drugs, injuries, bedridden residents, and pressure ulcers. To select appropriate quality indicators from the 22 available in our data set, we consider two approaches. The first approach combines quality indicators to obtain composite measures of process and outcome quality. Conversely, the second approach selects single quality indicators of process and outcome.

Combining different quality indicators, as suggested by organizations including the US Institute of Medicine [52], allows us to condense the multidimensional nature of quality, limit the number of variables included in an econometric model, and overcome possible arbitrariness in the choice of quality indicators. However, combining different quality indicators requires a weighting mechanism, which may be subject to criticism. Differences in the number of eligible residents for different quality events across facilities may represent a serious problem in obtaining a composite measure of quality. To overcome this problem, quality indicators can be adjusted before aggregation to increase comparability across facilities. The avoidable number of residents potentially exposed to different quality events may offer a valid solution for adjustment. As an alternative, one can generate composite indicators using a principal component analysis (PCA), where many single indicators are reduced to a small number of orthogonal components (see for details [53, 54]). Since a composite measure of quality makes it difficult to identify the factors affecting costs, we rely on Donabedian’s classification of quality and derive separate composite indicators for both process and outcome. This allows us to identify the effect of the two quality dimensions separately.

Table 3 Descriptive statistics of costs, output, and structure variables
Table 4 Descriptive statistics of process and outcome quality indicators

To derive composite indicators of process and outcome quality, we then use two methods. First, we weight each quality indicator by the number of residents exposed to a given quality event within each nursing home. The second method applies PCA to single quality indicators of process and outcome to obtain a few orthogonal components, which can be used as composite measures of quality.Footnote 3 PCA is a statistical procedure that converts the observations of possibly correlated single quality indicators into a set of linearly uncorrelated variables, called principal components, through an orthogonal transformation. Each succeeding component accounts for as much of the variability in the data as possible under the constraint that it is uncorrelated with the previous components. As a rule of thumb, components with eigenvalues higher than 1 are generally considered. In our case, we consider the first two components to approximate composite measures of process and outcome quality. We have also replicated our analysis using the third component of outcome quality since its eigenvalue is slightly higher than 1, but the results are unchanged.

As stated above, the second approach to select appropriate quality indicators is based on single quality indicators. Single quality measures are probably more reliable and meaningful than composite measures. However, a selection process is needed to limit the number of indicators used in an econometric model. Our selection process builds on two strands of literature: the medical recommendations literature and the medical-statistical literature.

Regarding medical recommendations, we consult the numerous lists of recommended indicators used in benchmarking analyses of nursing homes [23, 25]. From the medical-statistical literature, we derive three main criteria that should be satisfied for the empirical analysis (see for instance [49]): a relatively large variation in the quality scores, the absence of multicollinearity between the indicators and other variables, and a relatively large number of observations from which the quality indicators are calculated. The latter criterion is motivated by statistical properties, since some quality indicators capture the onset of rare events. In these cases, the relevant question is whether the observed frequency of the event can be considered a “true score”, or whether it is driven by random shocks. Indeed, standard errors of rare events are large and generate problems in the comparison of quality among facilities. Generally, the minimum number of observations for benchmarking is 20 [12].

Based on these criteria, we select two process quality indicators and two outcome quality indicators. The two indicators of process are the presence of antipsychotic use for low-risk residents and the daily use of physical restraints. The two indicators of outcome include the prevalence of weight loss and the prevalence of severe pain. We also control for time-invariant quality features in the structure of nursing homes through the econometric specification of our model (see "Econometric estimation and results").

Detailing the cost function

In order to identify the impact of quality on costs, we consider a cost model that includes quality indicators as derived in "Choice of quality indicators". Total costs are a function of output (Y), measured by the number of patient-days of nursing home care, prices for labor, capital, and material (\(P_{\rm l}\), \(P_{\rm k}\), \(P_{\rm m}\)), the institutional form of the nursing home (IF), the case-mix of residents (MIX), the nursing staff ratio (SR), a vector of process and outcome quality indicators (\(\mathbf {q}\)), and a time trend (\(\tau\)) that captures technological progress:Footnote 4

$$C=f(Y\text {, }P_{\rm l}\text {, }P_{\rm k}\text {, }P_{\rm m}, \text {IF, }\text {MIX, }\text {SR} \text {, }\mathbf {q}\text {, }\tau )\text {.}$$
(1)

The price of labor is calculated as the weighted average wage of different professional categories employed in the nursing home (doctors, nurses, administrative and technical staff). The price of capital is calculated as the sum of mortgage costs, amortization, and costs related to capital purchases divided by the capital stock, which is approximated by the number of beds. The price for material and meals is computed by taking the remaining costs and dividing them by the number of meals provided each year. This item mainly includes costs for food, energy, and administrative expenses.

The main difference between nonprofit nursing homes is in their institutional form. Public-law nursing homes are public administrative units without a separate judicial status from the local public administration. Conversely, private-law nursing homes usually take the form of a foundation. We include a dummy for the institutional form (IF) equal to 1 when the nursing home is a public-law organization, and 0 otherwise.

SR is the nursing staff ratio, i.e., the ratio between the number of nurses employed in a nursing home and the number of nurses that should be employed according to the guidelines of the regulator (prescribed amount of staff). Because nursing care is a labor-intensive service, staffing levels have been recognized as a good indicator for (structure) quality [24]. Note, however, that our indicator is conceptually different from other quality indicators related to staffing levels, since it captures deviations from the prescribed number of nurses.

The vector of process and outcome quality indicators (\(\mathbf {q}\)) leads to three different model specifications. In Model 1, the quality vector includes four composite indicators, two for process quality (\(Q_\mathrm{process}^\mathrm{pc1}\) and \(\ Q_\mathrm{process}^\mathrm{pc2}\)) and two for outcome quality (\(Q_\mathrm{outcome}^\mathrm{pc1}\) and \(\ Q_\mathrm{outcome}^\mathrm{pc2}\)), derived from PCA. As explained above, these are the two succeeding principal components of all the observed quality scores with the highest eigenvalues. In Model 2, the quality vector is represented by two composite indicators (\(Q_\mathrm{process}\) and \(\ Q_\mathrm{outcome}\)) derived using weights according to the number of residents exposed to different quality inputs. Finally, Model 3 includes a vector of four single quality measures: two process quality indicators—the prevalence of antipsychotic use for low-risk residents (\(Q_\mathrm{antips}\)) and daily use of physical restraints use (\(Q_\mathrm{restr}\))—and two outcome quality indicators—weight loss (\(Q_\mathrm{weight}\)) and severe pain (\(Q_\mathrm{pain}\)).

\(Q_\mathrm{antips}\) is risk-adjusted based on the stratification approach, whereas \(Q_\mathrm{restr}\) is a sentinel indicator, and as such, no risk adjustment is required [12]. Due to lack of data at the resident level, we further control for case-mix differences using an index at the facility level (MIX). This is a cardinal index that measures the average patient’s need in terms of daily hours of personal and medical care, and is calculated on yearly basis by the regulator. Patients are classified in one out of five categories according to their severity level. A value between 0 and 4 is assigned, where higher values indicate more severe cases.Footnote 5 We expect this case-mix indicator to be correlated with patients’ risk factors that are not observable. Moreover, any unobserved facility-specific risk factor features are captured by the individual effects. We acknowledge that the risk-adjustment system used in this analysis may be less precise than adjustments based on clinical information at the individual level. However, as previously discussed, even complex systems of risk adjustment present serious shortcomings.

For the estimation of the cost model in Eq. (1), we use a log-log functional form. When choosing the functional form, parsimony in the number of coefficients to be estimated is traded off against flexibility. A translog functional form would require interacting all quality indicators with the production factors, leading to an important loss of degrees of freedom.Footnote 6 , Footnote 7

Input prices and total costs are divided by the material price in order to satisfy the homogeneity condition in input prices.Footnote 8 The log-log form of Eq. (1) is:

$$\begin{aligned} \ln \left( \frac{C}{P_{\rm m}}\right) & = {} \delta _{0}+\delta _{Y}\ln Y+\delta _{P_{\rm l}}\ln \frac{P_{\rm l}}{P_{\rm m}}+\delta _{P_{\rm k}}\ln \frac{P_{\rm k}}{P_{\rm m}} +\delta _\mathrm{IF} \mathrm{IF} \\&+\delta _\mathrm{MIX}\ln \mathrm{MIX} +\delta _\mathrm{SR} \mathrm{SR} +\mathbf {\delta }_{q}\mathbf {q}+\delta _{t}\tau +\varepsilon \text {,} \nonumber \end{aligned}$$
(2)

where \(\mathbf {\delta }_{q}\) is the vector of quality parameters, \(\varepsilon\) is the error term that contains individual effects \(\delta _{i}\). The individual subscript i and the time subscript t are omitted for simplicity.

The estimation of the cost function in Eq. (2) is based on the assumption that output, input prices, and quality are exogenous variables. In the case of nursing homes included in the sample, output is likely to be exogenous because nursing homes have to accept all residents in a given residential area, and residents do not have free choice of the facility. Also, the excess of demand due to subsidized prices leads to occupation rates of about \(100\, \%\). For the same reasons, the case-mix is also likely to be exogenous. Moreover, the reimbursement system is linked to the nursing home-specific case-mix, which limits incentives to attract less costly patients. Input prices can be considered exogenous because nursing homes have to follow the guidelines imposed by the regulator.

As with respect to quality, it is important to distinguish between the nursing staff ratio and clinical quality indicators derived from the RAI. The nursing staff ratio is strongly regulated by the canton, and nursing homes are not allowed to deviate significantly from the optimal staff size. Therefore, we can exclude the presence of endogeneity.Footnote 9 The potential endogeneity issue of unregulated clinical indicators will be discussed later, in "Econometric estimation and results".

Data and descriptive statistics

We merge two data sets on costs and quality of nursing home residents in southern Switzerland (Canton Ticino), which were provided by the regulator. The first data set includes yearly use of resources at the organization level extracted from the annual reports of nursing homes. It includes 45 nursing homes over a 10-year period, from 2001 to 2010. The second data set contains information derived from the MDS on 22 clinical quality indicators at the organization level for the period 2006–2010, excluding the year 2008.Footnote 10 Due to missing values in the data set, no quality scores are available for three nursing homes for the years 2006 and 2007. The total number of observations is 173 for the models with composite quality indicators. For the model with single quality indicators, we exclude observations for which the denominator of the quality scores is less than 20. This leads to a loss of ten observations.

In Tables 3 and 4, we provide descriptive statistics for the main costs and quality variables. The data show that on average, a resident day costs 247 Swiss francs (SFr.). The difference between the minimum and the maximum cost is almost SFr. 200. This may be due to differences in the output, as the number of resident days ranges between almost 9000 and more than 64,000. The average resident case-mix is 3.1, with important differences among nursing homes (0.80–3.83). The average price of labor is approximately SFr. 81,000, and nursing homes are highly homogenous in this respect. The prices of capital and material show higher variation, from SFr. 1054 to almost SFr. \(23 {,\,} 000\) and from SFr. 5.16 to around SFr. 103, respectively. These differences are due to renovation or enlargement investments. At the approximation point, the shares of capital, material, and labor costs are 6.5, 12.1, and \(81.4\, \%\), respectively.

Regarding quality indicators, the data show that the nursing staff ratio is very close to 1, as expected. Variations larger than \(10\, \%\) are possible only for very short periods. On average, \(32\, \%\) of low-risk patients use antipsychotics, but in some nursing homes this value reaches \(88\, \%\), suggesting that serious problems may exist within the production process of nursing home care. The average prevalence of daily use of physical restraints is around \(20\, \%\) and ranges between 0 and \(50\, \%\). Regarding outcome quality, the average prevalence of residents who lost weight unexpectedly is about \(7\, \%\), and this percentage ranges between 0 and \(27\, \%\). Finally, the prevalence of residents suffering from severe pain is \(21\, \%\) on average, but reaches more than \(60\, \%\) in some cases.

An interesting question is whether quality domains are correlated. This may affect the selection process of appropriate composite quality scores as well as the choice of single quality indicators to be included in the econometric analysis. We compute the correlation among indicators (including the staff ratio) and Kendall’s rank correlation coefficient [56]. The latter measures the similarity of the ordering of nursing homes when these are ranked according to quality scores. Both measures indicate a low correlation between quality indicators (\(<\)25 \(\%\)). This could potentially undermine the use of composite quality scores derived from PCA (Model 1). However, our strategy of measuring quality using three different approaches comes out stronger. Indeed, the use of other composite quality scores (Model 2) not derived from PCA and the use of single quality indicators (Model 3) may offer an answer to this criticism. Meanwhile, the use of composite quality indicators and of a small number of single quality indicators ensures that collinearity between quality scores is not an issue.

Econometric estimation and results

Panel data models

When analyzing the impact of process and outcome quality on costs, two main issues that are likely to bias the results may arise: omitted variables and simultaneity of quality and costs. We exclude the ordinary least squares (OLS) estimator due to the presence of unobserved heterogeneity (shown by both F test and Breusch–Pagan test) and use panel data models with individual effects.

Table 5 Estimation results of fixed-effects (FE) and random-effects (RE) cost models with composite and single quality indicators

The results of the estimation of the three models with combined and single quality indicators described in "Detailing the cost function" are reported in Table 5. In all models, standard errors are corrected using the cluster-robust estimator based on Stock and Watson [57] and Kezdi [58].Footnote 11 Both the FE and the RE models have potential advantages and disadvantages, and the model choice involves a tradeoff between bias and variance [60]. Both approaches address the omitted variables issue. The RE model treats the individual effects as stochastic parameters, therefore assuming independence with the other covariates. When this assumption does not hold true, the RE estimates are biased. Instead, the fixed effects model treats the individual effects as fixed parameters and allows the individual effects to be partially correlated with regressors, thus accommodating a limited form of endogeneity deriving from constant omitted variables [61]. This feature is particularly appealing in studies of costs and quality due to unmeasurable dimensions that are likely to affect the relationship. The Hausman test casts doubts on the RE estimates, since it rejects at the \(5\, \%\) level the hypothesis that the individual-specific error terms are uncorrelated with the explanatory variables, i.e., the RE estimator may be inconsistent (see Cameron and Trivedi [62] for details). As compared to the RE model, the FE model could suffer from lack of robustness in the case of small sample size or small within variation. However, our estimates show that the coefficients of interest (quality indicators) are very stable independently of the model specification. As expected, the variance is smaller in the RE estimates. Given that the percentage of within variation of the variables of interest with respect to the overall variation is satisfactory, the fixed effects estimates should be fairly precise [62].

The small sample size may explain the difference in the magnitude of some coefficients (Y, MIX, and SR) that increase slightly in the RE estimates. However, the sign and statistical significance of all coefficients are basically unchanged, suggesting that FE estimates are unbiased. The only exceptions are a measure of process quality (\(Q_\mathrm{restr}\)), which becomes significant at the \(10\, \%\) level in the RE specification of Model 2, and a measure of outcome quality (\(Q_\mathrm{outcome}^\mathrm{pc2}\)), which becomes more significant in the FE specification of Model 1. Generally, the similarity of the random effects and the fixed effects estimates suggests a low correlation between the individual effects and our covariates.

Note that the estimated parameters are very similar across the three models. Consider first the main variables of interest: the quality indicators. The nursing staff ratio (SR) is highly significant. As expected, the higher the relative number of nurses working in a nursing home, the higher the costs. The estimated coefficient is stable across the three models. In Model 1 and Model 2, we consider composite quality indicators. Note that outcome quality (\(Q_\mathrm{outcome}^\mathrm{pc1}\), \(\ Q_\mathrm{outcome}^\mathrm{pc2}\), \(Q_\mathrm{outcome}\)) exhibits a negative (positive sign) and significant effect on costs in both models, although the magnitude of the effect is stronger when composite quality indicators are derived using weights according to the number of residents exposed to different quality aspects (Model 2). Conversely, process quality shows an opposite (negative sign) effect on costs, although the impact is not significant. These results are in accordance with those obtained with single quality measures (Model 3). We observe a negative and significant association between costs and outcome quality measured by the prevalence of weight loss (\(Q_\mathrm{weight}\)) and the prevalence of severe pain (\(Q_\mathrm{pain}\)).Footnote 12 This means that worsening outcome measures lead to increased costs, while better control of patients’ outcomes reduces nursing home costs. Instead, process quality measured by the daily use of physical restraints (\(Q_\mathrm{restr}\)) and the prevalence of antipsychotic use for low-risk residents (\(Q_\mathrm{antips}\)) do not seem to have a significant impact on costs.

Note also that the other coefficients are very similar across the three models. The coefficient of output (Y) measures the total costs elasticity with respect to output. A value lower than 1 suggests the presence of unexploited economies of scale. In our case, an increase in output by \(10\, \%\) in the number of patient-days increases total costs by roughly 7–\(8\, \%\). As expected, more severe patients (MIX) are more costly to treat. The coefficient can also be interpreted as a cost elasticity. An increase in the level of patients’ severity by \(10\, \%\) significantly increases costs by around \(2\, \%\). The above findings on the effect of outcome quality on costs may be questioned if less costly NHs select patients in better health status, resulting in better outcome indicators for quality. This supposes that the case-mix variable (MIX) does not fully capture the information relative to patients’ health status. However, as explained in the “Introduction” section, patient selection is a negligible factor in our setting, since individuals are assigned to the NH of the former place of residence. Still, as a robustness check, we grouped observations in five categories of case-mix (20 percentiles) and did not find any evidence of a systematic increase/decrease in quality when moving from less severe to more severe residents. Finally, we also instrumented the case-mix with the spatial lag and the proportion of elderly population in the community area. The instrumental variables (IV) approach could not reject the hypothesis of exogenous case-mix in all our models (estimation results and endogeneity tests are available upon request).

The cost function is monotonically increasing in the vector of input prices, since input price coefficients (\(P_{\rm l}\) and \(P_{\rm k}\)) are positive and significant. Also, these coefficients provide information on the percentage of labor and capital costs over total costs of a representative NH. The share of labor costs (\(P_{\rm l}\)) is estimated between 91 and \(92\, \%\), while the estimated share of capital (\(P_{\rm k}\)) is between 6 and \(8\, \%\) . The institutional form (IF) is dropped in fixed effects regressions because of time invariance, but it is not significant in random effects regressions.Footnote 13 The time trend (t) is statistically significant in Models 2 and 3, but the coefficient is very small. Total costs of nursing home care remained relatively constant over the time period considered in the analysis.

The issue of simultaneity may arise from the fact that costs and quality are codetermined. However, the results discussed so far are not expected to be significantly affected by simultaneity bias for two main reasons. First, our panel is relatively short. Second, as explained above, the nursing home sector under analysis is highly regulated. Nevertheless, to increase the robustness of our findings, we discuss the issue of endogenous quality indicators in detail in the next section.

Instrumental variable models

We believe that simultaneity between costs and quality is unlikely. Even in the case of endogeneity, however, the estimation bias due to quality endogeneity would be very limited. This is because of the institutional setting of the nursing home sector and the strong regulation system. Nursing home activities are regulated by the local government in a relatively effective way. Nonetheless, in order to test potential endogeneity, we consider IV approaches using the efficient generalized method of moments (GMM) combined with the fixed effects model. The GMM approach has the advantage of consistency in the case of arbitrary heteroskedasticity and shows higher flexibility than two-stage least squares (2SLS), in particular to test the validity of the instruments. The GMM approach is preferred, since it allows error clustering for panel data and provides a battery of tests to check the validity of the instruments.

A valid instrument must satisfy two requirements: the instrument z must be correlated with the endogenous variable x, \(Cov(z,x)\ne 0\), and uncorrelated with the error term u, \(Cov(z,u)=0\). In the case of multiple endogenous regressors, the Shea partial \(R^{2}\) [63] measure should be used to test the first condition, as this takes into account the intercorrelation among the instruments.Footnote 14 However, this does not exclude the possibility of weak instruments. The second condition can be tested when there are more instruments for an endogenous variable. In this case, the C-statistic, also called “difference-in-Sargan” statistic, can be used [64].

As shown in previous studies [3, 47], good instruments for quality are lacking. Moreover, finding good instruments for several quality indicators is even more challenging. We rely on three hypotheses. First, visits by residents’ relatives exert pressure on the management staff of the nursing home to keep adequate levels of quality. Hence, we identify two variables: the weighted average distance (travel time) between residents’ location and the nursing home facility, and the weighted population density of the area served by the nursing home. The relative distance variable has previously been used by other authors in the nursing home literature and is considered a valid exclusion restriction [40, 65, 66]. The second hypothesis assumes that the quality offered by the nursing home depends on the average quality offered by surrounding nursing homes. We build a variable to capture pressure from other nursing homes located in geographical proximity. For each year and nursing home, pressure is measured as the average score of quality indicators of nursing homes located in neighbouring districts.Footnote 15 Our third hypothesis is that the elderly population living in the area around the nursing home exercises an indirect pressure on quality of home care provided. We then consider the percentages of young, adult, and elderly people in the catchment area of each nursing home. Finally, we also consider lagged values of quality indicators as natural instruments.Footnote 16

Table 6 Estimation results of second-stage IV-GMM cost models with fixed effects

The results of IV-GMM estimations are reported in Table 6. The table shows the three IV-GMM models with fixed effects. The results of FE estimations without IV are partially confirmed. Findings are mixed. Process quality indicators become significant at \(10\, \%\) in a couple of cases, whereas outcome quality indicators lose significance in some cases. The Hausman test suggests no evidence of endogeneity of quality indicators. The Hansen J test indicates that the over-identifying restrictions are valid. F tests of excluded instruments in the first stage are passed for most regressors. However, the Shea partial \(R^{2}\) statistics show that the percentage of variability in quality indicators explained by the instruments is relatively low. Because of the small sample and several potentially endogenous regressors, F statistics are not high and the instruments do not appear to be strong enough to safely conclude that quality endogeneity can be excluded.Footnote 17 As stated above, addressing endogeneity using multiple quality indicators and many instruments may not be very efficient. Consequently, we also tested exactly identified models with only one quality indicator and one instrument. In these cases, the null hypothesis that the excluded instruments are exogenous cannot be rejected, and the results appear more robust to weak identification.

Conclusions

In the nursing home sector, poor quality represents a main concern, and ongoing discussions are taking place to address this issue. How to increase quality in a context of financial pressure remains an open question. In this paper, we contributed to this debate by investigating the relationship between costs and quality in accordance with the SPO framework developed by Donabedian. We used recently published data on quality indicators derived from the resident assessment instrument and costs of Swiss nursing homes. In addition to structure quality indicators (e.g., nursing staff ratio), we considered single and composite clinical measures of process and outcome quality.

As compared to previous studies, we improved the estimation approach by using panel data models, in particular the fixed effects model to address endogeneity arising from omitted variables. In addition, we instrumented the quality indicators and tackled bias coming from potential simultaneity between costs and quality. While we did not find evidence of simultaneity bias and were able to control for constant omitted variables bias, we could not exclude bias from omitted, time-varying variables.

Our analysis showed evidence of a negative and significant relationship between clinical indicators of outcome quality (e.g., the prevalence of severe pain and the prevalence of weight loss) and total costs. Conversely, we did not find an impact of process quality on costs. Prevalence of daily physical restraint use, as well as the use of antipsychotics, were not found to be statistically significant. Interestingly, process measures typically interpreted as labor-saving and cost-saving factors did not seem to affect costs, while outcome measures did. Finally, structure quality indicators such as staffing levels were strongly associated with higher costs, as shown in previous studies.

A possible explanation for the negative effect of outcome quality on costs is that the use of cost-saving instruments, such as drugs and physical restraints, may initially reduce costs. However, this is only a temporary effect since worsening patient outcomes lead to increased costs of treatment making up more than the initial savings. This explanation may be questioned, since our results are based on a relatively short panel. Note, however, that the large majority of Swiss home care residents (\(>\)60 \(\%\)) are more than 85 years old and do not spend many years in a nursing home. Therefore, the effects of poor outcome quality on costs are expected to manifest in a relatively short period. Untreated patients may develop more severe dysfunctions, and, consequently, require additional resources in subsequent treatments.

An alternative interpretation relies on the idea of patient selection, i.e., nursing homes that are less costly also select patients in better health states, resulting in better outcome indicators for quality. Although we cannot completely exclude this hypothesis, our regulatory setting and the analysis provide evidence against this interpretation. First, the tight Swiss regulation on individual access to nursing homes makes resident selection highly unlikely. Individuals are assigned to the NH of the former place of residence. Second, we did not find evidence of a systematic trend in quality when moving from less severe to more severe residents or of endogenous patient severity in IV regressions.

From a policy point of view, the assessment of the relationship between costs and quality may be valuable in informing payment systems for long-term care. Our results may lead to paradoxical conclusions on the properties of payment systems. Generally, funding schemes for long-term care do not compensate nursing homes for outcome quality. This is the case in the Canton Ticino, Switzerland. At the beginning of the period of analysis (2006), the Canton Ticino introduced a new payment system based on prospective payments (global budget) and started a system of quality measurement. The cantonal authority does not rule out to integrate quality aspects in future revisions of the payment system. Given our results, the current payment system may provide adequate incentives for cost containment if managers are aware of the negative relationship between outcome quality and costs. Since payments are independent of actual costs, managers may have the incentive to better manage and prevent adverse clinical outcomes such as pain and weight loss to avoid increasing costs. Conversely, under a cost reimbursement system, managers may not have the incentive to prevent adverse outcomes, since additional costs to treat residents when adverse events occur would be covered. However, if managers are imperfectly informed on the relationship between costs and quality, the cantonal authority could consider two options: to improve information available to NH managers on the effects of quality on costs, or to incorporate the effects of quality on costs into the payment mechanism, e.g., by rewarding quality improvements or providing negative financial incentives for poor outcome quality. The latter instrument suggests that incorporating quality aspects into retrospective payment schemes would lead to quality improvements.

To conclude, it is not being advocated that a measure of the impact of quality on costs should be used in a mechanical way to introduce financial incentives in payment schemes. Rather, policymakers could use this as an additional instrument to provide a guide to the relative levels of efficiency. However, it should be noted that our results could be sensitive to the assumptions adopted regarding the econometric approach, the model specification, and data limitations. Also, Swiss payment systems for long-term care are quite heterogeneous across cantons. The investigation of the relationship between costs and quality in long-term care in other regulatory settings was beyond the scope of the present study. Therefore, the contribution to the discussion on optimal design of payment schemes in nursing home care is likely to improve in future research.