Introduction

Provider payment systems for mental health care that incentivize both cost control and quality improvement have been a policy focus in a number of countries. In the Netherlands, psychiatric care is included in the prospective activity-based payment system for inpatient and outpatient care [1]. Cost control is incentivized by nationally agreed unit prices and the system also incentivizes quality improvements that reduce resource consumption [2]. In the US, psychiatric inpatient care provided under Medicare is reimbursed using a prospective per diem payment system that links payment to average cost in order to promote efficiency. The system employs variable per diem payments with higher payments at the beginning of the inpatient stay to reflect higher cost. The payment system is also designed to prevent adverse effects on quality of care by reimbursing readmissions within a short period of time at the per diem rate that was applied at the time of discharge [3].

In England, mental health services have historically been funded through block contracts that do not necessarily incentivize providers to control cost [4] nor has payment been aligned to patient outcomes [5]. The prospective activity-based system used in the acute physical health care sector—the National Tariff Payment System (NTPS)—has recently been introduced to mental health. A new classification system has been developed for mental health with a primary focus on patient need and severity, an important predictor of mental health resource use [6]. The currencies or units of activity for which payment will be made are 21 care clusters that are independent of care setting. Users of mental health care services are allocated to a care cluster by clinicians using the Mental Health Clustering Tool (MHCT). The MHCT incorporates items from the Health of the Nation Outcome Scales (HoNOS) [7] and the Summary of Assessments of Risk and Need (SARN) [8]. HoNOS is comprised of 12 items, each scored from 0 (no problem) to 4 (severe problem) giving a total score ranging from 0 (best) to 48 (worst). Ratings are made by an individual clinician (psychiatrist, nurse, psychologist, or social worker) or are based on a team rating. The rating is made on the basis of all information available to the clinician and is based on the most severe problem that arose during the 2 weeks leading up to the point of rating.

After a patient is allocated to a care cluster, they should be reviewed on a regular basis to ensure that the care cluster continues to meet their needs. The time between care cluster assessments or cluster review periods (CRPs) forms the basis of contracts and prices agreed between providers and commissioners (responsible for the organization and purchase of mental health care for their populations) [9, 10]. The aspiration is that each care cluster will have a fixed national price or tariff [10]. This will provide an incentive to control costs as providers with costs above (below) the tariff will incur financial losses (surpluses). Quality of care and patient outcomes will also be incentivized under the new payment system with the inclusion of quality and outcome measures in provider contracts and the intention to link these to prices [11]. While the care cluster currencies cover most services for working-age adults and older people, some services such as children and adolescent, drug and alcohol, and specialist mental health services are not included and will be reimbursed under separate non-cluster currencies [12].

A small number of studies [13,14,15,16] have examined the relationship between costs and quality in mental health care with no clear consensus on whether a trade-off exists between the two. Previous studies investigating the relationship between costs and quality in both physical and mental health care have revealed that this is a challenging endeavor. Particular challenges relate to the availability of adequate measures of quality, small sample sizes, and the endogenous relationship between costs and quality. Regarding the latter, a number of studies [17,18,19] have used instrumental variables in order to consistently estimate the causal relationship. Nevertheless, other studies [20, 21] have highlighted the inherent difficulty of addressing endogeneity including the limited availability of suitable instrumental variables. We avoid making causal inferences due to the challenge of finding suitable instrumental variables. Instead, we analyse costs and outcomes using two separate equations and allow for a correlation in responses.  A similar methodology has been used in studies that have assessed performance in physical health care [22, 23]. As in previous studies [14, 21], we measure quality in terms of an outcome measure—the Health of the Nation Outcome Scale (HoNOS). Following earlier studies [18, 21,22,23], we use multi-level modeling, which allows us to examine the correlation in residual responses at provider-level to provide insight into the relationship between costs and outcomes and whether a potential trade-off exists. The use of a large, nationally representative dataset with individual-level data moves us beyond prior studies in mental health care that were constrained by aggregate data [15, 16] or small patient-sample sizes [13, 14].

The aim of this paper is to investigate the relationship between costs and outcomes for mental health providers in England to ascertain if incentives provided by the NTPS to control costs can be achieved without negatively affecting patient outcomes. We do not attempt to estimate the causal relationship between costs and outcomes, rather, we estimate a multi-level bivariate model with costs and outcomes as responses. We calculate the correlation between the residual variation in costs and outcomes for providers, before and after controlling for a range of risk-adjustment covariates encompassing socio-demographic, need, and treatment variables. Our method also allows us to group providers according to their performance on residual costs and outcomes.

We contribute to existing evidence in several ways. This paper is the first to use a multi-level bivariate model to examine mental health cost and outcome responses simultaneously and calculate the correlation in residual variation between two responses. We isolate the residual variation in costs and outcomes attributable to mental health providers in order to assess provider performance. Moreover, while previous studies in mental health have used data with limited geographical or provider samples, we use a nationally representative dataset that contains data for all specialist mental health providers in England. This means that we can examine costs and quality for both admitted and non-admitted care and are not constrained to just one care setting as in previous studies. Finally, our findings provide a better understanding of the incentives introduced by the new payment system and whether a trade-off between cost containment and outcome improving efforts exists.

Methods

The CRP forms the unit of observation in this analysis. As a patient can have more than one CRP, we utilize a multi-level model to reflect CRPs nested within patients who are in turn nested within providers.

We estimate the following bivariate model with two response variables for CRP i in patient j in provider k: cost y 1ijk and outcome y 2ijk :

$$\left\{ {\begin{array}{*{20}c} {y_{1ijk} = \alpha _{1} + \beta _{1}X_{1ijk} + u_{1k} + v_{1jk} + \varepsilon _{1ijk}} \\ \\ {y_{2ijk} = \alpha _{2} + \beta _{2}X_{2ijk} + u_{2k} + v_{2jk} + \varepsilon _{2ijk}} \\ \end{array} } \right.,$$
(1)
$$\left( \begin{aligned} u_{1k} \hfill \\ u_{2k} \hfill \\ \end{aligned} \right) \sim N(0,\,\varOmega_{u} ):\;\left( {\begin{array}{*{20}c} {\sigma_{u1}^{2} } & \\ {\sigma_{u1u2} } & {\sigma_{u2}^{2} } \\ \end{array} } \right),$$
(2)

X1ijk and X2ijk represent vectors of risk-adjustment covariates for the cost and outcome equations, respectively. The provider-level residual variance for costs and outcomes is represented by the random effects u1k and u2k , respectively. The patient-level residual variances for each response are denoted by v1jk and v2jk , while ε1ijk and ε2ijk signify the error terms at the CRP level for each response. The provider-level residuals, u1k and u2k , are both assumed to follow a bivariate normal distribution with zero mean and covariance matrix Ω u . Our interest lies in the correlation between the residual variation in y 1ijk and y 2ijk at the provider level, which can be calculated as r (x,y) = \(\frac{{\sigma_{u1u2} }}{{\sqrt {\sigma_{u1}^{2} \sigma_{u2}^{2} } }}\). We calculate this correlation for responses with and without risk-adjustment to gain an insight into the extent to which our risk-adjustment variables account for correlation between outcomes.

Our cost response variable y 1ijk is modeled using a log-linear model and our outcome response variable y 2ijk using a linear model. The multilevel estimates are statistically efficient even if some observations have missing data for either response under the assumption that data is missing at random [24].

The coefficients for the cost response can be interpreted in terms of a percentage change in the geometric mean of cost. For covariates measured as dummy variables, this is the percentage change in the geometric mean resulting from a change in the variable from zero to one which can be calculated as (exp(β) − 1) × 100. For continuous variables, the coefficient can be interpreted as the change in the geometric mean in cost resulting from a one unit change in this variable. The coefficients for the outcome response can be interpreted in terms of marginal effects; the change in this variable arising from a change in a binary independent variable from zero to one, and a one-unit change in continuous variables.

The model is estimated using restrictive iterative generalized least squares (RIGLS), which is equivalent to restricted maximum likelihood [25] in MLwiN [24] using the runmlwin command [26] in Stata 13.0 [27].

We performed a sensitivity analysis by excluding a provider that is an outlier on the outcome response.

Data

Reference cost data

Reference cost data is submitted by publicly owned providers to the Department of Health and reflects the costs of providing mental health services. Reference cost data for mental health care is reported as per diem costs per care cluster for admitted and non-admitted care separately for each provider. The reference cost data was cleaned by omitting data for outliers defined as greater than four times the cost reported in the previous (for 2012/13 data) or following year (for 2011/12 data) (n = 102,121). This resulted in dropping one provider with consistently high costs for all clusters across both years giving a sample size of 55 providers.

To construct our cost-response variable, we measured all activity during a CRP that corresponded to mental health services reimbursed under the care clusters. For each observation (CRP), we calculated the total number of days or length of stay in admitted and non-admitted care. These length-of-stay variables were then multiplied by the per diem unit costs for admitted and non-admitted care for the particular care cluster and provider in order to construct a variable reflecting the total cost associated with a CRP. It is important to highlight that the use of cost data reported at a provider level, albeit disaggregated by cluster and admitted and non-admitted care will conceal the true variation in cost that would be evident in data reported at the patient level. We used the 2011/12 reference cost data for activity between 1 April 2011 and 31 March 2012 and the 2012/13 reference cost data for activity between 1 April 2012 and 31 March 2013. For activity between 1 April 2011 and 31 March 2013, we calculated a weighted average cost that reflects the number of days during a CRP in each year.

Mental Health Minimum Data Set (MHMDS)

The MHMDS is a patient-level data set with national coverage for England. It is mandatory for providers of specialist, including elderly, mental health services funded by the NHS to submit MHMDS data on a quarterly and annual basis. The MHMDS contains data on all the care and treatment received by a service user irrespective of setting. We use Version 4.0 of the MHMDS, which covers 2011/12 and 2012/13 and includes information pertaining to the NTPS for mental health. As patients are not potentially identifiable from the data, and are not directly involved in the research, ethical approval was not required. The MHMDS data was cleaned to remove observations that are duplicates, have age coded as less than 18 years or greater than 110 years, and are treated by private providers. We also dropped observations with inpatient days in the 99th percentile for clusters 1 (common mental health problems, low severity) and 2 (common mental health problems) (n = 833). Admission thresholds have increased [28,29,30] to the extent that patients are being admitted under the Mental Health Act (MHA) in order to access inpatient treatment [31]. Therefore, we would not expect patients in clusters 1 and 2 to receive long periods of inpatient treatment due to demand pressures on inpatient beds.

HoNOS is routinely collected as part of the MHMDS, both as part of the MHCT and as an outcome measure. We use the total HoNOS score recorded as part of the MHCT at the end of a CRP (follow-up HoNOS) as our outcome response. Risk-adjustment covariates can vary for each response. The total HoNOS score recorded at the start (baseline) of a CRP is included as a risk-adjustment variable for the outcome response only as previous studies [32,33,34,35] show that baseline outcome is a consistent predictor of follow-up outcome. Risk-adjustment covariates for both responses cover demographic, treatment, need, and social variables. Demographic variables include age, gender (with female as the reference category), ethnicity (white (reference category), black, Asian, and other) and married/civil partner (with unmarried/no civil partner as the reference category). Age is grouped into five categories reflecting quintiles of the distribution in order to capture any non-linearities in the relationship with cost and outcomes with age 18–34 years the reference category. Variables reflecting if a patient has care coordinated under the Care Programme Approach (CPA) (a method of assessing, planning and reviewing the needs of a person with severe mental illness) or has been admitted to hospital under the Mental Health Act (MHA) provides information on severity and treatment. Missing values of CPA and MHA were coded as zero under the assumption that it is unlikely these observations were subject to the MHA or under CPA, given the high levels of scrutiny of these activities. We include dummy variables for the 21 care clusters to investigate the extent to which these explain variations in costs and outcomes. We use the cluster with the lowest cost (cluster 1—common mental health problems, low severity) as the reference category. The MHMDS also contains data for a small area level geographic marker, the Lower Layer Super Output Area (LSOA) of the individual. LSOAs are a geographic hierarchy with a minimum population of 1000 and a mean of 1500 [36]. We matched LSOA codes in the MHMDS to data on the Index of Multiple Deprivation (IMD) Income Domain [37]. The IMD Income Domain measures the proportions of the population experiencing income deprivation in an area [37]. A higher score for the Income Domain indicates a greater proportion of the population in the area in which the patient lives experiences income deprivation. A dummy variable is included as a covariate for both responses in order to capture the year (2011/12 and 2012/13) in which the cluster commenced with 2011/12 as the reference category.

Results

Response variables

Figure 1 shows our cost response variable and Fig. 2 our outcome response variable measured at the CRP level.

Fig. 1
figure 1

Log of total cost

Fig. 2
figure 2

Total follow-up HoNOS score

The graphs show that both variables are approximately normally distributed although the outcome response variable is slightly right skewed, reflecting a smaller number of observations with high follow-up HoNOS scores (and more severe mental health problems).

Descriptive statistics

Our estimation sample is 697,022 observations treated by 55 providers of which 269,525 observations have both cost and outcome responses, 419,879 observations have the cost response only, and 7618 have the outcome response only.

Table 1 displays the descriptive statistics for our estimation sample with reference categories in brackets.

Table 1 Descriptive statistics

Estimation results

Table 2 displays the estimation results. All variables are statistically significant with the exception of other ethnicity and cluster 11 (ongoing recurrent psychosis, low symptoms) for the total follow-up HoNOS dependent variable.

Table 2 Estimation results

For the log of total cost response variable, many of the cluster variables are associated with the largest effects. For example, cluster 14 (psychotic crisis) is associated with a 647% and cluster 17 (psychosis and affective disorder difficult to engage) a 555% increase in cost compared to cluster 1 (common mental health problems, low severity). Other clusters associated with psychosis (clusters 10, 13, and 15) are also associated with considerable increases of over 400% compared to cluster 1. CRPs that have an admission under the MHA variable are associated with a 98% increase in cost. In terms of demographic variables, black ethnicity is associated with an increase of 9% in costs compared to white ethnicity while age of 63–79 years is associated with an increase in costs of 34% compared to age 18–34 years. CRPs that started in 2012/13 are associated with a 39% reduction in costs compared to CRPs that started in 2011/12. Unit costs for non-admitted care reported by providers fell from 2011/12 to 2012/13 and we believe this is driving the negative association between total cost and CRPs that started in 2012/13.

For the follow-up HoNOS response variable, a positive coefficient signifies a worse outcome. Covariates associated with an improved outcome include being married or having a civil partner, Asian and black ethnicities compared to white ethnicity, and older age. Marriage/civil partnership and age 80 years or over are associated with a reduced HoNOS score of around 0.4 points while black ethnicity is associated with a reduction of 0.3 points. The MHA and CPA variables are associated with an increase in the follow-up HoNOS score of 0.4–0.5 points. Similar to the cost response, the clusters with higher severity are associated with greater magnitudes of effects. Compared to cluster 1 (common mental health problems, low severity) cluster 15 (severe psychotic depression) and cluster 16 (dual diagnosis, substance abuse and mental illness) are associated with an increase of 3.6 points and 3.9 points, respectively. Clusters for cognitive impairment or dementia are associated with an increase of 4.4 points (cluster 20—cognitive impairment or dementia, high need) and 5.5 points (cluster 21—cognitive impairment or dementia, high physical need) compared to cluster 1. A CRP that started in 2012/13 is associated with an increased HoNOS score of around 0.2 points.

Residual variation in provider costs and outcomes

The correlation between residual costs and outcomes at the provider level was calculated as −0.004 for unadjusted outcomes and −0.02 for risk-adjusted outcomes, suggesting little evidence of a meaningful relationship between the two variables.

Figure 3 shows the pairwise plot in risk-adjusted residual costs and outcomes for the providers in our analysis.

Fig. 3
figure 3

Pairwise plot of residual costs and outcomes for providers

The providers fit quite evenly into four groups; those associated with (1) higher costs and lower follow-up HoNOS scores (better outcome) in the top left quadrant, (2) higher costs and higher follow-up HoNOS scores (worse outcome) in the top right quadrant, (3) lower costs and higher follow-up HoNOS scores (worse outcome) in the bottom right quadrant, and (4) lower costs and lower follow-up HoNOS scores (better outcome) in the bottom left quadrant. There is an outlier provider with a residual follow-up HoNOS score of just over four points above the average and slightly above-average residual costs. Compared to the average, this outlier provider has higher than average baseline and follow-up HoNOS scores, and higher than average proportions of observations of white ethnicity; in the older age groups (63–79 years and 80 years and over); under CPA; in cognitive impairment or dementia clusters (clusters 18 and 19); and with lower income deprivation.

The estimates of risk-adjusted residual costs and outcomes at the provider level are normalized to have a mean of zero. The follow-up HoNOS response variable is measured on a continuous scale from 0 (best) to 48 (worst) meaning that a positive score for the residual total follow-up HoNOS score signifies a worse outcome. The best performer in relation to outcomes is associated with a risk-adjusted residual follow-up HoNOS score of 1.36 less than the average performer while the worst performer is associated with a risk-adjusted residual follow-up HoNOS score of 4.23 greater than the average performer. The estimates of residual total cost can be interpreted as percentage increases or decreases in the geometric mean of total cost compared to the average as calculated as (exp(EB estimate) − 1) × 100. The best performing provider in relation to risk-adjusted residual total cost is associated with a total cost that is 71% lower than the average performing provider while the worst performing provider in relation to risk-adjusted residual total cost is associated with a total cost that is 194% higher than the average performing provider.

Sensitivity analysis

The exclusion of the provider with the above-average risk-adjusted residual follow-up HoNOS score of 4.23 decreased the estimation sample to 681,305 observations. The estimation results were robust to this change. The correlation between risk-adjusted residual costs and outcomes at the provider level became −0.09. Figure 4 displays the pairwise plot in residual costs and outcomes for the 54 providers in the sensitivity analysis. The risk-adjusted residual follow-up HoNOS score reduced to 1.39 for the worst performing provider on follow-up HoNOS scores compared to the average performing provider for this response. There was also a small reduction regarding risk-adjusted residual total cost with the worst performing provider associated with a total cost that is 191% higher than the average performing provider.

Fig. 4
figure 4

Pairwise plot of residual costs and outcomes for providers for sensitivity analysis

Discussion

The reimbursement of mental health care providers in England is undergoing a considerable reform with the introduction of a prospective, activity-based payment system. From April 2017, all mental health providers and commissioners are required to link payment to locally agreed quality and outcome measures [11] and some local health economies have already developed outcome measures and indicators for payment purposes [38]. The new system will offer incentives for providers to deliver care more efficiently while better meeting patient needs and improving outcomes. This research has explored the relationship between mental health costs and outcomes in order to examine the scope for providers to respond to the incentives introduced by the new payment system and makes an important contribution to the limited literature for mental health care.

Prior to risk-adjustment, we find evidence of a very small negative correlation between costs and outcomes at the provider-level. After controlling for a range of demographic, need, social and treatment factors, we find that residual variation remains in both costs and outcomes at the provider level. However, the correlation between residual costs and outcomes at the provider level is miniscule, which suggests that a trade-off between cost containment and outcome improving efforts on the part of providers is not a major concern in our data. Plotting the provider-level residual costs and outcome response variables reveals that providers broadly fall into four groups with an outlier provider. Providers with higher than average residual costs and higher than average residual follow-up HoNOS scores (worse outcome) may signify poor performance but may also indicate that certain providers are treating a case-mix that our model has not fully accounted for. While patient case mix is controlled for to a certain extent by the care clusters, the clustering method does not explicitly take diagnosis into account and it is likely that the clusters are very variable in terms of diagnosis and case mix [4, 39]. It may also be the case that some patients have treatment-resistant variants of mental illness which implies that they will be consuming large amounts of care and resources with little discernible changes in outcome scores [40]. If certain providers have a higher case-load of such patients this could well explain their unexplained higher costs and worse outcomes. If the higher costs are legitimate then these providers may warrant additional payments as defined by any outlier policy attached to the payment system. A number of providers are associated with lower residual follow-up HoNOS scores (better outcome) but also with higher residual costs. These providers in particular may face a potential trade-off between costs and outcomes and efforts to reduce costs under a national tariff may compromise outcomes if providers are induced to undertake undesirable behaviors such as skimping on patient care. A number of providers have lower than average residual costs and lower than average residual follow-up HoNOS scores (better outcome). These providers are likely to benefit financially from the new payment system if a national tariff is introduced and patient outcomes are linked to provider payment. Providers with lower than average residual costs and higher than average residual follow-up HoNOS scores (worse outcome) may have scope to make financial profits under a national tariff but these may be offset if payment is linked to outcomes. If providers are achieving lower costs at the expense of patient outcomes then they would warrant particular scrutiny by commissioners under quality and outcomes standards established in the contracting process.

It is important to highlight a number of limitations regarding the data we use. Concerns have been raised regarding the quality of mental health reference cost data in particular in relation to variations in unit costs within clusters and between providers as well as issues around missing data [41]. A further limitation of the cost data used is that it is essentially provider-level cost data that underpins our dependent cost variable. While variation in the dependent cost variable will arise for patients in different clusters with different care patterns, a greater level of variation would be observed if we had access to data on the actual costs incurred by individual patients, rather than the provider average. While the MHMDS contains variables on primary and secondary diagnoses, poor coding of this data inhibited inclusion of diagnosis in our set of risk-adjustment variables. A further data limitation is the missing data for follow-up HoNOS scores. We assume that this data is missing at random but this assumption would not hold if, for example, patients with more severe mental health problems are more likely to drop out of care and not have a follow-up HoNOS score recorded. Due to the poor coding of diagnosis data we are unable to investigate this hypothesis. Despite these data limitations, this research provides a valuable insight into the relationship between mental health costs and outcomes that is pertinent in the context of prospective activity-based payment. Our data show no evidence of a strong relationship between costs and outcomes. This provides some reassurance that outcome improvements can be obtained without spending a lot more, or, conversely, that some savings are possible. The research will be useful to commissioners of mental health services by providing an indication of how providers perform on disparate objectives. This work also benefits policymakers in planning future refinements to the payment system.