Introduction

This paper examines alternatives for improving the payment systems used by government funding agencies to fund hospitals using prospective payment formulae while appropriately reflecting variations in costs and severity across hospitals and patients. While hospitals in Australia and elsewhere have been switching to the use of diagnosis-related groups (DRG) payment formulae to reimburse hospitals, DRG classification systems around the world have been found to be limited in their ability to predict the differences in costs between teaching and non-teaching hospitals. Part of this issue may relate to the role of state-wide referral services of some teaching hospitals, which impact on the higher complexity of patients that are treated there for Australian-refined diagnosis-related groups (AR-DRG) related to such services. Data from the Victorian Department of Human Services, Australia, were analysed to investigate this issue.

This analysis was undertaken because of concerns that existing mechanisms for paying hospitals may be leading to systematic underpayment of some teaching hospitals, due to the averaging principle inherent in the use of AR-DRG cost weights and the funding policy that all centres should be paid the same for the same AR-DRG episode. The Victorian government established a committee called the Risk Adjustment Working Group (RAWG) in 2002 involving both government (Victorian Department of Human Services) and hospital industry representatives to examine this issue in consultation with international experts. The RAWG’s key Terms of Reference are to advise the government on the need for risk-adjusted funding arrangements for, inter alia, high-complexity patients of state-wide specialty services via risk-adjusted specified grants (RASG). In this paper, we examine several alternative approaches for changing payment systems for hospitals in Victoria through risk adjustment and consider the implications for hospital payment in an international context. Our analysis builds on the initial work undertaken by Antioch and Walsh [1] in the area.

A 2004 review of hospital prices and resource allocation by the Victorian Department of Human Services (DHS), Premier, Cabinet and Treasury and Finance identified non-salary cost escalation and variable management performance as key determinants of declining hospital financial performance. For 2004–2005, it recommended a financial sustainability framework linked to the demand management, strategic planning, accountability and performance reporting to eliminate deficits and control costs. Hospitals were asked to manage productivity targets of at least 0.5% over two years to contribute to deficit elimination. Hospital cost control mechanisms are to be strengthened through the development of guidelines for medical and surgical supplies and pharmaceutical cost control, following the recommendation of two independent consultancies on best practice in these areas (DHS, 2004; [14]). Arguments advanced by the hospital industry include pricing reform as an important strategy impacting on hospital deficits. These should be considered within the recommended broader assessment of the role of variable hospital management performance and non-salary cost escalation. Whilst the issues are complex, the risk adjustment analyses can, potentially, shed some light on new mechanisms to further assist the funding processes.

Australian health care system and the reform context

The Australian health care system is managed within the country’s federal structure of government, which includes Commonwealth (national), State and Local tiers. State and Territory governments have the major responsibility for the financing and public provision of health services, including public and psychiatric hospitals under what are now called Australian Health Care Agreements (AHCA) between the Federal and State governments. The Federal government funds a universal benefit scheme for private medical services called the Medical Benefits Schedule and pharmaceuticals via the Pharmaceutical Benefits Scheme. In addition to this universal public insurance program, many individuals also purchase private insurance that covers additional benefits, such as access to private hospitals, a choice of medical specialists in public and private hospitals, dentistry and certain ancillary services, such as physiotherapy (see [12, 17] for discussion).

Australia relies upon both demand-side and supply-side incentives to try to control costs. Demand-side measures include co-payments by consumers, while supply-side approaches to containing government outlays include limiting the range of items covered by the Medical Benefits Schedule and the Pharmaceutical Benefits Scheme. In recent years, governments have promoted competition and emphasised evidence-based medicine. They have also separated purchaser, provider and regulatory functions and improved primary care, prevention and systems integration functions [12]. Advances in risk adjustment are currently being explored as a key mechanism to aid funding reform in Australia at the Federal and State levels of government. An important element of health care reform in Victoria, one of Australia’s largest states, is improving the casemix funding system, particularly as it affects major teaching hospitals.

Since 1 July 1993, Victorian public hospitals have been funded based on customised AR-DRG casemix systems, which are updated annually. Initially limited to the AR-DRG funding of inpatient services, this system has since been extended to include virtually all episode-based funding of sub-acute and non-inpatient services [4, 11]. Hospital separations (elsewhere called discharges or visits) are coded using the International Classification of Diseases, 10th revision, Australian modification. Inpatient separations in Victoria are allocated to AR-DRGs using a modified form of AR-DRGs [11]. Victorian modifications are relatively slight and involve changes to the grouping criteria for only a few AR-DRGs.

Prior to 2000, Victorian inpatient casemix funding reimbursed variable and fixed costs separately. Since 2000–2001, casemix payments are presented in a single payment rate, with allowances for rural areas and differential claw-backs for different levels of underperformance. The primary payment unit for each separation is its weighted inlier equivalent separation or WIES. Most separations are classed as “inliers,” meaning that their length of stay (LOS) falls between lower and upper trim points. “Outlier” separations, which are those with LOS falling outside the lower and upper trim points, receive a variable payment based in part on LOS and in part on their inlier equivalent [11].

The WIES value for a separation is determined by converting each separation into an “inlier equivalent” and multiplying that value by a cost weight. The calculated WIES value for the separation is then multiplied by the standard (WIES) payment per inlier equivalent and the payment for the separation is claimed from the Victorian Department of Human Services. The WIES value for a low LOS outlier is derived by conversion into a partial episode value, again described as an “inlier equivalent,” which is multiplied by a cost weight in the same way as an inlier payment. For example, in 2004, the 3-day treatment for a major small and large bowel repair (AR-DRG G02A) with a 5-day low boundary point had an outlier equivalence of 0.6 (3 days stay/5-day low boundary), an inlier cost weight of 5.2949 and a WIES value of 3.1769 (0.6×5.2940). Similar to the US and other countries, high LOS outliers received additional payments for each day above the high outlier threshold. This payment was calculated at 70% or 80% of the AR-DRG average inlier cost per day (excluding operating theatres and prostheses costs). Final adjustments for high outlier weight payments sometimes distinguish rural and urban hospitals (DHS, 2002, 2004; [14, 16]).

Since 2001–2002, the total hospital inpatient budget has been capped by setting maximum WIES targets for each hospital. Until a hospital reaches this expenditure cap, the standard WIES payment rate for 2001–2002 was set at $2,515, while for 2004–2005, this rate was $2,919 for major providers, with rural, acute care hospital rates ranging from $3,055 to $3,235 (DHS, 2004; [14]). In addition to DRG-based WIES payments, additional WIES payments, called “co-payments,” are paid to hospitals by the state government for mechanical ventilation, thalassaemia, certain stents, atrial septal defect and Aboriginal and Torres Strait Islander loading (DHS, 2004; [14]). Victorian government hospital funding policy also embraces separate funding for non-admitted patients, sub-acute and non-acute care, purchasing arrangements with the private sector, teaching, research and capital funding, performance bonuses and coding audits [11].

In addition to the WIES-based casemix payments, other facility payments are made by the state government. Specified grants are provided for specific services not covered by casemix, general patient bed day funding or training and development. These include a mixture of historically paid service grants, specific one-time grants and financial payment grants that have not been put into the general WIES price. A few specified grants were rolled into WIES for 2004–2005, including the complexity component of the Training and Development grant, an outpatient base grant and a small rural services grant (DHS, 2004; [14]). The Victorian government continues to explore alternative funding models to facilitate integrated and coordinated care.

Price issues: base payments per case and AR-DRG price relativities

Every casemix payment system needs to calculate both the base payment and a set of relative values. In Victoria, the calculation of the base payment amount is made jointly by the Department of Human Services and the Department of Treasury and Finance. Antioch et al. [5] found hospital expenditure to be associated with Victorian State Gross Product, the proportion of the population under 4 years of age, the mix of public and private patients in public hospitals, the introduction of casemix funding and subsequent funding cuts, the state-wide proportion of public beds to total beds and technology. These same factors continue to influence annual increases in base payments. However, concerns persist that the base payments have increased too slowly [1, 8]. Setting relative values correctly takes on increased importance when hospitals are facing deficits which may jeopardise their performance.

AR-DRGs and teaching hospitals

This paper builds upon the earlier analysis by Antioch and Walsh [13], which documented that hospitals such as the Alfred hospital, which is a state-wide provider of services for trauma, cystic fibrosis, heart and lung transplantation and chronic heart failure, treat patients that are more complex and, hence, more expensive than what the AR-DRG casemix arrangements would indicate. Antioch and Walsh [1] explored the potential for RASGs to reduce the budget shortfall facing hospitals such as the Alfred. They analysed five high-complexity AR-DRGs, encompassing respiratory, cardiology and stroke AR-DRGs. Collectively, these five AR-DRGs were responsible for annual deficits of $3.6 m at the Alfred. Five stepwise linear regressions found that age, LOS outliers, number of disease types, diagnoses, procedures and emergency status were all significant predictors of patient-imputed costs. They also identified diagnosis- and procedure-based severity markers related to the state-wide referral services. The R 2 value explained 64% of the patient-level variance for the stroke AR-DRG, and 52% and 51% for severe respiratory infections and severe chronic obstructive pulmonary disease (COPD), respectively. The proportion of variance explained for some circulatory disorders without acute myocardial infarction (AMI) was lower, at between 6% and 20% of variance explained [1].

Previously, Antioch and Walsh [2] highlighted the case for high-severity/complexity flow-on effect for state-wide referral services for trauma, impacting on AN-DRG 23 (craniotomy with complications and co-morbidities) and AN-DRG 3 (tracheostomy, except for mouth, larynx or pharynx disorders with age over 15 years). The Alfred hospital negotiated increases in RASGs, which totalled around $14 million over the period from 1998 to 2004 for these DRGs and also cystic fibrosis [1, 3].

For casemix payments to be acceptable, the base price and the relative cost weights must be set appropriately; otherwise underfunding problems will emerge. From the perspective of a large teaching hospital, the pursuit of equity in addition to efficiency would involve the principle of a fair price that would cover the appropriate costs of an efficient provider. It would also enable a sustainable provider industry, avoid the need for cross-subsidisation between hospital services and avoid the need for additional specified grants. Antioch and Walsh [1] argued that the AR-DRG formula adjustments for complexity, age, sex and outliers do not go far enough, and argue that RASG may be a very helpful solution.

The Victorian experience is relevant for many other countries. Crafting a fair and efficient payment mechanism for hospitals is an enduring health policy challenge facing every country [7, 10]. Problems have emerged with the prospective payment system used by US Medicare and other US payers, which are criticised for not adequately capturing differences in severity within DRGs. Many studies have examined the relationship between profitability and illness severity at the hospital level (for a review, see Carpenter et al. [7]). Carpenter et al. [7] found that two measures of severity, i.e. the number of unrelated diseases and disease stage, are significant predictors of cost per case and often have better predictive power than DRGs. In the majority of instances, DRG payments did not compensate adequately for severity, and higher values for the severity variable resulted in financial losses for the hospital.

Training and development grants

Equitable payment of teaching costs is a particular challenge for every country. The costs of clinical care and teaching are closely interwoven and costs are not easily allocated between these two functions. This problem is further compounded by the fact that teaching and other speciality hospitals tend to attract more complex patients. In Victoria, funding to recognise special teaching hospital costs has been provided through Training and Development (T&D) grants. Following the 2001–2002 review of T&D grants, funding was divided between funding for complexity and funding for training and teaching, with the latter based on the actual numbers of staff. In 2004, the complexity component of the T&D grant was aimed at compensating teaching hospitals for treating more complex patients within selected AR-DRGs. Patient complexity was measured by identifying complex AR-DRGs and the most expensive conditions within them based on the highest cost patients and related ICD-10 procedure and diagnosis codes that accounted for 30% of the workload. Each hospital’s proportion of WIES associated with “complex” patients in “complex” AR-DRGs was then estimated and the complexity grant was allocated based on the share of WIES (DHS, 2003, 2004; [14, 15]).

In summary, the setting for our analysis is one in which payments to health care facilities are based on AR-DRGs, but subject to numerous adjustments. Fixed AR-DRG payments are adjusted upwards and downwards for high and low LOS outliers. Further specified grants are made for certain services, and T&D grants are made to pay for complexity and teaching costs.

Methodology

Risk adjustment alternatives for hospital casemix funding

The starting point for our analysis was to identify the AR-DRGs for 2002–2003 that contributed the most to losses by major teaching hospitals in Victoria. The teaching hospitals participating in RAWG provided data on the ten AR-DRGs that contributed the most to their deficits. These deficit calculations were based on all costs incurred and revenue for WIES-funded activity, which allocated all fixed, variable and specified grants. Each hospital was requested to identify severity markers (particularly, diagnosis and procedure codes) related to the 15 most expensive patients in each deficit AR-DRG. This methodology, along with initial formulations of how to calculate the net RASGs were based on that outlined in Antioch and Walsh [1]. The analysis was influenced by the international literature on risk adjustment by Van de Ven and Ellis [13].

Preliminary analysis

The initial regression models tested were based on variables identified by Antioch and Walsh [1], including:

  • Severity markers (selected diagnosis and procedure codes identified by leading clinicians as specifically relating to the state-wide referral service in each hospital)

  • Age

  • Sex

  • Number of diagnoses

  • Number of disease types (i.e. body systems)

  • Complexity as measured by the patient clinical and complexity level (PCCL, four different levels, created by the AR-DRG grouper)

  • Flag for high outlier on the length of stay

  • Emergency department admission

  • Number of procedures.

The dependant variable was per patient cost for the hospital stay. Severity marker procedure and diagnosis code data related to state-wide referral services were identified for three hospitals using clinical input and were applied to those AR-DRGs across all hospitals in the data set.

A linear model was specified with explanatory variables that capture the above variables and is discussed further below:

$$\begin{aligned} & Y = \beta _{0} + \beta _{1} {\left( {{\text{SEVERITY}}\;{\text{MARKERS}}} \right)} + \beta _{2} {\left( {{\text{AGE}}} \right)} + \beta _{3} {\left( {{\text{SEX}}} \right)} + \beta _{4} {\left( {{\text{DIAG}}} \right)} \\ & \quad \quad + \beta _{5} {\left( {{\text{DISEASE}}\;{\text{TYPES}}} \right)} + \beta _{6} {\left( {{\text{COMPLEX}}1} \right)} + \beta _{7} {\left( {{\text{COMPLEX}}2} \right)} \\ & \quad \quad + \beta _{8} {\left( {{\text{COMPLEX}}3} \right)} + \beta _{9} {\left( {{\text{COMPLEX}}4} \right)} + \beta _{{10}} {\left( {{\text{OUTLIER}}} \right)} + \beta _{{11}} {\left( {{\text{EMERG}}} \right)} \\ & \quad \quad + \beta _{{12}} {\left( {{\text{PROCEDURES}}} \right)} + \varepsilon \\ \end{aligned} $$
(1)

Some analyses excluded the number of procedures (PROCEDURES), which improved the stability of the model. The above specification in Eq. 1 was, therefore, further analysed excluding procedures and utilising data for financial years 2001–2002 and 2002–2003 for our sample of 23 hospitals, including some teaching and large rural hospitals. Such analyses found R 2 values ranging from 0.0181 for AR-DRG L61Z (Admit for renal dialysis) to 0.6463 for AR-DRG L62A (Kidney and Urinary Tract Neoplasms w Catastrophic or Severe CC). Of the AR-DRGs analysed, approximately 31 (or 53%) had R 2 values over 0.400, indicating that over 40% of the variance was explained by the specification. However, there were often negative or insignificant coefficients for the four PCCL complexity variables (COMPLEX1–4), based on the complexity measure created by the AR-DRG grouper. This reinforced previous analyses reported in the Victorian Department of Human Services 2003–2004 Policy and Funding Guidelines (DHS, 2004; [14]), which indicated that the PCCL was not a significant severity adjustment variable once other predictors, such as outlier status, were included in the equation. Hence, refined models were conceptualised and tested below, excluding the PCCL variables, procedures and also the number of diagnoses (DIAG), given that the number of body systems (DISEASE TYPES), was already captured.

Four funding models

Further analysis was undertaken using variations of Eq. 1 above. Four further funding policy models to risk-adjust casemix funding in Victoria were conceptualised.

Independent variables

All regression models excluded the number of procedures, PCCL level and the number of diagnoses as independent variables. A new variable, called “transfers in,” was also included, which detected whether the admission was the result of a transfer from another facility. Hence, the independent variables explored included:

  • Severity markers (aggregated or disaggregated)

  • Age

  • Sex

  • Number of disease types (i.e. body systems)

  • Outlier on length of stay

  • Emergency admission

  • Transfers in.

These variables were used in Models 2, 3 and 4 below. Model 1 used only the severity markers as an independent variable.

Dependent variables

For most of our analysis, the dependant variable was per patient costs. In the case of Model 1 below, another dependent variable was also used (cost minus WIES payment). This variable is of interest because it is an empirical approximation of the degree of underpayment (overpayment if negative) under the 2002–2004 WIES-based payment formula. Hence, this regression model is trying to explain costs not already being predicted by the WIES-based payment formula. This variable represents only a proportion of the underpayment (i.e. the difference between cost and revenue), as other revenue sources are payable to hospitals in addition to the WIES price, such as specified grants. We also varied the explanatory variables used as severity markers, using both aggregated and disaggregated versions.

Severity markers

For all analyses, AR-DRG-specific severity marker variables were constructed using diagnoses and procedure codes identified as potential signals of higher costs and clinical complexity that were related to the state-wide referral services of the hospital with the AR-DRG deficit. Severity marker codes had been provided by clinicians at five teaching hospitals and were included in the analyses. Two variations were used. The first approach was to aggregate all severity markers into a single binary variable for the specific AR-DRG that simply distinguished whether a given hospitalisation had ANY of the relevant diagnoses or procedures. The other approach was to create separate (or disaggregated) severity flags for each of the diagnosis or procedure codes for the specific AR-DRG. Severity markers relating to only three hospitals were included in the disaggregated severity marker analyses. Only results for the aggregated severity flags are reported here for Models 2, 3 and 4. Results for Model 1 using both aggregated and disaggregated severity flags are reported below.

Payment models

The general specification for Models 2, 3 and 4 as defined above was as follows:

$$\begin{aligned} & Y = \beta _{0} + \beta _{1} {\left( {{\text{SEVERITY}}\;{\text{MARKERS}}} \right)} + \beta _{2} {\left( {{\text{AGE}}} \right)} + \beta _{3} {\left( {{\text{SEX}}} \right)} \\ & \quad \quad + \beta _{4} {\left( {{\text{DISEASE}}\;{\text{TYPES}}} \right)} + \beta _{5} {\left( {{\text{OUTLIER}}} \right)} + \beta _{6} {\left( {{\text{EMERG}}} \right)} \\ & \quad \quad + \beta _{7} {\left( {{\text{TRANSFERS}}\;{\text{IN}}} \right)} + \varepsilon \\ \end{aligned} $$
(2)

Specifications for Model 1 were:

$$ Y = \beta _{0} + \beta _{1} {\left( {{\text{Severity}}\;{\text{Markers}}\;{\text{Aggregated}}} \right)} + \varepsilon $$
(3)
$$Y_{1} = \beta _{0} + \beta _{1} {\left( {{\text{Severity}}\;{\text{Marker}}\;{\text{Aggregated}}} \right)} + \varepsilon $$
(4)
$$Y = \beta _{0} + \beta _{1} {\left( {{\text{Severity}}\;{\text{Marker}}\;1} \right)} + \beta _{2} {\left( {{\text{Severity}}\;{\text{Marker}}\;2} \right)} + \beta _{n} {\left( {{\text{Severity}}\;{\text{Marker}}\;n} \right)} + \varepsilon $$
(5)

The dependent variable for Eqs. 2, 3 and 5 above was “cost per patient.” The dependent variable for Eq. 4 above was “cost per patient minus WIES payment.”

Model 1: severity marker co-payment model

Using this framework, all hospitals would receive extra money based on selected severity marker variables. The amount provided would be based on coefficients from regressions that only include severity marker flags. This approach is analogous to what is currently called the “co-payment” concept by the Victorian DHS, used to pay for selected services such as stents. Three variations are considered, as shown by Eqs. 3, 4 and 5.

Model 2: expanded risk-adjusted specified grant (RASG)

Under this system, the predicted cost of each patient would be calculated by new payment formulae based on multivariate regression models estimated for selected AR-DRGs. The explanatory variables using this framework might include not only the AR-DRG, but demographics, severity, number of disease types, day outlier, emergency and “transfers in.” Each hospital could be paid one RASG based on the summation of the net RASGs (gross RASG minus current casemix revenue) for each of the selected AR-DRGs. The gross RASG would be based on the significant coefficients for each regression for each AR-DRG. It is called an “expanded” RASG because each hospital is paid only one “expanded” RASG based on the summation of all of the RASGs that would have been payable for each AR-DRG identified. Hence, all grants are rolled into one aggregated RASG, not a series of AR-DRG-specific RASG payments. As is the case for all of the models articulated here, adjustments would need to be made for consumer price index (CPI), wages and technology, given that the calculations would be based on the year prior to the introduction of the new funding policy.

Model 3: training and development grant

Under this system, we would calculate the expanded RASG using a similar methodology as specified in Model 2, but using data only for hospitals that receive T&D grants (i.e. RAWG hospitals). The percentage allocation by hospital of the summation of the expanded RASGs across all of the teaching hospitals would be determined. This model would only be used to determine the percentage allocation by hospital of the additional funds available to be distributed across the teaching hospitals for the T&D grants. The percentages of cost burdens would be used to multiply by available funds to determine the T&D grant for each hospital. For example, the RASG calculations could imply that $15 million is justified for reallocation, while only $10 million is available from the Treasury. Each hospital’s percentage share of the summation of RASGs ($15 m) could be used to calculate the desired level of funding for each hospital and the available funds ($10 m) could be divided up among eligible hospitals based on these proportions. The evidence of the difference between required (i.e. risk-adjusted) and available funds could be used in the funding negotiations by the DHS with central agencies. An advantage of this option is that it measures the entire pool of funds to enable appropriate risk-adjusted funds. Whilst this amount may not be available in the Treasury funding, which is a political decision, then the relative distribution of funds between hospitals can be estimated and applied to the available funds. This approach would logically build on the current methodology used in the complexity component of the T&D grant.

A key difference of this new option is that the new model is based, as a starting point, only on the deficit DRGs for the hospitals currently running at a deficit and in receipt of the T&D grant. It applies risk adjusters to identify the drivers of costs for those DRGs and uses them to allocate funds in an equitable way among all of the teaching hospitals. It, therefore, limits the regression analyses to data from the RAWG hospitals that would be in receipt of the T&D grant. This option is similar to Model 2 in its choice of predictive variables, differing primarily in how the predictions from the regression model would be used. However, the regression data sets are different, given that the coefficients in Model 2 uses data from all hospitals (including some rural hospitals), whereas Model 3 only uses data for the RAWG hospitals which are the major teaching hospitals.

Model 4: risk adjustment replacement formulae

New risk adjustment formulae for a few AR-DRGs that are high deficit for teaching hospitals. It would replace the current formulae (WIES and grants).

A summary of these options and the variables included in the regressions are outlined below in Table 1.

Table 1 Models and variables for current analyses

Models 1, 2 and 4 were analysed using data for the teaching and rural hospitals for which patient-level cost information was available, while Model 3 was based on teaching hospital data only.

Results

Model 1: severity marker co-payment

The three variations of Model 1 use only severity markers as independent variables. Equations 3 and 4 use a single aggregated severity marker. Equation 5 uses disaggregated severity markers. For the disaggregated severity markers, separate indicators were identified for each diagnosis or procedure identified by clinicians as appropriate predictors of increased spending. Up to 13 markers were identified for each AR-DRG considered. The three different models varied depending on the level of aggregation of the severity marker and two different dependent variables, i.e. either “cost per patient” (Eqs. 3, 5) or “cost per patient minus the WIES revenue” (Eq. 4).

Overall, very low R 2 values were obtained for Eqs. 3 and 4, ranging from 0.0032 for AR-DRG R63Z (Chemotherapy) to 0.2665 for AR-DRG A04A (Allogenic bone marrow transplantation) for Eq. 3. Negative coefficients were found for the single severity markers for AR-DRG E62B (Respiratory infections/inflammation). The R 2 values for various AR-DRGs increased modestly for Eq. 5 when the disaggregated severity markers were used in place of the single aggregated measure in Eq. 3. This approach holds promise, since the higher proportion of variance is explained. The R 2 values varied from 0.00316 for R63Z (Chemotherapy) to 0.48196 for B76B (Seizure age >2 w/o catastrophic), which included up to ten severity markers. Negative coefficients persist for some severity markers, which are difficult to rationalise.

Models 2 and 4

Models 2 and 4 utilised the same regression specification (Eq. 2) and the same data set. Hence, they are discussed together in this section. Model 2 involved the “expanded RASG,” whereby each hospital could be paid one RASG based on the summation of the net RASGs. The net RASGs would be calculated based on the gross RASG minus the current casemix revenue for each AR-DRG. The gross RASG would be based on the significant coefficients for each regression for each AR-DRG. As in other models, adjustments would be required for CPI, wages and technology. Model 4 involved a risk-adjusted replacement formulae for a few AR-DRGs that were high deficit across a range of hospitals. The formulation for Model 4 could be incorporated into the current formulae (WIES and specified grants etc.) to make payments more accurately reflect severity.

Table 2 provides results for Models 2 and 4. The data analysed are for all hospitals using cost per patient as the dependent variable. The R 2 values were relatively high for these options and ranged from 0.00426 for L61Z (admit for renal dialysis) to 0.65536 for C01Z (Proc for penetrating eye injury). Of the AR-DRGs analysed, approximately 32 (or 46%) had an R 2 value over 0.400, indicating that, for these AR-DRGs, over 40% of the variance was explained by the specification, which is a very good outcome. A relatively large number of negative coefficients remain on selected severity markers, perhaps explained by collinearity. Any effort to include these coefficients in a payment model would need to be carefully considered.

Table 2 Model 2 (expanded risk adjusted specified grant (RASG)) and Model 4 (risk adjustment replacement formulae)

Model 2 is the expanded RAWG where each hospital is paid one RASG based on the summation of the net RASG (gross RASG minus current casemix formulae/revenue) for each AR-DRG. This option implicitly requires calculation of the revenue under the current arrangements. This extends beyond just WIES revenue to also include T&D grants, specified grants etc. A major survey of revenue modelling was underway with RAWG representatives during 2004. Further consideration is required of this modelling to enable a consistency of approach between hospitals and higher validity of revenue modelling approaches. Further details of the survey results are discussed below. Hence, the feasibility of this funding option will depend on the further development of this revenue modelling framework state-wide. The regression modelling undertaken to date is very promising, discovering that a high proportion of the variance is explained by the variables included. Like the other options explored, the work could be further advanced via a wider incorporation of severity markers from more hospitals and, hence, more AR-DRGs. In general, the R 2 value is higher where the severity marker variable has been included.

Model 4 involves the replacement formulae for a few AR-DRGs that have “deficit” status across several hospitals. It would simply replace the current formulae (WIES plus various grants). This option has some appeal, given that it is easier to develop and implement compared to Model 2. It uses the same set of coefficients and data set as for Model 2, but is a simple “replacement formulae” that does not require the calculation of any net RASGs nor the gross RASG. Hence, the need for extensive revenue modelling in the calculation of the price is not a key requirement. When using the data from each AR-DRG, it would be important to identify the sub-group of AR-DRGs that were “deficit status” across a broad range of hospitals. This could involve the following DRGs identified to date via the RAWG data processes, including: AR-DRG AO6Z Tracheostomy, any age, any condition; G02A Major Small and Large Bowel Procedures with Catastrophic CCs; E62B Respiratory infections/inflammations with severe or moderate CCs; F06A Coronary Bypass no investigative Cardiac invasive procedures with Catastrophic/Severe CCs; F10Z Percutaneous Coronary Angioplasty with AMI; G44C Other Colonoscopy Same day; L61Z Admit for Renal Dialysis; R63Z Chemotherapy. This model might be relatively easy to implement compared to the other options, and might conceptually be the easiest for the industry to understand and accept.

Model 3: training and development grant

With this formulation, the percentage distribution of the total available funding for the complexity component of the T&D grant would be based on the percentage distribution of each hospital in the total of the expanded RASG concept, but which is calculated based on the data from RAWG hospitals in receipt of the T&D grant. The R 2 value for this option ranged from 0.00437 for AR-DRG L61Z (Admit for renal dialysis) up to 0.64173 for AR-DRG 901Z (Extensive OR procedures unrelated). Around 35 (or 50%) AR-DRGs had R 2 values that were higher than 0.40, which is a very good outcome. As before, negative coefficients on certain severity measures would warrant further consideration. The results are summarised in Table 3.

Table 3 Model 3: T&D grant (Risk Adjustment Working Group (RAWG) hospitals only)

The advantage of this model is that is builds upon a framework already used in Victoria for calculating the complexity component of the T&D grant. A challenge for its implementation, however, is the same as that outlined for Model 2 above. It requires careful calculation of both the gross RASG and the net RASG. The latter requires calculation of the revenue that would be derived from WIES and other sources, such as specified grants, and, hence, is dependent on good revenue modelling that is consistent between hospitals. This is not considered to be a major impediment but will require more work. Overall, it seems that Model 4 would be the easiest to implement and trial in the short term, pending additional severity marker data. Should more comprehensive severity marker data become available from the hospitals, then the use of an additional variable (i.e. disaggregrated severity markers) might be considered for Model 4 (and also Model 2). This would simply involve the inclusion of these additional variables in Eq. 2. Models 3 and 2 will require much more work on the revenue modelling side. The validity of Model 1 is an issue, given the small size of the R 2 values, especially for aggregated severity markers. However, further exploration of the data using disaggregated severity markers may produce a better outcome.

Severity marker code analyses

Given that severity marker codes were provided by only five of the eight RAWG hospitals and used in the analysis, the need has been identified for the extension and validation of severity marker code choice for each high-deficit AR-DRG across all of the teaching hospitals. For example, some specific AR-DRGs (e.g. A06Z) were high deficit across four teaching hospitals, but only one hospital provided severity markers for this AR-DRG. In such instances, there would be under-representation of all of the severity codes for that AR-DRG. In other cases, three other hospitals did not provide any severity markers for their high-deficit AR-DRGs. Hence, those hospitals would be under-represented in the analyses. Whilst the preliminary regressions run to date do shed some light on the power of severity markers to explain costs, much more work is required to address these issues. The validity and reliability of these markers may be compromised, given the relatively small number of hospitals that have identified severity markers for deficit AR-DRGs and the variability in cost data across hospitals and time. For example, the following list of severity markers was identified by one hospital for AR-DRG A06Z (Tracheostomy any age, any condition) for 2002–2003:

Code

Description

T862

Heart transplant failure and rejection

Y830

Surgical op w transplant of whole organ

S250

Injury of thoracic aorta

S251

Injury innominate or subclavian artery

Z942

Lung transplant status

G8251

Tetraplegia, unspecified, acute

4001201

Third ventriculostomy

3901500

Insertion of external ventricular drain

An analysis of the 2002–2003 inpatient cost data for this one AR-DRG shows:

  • Across all cost-reporting hospitals, the distribution of costs for episodes with one or more severity marker fall within the distribution of costs for episodes without a severity marker

  • Six campuses report episodes with one or more of the above severity markers; for these six campuses combined, there is no significant difference in the average cost between episodes with and without severity markers (P>0.05)

  • Three campuses report the highest average cost for episodes without a severity marker

  • Three campuses report the highest average cost per bed day for episodes without a severity marker.

This analysis of one AR-DRG could be replicated in the future once severity marker information is received across all hospitals with a deficit in the specified AR-DRG, since the array of severity markers will vary depending on their state-wide referral service. The above analysis only included severity markers identified by the hospital (related to transplantation and trauma) and it then analysed the impact across all hospitals. That hospital’s severity markers are not necessarily those of other teaching hospitals with different state-wide referral services. It might be reasonable to hypothesise that, if the costs associated with any individual severity marker represented the actual costs required to treat the patient condition rather than costs associated with an individual hospital practice, then a severity marker should be high cost for all hospitals. Where this is not the case, reimbursing hospitals for higher than average costs in their hospital alone could, potentially, result in funding inappropriate hospital practice rather than funding severity per se.

Other important issues involve the potential for self-selection bias in the identification of severity markers by various teaching hospitals. This is important because some hospitals may select a relatively high number of severity markers compared to other hospitals. In our exploratory work, not all hospitals identified severity flags and we did not consider how new severity measures might be added in the future. Moreover, the prevalence of these flags might change once hospitals know that they affect funding. All of these issues would need to be addressed, should this approach be implemented. Guidelines could be developed to clearly define codes that could be counted as severity indicators. The variability in cost data across hospitals and time has been emphasised and could be further explored in future.

Limitations and policy concerns

Some of the variables that we included in our regression models raise concerns about incentives and fairness. The inclusion of a high outlier flag and an emergency status variable are two examples.

Under Victoria’s casemix formula, high outliers are designed to be “loss” patients. The assumption of the RASG models is that high outlier status is a reflection of severity rather than inappropriate hospital practice. While this might be a valid assumption, refunding aggregated patient losses for high outliers through an RASG model could provide a perverse incentive for hospitals to retain patients with above-average hospital stays until they exceed the high boundary, thereby becoming high outliers and eligible for the augmented funding. If RASG models were implemented that included additional payments for outliers, it could encourage hospital inefficiency.

Similarly, while emergency patients do have higher costs than non-emergency patients in some DRGs, the Victorian DHS has considered and rejected the application of “emergency” WIES copayments on the basis of their potential to adversely impact on patient care. Without clear definitions of what represents an “emergency,” the reporting and counting of emergencies is problematic, relying largely on clinician judgement. Providing financial incentives for admitting emergency patients has the potential to change the types of patients reported as “emergency,” thereby reducing the hospital’s ability to identify those patients that are most in need of immediate admission. Funding hospitals for “emergency” patients through RASG could be associated with the same risks.

Our analysis is subject to other limitations that we wish to highlight here. In analysing the reasons for the deficit position of hospitals for some AR-DRGs, evidence about the relative efficiency of the hospitals is required, in addition to the results of econometric analyses of risk adjustment variables. This matter was previously explored by Antioch and Walsh [13], who used benchmarking data developed by the Health Round Table (HRT) to demonstrate relative efficiency.

Further, we have estimated our models without including any response by hospitals to any new incentives that would be created. More refined estimates could capture changes in efficiency in response to payment formula changes. Another issue is that there is a wide variation in methods for allocating costs among patients at different hospitals. Allocation methods will affect both the identification of high-deficit AR-DRGs and the estimated coefficients. We have used an incomplete set of severity measures. A more comprehensive approach might be to start with a comprehensive classification system, such as the DCG system of Ash et al. [6], for grouping diverse diagnosis codes. Some preliminary results from this approach are discussed later.

Hospital-level simulations of refined AR-DRG models

In order to better understand the implications of the regression models for the explanation of individual-level spending, we conducted policy simulations of hospital-level costs and predicted payments under a variety of assumptions. For this analysis, we focussed on the 59 problematic AR-DRGs for which this study’s clinicians had identified severity markers. We used a sample of 23 hospitals considered to have the most reliable cost information for 2002–2003. Altogether, the sample contained data on 743,628 separations, with a total cost of $2,058 million. The results of our simulations are presented in Table 4.

Table 4 Results from using regression models to simulate hypothetical risk-adjusted diagnosis-related group (DRG) payments

We started by simulating, as the base case, a very simple hospital payment model: for each patient in a given AR-DRG, every hospital received a constant payment amount just equal to the state average cost for that AR-DRG. By construction, this payment system will pay out the same amount as the sum of the total cost for all hospitals combined. For specific hospitals, however, this constant AR-DRG payment will systematically over- and underpay relative to the actual hospital costs. So, to avoid the controversy of looking at specific hospitals, we collected the five hospitals that had the largest amount of underpayment (which were all among the RAWG hospitals) and the five hospitals with the largest amount of overpayment using this simple system. The 13 remaining hospitals were grouped in an intermediate category which we call the “rest of the hospitals.” As shown in Table 4, the underpaid hospitals would collectively experience a loss of $88 million under this stylised system, representing a loss of $392 per case. The overpaid hospitals would experience a profit of $63 million ($320 per case), while the rest of the hospitals would experience a cumulative profit of $25 million.

We then simulated a modified payment system in which the predictions from our Model 4 regression model (Eq. 2) were used to predict the payments for each case, rather than the constant AR-DRG mean. The results from this simulation are summarised in the second section of Table 4. Altogether, Model 4 increased payment to the five most underpaid hospitals by only $9 million, representing about 10% of the imputed deficit. Payments to the five most overpaid hospitals were reduced by about $5 million, with the rest of the hospitals seeing a reduction of about $4 million.

To see if this modest impact on the budget allocation to underpaid Victorian hospitals would differ if the existing structure of risk adjustment is superimposed on the simulated payment model, we repeated the simulations using a different dependent variable. Rather than using the total cost of each case, we used the total cost minus the existing WIES payment amount, which captures the total payment before teaching and certain other adjustments. The grand sum of this new payment decreases from $2,058 million to $609 million, reflecting that most (about 70%) of the hospital payments to the sampled hospitals in our selected AR-DRGs is captured by the existing WIES payment calculations. As shown in the bottom half of Table 4, the impact of using regression Model 4 rather than a constant amount for each case on top of the existing WIES amount was to increase payments to the most underpaid hospitals by about $9 million. In short, the hospital-case-based formulae developed here reduce the imputed deficits by only about 10%, regardless of whether they were implemented on top of the WIES system or in place of it.

Re-calibrating DCG/HCC using Victorian cost data

One limitation of the approach used here is that only a relatively small subset of all diagnoses was identified as possible risk adjusters. An alternative approach would be to start with a comprehensive classification system, such as the diagnostic cost group/hierarchical condition category (DCG/HCC) system described in Ash et al. [6]. The HCC system uses diagnoses generated during patient encounters to infer medical problems. Diagnostic profiles and patient demographics predict costs. The “condition categories” capture both chronic and serious acute disease manifestations and expected costs, while hierarchies on these conditions promote clinical coherence. When included in a regression framework, each condition category coefficient reflects the increment to expected costs that is associated with that condition [6].

In 2004, preliminary work was undertaken using solely diagnoses as classified using the DxCG risk adjustment software. That framework uniquely classifies every ICD-10 diagnosis into 763 detailed clinical groups, called DxGroups, as well as into 173 more aggregated categories, called hierarchical condition categories (HCCs). Victorian data were processed using DxCG 6.1 Global Edition software using the hospital cost data for 2002–2003. Two preliminary regressions were estimated. Regressions using the full set of 763 DxGroups achieved an adjusted R 2 value of 0.4422, while a second preliminary regression using 173 HCCs achieved an adjusted R 2 value of 0.3626. Although encouraging, both sets of coefficients had negative coefficients for some covariates, (including intercept), suggesting that nonlinearities and interactions would need to be corrected.

In subsequent work at the DHS by Gillett [9], hospital data was analysed after merging individual cost data across episodes using a unique patient pin to group multiple separations together, rather than considering each patient’s hospitalisation as a separate observation. A concurrent R 2 value of 55%, was obtained—a very good outcome. In that preliminary HCC model, 18 of the parameters had negative coefficients, which would need to be explored in further work.

Conclusions

Concerns about the viability of hospitals in the face of highly imperfect diagnosis-related group (DRG) payments have led many countries to explore various reforms to their hospital payment system. This paper has evaluated alternative possible hospital payment reforms using data from Victoria, Australia, with the goal of understanding how different explanatory variables and different payment frameworks affect hospital revenues.

The review of hospital price and resource allocation by the Victorian Department of Human Services (DHS), Premier, Cabinet and Treasury and Finance identified non-salary cost escalation and variable management performance as key impacts on declining hospital financial performance. Hence, the arguments advanced by the hospital industry about pricing reform agendas should be carefully considered within this broader assessment of the role of variable hospital management performance and the non-salary cost escalation as important additional factors impacting on hospital deficits. The Victorian government has already made significant inroads in trying to resolve issues of hospital deficits. In response to financial concerns, the DHS increased hospital base prices by $95 million in 2004–2005. Savings targets and transitional grants were developed for each health service still in deficit following the initial allocation. Transition grants will be in place for 1 year and a maximum of 2 years, and all health services are expected to achieve balanced budgets by the end of 2005–2006. Non-salary costs were indexed to 4.8% increases, and hospitals were asked to improve efficiencies by at least 0.75% of total operating revenue (DHS, 2004; [14]). The Victorian DHS has already made significant in-roads in risk-adjusting elements of the Training and Development grant (T&G), including the recent separation of the training and development payments into complexity and teaching components.

Notwithstanding these initiatives, further refinements will still be needed. The various funding models explored by the Risk Adjustment Working Group (RAWG) in this paper may provide guidance on the most desirable directions to explore. The use of only severity markers as independent variables (Model 1 variants) appears to lack sufficient explanatory power to be worthy of further consideration. Models such as the expanded risk-adjusted specified grants (RASG) (Model 2) and the T&D grant (Model 3), linked to deficit AR-DRGs show some promise, but, to be useful, they would require more refinement. When some risk-adjustment variables are included, simulations using the adjusters identified in Victoria only reduce underpayment to the high-loss hospitals by about 10%. Preliminary results presented here suggest that the most promising directions to consider use refinements similar to our Model 4 that involve replacing the existing Australian-refined diagnosis-related groups (AR-DRG) formulae with new formulae.

One approach that appears promising would use the diagnostic cost group/hierarchical condition category (DCG/HCC) classification system, involving patient relative risk scores to risk adjust the AR-DRGs, and better control for within-DRG severity. An alternative possibility would be to reimburse hospitals for the expected cost of individuals for a period of time (such as a year), rather than pay for an inpatient episode as the unit of payment. This might be appropriate for patients requiring chronic care.