Introduction

US federal policy has emphasized the use of performance pay contracts to transmit quality and efficiency incentives to health care providers. However, there is considerable debate over the design and effectiveness of performance pay programs (Christianson et al., 2008; Jha et al., 2012; Mullen et al., 2010; Rosenthal et al., 2005). A key source of contention has been the use of patient outcomes as performance metrics and their association with unintended adverse effects due to gaming responses (Chen & Grabowski, 2017; Ibrahim et al., 2018; Wadhera et al., 2018). However, making pay contingent on outcomes can also be more efficient than using process (input) metrics if providers respond by pursuing more effective processes than would have been mandated otherwise. Policymakers need to factor in likely provider responses so they can optimally design performance pay contracts, however empirical evidence from health care remains limited. This paper attempts to fill this gap by documenting a novel response mechanism to outcomes-based penalties: overcoming market exchange frictions and shifting production to lower cost downstream providers. Our setting is the Hospital Readmissions Reduction Program (HRRP)—one of the largest performance pay programs in US healthcare.Footnote 1

HRRP authorized the federal government to penalize hospitals if their re-hospitalization rate for Medicare patients admitted with select targeted conditions exceeds a specified benchmark. The Centers for Medicaid and Medicare Services (CMS) has imposed penalties of about $500 million on hospitals each year under HRRP. Prior studies vary in their assessments of the magnitude of its effect on the mean readmission rate, ranging from nearly zero to a one percentage point decrease, but the consensus is that it has decreased readmission rates for targeted patients (Carey & Lin, 2016; Desai et al., 2016; Mellor et al., 2017; Chen & Grabowski, 2017; Gupta, 2021; MedPAC 2018; Ziedan, 2019). While evidence on gaming responses such as up-coding (Ibrahim et al., 2018; Ody et al., 2019), patient selection and greater use of observation status (Gupta, 2021) has been forthcoming, evidence on quality improving mechanisms has remained speculative.

Hospitals could respond to the readmission penalty by sending more of their patients into post-acute care (PAC) as well as leverage their relationships with PAC partners to obtain more intense care for their patients–thus shifting some of the incremental care to a lower cost setting. Recent studies have shown the potential for PAC to improve hospital readmission rates (David & Kim, 2018; Rahman et al., 2017). However, using Medicare claims data we find no economically or statistically significant changes in the proportion of hospital patients sent to any type of PAC—consistent with findings of recent studies (Gupta, 2021; Kim et al., 2019; Ziedan, 2019). Hence, we focus our attention on the intensive margin.

Among different types of PAC, we focus on home health care since in the HRRP context it is the only commonly used low-cost PAC. Forty percent of the patients targeted by HRRP and sent to PAC receive home health care. It costs only about 20% the price of a stay at a skilled nursing facility or at a hospital. Second, it is an important service in its own right—Medicare spending on home health care doubled between 2001 and 2014 (MedPAC, 2016). With the growth of payment bundles and the emphasis on cost containment, it is expected to continue to grow rapidly (Sood et al., 2011).

Freestanding Home Health Agencies (HHAs) compete for hospital referrals and adjust their protocols and intensity of care to preserve or enhance their business with referring hospitals. With the introduction of the readmission penalty, hospitals may have identified greater coordination with and effort by their HHA partners as a mechanism to decrease readmissions for their patients. Hence, market competition for hospital referrals offers a pathway for the hospital penalty incentive to pass-through to HHAs. To empirically test this hypothesis, we construct a measure of the salience of a hospital’s penalty for an HHA and use this as the primary source of cross-hospital variation in incentive for an HHA to choose resource intensity for patients. We assume penalty salience increases in proportion with two factors—the share of the HHA’s patient volume contributed by the hospital and the penalty rate determined under HRRP. To mitigate the potential for bias due to unobserved hospital quality, we use patients discharged with a condition not targeted by HRRP as a within-hospital comparison group. We combine these cross and within-hospital sources of variation in a difference-in-difference (D-D) research design where the first difference is between patients from targeted and non-targeted conditions and the second difference is between patients from hospitals at different levels of penalty salience for the HHA.

We implement the above approach using novel operational and claims data on over 12,000 patients from a large Home Health firm with 70 offices in 16 states. For each patient treated by the firm between 2012 and 2015, we observe rich episode level and visit level information. This dataset provides the usual details available in administrative claims data, but also includes rarely available operational details (such as time spent with the patient, and caregiver specialty) and an accounting-based cost estimate for each patient episode used by the firm in its internal business analytics. This allows us to not only quantify patient specific cost of care (‘effort’), but also distinguish between alternative dimensions of effort, for example, increasing hours of care versus deploying more specialized caregivers who bill higher hourly rates.

Despite the advantages of this novel data, a key limitation is that we do not observe patients prior to the implementation of the HRRP penalty. This implies we cannot exploit the pre/post difference in outcomes in a traditional D–D framework as a source of identifying variation. Accordingly, the equivalent of the common trends assumption in our D–D design is that in absence of the penalty, differences in outcomes between patients in targeted and non-targeted conditions would be identical across patients from hospitals at different levels of penalty salience. We assess the credibility of this assumption and provide supporting evidence using multiple falsification tests.

Our results confirm that the firm incurred greater episode costs for heart attack and heart failure patients relative to non-targeted patients of similar risk at hospitals with higher levels of penalty salience. The implied increase in cost is economically meaningful—a heart failure patient from a hospital with 1 standard deviation greater penalty salience received care that cost about $60 more per episode (4.7% of the mean). While modest in absolute terms, we note that the firm spent only about $150 more at baseline for the sickest quartile of patients relative to the least sick quartile. Relative to this benchmark, it incurred 40% greater cost for targeted patients from a hospital with 1 s. d. greater penalty salience. However, we find no changes for pneumonia patients, implying heterogeneity in the pass-through of the penalty incentive within the targeted conditions.

We identify three channels through which the firm increased resource use. First, there is consistent evidence of greater effort in the first week of the episode. We interpret this as frontloading of effort, as recommended in the clinical literature (Rogers & Madigan, 2007; O’Connor et al., 2014; Murtaugh et al., 2017). Second, we show that the increase in cost is driven primarily by using more experienced caregivers and spending more time with targeted patients. This dimension of effort is typically unobserved in conventional claims data and may be an important reason why previous studies have not detected the role of PACs in responding to HRRP. Third, in the case of heart failure, the firm focused greater resource use on sicker patients. This is consistent with the HHA optimally directing greater resources to patients posing greater penalty risk to hospitals.

We present evidence from three different falsification tests to support the validity of our identification assumption. First, we show that the effects on episode cost disappear if we eliminate variation in penalty salience across hospitals, and instead assign the mean penalty rate across local area hospitals. Hence, the results are not driven by spurious responses to market-level penalty. Second, we use Medicare Advantage (MA) patients to implement a placebo test. MA patients are excluded from HRRP penalty calculations. Absent spillovers, hospitals and PACs should not invest additional resources for these patients due to HRRP. Reassuringly, we do not find differentially greater episode cost or lower readmissions for MA patients. In fact, the coefficients typically suggest a reduction in resource intensity for MA patients, consistent with a multi-tasking effect. Third, we perform a randomization test, estimating the episode cost model 1000 times. In each iteration we randomly assign a subset of patients (from targeted or non-targeted conditions) as the treated group and use the remaining patients as the control group. Thus, we obtain an empirical CDF of estimated effects on episode cost to compare against the coefficient obtained in the main results. We find that the true coefficient is at the 94th percentile of this distribution and can therefore reject the null hypothesis at conventional levels of significance.

This paper makes two contributions. First, we hypothesize and empirically test a novel mechanism through which providers responded to HRRP—one of the largest performance-pay programs in US health care. This program has received considerable attention, but prior studies have typically examined outcomes (readmissions and mortality) without offering insights into causal mechanisms (other than selection and coding). Overall, our results suggest that HRRP may have succeeded in its goal to improve coordination between hospitals and community-based providers by aligning incentives along the care continuum. We demonstrate that performance penalties imposed on hospitals passed through to downstream firms for heart disease patients, reallocating tasks across firm boundaries. This adds to the recent evidence on vertical integration in health care (David et al., 2013; Konetzka et al., 2018; Lehrman & Shore, 1998) and indicates that performance pay incentives may ease market frictions without changes in firm ownership. Second, these results inform policymakers on performance pay design in health care. It confirms that using outcomes as performance metrics motivates providers to deploy low-cost solutions, as suggested by economic theory (Lazear, 1986; Shleifer, 1985). It is implausible that policymakers could have identified and stipulated these specific aspects of coordination as input metrics.

The remainder of the paper is organized as follows. In “Background” section provides the necessary institutional background. In “Data” section describes our data sources and key variables. In “Empirical strategy” section discusses our conceptual framework and presents the empirical strategy. We present our main results, including robustness and falsification checks in “Results section”, and conclude in “Discussion and Conclusions” section.

Background

Performance pay in health care

Through the Affordable Care Act (ACA), the federal government committed to extensive use of performance pay in compensating hospitals for services to Medicare patients. With the passage of the Medicare Access and CHIP Reauthorization Act (MACRA) in 2015, performance pay incentives were extended to physician payments as well. Hence, performance pay is now a cornerstone of federal payment policies. However, there is still considerable debate over whether provider performance should be measured through process metrics or patient outcomes.

This debate is not limited to health care and has a long tradition in the large theoretical literature on the use of performance pay incentives by firms (Prendergast, 1999). Economic theory predicts that linking pay to outcomes is more efficient when they are more easily measured than effort, and agents are better informed about the production function and their ability than the employer (Shliefer, 1985; Lazear, 1986). However, economic theory also predicts that the use of outcomes may lead to gaming (Baker, 1992) or distract agents from tasks that society values but does not reward explicitly, known as the multi-tasking effect (Holmstrom & Milgrom, 1991). The possibility of unintended adverse effects on patient health has made the use of outcomes controversial in health care (Ibrahim et al., 2018; Wadhera et al., 2018).

Nevertheless, both approaches are widely prevalent and often overlap. For example, CMS penalizes hospitals under HRRP if their heart attack patients have high readmission rates (an outcome measure). But it simultaneously rewards hospitals under the Hospital Value Based Purchasing (HVBP) if they provide heart attack patients with aspirin upon discharge (a process measure), regardless of readmission rates. Private insurers have taken the cue from Medicare and introduced performance pay programs of their own, using a mix of process and outcome measures (Chernew et al., 2011).

The hospital readmissions reduction program (HRRP)

HRRP authorized CMS to penalize hospitals whose 30-day re-hospitalization rates for select Traditional Medicare (TM) patients are greater than the national average. Hence the program explicitly discourages readmissions, treating them as adverse outcomes. CMS enjoys tremendous discretion in administering the exact rules of the program—the targeted conditions, the penalty formula, and timing. Hospitals were officially notified of these details in August 2011. Starting in fall of 2011, hospitals could predict the consequences of HRRP on their finances, and accordingly take actions to decrease their readmission rates. In the first three years program years, hospitals were penalized for re-hospitalization following an index admission for heart attack, heart failure or pneumonia.Footnote 2 All unplanned readmissions, including those occurring at a different hospital, count against the hospital where the index treatment was performed.

The key source of variation in penalty applied in any year is the extent to which a hospital’s risk-adjusted readmission rate deviates from the national average, all computed over a three-year baseline period. While a hospital does not receive a bonus payment if it does better (i.e., lower) than the threshold, it is penalized if its baseline rate exceeds the threshold. This rate is normalized against the national average and adjusted by a multiplier based on the hospital’s total Medicare revenue, which is typically close to one.Footnote 3 Penalty values are computed separately for each targeted condition and a weighted sum is calculated at the hospital level. This ensures that hospitals have an incentive to improve on each condition, since excelling in one does not compensate for poor performance on other conditions. This aspect of the program motivates our empirical approach of using the variation in condition-specific penalty.

The condition-specific penalty dollar amount is directly proportional to the revenue billed by the hospital for the targeted condition (“base”), i.e., all else equal, hospitals serving more heart attack patients will incur a greater penalty for readmissions following heart attacks. To isolate variation due to baseline performance in readmission, we normalize the penalty amount by this base revenue, and refer to the ratio as the penalty per dollar of base revenue, or penalty rate in short. Figure 1 illustrates variation in this penalty rate across hospitals for pneumonia in the first year of the program. The dots represent penalty rates for individual hospitals, while the solid line plots a hypothetical linear increase. The plot confirms that baseline variation in readmissions performance is the major determinant of HRRP penalty variation.Footnote 4 The patterns are similar across the three targeted conditions. We use this as our preferred measure of penalty variation across hospitals in the empirical analysis.

Fig. 1
figure 1

HRRP penalty incentive and mean readmission rate. Note This figure provides necessary background about the HRRP program penalty incentive. It illustrates the penalty incentive introduced by HRRP using data from pneumonia patients. The figure plots our measure of the penalty rate in the first year of the program on the Y-axis against the risk-adjusted readmission rate over the period 2009–2011 on the X axis. Each circle denotes an acute-care hospital. There are about 3000 hospitals represented in the figure. We compute the penalty rate as the ratio of the penalty value assessed due to pneumonia in dollars divided by the mean annual fee-for-service Medicare revenue for the hospital from pneumonia during the baseline period (2009–2011). Hence, it is the penalty per dollar of reimbursement, and we interpret it as the penalty rate. Following CMS, the risk-adjusted readmission rate is normalized by the mean, and this is known as the risk-standardized readmission rate (RSRR). The solid red line provides a schematic of the relationship between the RSRR and the penalty rate. The actual penalty formula depends on other factors and hence the circles deviate slightly from the red line. The incentive patterns are similar for other targeted conditions.

Table 1 Medicare sample descriptive statistics

In any given year, hospitals are penalized based on baseline performance, but the penalty is applied two years later. For example, the penalty was first applied to 2012–2013 revenue based on baseline performance over July 2008–June 2011. In every successive year, hospital penalty status is updated based on a correspondingly updated three-year performance window (e.g., July 2009–June 2012 is the performance window for the next program year). This program design has an important implication for our empirical analysis. Penalties in the first year were based on hospital performance during a period that ended prior to when the formula and other details were revealed. Hence, hospitals could not influence their first-year penalty status by changing patient mix, clinical condition mix, care protocols or coordination with other providers after August 2011. Penalty status in future years, however, could itself be influenced by hospital responses to HRRP, i.e., a reverse causality concern. Hence, in our empirical work we hold hospital penalty status constant as of the first year of the program. In fact, we go a step further and compute the applicable penalty rates based on hospitals readmissions over July 2007–June 2010 instead. This is much removed from the outcomes we study, which is post-acute care use during 2012–2015.

Home health agencies (HHAs)

PAC providers care for patients and assist them with daily activities after receiving treatment in an acute care setting, such as inpatient care. PAC occurs in a variety of settings depending on the intensity of care required–in skilled nursing facilities, long-term acute care, and at home through home health or hospice care. The importance and role of PAC has grown over time as the share of hospital patients discharged to PAC has increased.Footnote 5 In fact, for Medicare patients discharged with heart attack or heart failure, PAC has been the fastest growing category of care (Chandra et al., 2013).

Home health care involves several general care and recovery tasks which could plausibly occur in the hospital. However, these tasks are much costlier to produce in the hospital setting, primarily due to the capital cost of hospital infrastructure and the alternative costs associated with lower throughput (Candrilli and Mauskopf, 2009; Metlife, 2011). HHAs also enable regular monitoring of patient health, offering the potential to prevent medical emergencies, at least in theory. Home health care agencies and industry associations emphasize that a key component of their value proposition for Medicare is their ability to avoid costly hospital readmissions (AHHQI, 2012). Since HRRP emphasized readmission reductions—an outcome that HHAs also target—it is a very salient program for home health agencies.

Freestanding home health care firms compete for hospital patient referrals (Study, 2014). While patients make the final choice, their decision is influenced by hospital physicians and discharge planners, who inform them about the different options, their strengths, and weaknesses. These decisions are subject to informal negotiations over the quality, intensity of care, and responsiveness that a home health firm can commit to the hospital and patient. Competitive pressure may induce HHAs to offer additional services or intensity of treatment to win patient referrals. In the aftermath of the readmission penalty, this could include additional effort to help the hospital reduce readmissions (Worth, 2014).

Home health care is relatively inexpensive. Table 1 Panel B shows that Medicare paid about $2,600 for an episode following a hospital stay for one of the HRRP conditions. In contrast, the average reimbursement for a nursing home stay was about $10,000—comparable to the cost of the hospital stay. HHAs determine the care resources necessary for patients when the episode begins. Home health care involves home visits and care by both skilled (nurses, physical therapists) and semi-skilled employees (assistants, aides). In case of traditional Medicare patients, the HHA receives a prospective payment for an episode that can be up to 60 days long. Like hospital prospective payments, the amount varies based on the provider’s assessment of intensity of care required for the patient and local market costs.Footnote 6 Hence, the HHA is not reimbursed on the margin for additional care within the same episode for the typical patient.

Data

Data sources

Traditional medicare claims data

We use a 100% sample of Medicare inpatient claims data to compute hospital penalty under HRRP per dollar of base revenue, as described in Section II.B above. We compute hypothetical condition-specific penalties applicable based on the performance period of July 2007–June 2010, following the formula used by CMS.Footnote 7 The mean unconditional penalty per dollar of revenue for hospitals across target conditions is about 7%. This implies that hospitals would expect to be penalized about 7% of the annual revenue they billed Medicare for the targeted conditions over the baseline period of 2007–2010.

We also use a 20% random sample of Medicare claims to examine national patterns in use of PAC for hospital patients targeted by HRRP. This data spans July 2008 through June 2015 and therefore allows observing patterns before and after introduction of the penalty. Table 1 describes this sample. There were about 1.3 million index hospital stays for the targeted conditions, with heart failure contributing the largest share (~ 45%). About 40% of these stays were followed by a PAC episode within 7 days of discharge from the hospital, with skilled nursing (18%) and home health (16%) the two largest types of care. Figure 2 presents time series patterns on the probability of being sent to some type of PAC within 7 days of discharge from a hospital stay for one of the targeted conditions (Panel a), and the mean number of PAC providers used by hospitals for each of the categories. The data suggests that the patterns remained relatively stable over this period, with no noticeable shifts immediately after 2011. We formally test for changes in PAC use in Section V.E. (a).

Fig. 2
figure 2

Use of PAC in Traditional Medicare. Note This figure presents time series trends on the extensive margin use of post-acute care Panel a and the mean number of PAC providers used, by type Panel b, using claims data from a 20% random sample of Traditional Medicare beneficiaries. Panel a presents the prevalence of four different types of PAC starting within seven days of discharge from hospital stays for the targeted conditions. The vertical dotted line indicates the formal announcement of HRRP details in August 2011. Panel b excludes hospitals that used only one PAC throughout the period. In both panels, years are defined as July–June periods rather than calendar years to be consistent with CMS computations under HRRP.

Proprietary data from large HHA

The primary data source is a novel patient level operational and claims data from a large for-profit, independent home health care firm that operates 70 offices in 16 states.Footnote 8 The firm allows each office to operate rather autonomously, with control over human resources management, care protocols and scheduling. Each office is also held accountable as an independent profit center. Offices logically serve as the unit of our analyses. This feature of autonomous operations within the firm also helps partially alleviate concerns about the generalizability of our results to other HHAs.

We have data on all Traditional Medicare and Medicare Advantage patients treated by the firm over calendar years 2012–2015. Patients in the sample were discharged following an inpatient hospitalization at one of 297 hospitals. The data contains detailed information for each care visit—length of the visit, date and time, specific provider who performed the care (and her service type), and estimated cost of the visit, including allocated personnel time. This information uniquely allows us to measure treatment intensity and patterns, such as time spent with the patient and personnel cost. These details remain unobserved in typical claims data. The files also contain rich patient health status assessment at the start of the episode, which allow us to control granularly for differences in patient status when they were discharged from the hospital.

We define years based on July–June periods, to be consistent with the annual measurement cycle used by CMS for HRRP. The analysis sample is limited to patients who received home health care during July 2012–June 2015 and were discharged from inpatient care.Footnote 9

Table 2 presents descriptive statistics on the analysis sample separately for patients with targeted and non-targeted conditions. The total sample contains approximately 9000 Traditional Medicare episodes, equivalent to 130 episodes per office and 30 per hospital on average. Targeted conditions contribute about 2200 patients of which heart failure is the largest (1250). The control group is highly diffused across many conditions, each making a small contribution to the sample. The most frequent diagnosis in the control group is Osteoarthrosis with about 1,000 episodes or 14% of the control group.Footnote 10

Table 2 Proprietary data descriptive statistics

We use data from 2012 to establish baseline values for key variables as well as to produce summary statistics on distributions of outcomes like resource use and readmissions. We assign severity scores to patients as the sum of observable risk indicators coded by the firm at the start of the episode.Footnote 11 We use the severity scores to assign patients to top or bottom quartile of sickness. Online Appendix Table A.1 Panel A presents descriptive statistics on demographics and risk factors for patients in the two groups. The sickest patients are more likely to be female and non-white and diverge dramatically on risk factors. For example, eighty percent of the sickest patients have a history of 2 + hospital stays, while less than 10% of the least sick patients do.Footnote 12

Online Appendix Table A.2 helps assess the representativeness of the proprietary sample. It presents data comparing inpatient stays and PAC use over 2009–2011 on the universe of hospitals (Col. 1), the 16 states where the firm operates (Col. 2), and for the approximately 300 hospitals appearing in the proprietary data (Col. 3). Hospitals dealing with the HHA were more likely to use PAC and home health care at baseline but had similar 30-day unadjusted readmission rates. Online Appendix Table A.2 Panel B presents descriptive statistics on demographics for home health care patients. The patient demographics at hospitals represented in the sample are very similar to the national averages, which is reassuring. Comparing these means to the corresponding demographics of patients in the proprietary sample in Table 2 panel C column 1, we find that patients at the firm have similar age and gender mix as in the population but are more likely to be non-white.

Outcome variables

Episode cost and other measures

Table 2 panel B presents descriptive statistics on resources deployed by the firm. Our preferred measure is an index of aggregate spending for each home health episode using detailed cost accounting data from the firm. The average episode costs the firm about $1270 and costs are similar across targeted and non-targeted patients. It is composed of two segments. The first segment, called visit cost, reflects the specialty mix of caregivers and materials used and accounts for about 60% of total cost. For example, for two otherwise identical episodes if one received care entirely by nurses while the other received care by a mix of nurses and aides, then the former will have greater visit cost. The second segment is personnel cost, i.e., salary and overheads of the specific individual making the visit, and accounts for about 40% of total cost. This segment enables differentiation in cost between two caregivers of the same specialty. For example, a more experienced and qualified nurse will have greater personnel cost than a less experienced one. Finally, total cost also includes incidental expenses such as travel reimbursements, but these account for only 3% of total cost.

Total episode cost is our preferred measure since it incorporates multiple margins of effort (time spent with the patient, expertise and experience of the caregiver, and materials used) and translates them into dollars. It captures more information than using number of visits or total time spent with the patient, which we also observe.Footnote 13 Note that in the Medicare claims data we only observe total charges and number of visits.

Probability of readmission

We use readmission to a hospital within 30 days of discharge from the hospital as our key health outcome. In addition to being an important marker of quality, it is common among Medicare patients, and is the metric targeted by HRRP. The mean probability of readmission within 30 days for patients with targeted conditions in our sample is 16%.Footnote 14 Online Appendix Table A.1 presents baseline readmission rates for the healthiest and sickest patient quartiles. The sickest patients are more than twice as likely to be re-hospitalized within 30 days, at 20 percentage points (pp) versus 9 pp for those in the bottom quartile.

Empirical strategy

Conceptual framework

This section discusses the home health agency’s objective and why it would invest additional effort to reduce hospital readmissions in response to the penalty. Nearly 90% of agencies in the US are for-profit, hence it is natural to assume the average firm maximizes total profit (MedPAC, 2016). This is also true of the specific firm studied in this paper. According to MedPAC, home health firms make relatively high operating margins on Medicare patients–about 11% in 2014. This estimate accounts for fixed costs, implying the marginal profit on a Medicare patient is higher still. Hence, agencies are incentivized to serve as many Medicare patients as their capacity allows them to. Hospitals are the dominant source of patient referrals for agencies (HHCN, 2019). However, freestanding firms must compete for hospital referrals. Figure 2 shows that hospital patients are discharged to 3.5 different home health firms on average.

Hospitals have no contractual arrangement with home health firms, nor can the latter pay hospitals to obtain patient referrals. However, hospitals still have substantial leverage over home health firms since they direct Medicare patient volume to the firms. Hospital discharge planners and physicians are key influencers in helping patients choose home health agencies. This leads to a familiar situation in health care where hospital personnel serve as de facto agents for patients. Hence, agencies invest in building relationships with hospital administrators, physicians, and particularly discharge planners. They differentiate themselves on dimensions like responsiveness, specializing in certain disease types, and offering additional services, sometimes even outside of what Medicare reimburses (Study, 2014). Home health firms have to pitch their services to hospitals and tailor their services to suit the preferences of both hospitals and patients, rather than of patients alone. Since Medicare patients are financially lucrative, firms have a cushion to tradeoff higher cost per patient to increase patient volume and total profits.

In theory, home health care can help patients avoid hospital readmissions, though the empirical evidence is mixed (Linertova et al., 2011; Murtaugh et al., 2017; David & Kim, 2018). A review of the trade press reveals that home health firms and their industry groups emphasize this potential benefit as a key point in their marketing communications (Lotven, 2012; Mullaney, 2015). Both this background and discussions with executives at our data partner suggest that HRRP was a salient policy change for home health firms. Driven by demand from hospitals worried about the readmissions penalty and firms that strategically offered additional services to differentiate themselves and strengthen their market share, we hypothesize that the penalty incentive passed through from hospitals to home health firms that served their patients.

In addition to the per-episode cost of care, the home health firms also incur some fixed costs of business development to source referrals from hospitals and fixed costs of operations to maintain service capacity. Due to the presence of these fixed costs, hospitals contributing greater patient revenue become more profitable for the firm. We hypothesize these relationships accordingly become more valuable for the office manager and their demands more salient. All else equal, the agency should be willing to invest more resources in patients from hospitals that contribute a greater share of patients.Footnote 15

Penalty salience

This section develops a measure to empirically operationalize pass through of the penalty incentive from the hospital to the home health firm. Imagine that heart failure patients from three hospitals, A, B and C, contributing 40%, 40% and 20% of the patient volume at a HHA office are referred simultaneously to that office. Further, assume the three hospitals are penalized at 0.5%, 1% and 0.5%, respectively, on heart failure under HRRP. We hypothesize that the readmission penalty is more salient for the HHA office for patients arriving from hospital A relative to those from C even though both are penalized at 0.5%, since A contributes a greater share of patients to the office. We also posit that the penalty is more salient for patients discharged from B relative to those from A since B has a greater penalty rate and hence is more likely to assert this issue in discussions with the HHA.

The above discussion pertains to patients discharged from hospitals with one of the targeted conditions. Patients not subject to the penalty, i.e., those discharged with non-targeted conditions or those covered by other insurers, should not receive additional care from the HHA since effort is costly and not reimbursed on the margin.

We develop an empirical measure of condition (\(c\)) specific penalty salience, \(P_{hjc}\) for each HHA office \(j\) that varies across referring hospitals \(h\). Following the discussion above, we construct this variable as the product of the HRRP penalty per dollar of revenue, \({{penalty}}_{hc}\) and the share of patients at HHA office \(j\) contributed by hospital \(h\), \(share_{jh0}.\)

$$\begin{array}{*{20}c} {P_{hjc} = {{penalty}}_{hc} * {{share}}_{jh0} } \\ \end{array}$$
(1a)
$$\begin{array}{*{20}c} {{{share}}_{jh0} = \frac{{n_{hj0} }}{{n_{j0} }}} \\ \end{array}$$
(1b)

where \(n_{hj0}\) is the total number of patients across all conditions referred by hospital \(h\) to office \(j\) during the period January–June 2012, and \(n_{j0}\) is the corresponding total number of patients across all conditions served by office \(j\) during that period. We use the total patient volume (including MA patientsFootnote 16) to be consistent with our conceptual framework that the importance of the hospital depends on its total contribution to the office.

We hold penalty salience constant over time even though both penalty rate and hospital shares change annually. However, all such updates are potentially contingent on the initial penalty and hence are endogenous to the penalty. The mean (unconditional) penalty salience across targeted patients is 0.02 with a standard deviation of 0.04. Consistent with the discussion above, we set \({{penalty}}_{hc} = 0\) for all patients with non-targeted conditions.

Research design

We develop a difference-in-difference research design that exploits the natural experiment created by HRRP’s readmission penalty and is feasible given our data. The first difference is taken across hospitals at different levels of penalty salience. This is a pre-determined, policy driven variation in the penalty perceived by the same office for patients discharged from different hospitals. A key limitation is that we do not have data from the period prior to the HRRP penalty. Hence, we cannot implement a traditional research design with pre/post variation providing the second difference. We overcome this limitation by using patients in non-targeted conditions as a within-hospital comparison group. Hence, the second difference is between patients discharged from the same hospital with targeted and non-targeted conditions. This approach therefore allows home health effort and readmissions to differ between patients from the same hospital belonging to the treated and comparison cohorts, but tests if these differences are correlated with penalty salience.

The key identifying assumption that enables interpreting the DD coefficient as causal is that in absence of the penalty, differences in outcomes between targeted and non-targeted patients would be similar across hospitals at different levels of penalty salience. This is equivalent to the common trends assumption in a conventional difference-in-difference design which would have assumed that treated patients from hospitals at different levels of penalty salience would have progressed on parallel trends in absence of the penalty.

We assess the validity of our identifying assumption through three different falsification exercises under which we would expect null effects. These are, respectively, shutting down variation in penalty salience across hospitals for an office and simply using a mean market-level penalty rate, replicating the analysis with MA patients who are not subject to HRRP, and a randomization test where we estimate the model 1000 times assigning a random subset of patients as a placebo treatment group in each iteration. This exercise generates an empirical distribution of placebo effects on effort against which we can compare the estimate from the main model. Reassuringly, in all three cases, we find small and statistically insignificant effects relative to the coefficients from the main analysis.

A caveat with the proposed research design is that spillover effects to non-targeted patients will bias the estimated effects away from the true effect of the penalty. Spillover effects could be positive if the home health firm adopts new protocols uniformly across patients or hires better quality nurses for all patients. On the other hand, if the firm has capacity constraints, greater resources on targeted patients may come at the expense of non-targeted patients. Hence, it is difficult to predict the direction of the bias.

Prior studies on HRRP have found evidence of net positive spillovers for non-targeted patients (Chen and Grabowski, 2019; Soltani et al., 2021; Gupta, 2021), i.e., patients in non-targeted conditions or those covered by other insurers also experienced a net reduction in readmissions. If this is also the case in our setting, then our estimates will be biased downward and understate the true effect. Spillover effects have been found to be larger for patients in conditions related to those targeted by the program. To mitigate the potential for spillover effects in our analysis, we exclude patients discharged from the hospital with a cardiovascular or respiratory condition— those with a principal ICD 9 diagnosis code within the 390xx–519xx range.

Estimating equation

We conduct our analyses at the patient episode level. For each patient episode \({i}\) receiving treatment at HHA office \({j }\) and discharged from hospital \({ h}\) with condition \(c\), we model outcome \({ Y}_{i}\) as below:

$$\begin{array}{*{20}c} {Y_{i} = \alpha_{j} + \alpha_{h} + \alpha_{t} + \alpha_{{d_{c} }} + \mathop \sum \limits_{{\begin{array}{*{20}c} {d_{c} = AMI,} \\ {HF, Pneum} \\ \end{array} }} \beta_{c} P_{hjc} 1\left ( {d_{c} = 1} \right) + \gamma_{1} X_{i} +\epsilon_{ijht} } \\ \end{array}$$
(2)

We include office (\(\alpha_{j} ){ }\) and hospital (\(\alpha_{h}\)) fixed effects.Footnote 17 The former eliminate stable unobserved differences in personnel and managerial quality across offices. The latter eliminate time invariant differences in quality across hospitals and ensure we compare treated and control patients discharged from the same hospital. We include indicators for each targeted condition, \(\alpha_{{d_{c} }}\) to allow for stable differences in outcomes relative to non-targeted conditions. \(P_{hjc}\) is the penalty salience for patients discharged with condition \(c\) from hospital \(h\) to HHA office \(j\). It is set to zero for patients with non-targeted conditions and calculated separately for each of the three penalized conditions.Footnote 18

The coefficients of interest in this model are \(\beta_{c}\), estimates of the differential effect of being discharged with targeted condition \(c\) relative to non-targeted conditions from a hospital with penalty salience equal to one, relative to the same difference at a hospital with zero penalty salience, all within the same office. Table 2 Panel A shows that the variation in penalty salience is split nearly equally into the within-office and between-office components. Hence, there is substantial variation in penalty salience across hospitals served by the same office.

We control for observable differences across patients by including a vector of co-morbidity indicators \(X_{i}\). Table 2 Panel C presents descriptive statistics on these variables. \(\epsilon_{ijht}\) is an idiosyncratic error term capturing unobserved disturbances that may affect the outcome. We cluster standard errors by hospital and office to account for potential correlation in outcomes for patients discharged from the same hospital or served by the same office.

Results

Descriptive evidence

We first visually explore if unadjusted patterns of episode cost and patient readmissions suggest a differential response for patients from penalized hospitals. Figure 3 presents time series trends of mean episode cost (Panel a) and readmission rates (Panel b) using data from the proprietary data sample. We plot the mean values separately for patients discharged from hospitals with non-zero penalty salience (‘treated’) versus those from hospitals with zero penalty salience (‘comparison’). The first group includes approximately 1100 patients discharged with targeted conditions, while the latter group includes all patients with non-targeted conditions and about half the patients with a targeted condition discharged from hospitals for which we calculated a zero-penalty rate (approx. 7900 patients). In each plot, the mean values are presented over the calendar years 2012–2015, with the first and last data points representing six-month averages. For expositional clarity, we normalize the raw values for both groups by their respective means in 2012, but do not adjust the data otherwise. Note that we do not have data from what we would consider as the pre-HRRP period (prior to August 2011).

Fig. 3
figure 3

Time series of key outcomes by penalty salience. Note This figure presents time series trends on the key outcome variables in the proprietary home health care data–mean episode cost Panel a, and the probability of readmission in 30 days Panel b, against calendar year on the X-axis. Both panels plot the time series separately for patients from hospitals with zero penalty salience (dashed), and non-zero penalty salience (solid). Since the levels of the two groups of hospitals are quite different, we normalize the values against their respective levels in 2012. Note that episode costs are deflated to be in 2015 dollars. The zero-penalty salience group includes patients from targeted conditions that were discharged from a hospital assessed zero penalty.

Figure 3a shows that episode costs increased in 2013 for patients from hospitals facing a non-zero penalty salience and declined steadily later. In contrast, episode costs for patients from zero penalty salience hospitals do not meaningfully increase in 2013 but do exhibit the same declining trend. The declining trend over 2014 and 2015 is common to both groups and may also reflect a structural shift in how the firm computes these cost estimates. At the end of the period, the cost for the treated group is similar to what it was in 2012, while for the comparison group, it is about 10% lower. Panel b presents mean values of thirty-day readmission rates for patients in the two groups. There is a clear decline for the treated group in 2014, which persists in 2015 (about 10% lower relative to 2012), while the trend is flat for the comparison group.

This evidence suggests differential movements in episode cost and patient outcomes for patients from hospitals likely to be penalized and contributing greater volume to the HHA. These patterns are unadjusted and may therefore partly reflect differences in patient or office composition over time. Hence these are only suggestive, and in the next section we present results obtained by estimating Eq.  (2) discussed above.

Resource intensity

Table 3 columns 1–6 presents results on Home Health resource intensity, obtained by estimating Eq.  (2). For each measure of resource intensity, we present two sets of results–columns 1, 3, and 5 pertain to effects for the entire episode, while columns 2, 4, and 6 present effects only after the first week of the episode. We use outcomes at one-week duration to test if the firm responded by increasing effort disproportionately at the beginning, i.e., a frontloaded response. Each column presents two sets of condition specific coefficients. The average effect of being discharged with a targeted condition relative to a non-targeted condition regardless of penalty salience is presented in rows 1, 3, and 5. The coefficient of interest is that on the interaction of the condition dummy and penalty salience (rows 2, 4, and 6).

We begin by exploring effects on episode cost (columns 1 and 2). The coefficients suggest differentially greater episode cost for heart failure patients from hospitals with greater penalty salience. The implied effect for heart failure patients discharged from a hospital with 1 s. d. greater penalty salience is 4.7% greater episode spending (\(e^{0.78} - 1 = 1.18 \times 0.04 = 0.047\)). Since the mean episode cost for targeted patients was about $1,280, the implied amount is $60. Since this is not a conventional DD design, a brief interpretation of this coefficient is warranted. The estimate implies that, within an office, the difference in episode cost between heart failure and non-targeted condition patients from the same hospital is $60 greater for hospitals with 1 s. d. greater penalty salience. The remaining DD estimates from our main model can be interpreted in a similar fashion.

When we limit our attention to the first week, we find statistically significant greater cost for heart attack patients, while the coefficient for heart failure patients is also positive but marginally significant. Accordingly, we focus more on the effect for heart attack patients. An increase of 1 s. d. in penalty salience is associated with a statistically significant 6.9% greater cost of care in the first week of the episode for heart attack patients (\(e^{1} - 1 = 1.71 \times 0.04 = 0.069\)). This implies $28 greater cost in the context of our sample (6.9% of approximately $400).

When interpreting the magnitude of these implied effects, it is useful to consider the range of costs observed in the baseline period between the sickest and least sick patient groups (about $150, see Online Appendix Table A.1). If we regard this as an empirical benchmark of the extent by which the firm can increase episode resource intensity, then a 1 s. d. greater penalty salience is associated with a 40% (60/150) greater episode cost for heart failure patients, and the first week costs for heart attack patients are differentially greater by more than the range of cost observed in the baseline period ($12). Taken together, the firm appears to use more resources in an economically meaningfully way for heart disease patients from hospitals with greater penalty salience.

Online Appendix Table A.3 presents the results of estimating Eq.  (2) on individual cost segments–visit and personnel costs (in logs). The results indicate that the differentially greater cost for heart failure patients is driven by both greater visit and personnel cost categories. A 1 s. d. greater penalty salience is associated with a 4% greater visit cost (\(e^{0.7} - 1 = 1.03 \times 0.04 = 0.04\)), or about $30 (4% of 760) and a similar amount in additional personnel cost. Hence, there is a greater relative increase in personnel cost, since this category accounts for only 40% of total episode cost, but about half of the increase. The estimated effects on first week costs for heart attack patients indicate a similar pattern–personnel costs account for 55% of the total effect. The increase in personnel costs implies that the firm is deploying more experienced caregivers for the targeted patients from hospitals with greater penalty salience.

Greater visit cost could be driven by two channels–spending more time with the patient or deploying more specialized workers (e.g., registered nurses or physical therapists rather than semi-skilled aides). We directly observe the time spent with patients and study the effect on this dimension. Table 3 columns 3–4 and 5–6 present results on the total time spent and the number of visits, respectively. The point estimates imply an increase in the amount of time spent with both heart attack and heart failure patients. For example, the coefficients imply the firm spends about 20 min more (3% of the mean) for heart failure patient episodes from a hospital with 1 s. d. greater penalty salience. The coefficients imply that the increase in time spent with the patient is comparable to the estimated increase in total episode cost (e.g., we find a 7% increase in both the time spent and the episode cost in the first week for heart attack patients). Hence, we interpret these results as suggesting that greater visit cost is largely driven by spending more time with the patient.

Note that standard claims data would allow us to observe only the number of visits. Even for this outcome, the 20% Medicare sample would likely produce downward biased estimates due to measurement error. First, sampling error would lead to noise in the hospital’s share of HHA volume. Second, without the corresponding MA claims for each hospital, we cannot compute the hospital’s true share of the HHA’s Medicare patient volume accurately.

We note here that we do not find an increase in costs across all targeted conditions–the estimated effects for pneumonia patients are typically statistically insignificant. This may be due to sampling error, but the coefficients are consistently negative across the different components of cost, suggesting a decline in resource intensity.

Readmissions

Table 3 columns 7 and 8 present the estimated effects on the probability of 30-day and 7-day readmissions, respectively. The estimates indicate differentially lower readmission probability in the first week for heart failure patients and–surprisingly–no decline over the 30-day period. The estimated effects imply that heart failure patients from a hospital with a 1 s. d. greater penalty salience experience a 1 percentage point lower probability of readmission (0.26*0.04 = 0.01) in the first week. This effect is about 20% of the mean readmission rate in the first week and is about half the difference in readmissions between the sickest and least sick patient groups. Alternatively, a heart failure patient from a hospital with mean penalty salience experienced a 0.5 percentage point lower probability of readmission (0.26*0.02 = 0.0052). This is well within the range of estimates from previous studies of the decrease in readmissions due to HRRP (Desai et al., 2016; Gupta, 2021). However, these gains don’t persist to the 30-day point.

The estimated effects on patient readmissions are puzzling given that the firm invests greater resources for targeted patients without delivering consistent improvement in readmissions. It may signal lack of effectiveness of home health care in addressing hospital readmissions, or heterogeneity in patient outcomes that is masked by small average effects. Accordingly, in the next section we test for heterogeneous effects.

Heterogeneity

We hypothesize that since incremental costs borne for these patients come at the cost of the firm’s episode level surplus, it will selectively target patients for additional care. Since the objective behind the incremental effort is to help reduce penalty risk for hospitals, the firm should prioritize more resources for patients at greater readmission risk. We test this hypothesis by exploiting variation in observed risk across patients as an additional dimension that may determine the firm’s response.

The patient severity score (described in “Data sources” section ) is an excellent predictor of readmission risk–the sickest patient quartile have more than twice the 30-day probability of readmission as do patients in the bottom quartile (see Online Appendix Table A.1). Accordingly, we construct an indicator that takes value one if the patient has a severity score greater than the sample median (i.e. \(d_{s} = 1[s_{i} \ge s_{50}\)]). We construct the indicator by comparing targeted and non-targeted patients to their group-specific median values. We then estimate the model in Eq.  (3) below.

$$\begin{aligned} Y_{i} & = \alpha _{j} + \alpha _{h} + \alpha _{t} + \alpha _{{d_{c} }} + \mathop {\sum\limits_{{d_{c} = AMI,}} ~ }\limits_{{HF,~~Pneum}} \beta _{{1c}} ~P_{{hjc}} ~1\left ( {d_{c} = 1} \right) + \mathop {\sum\limits_{{d_{c} = AMI,}} ~ }\limits_{{HF,~~Pneum}} \beta _{{2c}} ~1\left ( {d_{c} = 1} \right)~1\left ( {d_{s} = 1} \right) \\ & \;\;\;\; + \mathop {\sum\limits_{{d_{c} = AMI,}} ~ }\limits_{{HF,~~Pneum}} \delta _{c} ~P_{{hjc}} ~1\left ( {d_{c} = 1} \right)~1\left ( {d_{s} = 1} \right)~ + \gamma _{1} X_{i} + \epsilon_{{ijh}} ~ \\ \end{aligned}$$
(3)

This model has three sets of coefficients of interest. \(\beta_{1c}\) is equivalent to the main interaction coefficient from Eq.  (2) and estimates the differential effect for patients discharged with targeted condition \(c\) from hospitals with penalty salience \(P_{hjc} = 1\). However, this coefficient now estimates the main effect for patients in the bottom half of sickness severity (henceforth, healthier patients). \(\beta_{2c}\) estimates the average effect for sicker patients discharged with targeted condition \(c\) from hospitals with zero penalty salience. This coefficient soaks up any differential responses for sicker patients even from non-penalized hospitals. Finally, \(\delta_{c}\) is the coefficient of interest in this model and it isolates the differential response for sicker patients from high penalty salience hospitals, in addition to the effect \(\beta_{1c}\) discussed above.Footnote 19 As in the base specification, we continue to control for observable differences across patients (\(X_{i}\), which now includes the indicator \(d_{s}\)). When interpreting the coefficients, note that the reference group is now healthier non-targeted patients.

We estimate Eq.  (3) with our main outcomes–total cost and probability of readmission and present the results in Table 4. Since the main results suggest greater effort and lower readmissions in the first week, we also present evidence on these outcomes after one week of care. For brevity of presentation, we omit the condition specific coefficients \(\alpha_{{d_{c} }}\). The coefficients estimating the effect for sicker targeted patients from non-penalized hospitals (\(\beta_{2c}\)) were generally small and statistically insignificant across both outcomes–suggesting little differential effort for sicker targeted patients, if discharged from a hospital with zero penalty salience. These are also not reported. The table presents the main interaction coefficients (\(\beta_{1c}\)) and differential response for sicker patients (\(\delta_{c}\)).

The results on total cost clarify that the firm provided greater resources to sicker heart failure patients, particularly in the first week of care. Column 2 indicates that while first week cost may have been lower for healthier heart failure patients, cost for sicker heart failure patients from a hospital with 1 s. d. greater penalty salience is about 5% greater (\(e^{0.97 - 0.16} - 1 = 1.25 \times 0.04 = 0.05\)). Table 4 columns 3 and 4 present corresponding results on patient readmissions in 30-days and 7-days, respectively. Column 4 shows that the decrease in first week readmissions for heart failure patients is largely due to patients sicker than the median.

The heterogeneity analysis results on episode cost and readmissions helps resolve some, but not all the puzzles raised by the average effects discussed in the previous section. For example, among heart failure patients sicker patients appear to have received differentially greater treatment intensity. They also experience fewer readmissions at the 7-day mark. However, these gains do not persist to 30 days. Taken together, the evidence suggests that greater home health care intensity is successful in delaying readmissions but did not meaningfully avoid hospital care.

Table 3 Provider effort and patient readmissions

Threats to identification and robustness checks

Changes in hospital referral behavior due to HRRP

Since patients must receive home health care to enter the proprietary data sample, a natural concern is whether hospitals’ response to HRRP includes changes in the patient mix referred to home health care. If penalized hospitals responded by differentially sending more patients to home health, and if the marginal patients were more costly to serve, then we would spuriously find higher spending for these patients. The opposite response is also possible, that is, the marginal patient discharged to home health may be healthier than before, in which case we would be biased against finding an increase in spend. It is therefore difficult to sign the bias and assess its importance without additional analysis.

To formally study changes in extensive margin use of post-acute care, we used the Medicare claims data to estimate linear probability differences-in-differences models presented in Eq.  (4) below. The models were estimated independently for each targeted condition, and hence the subscript \(c\) is omitted. The dependent variable is an indicator for use of any PAC or specific sub-type within 7 days of discharge from the index hospital stay. The coefficient of interest is \(\beta\) which quantifies the mean change in use of PAC for a hospital with penalty rate, \(penalty_{h} = 1\) over the period 2012–2015 relative to the pre-HRRP period. Note that we hold hospital penalty rate constant as of 2012 to circumvent reverse causality concerns.

$$\begin{array}{*{20}c} {1\left ( {PAC} \right)_{i} = \alpha_{h} + \alpha_{t} + \beta penalty_{h} 1\left ( {T_{t} = 1} \right) + \gamma_{1} X_{i} +\epsilon_{iht} } \\ \end{array}$$
(4)

Online Appendix Table A.4 presents corresponding estimates for each of the targeted conditions. Barring a couple of coefficients, none are statistically significant. More importantly, the implied magnitudes are very small. For example, a 1 s. d. increase in penalty rate is associated with a decrease of 0.001 (0.06 x − 0.02) in the probability of a heart failure patient being sent to home health care. This is negligible relative to the mean usage rate of home health care in the sample (0.17). The other coefficients are of comparable magnitude. We also estimated a flexible, non-parametric specification where we used an indicator \(d_{h}\) for hospitals in the top tertile by \({\text{penalty}}_{h}\). This is presented in Eq.  (5) below. This model also tests for differential pre-trends in the use of PAC across hospitals. For brevity, this exercise focuses on the two largest categories of PAC–home health and skilled nursing.

$$\begin{array}{*{20}c} {1\left ( {PAC} \right)_{i} = \alpha_{h} + \alpha_{t} + \mathop \sum \limits_{s \ne 2008} \beta_{s} d_{h} 1\left ( {t_{t} = s} \right) +\epsilon_{iht} } \\ \end{array}$$
(5)

Online Appendix Figure A.1 plots the coefficients for any PAC (Panel a), home health (Panel b), and skilled nursing (Panel c), respectively. There is no distinct pattern before or after the implementation of the penalty for any of the conditions or any type of PAC. Taken together, these results imply no change in the use of PAC for the average patient with one of the targeted conditions.

Another concern is that while there is no change in the fraction of patients sent to PAC, the risk profile of patients in PAC may have changed. Online Appendix Table A.5 presents results from triple difference models testing for heterogeneous effects based on patient sickness severity.Footnote 20 Columns 1, 3, and 5 reproduce the results discussed above for comparison. Columns 2, 4, and 6 present the triple difference coefficient for each of the targeted conditions. These are also small and statistically insignificant. This helps alleviate the concern that hospitals may be sending sicker or healthier patients to home health care post-HRRP. These findings are consistent with those of previous studies on HRRP which have not found evidence of changes in PAC use by hospitals (Gupta, 2021; Kim et al., 2019; Ziedan, 2019).

Falsification tests

Eliminating variation in penalty across hospitals

We interpret differentially greater episode cost for targeted patients as evidence of greater effort by the home health firm in response to HRRP. The identification assumption is that if the penalty were not implemented, there would be no systematic variation in the difference in outcomes between targeted and non-targeted patients from hospitals at different levels of penalty salience. We assess the plausibility of this assumption through a falsification test where we shut down variation in penalty salience across hospitals and assign to targeted patients the average penalty rate across all hospitals sending patients to the same office. Non-targeted patients continue to be assigned zero penalty salience. We then re-estimate our main models. Since penalty salience now only varies across markets, the key coefficient now captures whether offices in higher penalty markets spend more on targeted condition patients relative to those in lower penalty markets. If our results largely reflect the firm’s response to local market sentiment regarding the penalty, then we should continue to find meaningful effects even with this aggregate penalty measure. Table 5 Panel A presents the corresponding results. Reassuringly, the estimates are small relative to the main coefficients and statistically insignificant across all three conditions. For example, the DD coefficient on episode cost for heart failure patients is 0.32 (about two-fifth the size of the main estimate) and is statistically insignificant. This exercise confirms that our results are not driven by a generic response to the introduction of the penalty.Footnote 21

Table 4 Heterogeneity by patient severity
Table 5 Robustness and Falsification
Medicare advantage patients only

CMS does not calculate or consider readmissions for MA patients to compute a hospital’s penalty under HRRP. MA patients provide a suitable control group since they are of similar age as fee-for-service patients. Under our conceptual framework, the hospital has no incentive to insist on differentially greater resource use for these patients, and the HHA obviously has no reason to do so on its own. Since there are only 23 heart attack patients in the Medicare Advantage sample, we exclude heart attack from this analysis.

Table 5 Panel B presents the corresponding results when we replicate our main analysis on the sample of MA patients. Columns 1 and 2 present estimated effects on episode and first week costs, respectively. The coefficients imply that heart failure patients from high penalty salience hospitals receive either the same or differentially lower resource intensity. A slight decrease in resources for MA patients with targeted conditions could be due to the multitasking effect alluded to in Section II.A. This pattern is also found in the case of pneumonia patients, where we estimate a marginally significant reduction in episode cost in the first week.

Table 5 Panel B columns 3 and 4 present the corresponding results on the probability of readmission in 30 days and the first week, respectively. The coefficients do not display a consistent pattern across conditions or even within a condition. In the case of pneumonia, we find a positive and marginally significant coefficient on readmissions in the first week, which is plausible given the estimated reduction in resource intensity discussed above. But this pattern doesn’t hold at 30 days for pneumonia or for heart failure patients.

Placebo treatment effects

Our research design exploits the fact that HRRP incentivized improvement in readmissions only for patients in select conditions. However, one may worry that our results are capturing spurious effects. We assess this concern by generating an empirical distribution of treatment effects by randomly assigning a subset of the sample as heart failure patients and re-estimating our main specification 1,000 times. We then compare the observed effect on heart failure patients against this distribution.Footnote 22 Figure 4 presents the CDF of the estimated effects and indicates the true coefficient by a red line. The true coefficient is larger in magnitude than about 94% of the placebo coefficients. Hence, we interpret the main results as capturing a systematic effect on episode cost for heart failure patients.

Fig. 4
figure 4

Empirical distribution of placebo effects on episode cost. Note This figure presents the empirical distribution of placebo effects on log episode cost obtained by implementing a randomization test. Our main result finds an increase in episode cost for heart failure patients with a coefficient of about 0.8. We obtain a distribution of placebo effects and plot them relative to the observed effect on heart failure (indicated by the red line). Accordingly, we excluded heart attack and pneumonia patients from the sample but retained heart failure patients. We then estimated the model 1000 times. In each iteration we randomly assigned a subset of patients as placebo heart failure patients, keeping the same proportion as the actual number of heart failure patients. The figure plots the CDF of the estimates obtained. It indicates that the true effect is larger than about 94% of the placebo estimates.

Alternative characterization of penalty salience

We took a reduced form approach when we constructed our penalty salience measure and assumed that it increases in proportion to a hospital’s share of the firm’s patient volume. While this is intuitive, there are alternative ways to motivate this mechanism. For example, one may argue that hospitals will not bring up increased service standards with all their HHA partners. If enforcing these standards requires greater monitoring effort by the hospital, then it is costly for the hospital to do so. It follows that hospitals will prioritize monitoring for HHAs that account for a larger share of their patient discharges.

We replicate our main results using a modified measure of penalty salience to operationalize this interpretation. This measure, denoted \(\tilde{P}_{hjc}\), also incorporates the firm’s share of the hospital’s patients sent to home health care in the baseline period, \(\tilde{s}_{hj0}\), as shown below.Footnote 23

$$\begin{array}{*{20}c} {\tilde{P}_{hjc} = penalty_{hc} * share_{jh0} * \tilde{s}_{hj0} } \\ \end{array}$$
(3a)
$$\begin{array}{*{20}c} {\tilde{s}_{hj0} = \frac{{n_{hj0} }}{{n_{h0} }}} \\ \end{array}$$
(3b)

In the baseline period, the patient weighted mean value of \(\tilde{s}_{hj0}\) across office-hospital pairs was \(0.41\), with a standard deviation of 0.39.Footnote 24 However the firm’s share of hospital patients is positively correlated with the hospital’s share of the firm’s volume (\(share_{jh0}\)), hence this alternative measure of penalty salience is only slightly smaller in magnitude than the main measure on average (mean of 0.017 vs. 0.02). Using this alternate measure as the key source of cross-hospital variation, we replicate our main results on episode cost and probability of readmission. Table 5 panel C presents the corresponding interaction coefficients. Reassuringly, these results have similar patterns–across conditions, outcomes, and time horizon, i.e., whether we consider the entire episode or first week only. The coefficients are also similar in size and imply similar magnitude effects as the main results. For example, the estimates imply a differentially greater cost of 4.9% (\(e^{0.8} - 1 = 1.22*0.04 = 0.049\)) for heart failure patients from a hospital with 1 s. d. greater penalty salience.

Discussion and conclusions

Performance pay has become a pervasive feature of provider contracts in US healthcare, although there remains considerable debate over design. Using patient health outcomes to anchor performance pay contracts exposes providers to greater risk but may also be more efficient if providers respond by deploying effective and low-cost solutions. To design policies optimally, policymakers need to factor in potential provider response to incentives. But there is little empirical evidence from the health care sector to guide policymakers. This paper begins to fill these gaps by examining whether a performance pay penalty imposed on hospitals passed through to post-acute care providers and resulted in greater use of resources outside the hospital.

We do so in the context of one of the largest performance-pay programs in US health care, the Hospital Readmissions Reduction Program (HRRP). Previous studies of this program have documented improvements in 30-day readmissions, however there is debate over the causal channels. Leveraging novel operational and claims data from a large home health care firm operating in 16 states, we find that the firm provided more inputs for patients targeted by HRRP, discharged from hospitals at greater penalty risk and which account for a greater share of the firm’s patient volume. The data also permits us to identify three mechanisms used by the firm–frontloading care in the first week of the episode, using more experienced caregivers and spending more time with the patient, and targeting sicker patients.

The data is novel and allows us to observe changes in effort that would remain unobserved in traditional claims data. The results are also encouraging from a policy perspective. However, the use of data from one provider does limit the generalizability of the conclusions.

We find effects on episode cost for heart disease patients, but not for pneumonia patients. Within heart disease, our results at the 30-day mark are stronger for heart failure patients. However, prior studies on HRRP have found greater effects on readmissions for heart attack and pneumonia patients relative to heart failure. (Desai et al., 2016; Mellor et al., 2017; Ziedan, 2019). Our results may differ because our sample is limited to home health care patients (about 16% of hospital stays for the targeted conditions). These episodes represent more complex patients and effects for this group may differ from population level effects documented in prior studies. Further, it is plausible that the low-intensity, frequent care provided by home health care disproportionately benefits patients with chronic conditions such as heart failure (Naylor et al., 1999, 2004).

These results indicate that even in the absence of vertical integration, downstream post-acute care firms can internalize penalty incentives imposed on hospitals, to the extent quality of care affects demand for their services. As post-acute care gains prominence in policy proposals, quantifying the welfare effects for society assumes more importance. This was outside the scope of the current study but remains an important avenue for future research.