Introduction

In most developed countries, including Italy, costs of cancer care are increasing [1], due to growing cancer survivor population [2] and rising cost of novel and more expensive treatments. As a consequence, the sustainability of cancer burden is a challenge for the welfare system and becoming increasingly central in the policy makers’ debate.

Recent estimates report about 2.6 million people living with a cancer diagnosis in Italy in 2010 and predict 3.6 million in 2020, as the combined effect of increasing survival and population aging [3]. This population of cancer survivors is a mixture of newly diagnosed patients, patients living in chronic condition, persons cured, and patients in their end-of-life, whose health care needs and corresponding costs are very different from one another [4, 5]. A sound assessment of cancer-related expenditures and evidence about their distribution according to the care pathway represent a fundamental support for policy makers, who face the challenge of an efficient provision of health care services.

A number of studies comparing cancer costs in countries of the European Union investigate direct and indirect costs at national aggregate level [6, 7]. A study by Laudicella et al. [8] reports the cost of cancer in England using population-based, patient-level data and shows that early diagnosis and cancer prevention contribute to the achievement of large cost savings for the health system. In the US, a number of studies use the SEER-Medicare database and investigate the direct costs of cancer care using individual-level data [5, 9, 10]. Their findings are generally restricted to population aged 65 and over and cannot be generalized to the entire population. A more recent study involves younger patients enrolled in four health plans [11] and concludes that: “higher costs among patients aged < 65 years highlight limitations on relying on SEER-Medicare data alone”. In Canada, a comparative study on British Columbia and Ontario provinces [12] uses individual-level data from cancer registries linked with administrative data sources and concludes that “comparative cost studies present many challenges but enable analyses within and between countries, and can produce comparable estimates for research, policy and decision-making”.

In Italy, current evidence is mainly based on clinical cohorts [13], or it is limited to specific type of expenditures [14, 15] or to single phases of the disease pathway [16]. In this paper, we present the findings of the Epicost study, which uses information, obtained at individual level, merged from different data sources (cancer registries and administrative health care databases) to reconstruct patterns of care and cost profiles of women with breast cancer.

Materials and methods

Data sources

In Italy, a public welfare system administered on regional basis guarantees universal health care. Hospitals, clinics and pharmacies submit their health claims to the regional health authority for reimbursement.

This study includes data from four different sources: population-based Cancer Registry database (CR), Hospital Discharge database (HD), Outpatient Services database (OPS), Drug Prescriptions database (DP). CR provides data on cancer patients, while data on health care services and corresponding claims in Euros are provided by the other three sources.

CR collects data on all cancer diagnoses occurring in the population resident in the area covered by cancer registration and follows-up patients with respect to their vital status. When a patient dies, CR registers the date of death and the cause of death. This information is complete for all cases, except for those lost to follow-up, which represent on average about 1% of cases per year.

The HD database is a collection of hospital discharges, the OPS database provides information on outpatient services (for example diagnostic tests and ambulatory interventions), the DP database contains data on drugs prescribed to a patient and sold by a pharmacy.

Chemotherapy is generally administered in hospital or outpatient settings. In either case, information is included in the HD or OPS databases. Exception is for high cost drugs administered in hospital or in ambulatory, such as some types of chemo- and immuno-therapeutic drugs, which are included in another database not considered in the analysis, because at the time of the study the management of information and the reimbursement system was incomplete and very variable among regions.

In each type of health care service information is collected at individual level and includes a personal identification code for the record linkage with the CR database, to trace all health care resources utilized by a single patient in a given period of time and to select those services related to breast cancer, according to the methodology illustrated below.

Study cohort

This study involved 8 population-based cancer registries (CRs) with at least 8 years of registration: (3 out of 21 Local Health Units of) Veneto, Friuli Venezia Giulia (FVG) and Milano in Northern Italy; Umbria, Firenze-Prato (Fi-Prato) and Latina in Central Italy; Napoli and Palermo in Southern Italy. These CRs belong to eight different regions and overall they cover about 5.3 million subjects, corresponding to 17% of the Italian female population.

The study cohort is cross-sectional and included women diagnosed with malignant breast cancer (ICD-X C50). Each CR contributes to the study with patients who have been diagnosed in the 8 years prior to prevalence date, who are still alive on prevalence date (prevalence cohort). The CRs entered the study with the most up-to-date data at the time of case extraction: dates of prevalence span between January 1st 2009 and January 1st 2013, and consequently the eight-year diagnosis period spans between 2001 and 2008 (as is the case for Fi-Prato CR) to 2005–2012 (as is the case for Milano CR) as illustrated in Table 1.

Table 1 Study cohort features by cancer registry: geographical area, population coverage, prevalence cohort, information on stage at diagnosis, information on screening

Women diagnosed with another cancer in the years following breast cancer diagnosis and those with a previous cancer diagnosis within 5 years from the breast cancer diagnosis were excluded.

Women of the prevalence cohort were followed with respect to their vital status up to one year after the prevalence date. For those diagnosed during the last year before the prevalence date, CRs were asked to provide information on stage at diagnosis and modality of diagnosis (screen-detected cases versus not screen-detected cases). Table 1 reports the percentage of cases with unknown stage at diagnosis, the percentage of screen-detected cases and the percentage of target population who underwent mammography within screening programs active in 2010.

Prevalence by phase of care

Each patient contributed to the study with a 12-month time interval: we fix the prevalence date and compute cancer-related costs in the 12 months around it; short survivors, defined as those patients whose time between diagnosis and death was less than 12 months (about 0.3% of the study cohort), and patients with censored follow-up (about 0.1% of the study cohort) contributed to the study with less than 12 months. We defined three mutually exclusive phases of care: initial (the first 12 months following diagnosis), continuing (the time between the initial and the final phase of care) and final (the final 12 months of life) and assigned each case in the cohort to the phase of care she belonged to on prevalence date, in the following way:

if she had been diagnosed within 12 months before prevalence date and she is alive 12 months after prevalence date, she belonged to the initial phase; if she had been diagnosed more than 12 months before prevalence date and she is alive 12 months after prevalence date, she belonged to the continuing phase; if she died within 12 months after prevalence date, regardless of when she had been diagnosed, she belonged to the final phase. It should be noticed that, although during her life span each woman can contribute to more than one phase of care, on prevalence date each patient belonged to only one phase of care i.e., the phases of care are mutually exclusive. Figure 1 illustrates the four possibilities, including the short survivor one.

Fig. 1
figure 1

Phase of care study design

Each line in the figure represents the life of a study patient, from diagnosis (Dx) to possible death ( +); the thick solid line is the observational period in which costs are computed, defined as follows: [Dx date, Dx date + 12 months] for patients in initial phase (patient number 1); [Prevalence date − 6 months, Prevalence date + 6 months] for patients in the continuing phase (patient number 2); [Death date − 12 months, Death date] for patients in the final phase (patient number 3); [Dx date, Death date] for short survivors (patient number 4), i.e. patients with overall survival shorter than 12 months; in this case the first two months after diagnosis are attributed to the initial phase, and the remaining months to the final phase; in case survival is shorter than 3 months, the entire survival time is attributed to the final phase, which prevails on the initial one.

Notice that patients dying for not-cancer causes are regarded as cases with censored follow-up.

Costs indicators

Each case was linked to the three databases (HD, OPS, DP) to trace every event of interest during the observation period. The linkage was deterministic, through an anonymous personal identification code. To take into consideration only those events that were related to the breast cancer, a list of events was utilised for each of the three databases. These lists were created by expert oncologists and referred to diagnoses, interventions and procedures coded according to the ICD9-CM classification for HD and for OPS, and to the ATC classification system for DP.

Costs were expressed in Euros and were defined as the direct expenditure paid by the Regional Health Authority to the health care providers (hospitals, ambulatories, pharmacies) as reimbursement of the services provided to a breast cancer patient.

The following indicators were used in this study:

Patient monthly cost cij: all costs sustained in month i, i = 1,…, 12, for a patient j, j = 1,…,N

Person months pij is a binary variable equal to 1 if patient j is present in month i and 0 otherwise.

Patient monthly average cost Ci: all costs sustained on average for a patient in month i, obtained by dividing costs sustained for all patients in month i by the corresponding number of person-months, i.e. \({C}_{i}=\frac{{\sum }_{j=1}^{N}{c}_{i,j}}{{\sum }_{j=1}^{N}{p}_{i,j}}\)

Patient annual average cost CA: all costs sustained on average for a patient in a year, obtained by dividing the sum of monthly costs by the sum of person-months and multiplying the ratio by 12.

Total annual cost: all costs sustained in 12 months for all patients, obtained by multiplying the patient annual average cost CA by the total number of patients. These costs are computed by phase of care and/or by type of health care service.

A cost profile is a series of patient monthly average costs Ci computed for the three phases of care; since the study is cross-sectional and each patient contributes to one phase only, the cost profile is made of a series of monthly average costs: C1initial,..,C12initial,C1continuing,…,C12continuing,C1final,…,C12final.

We identified homogeneous groups of patients according to clinical and demographic variables affecting the patterns of care. Each homogeneous group corresponds to a combination of age class (15–49, 50–69, 70–79, 80 +) and stage at diagnosis (I, II, III, IV) and costs are computed as simple averages over patients belonging to the same homogeneous group.

Patterns of care indicators

To better describe and interpret results on costs in the initial phase of care, the following indicators were computed by age at prevalence and stage at diagnosis: percentage of patients receiving at least one surgery treatment; percentage of patients receiving at least one chemotherapy; percentage of patients receiving at least one radiotherapy over all patients in initial phase of care; percentage of patients receiving at least one neo-adjuvant chemotherapy over patients with surgery in initial phase. These indicators are in Supplementary Table 1, in Appendix.

Statistics

Descriptive statistical methods were used to evaluate the clinical characteristics for each treatment strategy. Differences in proportions were compared using the χ-square test; linear trends of proportions were assessed by the Cochran Armitage test for trend. All p values are two-sided; p values of less than 0.05 were considered significant. Statistical analysis was carried out using SAS 9.4 (SAS Institute, Cary NC).

Results

Distribution of prevalent cases and total annual costs

The prevalence cohort included 49,272 patients, corresponding to 48,469 person-years. Of these (Table 2), 15.2% were in the initial phase (n = 7382), 79.7% in the continuing phase (n = 38,620) and 5.1% in the final phase (n = 2467).

Table 2 Prevalent cases by phase of care and cancer registry, patient annual average costs (in Euros), by phase of care, type of health service and cancer registry

42% of resources were absorbed by cases in the initial phase of care, 44% of resources by cases in the continuing phase, and 14% of resources by cases in their last year of life (Fig. 2).

Fig. 2
figure 2

Distribution of total annual costs by phase of care and type of health service (a) and distribution of prevalent cases by phase of care (b), in the pool of Cancer Registries

Hospitalization was the most important cost driver, accounting for over 55% of the total costs, followed by outpatient services (29%) and pharmaceutical costs (16%). Hospitalization absorbed resources especially in the initial phase, while costs in the continuing phase are uniformly distributed among the three types of services.

Prevalent cases and patient annual average costs

Average annual costs per person were higher among newly diagnosed women (7577 Euros) and among patients in the last year of life (7563 Euros); the continuing phase of care amounts to 1507 euros, as illustrated in Table 2.

This pattern varied across the CRs: in Veneto, Umbria and Latina the initial phase was more expensive than the final phase; vice versa in Firenze-Prato, Milano, Napoli and Palermo; in Friuli Venezia Giulia costs are equivalently distributed between initial and final phases of care.

There are also some differences across CRs in the composition of costs by health service: among newly diagnosed women hospitalization costs are higher than outpatient costs, range from 60 (Latina CR) to 85% (Firenze-Prato CR) of total costs while outpatient costs range from 14 (Firenze-Prato CR) to 31% (Milano CR) of total costs; exception is Veneto CR, where costs of hospitalization and outpatient are equivalent. At the end of life hospitalization costs are four times higher than outpatient costs in the pool of CRs, ranging from 1.5 times in Veneto CR to 12 times in Firenze-Prato CR.

Costs profiles by health care service

Figure 3 illustrates the monthly cost profiles (in Euros) by type of health care service in the pool of CRs. The vertical axis represents patient monthly hospitalization costs (a) and outpatient costs (b).The horizontal axis represents time (in months) in each phase of care: initial and continuing phases are 1–12 months starting from diagnosis and final phase are 1–12 months ending with death.

Fig. 3
figure 3

Cost profiles by type of health service in the pool of cancer registries

Hospitalization costs followed a U-shape profile with two peaks: about 1600 Euros per woman in the first month since diagnosis and 1700 Euros in the last month of life. Outpatient costs were on average higher in the initial and final phases of care. During the initial phase, the curve of outpatient costs showed two peaks in correspondence of the first (160 Euros) and the seventh (225 Euros) month. Costs due to drug prescriptions were uniformly distributed during the entire disease pathway, with an average expenditure of 30 Euros per patient/month. It should be recalled that this database contains only costs of drugs prescribed to patients and sold by pharmacies and does not contain costs of chemotherapy administered in hospital or outpatient settings.

Prevalence and patient annual average costs by stage at diagnosis

Table 3 reports the distribution of prevalent cases and patient annual average costs in the initial phase of care by stage at diagnosis, type of cost and CR. Out of the 7359 patients in the initial phase, 42.2% were in stage I, 26.7% in stage II, 15.5.% in stage III and 2.4% in stage IV. Stage was not available for 13.1% cases.

The distribution of cases by stage varies widely across CRs: more severe cases (stages III and IV) spanned from 6% of total staged cases in Milano to about 50% in Palermo.

Health care services showed a trend by stage at diagnosis: more advanced stages corresponded to higher average costs, patients in stage III and IV costing 44% more than patients in early stages (about 10,000 Euros to treat patients in stage III or IV vs. about 7000 Euros for those in stage I or II). This difference was even higher (54%) when considering hospitalization costs only (about 4500 Euros per patient in stages I or II vs. about 7000 Euros per patient in stage III or IV). This trend was consistent in all CRs and for all cost components, with some variability among CRs, stage III or IV costing 24% more than stage I or II in Latina and 78% more in Palermo. This variation was even wider when considering hospitalization costs only, spanning from 27% extra costs of stage III or IV in Latina to 95% in Palermo. Veneto was an exception, more severe patients having slightly lower hospitalization costs and higher outpatient costs with respect to less severe patients.

Costs by phase of care and age

Patient annual average costs by age at prevalence are presented in Table 4. Nearly half of prevalent cases in the initial and continuing phase were in the 50–69 year age group, which is the target age of screening programs (45% and 48% of prevalent cases respectively), half of cases in the final phase were elderly (80 + years), while younger patients (15–49 years) accounted for 23% of cases in the initial phase, 16% in the continuing phase and 8% in the final phase.

Table 3 Prevalent cases and patient annual average costs (in Euros) in the initial phase of care by type of health service, stage at diagnosis and cancer registry
Table 4 Prevalent cases and patient annual average costs (in Euros) by phase of care, age at prevalence and type of health service in the pool of cancer registries

Looking at costs, we generally observed an inverse relationship between age at prevalence and health care costs per patient across all phases of care and types of services.

The average cost of all services of the youngest age class is 80% and 65% higher than the cost of elderly women in the initial and continuing phases of care, respectively; in the final phase of care, the average cost of women aged 80 and over is about 1/5 the cost of young women.

Hospitalization costs allocated to patients aged 15–69 were 30 and 40% more than those allocated to patients aged 70 + in the initial and continuing phases of care, respectively and more than double in the final phase. In particular, annual costs due to hospitalization for patients aged 80 and over were the lowest across all phases of care.

Discussion

This paper represents the first attempt in Italy to estimate the economic burden of breast cancer at population level taking into account the entire disease pathway, from diagnosis to possible recovery or death, according to a three-phase of care framework and using individual information from multiple current health care and administrative databases.

At individual level, the results showed that costs of cancer patients are not uniformly distributed along the disease pathway, but follow a U-shape with higher costs being concentrated in the first months, when diagnostic ascertainment and main course treatments are provided, and in the last year of life, when palliative care is provided. These findings are confirmed by the literature [5, 8, 10]. Cost of drug prescriptions represents an exception, as it appears to be constant in the three phases of care. Notice that when chemotherapy is administered in hospital or outpatient settings its cost is not included in the drug prescription costs. Outpatient services costs within the initial phase decreased during the first months since diagnosis and then increased up to a peak at the seventh month since diagnosis. This trend seems consistent with the care pathway, diagnostic ascertainments being administered in ambulatory in the first month, followed by surgery in hospital and subsequently by chemotherapy in ambulatory.

At aggregate level, 80% of patients belonged to the continuing phase and their total annual costs represented 44% of the total expenditure. Such proportion is expected to rise, as prevalence of breast cancer is rapidly increasing in Italy [3]. Moreover, an excess of procedures during the follow-up phase has been demonstrated, confirming the excessive utilization of eco marker, scintigraphy and PET (Positron Emission Tomography), with respect to clinical guidelines [16]. This phenomenon suggests the need for a widespread adoption of evidence-based protocols for the post-therapy follow up of patients with breast cancer.

In initial and final phases of care, hospitalization represented the main cost item, followed by outpatient services and drug prescriptions, while in the continuing phase costs were uniformly distributed. However, the recent introduction of targeted therapies (not considered in this study) could deeply modify the observed pattern.

According to previous findings [14], stage at diagnosis in the initial phase of care is the main cost driver, with patients diagnosed in advanced stage (III and IV) consuming on average about 10,000 euros per year versus patients diagnosed in early stages (I and II) consuming between 6000 and 7800 euros per year. These costs are quite variable among CRs, with Palermo CR having the lowest costs throughout the stages, and Umbria, Latina, FVG and Napoli CRs having the highest costs in I, II, III, and IV stage, respectively. It should be pointed out that in Italy the diffusion of organized screening programs for breast cancer is still incomplete and varies by Region: in 2010, on average 35% of the target population underwent mammography within screening programs, but this percentage varied from 12% in Southern regions to 55% in North east Regions [17]. The Epicost study confirms the positive gain on budget due to early diagnosis of breast cancer, thus supporting measures to increase diffusion and adhesion to organized screening programs, addressed especially to women resident in the Southern Regions.

Patterns of care and costs were associated also with age: elderly patients on average accounted for half of younger patients’ cost in initial phase and 1/5 of younger patients’ costs in the last year of life. These results are possibly due to different clinical approaches: young patients better tolerate more aggressive (and more expensive) treatments and have more chance to survive longer when treated aggressively, compared to elderly patients, that are more exposed to comorbidities. Additionally, elderly patients are more likely to spend time into nursing care facilities, whose costs were not included in our data.

This study has several strengths, partly related to the methodology and partly to the data used.

This is a real-world study, and the results are representative of what happens at population level. It includes as much data sources on claims as available at the time of data collection, linked at individual level with a prevalence cohort identified by the CRs.

Health care costs were obtained using a direct approach based on cancer-related procedures only, rather than using control cohorts for expenditure comparison. This allowed producing precise estimates of cancer-related costs that were comparable across regions.

The prevalence-based (cross-sectional) approach adopted in the study yields more up-to-date estimates than an incidence-based (longitudinal) approach. Further, the prevalence cohort is representative of a longitudinal cohort of breast cancer women followed up for 8 years after diagnosis, without any selection with respect to prognosis or to the patients’ clinical and demographic characteristics.

Moreover the phase-of-care framework adopted here is appealing because it simultaneously takes into account all the clinically relevant phases of the disease.

Some limitations might affect the results and should be taken into account.

Some data sources are missing in this study, as the following health care services were not included: high cost drugs administered in hospital or outpatient settings, emergency room (ER) services, home care, nursing facilities for elderly people and hospices for terminal patients. This caused an underestimation of costs that varies according to the phase of care and the patient characteristics: home care and nursing facilities are generally provided to elderly patients, hospices are provided to terminal patients, ER services are not particularly relevant for a chronic disease like cancer.

In-hospital drug data have been collected from all participating CRs, but they were not included in the analyses, due to lack of completeness and bad quality of the archives. Consequently, the use of chemo- and immuno-therapeutic drugs and, more recently, of highly expensive targeted drugs was not considered. The pharmaceutical costs may therefore be underestimated. However, at the time of this study targeted therapies were not largely used; furthermore, this data source is expected to become more complete and standardized in the near future. Finally, from this experience we developed specific data check procedures that could be used in a next call for data.

The continuing phase included a mixture of patients with different clinical characteristics and patterns of care: patients in chronic conditions; patients cured from the disease; patients diagnosed with relapses. On the basis of the information collected by CRs, it is currently impossible to distinguish among these groups of patients. A further development would be to identify specific treatments and procedures targeted to specific patient categories and to use them to disentangle patterns of patients with homogeneous care needs within the continuing phase of care [18].

Several confounders might affect the geographical comparability of results. The prevalence of hospices, as an alternative to hospitalization in the final phase of disease, varies according to the Cancer Registry area. In addition, reimbursement for the same health care service might vary between Regions; in particular, each Region sets its own monetary value of outpatient services, and establishes whether to include additional codes. Different Regions might provide the same procedure in different settings, for example, chemotherapy may be administered in in-hospital regime in some Regions and in outpatient regime in others. Stage at diagnosis is known to be the main cost driver in the first twelve months since diagnosis. CRs were asked to provide with the relevant information on stage for newly diagnosed patients, however the level of completeness of the information was variable, ranging from 65% of Friuli Venezia Giulia to 95% of Palermo. Lack of completeness in the information on stage was possibly caused by organizational problems in the development of the health care information system, as well as by intra-regional migration of patients: when a patient is treated outside her Region of residence, some clinical information (such as stage at diagnosis) may be lost.

Despite these limitations and even though intra-regional comparison is not the focus of the study, some reasoning about geographical variability of breast cancer costs might be useful to identify good practices and optimal health care organization models. This is the case of Veneto, where a shift of some health care services from hospital to outpatient care is apparent and yields lower costs.

Conclusions

The approach of this study allows distributing health care budget according to different health care service components, phase of care and patients’ characteristics that are proved to be cost drivers, i.e. stage at diagnosis and age. This approach can be used by health care planners to make predictions of cancer burden into the near future according to specific interventions and corresponding scenarios. The model of analysis proposed here is replicable to other Italian regions and possibly to other countries with different health care systems, provided that individual health care information on services and corresponding claims are available.