Introduction

In patients with haematological and solid tumour malignancies, infections contribute to poorer clinical outcomes [1,2,3] compounded by high attributable hospitalization costs [4]. Immunosuppression related to the underlying malignancy or cancer-directed therapy, the need for indwelling medical devices (e.g. venous access catheters) and frequent contact with healthcare settings all contribute to increased risk of infection [5]. Cancer outcomes may be affected if therapy is delayed in the setting of infection [6], and high mortality attributable to infections (up to 85%) has been reported in some high-risk populations [7]. Few prior studies have comprehensively evaluated infection burden across the spectrum of malignant conditions with use of a single standardized, patient-level coded data source to inform prevention strategies in high-risk patient groups [2, 4, 5].

Internationally adapted coding systems endorsed by the World Health Organization [8] are routinely applied for hospitalized patients in Australia in accordance with the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, Australian Modification (ICD-10-AM). These data are coded in an alphanumeric standardized framework [9] and enable sequential hospitalisations for individual patients to be identified and linked over time. Prevention and management of infections in patients with cancer requires targeted and multimodal programmes [10,11,12,13] and robust surveillance is essential to inform programme development.

Historically, data evaluating infection burden in patients with cancer have been difficult to compare, given heterogeneity in study design and data quality. There is also a paucity of literature characterizing the epidemiology of all infections and stratified by malignancy type. The objectives of this study were to use ICD-10-AM codes to determine the (i) prevalence, (ii) incidence, (iii) time-trends and (iv) risk of in-hospital mortality of infectious diseases in Australian patients diagnosed with haematological and solid tumour malignancies between 2007 and 2017 in the current era of cancer therapy.

Methods

Study design and setting

This was an observational, retrospective, longitudinal cohort study of adult inpatients (≥ 18 years) diagnosed with a haematological malignancy (HM) or solid tumour neoplasm (STN) at the Peter MacCallum Cancer Centre (PMCC) from 1 January 2007 to the 31 December 2017. The PMCC is a tertiary referral hospital in Victoria, Australia, with haematology, medical oncology, cancer surgery and radiation oncology services. This hospital is the largest specialized healthcare facility dedicated to the delivery of cancer care in Australia. Study design and reporting is consistent with criteria endorsed in the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE; Online Resource 1) [14] and the REporting of studies Conducted using Observational Routinely collected health Data (RECORD; Online Resource 2) Statement [15].

Data extraction

Episode-level, longitudinal data for each patient’s hospitalization were collected by the investigators. Administrative ICD-10-AM and Australian Classification of Health Interventions (ACHI) coding data and associated patient demographics and clinical characteristics for every recorded hospitalization at the PMCC were sourced from patient administration system archives (HosPro™ (2007–2010) and iPM (2011–2017)) and the Enterprise Master Patient Index system. This extract included updates to the ICD-10-AM and ACHI codes from the Fifth to the Tenth Edition.

Definitions

All inpatients in receipt of a principal diagnosis code denoting a primary malignant neoplasm (Australian Coding Standards (ACS) 0001 Principal diagnosis) [9] were included (Online Resource 3). Index hospitalization was defined as the patient’s first admission to hospital. An episode-of-care was defined as a hospitalization. An incident case was defined by the first ICD-10-AM code denoting an infectious disease from the date of index hospitalization (Online Resource 3) in receipt of either a complication (C) or present-on-admission (P) condition-onset-flag indicating time of infection onset (Vic 0048 Condition onset flag) [16].

Infection diagnostic codes (Online Resource 3) used for the current study were based on those employed for a previous large dataset analysis [17] and were adapted to the Australian healthcare setting. Any subsequent infection diagnoses in the same hospitalization were treated as secondary infectious disease events. Duplicate codes in the same hospitalization were removed. Hospitalization for autologous haematopoietic stem cell transplantation (auto-HSCT), radiotherapy, chemotherapy, brachytherapy and surgical procedures/interventions were defined by corresponding ACHI codes (Online Resource 3). Neutropenia was defined according to current coding conventions (codes D70 (Agranulocytosis) ± R50x (Fever of other and unknown origin); ACS 0109 Neutropenia). Inpatient complications were defined according to complications listed in the Charlson Comorbidity Index using well-accepted coding rules [18] (Online Resource 3).

The date of in-hospital mortality was defined as the last discharge date where a deceased flag appeared for the inpatient discharge record. The extracted dataset also contained a binary out-of-hospital death code sourced from the Victorian Births, Deaths and Marriages Registry.

To measure burden of illness and risk of in-hospital mortality associated with infection in the study cohort, HM or STN inpatients were defined as exposed and unexposed based on the presence or absence of an infection code post-index hospitalization, respectively. See Online Resource 3 for a list of exclusion codes and the episode-level data collected by the investigators.

Statistical analyses

Infection burden, prevalence and incidence rate

Patient characteristics, procedures, all-cause 30-day age-specific in-hospital mortality and inpatient complications were compared between exposed and unexposed in the HM- and STN-cohorts using chi-squared (χ2) or Fisher’s exact and Wilcoxon rank-sum tests for categorical and non-parametric continuous covariates, respectively. Infection prevalence was calculated as the number of incident cases divided by the total number of patients stratified by underlying malignancy during the study period. Incidence rates for infection were reported and adjusted per 10,000 occupied bed days (OBD) as a measure of person-time. Exact binomial 95% confidence intervals (CI) were calculated for estimated proportions of prevalence and incidence. The burden of prognostic comorbid conditions was quantified using the Charlson Comorbidity Index [18] (Online Resource 3).

Time trends

Quarterly time trends were evaluated for coded infections using autoregressive integrated moving average (ARIMA(p,d,q)) time-series models capable of filtering out high-frequency noise in non-stationary time-series data [19]. Competing ARIMA(p,d,q) processes were compared using the Akaike and Bayesian Information Criterion. Potential seasonality in infection rates over the calendar year were assessed using the Walter and Elwood test of seasonality [20].

Risk of in-hospital mortality

To compare time to in-hospital mortality (in months) between the exposed and unexposed from the date of index hospitalization in admitted episodes subsequent to the hospitalization with coded infection, Kaplan-Meier analyses and Cox regression, adjusted for gender and age (in years), were used. Candidate predictors in the multivariate Cox model were assessed on univariate regression for covariates with p < 0.20 and hazard proportionality was assessed via a proportional hazards test.

All statistical analyses were undertaken using Stata/SE v15.1 software (StataCorp® LLC, College Station, Texas, U.S.A.). A two-sided p value < 0.05 was considered statistically significant.

Ethics

Ethics approval was granted by the Peter MacCallum Cancer Centre Human Research Ethics Committee (project number 18/72R).

Results

Study cohort

A total of 24,391 patients identified from 49,656 hospitalisations were captured between 1 January 2007 and 31 December 2017. Ninety-nine patients (899 hospitalisations) with no record of a primary malignant tumour diagnosis and 2887 patients (3641 hospitalisations) with a benign neoplasm were excluded from the study. Of 21,405 patients treated for a primary malignant neoplasm, 3033 (14%) and 18,372 (86%) were diagnosed with a HM and an STN, respectively (Fig. 1). Within the HM-cohort, 953 (31%) patients underwent auto-HSCT (Fig. 1).

Fig. 1
figure 1

Consort diagram of studied patients and associated hospitalisations. HM, haematological malignancy; HSCT, haematopoietic stem cell transplantation; STN, solid tumour neoplasm

Infection burden, prevalence and incidence rate

Among the eligible study cohort, the infection prevalence was 35% (95% CI 34–35%; N = 7403). In subgroups with HM and STN, 1997/3033 (67% [95% CI 64–68%]) and 5406/18,372 (29% [95% CI 28–30%]) were coded with an infection post-neoplasm diagnosis, respectively (Fig. 1). For HM patients undergoing auto-HSCT, infection prevalence was high (N = 840; 88% [95% CI 86–90%]). The distribution of infectious disease diagnoses stratified by underlying malignancy is presented in Fig. 2 and Online Resource 4. Among the HM-cohort, infection prevalence was highest in patients with acute myeloid leukaemia (N = 343; 85% [95% CI 81–89%]) and multiple myeloma (N = 583; 78% [95% CI 75–81%]). Among the STN-cohort, infection prevalence was highest in patients with cancer of the heart, mediastinum and pleura (N = 23; 56%) [95% CI 40–72%], bone and articular cartilage of limbs (N = 160; 53% [95% CI 22–43%]) and Kaposi sarcoma (N = 5; 50% [95% CI 19–81%]).

Fig. 2
figure 2

Prevalence (in percent) with 95% confidence intervals of all coded infectious diseases stratified by underlying malignancy diagnosis. Small and large red dashed lines denote the pooled average infection prevalence rate in the HM- and STN-cohorts, respectively. STN, solid tumour neoplasm

Incidence rates are presented in Fig. 3 and Online Resource 4. Among the HM-cohort, the incidence of coded gastrointestinal tract infections (GTI) was the highest, occurring most frequently in patients with multiple myeloma (192 [95% CI 174–211] per 10,000 OBDs), followed by Hodgkin and non-Hodgkin lymphoma (129 [95% CI 93–175] per 10,000 OBDs). Bloodstream infections (BSI) were most frequently identified in Hodgkin lymphoma (123 [95% CI 88–168] per 10,000 OBDs) and myelodysplastic syndrome patients (118 [95% CI 84–104] per 10,000 OBDs). Lower respiratory tract infection also occurred frequently in the HM-cohort, predominating in the chronic myeloid leukaemia (143 [95% CI 110–182] per 10,000 OBDs) and myelodysplastic syndrome (106 [95% CI 74–147] per 10,000 OBDs) subgroups. Among the STN-cohort, overall infection incidence rate was lower compared to the HM-cohort. GTIs constituted 24% of infection diagnoses in STN patients, occurring most frequently in patients with a diagnosis of pancreatic cancer (138 [95% CI 103–180] per 10,000 OBDs) and cancer of the lip and oral cavity (124 [95% CI 106–143] per 10,000 OBDs). Bloodstream infections had the second highest overall incidence (0–123 per 10,000 OBDs). Lower respiratory tract infection had the third highest overall incidence occurring most frequently in patients with lung cancer (194 [95% CI 189–199] per 10,000 OBDs). Age-specific infection incidence rates were highest in the 60–69 years group among the HM-cohort (741 [95% CI 713–769] infections per 1000 persons) and the 18–29 years group among the STN-cohort (386 [95% CI 343–430] infections per 1000 persons) (Online Resource 5).

Fig. 3
figure 3

Heat map of infection incidence adjusted per 10,000 occupied bed-days, stratified by underlying malignancy diagnosis, 2007–2017. Refer to Online Resource 4 for incidence rates with 95% confidence intervals. ANS, autonomic nervous system; HIV, human immunodeficiency virus; PNS, peripheral nervous system; STI, sexually transmitted infection

Inpatient length of stay was more than five times greater in exposed than unexposed patients with HM (22 days versus 4 days; p < 0.001) and more than three times greater in patients with STN (15 days versus 4 days; p < 0.001). Among the HM-cohort, a significantly higher proportion of exposed than unexposed patients were diagnosed with concurrent neutropenia (73% versus 20%; p < 0.001), with a similar trend in the STN-cohort (15% versus 2.84%; p < 0.001). Admission to intensive care unit (ICU) predominated in exposed compared to unexposed patients in both the HM- and STN-cohorts (Table 1). Median length of stay in ICU was approximately three times greater in HM exposed (67 h) compared to HM unexposed patients (22 h; p < 0.001) and approximately 1.9 times greater in STN-exposed (47 h) than STN-unexposed patients (25 h; p < 0.001). A significantly greater proportion of exposed than unexposed patients required mechanical ventilation in the HM (4.21% versus 0.29%; p < 0.001) and STN cohorts (4.64% versus 0.79%; p < 0.001), as well as haemodialysis (HM 2.00% versus 0.10%, p < 0.001; STN 0.28% versus 0.20%, p < 0.001). The incidence of inpatient complications was consistently higher in exposed than unexposed patients for both the HM and STN cohorts (Table 1). All-cause, 30-day, age-specific, in-hospital mortality was highest in exposed patients aged 70–79 years in both the HM (174 deaths per 1000 persons) and STN (134 deaths per 1000 persons) cohorts (Table 1).

Table 1 Characteristics of inpatients with coded infection compared to no coded infection

Time trends

Among the HM cohort, increasing linear quarterly infection rates were observed for GTI (2.11 per 10,000 OBDs per quarter; p = 0.573), BSI (1.06 per 10,000 OBDs per quarter; p = 0.002), genitourinary tract infection (0.22 per 10,000 OBDs per quarter; p = 0.907) and lower respiratory tract infection (0.17 per 10,000 OBDs per quarter; p < 0.001). Decreasing quarterly rates were observed for invasive fungal disease, other infection, skin and soft tissue infection and upper respiratory tract infection (Online Resources 6 and 7).

Among the STN cohort, increasing quarterly infection rates were observed for BSI (1.07 per 10,000 OBDs per quarter; p < 0.001), GTI (1.04 per 10,000 OBDs per quarter; p < 0.001), invasive fungal disease (0.13 per 10,000 OBDs per quarter; p = 0.901) and upper respiratory tract infection (0.03 per 10,000 OBDs per quarter; p < 0.001). Decreasing quarterly rates were observed for genitourinary tract infection, lower respiratory tract infection, other infection and skin and soft tissue infection (Online Resources 6 and 7).

No statistically significant seasonality in overall infection rates was observed in the STN (p = 0.823) or HM (p = 0.761) cohorts. Monthly peaks were observed in April and December among the STN and HM cohort, respectively (Online Resource 8).

Risk of in-hospital mortality

Risk of in-hospital mortality was higher in the STN than HM cohort. The adjusted hazard ratio (aHR) for infection-onset post-index hospitalization for STN patients was 1.61 (95% CI 1.41–1.83; p < 0.001) (Fig. 4). The aHR was 1.30 (95% CI 0.90–1.90; p = 0.166) in the HM cohort (Online Resource 9). After stratifying by infection, risk of mortality was highest for patients with genitourinary tract infection in the HM cohort (aHR 3.39; 95% CI 0.46–25; p = 0.231) and lower respiratory tract infection in the STN cohort (aHR 3.13; 95% CI 2.64–3.71; p < 0.001) (Online Resource 10).

Fig. 4
figure 4

Kaplan-Meier survival curves illustrating overall survival (in months) from index hospitalization between solid tumour neoplasm patients with coded infection (exposed) and patients without coded infection (unexposed). CI, confidence interval; STN, solid tumour neoplasm

Discussion

To our knowledge, this is the largest study evaluating infection prevalence, incidence, time trends and risk of in-hospital mortality in patients with haematological- and solid-tumour malignancies through use of administratively coded data. The most salient findings were: (i) high prevalence of infection in auto-HSCT recipients (88%), followed by patients with HM (67%) and STN (29%); (ii) high incidence rate of GTI, BSI and invasive fungal disease; and (iii) higher risk of in-hospital mortality in STN and HM patients with coded infection. These findings underscore the importance of understanding the local infection epidemiology and outcomes in patients with cancer, and ensuring that appropriate infection prevention, screening, prophylaxis and treatment strategies are in place [10, 21, 22].

Our data indicate that 67% (N = 1997) of hospitalized HM patients were diagnosed with at least one infection, which is comparable to local [23] and international [24] reports. Contrary to prevalence rates spanning 21% to 43% in earlier reports [25, 26], we note that 88% (N = 840) of auto-HSTC recipients received an infection code from index hospitalization (Fig. 1). Wider eligibility for aggressive antineoplastic therapies, including auto-HSCT in older and more vulnerable cancer patients at the PMCC, may be one explanation for this observation.

The higher incidence of infection in patients with haematological malignancies compared to solid tumour neoplasms is well accepted [27, 28]. In keeping with local [23] and international [29] estimates, patients with HM experience a higher rate of infectious complications (N = 1997; 67% [95% CI 64–68%]) compared to patients with solid tumour neoplasms (N = 5406; 29% [95% CI 28–30%]). Concordant with Valentine et al. [2], the HM cohort described here was characterized by higher comorbid indices than STN patients from index hospitalization, reflected by the presence of neutropenia (73% [N = 1457] versus 15% [N = 816]; p < 0.001) and the higher median [IQR] duration in ICU (67 h [36–143] versus 47 h [23–90]; p < 0.001) (Table 1). Although speculative, our observations suggest that the reasons for the higher incidence estimates in patients with HM compared to STN are multifactorial, attributed, in part, to higher rates of underlying comorbidities in combination with intravenous administration of immune-modulating multi-agent chemotherapy as detected in this study (HM: 74% [N = 1479]; STN: 21% [N = 1146], p < 0.001) (Table 1). This is further supported by a 1.98 times increased risk of infection in patients with chemotherapy-induced neutropenia described in Li et al. [30].

We detected a high incidence of GTI, BSI and invasive fungal disease. While higher risk for bloodstream [27] and fungal infections [2] in cancer patients is widely accepted, reasons for the observed high rate of coded GTI events are not clear. These codes may be applied in the setting of intestinal mucosal injury, together with neutropenic enterocolitis, both of which may be more frequent in patients with HM [31]. Similarly, reasons for the high incidence of GTI observed in patients with pancreatic cancer (138 [95% CI 103–180] GTIs per 10,000 OBDs) and cancer of the oral cavity (124 [95% CI 106–143] GTIs per 10,000 OBDs) are unclear, but may be a reflection of codes applied to hospital admission episodes where surgical resections, radiation to the head and neck and concurrent chemoradiation are routine for treatment of solid tumours [32]. Despite efforts made to standardize infectious disease coding classification [17], there is no consensus supported by the Australian Coding Standards [9] for defining GTI, nor neutropenic fever, as per current ICD-10-AM coding conventions. This is reflected by inclusion of non-discriminating unspecified codes denoting GTI, including codes A08 (Viral and other specified intestinal infections) and A09 (Other gastroenteritis and colitis of infectious and unspecified origin) coupled with D70 (Agranulocytosis), representing 62% and 51% of all GTI codes in the HM and STN cohorts, respectively, and likely contributing to the high incidence rates observed in this study (Fig. 3).

Poor sensitivity of administrative coding data may be a contributing factor to the high incidence of infection detected in this study due to nuances in clinical coding methodology. A low to moderate sensitivity, in combination with a low positive predictive value, compounds to result in clinical overcoding and an overestimation of disease burden. Earlier studies reported that ICD-9 and ICD-10 codes overestimated the rate of drain-related meningitis [33] and surgical site infections [34] three and four times the true incidence, respectively. More broadly, Stamm and colleagues [35] elucidated an overall sensitivity and positive predictive value using ICD-9-CM for detection of healthcare-associated infections of 0.18 and 0.57, respectively, and advocate the use of linked data to improve existing methods of infection surveillance. Reasons for overcoding of infection include incomplete or illegible discharge summaries, clinical coder experience, hybrid medical charts (i.e. electronic and paper-based) and the disconnection between standardized surveillance definitions [34] and ICD-10-AM coding conventions [36]. The aetiology of infections may also be poorly classified using administrative codes, given that ICD-9/10 coding does not extensively capture all pathogens, and that rarer organisms may be responsible for infections in immunocompromised cancer cohorts [37]. Looking ahead, consideration needs to be given to the myriad coding artefacts used for defining infection, as well as unfamiliarity with consensus surveillance criteria [34] and clinical coder experience [36] as strategic imperatives to safeguard coding data integrity and to maintain coding accuracy of infections in highly prevalent populations.

From 2007 to 2017, an increased rate of coded infections was observed for BSI, GTI, genitourinary tract infection, invasive fungal disease, lower and upper respiratory tract infections. Our modelling revealed the highest rate increase in GTI among both the HM (2.11 GTIs per 10,000 OBDs per quarter; p = 0.573) and STN (1.04 GTIs per 10,000 OBDs per quarter; p < 0.001) cohorts, followed by BSI. Few studies have reported these trends in hospitalized cancer cohorts. Immunotherapy-related colitis and diarrhoea are common complications associated with the increasing use of novel immune-modulating therapies [38], which may be miscoded as GTIs, likely accounting for the increasing rate observed in this study. Further, an ageing population of patients receiving more systemic immunosuppressive treatment for solid tumours [39] from 2007 to 2017 may also contribute to the high rates detected in this work.

This study is the first to describe a higher overall survival among HM versus STN exposed patients with infection. Our risk of in-hospital mortality estimates (STN: aHR = 1.61 [95% CI 1.41–1.83]; p < 0.001; HM: aHR = 1.30 [95% CI 0.90–1.90]; p = 0.166) are in keeping with a higher overall case-fatality rate (STN: 38.6%; HM: 12.1%; p < 0.001) described in Marín et al. [27], and are further supported by a threefold higher mortality rate in STN compared to HM patients [28]. Our findings likely reflect improvements in supportive care for patients with HM, such as use of antimicrobial prophylaxis regimens [40], improved diagnostic investigations [13], availability of clinical guidelines and sepsis pathways [10, 12, 21], infection prevention strategies [41], improved access to infectious disease consultations [42] and differing disease state and treatment goals at the time of admission (i.e. more terminal care for solid tumour than haematological malignancy patients). Despite widespread use of cytotoxic chemotherapy, new strategies have emerged for the treatment of haematological malignancies, including biologic therapies and radiotherapy ablative doses with modern conformational techniques [43], which have shown considerable improvements in long-term survival. In drawing these parallels, it is likely that many infections in STN patients may have occurred in patients with poor prognoses, meaning it is difficult to attribute death due to advanced cancer versus infection. Our findings suggest the need for ongoing evaluation of infection risk in patients with solid tumours to better understand both the burden and outcomes of infection in these patients.

There are several limitations to this study. The use of ICD-10-AM codes as a measure of disease burden has been criticized when applied to non-cancer populations [44, 45]. However, large dataset analyses and validation of coded infections has not been performed for cancer populations, in whom infection risk is likely to be higher than other populations [46]. We believe our findings assist with estimating disease burden and providing relative estimates for sub-populations of patients with specific malignancies. We acknowledge that future studies are required to assess the validity of coded datasets in cancer populations, in order to evaluate the quality of ICD-10-AM codes and concordance with accepted definitions for infection (e.g. EORTC/MSG [47] and CDC/NHSN [48] criteria). Secondly, this study is single site covering a large and diverse patient population, potentially limiting the generalisability of the study findings to other hospital settings. Thirdly, a moderately high prevalence of radiation proctitis in STN-exposed patients (12%; Table 1) may correlate with coded GTI in the 37% of STN exposed patients receiving radiotherapy. Fourthly, classification of the grade of infection and the severity of neutropenia (including neutrophil cell count) is not performed through current coding practices (Australian Coding Standards), and could therefore not be specifically evaluated in the current study. Finally, there is no consensus in the literature regarding the most appropriate denominator data for reporting standardized-adjusted rates of invasive fungal disease as a measure of patient exposure [1]. Strengths of this study include the fact that the study site is the largest public cancer specialized tertiary hospital in Australia, enabling a large and clearly defined cohort to be evaluated. Although modifications to infection prevention, screening, prophylaxis and treatment strategies at our centre may correlate with changes in infection rates spanning 2007 to 2017, our data also represent a significant period (11 years), enabling longitudinal trends to be examined. Non-stationarity in the time-series data motivated the choice of ARIMA(p,d,q) simulations over other time-series models due, in part, to transformation of the series into a stationary one based on determination of optimal differencing orders (d) and estimation of model parameters in the autoregressive (p) and moving average (q) polynomials [49]. Despite local data indicating high rates of infectious complications in cancer patients who underwent cytoreductive surgery [50], our data cannot be used to reliably measure prevalence rates of surgical site infection due to uncertainty in causal inference as the exact timing of infection diagnosis and surgical intervention cannot be elucidated from the coding data alone. Derivation of surgical site infection rates in cancer patients would require an in-depth case review of medical charts and microbiology reports, taking the time of surgery and specimen diagnosis into consideration [48].

In conclusion, this study estimates the relative burden of infections across a broad range of hospitalized immunocompromised cancer patients in the current era of cancer therapy. In rank order, gastrointestinal infections, bloodstream infections, and lower respiratory tract infections were most frequently reported, followed by invasive fungal disease. In particular, a predominance of coded gastrointestinal and bloodstream infections was identified in patients with haematological malignancy (specifically multiple myeloma and Hodgkin lymphoma cohorts). Our findings support the need to maintain good monitoring and prevention strategies, and to consider the impetus for targeted and customized surveillance in specific high-risk patient populations. Although we recognize the benefit of using ICD-10 codes to enable meaningful comparison of infection epidemiology internationally, future studies must validate the quality of hospital-level administrative data for infection monitoring in high-risk populations.