Introduction

Identification of older people at higher risk of adverse drug reactions (ADRs) is important for several reasons. ADRs are highly prevalent in hospitalized older people (on average 11% in one systematic review [1]), cause significant hospitalization in older people [2], and are mostly predictable [3] and therefore most likely preventable. In a study of ambulatory elders, Gurwitz et al. reported a prevalence of adverse drug events (ADEs) of 50.1 per 1000 person years, over one quarter of these ADEs being considered preventable [4]. More recently, Hamilton et al. [5] reported that of 219 ADEs considered causal or contributory to acute hospitalization in 600 consecutive older patients, almost 69% were preventable. ADRs and ADEs account for at least 6% of all adult acute hospitalizations [6] and in older people are a recognized major cause of serious morbidity, and possibly up to 20% of all in-patient deaths [7].

Although ADRs/ADEs represent an increasingly important cause of harm to older people in hospital, two large prospective studies indicate that in 42 and 46.5% of cases, ADRs/ADEs were entirely preventable [8, 9]. Recent systematic reviews indicate that female gender, multi-morbidity and polypharmacy are consistent ADR/ADE risk factors in older people [10, 11].

Prospective studies of incident ADRs during acute illness hospitalization in older people are relatively few, mainly undertaken in Europe and India. O’Connor et al. reported an ADR incidence of 26% in 513 unselected older patients during admission for acute illness in a tertiary referral hospital in Ireland [12]. Lattanzio et al. reported an average ADR incidence of 11.5% in older patients during hospitalization across a group of Italian acute medical centres [13]. They identified female gender (odds ratio, OR 2.29, 95% CI 1.18–4.45), number of medications (OR 1.12, 95% CI 1.06–1.18) and the combination of a falls history plus dependency in one or more activities of daily living (OR 2.18, 95% CI 1.13–4.19) as the significant risk factors associated with ADRs. A recent British prospective study of 560 patients aged over 80 years reported an overall ADR incidence of 13.2%, two thirds of these ADRs being considered preventable [14]. In a systematic review of 21 prospective ADR incidence studies of hospitalized adults of all ages in India, Patel and Patel identified older age group, female gender and polypharmacy as important risk factors for incident ADRs [15]. In another prospective study of older patients admitted to non‐surgical wards in two large Indian teaching hospitals, Harugeri et al. identified ADRs in 32.2% of 920 patients; 53.9% of the ADRs were of at least moderate severity and almost half (48.4%) were preventable [16].

Potentially inappropriate medications (PIMs) represent another important risk factor for ADRs and ADEs in multi-morbid hospitalized elders [5, 17,18,19]. In addition, ADR risk in this population is increased by chronic kidney disease [20, 21] and liver disease [22], which are often subclinical in older people. Some physiological changes of ageing (e.g. increased body fat composition and blood–brain barrier permeability) and age-related frailty predispose to ADRs [23]. Also, some commonly prescribed drugs, i.e. anticoagulants, insulin, oral antidiabetic drugs, antiplatelet agents, diuretics, calcium channel blockers, digoxin and nonsteroidal anti-inflammatory drugs heighten ADR Risk in Older People (ADRROP) [24, 25].

Although multi-morbidity, female gender, polypharmacy and PIMs appear to be consistent risk factors for ADRs and ADEs in older people, currently there are no widely used systematic methods for quantifying ADR/ADE risk on an individual basis. A recent systematic review by Stevenson et al. [26] examined the literature on ADR risk prediction models specifically designed for older people. From that review, there were four models that met their inclusion criteria [27,28,29,30]. However, none of the four models had achieved all of key stages of accurate risk prediction model creation, i.e. development, validation, impact and implementation. Among the four ADR risk predictive models, the area under the curve (AUC) values ranged from 0.623 to 0.73, i.e. modest to moderately good ADR risk prediction.

Of the four ADR risk prediction models identified by Stevenson et al., the one developed by the GerontoNet Research Consortium [29] has received the greatest attention in the literature. This 10-point ADR risk scale includes the following risk factors: ≥ 4 co-morbid conditions (1 point), heart failure (1 point), liver failure (1 point), 5–7 daily drugs (1 point), ≥ 8 daily drugs (4 points), previously documented ADR (2 points) and renal failure, i.e. estimated GFR < 60 ml/min/BSA (1 point). The GerontoNet ADR risk scale was validated prospectively in a population of 483 patients recruited from four hospitals in four separate European countries with a reported AUC value of 0.70. In an accompanying commentary to the GerontoNet publication, Schneider and Campese describe the GerontoNet ADR risk scale as “a valuable tool for clinicians to assess the risk of adverse drug reactions … in an older population” [31].

However, when O’Connor et al. applied the GerontoNet ADR risk scale prospectively in a comparable population of 513 unselected acutely ill hospitalized older patients in Ireland [12], they found that it had weaker predictive power (AUC = 0.62) than that reported in the original GerontoNet study. They also found the following variables to be significantly associated with ADRs in their study: (i) estimated GFR < 60 ml/min/1.73 m2 (OR 1.81, 95% CI 1.12–2.92), (ii) increasing number of medications (OR 1.09, 95% CI 1.02–1.17), (iii) inappropriate medications (OR 2.40, 95% CI 1.26–4.50) and (iv) age ≥ 75 years (OR 2.12, 95% CI 1.23–3.70).

One of the shortcomings of the GerontoNet ADR risk scale was that the derivation population of 5936 patients, whilst substantial in size, was recruited during surveys between 1993 and 1997, i.e. 11–15 years before the risk scale was devised. Also, the reported ADR incidence was 11.6%, substantially lower than the 26% incidence observed by O’Connor et al. [12]. Furthermore, the prospective validation cohort of 483 patients represented less than 10% of the risk scale derivation patient cohort of 5936 patients.

Given the limitations of geriatric ADR risk prediction models to date, we set out to derive a new ADR risk prediction model for older people in hospital and to define its predictive power compared to the GerontoNet ADR risk assessment model [29].

Patients and methods

We merged four separate study databases of older patients admitted with acute illness to a large academic teaching hospital between 2008 and 2012, i.e. a total of 2217 patients in whom ADRs were ascertained according to the WHO ADR definition, i.e. a response to a drug that is noxious and unintended and that occurs at doses normally used in man for prophylaxis, diagnosis or therapy of disease or the modification of physiological function [32].

WHO-UMC criteria were used to define ADR causality [33] in these four studies. We defined non-trivial ADRs as those that (i) required immediate discontinuation of the culprit drug, or (ii) caused prolongation of hospital stay by > 48 h, or (iii) required urgent administration of an antidote or resuscitative treatment, or (iv) caused major derangement of blood biochemistry or haematology data, or (v) caused permanent disability or (vi) resulted in death. The prevalence of non-trivial ADRs at admission was consistent across the four studies, i.e. ranging from 23.9 to 26%. Table 1 shows demographic and clinical details of these four constituent databases, details of which have been previously published as separate papers [5, 12, 34, 35].

Table 1 Clinical characteristics of patient databases according to study of origin (total n = 2217)

Inclusion and exclusion criteria were very similar in the four studies and patients studied were broadly representative of the older in-patient population being managed on general medical or surgical hospital wards. Specifically, patients admitted under the care of specialists in Geriatric Medicine, Palliative Medicine, Psychiatry or Intensive Care were excluded to avoid data contamination. Those ADRs that were defined as ‘definite’ or ‘probable’ according to WHO-UMC criteria were added to the database for analysis and all ADR ascertainment was performed by a trained physician. In cases of uncertainty, ADRs were further evaluated by a senior physician in Geriatric Medicine with expertise in pharmacotherapy. The research physicians assessing ADRs had full access to patients’ clinical records, laboratory results and radiology reports. Patients’ medication lists were reconciled by collateral drug history taking from their carers, community pharmacists or primary care doctors where required.

This combined dataset was divided into two parts:

(a) An approximate 3/4 portion (n = 1687) for the purpose of ADR risk factor derivation, OR calculation and ADRROP scale construction (derivation cohort).

(b) An approximate 1/4 portion (n = 530) for the purpose of prospective validation of ADRROP (validation cohort).

Demographic and clinical details of the derivation and validation cohorts are shown in Table 2. There were no significant differences between these cohorts for any of the variables considered. Randomization to derivation cohort or validation cohort was stratified by the study of origin. We performed multiple logistic regression analysis of the derivation cohort patients to determine those risk factors that were significantly and independently associated with ADRs.

Table 2 Details of the patient cohorts used for ADRROP development and ADRROP validation

Development of ADRROP

We used those risk factors independently and significantly associated with prevalent ADRs to construct the ADRROP risk prediction model. Possible predictors of ADRs were selected on the basis of available study data and previously established risk factors from published reports, including the ADR risk factors that make up the GerontoNet ADR risk scale [29].

The relationships of continuous variables with ADR occurrence were examined by initial parameterization of the variable as an ordinal, typically in quintiles or deciles of its distribution. Visualization of the resulting trends over the examined range of values determined whether the variable was entered as a continuous or categorical variable and the cutoff limits used. We examined categorical variables using the most clinically relevant reference value as the comparator group unless this was substantially smaller than other categories examined. Where possible, without substantive loss of explained variability, we simplified categorical variables into binary variables.

We examined the strength and significance of candidate predictors for ADR occurrence by means of uni-variate logistic regression. A parsimonious multivariate model was built, based on forced retention of known variables of scientific relevance and those selected from the remaining group of candidate variables, using backwards likelihood ratio stepwise regression. The variables selected for retention in the multivariate model were still retained by backward stepwise regression even when their retention was not forced. Potentially significant variable interactions were then examined.

The β coefficients of variables retained in the final model were inspected and converted to scores by rounding to the nearest 0.5 of a unit. To facilitate ease of calculation and to avoid the use of fractions, all scores were doubled giving rise to a total potential score range that included only whole numbers. We then examined the newly developed scale for degree of explained variance using the Nagelkerke R2 test and for calibration using the Hosmer and Lemeshow test. The ADRROP score was further refined by examining ADR occurrence rates across quartiles of the distribution of ADRROP scores, and tested for discrimination efficacy using AUC analysis.

Validation of ADRROP

The derived ADRROP score was applied to the validation cohort and assessed as outlined above. We defined a priori those criteria for the final ADRROP score to be clinically relevant: (a) significant association with ADR occurrence, (b) explained variability > 15%, (c) significant test result for trend across quartiles of ADRROP score distribution, and (d) ADR predictive capacity > 70%, as measured by AUC analysis. For comparison, we applied the GerontoNet ADR risk scale to all patients in the combined derivation and validation cohorts.

Ethical approval for each study in the database was granted by the Cork University Hospitals Research Ethics Committee.

Results

Overall, 467 ADRs were detected in 2217 patients (21%) using WHO-UMC criteria. With uni-variate regression analysis, we found that female gender, age > 70 years (compared to age ≤ 70 years), ≥ 6 daily medications, ≥ 4 co-morbid conditions, higher levels of co-morbidity on the modified cumulative illness rating scale (mCIRS) scale, renal function impairment (estimated GFR < 30 ml/min/1.73 m2), liver disease (transaminase levels > twice the upper limit of normal), heart failure, dementia, history of recent falls and presence of ≥ 1 STOPP criteria PIMs were significantly associated with incident ADRs.

From the multivariate analysis, we found the following variables to be significantly and independently associated with incident ADRs: age > 70, ≥ 4 co-morbidities, liver disease, ≥ 1 fall in the previous year and the presence of ≥ 1 STOPP criteria medications.

We then analysed the model summary statistics, using likelihood ratio backward stepwise regression in the ADRROP development cohort (n = 1651; incomplete data on 36 patients). Potential independent ADR risk predictive variables (p < 0.1) were: age > 70 years, female gender, renal function impairment, ADL impairment, multi-morbidity, liver disease, ≥ 1 STOPP medications and ≥ 1 fall in the previous year (Table 3). Liver disease and ≥ 2 STOPP PIMs had the highest ORs for incident ADRs, i.e. 2.259 (95% CI 1.307–3.904) and 2.692 (95% CI 1.983–3.655), respectively. When these candidate variables were assigned a score to the nearest 0.5 number based on their respective ORs for incident ADRs, the nascent ADRROP scale had a range from 0 to 13.5. By multiplying each ADRROP scale score by two, we were able to define ADRROP scores in whole numbers ranging from 0 to 27 for greater ease of use. The final version of the ADRROP scale is shown in Table 3.

Table 3 Details of potential ADR risk predictive variables

Figure 1 demonstrates the face validity of the ADRROP scale, i.e. a rising score in the derivation cohort is associated with an increasing rate of incident ADRs. In the validation cohort, as with the derivation cohort, we found a similar relationship between ascending ADRROP score and incident ADRs (Fig. 2). In the derivation cohort, 78% of the patients had ADRROP scores between 6 and 15; almost half (48.1%) had ADRROP scores between 11 and 15.

Fig. 1
figure 1

The upper histogram shows the relationship between ADRROP score and observed ADR rate in the development cohort (n = 1687). The y-axis shows the proportion of patients manifesting ADRs in relation to ascending ADRROP score on the x-axis. The lower histogram shows the relationship between ascending ADRROP score and observed ADR rate in the validation cohort (n = 530). In both cohorts, face validity of ADRROP is demonstrated

Fig. 2
figure 2

AUC analysis of ADRROP (development sub-cohort, n = 1687) is shown in the upper diagram (curved line). The diagonal line represents chance discrimination of ADR-positive versus ADR-negative cases using ADRROP, i.e. a 50% likelihood of ADRROP of being correct in its discriminant power. The area under the receiver operator curve was 0.632. The lower diagram shows the AUC analysis of ADRROP in the validation sub-cohort (n = 530), with an AUC value of 0.592, i.e. in the validation sub-cohort, 59.2% of observed ADRs were predicted by ADRROP

In the derivation cohort, the AUC was 0.632 (95% CI 0.598–0.665), i.e. ADRROP correctly predicted ADR occurrence in 63.2% of patients who experienced ADRs. In the validation cohort, the AUC was 0.592 (95% CI 0.532–0.652).

With application of the GerontoNet ADR risk scale to all patients in the combined derivation and validation cohorts (n = 2217), this yielded an AUC of 0.566 (95% CI 0.537–0.596); applying ADRROP to the combined cohorts gave an AUC value of 0.622 (95% CI 0.593–0.652). The explained ADR variance for all 2217 patients with ADRROP was 6.4% (Cox and Snell R2); with the GerontoNet ADR risk scale, it was 1.2%.

Discussion

In the present study, ADRROP showed poor ADR prediction in the prospective validation cohort of 530 patients, i.e. 59.2%. Applying ADRROP and the GerontoNet ADR risk scale to the full patient dataset (n = 2217) yielded similarly poor ADR prediction. Although ADRROP has face validity, it could only explain 9.5% of the variability and AUC values were substantially less than 0.70, the level beyond which ADRROP would have clinical relevance.

It is unclear why ADRROP failed to predict ADRs well in the present study. ADRs in older patients in hospital are highly variable in causation, severity and clinical outcome [2, 36, 37]. ADRs also commonly occur in the context of multi-morbidity, such that ADRs can be difficult to discern even with recognized ADR causality criteria like WHO-UMC criteria [33] that reflect ADR ascertainment in routine clinical practice. ADR risk prediction models applied to older people with acute illness attempt to deal with highly heterogeneous conditions and highly variable clinical status. This variability may be so great that constructing any kind of ADR risk assessment scale around such a large number of variables is liable to fall short of the predictive level required for clinical relevance.

The absence of number of daily medications as a component of ADRROP may be considered curious given the well-known association between polypharmacy and ADRs in older people. In the univariate analysis, ≥ 6 daily medications were significantly associated with ADR occurrence. However, polypharmacy is not an independent ADR risk factor, since the number of daily drugs is strongly correlated with the number of concurrent active medical conditions and with the degree of multi-morbidity, as measured by validated instruments such as the CIRS—adapted for Geriatric patients (CIRS-G) [38]. In the multivariate analysis which identifies independent ADR risk factors, polypharmacy was no longer independently associated with ADR occurrence whilst multi-morbidity remained an independent risk factor for ADRs.

It is notable that the reported 21% incidence of ADRs in the present study was substantially higher than in the GerontoNet cohort, i.e. 6% [29]. The WHO definition of ADR was applied in both datasets. However, different ADR causality criteria were used in the two studies, i.e. WHO-UMC criteria [33] in the present study and Naranjo criteria [39] in the GerontoNet study. Also, ADRs that were identified at hospital admission or were the prime cause hospital admission in the GerontoNet study were excluded from the data analysis. In contrast, in the present study we included ADRs identified within 72 h of admission, prior to applying any research-based ADR prevention strategies. The GerontoNet ADR score which was derived from the GIFA database involved ADR detection during the index hospitalization. Similarly, in the GerontoNet validation sub-study, only ADRs that were verified after admission were included. In its current form ADRROP, similar to the GerontoNet and other existing ADR risk prediction models, needs further improvement and cannot yet been recommended for ADR prediction in routine clinical practice.

In view of these limitations of ADRROP and the other geriatric ADR risk prediction models [26], there is a good case for collection of a new data set with high-precision ADR ascertainment for the purpose of re-evaluation of ADR risk factors in multi-morbid older people. The currently ongoing EU-funded Seventh Framework Programme SENATOR project [40] aims to provide such a data set from the prospective SENATOR clinical trial. In SENATOR, there is a novel method for ADR ascertainment that involves a 12-item event trigger list of the 12 most common clinical manifestations of ADRs and detailed ADR description forms. The trigger list method is described in two recent studies undertaken by our research group [12, 35]. ADRs are defined according to independently adjudicated evidence forms which are submitted for independent adjudication by primary researchers whenever one of the trigger listed clinical events occurs. The evidence forms are reviewed by blinded experts who assess putative ADRs as being definite, probable, possible or unlikely. Prospective data will be obtained in approximately 2500 older people being hospitalized with acute illness in six large European academic medical centres. The SENATOR trial will involve create a large prospective database of ADRs defined by the trigger list method with concurrent high quality clinical data relating to multi-morbid older hospitalized patients. We will use the SENATOR database to derive a predictive ADR risk assessment tool that will hopefully be suitable for routine clinical use.

It is anticipated that the SENATOR database will show a substantially higher incidence and prevalence of ADRs in acutely ill hospitalised multimorbid older people than has been reported in a recent systematic review by Alhawassi et al. [41].