Introduction

Inappropriate prescribing (IP) is common in frail older adults with a poor survival prognosis [1, 2] and can contribute to adverse drug reactions (ADRs) [3], increased morbidity and mortality [3] and negative economic consequences [4, 5]. Fried and colleagues define frailty as “a clinical syndrome in which three or more of the following criteria are present (i) unintentional weight loss (10 lbs in past year), (ii) self-reported exhaustion, (iii) weakness (grip strength), (iv) slow walking speed, and (v) low physical activity” [6]. Many medications prescribed to frail older adults with a poor 1-year survival prognosis were commenced at a time when life expectancy was favourable and disease prevention was paramount. When a significant deterioration in clinical status occurs, there can be great reluctance to stop long-term medications. In fact, the total number of prescribed medications tends to increase as treatments to manage the symptoms of terminal illness are prescribed alongside long-term chronic prescriptions [7]. Deprescribing is avoided for several reasons including (i) lack of evidence for efficacy of deprescribing, (ii) fear of adverse events following medication cessation, (iii) unavailability of up-to-date medical and medication records, (iv) limited training of nursing staff in some residential care settings, (v) physician/prescriber time restrictions and (vi) fear of litigation [8,9,10]. The result in IP is highly prevalent in this patient population. One study of older nursing home residents with advanced dementia reported that 86.3% received a medication of questionable benefit in the last 120 days of life, with 47.8% received a statin during this time [11].

STOPPFrail [Screening Tool of Older Persons Prescriptions in Frail adults with limited life expectancy] criteria were recently developed to assist physicians in identifying IP in frail older adults with a poor 1-year survival prognosis, where the goals of treatment are essentially symptom management rather than disease prevention or cure [12]. In order for this tool to apply to a patient, the inclusion criteria must be met, i.e. end-stage irreversible pathology, poor 1-year survival prognosis, severe functional impairment or severe cognitive impairment or both and patients where symptom control is the priority rather than prevention of disease progression. These 27 explicit criteria are organised according to eight physiological systems (Table 1). STOPPFrail criteria were developed and validated using the Delphi consensus methodology in which 17 experts participated, i.e. senior academic experts in the areas of geriatric medicine, clinical pharmacology, old-age psychiatry, palliative medicine, primary care medicine and clinical pharmacy [12]. The primary purpose of STOPPFrail criteria is to guide and assist physicians with the challenging task of identifying medications that warrant deprescription in a structured fashion in older frail adults with a poor survival prognosis. To date, no study has reported the prevalence of IP according to STOPPFrail criteria, in the proposed target population. Prior to an IP prevalence study, and the foreseen widespread use of this tool, it is important to determine whether STOPPFrail criteria can be used accurately by physicians. Thus, the objective of this study was to determine the inter-rater reliability (IRR) of STOPPFrail criteria between multiple physicians practising across different specialties who routinely deal with the medication needs of this patient population.

Table 1 STOPPFrail criteria [10]

Methods

Twenty clinical cases were collated (supplementary data S1). These clinical cases were based on a sample of participants that enrolled in a prevalence study at Cork University Hospital, that investigated the prevalence of adverse drug reactions (ADRs) causing hospitalisation. For this study, participant’s comorbid illnesses, concurrent medication use and cognitive and functional statuses were recorded. The structured history of medication use (SHiM) was employed to accurately capture concurrent medications, including medication adherence [13]. Each clinical case for this exercise was based on one clinical patient from this study. Cases were amended, where necessary, to ensure, the 20 clinical cases described frail multi-morbid patients with an appreciable incidence of potentially inappropriate prescriptions (PIPs), according to STOPPFrail criteria. These 20 clinical cases were presented in a standardised format (see supplementary data S1) to include age, gender, comorbidities, concurrent medications use (with adherence clearly stated), medication allergies and current functional and cognitive statuses. Eighteen of the 20 clinical cases met the inclusion criteria for STOPPFrail criteria, i.e. end-stage irreversible pathology, poor 1-year survival prognosis, severe cognitive or functional impairment or both and patients in whom symptom management was the priority.

The 18 clinical cases, that met STOPPFrail inclusion criteria, had a mean age of 79.5 (SD 6) years. The total number of prescribed medications was 165, a median of 9 (IQR 7.75–11.25) per clinical case. The median number of conditions per clinical case was 7 (IQR 6–8.25). The median number of PIPs, according to STOPPFrail criteria, was 5 (range 0–9). The median mini mental state examination (MMSE) score was 11 (0–28), with 80% (n = 16) totally dependent for activities of daily living (ADLs). The irreversible diagnoses for the 18 cases included severe dementia (n = 6), advanced metastatic cancer (n = 4), severe disabling stroke (n = 2), stage IV chronic obstructive pulmonary disease [COPD] (n = 2), advanced Parkinson’s disease with associated dementia (n = 1), motor neuron disease (n = 1), stage 4 congestive heart failure (n = 1) and advanced rheumatoid arthritis with dementia (n = 1).

Expert gold standard assessment of PIM use according to STOPPFrail criteria

For each of the 20 clinical cases, two physicians, with expertise in geriatric pharmacotherapy, first identified if the patient described in the clinical case met STOPPFrail inclusion criteria. For the relevant case, IP according to STOPPFrail criteria was determined. Complete agreement between the two expert assessors was reached in terms of prescribing appropriateness according to STOPPFrail criteria. This combined level of agreement (labelled “rater 1”) was set as the gold standard [GS], against which other physicians’ ratings were compared.

Physician selection

Twelve physicians were invited to participate, i.e. geriatricians (3 consultant geriatricians and 3 geriatric medicine trainees), general practitioners (GPs) (2 registered GPs and 1 trainee in general practice) and 3 palliative care physicians. Participants were selected on the basis of their practising specialty, their experience in managing older adults with a poor survival prognosis and their geographical location. This was a convenient sample with an optimum proportion of raters to subjects. It was anticipated that raters would agree 80% of the time with a relative error of 30%; thus, a minimum of 17 cases was required for review by raters [14]. Those invited had no prior knowledge of STOPPFrail criteria and did not routinely use other IP tools.

The study’s objectives were explained to each invited physician. Subsequently, all physicians agreed to participate. Physicians completed the exercise between January and February 2017. This was a theoretical exercise, which was completed by the physician at a time convenient to them. Each physician was supplied, in paper format, (i) the STOPPFrail criteria (Table 1 ), (ii) the 20 clinical cases (supplementary data S1) and (iii) an answer booklet with clear instructions. A sample of the full booklet for case 1 can be found in the supplementary data S2. Participants were asked to decide, for each individual clinical case, (i) if the case in question met the inclusion criteria for the application of STOPPFrail, (ii) for the cases that did, to identify medications listed in the STOPPFrail criteria and (iii) to suggest which PIP, as determined by STOPPFrail criteria, could be deprescribed in theory, only if they deemed it clinically appropriate to do so. Participants were asked, after they had familiarised themselves with STOPPFrail criteria, to time themselves applying this tool to the clinical cases. Physicians were allocated the following rater numbers: consultant geriatricians [raters 2, 3, 4], specialist registrars in geriatric medicine [raters 5, 6, 7], general practitioners (GPs) [raters 8, 9], trainee in general practice [rater 10] and palliative care physicians [raters 11, 12, 13].

For criteria A1 (any drug that the patient persistently fails to take or tolerate despite adequate education and consideration of all formulations), raters were instructed that, for the purpose of this exercise, if it was documented that medication adherence was challenging to assume that all formulations and delivery mechanisms had been tried without success. For criteria A2 (drugs with no clear indication), raters were asked to base this on the known indications of the medications, as per the summary of product characteristics (SPC) guidelines of the medicine, the British National Formulary (BNF) and/or their clinical judgement.

Statistical analysis

For the purpose of this research, physician responses were dichotomized into whether each STOPPFrail indicator was applied or not, being cognisant that some indicators could be applicable to more than one drug prescription, e.g. STOPPFrail criterion A2 advises stopping any drug without clear indication. The response of raters 2–13 were compared with those of the GS. Statistical analysis was performed using IBM SPSS® statistics package version 20. Cohen’s Kappa Statistic was used to determine the level of agreement between each rater and the GS. The Fleiss Kappa Statistic was used to determine the overall mean kappa rating between subgroups of raters (geriatricians, GPS and palliative care physicians) and the GS. The kappa statistic was interpreted as poor if ≤ 0.2, fair if 0.21–0.40, moderate if 0.51–0.6, substantial if 0.61–0.8 and good if 0.81–1.00 [15].

Results

Of the 12 raters, 9 identified all 18 cases that met the STOPPFrail inclusion criteria. Two GPs identified 16 of the 18 cases and one consultant geriatrician identified 17 of the 18 cases as appropriate for the application of STOPPFrail criteria. On average, geriatricians, GPs and palliative care physicians took 2.33, 3.41 and 2.7 minutes, respectively, to apply STOPPFrail criteria to each clinical case, a combined overall average of 2.7 (SD 0.94) minutes. During this time, the physician read the clinical case in question and applied STOPPFrail accordingly. This time did not include the time taken for participants to read the instruction manual and familiarise themselves with the STOPPFrail tool.

Of the 165 medications prescribed for the 18 cases that met STOPPFrail inclusion criteria, the GS determined that 91 medications were inappropriate according to STOPPFrail criteria and should theoretically be deprescribed accordingly. Table 2 displays the kappa statistic for each rater compared to the GS. In Table 2, columns A, D, C and D indicate the status of agreement between raters and the GS. For example, rater 1 (GS) and rater 3 agreed that STOPPFrail criteria were not identified in 471 instances (column A). In 27 instances, rater 1 (GS) did not identify a STOPPFrail criterion but rater 3 did (column B). There were 20 instances where rater 3 identified a STOPPFrail criterion that rater 1 did not (column C). In 83 instances, both rater 1 and rater 3 identified a STOPPFrail criterion (column D). The Fleiss kappa coefficient between all 12 raters and the GS was 0.76 (SD 0.059). The Fleiss kappa coefficients between the GS and geriatricians, GPs and palliative care physicians were 0.80 (SD0.6), 0.77 (SD0.9) and 0.75 (SD0.1), respectively, with no significant difference noted between groups or between participants within groups, as determined by one-way ANOVA (df (2, 9) = 0.712, p = 0.516).

Table 2 Level of agreement in STOPPFrail criteria applied and number of drugs stopped

Table 2 (supplementary data S3) shows the discrepancies in the number of times each STOPPFrail indicator was applied by the GS and the 12 raters for the 18 clinical cases. Total agreement between all raters and the GS was seen for 4 STOPPFrail criteria, minor discrepancies were seen for 16 criteria and major discrepancies for 7 criteria. Table 3 displays the list of criteria according to their agreement with the GS. Minor discrepancies occurred when a STOPPFrail indicator was applied, to all 18 clinical cases, by a rater ≤ 2 times more frequent or less frequent than the GS. Major discrepancies occurred when a STOPPFrail indicator was applied, to all 18 clinical cases, by a rater > 2 more frequent or less frequent than the GS. The criteria where most discrepancies were noted were A1 (any drug that the patient persistently fails to take or tolerate despite adequate education and consideration of all formulations), A2 (drugs with no clear indication), E1 (proton pump inhibitors), G1 (calcium supplementation), I1 (Diabetic oral agents), J1 (multivitamins combination supplements) and J2 (nutritional supplementation).

Table 3 Agreement of criteria

Differences in opinion regarding drug indication were identified for warfarin, benzodiazepines and acetylcholinesterase inhibitors. Two consultant geriatricians and one GP with experience in attending patients in residential care units were more likely to identify these prescriptions as inappropriate, than other participants. For 1 or more clinical cases, 10 raters overlooked that patients were having difficulty with medication adherence. Seven raters identified the lower dose of a proton pump inhibitor (PPI) as being inappropriate as part of criterion E1; this criterion suggests reducing the higher dose to a lower dose. Three raters suggested vitamin d was inappropriate as part of G1; this criterion suggests stopping calcium alone. When three diabetic oral agents were prescribed, raters’ opinion on appropriateness varied. Raters either identified that one agent alone was inappropriate and suggested that deprescribing should occur in a staggered fashion, i.e. one agent at a time. Others identified all diabetic oral agents as inappropriate and suggested that they could, in theory, be deprescribed all at the one time. For three raters, folic acid and vitamin b12 supplementation were identified as inappropriate as part of J1 (combination multivitamins) criterion. Other raters either deemed these drugs appropriate and suggested continuation or else deemed them inappropriate as part of criteria A2, i.e. no clear indication.

Discussion

The IRR of STOPPFrail criteria is substantial to good (mean 0.76 (SD 0.059)), when tested between multiple physicians practising across three different specialities, despite physicians having no prior knowledge of the tool or experience using it. It takes approximately 3 min to apply STOPPFrail criteria to one clinical case. No discrepancies in its application were identified for 4 STOPPFrail criteria. Minor discrepancies were identified for 16 criteria and major discrepancies were identified for 7 criteria. There was no difference between the three different physician groups, or between the participants within each group, in their ability to apply STOPPFrail criteria (df (2, 9) = 0.712, p = 0.516).

The strength of this study is the robust methodology employed. Three groups of physicians, all of whom had no experience using IP tools and all of whome were given the same clear instructions, participated in this research. The clinical cases used were based on real-life patients and therefore reflected common clinical practice. However, there were limitations. Firstly, this was a theoretical exercise, i.e. physicians assessed the suitability of STOPPFrail criteria according to a clinical case history presented to them in a structured format and identified IP accordingly. Assessments were not completed on patients in person and medications were not actually deprescribed. It could be suggested that physicians are more conservative when dealing with real-life patients rather than theoretical cases. However, conversely, it could be suggested that the IRR could be under-estimated here as where ambiguity exists, and patients are not there to clarify information, physicians could also assume medications are appropriate. Efficient and safe deprescribing depends on the quality of the available clinical data. The more comprehensive the clinical information available to clinicians is, the more accurate IP criteria can be applied leading to higher levels of IRR [16]. However, ambiguity is often present in clinical practice due to incomplete records [17, 18], and consequently, physicians often make decisions based on limited information. Therefore, these cases, in this theoretical exercise, do reflect common clinical scenarios. Additionally, medication indications were not clearly documented for participants in this exercise. Physicians often have to decipher clinical indication based on documented comorbidities, the results of previous imaging investigations and previous laboratory tests when reviewing patients; thus, this exercise was designed to reflect this.

Major discrepancies, found in 7 STOPPFrail criteria, were as a result of (i) differences in physician opinion regarding clinical indications, (ii) criteria misinterpretation and (iii) failure to acknowledge problems with medication adherence. Differing opinions on clinical indication for medications could be as a consequence of physician specialty and/or physician level of training, e.g. consultant geriatricians deemed acetylcholinesterase inhibitors inappropriate in late-stage dementia more frequently than their trainee geriatricians or GP colleagues. Misinterpretation of criteria was identified for the prescription of vitamin D and low-dose PPIs. The identification of both these prescriptions as inappropriate was not necessarily incorrect; however, for the purpose of this exercise, they were deemed incorrect as they were not specifically listed as PIP in STOPPFrail criteria. STOPPFrail criteria were developed and validated to guide physicians on deprescribing, as well as open dialogue around the appropriateness of all medications and in doing so encourage medication review in its entirety; thus, these variations seen here cannot be assumed to be inappropriate.

Complete agreement was seen for the application of four STOPPFrail criteria; memantine, gastrointestinal antispasmodics, Selective Oestrogen Receptor Modulators (SORMs) and prophylactic antibiotics. Memantine was prescribed in three clinical cases. Patients described in these clinical cases had advanced dementia, i.e. they were bed-bound, fully dependent for ADLs and could not complete MMSEs; therefore, there was little ambiguity around the appropriateness of this prescription. Similarly, for the cases where prophylactic antibiotics and SORMs were prescribed, there was no uncertainty. It was clearly documented that recurrent urinary tract infections continued despite prophylaxis and that patients were not fully dependent and not at risk of falls. This further supports that the more comprehensive patients’ medical records are the more accurate the application of IP tools. Gastrointestinal antispasmodics were not deemed inappropriate in any case.

Despite clear documentation of medication adherence in the clinical cases, physicians did not identify this every time. This was probably as a result of a reading error and, once not identified in one case, was unlikely to be identified in other cases. This is a challenge with a theoretical exercise as participants rely on their ability to assess the clinical information as it is presented to them, rather than confirming medications and adherence with a patient directly. This could also have been down to user fatigue as the exercise progressed and the process became repetitive.

Another limitation of this study was the use of Cohen’s kappa coefficient statistical test. Cohen’s kappa coefficient does not take into consideration, where chance agreement occurs; thus, it can overestimate the actual level of agreement. Additionally, it identifies that disagreements have occurred, but the reasons for these disagreements are not captured. Therefore, further scrutinisation of the data is required and reasons for disagreements need to be investigated further.

Physicians are frequently under time pressure where completing medication reviews and using criteria like STOPPFrail can encourage identification of medications that can potentially be deprescribed in a time-efficient structured fashion. Explicit criteria that require time to deploy often do not translate to clinical practice and inevitably are used primarily as research tools [19]. STOPPFrail criteria have shown itself here to, not only assist physicians with identifying inappropriate medications in frailer older adults with a poor survival prognosis, but to also do this in a time-efficient manner, which suggests it will translate across to clinical practise, where it, hopefully, will have an impact.

Deprescribing requires a culture change for many physicians, particularly physicians wherein contact with frail older adults with a poor 1-year survival prognosis comprises a small part of their everyday clinical practice. Deprescribing requires extensive knowledge around disease trajectory, pharmacological actions of medications and the likely risks involved with their use. Deprescribing in patients with a poor survival prognosis is more challenging than deprescribing specific drugs for specific reasons in older adults as this process can often initiate a more extensive discussion around end-of-life care. Future studies, using STOPPFrail criteria, will be needed to ascertain the extent of PIP in this population cohort. The substantial to good IRR demonstrated in this study indicates that prevalence studies of PIP, according to STOPPFrail criteria, will be comparable between researchers and across research centres. Following this, randomised controlled trials can be planned to assess whether deprescribing in this population can affect patient outcomes and provide the evidence required to support physicians undertaking deprescribing. Our data suggests that STOPPFrail provides reliable explicit guidance for any clinician undertaking routine medication review in frailer older patients with poor 1-year survival prognosis.