FormalPara Key Points

Evidence-based clinical decision support for drug–drug interactions (DDIs) requires consistent application of transparent and systematic methods to evaluate the evidence.

An expert workgroup developed recommendations by consensus for systematic evaluation of evidence for DDIs from the scientific literature, drug product labeling, and regulatory documents.

Workgroup members expect that editors of drug compendia and clinical decision support systems will be able to provide higher-quality information about DDIs.

1 Background

Exposure to potential drug–drug interactions (DDIs) is a significant source of preventable drug-related harm that requires proper management to avoid medical errors [1]. Studies indicate DDIs harm 1.9–5 million inpatients per year and cause 2,600–220,000 emergency department visits in the US per year [24].

The importance of DDIs as a risk factor for patient harm led the Centers for Medicare and Medicaid Services (CMS) to include DDI clinical decision support (CDS) alerts in the agency’s guidelines for achieving meaningful use of electronic health records [5]. However, evidence indicates that DDI decision support systems have not successfully reduced exposure to DDIs [68]. In the US, most alerting systems rely on clinical content created, maintained, and sold by knowledgebase vendors [9]. Each organization implements their own approach to classifying DDIs with limited agreement between systems [1012]. Additionally, CDS systems often alert for DDIs that have limited clinical relevance, which may increase alert fatigue [13] and lead to inappropriate responses [1416].

In spite of a desire among providers of DDI decision support tools to produce clinically relevant content, improving the state-of-the-art poses several challenges. High-quality evidence to support the existence of many DDIs is lacking, there are few controlled clinical studies conducted in relevant populations [1719], and individual case reports are underreported and often lack information [20]. Compendia and knowledgebase editors use differing approaches to identify and evaluate evidence [1012]. There are no guidelines or standards for determining clinical relevance of interactions via consistent systematic evaluation or classification [9, 21]. Without such guidance, DDIs may also be inappropriately extrapolated to other drugs within the same therapeutic or pharmacologic class [22]. In an effort to reduce legal liability, system vendors might have an incentive to include almost all possible DDIs, including those that confer extremely low risk to exposed patients [9, 23].

A conference series was conducted to develop specific recommendations to improve the quality of CDS alerts for DDIs. The goals of the conference series were addressed by three workgroups. This paper describes recommendations by the Evidence Workgroup that was charged with the objective to provide recommendations for systematic evaluation of evidence from the scientific literature, drug product labeling, and regulatory documents with respect to DDIs for CDS. These activities were supported in part by a conference grant from the Agency for Healthcare Research and Quality (AHRQ) and donations from health information technology (IT) vendors. Use of funds was at the sole discretion of the University of Arizona and according to Department of Health and Human Services requirements.

2 Methods

Nineteen individuals (listed as co-authors) with expertise in DDIs, clinical pharmacology, drug information, evidence evaluation, biomedical informatics, and health IT were invited to participate as workgroup members. No invited experts declined to participate; one individual was unable to contribute due to health reasons and is not listed as a co-author. Members represented diverse backgrounds such as academia; journal, compendia, and knowledgebase editors; healthcare organizations; US Food and Drug Administration (FDA); and the US Office of the National Coordinator for Health IT (ONC). Twelve 1-h webinar meetings were conducted from January 2013 to February 2014, with two in-person meetings held in Washington DC (May 2013) and Phoenix, Arizona (September 2013). A member with recognized leadership skills and experience facilitated the meetings. The group followed a structured consensus-development process that included clarifying issues to be decided; open discussion and debate among all members; iterative aggregation and refinement of ideas leading to proposals that incorporated the best elements of members’ ideas while addressing all key concerns; and active agreement on the final proposal. To ensure the most pressing issues were addressed, a nonsystematic search of the literature was conducted for papers describing methods for evaluating DDI evidence. From these articles, the following key questions were developed by the conference organizers and then reviewed and agreed upon by consensus of the members: (i) What is the best approach to evaluate DDI evidence? (ii) What evidence is required for a DDI to be applicable to an entire class of drugs? (iii) How should a structured evaluation process be vetted and validated?

Workgroup members were provided access to articles that were deemed relevant for consideration. They also identified relevant studies and copies were obtained for all workgroup members to review. Each key question was evaluated in light of the available evidence and the collective experience of the workgroup members. Responses to each key question were written and then modified to improve clarity or address issues or concerns. Workgroup recommendations were posted on a project Internet site and feedback was sought from other stakeholders via dissemination to professional societies and organizations. Consensus was achieved through an iterative process of drafting recommendations, collecting verbal or written comments from workgroup members and other content experts, and revising documents until no additional substantive comments were provided. Workgroup recommendations were presented at regional and national forums to solicit feedback from stakeholders such as healthcare providers, compendia and knowledgebase editors, professional and quality organizations, and government agencies. Members of the workgroup were informed of feedback during the regularly scheduled webinar meetings. No substantial changes were made to the recommendations based on comments and questions collected during this vetting process. Changes to the recommendations were editorial in nature to improve clarity.

3 Results

A summary of our recommendations when evaluating the DDI evidence is shown in Table 1. Details about these recommendations are described more fully below. The results are presented in the following order: (i) recommendations about terminology; (ii) best approaches for evaluating DDI evidence (Key Question 1); (iii) recommendations for evidence of drug-class interactions (Key Question 2); and (iv) procedures to validate a structured process for DDI evidence evaluation (Key Question 3).

Table 1 Summary of Evidence Workgroup recommendations for systematic evaluation of DDI evidence

3.1 Terminology

We recommend consistent use of relevant terminology for evaluation of DDI evidence. In the process of answering the key questions, several terms required clarification. A complete list of definitions agreed upon by the workgroup is provided in Electronic Supplementary Material (ESM) 1, with some key terms described below.

A DDI is defined as a clinically meaningful alteration in the exposure and/or response to a drug (object drug) that has occurred as a result of the co-administration of another drug (precipitant drug) [24, 25]. Response can refer to either precipitating an adverse event or altering the therapeutic effect of the object drug. A potential DDI is defined as the co-prescription of two drugs known to interact, and therefore a DDI could occur in the exposed patient [25]. Although the distinction between a DDI and a potential DDI is important, both are referred to as DDI throughout this paper for simplicity. A clinically relevant DDI is defined as one associated with either toxicity or loss of efficacy that warrants the attention of healthcare professionals. We recommend use of the term seriousness, rather than severity, to describe the extent to which a DDI can or does cause harm [26].

We developed a working definition for narrow therapeutic index (NTI) because many clinically relevant DDIs involve drugs with a NTI. Similar terms include narrow therapeutic ratio and narrow therapeutic range (NTR). Existing NTI/NTR definitions (see ESM 2) were considered inadequate for guiding the evaluation of DDIs [27]. Although the FDA is developing a definition of NTI drugs in the context of bioequivalence, this definition would generally be stricter than is needed for managing DDIs in clinical practice. Therefore, we define NTI drugs as those for which even a small change in drug exposure may lead to toxicity or loss of efficacy. Several scenarios describe what may constitute a ‘small’ change in drug levels: <100 % (<2-fold) increase in area under the concentration–time curve (AUC) for the object drug may lead to serious adverse events; or <50 % decrease in AUC for the object drug may result in a loss of efficacy with serious therapeutic consequences (e.g., failure of contraception, or virologic failure due to subtherapeutic drug levels).

3.2 Key Question 1: What is the Best Approach to Evaluate Drug–Drug Interaction (DDI) Evidence?

The first step in evaluating DDIs to guide clinical decision making relates to establishing sufficient evidence that a DDI exists, followed by questions of clinical relevance and how to present DDI recommendations to health professionals. Our recommendations primarily focus on identifying the best approach to evaluate if a DDI exists, with additional consideration for how to establish clinical relevance.

When publishing a recommendation about the risk of a DDI, it is essential first to assess the quality of individual studies to prevent drawing erroneous conclusions about the entire body of evidence. Evaluation of medical treatments commonly involves hierarchical rating schemes such as those used in evidence-based medicine. However, a unique approach is needed to summarize a body of DDI evidence, which often consists of case reports, retrospective reviews, and extrapolation from in vitro studies, with few controlled clinical studies conducted in relevant populations [1719]. Some DDIs do not require randomized controlled trials to confirm their existence. It is possible to reasonably extrapolate many interactions based on pharmacokinetic and/or pharmacodynamic properties of a drug without placing patients at unnecessary or unethical risk. High-quality observational studies and evidence from real-world use can be applied to confirm the association with adverse clinical outcomes and to evaluate the magnitude of harm and relevant risk factors. Therefore, evidence supporting a DDI may be derived from what would be regarded as less rigorous study designs for other research questions.

3.2.1 Existing DDI Evidence Evaluation Methods

We conducted a nonsystematic search for published methods for evaluating DDI case reports using MEDLINE and also queried workgroup members for relevant articles and instruments (see ESM 3). Only one instrument, the Drug Interaction Probability Scale (DIPS) [20], was identified to be specifically developed to evaluate individual case reports for DDIs. This 10-item scale was designed to assess an adverse event for causality by a DDI. DIPS was developed to address the limitations of previous assessment instruments, such as the Naranjo scale [28], that failed to evaluate causality associated with concomitant medications. DIPS takes into consideration previous credible reports, consistency with known interactive properties, time course of the interaction, results of de-challenge and re-challenge, and alternative explanations. DIPS also meets published criteria for assessing causality in terms of guiding users to conduct an explicit, transparent, complete, and balanced assessment of the attributes important to causality assessment of whether an adverse drug interaction occurred and exists [29].

We also searched for published literature of methods that evaluate a collection of evidence relevant to establishing that a DDI exists (see ESM 3) and found two systematic approaches [30, 31]. The first is the approach used for developing a DDI knowledgebase in Swedish and Finnish computerized CDS systems (SFINX) [30]. This system categorizes level of documentation (0–4) and clinical relevance (A–D). A ‘0’ level of documentation reflects potentially dangerous interactions that have not been, and probably never will be, documented in clinical studies. The second approach described in the literature is a systematic assessment of DDIs for CDS systems in the Netherlands [31]. Four core parameters are used to assess each DDI: evidence supporting the interaction; clinical relevance of the potential adverse reaction; risk factors related to patient, drug, or disease characteristics; and incidence of the adverse reaction. A 5-item scale is used to assess the quality of evidence for a DDI. The approach requires the existence of a reasonable mechanistic explanation in order to establish a DDI based solely on pharmacokinetic or pharmacodynamic properties.

While the DIPS approach to case report evaluation was considered acceptable, the existing methods for evaluating a body of evidence were considered more complex than necessary because they combine DDI evidence assessment with questions of clinical relevance. Additionally, existing methods do not explicitly address reasonable extrapolation of DDIs from in vitro findings, nor do they explicitly address study quality and interpretation in the context of DDIs.

3.2.2 The Need for a New DDI Evidence Evaluation Instrument

Given the limitations of the available tools, we agreed that a new assessment instrument was needed to objectively evaluate a body of evidence to establish the existence of a DDI. It was further agreed that this instrument should include concepts from previously published DDI evidence rating instruments [30, 31] but with fewer categories based on the presence or absence of specific, clearly defined, types of evidence.

Specific guidance is needed for reasonably extrapolating DDIs that are unlikely to be evaluated in clinical trials. Reasonable extrapolation refers to using the knowledge of the mechanism of a DDI to predict the risk of a DDI from one pair of drugs to multiple pairs with similar pharmacologic properties. Extrapolation of pharmacodynamic interactions is commonplace. For example, not every possible combination of a benzodiazepine and ethanol has been studied. Yet, all benzodiazepines are assumed to interact in a similar manner with ethanol. DDIs based on pharmacokinetic mechanisms present more of a challenge to extrapolation because numerous elimination pathways may be involved and it is difficult to predict the magnitude of the interaction without additional data.

Based on the needs described above for a new method to establish the existence of a DDI, we developed the DRug Interaction eVidence Evaluation (DRIVE) instrument. The rationale behind the DRIVE instrument is to (i) use simple evidence categories; (ii) include causality assessment with DDI case reports (via DIPS); (iii) apply reasonable extrapolation, including from in vitro studies; (iv) address evidence/statements provided in product labeling; and (v) describe study quality criteria and interpretation in the context of DDIs. The purpose of the DRIVE instrument is to promote consistent, transparent, and systematic evaluation of a body of evidence to establish that a DDI exists. Once formally evaluated and validated, the DRIVE instrument may be used by drug compendia and knowledgebase editors who develop and maintain DDI content for drug information and decision support systems. Health professionals, researchers, and journal editors may also find the instrument useful. Systematic evidence review should include a thorough search for relevant published and unpublished literature and, therefore, we also recognized that future work should seek to develop systematic methods for conducting literature searches for assembling DDI evidence [3236].

Due to regulatory and clinical practice implications, special attention was given to evidence from product labeling. Formal regulatory documents, such as product labeling, preapproval reviews, and post-marketing analyses are important sources when evaluating DDI evidence [37]. Preapproval reviews and post-marketing analyses by the sponsor and the FDA can identify unpublished DDI evidence as the documents are generally more thorough and may be useful when information in product labeling is incomplete [36]. Despite significant challenges and effort involved in reviewing FDA materials [37], a careful review of these documents should be performed when conducting an evaluation of DDI evidence [36].

In our collective experience and opinion, the majority of pre-marketing DDI studies conducted and described in product labeling are well designed and executed. However, we agree that DDI content in drug compendia and CDS systems do not always need to align with product labeling, even when that information is listed as a contraindication or boxed warning. This opinion is based upon examples in which the labeling is not consistent with existing evidence [38] and significant variation occurs in the DDIs listed in product labeling compared with published information [33, 34, 3941]. We acknowledge that the purpose and guidance of labeling is unique, and also that the FDA has taken important steps to improve the quality and usefulness of DDI information in product labeling [4244]. Continued effort is recommended to improve the consistency and timeliness of DDI information in product labeling, particularly for older nonproprietary drugs.

3.2.3 Assessing Clinical Relevance

If there is sufficient evidence that a DDI might require clinical management, further evaluation is needed to establish the clinical relevance. Clinical outcomes associated with the DDI must be determined, including the magnitude, variability and frequency of effects (if known), and modifying factors that may increase or decrease the risk of patient harm. Depending on the context, exposure to a clinically relevant DDI might warrant a change in therapy, increased monitoring, and/or patient education. Assessing the clinical relevance of a DDI is an estimate, at best, because inter-patient variability is often unknown, and for pharmacokinetic DDIs, changes in the object drug can vary four- to six-fold [45, 46]. For some DDIs, it is reasonable to assign a general risk rating based on the properties of the object drug, such as those with a NTI.

We recommend providing estimates of the frequency (incidence) of adverse outcomes from DDIs when available. However, assessing these frequencies is difficult because data are largely limited to observational studies, which are susceptible to confounding. When available, the definition of the adverse outcome and the applicable population should be clearly specified. For example, patients can be informed that the risk of upper gastrointestinal (GI) bleeding is estimated to increase by 19 % with combined use of selective serotonin reuptake inhibitors (SSRIs) and nonsteroidal anti-inflammatory drugs (NSAIDs) beyond the effects of each individual drug [47]. Estimates can also be provided that 179 (95 % CI 107–319) high-risk patients (e.g., elderly, previous GI bleed) and 645 (95 % CI 387–1,152) low-risk patients need to be treated with the SSRI and NSAID combination to cause one upper-GI bleed [47]. But for most DDIs—even those that are well documented and potentially dangerous—only rough estimates of the incidence of adverse outcomes are known.

Thorough evidence evaluation of DDI literature should include documented methods to mitigate harm (e.g., dosage adjustment, monitoring strategies, and therapeutic alternatives) [22, 48]. Reasonable therapeutic alternatives may include DDIs ruled out by mechanistic principles, preferably with one or more negative studies with limited bias and confounding.

We also recommend that modifying factors that may increase (risk factors) or decrease (mitigating factors) susceptibility to DDIs should be considered when evaluating and reporting DDI evidence [48]. Drug-related modifying factors may include dose, duration, route of administration, order of administration, timing of dose, and co-medications. Patient-related modifying factors may include age, sex, pharmacogenomics [49], comorbidity, clinical status, vital signs, laboratory values, and indication for the drug. Identifying modifying factors is essential because research shows that providing patient-specific risk factors in CDS improves the specificity of alerts [50, 51]. There are many situations where a particular DDI may not be clinically relevant to a specific patient due to mitigating factors that result in a negligible risk of adverse outcomes. For example, a precipitant drug is unlikely to produce a clinically relevant DDI for a patient with a genetic variant producing a nonfunctional cytochrome P450 (CYP) enzyme (i.e., poor metabolizer) [52]. However, information on factors that alter patient susceptibility to DDIs is not yet systematically reviewed in DDI guidelines [51]. In general, more research is needed to identify modifying factors to inform CDS algorithms and clinical decision making.

More work is also needed to identify the most appropriate process to rate the quality of DDI evidence and provide graded recommendations to reduce the risk of adverse consequences [48]. We recommend considering the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) system, a well accepted standard to indicate quality of evidence and strength of recommendations [5356]. For example, the University of Liverpool has adapted the GRADE approach for communicating recommendations related to drug interactions with medications used for HIV and hepatitis C (described in ESM 4) [57, 58].

3.3 Key Question 2: What Evidence is Required for a DDI to be Applicable to an Entire Class of Medications?

CDS systems can generate nuisance alerts when they inappropriately define or represent a DDI as a ‘class’ effect. Knowledge of the mechanism of interaction is crucial to determining whether there is basis for a class effect. Most class-based DDIs are of a pharmacodynamic nature, with additive [e.g., angiotensin-converting enzyme (ACE) inhibitors + angiotensin II receptor blockers (ARBs)] or opposing (e.g., β-blocker + β-agonist) pharmacologic effects. In contrast, pharmacokinetic interactions are rarely generalizable to all agents within a class [22, 59]. Even when there is seemingly a class effect, the magnitude of effect can vary, which often makes it necessary to consider each drug in the class individually. For example, azole antifungal agents can inhibit CYP3A4. However, itraconazole and ketoconazole are much more potent inhibitors than fluconazole, so the magnitude of interaction with drugs that are primarily metabolized by CYP3A4 may differ significantly, which would impact the clinical relevance of the interaction. This can be shown by their effect on triazolam levels: itraconazole and ketoconazole increase the AUC of triazolam by 27- and 22-fold, respectively [60], whereas fluconazole causes a 4.4-fold increase in AUC [61].

The overwhelming majority of extrapolated DDIs are pharmacodynamic, because few studies are conducted to investigate this type of interaction. In the absence of drug-specific data, a class-based interaction may be reasonably assumed if the purported mechanism of interaction is biologically plausible and consistent with known pharmacology of one or both classes of drugs involved. Class examples include SSRIs plus other serotonergic drugs related to serotonin syndrome, and anticoagulation plus antiplatelet agents related to bleeding.

Occasionally, pharmacokinetic interaction data may be extrapolated from one agent to other agents in the class if the purported mechanism of interaction involves common pharmacologic effects. For instance, NSAIDs may reduce the renal excretion of lithium and therefore increase the risk of toxicity [62]. The proposed mechanism of interaction is inhibition of renal prostaglandin synthesis by NSAIDs, which leads to reduced renal blood flow. Although lithium toxicity has not been reported with all NSAIDs, the interaction is probably applicable to the entire class based on their common ability to inhibit prostaglandin synthesis.

We recommend that DDIs should be class-based only when the evidence (or reasonable extrapolation) applies to the entire pharmacological class of drugs.

3.4 Key Question 3: How Should a Structured Evaluation Process Be Vetted and Validated?

In Key Question 1, we recommended use of a new instrument as a standard to evaluate DDI evidence. However, any new DDI evidence evaluation instrument should undergo a rigorous evaluation. The evaluation should ensure that the instrument is easy to apply by end users and produces results that are generally concordant with other DDI evidence rating systems, except where differences are expected.

We recommend evaluating any new assessment tools using a subset of 15 ‘high-priority’ DDIs (drug pairs which should always generate an alert) approved by a panel of experts commissioned by the US ONC [63]. There are also several existing studies that have systematically collected evidence for a set of DDIs and then examined concordance on DDIs mentioned in drug information sources [12, 34, 39, 41, 64]. These studies can provide DDIs for which there are varying degrees of agreement across drug information sources (e.g., some DDIs that all sources mention and others that only one source mentions).

Evaluation tools should provide clear definitions and be easy to use. The results from such tools should be internally consistent as well as across users. Measures of agreement among experts, such as Kappa statistics, should be reported in validation studies.

4 Discussion

This expert workgroup was convened to recommend an approach for evaluating DDIs in order to provide consistent, evidence-based CDS systems for healthcare providers. Because of the numerous challenges to evaluating evidence for DDIs, we propose a systematic and transparent process to evaluate evidence that supports the existence of clinically relevant DDIs. Furthermore, the use of a standardized evidence-based approach to evaluate DDI evidence will eliminate combinations with a low probability of harm and minimize legal liability for knowledgebase vendors and healthcare systems [23].

Our search for relevant tools to evaluate the DDI evidence identified no instruments that possessed all of the attributes believed to be important. Consequently, DRIVE was developed for evaluating the body of evidence for DDIs using important concepts from existing evidence evaluation methods with a focus on simplicity and explicit criteria for certain types of evidence [30, 31]. In this process, several terms were defined for use when evaluating DDIs because consistent application of terminology is requisite for systematic evaluation.

We recommend that any systematic approach to evaluating DDI evidence be validated to ensure the method is worthwhile. To that end, further evaluation of the DRIVE approach, including explicit criteria for what constitutes a well designed and executed study should be developed. DDI evidence reported in product labeling should be evaluated by the same criteria as published studies to establish sufficiency of DDI evidence.

For case reports, DIPS was judged to be the most appropriate published method to evaluate whether a DDI occurred [20]. Case reports may provide the first evidence of DDIs; however, using these reports as the sole evidence source has several disadvantages. They are often poorly described, leading to speculation and potentially inaccurate inferences about causal relationships. Use of case reports that are later found to be incorrect results in erroneous data listed in prescription product labeling and/or drug information compendia that are very difficult to correct. Therefore, careful evaluation of case reports is needed to establish the existence of a DDI.

There are limitations that should be kept in mind when employing experts in consensus recommendations. The most significant limitation is that results from consensus groups are driven by the membership. To overcome this limitation we invited members from diverse backgrounds. We refer the reader to other reports that outline other limitations of consensus groups [65].

We accomplished our goal of identifying principles for establishing a systematic process for evaluating evidence for DDIs; however, more work remains in certain areas. Although the DRIVE instrument may be used in the future to affirm that a DDI exists, further evaluation is needed to establish criteria for assessing clinical relevance. This entails identifying the associated clinical effects and their magnitude, variability, and estimated frequency. Modifying factors that may increase or decrease the risk of patient harm should also be identified. Pharmacogenetic research can further improve the precision of DDI evidence and CDS by identifying patient-specific predisposing factors. More work is also needed to identify the most appropriate process to rate the quality of DDI evidence and provide graded recommendations to reduce the risk of adverse consequences [48]. We recommend considering the GRADE system, a well accepted standard to indicate quality of evidence and strength of recommendations [5356].

5 Conclusion

We convened a group of experts in pharmacology, drug information, biomedical informatics, and CDS to develop recommendations for best practices for evaluating DDI evidence. To promote agreement among DDI information systems we recommend consistent use of relevant terminology, including DDI, potential DDI, clinically relevant DDI, and seriousness. We created a definition of what constitutes an NTI drug specifically related to evaluating the clinical relevance of a potential DDI. For evaluating case reports of DDIs, we recommend the use of the DIPS tool. When formally validated, we recommend the use of the DRIVE instrument to evaluate a body of evidence for existence of a DDI. Defining the clinical relevance of a DDI is extremely important because of thousands of theoretical, but not clinically relevant, interactions. Evaluations of evidence should consider modifying factors as well as the frequency of occurrence (if known). Broad indictments for drugs in the same therapeutic class should be done cautiously and only when there is sufficient evidence based on biological plausibility and known pharmacology. Finally, we recommend that any tools used to assess DDI evidence undergo a rigorous validation to ensure the approach results in reproducible and consistent findings.