FormalPara Key Points

Disproportionality cannot be used for comparative drug safety analysis beyond basic hypothesis generation because measures of disproportionality are missing the incidence denominators, are subject to severe reporting bias, and are not adjusted for confounding.

Hypotheses generated by disproportionality analyses must be investigated by more robust methods before they can be allowed to influence clinical decisions.

1 Introduction

Due to the known limitations of clinical trials to detect particularly rare adverse drug events, post-marketing safety data are a valuable source of information to detect new safety signals early. The FDA Adverse Event Reporting System (FAERS) is one of the largest spontaneous reporting databases containing reports of adverse events and medication errors from healthcare professionals, manufacturers, and consumers [1]. The number of reports greatly increased over the recent years and reached 1,289,133 reports in 2014, which means that the number of reports per year more than doubled since 2009 [2]. In addition to FAERS, examples of other large spontaneous reporting databases include EudraVigilance, maintained by the European Medicines Agency (EMA), Vigibase, maintained by the World Health Organization (WHO), and the French Pharmacovigilance Database maintained by the French health authority [3,4,5,6].

To enable the screening of large volumes of spontaneous reporting data, data-mining methods have been developed based on the concept of disproportionality [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]. Disproportionality analyses compare observed reporting frequencies versus those expected from the background of the database for pre-defined drug-event combinations and generate a “signal” or statistics of disproportionate reporting (SDR) when a pre-set threshold is exceeded. While no “gold standard” for determining which thresholds should be adopted to define an SDR exists, several metrics/threshold combinations are commonly used [10, 16, 17].

Because disproportionality analysis is based on spontaneous reports submitted for a large number of drugs and adverse event types, one might consider using these data to compare safety profiles across drugs. Recent publications have promoted this practice, claiming to provide guidance on treatment decisions to healthcare decision makers [21,22,23,24]. In this article we would like to investigate the validity of this approach.

In the following sections, we review key characteristics of a spontaneous case report, formally define the concept of disproportionality, state theoretical conditions that are needed for accurate comparison of drug safety based on disproportionality measures, and review interpretation of disproportionality measures in light of the likely ubiquitous violation of these conditions in practice.

2 A Spontaneous Case Report is Not Always an Adverse Drug Reaction

Systematic spontaneous reporting of possible drug-related adverse effects began in the UK in 1964 [25]. Since then, more than 50 countries, including all major health authorities, have adopted a voluntary spontaneous reporting system. As stated by the European Medicines Agency [26], “Case reports of suspected adverse reactions alone are rarely sufficient to confirm that a certain effect in a patient has been caused by a specific medicine.” Before a case report can be reasonably considered to reflect a true adverse drug reaction, it must typically pass two hurdles. First, there must be sufficient information provided by the reporter to allow a reasonable assessment of the case. Second, the adverse event must appear to be causally related to the drug. Typically the following criteria, derived from the work of Bradford Hill [27] are used for patient-level causality assessment:

  1. 1.

    Temporality. Is the adverse event temporally related to the drug?

  2. 2.

    Confounding/parsimony. Are there no other more likely reasons why the patient had the adverse event, for example other drugs the patient was receiving?

  3. 3.

    Dechallenge/rechallenge. Did the adverse effect abate when the drug was discontinued or the dose lowered or did the adverse effect recur when the drug was restarted?

  4. 4.

    Consistency with other knowledge. Given all available information about the drug’s safety profile (mechanism of action, pre-clinical, clinical, and epidemiological data, known class effects), does it seem likely that the reported adverse event was caused by the drug?

If, in the context of good medical judgment, a case report satisfies these criteria, then it is reasonable to assume that a particular case report reflects a credible, causally related adverse drug reaction. Unfortunately, and this is one of the important, general weaknesses of case reporting, the great majority of case reports do not pass these criteria. The two most common reasons for not passing are a lack of sufficient information and confounding. Unfortunately, for most health authority sponsored databases, such as FAERS or VigiBase, the detailed information required to fully evaluate a case report and determine whether or not it represents a credible, causally related adverse drug reaction is not readily available. Therefore, while these databases can be used for disproportionality detection of possible safety signals they do not allow the required causality assessments of the individual case reports that contributed to this disproportionality signal.

3 What is a Disproportionality Analysis?

Disproportionality analysis is a set of statistical signal detection techniques based on comparison of reporting proportions between the study drug and all drugs in the spontaneous reporting database combined [7, 8]. A reporting proportion is the number of reported adverse events of interest divided by the total number of reported adverse events. For illustration, consider the hypothetical data in Table 1.

Table 1 Hypothetical spontaneous reporting data on association of event X with drugs A and B

For event X in Table 1, the reporting proportion is 15/100 = 0.15 for drug A, 5/100 = 0.05 for drug B, and 5000/100,000 = 0.05 for all drugs combined. The latter quantity, known as the “expected” reporting proportion, is compared with the reporting proportion observed for any given drug [7, 8]. Disproportionality is inequality of the observed and expected reporting proportions. Its magnitude can be expressed by a reporting ratio (RR) [8]:

$$ {\text{Reporting ratio}} = \frac{{{\text{Observed}}\;{\text{reporting}}\;{\text{proportion}}}}{{{\text{Expected}}\;{\text{reporting}}\;{\text{proportion}}}} $$

In Table 1, the RR is 0.15/0.05 = 3.0 for drug A and 0.05/0.05 = 1.0 for drug B, indicating that disproportionality is present for drug A but not drug B. Other measures of disproportionality closely related to the RR include proportional reporting ratio (PRR), reporting odds ratio (ROR), information component (IC), and empirical Bayes geometric mean (EBGM) [7, 8].

Because RR for each drug has the same denominator (the expected reporting proportion), a ratio of RRs for event X estimated for any two drugs in a comparative safety analysis is just the ratio of the drug-specific reporting proportions for event X. We will refer to this quantity as the relative reporting ratio (RRR):

$$ {\text{Relative reporting ratio}}\;({\text{RRR}}_{\text{ab}} ) \, = \frac{{{\text{RR}}\;{\text{for}}\;{\text{drug}}\;{\text{A}}}}{{{\text{RR}}\;{\text{for}}\;{\text{drug}}\;{\text{B}}}} = \frac{{{\text{Observed}}\;{\text{reporting}}\;{\text{proportion}}\;{\text{for}}\;{\text{drug}}\;{\text{A}}}}{{{\text{Observed}}\;{\text{reporting}}\;{\text{proportion}}\;{\text{for}}\;{\text{drug}}\;{\text{B}}}} $$

It is important to remember that each disproportionality estimate is subject to random error, and therefore should be interpreted along with a confidence interval (or its Bayesian equivalent). If a 95% confidence interval excludes 1, we can state (with 95% confidence) that disproportionality is present. On the other hand, if the interval includes 1, we cannot claim disproportionality, even if the estimated disproportionality value is large. Expressions for construction of CIs for disproportionality measures are readily available from the literature [6, 18].

4 What Conditions are Needed for Accurate Comparative Drug Safety Analysis Based on Disproportionality Measures?

A connection between disproportionality analysis and true treatment effects of drugs on safety end-points is always indirect because disproportionality measures use reporting proportions while safety profiles are defined in terms of event incidence. An incidence rate of event X in patients treated with a given drug is the ratio of the true event count (numerator) to the total person-time on treatment (denominator):

$$ {\text{Incidence rate}} = \frac{\text{True event count}}{{\text{Total person-time on treatment}}}$$

A background rate of event X in patients treated with a given drug is the rate that would be observed in the same patients in the absence of treatment with this drug. Causal effects of drugs A and B on event X can be expressed by the causal rate ratios (CRR) [28, 29]:

$$ {\text{CRR for drug A vs}}.{\text{ background }}({\text{CRR}}_{\text{a}} ) \, = {\text{ Incidence rate on drug A}}/{\text{background rate for drug A}} $$
$$ {\text{CRR for drug B vs}}.{\text{ background }}({\text{CRR}}_{\text{b}} ) \, = {\text{ Incidence rate on drug B}}/{\text{background rate for drug B}} $$
$$ {\text{CRR for drug A vs}}.{\text{ B }}\left( {{\text{CRR}}_{\text{ab}} } \right) = \frac{{{\text{CRR}}_{\text{a}} }}{{{\text{CRR}}_{\text{b}} }} = \frac{\text{Incidence rate on drug A/background rate for drug A}}{\text{Incidence rate on drug B/background rate for drug B}} $$

An incidence rate ratio (IRR) comparing drugs A and B with respect to event X is simply the ratio of their respective incidence rates:

$$ {\text{IRR for drug A vs}}.{\text{ B }}({\text{IRR}}_{\text{ab}} ) \, = {\text{ Incidence rate on drug A}}/{\text{incidence rate on drug B}} $$

Unlike the CRRab, the incidence rate ratio IRRab does not always have a causal interpretation because it does not account for differences in background rates between the drugs. Suppose for example that drugs A and B do not cause event X, but due to the differences in indications the background rate of event X for drug A is ten times higher than that for drug B. In this case, IRRab = 10, but the true CRRab = 1. Only when the background rates are the same for the two drugs will the incidence rate ratio IRRab coincide with the CRRab and thus have a causal interpretation.

As is well known, consistent estimation of the causal rate ratios is guaranteed by randomization [28]. On the other hand, if we attempt to estimate the effect of drug A versus drug B on event X using disproportionality measures based on spontaneous reporting data, we encounter three major difficulties: (1) the incidence denominators for post-marketing events are missing, (2) the reported post-marketing events represent an unknown fraction of the true incident event count (under-reporting), and (3) background event rates in the post-marketing settings may differ between the two drugs, since the treatments are not randomized and may even have different indications (confounding).

We show in the “Appendix” that the causal rate ratio CRRab comparing drug A versus drug B with respect to a specific adverse event X can be accurately approximated by the relative reporting ratio RRRab obtained from disproportionality analysis if the following three conditions hold:

  1. 1.

    Equal overall reporting rates: the two drugs have the same reporting rate for all adverse events combined (a reporting rate is a ratio of the reported event count to the total person-time on treatment).

  2. 2.

    No differential reporting bias: under-reporting for event X is either absent or has the same relative magnitude for the two drugs (this magnitude is given by the ratio of the reporting rate to the incidence rate).

  3. 3.

    No confounding: patients treated with the two drugs have the same background rate of event X (i.e., patients with a higher baseline risk of event X are not preferentially selected for treatment with one of the two drugs).

Condition (1) allows for replacement of the “missing” incidence denominators with overall reported event counts for each of the two drugs, since it implies that the relative reporting ratio from disproportionality analysis is equal to the reporting rate ratio (see “Appendix”). Condition (2) ensures that the reporting rate ratio is equal to the incidence rate ratio, while condition (3) guarantees that the incidence rate ratio has a causal interpretation (i.e., that it is equal to the causal rate ratio). In the following sections we explain these conditions individually and consider their plausibility.

4.1 Accounting for the Missing Incidence Denominators

Because disproportionality measures are based on unknown incidence denominators, they may easily point in the wrong direction in a comparative drug safety analysis, even when reporting bias and confounding are completely absent. For example, consider again data in Table 1, and suppose that the reported cases of event X represent all true incident cases (no reporting bias), and that patients treated with drugs A and B have the same background rate of event X (no confounding). It is seen from Table 1 that disproportionality for adverse event X is present for drug A (RR = 3) but not drug B (RR = 1), with a relative reporting ratio of 3/1 = 3. Does this mean that drug B is safer than drug A with respect to adverse event X?

The problem here is that the treatment exposure person-time giving rise to the observed cases of event X is completely unknown for each drug. Therefore, the risk of event X in patients treated with drug A versus drug B is also unknown. Suppose for example that the 15 cases of event X on drug A occurred after 5000 patient-years on treatment (event rate = three per 1000 patient-years), while the five cases of event X on drug B occurred after 1000 patient-years on treatment (event rate = five per 1000 patient-years). In this case, the causal rate ratio for drug A versus drug B is 3/5 = 0.6, which is pointing in the opposite direction from that suggested by disproportionality measures.

Nevertheless, as shown in the “Appendix”, when overall reporting rates are the same for the two drugs (Condition 1), the relative reporting ratio from disproportionality analysis accurately estimates the reporting rate ratio, which coincides with the incidence rate ratio if there is no differential reporting bias (Condition 2) and coincides with the causal rate ratio if there is also no confounding (Condition 3). Unfortunately, the assumption of equal overall reporting rates is not verifiable with spontaneous reporting data and may easily be violated in any given comparison. A typical example of this is shown in Fig. 1. When overall reporting rates differ between the drugs, disproportionality analysis cannot be expected to provide an accurate estimate of the causal rate ratio for any given adverse event even if reporting bias and confounding are completely absent.

Fig. 1
figure 1

Real data on overall reporting rates of two Novartis drugs (labeled here as drugs A and B) illustrating typical variability in reporting

4.2 Reporting Bias in Disproportionality Analysis

Due to systematic under-reporting of adverse events in the post-marketing settings, reported adverse events represent an unknown fraction of the true incident event count. For any given adverse event, the magnitude of this under-reporting may easily differ between the drugs, giving rise to differential reporting bias, and may also change over time within a given drug, further complicating comparison of safety profiles across drugs based on disproportionality measures.

A good example is provided by Moore et al. [30]. Differential reporting of specific events (sudden death and fatal arrhythmia) for sertindole as compared to the other atypical neuroleptics of the class led to the suspension of this drug as it appeared to have a tenfold higher PRR than the comparator products. However, after taking into account additional data sources and recognizing the reporting bias for sertindole, the suspension was lifted after almost 3 years. The fact that electrocardiogram (ECG) monitoring at that time was mandatory only for sertindole, and not for the other neuroleptics, and physicians were informed about this by a “Dear doctor letter,” clearly led to a higher reporting proportion for QT-interval prolongation and asymptomatic arrhythmia for sertindole as compared to the other neuroleptics. According to the authors, in retrospect, differential reporting of death in general and cardiovascular death in particular had generated the tenfold increased PRR [30].

Changes in reporting patterns of individual drugs over time were first described by Weber [31]. The classical “Weber effect” is an increase in the absolute number of reports post-approval, which peaks at the end of the second year and then declines [31,32,33]. While reporting patterns of modern drugs show considerable variability and need not necessarily follow the classical Weber curve [34], it is clear that these patterns may change substantially over time within a given drug due to factors such as regulatory action, media attention, approval of new indications, and many others [32].

One of the many issues that can make comparative disproportionality analysis between two drugs biased is the time frame that is chosen for the analysis. Any time-point that is selected for analysis has a one-to-many bias that may capriciously favor one drug over another, making direct comparison inappropriate and uninformative.

For example, selecting a time of media attention on an adverse event may cause a favorable bias for older drugs in the market. This simple scenario highlights two effects of a temporal or time point bias; the time during a drug’s lifetime and a time during increased media attention. In this scenario, when comparing newer drugs with established drugs, there may be high disproportionality scores for newer drugs. Research done by Pariente et al. has shown in this case that for older drugs, the events reported during the high-reporting period were diluted by years of low reporting [35].

Another example, shown below (Fig. 2), clearly illustrates the effects of the time-frame bias. Selecting the time point of 2011, one could conclude that the disproportionate reporting for embolic and thrombotic events is the lowest for dabigatran than all other anticoagulants. However, if one considers 2012 and 2013, warfarin is clearly the lowest. Finally, beginning 2014, apixaban has the lowest measure of disproportionate reporting. In this typical example we see that depending on which time frame one selects, one can pick a different conclusion.

Fig. 2
figure 2

Reporting proportions of several anticoagulants over time

If reporting proportions are compared across drugs, the time-frame for analysis should be selected to minimize as much as possible differences in reporting patterns due to time on the market or other known factors. The time on the market can be standardized by using the same fixed-length post-approval analysis time-frame for each drug. This, however, would not account for variation in the calendar time-frames between the drugs or for the numerous other factors influencing reporting.

4.3 Confounding in Disproportionality Analysis

True incidence rates as well as reporting proportions for any given adverse event may differ between the drugs due to differences in drug indications, quite independently from any drug effects. For example, suppose that drugs A and B have no causal effect on adverse event X, but drug A is used in a population of patients where adverse event X is common and represents a large fraction of all adverse events, while drug B is used in a population of patients where adverse event X is rare and represents a small fraction of all adverse events. In this case, disproportionality analysis will produce a much larger RR for drug A than for drug B, even though neither drug causes this adverse event.

More generally, confounding in disproportionality analysis results from differences in background event rates in patient populations utilizing different drugs. For example, in one recent study, a safety profile of asthma drugs “as a class” was examined relative to all other drugs in FAERS based on disproportionality analysis [21]. Because patients with asthma may differ from “all other patients” with respect to baseline rates of specific adverse events, confounding can introduce serious interpretational difficulties in such analysis in addition to those already present due to the “missing” incidence denominators and reporting bias.

The effect of confounding may potentially be reduced by limiting disproportionality analysis to drugs within the same therapeutic area [9, 11], and possibly by adjusting for age and co-medications using methods such as logistic regression [12]. However, even in that case the magnitude of residual confounding can be large due to the actual differences in drug utilization within therapeutic areas and multiple unmeasured confounders.

5 Interpretation of Disproportionality Measures in the Context of Comparative Drug Safety Analysis

Disproportionality analysis is a method of signal detection. This is one of several approaches used to draw attention to particular drug-event combinations deserving further investigation. As such, disproportionality analysis is at most hypothesis-generating, and should not be viewed as a definitive characterization of safety profiles [36]. This is well recognized by health authorities [6, 13]. It has also been argued that due to their very limited scope as a signal detection tool, disproportionality analyses should be given low priority for publication in scientific journals [19, 20]. As a general rule, SDRs representing “technical hits” warrant further investigation, extending the evaluation to other data sources, such as toxicology, animal studies, clinical trials, epidemiological studies, and literature, in order to establish causality. Therefore, we feel that publication of disproportionality findings without consideration of such additional evidence may be of little value.

While conditions for accurate estimation of treatment effects based on disproportionality measures exist (Sect. 4), they are not verifiable with spontaneous reporting data and can be easily and severely violated in any given application. Definitive measurement of efficacy and safety profiles of therapeutic interventions is provided by randomized trials. In the post-marketing settings, the methods of pharmacoepidemiology play an important and growing role in investigation of safety signals, and may constitute the best option for getting a rapid and reliable answer to a safety question [9, 37].

6 Conclusion

Disproportionality cannot be used for comparative drug safety analysis beyond basic hypothesis generation because measures of disproportionality are missing the incidence denominators, are subject to severe reporting bias, and are not adjusted for confounding. Hypotheses generated by disproportionality analyses must be investigated by more robust methods before they can be allowed to influence clinical decisions.