1 Introduction

For some drugs with serious risks it may be necessary to improve the benefit-risk balance with measures extending beyond the routinely required summary of product characteristics (SmPC), package leaflet, packaging labelling, pack size and design, and legal status of a drug [1]. In the European Union (EU), such extra activities are referred to as additional risk-minimisation measures (aRMMs).

Examples of aRMMs include educational programmes for healthcare professionals (HCPs) and patients, and required interventions such as pre-prescription screening of patients to ensure the appropriate patient population [2]. The measures are often implemented through use of appropriately selected tools, such as a prescriber brochure and patient card. In the EU, when an aRMM is necessary, this is set out in the risk-minimisation section of the risk management plan (RMP), a mandatory part of all new marketing applications [3]. The RMP considers the important identified and potential safety concerns of the drug and missing information within the safety specification, the planned pharmacovigilance activities to monitor and further characterise these safety concerns during the post-marketing period (including post-authorisation safety studies), and risk-minimisation measures (RMMs) (both routine and additional) that aim to prevent and mitigate the safety concerns in clinical practice [1]. In the United States (US), the Food and Drug Administration (FDA) can require a risk evaluation and mitigation strategy (REMS) for a medicinal product, which can be a compulsory element of usage of the drug [4]. Although the specific legislation differs between the EU and the US, in both jurisdictions these additional measures are interventions that aim to minimise the occurrence and/or impact of adverse drug reactions (ADRs).

The risk management approach, including the use of RMMs, is evolving [5], and new standards are required under the EU pharmacovigilance legislation that came into effect in 2012 [1, 6, 7]. The need for aRMMs or a REMS is evaluated during the registration process of a drug and also on a continuous basis during the life cycle of the drug (Fig. 1). The number of products with aRMMs has grown. The proportion of centrally approved active substances with aRMMs in the EU was 5 % among products authorised before the RMP became mandatory in 2005; and 29 % among products approved afterwards [2]. As a result, the need for adequate evaluation has also increased.

Fig. 1
figure 1

Evaluation of risk minimisation at different lifecycle stages

Measuring effectiveness of the RMMs and REMS post-marketing is an important part of the continuous re-evaluation of the benefit-risk balance of a drug. This also includes an assessment of whether the existing RMMs are sufficient and enables modification of the initial measures to improve the risk-minimisation strategy if warranted. A feedback loop to detect potential issues with the adopted RMMs, so that corrective actions can be implemented promptly, is a key component of the overall approach. With the recent EU pharmacovigilance legislation, monitoring the effectiveness of the aRMMs has become mandatory for both marketing authorisation holders (MAHs) and regulatory authorities [6].

In the US, a REMS assessment plan needs to be approved in advance of REMS implementation. The REMS effectiveness analysis is normally needed at 18 months, 3 years and 7 years after REMS approval [4]. The REMS assessments should include an evaluation of the extent to which each of the REMS elements is meeting the goals and objectives of the REMS, and whether or not the goals, objectives, or REMS elements should be modified. Other regions are developing their own RMP requirements. Many are based closely on the EU-RMP template (such as Brazil), some have a country-specific addendum (such as Australia), whilst others, including Japan, have implemented their own template. However, it is likely that all will require some form of post-launch evaluation of any aRMMs, so the principles discussed in this paper are relevant for their potential impact on global regulatory policies.

Currently, there is limited knowledge on the approaches for evaluating tools, and the determinants of success of RMMs. In this paper, the evaluation strategies and the challenges involved in evaluating RMMs during the post-marketing phase are discussed. Potential outcome measures and their interpretation are also considered.

2 Models of Evaluation of Effectiveness

Prieto et al. [8] described a model assessing RMMs and their implementation (process indicators), including measurement of tool delivery and acquired clinical knowledge, as well as the resulting clinical behaviours. Effectiveness of the risk-minimisation strategy as a whole can be evaluated by demonstrating a reduction in the occurrence or severity of ADRs (Fig. 2). A key goal for some programmes may be appropriate patient selection to ensure maximal benefit-risk balance. Proper patient selection may exclude high-risk patients from treatment or could optimise outcomes by maximising benefit even though the frequency and severity of ADRs may not change.

Fig. 2
figure 2

Evaluating the effectiveness of risk-minimisation plans by a dual evidence approach—the implementation process and the outcomes (adapted from Prieto et al. [8]). ADR adverse drug reaction, RMM risk-minimisation measures, RMP risk management plan

Prieto’s model usefully differentiates between process indicators and final outcome indicators, providing a basic hierarchy of evidence for RMM evaluation. However, both this model and the good pharmacovigilance practices (GVP) Module XVI on the evaluation of RMMs remain relatively non-specific regarding detailed methodology for evaluating the risk-minimisation plan as a whole; other than suggesting the use of epidemiological outcomes studies [1, 7].

A 5-level model [9] (Fig. 3) with different evaluation levels, resulting in increasing utility of information, may be used to determine RMM effectiveness. The evaluation levels range from level 1 (risk-minimisation tool coverage), addressing the distribution of the risk-minimisation tools, to level 5 (safety outcomes), covering linkage of risk-minimisation tool usage to safety outcomes; i.e., occurrence of ADRs. Behavioural (level 4) and safety outcomes (level 5) data may be harder to obtain but generally have higher value than information on tool coverage (level 1), awareness and usage (level 2) and knowledge (level 3) metrics.

Fig. 3
figure 3

A 5-level framework covers both individual risk-minimisation tools and programme evaluation, and focuses on the quality of evidence. PASS post-authorisation safety study, RM risk minimisation

The effectiveness of individual tools can be measured at all levels, whereas the success of the overall programme in meeting goals and objectives can be evaluated at levels 3 to 5. This 5-level model adds a detailed hierarchy of evidence into the evaluation of tools, and attempts to link the evaluation of individual tools and the risk-minimisation plan as a whole into a single continuum.

A complementary model (Fig. 4) [10] evaluates effectiveness at various intervals. In this model, the complete RMM strategy and risk-minimisation tool content and face validity can be assessed in level 1 (pre-approval phase). Assessment of the implementation of the risk-minimisation tools includes use of the tools and the acquired clinical knowledge and behaviour (level 2). The overall effectiveness of the RMM and the impact on the occurrence or severity of the safety concern is assessed in level 3. The second and third levels provide complementary information relevant for the assessment of the RMM’s impact on the benefit-risk balance of the drug. This latter model makes the iterative elements of evaluation, correction, and re-audit integral to the overall chronological process.

Fig. 4
figure 4

Evaluation steps increase in utility of information with time after implementation (from Zomerdijk et al. [10]). aRMM additional risk-minimisation measure, RM risk minimisation

These three models should prove useful for communicating the concept of assessing both implementation and outcomes, and aiding more detailed planning of components in the evaluation of effectiveness post-marketing.

3 Shortcomings of Current Risk-Minimisation Evaluation Approaches

Regardless of the model, a number of challenges emerge when considering effectiveness evaluation systems in general.

3.1 Appropriate Data Collection

Appropriate data collection is important to adequately assess the RMMs and many data sources can be used. Some examples of potential suitable metrics and their interpretation are shown in Table 1 for each level of the 5-step model.

Table 1 Possible risk-minimisation tool evaluation metrics and suggested interpretations based on the 5-level model (see Fig. 3)

Information on patients’ or HCPs’ knowledge, behaviour, and drug use can be prospectively collected via surveys. With this type of data collection, specific and detailed information from the target group can be collected. However, issues with recruiting participants, and small or unrepresentative sample sizes may occur which make it difficult to draw robust conclusions [11].

An independent study of REMS assessments submitted by MAHs and reviewed by the FDA in the period 2008 to 2011, highlighted issues with submission on time, completeness and meeting their stated goals. The issues identified included difficulties with data collection (predominantly survey-based methods were used) and sample sizes that were too small for enabling conclusions to be drawn. Almost half of the REMS assessments reviewed did not include all the information requested in FDA assessment plans. Based on these results, the FDA was recommended to identify and implement reliable methods for assessing the effectiveness of REMS, decrease its reliance on survey data in REMS assessments and work with MAHs and healthcare providers to develop more accurate evaluation methods [12].

Other known limitations for surveys include sample populations that do not reflect the demographics of the target population, bias caused by convenience samples (‘lower-risk’ patients and HCPs are often more likely to take part in such surveys [13, 14]) and a lack of objective standards to measure knowledge of risks. Furthermore, knowledge and behaviour surveys are usually based on a subject’s recall of events or expectations, rather than direct measurement of how risk education affects behaviour, meaning surveys may fail to reflect real tool use and utility for all intended users. Response rates to surveys are often low, indicating that they may represent a burden on clinical practice [15].

Evaluations of aRMMs or REMS effectiveness that allow assessment of safety outcomes often rely on integration with data sourced from electronic healthcare records, or from disease or drug registries [16, 17], as electronic healthcare databases include information on drug prescription (which reflects the HCP’s behaviour) and patient safety outcomes. These data sources are often used within pharmacoepidemiology and pharmacovigilance research and provide opportunities to study effectiveness of RMMs. Using routinely collected data reflecting actual care is efficient, and timely feedback on the RMMs may be provided.

However, electronic healthcare databases may not capture sufficient and relevant data [10] and only cover some drugs [18]. Examples of electronic healthcare databases are the administrative (claims) databases, routine primary care databases, pharmacy dispensing databases, hospital databases and disease/drug registries [19, 20]. Several claims databases (e.g., Premier in the US) and prescription databases, such as the Nordic prescriber databases and Clinical Practice Research Datalink (CPRD), can give valuable information on drug utilisation. Studies can examine the extent of off-label use, indication, dosage and prescriber characteristics, offering useful indirect information on safety outcomes that can be correlated with behaviours. However, although HCP behaviour on drug prescribing and patient follow-up information in the form of coded events is available, knowledge cannot be measured from these data sources.

Spontaneous reporting systems are also possible data sources as these include case reports of patients who developed adverse events. These systems, however, suffer from biases such as under-reporting and lack of a suitable overall denominator (i.e., total exposure to the drug, number of patients exposed or number of drug doses administered), which inevitably hampers interpretation of results. The use of spontaneous reports may therefore not be suitable or sufficient.

Another issue is the timing of effectiveness evaluation. The interval between tool deployment, data collection, interpretation and actioning changes can often be a number of years, whereas ideally, efficient and timely evaluation is needed to allow early closure of the audit loop and timely aRMM amendment if necessary. It takes time for a newly launched drug to sufficiently penetrate the market and often certain sample sizes are necessary to be able to observe desired effects and draw conclusions on the study outcomes. Therefore, the timing of assessment should be appropriate for the intervention and the expectations of all stakeholders, including regulators, should be realistic. REMS have a mandated timeline for assessments which may not be an appropriate fit for every REMS.

Overall, strengths and limitations of the type of data collection should be carefully considered on a case-by-case basis depending on the RMMs, the safety concerns and drug involved.

3.2 Lack of Comparators

Drugs with RMMs that are required at the time of initial marketing authorisation do not have defined, unbiased comparator groups of tool non-users for post-launch RMM evaluation. Post-hoc analysis may be used, where risk-minimisation tool users are compared with non-users, although in practice it may be difficult to distinguish them. Furthermore, there may be other confounding factors, such as a propensity to riskier clinical practice by ‘non-users’ that contributes to an increased risk of occurrence of a particular safety concern. This makes any observed difference difficult to attribute to any positive effects of the tools themselves.

It would also not be ethical in the post-approval setting to have a control group where RMMs that contribute to the favourable benefit-risk balance of the drug were not available. However, when the value of the risk minimisation is unclear, a potential approach could be a phased implementation of aRMMs that initially includes a comparator population that does not use the aRMMs. Such an approach has been utilised for modifying the risk minimisation of an already-launched antifungal product, with the management of unresolved hepatotoxicity safety issues involving a modified aRMM approach piloted in two EU countries, prior to interim evaluation and subsequent rollout in the rest of the EU [21].

An alternative solution would be to test the proposed RMM in a proportion of the phase III study population. This allows some valuable comparative information to be gained, albeit not in a real-world setting. It may also be possible to compare different drugs, with and without aRMMs, or compare the safety outcomes for the drug with a reference value for the target or general population [22]. Nevertheless, all comparisons will have their limitations and the most appropriate solution should be selected on a case-by-case basis, and will be dependent on the data available.

3.3 Lack of Meaningful Outcomes

RMMs aim to minimise the inherent risks of drug treatments, thus optimising the benefit-risk balance of the drug. Ideally, successful implementation should lead to a reduced ADR rate and/or severity, by increasing the patients’ and HCPs’ knowledge and adapting their behaviour (e.g., appropriate patient selection), as shown in Fig. 2. Since the aRMMs are developed for each drug product independently, a ‘gold standard’ set of standard outcome measures cannot currently be defined and only the broad outcomes can be outlined.

For example, in drugs with the risk of teratogenicity, a meaningful outcome is prevention of pregnancy or no foetal exposure to the drug in question. The aim is to guide desired behaviours (e.g., use of contraceptives and HCPs providing appropriate advice to patients) to meet this goal [23]. However, even with drugs such as isotretinoin, where strict RMMs have been implemented in the form of a pregnancy prevention programme, pregnancies still occur. The real outcome of interest might be minimisation of infants with congenital malformations by preventing pregnancies [24, 25]. In the case of pregnancies occurring, the issue of determining an acceptable threshold based on the benefit-risk profile of the drug still arises; that is, evidence of successful effectiveness of an RMM relies on a stated goal for defining success [26].

Most evaluations have so far concentrated on measures of process, such as tool distribution and utilisation results, rather than clinical outcomes, such as reducing or eliminating ADRs, or fewer patients with absolute or relative contraindications [26]. The three models discussed earlier highlight the need to also evaluate the latter.

Linking risk management activities to meaningful changes in safety outcomes remains a challenge, as demonstrated by the FDA’s recent exercise to address prescription opioid abuse and over-prescribing [27], which mirrors the experience in other markets. Although evaluation of effectiveness was being performed, the data collected failed to either support the goal of improving RMMs, or provide evidence to enable future ‘de-commissioning’ of RMMs that have outlived their original purpose.

3.4 Uncertainty About Interpretation of Evaluation Metrics

It is rare to be able to directly associate a reduction in ADRs with specific RMMs. Often, only cross-sectional data on safety outcomes are available that are not directly linked to data on the patients’ usage/non-usage of tools. Baseline data on knowledge and behaviour are also frequently not available.

Spontaneous reporting rates of ADRs have too many biases, such as under-reporting, to enable a change in the frequency of ADR reports to be directly attributed to a risk-minimisation intervention. This is particularly the case in the period shortly after the aRMM has been introduced or when there has been public communication about a serious event. The outcome may be an apparent rise in spontaneously reported ADRs, due to better prescriber and patient awareness of a risk increasing the reporting rate rather than an increase in the actual ADR rate.

The impact of RMMs on drug use is difficult to predict. Reber et al. [28] examined changes in use for 58 new drugs following direct healthcare professional communications (DHPC)—colloquially known as ‘Dear Doctor letters’. The results showed that DHPCs have a complex effect in changing clinical prescription behaviour. In about half the evaluated cases, DHPCs lowered overall drug use in the short term, and for around a third of the drugs long-term use was reduced.

Examination of prescribing outcomes linked to the patient’s condition can identify patients in risk groups who may be receiving the drug inappropriately (i.e., not in line with the risk-minimisation recommendations). HCPs may decide to prescribe a drug not in accordance with the recommended RMMs based on valid clinical reasons for individual patients; whereas the aRMMs and REMS aim to improve the benefit-risk of a drug at a patient population level. Whilst this is acceptable, it means that 100 % adherence to risk-minimisation recommendations is not feasible in clinical practice. However, if the frequency or severity of reported ADRs remains high and the benefit-risk balance of the drug therefore remains uncertain, appropriate regulatory action should be taken. This may include modification of the risk-minimisation strategies.

3.5 Lack of Benchmarking

It is difficult to predict what acceptable levels of distribution, tool uptake and impact on knowledge, behaviours and attitudes, constitute success. A first round of evaluation, following market authorisation, provides a benchmark against which future evaluations may be compared. As the number of drugs with aRMMs grows, and experience with evaluating these measures evolves, acceptable outcome measures will be developed.

Such benchmarking will allow newly introduced aRMMs to be compared against these first-round evaluation measures. However, in order to be meaningful, benchmarking will need to cover different patient groups, specialist versus generalist prescribers, geography and therapeutic area, as these and similar factors may alter risk-minimisation tool uptake. Further research is needed to understand the impact of these multiple factors on influencing the implementation of individual RMMs. Continued collaboration between industry and regulatory agencies will be necessary to facilitate and agree suitable benchmarking metrics, and will also require an overt regulatory policy on greater transparency on publication of the results of effectiveness evaluation.

4 Innovative Web-Based Tools Facilitate Data Collection

Many of the shortcomings of current evaluation methods could be addressed by implementation approaches that allow consistent, timely collection of evaluation data, relevant to the objectives of the specific RMMs deployed.

Web-based risk-minimisation tools can facilitate data collection for the evaluation of the aRMMs. They can be combined with simultaneous collection of data reflecting actual behaviour of tool users, providing timely and ongoing evaluation (Fig. 5) [29, 30]. Internet facilities in healthcare centres are widely available globally. For example, in the EU an estimated 80 % of hospitals have implemented electronic patient record systems, suggesting a potential to become paperless in the future [31]. This supports an increase in both web-based risk-minimisation tools and web-centric risk-minimisation evaluation programmes.

Fig. 5
figure 5

Web-based behavioural risk-minimisation tools can combine education, communication and real-time evaluation. RM risk minimisation

Web-based tools can enhance HCP-to-patient communications (Pope Woodhead. User testing for new drug product X. 2013), for example, enabling HCPs to send automated reminder messages to patients. Furthermore, web-based tools can be used to confirm whether patient counselling was provided when the drug was prescribed, encourage ADR reporting and link tools to other post-launch data-gathering initiatives, including spontaneous reporting systems and registries.

Routine collection of anonymised data from risk-minimisation tool users, processed continuously in real time, permits rapid assessment of the effectiveness of the implementation and tools themselves, and provides a straightforward approach to periodic evaluation [32].

In the future, patient-reported outcomes of adverse events (PRO-AEs) may offer a useful and relevant approach for assessing the success of RMMs [33], [34]. However, a number of factors need to be considered when deploying web-based RMMs (for both HCPs and patients); for example, tool penetration and usability, how to achieve coverage where the internet is not available, and data protection issues (which may vary between countries).

Hence, a web-based risk-minimisation approach that incorporates evaluation requires careful design and supporting IT infrastructure, though over the lifetime of a product. Complexity can also arise in integrating data from multiple sources such as PROs, patient drug lists, health outcomes, provider education and assessment results. The numerous health records systems within the US and differences between EU member states provide further complexity.

5 Potential Impact on Regulatory Policies

From a regulatory policy perspective, aspects to consider include corrective actions, establishment of an agreed baseline, and potentially more combined/single RMPs for a product class to allow more impactful comparisons.

For effectiveness evaluation to be useful, an audit loop should be closed; i.e., corrective actions should be taken based on the collected results if necessary. Whilst appropriate actions may include modifying the aRMMs or individual risk-minimisation tools, sometimes the MAH in conjunction with regulatory authorities might consider making the programme less onerous (particularly if it is clear that adequate risk minimisation is successfully occurring without the need for tool intervention; e.g., despite low tool usage).

Effective processes in risk management are essential, so steps should be incorporated to allow elimination of ineffective tools or programmes. Measures impose a considerable burden on clinicians, support staff and patients, and are often expensive to implement. Effectiveness depends on the quality of the tools and overall programme and both should be scrutinised by establishing clear success criteria for each.

Table 2 outlines proposed actions to improve effectiveness of both risk-minimisation tools and the aRMMs overall for a product. The suggested action depends on the success of the deployment versus the effectiveness of risk minimisation as a whole. In this framework, deployment is defined by tool coverage and usage, measured by distribution/utilisation and tracking metrics (levels 1 and 2 of the 5-level model shown in Fig. 3); and effectiveness of the risk minimisation is measured by knowledge, attitudes and behaviours and the safety outcomes themselves (levels 3, 4 and 5 of the 5-step model in Fig. 3).

Table 2 Possible actions following risk-minimisation effectiveness evaluation

Other issues that potentially impact regulatory policy include a move towards transparency of the results of effectiveness evaluation and corrective actions taken. This will eventually allow prospective benchmarking and, potentially, the setting of target standards of deployment expected from MAHs at the time of approval if possible. Greater harmonisation between the aRMMs, perhaps with class-specific risk minimisation, would allow greater efficiency and less confusion for prescribers and patients, promoting greater engagement of the end-user with voluntary aRMMs [40].

Regulators should ensure that the requirements for effectiveness evaluation of aRMMs with accepted methodologies are transparent.

6 Conclusions

Measuring the effectiveness of aRMMs and REMS is an important aspect of the benefit-risk evaluation of a drug. Limited detailed guidance is currently available on the evaluation of effectiveness of aRMMs, leading to a lack of consistency which is only partly addressed by the recent GVP guidance and the REMS guidance. Available models include assessment of effectiveness at different levels, with varying utility of information.

Specific challenges in evaluation include appropriate data collection, lack of comparators, uncertainty on the best outcome measures and lack of benchmarking or pre-defined aRMM objectives, which may be indicative of a weak underlying risk-minimisation strategy. The difficulty of collecting adequate and timely data may be impeding evaluation of effectiveness of aRMMs. The result is that the reaction time to safety issues is slow, and revisions to implemented RMPs to address potential safety issues are protracted.

Optimal risk-minimisation evaluation involves assessment of aRMM deployment and use, as well as the knowledge and behaviour of patients and HCPs, and the safety outcomes. Evaluation methods need to be tailored to specific safety concerns, the actual aRMMs deployed and the drug involved. Global regulatory policy must also embrace industry, patients and prescribers to pre-define comprehensive, but feasible, objectives and evaluation plans and iteratively close the audit loop in a timely way on any aRMMs.