Introduction

The issue of fraud in clinical trials comes under the general heading of research misconduct, which is defined as “fabrication, falsification or plagiarism in proposing, performing or reviewing research or in reporting research results” [1]. There are increasing reports of research misconduct in clinical trials, and the practice has even been the subject of popular fiction [2].

This paper proposes offensive and defensive strategies to deal with fraud in clinical trials. Offensive strategies will be seen as steps that might be taken to salvage a trail once detected, while the defensive strategies entail use of clinical trial designs that might minimize the effect of fraud should it occur. The types of trials considered here are pharmaceutical industry oncology phase III (pivotal) clinical trials. It is assumed that these strategies will be taken by the sponsor and are independent of any regulatory or legal actions that might also accrue.

Fraud is a deliberate action that results in irreproducible results of a scientific experiment or clinical trial. Irreproducible results can also be a consequence of carelessness or incompetence on the part of investigators or sponsor. Some of the strategies discussed here will be appropriate for this case as well.

Discussion of strategies against fraud is a timely topic for several reasons. First, the U.S. Food and Drug Administration (FDA) recently issued a guidance approving centralized statistical monitoring of clinical trials [3]. The use of statistical algorithms to find unusual patterns of data [4] coming from certain investigators will now replace the need for continual site monitoring by pharmaceutical company auditors. While this initiative is designed primarily to improve data quality, it will certainly uncover evidence of fraud as well. The availability of this information may create an overzealousness to find and declare fraud and, thus, many false positives.

The second reason for timeliness can be illustrated by referring to the “index” case of fraud in oncology clinical trials. In 1991, Dr. Roger Poisson of Saint-Luc Hospital in Montreal falsified eligibility requirements for women enrolling in National Surgical Adjuvant Breast and Bowel Project (NSABP) trial B-06 [5]. Dr. Poisson, an inexperienced cooperative oncology group investigator, admitted to falsifying the eligibility requirements under the justification that the deviations in eligibility in the fraudulently entered patients had “no oncologic relevance.” Today, many new investigators are enrolling patients on pharmaceutical industry oncology clinical trials, as sponsors cast their net to include many new geographic regions in clinical trials and offer incentives for enrollment. This situation can create an increase in similar fraud.

The U.S. Institute of Medicine recently commissioned a discussion paper on “The Clinical Trial Enterprise” [6]. One of the suggestions that is gaining traction is that electronic records should be used to enroll patients in clinical trials. Under this scheme, a patient will be seen by his/her own physician and, after the diagnosis and other patient characteristics have been entered, the computer screen will display all clinical trials for which this patient is eligible. If the patient consents, this physician will be an investigator in the trial. Clearly, the Enterprise proposal will bring considerable efficiency and pragmatism to trials, but also many inexperienced investigators who may not understand the discipline required by clinical trials. This could result in an increase in fraud, especially in the falsification of eligibility requirements.

The use of companion diagnostics for biomarker determination is increasingly present in oncology clinical trials. Some investigators are already entering into trials patients whose eligibility cannot be supported by diagnostic results. The justification given by the investigator is that the diagnostic result is likely a false negative.

There are several possibilities for fraud by the sponsor that need not be listed here. However, a recent regulation by the FDA requiring sponsors to submit SUSARs (Suspected Unexpected Severe Adverse Reactions) as part of the continual safety reporting process [7] deserves mention. The SUSAR list is generated by the sponsor, not the investigators, and clearly there is a broad gray area where sponsors could underreport events merely by saying that the events were not suspected and/or unexpected. Indeed, risk-based centralized statistical monitoring would not detect this kind of fraud because of its subjective nature.

Offensive strategies

The offensive strategies are those that can be applied to salvage a trial once fraud is detected. Table 1 presents a list of the various types of “plans” that are created by the sponsor prior to the start of a clinical trial. These plans, such as the statistical analysis plan or the data management plan, must be pre-specified to remove the suspicion of bias in data handling and analysis methods when trial results are publicized. In the same spirit, this paper will make the case for a pre-specified fraud recovery plan (FRP).

Table 1 Clinical trial planning documents

The FRP would specify steps to be taken in the event that evidence of fraud is discovered. Like the other plans enumerated in Table 1, the FRP would have to be approved by the regulatory agencies prior to the start of the trial.

We now describe the use of the FRP for the five types of fraud listed in Table 2.

Table 2 Some types of fraud encountered in clinical trials

Falsification of eligibility criteria

As mentioned above, Dr. Poisson had falsified patient eligibility criteria for NSABP B-06. Possible options for an FRP would be to eliminate all patients from that site from analysis, eliminate only the ineligible patients from that site, or ignore the infraction by including all patients from the site. The latter approach would be justified under the intent-to-treat methodology. In addition, slight deviations from eligibility are not a problem because patients treated in clinical trials are not generally representative of those seen in practice anyway. Eliminating all patients from the site will remove the site from continual suspicion, but there will be a loss of power in statistical hypothesis testing. Sponsors should avoid doing all analyses and sending them to the regulatory agency without any pre-specified priority of analyses.

Underreporting of adverse events

The FRP might specify that once underreporting of adverse events is found or suspected, an independent auditing group should go to the site and restore the adverse events. If this cannot be done, the Independent Data Monitoring Committee (IDMC) for the trial should indicate how important the omissions are in the context of the trial. If there is underreporting of adverse events on an active control arm (already approved drug), this would probably not be considered serious because it is the experimental treatment that is under scrutiny, since the active control’s safety profile is probably well known. It the active control were say, lapatinib, investigators might underreport moderate to severe diarrhea common to lapatinib [8, 9] if this adverse event is not expected on the experimental arm. This underreporting is unfortunate but not intentional and not serious. However, deliberate underreporting of adverse events on either arm comes under the category of fraud. If the IDMC considers the underreporting a serious omission and the data cannot be recovered then the site must be dropped from the trial.

Fictional patients

If it is discovered that a site created fictional patients to meet enrollment quotas, all patients from that site should be dropped. Regulators would always be suspicious of the integrity of the data from the remaining patients if they were left in.

Fabrication of patient diaries

Patient diaries are often used to collect adverse event experiences during clinical trials. Typically, the diaries are handed into the clinic at each visit. It is well known that, rather than complete the diaries on a daily basis, many patients fill them out just prior to a clinic visit, recalling adverse events and approximate dates from memory. This is not a case of fraud if the data are approximately correct. However, if clinic personnel fabricate diaries that are supposed to be authored by patients then it is a case of fraud and, when detected, should result in the site being dropped from the trial. Of course, the sponsor could drop only those patients from analysis who are known to have fabricated diaries, but there will always be suspicion about the integrity of diaries of other patients and how the sponsor decided which diaries were fabricated.

Propagation of serial data

Vital signs and laboratory data are collected over time. Clinic sites might fabricate data that were not collected by just propagating the same observation forward to future clinic visits. This is similar to a missing data technique known as “last observation carried forward” [10]. However, in the latter case the data being imputed come from the last legitimately observed vital sign/laboratory value, and the clinic site declares them missing. There is no deception in this case. The solution for the deliberate propagation of serial data is to eliminate the site from analysis.

Issues in creating the fraud recovery plan for the offensive strategy

There are several issues to be addressed in the creation and/or administration of the FRP.

Timing of fraud detection

It is important to understand that if fraud is detected during the clinical trial then the FRP can be followed. If it is found after the completion of the trial when the sponsor is unmasked, little can be done. If the fraud is detected after the product is marketed then it becomes strictly a legal issue.

Role of intent-to-treat

It is well known that the primary efficacy and safety analysis for a pivotal oncology trial uses the intent-to-treat population, i.e., all patients randomized. Occasionally, sponsors find a patient who did not receive a single dose of the study drug. In that case, a modified intent-to-treat is followed, with patients of this type dropped from analysis. The question arises of how intent-to-treat can be followed in the face of fraud. Do we include all patients and all data in the analysis, or do we drop the fraudulent patients as if they did not receive a dose? Clearly, carried to the extreme, if all patients had all of their data fabricated, an intent-to-treat analysis would make no sense. Thus, the FRP will have to define the modified intent-to-treat population to be used, e.g., all patients randomized except those from a site that committed fraud.

Eliminate all patients from the clinic site or just those with fraudulent data?

If a clinic site contributed 100 out of, say, 900 patients on a trial, and only four had fraudulent data, sponsors would want to make a case for eliminating only those four “guilty” patients. To lose the entire site would result in a considerable loss of statistical power in the final analysis. The FRP might specify only dropping the guilty patients, but regulators would likely maintain a low credibility for all of the data at the site and adjudicate it to the satisfaction of all involved. Sponsors could offer to impute the data using missing data methodology [11]; however, while it is difficult enough to justify “missing completely at random” or “missing at random” in the absence of fraud, it would be very difficult to do so in the face of fraudulent data.

Punishment imputations

We can think of some regulator-imposed or sponsor-imposed “punishment” imputations. In a survival analysis, suppose a patient died at 14 months post treatment start, but the data were reported as alive at 14 months. If the previous observation of that patient was correctly reported as alive at 7 months, perhaps a punishment imputation would be to retain the patient but record it as a death at 7 months.

Another punishment imputation for fraudulent efficacy data would be to retain all patients but to impute the guilty experimental group patients with the worst results from the control group, and the guilty control group patients with the best results from the experimental group. This should result in shrinkage of effect size and possibly a loss of statistical significance.

A common approach might be to analyze the efficacy data both ways—with the fraudulent observations included and with them dropped without imputation. The less favorable result will prevail.

Considering these alternatives, eliminating all patients from the fraudulent clinic site might be the most expedient but it could possibly result in a loss of power. If the fraud is discovered early enough, there would, presumably, be time to enroll more sites to make up for the potential loss of power.

Fraud and planned interim analysis

An FRP should specify what would happen if there was fraud with key data in an interim analysis. Detecting the fraudulent data before the interim analysis takes place should lead to one approach. If it is detected after the interim analysis, there should be a definite specified plan in place.

Fraud in the covariates

Up to now we have assumed that the fraud occurred with outcome data. Covariates could also be affected. Indeed, there could be fraud in stratification variables or model (Cox model or logistic regression model) adjustment variables. In this case, the FRP might specify conditioning on the “innocent” observations and the performance of a sensitivity analysis by sampling from a family of distributions for the “guilty” observations. This would allow for the sponsor and regulators to consider a range of attained significance levels and effect sizes for the primary efficacy variable. Alternatively, the guilty observations could be replaced by multiple imputation and, with repetition of the procedure, produce a suitable range of outcomes for inspection.

Randomization implications

As we consider eliminating patients or clinic sites due to fraud, we must also consider the effect of this removal on randomization. Randomization is certainly compromised if we eliminate from the analysis those patients with fraudulent data. If we eliminate an entire site, randomization is compromised if trial-wide minimization methods are used [12].

Administration of the fraud recovery plan

The FRP should indicate how the FRP is to be administered. One plausible approach would be for the IDMC to review centralized statistical monitoring reports and note possible instances of fraud. When a potential fraud is found, an audit group not paid for by the sponsor, perhaps regulatory authorities, would investigate. If fraud were indeed found, the sponsor would then invoke the FRP. Care would have to be taken to prevent accidental unmasking by the sponsor in the fraud detection and recovery process. Finally, the professional journals might want to insist that, when fraud is detected, it is reported in a final publication of the trial; when it is not detected, perhaps a statement such as “centralized statistical monitoring was employed in this trial and no evidence of fraud was detected” should be included.

Defensive strategies

The defensive strategies are those design characteristics that sponsors could follow to minimize the effect of fraud. A list of defensive strategies is shown in Table 3.

Table 3 Defensive strategies against fraud

Oversample and add extra clinics

As we have seen in the previous section, a likely offensive strategy is to eliminate patients, or more likely entire sites, involved with fraud. There will be a resulting loss of power if this happens. A defensive strategy then would be to enroll a larger sample than is needed and enroll a few extra clinics in the trial.

Specify a maximum number of patients per investigator

Closely related to oversampling would be to specify a maximum number of patients per investigator, e.g., no clinic site can contribute more than 5 % of the total number of patients on the trial. With this specification, if a site is eliminated, the effect on the power is less than if there were no restriction and one site with fraud enrolled, say, 20 % of all the patients on the trial.

Use co-primary endpoints

The protocol should specify not just a single primary endpoint but co-primary endpoints when vulnerable endpoints such as diary data are used. As an example, in a supportive care trial for chemotherapy-induced infection, culture data could be specified in addition to a patient diary. If fraud is found in the diary data, the culture data can be substituted. Statistical significance levels would be adjusted to allow for the additional endpoint.

Solicited adverse event data collection

To guard against diary fraud in adverse event data collection, sponsors might specify solicited adverse event data collection for the most important anticipated adverse events. Under solicited adverse event collection, a clinic worker will question the patient at each visit on whether or not certain adverse events occurred [13].

Use of covariates in primary efficacy analysis

To prevent fraud in covariates interfering with the primary efficacy analysis, the statistical analysis plan should specify that the primary analysis is unstratified and without covariate adjustment.

Randomization

Avoid using trial-wide minimization methods. As pointed out above, the integrity of this randomization scheme is highly vulnerable to fraudulent covariate data.

Use of technology

Many instances of fraud come down to re-copying data from source documents to forms or a computer screen. Use technology that takes data from the patient and inserts them directly into the clinical trial database.

Summary and conclusions

It is clear that the time has come for pre-specified actions to take in the face of fraud (offensive strategy, FRP), and defensive strategies to prevent or minimize the effect of fraud.

Taking all factors into account, the best suggestion to sponsors is to use defensive strategies. If fraud occurs, drop all patients from the clinic site. Punish the guilty investigators.

This paper suggests that the time has come for industry standards and best practices to be defined and followed for dealing with fraud.