Introduction

Cases of data fraud in clinical trials, defined as the fabrication or falsification of data, are uncovered on a regular basis [1, 2]. Some prominent recent examples are summarized in Table 1.

Table 1 Some prominent cases of data fraud in clinical trials

These cases include Roger Poisson, who falsified eligibility data for patients entered on multi-center breast cancer trials sponsored by the National Surgical Adjuvant Breast and Bowel Project (NSABP) [3, 4]; Werner Bezwoda, who reported strikingly positive findings from a single-institution trial using high-dose chemotherapy stem cell rescue in patients with high-risk breast cancer but the data on which the results were based could not be verified in an independent audit [5, 6]; Robert Fiddes, who was a lead investigator for a large number of clinical trials sponsored by pharmaceutical companies but was discovered to have committed a wide range of fraud and misconduct in these trials over many years [7, 8]; Harry Snyder and Renee Peugeot, a husband and wife team who falsified data on a clinical trial of a topical agent for the treatment of psoriasis and cutaneous T-cell lymphoma [9, 10]; Yoshitaka Fujii, an anesthesiologist who fabricated data on a large number of clinical trials of agents used to control postoperative nausea and vomiting in humans and animals [1116]; Anil Potti, who developed predictive models for therapeutic agents in cancer that were used in subsequent clinical trials but the details underlying the development of those models could not be independently validated [17]; and Hiroaki Matsubara, who resigned his university position in the wake of allegations of data fabrication and falsification in clinical trials of valsartan [18, 19].

Since clinical trials are a special type of research study, such cases are part of the general problem of research misconduct, with the added risk of potentially serious consequences for patients treated on trials or treated based on the results of those trials. Here, I discuss the definitions of misconduct, ranging from the narrow definition of ‘fabrication, falsification or plagiarism’ to wider definitions which include other questionable research practices; evaluate the available evidence on the prevalence or incidence of misconduct; and discuss potential contributing or causal factors leading to misconduct and the implications for preventive measures.

Definitions of research misconduct and data fraud

A single universally accepted definition of research misconduct does not exist among the various professional societies, scientific journals, government agencies and regulatory bodies concerned with the issue. However, fabrication, falsification and plagiarism are so egregious that all definitions implicitly or explicitly include these practices. The US Public Health Service defines research misconduct specifically limited to these practices [20]:

“Research misconduct means fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results.

  1. (a)

    Fabrication is making up data or results and recording or reporting them.

  2. (b)

    Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record.

  3. (c)

    Plagiarism is the appropriation of another person’s ideas, processes, results, or words without giving appropriate credit.

  4. (d)

    Research misconduct does not include honest error or differences of opinion.”

The National Institutes of Health [21], National Science Foundation [22], American Psychological Association [23] and others use identical or nearly identical definitions. This is a very narrow definition, covering only the most serious unethical behaviors. For clinical investigators, the US Food and Drug Administration use a much broader definition of investigator misconduct, targeting practices that might create a safety risk for patients, including:

“Failure to report serious or life-threatening adverse events; serious protocol violations, such as enrolling subjects who do not meet the entrance criteria because they have conditions that put them at increased risk from the investigational drug, or failing to carry out critical safety evaluations; repeated or deliberate failure to obtain adequate informed consent, including falsification of consent forms or repeated or deliberate failure to disclose serious risks of the investigational drug in the informed consent process; falsification of study safety data; failure to obtain IRB review and approval for significant protocol changes; failure to adequately supervise the clinical trial such that human subjects are or would be exposed to an unreasonable and significant risk of illness or injury” [24].

There are also other organizations that take a broader perspective, including the Council of Scientific Editors, who in a white paper on integrity in scientific publications [25], defined research misconduct as

“Behaviour by a researcher, intentional or not, that falls short of good ethical and scientific standard.”

Similarly, Universities UK defines research misconduct to include “behaviour or actions that fall short of the standards of ethics, research and scholarship are required to ensure that the integrity of research is upheld” [26]. These definitions are both broader and vaguer than the PHS definition, leaving open the question of the definition of ‘good ethical and scientific standard’ or ‘research and scholarship standards’ and what exactly constitutes falling short of those standards.

There has been much discussion in the literature on questionable research practices other than fabrication, falsification or plagiarism that may nevertheless result in unreliable results and other serious problems [2734]. Some of these practices relevant to clinical trials are listed in Table 2.

Table 2 Questionable research practices in clinical trials other than fabrication, falsification and plagiarism

In Table 2, these questionable practices are grouped into several categories—Design and analysis, such as the use of improper design or analysis techniques, misrepresentation of the methodology used, or selective reporting; publication and authorship, such as failure to publish or gift authorship; patient safety, such as failure to follow protocol safety requirements or failure to obtain proper informed consent; and other practices, such as misuse of funds, conflicts of interest, or refusal to share data.

One view of data fraud in clinical trials is that it represents the extreme end of a spectrum of sources of data errors in clinical trials, ranging from the inevitable honest errors at one end of the spectrum to data fraud at the other end, with misunderstandings, incompetence and sloppiness in between. This spectrum is illustrated graphically in Fig. 1 where there is a clear dividing line between data fraud and other sources of error defined by intent. Other sources of data errors are regrettable but data fraud involves a deliberate intent to deceive or ‘intent to cheat’, a qualitatively different source of data errors.

Fig. 1
figure 1

Spectrum of sources of data errors in clinical trials

It is arguable that in aggregate more damage is caused by the less serious forms of questionable research practices and from sloppiness or incompetence than from data fraud––largely because these other sources of data errors are more common.

Prevalence

There are fundamental difficulties in trying to estimate the prevalence of research misconduct in science in general and clinical trials in particular. First, there are definitional problems. Does ‘misconduct’ include only fabrication, falsification and plagiarism as in the PHS definition or should it include some of the other types of questionable research practices listed in Table 2?

Second, there are difficulties in the assessment of prevalence when applied to research misconduct. In epidemiology, prevalence is defined as the proportion of people in a defined population with a given condition at a specific time (point prevalence), or that have (or had) the condition during a specified time period (period prevalence), or that have ever had the condition at any time (lifetime prevalence). For assessing the prevalence of research misconduct there needs to be clarity in the type of prevalence being assessed as well as a clear statement of the population being studied, the defined population. Is it everyone engaged in research, or just primary investigators, or some other defined group of individuals? Even if the population can be defined in principle, how are the numbers of people in the population estimated? And how can we take a reasonable random sample, or some other reasonably representative sample, from the population in order to construct an estimate of prevalence?

Lastly, there is an ascertainment problem. Accurate responses to questions about misconduct may be difficult to obtain. This is a well-known problem when attempting to elicit responses from individuals for questions about behavior that is embarrassing, illegal or that is otherwise liable to result in evasive answers.

Because of these difficulties, the true prevalence of research misconduct in general or in clinical trials in particular is unknown, perhaps even unknowable. Nevertheless, there have been many attempts to address the issue. These efforts may be classified into studies providing indirect estimates of prevalence through assessing the detected cases [3537] and surveys providing direct evidence by asking questions in some supposedly representative sample of subjects about knowledge of misconduct by others [28, 3842] or about the respondent’s own behavior [29, 40, 43, 44]. The detected cases alone are obviously less than the actual number of cases and thus lead to an unreliable and biased underestimate of prevalence. Such indirect evidence on prevalence has resulted in speculations that range from the ‘tip of the iceberg’ metaphor, often favored by science journalists, at one extreme to the conclusion that fraud is extremely rare at the other extreme (‘99.9999 % of all reports are accurate and truthful’ [45]).

The evidence from sample surveys has the advantage of producing a direct estimate of prevalence, despite the caveats noted earlier. However, these surveys differ greatly in study designs, sample sizes, questions asked, and other features, resulting in inconsistent outcomes. In addition, as noted above, the surveys ask questions about topics for which respondents might be expected not to be truthful. Although there is a 50-year history of using randomized response designs in this setting to minimize this problem [46, 47], only one survey of misconduct actually used this type of design to address the issue [48].

In order to make some sense of the published survey results, Fanelli [49] conducted a meta-analysis of 21 surveys published in 1987–2008, restricting attention to those surveys asking direct questions about the misconduct of researchers or about the misconduct of their colleagues. Only studies addressing fabrication, falsification or other questionable research practices that could produce biased or misleading results in the analysis were included (e.g., plagiarism was not included). In addition, only studies that clearly separated fabrication/falsification from other questionable practices were included.

Figure 2 gives a Forest plot of the results from the meta-analysis of self-reported fabrication or falsification. Figure 3 gives similar results for personal knowledge of fabrication or falsification by others (i.e., of the respondent’s colleagues).

Fig. 2
figure 2

Forest plot of admission rates of data fabrication, falsification and alteration in self reports. Area of squares represents sample size, horizontal lines are 95 % confidence interval, diamond and vertical dotted line show the pooled weighted estimate. This figure is a reproduction without modification of Fig. 2 in Fanelli [49]

Fig. 3
figure 3

Forest plot of admission rates of data fabrication, falsification and alteration in non-self reports. Area of squares represents sample size, horizontal lines are 95 % confidence interval, diamond and vertical dotted line show the pooled weighted estimate. This figure is a reproduction without modification of Fig. 2 in Fanelli [49]

The pooled weighted estimate of the self-reported admission rate, rounded to one decimal, was 2.0 % (95 % CI 0.9–4.5) and the pooled weighted estimate of those who reported fabrication or falsification by others was 14.1 % (95 % CI 9.9–19.7). Significant heterogeneity among the surveys was observed. It is likely that the self-reported admission rates are biased (low) for the reasons noted above.

Other questionable research practices are likely to be much more prevalent than fabrication, falsification or plagiarism. For example, in a large survey of US scientists funded by the National Institutes of Health, Martinson et al. [29] reported that 15.5 % of the respondents admitted to ‘changing the design, methodology or results of a study in response to pressure from a funding source’ and 33 % admitted to at least one of the ‘top 10’ questionable practices.

Causal factors and prevention

“Why does research misconduct happen? The answer that researchers love is ‘pressure to publish’, but my preferred answer is ‘Why wouldn’t it happen?’ All human activity is associated with misconduct. Indeed, misconduct may be easier for scientists because the system operates on trust. Plus scientists may have been victims of their own rhetoric: they have fooled themselves that science is a wholly objective enterprise unsullied by the usual human subjectivity and imperfections. It is not. It is a human activity.”

R. Smith [50]

Unfortunately, in common with estimates of prevalence, reliable data concerning the possible causes of misconduct do not exist [30]. We are left largely with expert opinions and speculation. The lack of data is problematic for formulation of effective prevention strategies; however, in most cases of research misconduct it is reasonable to assume that the motivation for the perpetrator lies at least partly in the potential for personal gain. In some cases there may be financial advantages, either direct personal financial gain or indirect financial gain for research funding. In other cases seeking promotion or tenure or scientific prestige may be primary motivating factors. Finally, there is always the possibility of some type of psychiatric condition or illness behind the misconduct. In any specific case, even when the perpetrator has admitted to the misconduct and offered some explanations for it, the testimony itself may be unreliable and the true motivating factors may remain unclear. Generalizations from individual cases are unreliable at best.

Despite the lack of reliable empirical evidence, there is a considerable literature addressing the contributing factors in misconduct, with three broad general narratives about three primary contributing factors—individual traits, institutional issues, and structural problems in science itself [28, 51, 52].

Individual traits include characteristics of the individual researcher that may lead to misconduct, including the inability to handle the ‘publish or perish’ and other competitive pressures, personal ambition, the desire for personal recognition or the wish for direct or indirect financial gain [53]. Some ascribe the presence of research misconduct or fraud primarily to ‘bad apples’ since, as in all human endeavor, there are individuals who violate established norms of behavior. Some of these individuals may have self-delusional, even self-destructive, tendencies.

Institutional issues include the ‘publish or perish’ pressures inherent in the promotion and tenure requirements, inadequacies of training and mentoring, lack of detailed oversight of research, competition for federal support and other issues [54, 55]. Structural issues in the way modern science is conducted may also contribute to the problem [51].

In discussing the causes of research misconduct and their implication for potentially effective prevention methods, information from the broader context of other illegal, immoral, inappropriate or unethical behavior in society may be useful. Adams and Pimple [56] address this directly. Reduction of criminal/deviant behavior has proven resistant to strategies based on rational decision analysis and to setting appropriate norms and values, however laudable these may be. This approach can be called the ‘individualist’ approach, based on the ‘bad actor’ or ‘bad apple’ assumption. A new approach from recent theories in criminology, ‘opportunity theory’, starts with the assumption that the population of potential offenders is essentially everyone (see Ariely [57] for support of this assumption). If this is so, then we need to create physical settings or situations that reduce the opportunity for misconduct and encourage appropriate behavior, especially via effective supervision and internal controls. As Adams and Pimple state: “It is sometimes far easier and more effective to control or change situations than it is to control or change individuals” [56].

In the case of clinical trials, especially multi-center clinical trials, institutional issues and structural issues in science in general are likely to be less important than individual factors. In addition, with a few exceptions such as the Fiddes case and the Snyder−Peugeot case noted in the introduction, direct financial gain does not appear to be a major motivating factor. One intriguing suggestion is that physician-scientists simply may be less rigorous than other scientists in their approach to clinical trials:

“It is our sense, primarily experiential and impressionistic in nature, that honesty in research work as a fundamental rule is valued more strongly among scientists than among physicians… Physicians tend to evaluate research in terms of harm or benefit to patients rather than in terms of adherence to the rigorous norms of scientific investigation” [58].

Years after this speculation was published, some support for this view was inadvertently supplied by Poisson in his explanation of why he falsified eligibility data on NSABP trials:

“I believed I understood the reasons behind the study rules, and I felt that the rules were meant to be understood as guidelines and not necessarily followed blindly. My sole concern at all times was the health of my patients. I firmly believed that a patient who was able to enter into an NSABP trial received the best therapy and follow-up treatment… Maintaining the proper balance between good clinical care and rigid research methods is not an easy task” [59].

In addition to the usual suggestions for preventing misconduct implied by consideration of the various factors involved (training in the ethics of research, improved mentoring, increased supervision, etc.), none of which have been proven to be effective, statistical procedures may also play a role. In particular, central statistical monitoring, an effective tool for detecting data fraud in clinical trials as part of a general data quality assurance program, may also function as a deterrent to committing such fraud in the first place for trials in which such monitoring is known to be in place [1, 60, 61]. Such procedures should be applied more commonly in multi-center clinical trials.

Summary

Despite the large and growing literature on the prevalence, causes and prevention of research misconduct in science in general and in clinical trials in particular, reliable empirical evidence to support the discussion remains in short supply. This situation exists in part because of the difficulties in definitions and in part because of the difficulties in designing and conducting studies in this area. However, the available evidence taken as a whole suggests that the most serious forms of misconduct, fabrication and falsification, are relatively rare, albeit perhaps higher than assumed by most scientists, whereas other questionable research practices are quite common. In addition, most discussions of the causal factors in misconduct are not based on reliable empirical evidence. Thus, prevention measures which are based on assumptions about causal factors are also liable to be misguided. More rigorous studies of the prevalence, causal factors and potential prevention strategies for research misconduct are needed.