Keywords

What Constitutes Re-using Others’ Work (i.e., Plagiarism)?

For many who teach in the tertiary arena, such as colleges and universities, it is difficult to imagine a student who, at this point in time, is not familiar with the concept of plagiarism. Indeed, although evidence indicates that most students are able to define plagiarism (Barry 2006; Power 2009; Yeo 2007), other studies suggest a considerable amount of confusion and/or ignorance about plagiarism-related matters, such as the appropriate use of citations (McGowan and Lightbody 2008; Power 2009; Sutherland-Smith 2005a), quotations (Löfström and Kupila (2013), and proper paraphrasing (Hale 1987; Landau et al. 2002; Pecorari 2003; Roig 1997, 1999; Walker 2008). Study after study indicates that many students admit to plagiarizing. For example, the work of Donald McCabe and his colleagues who have surveyed thousands of students indicates that approximately 62 % of undergraduates and 59 % of graduate students admit to having plagiarized at least once (McCabe 2005). Moreover, instructional staff are not always in agreement about what forms of writing constitute plagiarism (Roig 2001; Sutherland-Smith 2005b). And judging by the many editorials (see Roig 2014) and articles that have appeared in the biomedical and social sciences literature (see Habibzadeh and Marcovitch 2011) and by the many articles that are retracted for plagiarism or self-plagiarism (Fang et al. 2012), far too many of those scientists and academics who should know better engage in plagiarism, as well as in many other forms of scholarly and scientific misconduct (Martinson et al. 2005). But, unlike outright fabrication and falsification, the ongoing situation with the misappropriation of others’ work should, perhaps, not be all that surprising, given the apparent lack of objective, quantifiable criteria for determining whether plagiarism has occurred. After all, there does not appear to be a widely accepted operational definition for what constitutes paraphrasing versus plagiarism, i.e., how many consecutive words from another source may an author include in a phrase or sentence and/or how many copied phrases or sentences merit a plagiarism charge (Roig and deJacquant 2000)? Thus, in spite of relatively detailed institutional policies on plagiarism (Pickard 2006; Sutherland-Smith 2011) and other existing guidance on this topic, instructors and journal editors may encounter many “borderline cases” involving plagiarism of text.

Plagiarism of other intellectual content (e.g., ideas, processes) presents additional challenges that make such transgressions much more difficult to operationalize. Moreover, the problem of intentionality (Sutherland-Smith 2011) and the seriousness of these infractions (Howard 1999) have been a source of concern for some instructors. Consequently, because of a certain degree of ambiguity inherent in how plagiarism is typically defined, some cases are likely to be classified by the “I know it when I see it” (Famous expression used in a US Supreme Court case to explain the difficulty of determining whether material in a film should be considered obscene. Jacobellis v. Ohio, 378 U.S. 184 (1964)) approach (see, e.g., Pecorari 2008, p. 38).

Self-Plagiarism

An analogous situation with respect to the quantifiable criteria occurs in some instances of self-plagiarism (Peh and Arokiasamy 2008), a somewhat controversial term used to describe situations in which authors re-use their previously disseminated work and pass it off as new. Even the term, self-plagiarism, has been the subject of recent criticism in the sciences (e.g., Andreescu 2013), with some observers pointing out that it is impossible to steal from oneself (Bird 2002; Callahan 2014). In spite of such criticisms, Bruton (2014) notes that “the term self-plagiarism has become too widespread for it to be replaced by different terminology anytime soon …” (p. 77).

Relative to its more famous cousin, self-plagiarism is often said to lie in a gray area (e.g., Bird 2002; Jacobs 2011), and it is generally not considered to be research misconduct according to the United States Public Health Service’s (PHS) Office of Research Integrity (ORI). In this regard, Dahlberg (2007) has noted that “ORI often receives allegations of plagiarism that involve efforts by scientists to publish the same data in more than one journal article. Assuming that the duplicated figures represent the same experiment and are labeled the same in both cases (if not, possible falsification of data makes the allegation significantly more serious), this so-called ‘self-plagiarism’ does not meet the PHS research misconduct standard” (p. 4). In the academic context, self-plagiarism is generally considered a form of cheating, and many tertiary institutions caution students against this dishonest practice in their university websites (Bretag and Mahmud 2009). However, other institutions do not mention this specific form of misconduct (Salhaney and Roig 2004), and the concept is unclear to some instructors (Hallupa and Bolliger 2013).

Unfortunately, the same questions about a lack of an operational definition of plagiarism apply equally to self-plagiarism. Moreover, when this type of transgression is covered in academic dishonesty policies, it tends to simply forbid the re-use of papers in new courses that have already been submitted to another course for credit (Bretag and Mahmud 2009). Thus, questions about what constitutes acceptable forms of re-use are seldom addressed in these policies. For example, students may understand that they cannot re-use a previously submitted paper, but what if they re-use three quarters of a paper, or half of a paper, or a quarter of a paper? These questions notwithstanding, awareness of self-plagiarism as a problematic practice does not seem to be as prevalent as that for plagiarism. In addition, given that many students seem to believe that plagiarism itself is not a serious transgression (Park 2003), the question remains as to the proportion of students, and even instructors (see Hallupa and Bolliger 2013), who view self-plagiarism as a form of cheating. With respect to instructors, and assuming that a large portion of contributors to science and scholarship are also university instructors, some evidence suggests that a significant proportion of them might not consider the practice as unethical. For example, Price et al. 2001 presented various ethically questionable research scenarios to health educators and found that 64 % of the sample considered self-plagiarism behaviors acceptable. Certainly, the views of editors and authors can also differ substantially with respect to what constitutes appropriate re-use (Yank and Barnes 2003).

Self-Plagiarism in Science and Scholarship

Several basic forms of self-plagiarism have been identified in scholarly periodicals, and these are briefly summarized below. It should be noted that a common feature of all of these malpractices is that: (1) there is substantial recycled material (text and/or data) in the new paper from the previously published paper and (2) the reader is never informed about the nature or extent of the re-use. In some cases, citations to the earlier published work are, in fact, provided in the new publication, but this is sometimes done in such an ambiguous manner that the reader is unable to determine the extent of and/or true nature of the re-use, let alone whether re-use has taken place. All such cases in which readers are not informed, or are misled, about the re-use should perhaps be termed “covert” (covert duplicate publication, covert salami publication; see, for example, Tramer et al. 1997; Roig 2006). von Elm et al. (2004) and, more recently, Bruton (2014) provide a more extensive treatment of the various forms of this type of “double-dipping” in journal articles. A brief review of common forms of self-plagiarism in the sciences follows.

Duplicate publication. A common form of self-plagiarism, and one that appears to be on the rise since the mid-1990s (Larivière and Gingras 2010), occurs when an author submits a previously published paper to a different journal. There are many ways in which this type of duplication occurs, and these can range from publishing an identical copy of the earlier published version to one that contains some minor changes. The end result is that the “new” paper may appear to be different on the surface, but it is likely to contain substantial amounts of recycled text and, especially, old data that are presented as new. Tramer et al. (1997) have demonstrated the danger of this type of misconduct when duplicated data are interpreted as new data in a meta-analytic study. Yet, it is likely that some meta-analytic studies are contaminated by duplicates. For example, Choi et al. (2014) have reported that 69 % of the meta-analyses carried out by Korean biomedical researchers included duplicate publications. Similar to the Tramer et al. 1997 study, these authors also showed how, in two instances, the inclusion of the duplicates had led to higher effect sizes than would have occurred without inclusion of the duplicates. It bears repeating that presenting old data as new data is tantamount to data fabrication, a major type of research misconduct, because the “new” data are data that, in reality, do not exist and, therefore, end up skewing the scientific record.

Another way in which duplicate publication may manifest itself is through translations of previously published works. For example, a paper that was first published in a low-circulation journal in one language may be later translated and then published in a journal of greater circulation or vice versa. An argument in support of this type of duplication is that such duplicates in a different language serve a greater purpose when others who cannot read in the language of the original paper can benefit from the wider dissemination of the research. Few would disagree with such noble purpose, and, in fact, some journal editors (e.g., Dickens et al. 2011) will accept such manuscripts provided that the authors disclose the prior publication. Obviously, this approach is only meaningful and appropriate when the authors acknowledge the prior published version to readers, as per long-established criteria for republishing already-published journal articles. Thus, according to the guidelines published by the International Committee for Medical Journal Editors (ICMJE 2014), authors may submit for publication a previously published paper if the editors of both journals give their approval. The secondary publication must aim at a different audience; it must “faithfully” reflect the data and interpretations of the primary publication and respect the primary status of the prior publication. There must also be full disclosure to readers and all other relevant parties, such as documenting agencies, about the previous publication including its full citation. Finally, the title of the secondary publication must indicate that it is a secondary publication (i.e., a translation) of the original. Although these guidelines serve the biomedical research community, they should be equally applicable to other scientific and scholarly disciplines in which the scientific status of a claim rests on the number of independent observations made in its support. The overriding concern here is that the provenance of evidence must always be made clear to readers.

Augmented publication. A particularly problematic type of self-plagiarism occurs when a set of data is published once, but it is then republished again with additional observations (see Smolčić and Bilić-Zulle 2013; also known as data aggregation, Kim et al. 2014). For example, consider the following fictitious scenario: Three surgeons decide to describe the effectiveness of a new surgical procedure with the results of twenty successful cases. Subsequently, two other surgeons who adopt the new technique contribute additional cases to the original database, and the combined data are analyzed and presented in a new paper with a modified title, a few additional authors, a larger set of cases, but no mention of the earlier publication (i.e., cross-referencing). In some cases in which the previous publication is cited, it is done in an ambiguous manner such that readers are misled into believing that the new data set is independent from the old one. As with the traditional duplicate publications, the new publication is likely to have significant portions of verbatim text from the earlier published version. However, the more fundamental problem with cases of data augmentation is that old data are mixed with new data, and the combined data are presented as new, thus likely contributing to the skewing of the scientific record. An example of this type of self-plagiarism is briefly described by Bonnell et al. (2012) (see also level 4 of duplicate publication in Davidhizar and Giger 2002).

Salami publication. Generating two or more published papers from the same study is generally known as “salami slicing” (Hoit 2007; Huth 1986; Nature Materials 2005), but terms such as “data disaggregation” (Houston and Moher 1996) and “least publishable unit” (Broad 1981) have also been used. As an example, consider a fictional large-scale retrospective study on health gains and health-care cost outcomes in a sample of type II diabetes patients who are examined according to their dietary and exercise activity. The results of the study are published in a diabetes journal. Sometime after publication, the authors (again, and for a variety of reasons, new authors may be added and old authors dropped) decide to reanalyze the data by including other demographic variables that were not examined in the previous study and excluding a very small number from their sample, such as underweight subjects; they publish the results in an obesity journal with only ambiguous cross-referencing or no cross-referencing between the papers (see Houston and Moher (1996) for a detailed description of one case). Instances such as the one depicted in the above scenario are likely to mislead readers into believing that the later study provides new data that are interpreted to be independent from the data reported in the previously published paper.

In other versions of salami publication, there may not be any recycled data. That is, prior to any publication, the authors may decide to segment the data set into separate discrete units in order to maximize the number of publications produced from the larger, original data collection effort. For example, they may decide to publish the results of outcome costs in one journal and the results of the health gains data in another journal (see Martin 2013; Smolčić 2013 for additional examples). Although both papers will obviously share some text similarities in terms of sample descriptions and perhaps some other methodological characteristics, much of the rest of each paper could conceivably be very different from the other. It is perhaps for these reasons that at least one author has questioned the inclusion of salami publication as a type of self-plagiarism (Bruton 2014).

Admittedly, some instances of salami publication are entirely justifiable. For example, certain types of complex longitudinal studies will yield data about outcomes at various points during the course of the study, and such data need to be published, including later studies from additional follow-up analyses. A similar situation may occur with multicenter clinical trials in which it may be meaningful to report the results from a single center (see Houston and Moher 1996 case). Some cases of fragmented publication (Smolčić and Bilić-Zulle 2013) are the exact opposite of augmented publication in that rather than adding more data to the original data set, some of the data from the original set are excluded, and this may be done for a variety of legitimate reasons. But, again, the key issue is the lack of transparency regarding the provenance of data in terms of how these studies relate to each other. Thus, authors must always disclose relevant details regarding the provenance of the data and any related publications.

Text recycling. By far the most common form of self-plagiarism in science and scholarship occurs when authors re-use substantial portions of their own previously disseminated text in new publications. Evidence indicates that some academics recycle relatively minor portions of text (Bretag and Carapiet 2007; Roig 2005). However, other evidence suggests that, in some instances, the amount of re-use can be considerably greater than 50 % or 60 % (see, e.g., Neligan et al. 2010). Before reviewing this relatively common malpractice or “misdemeanor” so termed by Zigmond and Fischer (2002), it may be useful to discuss an approach to writing papers that would drastically reduce most instances of plagiarism and self-plagiarism.

Reader-Writer Contract

The reader-writer contract is an approach to reading and writing that has its origins in the humanities (Tierney and LaZansky 1980). This approach holds that readers of academic work operate under three basic assumptions about the material being read. The first assumption concerns the creation and ownership of the work, which conveys to readers that the material presented is the exclusive creation of the listed authors. In instances in which others’ ideas are being conveyed, the authors indicate others’ ownership of that material using standard scholarly conventions, such as citations, footnotes, or other literary mechanisms. In addition, the reader-writer contract stipulates that any facts, figures, and ideas are accurately represented by the authors to the best of their ability. Finally, readers are assumed to approach these works with the understanding that the material is new and that in instances where such is not the case, readers are, again, informed about prior disseminations using established scholarly conventions (e.g., citations or footnotes). For example, the author of a work that has earlier been published in another language informs the reader of this fact in the front cover, title, or elsewhere in a prominent manner or as per ICMJE conventions. A new edition of an older textbook is identified as a newer version of the previous edition by either the phrase “revised edition” or the edition number. In both of these latter cases, there is, or should be, a clear understanding on the part of the reader that a substantial amount of material has been recycled from the previous edition. With this context in mind, the problem of self-plagiarism is explored further.

The first two elements of the contract, originality and accuracy, are consistent with basic standards of ethical scholarship found in traditional writing guides for research papers, theses, and dissertations. These elements are also covered in many scholarly and scientific journals’ instructions to authors and in related guidance issued by professional organizations (e.g., ICMJE). The third element, which compels authors to be transparent with their readers regarding any prior dissemination of their work, is central to the problem of self-plagiarism. Various aspects of self-plagiarism are also addressed in the sources outlined above. However, the topic is often discussed in the narrower context of duplicate publication and/or duplicate submission of manuscripts and of copyright violation. Moreover, when the topic of potential duplication arises, the cautionary advice to authors is to inform the editor about any potential overlap so that she/he may decide whether a manuscript is sufficiently original to be published. In instances where the degree of overlap is acceptable to the editor and the paper is published, it is sometimes unclear whether readers are fully informed about any duplication.

Several authors (Bruton 2014; von Elm et al. 2004) have described the various forms of this transgression as outlined above, but mainly within the biomedical and, to some extent, the social sciences fields and almost always within the domain of academic and/or scientific journals. However, recent retractions in other disciplines (e.g., Bo et al. 2014; Leonard 2015; Saurin et al. 2014; Statement of retraction 2015a, b, c) suggest that many of the key issues related to self-plagiarism are equally applicable to other scientific disciplines as well to other domains, such as theses, conference presentations, and books.

Beyond Recycling in Journal Articles: Some Considerations of Re-use in Other Scholarly Activities

Books

From old edition to new edition. As noted above, textbooks and similar works that are republished as revised editions of earlier works will contain significant amounts of recycled material; that is, the reader may not be directly informed that significant portions of textual material from an earlier edition will appear largely unchanged in the new edition. However, this is never considered an instance of self-plagiarism. For example, most university textbooks are revised a number of times over their lifetime, and each subsequent edition will likely include many portions of verbatim text of varying length without any modifications. The absence of changes may simply represent well-written content from the previous edition that continues to be relevant at the time of the revision. There may even be situations in which textbooks republished in a subsequent edition two or three years later that contain only very minor revisions, as it might be the case in certain disciplines, such as mathematics or statistics in which content does not change as rapidly as it might in other subject areas, such as biology, chemistry, and psychology. While the ethics of such faster-rate publishing tactics may be debated, these types of situations are not labeled as self-plagiarism as there is, or should be, a general understanding on the part of the readership that repetition of verbatim text from one edition to the other is a given. Thus, in these cases, it is not necessary to alert the reader that recycling of earlier material has taken place. A similar situation occurs with concise/abridged versions of full-length textbooks. The concise version may even contain new writing, graphics, etc., and may even be titled somewhat differently. But the general assumption is that the work is, essentially, the same as the full-length version, though with fewer details and/or narrower coverage.

Re-using Portions of Chapters or Entire Chapters from One Book to Another. There are other situations where the re-use is less clear and may confuse readers. For example, an author of a textbook in, for example, general psychology who later writes a textbook in child development using the same publisher may decide to recycle large portions or entire sections of some of the chapters from the general psychology textbook (e.g., conditioning, perceptual development) in the new textbook on child development. Alternatively, if different publishers are involved, permission may be obtained to re-use the material allowing the author to re-use the content. The question arises, however, as to whether there is an expectation of novelty, on the part of readers regarding the content of the second book relative to the first book. For this reason, readers should be informed as to the extent of the re-use.

From journal article to book. One can envision instances in which re-use from one source to another may be problematic, such as when authors are asked to write a review paper or a book chapter in their area of specialization. In these situations there may be a very strong inclination to re-use, without informing the reader, portions of literature reviews and discussion sections that have already been published by the same author in other journal articles, edited books, or monographs (see Martinson et al. 2011 for an example). However, in addition to potential copyright issues, a reader who has already acquired the earlier works may be expecting a fresher, more up-to-date perspective from the author. From a purely pedagogical perspective, if the primary purpose of academic work is to educate others, it would be more effective to convey the information in a different manner, rather than to merely repeat the same message verbatim.

Conference Presentations

Same paper presented at multiple conferences. In some disciplines, questions have been raised about the appropriateness of presenting the same or roughly the same paper at different conferences (Sigelman 2008). Certainly, issues regarding the provenance of data and the need for transparency with the audience may be similarly applicable in these situations. For example, as with many journals, some conference sponsors insist on original presentations that are exclusive to that conference, while other organizations do not have such requirements. Moreover, there are various types of presentations, such as invited addresses, conference submissions, and presentation formats that may determine the appropriateness of recycling previously disseminated material. Although a thorough exploration of recycling across conference domains is beyond the scope of the present work, authors should consider the principles of the reader-writer contract in guiding their conference presentation practices and alert their audiences about any material being recycled.

From conference presentation to journal article. In most disciplines, papers that are presented at conferences are subsequently submitted for publication to peer-review journals. In some disciplines such as psychology and education, it is common for the published papers to include an author note indicating any previous presentation of the paper. However, other disciplines, particularly within the biomedical sciences, may not follow this practice and doing so may depend on the individual journal’s policy as detailed in the journal’s instructions to authors.

Although the publishing of expanded versions of presented papers has a long-standing tradition that should continue to be strongly encouraged, some issues can arise when authors fail to indicate a paper’s prior dissemination history. For example, in the past, conference proceedings were only available in print and were usually distributed mainly to association members or conference registrants. However, the advent of the Internet has made many conference proceedings widely available for dissemination. If the title and authorship of a conference proceeding is different than that of the subsequently published paper, confusion can arise for those who might interpret each product as an independent contribution. Complicating the situation is the fact that conference proceedings come in many forms, ranging from compilations of paper titles with authors to compilations of full versions of presented papers. The latter situation can lead to confusion if changes are made to the structure of the paper in the published product to the structure of the paper (e.g., change in the language and/or authorship, including the addition or deletion of only a few data points which will most certainly change all of the data tables and perhaps even figures, such as line graphs). Thus, the question arises as to whether a reader would be able to recognize these two products as being the same. In addition, the full-paper as proceedings presents additional challenges for authors and editors in some disciplines because many journals are reluctant to publish papers that are largely based on work that is already fully available online, whether in conference proceedings or from some of the fully searchable online repositories or preprint servers. Even more problematic are instances in which an association journal will publish the proceedings of its conference as full papers. Under some conditions, such instances represent primary publications according to long-established guidelines and republication of a paper elsewhere, even if the paper is an expanded version of a conference proceeding, may be viewed as an instance of duplicate publication, not to mention the potential for copyright violation (see Vasconcelos and Roig (in press) for an example of this situation). In sum, in the absence of clear guidelines, authors can avert any confusion by being mindful of the reader-writer contract and ensuring full transparency with the editor and, especially, with their readers.

Doctoral Dissertations and Theses

From dissertation or thesis to publications. There is a tradition in many disciplines for authors to repackage portions of their dissertations such as dissertation chapters or empirical studies into one or more publications, such as journal articles or books. Doing so is perfectly acceptable and, in fact, some journals’ instructions to authors specifically accept this practice. Many authors include a note in the published work to indicate that it is a derivation of, or it is based on, their thesis or dissertation work. However, it appears that this clarification is not always made by authors in some disciplines, though from the perspective of the reader-writer contract it should always be made. An area of concern with respect to publishing portions of theses/dissertations in more than one journal article is whether it makes scientific or logical sense to break the thesis/dissertation work apart (see section on “Salami Publication”). Thus, to maintain transparency with readers and to avoid potentially misleading them about the context of the research, authors should be required, when appropriate, to indicate in the subsequent journal articles, books, or book chapters the existence of related publications that were also derived from the same thesis/dissertation.

From publications to doctoral dissertations. It is quite possible for some students who are completing doctoral work to have already published in the same area of research and, consequently, may wonder about re-using the content of such publications. There are two important issues to consider in this scenario. Copyright issues aside, and assuming there are no department or institutional guidelines against the practice, re-using content from the student’s earlier publications is entirely acceptable provided that there is full transparency between the student and dissertation committee members and, of course, full transparency with the readers. It should be noted that at some institutions a doctoral dissertation consists of an assemblage of journal articles published by the doctoral candidate as part of his/her dissertation work with perhaps the addition of a more comprehensive introduction and discussion of the entire corpus of work. In instances where the latter is not the standard procedure and assuming that the academic department accepts other forms of re-use of already-published material, there is a possible complication in situations where the published material was co-authored with other individuals. If the dissertation committee members are able to establish that the student’s contributions to the published work are sufficiently substantive and they accept this type of re-use, then permission from the co-authors must be requested to avoid issues of plagiarism. Obviously, such request should be made at the earliest possible stage of the dissertation process.

Why Should Authors Be Concerned About Re-using Their own Previously Disseminated Work?

The apparent rise in student plagiarism in recent years has also given rise to technology that facilitates its detection (Royce 2003; see Scaife 2007 for a review). Thus, services like Turnitin® (http://www.turnitin.com/), which retain in their database a copy of every document that is submitted for analysis, should give students and others pause before they consider re-using, in part or in whole, an earlier submitted paper to satisfy the requirements of a new course. At the professional, academic level, the increasing digitization and wider availability of scholarly and scientific print material means that a point will be reached soon – at which all academic written work will be easily identified, retrieved, stored, and processed in ways that are inconceivable at the present time. Actually, evidence suggests that students may already be sensitized to this possibility. For example, requiring them to submit their academic work electronically results in an increase in their awareness of various forms of plagiarism and possibly deter some of these behaviors (Mazer and Hunt 2012). Consider the fact that many academic journals use some type of plagiarism-detection software, such as Crosscheck®, to screen submitted manuscripts being considered for publication (see http://www.crossref.org/01company/06publishers.html). Editors using this technology have become alarmed at the large number of submissions containing plagiarized content (Baždarić et al. 2012; Bazdaric 2012; Shafer 2011) and likely self-plagiarized material as well. In addition, it is possible that other tools, such as eTBLAST and its resulting database, Déjà vu (Errami et al. 2008), which has already led to various retractions in the biomedical literature (Errami et al. 2008), are likely to become an established tool for use in screening scientific journal articles and perhaps other non publication domains, such as grant proposals.

Summary

In view of the increasing attention being given to the topic of self-plagiarism and of the recent developments in software technology designed to detect text re-use, students and professionals may need to reconsider previous practices with respect to publication. Doing so will be difficult for some, particularly for those who fail to see self-plagiarism as a questionable practice for those who may have limited language/writing skills and have relied heavily on the practice of recycling their previously written content. At a time when calls for transparency in science are at all-time high, keeping in mind the reader-writer contract throughout all stages of their scholarly activity may lead authors to adopt writing and other research practices that are more sensitive to the principles of responsible scientific and scholarly conduct. In turn, it is possible that these same attitudes may extend to other areas of personal and professional academic behavior.