Introduction

The “collaborative imperative” is emerging as the norm in most fields of research (Bozeman and Boardman 2014). The increasing number of international collaborations allows for the formation of global research networks and a greater sharing of research resources, including materials, equipment, data, knowledge, specialization and diverse cultural perspectives (Wagner 2005; Witze 2016). Multidisciplinary, interdisciplinary, cross-disciplinary, and transdisciplinary collaborations are yielding innovative breakthroughs (Larivière et al. 2015; Petersen et al. 2018). And compared to single-authored publications, works of collaborative science are published in higher impact journals and result in significantly more citations and patents (Wuchty et al. 2007).

The rise of collaborative research has also resulted in a marked increase in the number of authors named in article bylines (Wuchty et al. 2007). Authorship is generally attributed to individuals who have contributed significantly to the research and remain accountable for their work (Council of Science Editors (CSE) 2012). Unethical authorship practices, such as honorary authorship (naming someone who has not contributed to research) or conversely, ghost authorship (not naming someone who has contributed to the research), have been highlighted in the literature as problematic (Teixeira da Silva and Dobránszki 2015; Wislar et al. 2011). And while such misuse of authorship itself is acknowledged as misbehavior, it has been deemed a “normal misbehavior” that is common and even mundane (De Vries et al. 2006). Unlike falsification or fabrication, naming a few more authors on the byline does not have a material impact on the integrity of the content of research. Indeed, if one overlooks the broader ramifications for science, authorship disagreements may be dismissed as peripheral incidents of self-aggrandizement that have little impact on the advancement of science.

In the larger system of science, authorship has become a proxy for productivity, and is an important determinant in decisions regarding funding, professional advancement, salary, and recognition (e.g., prizes and awards). For researchers, authorship provides opportunities for further research recognition and credibility in a competitive research environment. Given the value of authorship—referred to as “symbolic capital” (Bourdieu 2004) or the “coin of the realm” (Babor et al. 2017)—its fair distribution is important not only to duly recognize one’s work, but also to confirm the integrity and justness of the team itself, and of the research system more broadly. In workplace settings, perceptions of (in)justice have been linked to outcomes such as organizational commitment, job performance, and satisfaction (Viswesvaran and Ones 2002). Studies on perceptions of justice in science show that when researchers believe they are treated unfairly, they are more likely to misbehave, which in turn, may compromise the integrity of research (Martinson et al. 2006). Viewed in this context, authorship cannot be treated as a “marginal” issue, but instead should be seen as a matter of justice in the research system.

Notions of justice, fairness, and collegiality are implicit in decision-making regarding authorship naming and ordering. Authorship naming is about choosing who should be included as an author; and by extension, who should be excluded. Definitions of authorship often include naming individuals who have “contributed substantially” and consequently remain responsible for the work while receiving due credit. However, the task of attributing authorship can be challenging, especially when one considers the heterogeneity of research contributions—intellectual, conceptual, technical, methodological, supervisory—and the evolution of individual contributions depending on unforeseen circumstances of research (e.g., reiterative methodology, failed experiments, unforeseen results), existing interpersonal issues (e.g., inability to communicate or work together), and external pressures (e.g., institutional support or financing). Authorship decisions can be further complicated by the diverse disciplinary or field-specific norms and practices regarding authorship inclusion; for example, research published in high energy physics and genetics may include hundreds or even thousands of individuals—often referred to as hyperauthorship (Cronin 2001)—a practice that is unimaginable in the humanities or social sciences where one or only a few authors is the norm. Although research institutions, funders, journals and publishing committees have attempted to provide authorship definitions to help in naming and ordering, these definitions are not uniform, they are difficult to enforce, and they are often unknown or simply ignored (Bosch 2012; Smith and Williams-Jones 2012; Teixeira da Silva and Dobránszki 2015, 2016).

After considering who to name as an author, researchers must also decide the order in which collaborators will be named on the byline. In many research fields (notably, in the fundamental and applied sciences), ordering reflects the amount or value of the contribution made by respective team members. Ordering often requires comparing and contrasting the value of different individual contributions, giving more or less importance to various criteria such as the type of contribution (technical, intellectual) or the role in the project (supervisor, student). The practices of ordering vary considerably. For example, in almost all fields of research, the first author is the individual who has made the most significant contribution of the team. However, there is a growing trend where some authors state explicitly—often in a footnote—that they have contributed equally to the research (Akhabue and Lautenbach 2010). In the health sciences or lab sciences, the supervisor or senior author is often named last. Finally, large research teams may resort to alphabetical or partial alphabetical ordering if ranking is too difficult or burdensome (Mongeon et al. 2017; Waltman 2012).

Empirical studies reveal a significant number of problems with authorship attribution that can have potentially serious negative consequences for science. According to a meta-analysis of cross-sectional survey studies on authorship (Marušić et al. 2011), researchers have reported or observed “misuse and/or problems” regarding authorship at a pooled weight of 29% (Marušić et al. 2011) in 14 survey studies; significant heterogeneity ranged from 1.5 to 71%. Such heterogeneity may be explained by the attribution of “misuse and/or problems” to a broad array of issues in different countries and populations, including: honorary co-authorship in radiology (O’Brien et al. 2009), gift authorship in medical schools (Bhopal et al. 1997), problems regarding ordering and inclusion in nursing (White et al. 1998), co-authors that did little or no work in business colleges (Manton and English 2006), bioethics students’ conflicts regarding authorship in Bangladesh (Ahmed et al. 2010), and post-doctoral fellows’ views on inappropriate authorship of supervisors in physics (Tarnow 1999). Other empirical studies have noted an important rate of disagreement regarding authorship. In a 2014 study of faculty, researchers and PhD students (n = 654) working in health science and medicine in Norway, 58% reported having been involved in authorship disagreements (Nylenna et al. 2014). In 2013, a study of Nigerian researchers (n = 133) found that 36.4% were involved in authorship disagreements (Okonta and Rossouw 2013); the same study also considered perceived occurrence of disagreement, which according to participant researchers happens: never (16.7%), seldom (46.2%), occasionally (29.5%), or frequently (7.6%).

It is recognized as essential in any scientific endeavor to have continuous, open and honest conversations even if they lead to debate and collegial disagreement. Authorship decision-making is no exception and often includes important topics linked to integrity such as scientific responsibility, accountability, merit, and leadership. However, when authorship disagreements become hostile, disrespectful and anti-social they may contribute to problems with research integrity. There have been few large-scale studies of why such disagreements happen and what type of subsequent misbehavior may ensue. The present study aims to better understand (1) the rates and predictors of authorship disagreements, (2) the factors that lead to disagreements, and (3) the prevalence of resulting misbehavior. This study contributes to understanding conditions and dynamics of authorship disagreements in various fields of research, in order to identify more appropriate methods to promote responsible conduct of research in authorship.

Methods

An international survey was conducted to evaluate the association between various predictors (gender, field of research, rank) that influence authorship disagreements and misbehavior. The survey tool was designed using relevant information from a literature review of ethical principles and procedures regarding distribution of authorship (Babor et al. 2017; Clement 2014; Smith 2017; Smith and Master 2017), as well as the results of preliminary semi-directed interviews with researchers in Canada and the US (n = 40) (Master et al. forthcoming). The survey tool covers a broad number of issues regarding authorship including, but not limited to, authorship naming and ordering disagreements, reasons for authorship issues, and misbehavior related to authorship disagreements. Although literature on ethical issues regarding authorship is more prevalent in the health sciences (Marušić et al. 2011), this survey was not limited to one field or discipline given the growing context of multidisciplinary research. In order to select representative questions, ensure clarity and also seek diverse disciplinary inputs, the survey tool was circulated among collaborators (authors) of this paper to obtain expert viewpoints from bioethics, library and information science, and science and technology studies. This multidisciplinary approach was intended to create a survey tool that was relevant and applicable to researchers from different disciplines and fields of research.

To ensure the questions were intelligible and sequenced in an intuitive manner, the survey was pre-tested with a sample of 100 participants from various fields of research randomly selected from the larger sample of participants; 33 responded to the survey. More than 90% of questions were considered clear; more than 80% of respondents found the survey complete and intuitive; and, 100% declared that the survey was well displayed on their web browser. The individuals who did not find the survey complete and intuitive mentioned that certain questions did not apply to their field of research. To further adapt the survey to capture the input of more research fields, the response choice of “other” or/and “does not apply” was added to multiple-choice questions. We added the definitions of “authorship naming” and “authorship ordering” at strategic places throughout the survey to increase respondents’ understanding of and familiarity with these terms. Lastly, a qualitative question was added at the end of the survey to allow those respondents who desired to share their experiences with authorship, in a narrative format.

Using the Web of Science (WoS), 103,297 researchers who had published in collaborative teams of two or more co-authors between 2011 and 2015 were identified as prospective survey participants. The sample was specifically designed to include researchers working in collaborative teams in all disciplines and fields of research as well as researchers with multidisciplinary profiles. It was stratified based on disciplinary diversity, hereafter referred to as “multidisciplinarity”. The level of multidisciplinarity was determined by (1) the diversity of department affiliations of authors and the (2) the disparities between the specialty of the journal in which they publish and the specialties they cite. For example, a paper published in a public health journal that cited papers published in cancer, surgery, or nursing was considered more multidisciplinary than a paper from the same journal that only cited other public health papers. While consideration of disciplinary practices was important, this study aimed to explore the significant growth of multidisciplinarity in teams and the dynamic that allows for diverse and sometimes conflicting disciplinary norms to co-exist in one research team. The level of multidisciplinarity is a continuous variable starting from a disciplinary individual to one with a higher level of disciplinary diversity. High rates of multidisciplinarity—i.e., the coexistence of many disciplines—may well point to teams that are interdisciplinary in nature and that combine knowledge from various fields in their study. However, this study did not examine how various types of knowledge were juxtaposed, combined, or integrated (e.g., the level of interdisciplinarity).

The survey was sent via email on the 24th of May 2016, and two reminders were sent during the following month. Once the data were collected, analysis included a general descriptive analysis of reasons for misbehavior and rates of disagreement, as well as engagement or observed misbehavior due to authorship disagreements. Logistic regression analysis was used to identify factors that were associated with disagreements and resulting misbehavior. The logistic model included the following variables as covariates: variables declared by the participants including gender and rank, as well as variables calculated using data from the WoS, such as country, number of papers published, multidisciplinarity and field of research. To avoid sparse data, countries with fewer than 20 researchers were collapsed into an “others” category. The following were identified as the response variables: disagreement regarding authorship naming, and disagreement regarding authorship order. Since the study considered researchers’ past experiences in various teams, analysis did not include the number of individuals in their research teams because this number can fluctuate considerably over time. However, the size of research teams is often related to the particular field of research (e.g., social science research tends to have smaller teams than in the natural sciences).

Of the 103,297 emails sent, 14,526 were rejected for various reasons, including: outdated email addresses, mailboxes being full, recipient server offline, or junk filtration settings. According to correspondence with participants, some emails did go directly to spam (without being considered by Qualtrics as rejected email) because of institutional firewalls which again reduced the possibility that researchers actually received the survey. 11,295 respondents opened the survey, and 8364 respondents completed part or all of the questions (response rate of 9.4%). For this specific study, 6641 respondents were included because they answered the main questions pertaining to authorship disagreement and misbehavior (response rate of 7.5%).

Table 1 presents the main characteristics of participants, namely: field of research, gender, and academic seniority (rank). Researchers self-identified their current rank at the time of data collection (not during past disagreements). Many of these characteristics can be compared to those in the population sample of the WoS to assess response bias, which is further discussed in the limitations section. Regarding the publication records of participants, the mean number of publications was 3.85 and the data ranged significantly from 1 publication to 155 publications per individual.

Table 1 Participant characteristics

Results

Rates and Predictors for Authorship Disagreement Regarding Order and Naming

In the course of their research careers, respondents were more likely to face disagreements regarding naming (46.6%) than ordering (37.9%) of authors (Table 2). When individuals had been involved in authorship naming or ordering issues, their rate of occurrence can be considered similar. More specifically, 70% of respondents reported that disagreements regarding naming and ordering rarely happen.

Table 2 Proportion of disagreement regarding naming and ordering

Multivariate regression analyses regarding naming and ordering disagreements (Table 3) suggest that compared to the individuals in natural sciences and engineering (identified as the baseline group), researchers in the medical sciences were significantly more likely to face authorship naming disagreements (OR 1.50; 95% CI 1.31, 1.72), while individuals in arts and humanities were less likely to face disagreements (OR 0.48; 95% CI 0.33, 0.69). Moreover, compared to individuals in the natural sciences, ordering disagreements were more likely to take place in the medical sciences (OR 1.66; 95% CI 1.45, 1.9). Contrary to what one might logically expect, when individuals had a high rate of multidisciplinarity, authorship disagreements were much less likely in both authorship naming (OR 0.77; 95% CI 0.63, 0.96) and ordering (OR 0.8; 95% CI 0.64, 0.99).

Table 3 Predictor for naming and ordering disagreements

If one examines career stage, mid-career researchers were much more likely than tenured and/or senior researchers to face naming disagreements (OR 1.17; 95% CI 0.99, 1.37), while researchers who were at the beginning of their careers were much less likely to be involved in authorship naming disagreements (OR 0.73; 95% CI 0.58, 0.9). There remains a statistically significant negative association between rank (e.g., graduate student, postdoctoral researcher, junior vs senior professor) and authorship order (OR 0.76; 95% CI 0.61, 0.96), showing that new researchers were less likely to encounter disagreement. Also, men were much less likely than women to encounter both authorship disagreements regarding naming (OR 0.75; 95% CI 0.66, 0.85) and ordering (OR 0.83; 95% CI 0.73, 0.94).

Factors Causing Disagreements According to Participants

Respondents who reported being involved in authorship disagreements were asked to select one or more factors that influence authorship disagreement for both naming and ordering. The rates of factors were generally slightly higher for authorship disagreement regarding naming than order (Table 4). Valuing or measuring the importance of contribution surfaces as the most important factor for disagreements, especially regarding authorship naming. Proportionally, “differing ethics” was identified by respondents as more important in authorship naming disagreements than in ordering disagreements. Further, lack of discussion in teams was often noted as an area of contention for both naming and ordering.

Table 4 Factors linked to naming disagreement and ordering disagreement

More than 10% of the time, multiple factors were selected, including: confusion and lack of clarity regarding process or criteria; lack of discussion and agreement within the team; and, differing ethics. Only 37% of participants identified a single factor as the cause of naming disagreement, while 29% selected two factors, 20% selected three factors and 13% selected more than four factors. This trend of co-occurring factors was also present in authorship ordering.

Misbehavior Resulting from Authorship Disagreements

All participants were asked if they had observed or engaged in misbehavior related to authorship naming and ordering disagreements. Not surprisingly, significantly more individuals reported observing misbehavior than engaging in such misbehavior. As seen in Tables 5 and 6, results for authorship naming and ordering followed the same trends but remain slightly lower for ordering. Survey respondents reported having witnessed a wide range of misbehavior following authorship disagreements, including: instances of hostility (24.6%), undermining of a colleague’s work during meetings/talks (16.4%), cutting corners on research (8.3%), sabotaging a colleague’s research (6.4%), or producing fraudulent work to be more competitive (3.3%).

Table 5 Observed and engaged misbehaviors resulting from authorship disagreements regarding naming
Table 6 Observed and engaged misbehaviors resulting from authorship disagreements regarding ordering

Behavior considered particularly problematic such as hostility, undermining the work of colleagues, cutting corners, sabotage and deliberately engaging in fraud (Table 5, rows 1–5) were collapsed into one category in order to examine whether there were predicting factors that might be associated with such misbehavior. The answer “limitation of further collaboration” was excluded for this specific analysis. Although limiting future collaboration may be counterproductive in a team science environment, it is generally not considered misbehavior.

“No specific behavior” (Table 5, row 8) was identified as the baseline group in logistic regression analysis of misbehavior engagement. Results showed a high level of variance and no statistical significance given the limited number of individuals in the population (Table 7). However, larger populations of individuals who had observed misbehavior did allow for further analysis (Table 8). Compared to researchers in the natural sciences, researchers in the medical sciences were statistically more likely to observe misbehavior as a result of authorship naming (OR 1.26; 95% CI 1.09, 1.45) and authorship ordering (OR 1.34; 95% CI 0.84, 1.45). Mid-career researchers also observed more misbehavior resulting from authorship ordering than senior researchers or tenured researchers (OR 1.22; 95% CI 1.02, 1.47).

Table 7 Presence of engaged misbehavior
Table 8 Presence of observed misbehavior

Discussion

An unexpected finding of this study was the lower reported frequency of authorship ordering disagreements as compared to naming disagreements. In the recent literature, authorship ordering practices seem to be related to particular disciplines, fields, or teams. Although studies have shown that the relationship between importance of contributions and the ordering of first and last authors are associated with a higher number of contributions (Larivière et al. 2016; Sauermann and Haeussler 2017), ordering is still subject to little or no explicit guidance (Smith and Boulanger 2011; Smith and Master 2017). Conversely, naming of authors is more widely addressed in the definition of authorship in institutional policies, guidelines provided by journals or the ICMJE and the Committee of Publication Ethics (COPE). Excluding an individual in naming may seem more ethically questionable and significant than slightly undervaluing their contribution through ordering. However, in the long-term, undervaluing ordering may create the same ethical problems—i.e., discrimination, exclusion and even exploitation—as does naming guest authors or failing to provide credit to those deserving authorship. For example, consistently excluding certain individuals as first or last author when they merit that recognition could negatively affect the individual’s career in the long term. This may actually be an insidious manner of appropriating the work of certain individuals and de facto means of creating a glass ceiling that bars them from leadership roles.

This study suggests that authorship order disagreements in the social sciences are similar to those of the natural sciences (baseline group). Yet, when naming authors, social scientists seem to be less likely to be in a disagreement. When compared to the natural sciences and medical sciences, there is a limited amount of literature on authorship order in the social sciences. Bebeau and Monson (2011) conducted a historical study that considered authorship guidance in the social sciences and suggested that order should be determined by “creative intellectual contributions.” Henriksen (2016) has demonstrated the significant rise in collaborative social science research mainly in fields that resort to experimentation, large data sets and quantitative methods in their research. But it should be acknowledged that collaboration in the social sciences is relatively recent; these fields of research may still be trying to establish norms and practices for ordering, which already exist in the medical sciences, engineering and the natural sciences. Some researchers from philosophy of science have suggested that the social sciences are more theoretically and epistemologically fragmented than most fields in the fundamental or applied sciences (Abbott 2000; Moody 2004). Given this fragmentation, individuals may agree on the importance of a research project but may at the same time differ as to the value of various types of contribution necessary in the ordering process.

The argument that disciplinary norm diversification is at the center of authorship disagreements does not seem to apply to researchers in multidisciplinary research teams; our study found that these researchers are much less likely to witness or have such disagreements. Multidisciplinary research teams cite, use, and come from various fields that are epistemologically complex and varied. Although further research is necessary to understand why multidisciplinary teams are less prone to disagreements, certain hypotheses are worth considering. For instance, diversity may well be a recognized strength in such teams and increased communication inherently valued and explicitly encouraged in research practices. Moreover, as shown by Larivière, Haustein and Börner (2015), multidisciplinary research papers are rated as successful based on citation counts. Thus, it may also be the case that individuals in multidisciplinary teams are less likely to undervalue or disagree about other researchers’ contribution because they recognize the potential for synergistic success, i.e., that the research will yield more than the sum of its parts. Lastly, one could also argue that many researchers may not understand the specificities of the work of those from another discipline. From an epistemological standpoint, Kukla has suggested that this radical multidisciplinary can be considered “unauthored” if authorship requires all researchers to be accountable—and thus understand—the totality of the work being published (Kukla 2012).

Not surprisingly, this study found authorship disputes to be most prevalent in medical research. The current literature (Nylenna et al. 2014) corroborates the high rates of disagreement reported in our survey, for both ordering and naming of authors. Certain researchers in the biomedical sciences have expressed misgivings about multiple authorship diluting responsibility and accountability (Cronin 2005). Journals in the health and biomedical sciences have issued the most guidelines including those that, for example, limit the number of researchers who can be named as authors on the byline (Weeks et al. 2004). This may be explained to some extent by the fact that biomedical research is entrenched in a medical system where legal ramifications are of significant concern (something that may not be the case in other fields). As a result, authorship in the medical sciences entails that an individual—or a limited number of easily identifiable individuals—take responsibility for the work in a way that is directly linked to accountability.

Although such disagreements may be collegial and productive in a manner that promotes fairness, this does not seem to always be the case. This study found medical science researchers to be the most likely to experience misbehavior in relation to authorship disagreements. The heightened competitive pressures within the medical field, as compared to other fields, may well contribute to a greater incidence of disagreement and misbehavior. Indeed, studies have suggested that competition is linked to questionable behaviors (Anderson et al. 2007b). Competition may also explain to some extent why mid-career researchers, who will be evaluated for their tenure based on their publication record, were involved in more disagreements than individuals who were tenured and/or near the end of their careers.

The increased likelihood of authorship disagreements experienced by women may be the unfortunate result of contextual discrimination. Studies often show the presence of the Mathilda effect, where women’s contributions are more likely to be overlooked than those of men (Rossiter 1993). This has been shown for the field of engineering (Ghiasi et al. 2015; Rossiter 1993), as well as more generally in academia (Lincoln et al. 2012). Increased disagreement is not surprising, given the fact that women are more likely to be relegated to less prestigious tasks, such as performing experiments (Macaluso et al. 2016).

In studies on funding distribution in research, the probability of gender bias is related to the gender balance of the funding committee members (Mutz et al. 2012). Commonly referred to as the “salience hypothesis”, this notion suggests that the importance of the characteristic will be based on the prominence of that characteristic. Although there has been an increase in the number of women in science, there remain global gender disparities (Larivière et al. 2013). While recent studies suggest that the younger generation is outperforming men (van Arensbergen et al. 2012) in strictly quantitative measures, the deck remains stacked against women as evidenced by unequal pay, gender discrimination and funding disparities (Shen 2013). The relationship between equity in authorship allocation and gender disparities in science requires additional scrutiny.

The presence of disagreement regarding authorship does not in itself suggest wrongdoing that will seriously undermine science. Most disagreements can be resolved through collegial discussion to achieve a mutually agreeable and fair outcome; as such, this would make disagreement of much less concern than egregious behaviors. However, this study showed that authorship disagreements can result in limiting collaboration (40.1%), increasing hostility in a team (7.0%), and undermining colleagues (4.3%). As a result, some authorship disagreements do seem to negatively affect collaboration. In a context where team science has achieved prominence—and in some fields become an imperative—a lack of team cohesion may well undermine collaborative initiatives, and threaten individual and collective success. If the researcher’s motivation, focus, and ability to work in a team are hampered, ultimately the integrity and productivity of science are affected.

In this study, a relatively small number of researchers admitted to having engaged in a behavior that had a direct negative effect on science, such as: fraud (0.8%), sabotage of an individual’s work (1.1%), and cutting corners to compete with a colleague (1.8%). Although it is impossible to make any direct comparisons with other surveys regarding misconduct that have used different tools (e.g., survey questions) and samples, this finding nonetheless highlights ethical issues. For instance, in a meta-analysis on questionable behaviors in science, Fanelli (2009) found that an average of 1.79% of scientists admitted to fabrication or falsification of modified data. One may wonder if and to what extent the respondents of the various surveys may have been subject to authorship disagreements.

This study is the first to consider rate of participants who observed (6.4%) and engaged (1.1%) in sabotage after an authorship dispute. In the literature on responsible conduct of research, there is limited information regarding sabotage. Certain scandalous cases have been briefly discussed in the literature. For example, Magdalena Koziol at Yale University alleged that her supervisor was tampering with her transgenic zebrafish experiments (Enserink 2014). Vipul Bhrigu systematically and continuously tampered with the work of colleague Heather Ames by poisoning her cell-culture media used in cancer research at the University of Michigan in Ann Arbor (Maher 2010). The US Office of Research Integrity officially considered this latter case of sabotage to be misconduct; but the judgment was criticized as a significant stretch from the official definition which is limited to falsification, fabrication and plagiarism (Rasmussen 2008). Although it would be difficult to define “sabotage” or even include this concept in a misconduct policy, clearly it has important effects on science. Not only can sabotage (e.g., such as tampering with a colleague’s experiment) distort science, as does fabrication or falsification, it can also introduce doubt or suspicion in research collaborations and erode the trust that is necessary for team science.

To avoid disagreement and ensuing potential misbehavior, researchers in the responsible conduct of research have attempted to create quantitative models, comparative processes, and detailed guidelines to help with the complex task of measuring contributions for authorship distribution (Clement 2014). However, there will likely always be a level of subjectivity in valuing schemes (Dyck 2012). As such, it is not surprising that researchers mentioned that “measuring or valuing contribution” was the most important factor leading to disagreements. It is also important to consider that 63% of respondents selected multiple factors that led to authorship naming disagreements and not simply one factor such as measuring contributions. Other factors that remain important include “Confusion and lack of clarity (e.g., process or criteria)” as well as “Lack of discussion and agreement within the team” which seem to belong in the realm of procedural and relational justice. In other words, disagreements will happen in any relationship. However, looking at practices that enable discussion and agreement, even if perfect fairness is impossible, may be quite feasible in practice. Further research into these other factors could pay dividends in better understanding and mitigating authorship disagreements.

Limitations

This study has two limitations that must be considered: (1) response rate, and (2) disclosure of morally questionable behavior.

Although a 7.5% response rate may seem low, it is actually comparable to many wide-scale cross-sectional surveys which average at lower than 10% (Kohut et al. 2012; Salganik 2017). Even in the most prestigious national social sciences surveys on US households, it is not surprising to see nonresponse rates over 60% for phone and face to face interviews (Massey and Tourangeau 2013). There are many social and technological reasons to explain the decreasing response rates (Salganik 2017). In the present case, the low response rate may be explained by the fact that a portion of individuals in the initial sample may have not received or seen the email. The email addresses were from the Web of Science database which indexes all emails from authors on publications. With the reduction of tenure positions in academia, researchers are very mobile and may change emails often during their career. Some researchers mentioned that the email was received as spam. Although minor modifications were made to minimize this issue (such as removing active links in the email) it is impossible to account for the settings in all institutional firewalls.

It has been suggested that declining response rates over the last two decades can threaten data validity (Tourangeau et al. 2013). Although survey methodologists argue that high response rates are a good way to obtain unbiased research, this does not mean that lower response rates necessarily lead to biased results (Massey and Tourangeau 2013). The introduction of bias in studies with low response rate—called nonresponse bias—occurs when one part of the population with a particular view is not adequately represented in the study. In other words, if the sample of non-respondents has characteristics that are very different from the respondents regarding the questions being asked, then it would result in bias. In this study, when comparing respondent and non-respondent groups, demographic factors such as gender, field, subfields, or rank were actually quite similar.

There was a slight reduction in the response rate of natural scientists and engineers which may be explained by the fact that individuals in these fields are more mobile and frequently change institutional emails. This may also explain the slight discrepancy between men and women since there are proportionally more men in the natural sciences than in other fields of research. When looking at survey study trends, women are also generally more likely to respond to online survey studies (Sax et al. 2003; Smith 2008). Since this study also suggests that women are in more disagreements than men, the topic may have been of greater interest to women than men. However, such discrepancies are so small that they have limited effect on the results of this study.

This study provides some large-scale insight regarding authorship but, as in the case of any research on morally questionable behavior, it will most likely experience a degree of underreporting regarding participants’ own behavior as well as their perceived behavior. Individuals may be ashamed of past misbehavior or they may be lying to themselves regarding the extent of their actions. As a result, responses to questions about the participants’ own questionable behavior are most likely understated and possibly rationalized. Conversely, when researchers are asked if they perceive morally problematic behavior, they are not blamed or ashamed and may feel disengaged from the situation. This makes respondents more likely to report their colleagues’ behavior than their own, creating underreporting of engaged behavior and a possible overreporting of observed behavior.

Both engaged and observed behaviors were presented because they represent two very different measurements. One aims to describe individual misbehavior (engaged) and the other aims to describe the colleague’s misbehavior (as perceived by another individual). Both measurements are intertwined, due to the fact that the presence of perceived misbehavior may be conducive to that misbehavior becoming the norm and thus resulting in engagement of misbehavior. For example, if a team of researchers are commonly cutting corners, and are seen to do so, junior scholars may be more likely to adopt such behavior to normatively align with a common practice. Not doing so will cause “normative dissonance”, i.e., contributing to confusion about the right action; unfortunately, colleagues might align themselves with the dominant practice so as to avoid alienation and to keep a competitive edge (Anderson et al. 2007a).

Recommendations and Conclusion

Researchers’ productivity and integrity are influenced by their workplace, collaborations, sharing, recognition, as well as the rules and norms of the system of science itself. The naming and ordering of individuals as authors is critically important in recognizing those responsible and accountable for the published work. This study allowed consideration of the predictors of authorship disagreements, the factors leading to disagreements that respondents believe to be important, as well as the misbehavior that might ensue as a result of unresolved disagreement. Some results of this study have been previously identified in the literature, but several are novel and merit further discussion. These include, notably: significant gender disparities regarding rates of disagreement, the decreased odds of disagreements in multidisciplinary research, and the resulting questionable conduct, including sabotage.

What Can or Should be Done?

Further elaborating definitions of authorship is unlikely to produce results given the fact that the notion of “substantial contribution” is highly subjective. Based on this paper’s findings, the following novel approaches are proposed to support ethical authorship distribution and the prevention of misbehavior.

  1. 1.

    Since multidisciplinary research seems to result in less disagreement, perhaps more flexibility regarding what is considered a “valuable contribution” is warranted. Although researchers identified defining valuing and measuring contribution to be an important factor leading to disagreement, one must be cognizant of the important limitations of these criteria.

  2. 2.

    Since authorship is a multifaceted problem, discussions should promote effective communication, effective division of labor, the management of bias and discrimination regarding gender, rank, discipline, etc. Teaching issues around authorship should not be limited to definitions or criteria for inclusion and exclusion as this often ends in an impasse.

  3. 3.

    Since misbehavior following from disagreements seem commonplace and are often of a retaliatory nature within a group (e.g., hostility, sabotage), management strategies may be more effective if rooted not only in systemic justice but also in relational and interactional justice. More research on such misbehavior is warranted given its negative effects on science.

  4. 4.

    Teaching researchers about the detrimental effect that unresolved disagreements can have on research may help highlight the importance of this issue, while making clear that sometimes there is no perfect “fair” solution to authorship and that trade-offs will be necessary. Institutional support through a research integrity consultation or ombudsperson service may help in situations that are difficult for teams to work through independently (Master et al. 2018).

Teamwork and collaborative arrangements are a modification to the social structure of science that is of significant scientific benefit. However, it also comes with important epistemic challenges in areas of communication, coordination, shared responsibility, credit, and accountability (Andersen 2016). As mentioned by Peterson  and colleagues (2014), the rise in team science necessitates a new type of team ethics that must encourage cooperation. Research, scientific rigor and the pursuit of science are inextricably bound to the human dynamic of the system of science. And as science becomes more collaborative and team-oriented, that human dynamic is all the more prevalent. Not taking the time to discuss, plan for, and manage these complex issues undermines the collaborative arrangements that are necessary to more productive, novel and ethical science.