Introduction

For someone who entered the academic field within the last half decade, it may feel as if the h-index has always been around, so extensive is its use (Lee et al. 2009; Ehteshami Rad et al. 2010; Lacasse et al. 2011; Ciriminna and Pagliaro 2013; Svider et al. 2013). Yet, it was proposed only a little more than a decade ago (Hirsch 2005), and its perception as an established metric just underlines how quickly it was adopted into wide use, despite its limitations.

The h-index is calculated as the highest number of publications that an individual has (co)authored that have received at least h citations each. Some other proposed bibliographic indexes are summarized in Table 1. Similar to the h-index, each of these indexes summarizes the publication and citation record of a researcher as a scalar number. Any bias in the underlying data translates into a bias in the resulting index. While numerous kinds of bias have been studied in the literature,Footnote 1 this work focuses on what I call the coauthor problem: for publications with more than a single author, the coauthor problem relates to distributing credit among the authors. This credit comes in many forms, ranging from the inclusion of that publication in a coauthor’s publication count and contribution of the publication to other measures such as citations or impact factors, to monetary benefits.

One possible assumption is that the contributions of multiple authors are uniform and that the same credit shall be assigned to all coauthors. In this case, the only open question is how much total credit shall be distributed, with a number of different answers (Lindsey 1980; Hirsch 2005; Mesnard 2017).

When we suspect that the coauthors’ contribution is not uniform, the ideal solution to the coauthor problem would be to provide some quantitative information about each author’s contribution, which could accompany each publication and be used to distribute credit to authors. However, such a scheme was proposed at least two decades ago (Lukovits and Vinkler 1995), and it has yet to be agreed upon or implemented by a sufficiently large number of publishers to enter common use. In the absence of such additional information, the author sequence usually offers the only clues to this distribution of credit. In many scientific fields, authors are listed alphabetically, or even arbitrarily, severely limiting the possibility of inferring author contributions and rendering the uniform approach the only viable one. Fortunately, in some areas of academic publishing, authors are listed in order of decreasing contribution. In these areas, it is commonly assumed that contributing as a coauthor requires less effort than contributing as a first author. This has been referred to as the sequence-determines-credit (SDC; Tscharntke et al. 2007) assumption. I restrict my investigation to areas where this assumption holds.

In this context, it has been argued that without appropriate correction, researchers may benefit disproportionately from working in larger research groups, as they may more easilyFootnote 2 become coauthors of more publications by their increased number of peers (Butson and Yu 2010; Sahoo 2016). A different, yet related, problem is that of honoraryFootnote 3 authorship. In both of these cases, bibliometric indexes for such researchers may be higher than those for other researchers with similar productivity.

The following briefly presents ethical criteria and previous proposals for solving the coauthor problem. I then propose a method of first-author-credit normalization as one part of a solution. I discuss three example applications where the new normalization method is well-suited, and I discuss why a new counting method, adaptable geometric counting, is more appropriate than harmonic counting. In fact, I show that adaptable geometric counting can be used as an alternative to harmonic counting: when using the golden ratio as a parameter, geometric counting is equally compatible with the empirical data. These data have previously been used to show that among four counting methods, harmonic counting best represents human expectations of author contributions (Hagen 2010b).

Table 1 Bibliometric indexes based only on the citation vector \(\mathbf {c}\), where \(c_p\) denotes the number of citations received by publication \(p\), sorted in descending order

Ethical criteria

Many criteria have been put forward in the literature regarding ethically acceptable solutions to the coauthor problem. I do not intend to imply that the following criteria are axioms and immune to discussion; rather, they serve to formalize previous practically oriented solutions to the problem of deriving ill-defined notions of credit and contribution from author lists. According to these criteria, coauthor credit shall be:

  1. 1.

    Strictly positive All coauthors shall receive strictly positive credit for a publication, assuming that an author list does not contain authors without a contribution.

  2. 2.

    Strictly non-inflationary The total credit of a publication shall not increase with the number of authors, thus preventing inflationary bias, and instead the total credit should be normalized to 1. In particular, all coauthors shall not repeatedly be issued full authorship credit to prevent the most extreme form of inflationary bias (Hagen 2008). Inflationary bias can cause the issues discussed in the introduction, which complicate the process of evaluating scientific productivity by means of publications. It also poses a problem whenever credit for individual researchers is aggregated, for example, when institutions or countries are compared; or when total credit is to be assigned from a fixed (e.g., monetary) resource.

  3. 3.

    Proportional Relative coauthor’s credit shall reflect their relative importance, instead of awarding the same credit to all authors (hence avoid equalizing bias; Hagen 2008). Without additional information, equalizing bias can only be avoided in fields where the author list is supposed to provide information about the relative importance of authors, in particular, in fields where authors are not generally ordered alphabetically.

    Of course, even in the ideal case of authors self-reporting their contributions, relative importance needs to be defined, and consensus must be reached regarding how much different elements such as acquisition of funding, initial conception, work performance, manuscript writing, supervision, and future accountability contribute to this importance. However, whatever the exact definition, the main purpose of this criterion is to avoid having to assign the same credit to all authors.

  4. 4.

    Empirically founded If possible, credit assigned based on a rule shall reflect the contribution practically perceived by readers and other users of the publications (Hagen 2010b). One reviewer of an earlier version of this manuscript rightfully objected that perception may vary from one reader to another and that small experimental studies cannot be generalized “to all papers published in the past and the future.” As to past papers, it is obviously impossible to measure the individual perception at the time of writing, and even more so to influence it. So, while there will be systematic bias not only from inter-individual but also from temporal variation, we must work with the few data that are available or choose not to use the data at all.

    Concerning future papers, it is possible that a consensus on awarding of credit may shape, and homogenize, future perceptions of contribution. Nonetheless, while one might conclude that this makes any global credit rule acceptable in the long run, it seems a longer step to establish an arbitrary credit rule than one that is based on current, even average perceptions.

  5. 5.

    Parsimonious Credit shall be assigned based on a simple rule with as few arbitrary parameter choices as possible. For example, a formula interpolating between harmonic and fractional counting with an additional parameter (Liu and Fang 2012) may be less preferable than just harmonic counting, even if the latter is slightly worse in terms of explaining variability in empirical data (Hagen 2013).

In addition to these previously discussed criteria, in this work, I introduce the following ones according to which coauthor credit shall be:

  1. 2’.

    Loosely non-inflationary I propose to consider a relaxed version of criterion 2, according to which the total credit of a publication need not necessarily be fixed, but bounded—that is, it may increase with the number of authors, but with an upper bound. This criterion is particularly appropriate when the credit is assigned not from a fixed, but still from a limited resource.

  2. 6.

    Independent of lower-ranked coauthors For some types of credit, it may be inappropriate to decrease first-author credit to credit coauthors, compared to single-authors: for example, when publications are counted as a criterion for graduation or promotion; when encouraging collaboration between researchers; or when assigning credit based on accountability. These three cases are discussed in detail in subsections of the "Example applications" section. In particular, under this criterion, the first author of a multi-author publication shall receive the same credit as a single author of that same publication.

Obviously, all criteria cannot be fulfilled simultaneously; in particular, criterion 2 and 6 are mutually exclusive under the positivity criterion 1. As we will see in the following section, at least one method fulfills all previously proposed criteria (criteria 1 to 5), yet fails to fulfill the criterion to not punish the first author for collaborating (criterion 6). After presenting the state of the art, I will argue under which circumstances this new criterion is most relevant, and how it is compatible with the relaxed criterion 2’. I will then propose to replace criterion 2 by criterion 2’, which will open the door to a solution of the coauthor problem that fulfills all criteria.

Related work

Easily computable post-computation corrections to the coauthor problem of the h-index have been proposed not long after the h-index itself has been presented: Batista et al. (2006) proposed an individual h-index \(h_I\) which is calculated by dividing the h-index by the average number of authors of the top h publications (the Hirsch core). This avoids inflationary, yet not equalizing bias. A different approach is the first author h-index, \(h_{\mathrm {fa}}\), which especially rewards first, including single authors (Butson and Yu 2010). Effectively, first-authors publications are counted twice after h is determined, avoiding some equalizing bias at the cost of inflationary bias.

More advanced, source-based adaptations to the h-index have been proposed, by decreasing the weight of a publication with increasing number of authors before calculating the h-index: either by dividing the number of citations (Lozano 2013), or by considering fractional numbers of publications (Schreiber 2008). Both approaches require a counting method to determine the weight of a publication, the need for which had been recognized long before the h-index was proposed. As a result, most counting methods—in particular, all those briefly summarized in the remainder of this section—are universally applicable to almost any bibliometric index (compare Table 1).

In 1973, Cole and Cole suggested, with caution, counting only first-author publications; this has found application in Opthof and Wilde’s (2009) h-index of first-author publications, and similarly in h-maj by Hu et al. (2010), who considered as major contributors only first and corresponding authors. Both approaches disregard all other authors and thus violate positivity (criterion 1).

Lindsey (1980) proposed fractional weights \(1 / n_p\); \(n_p\) being the number of authors of publication \(p\). That approach was reiterated in many contexts, e.g., by Schreiber (2008), and revived most recently by Sahoo (2016), who proposed an \(I\)-index as an author’s “share” of their total citations, again using fractional weighting. Mesnard (2017) proposed a parallelization bonus for multiauthored publications, effectively scaling the fractional weight \(1 / n_p\) to \((n_p+ 2) / (3 n_p)\), thereby turning the total publication credit from 1 to the—potentially unbounded—value \((n_p+ 2) / 3\). The Hirsch, fractional and parallel approaches, while crediting all coauthors (criterion 1), introduce equalizing bias (criterion 3) through the uniformity assumption. Preventing equalizing bias is the main advantage of the following three counting methods, which take into account author ranks within author lists.

Van Hooydonk (1997) used arithmetic (also referred to as proportional) counting using linearly decreasing weights, normalized such that the sum equals 1, to weight numbers of publications and citations. This normalization will be referred to as total-credit normalization in this work, and it is applicable to a range of counting methods. Geometric counting (with raw weights of \(2^{-i_p}\), subject to total-credit normalization as above; \(i_p\) being the position of an author in the author list of publication \(p\)) was proposed by Egghe et al. (2000), however, a convincing argument for the choice of the base factor 2 has yet to be made.

Finally, harmonic counting, using weights proportional to \(1 / i_p\), has been proposed and studied multiple times (Hodge and Greenberg 1981; Hagen 2008, 2009, 2013); this includes one proposal of using the raw weights \(1 / i_p\) without further normalization (Tscharntke et al. 2007).

Table 2 Weights for credit of author \(i\) in a publication with \(n\) authors according to several counting methods

Table 2 summarizes all aforementioned counting methods with their weights \(\mathbf {\alpha }\), as well as the criteria they violate. Equations 1 and 4 (see Table 2) result in considerable inflationary bias (criterion 2 and 2’: credit of a publication greater than 1 and increasing, without bounds, with the number of authors), encouraging honorary authorship (Hagen 2008). Other counting methods either ignore some coauthors and violate positivity (Eqs. 2 and 3 and criterion 1) or punish first authors of multi-author publications compared to single authors (Eqs. 59 and criterion 6). The latter is a side-effect of assuring positivity and preventing inflationary bias by total-credit normalization. Note that total publication credit before normalization is bounded (\(\lim _{n \rightarrow \infty } \sum _i\beta _i< \infty\)) only for geometric counting, yet, harmonic counting has been found to agree best with empirical data (Hagen 2010b).

With the exception of Eq. 3, none of the counting methods consider the special importance that last or corresponding authors are credited with in some fields, such as medicine (Baerlocher et al. 2007; Burrows and Moore 2011). In these cases, those authors tend to find extra consideration (instead of just that of the \(n\)th author) by the ad-hoc postulate of equal credit for first and last authors. This correction approach usually consists of a) ignoring the last author in the regular credit computations, b) then assigning credit equal to that of the first author to the last author, and c) if necessary, re-normalizing. This can be applied across various counting methods (Sekercioglu 2008; Zhang 2009a; Hu et al. 2010; Butson and Yu 2010; Hagen 2010b). Moreover, the same reasoning can be applied to multiple “first” authors, which are usually indicated by “equal contribution” comments in the publication. Alternative solutions, e.g., for the issue of last authors have been offered by Baerlocher et al. (2007) and Burrows and Moore (2011). Therefore, I consider the multiple-first-authors problem, as well as the last-author problem, orthogonal to the general coauthor problem and outside the focus of this manuscript—similarly to the academic-age problem, compare Footnote 1. In particular, while some empirical data for the last-author problem does exist (Wren et al. 2007), there is no comparable data related to the multiple-first-authors problem.

Proposed solutions

In the following section, I propose first-author-credit normalization as a way to satisfy criterion 6. With this, criterion 2 cannot be satisfied simultaneously, underlining the importance of bounded total publication credit (criterion 2’) which will be further discussed. I then discuss three example applications of first-author credit normalization, one of which is credit based on author accountability rather than relative contribution.

Table 3 Weights for different counting methods using total (\(\alpha\)) and first-author (\({\bar{\alpha }}\)) credit normalization; normalized quantities indicated in bold

First-author-credit normalization

To satisfy criterion 6, it is indispensable to assign the first author full credit (\({\bar{\alpha }}_1 := 1\)). To maintain relative credits between coauthors (\({\bar{\alpha }}_{i}\big /{\bar{\alpha }}_{j}\); criterion 3), starting from the above total-credit normalization weights, other authors’ weights need to be rescaled the same way:

$$\begin{aligned} {\bar{\alpha }}_i = \alpha _i \cdot {\bar{\alpha }}_{1}\big /\alpha _{1} = \alpha _{i}\big /\alpha _{1}. \end{aligned}$$
(10)

The result is numerically illustrated in Table 3 by comparing total-credit normalization to first-author-credit normalization for a number of counting methods; note that as expected, first-author-normalized credits of all authors are independent of the number of authors.

Unfortunately, as \(\alpha _1\) is equal to or less than 1 and usually decreasing with the number of coauthors \(n\), this raises the problem of introducing inflationary bias, especially when starting from counting methods which do not suffer from inflationary bias due to total-credit normalization. The relevant quantity to study is

$$\begin{aligned} \lim _{n \rightarrow \infty } \sum _{i= 1}^{n} {\bar{\alpha }}_i, \end{aligned}$$
(11)

which can be infinite or finite; the former case violates criterion 2 and 2’ (unbounded total credit), while the latter violates only criterion 2 (bounded total credit). The importance of bounded total credit is discussed in the following section. Note that implicitly, first-author-credit normalization has been used by Tscharntke et al. (2007); and a variant can be found in the proposals by Mesnard (2017), where the first-author credit is made less dependent on the number of coauthors by ensuring that it be at least \(1\big /3\), regardless of the number of coauthors.

Importance of bounded credit

The notion that the total publication credit may exceed 1 is generally independent of the actual counting method used. However, after setting \({\bar{\alpha }}_1 = 1\) and rescaling \({\bar{\alpha }}_i\), the sum of the weights may or may not be bounded (\(\sum _{i= 1}^\infty {\bar{\alpha }}_i< \infty\)). With first-author-credit normalization, geometric weights \({\bar{\alpha }}^{\text {geo}}\) are bounded (fulfilling criterion 2’), while this is not true for harmonic weights \({\bar{\alpha }}^{\text {harm}} = 1 / i\), as \(\sum _{i= 1}^n 1 / i\) tends towards \(\infty\) as \(\log n\) (Sondow and Weisstein 2016). Thus, harmonic counting with first-author-credit normalization fulfills criterion 6 at the expense of violating not only strict criterion 2, but also relaxed criterion 2’: with increasing number of authors, the total credit grows without bounds.

From the above, one concludes that the importance of bounded credit manifests most drastically in combination with the trend towards publications having excessively high numbers of authors, with papers even in the life sciences exceeding \(n = {1000}\) authors (Castelvecchi 2015). In cases like these, not even the logarithmic character of the total harmonic credit serves to control its size effectively. But those seeming exceptions only highlight the general trend that the average number of authors per publication are steadily growing, for example, from under 2 to over 5 between 1965 and 2015 in the MEDLINE database (U.S. National Library of Medicine 2016), and there is no indication this trend might be slowing down. Hence, with first-author-credit normalization, it is important to prevent explosion of total credit of current (and future) publications compared to past ones.

A similar argument is directed at the nature of the assigned credit. For a research or funding entity that advertises monetary incentives for publications (Shao and Shen 2011) and aims to encourage collaborations, on the one hand, it is counterproductive to normalize total credit and thus punish the first author for more extensive collaborations. On the other hand, being subject to the potential obligation of rewarding several dozen or even a few hundred authors of a single paper may pose unwanted risks. A counting method with bounded credit features an implicit barrier against such risks.

An improved counting method will thus limit the unbalance between papers written in different periods by limiting (while not fixing) the total assigned credit; maintain the principle of parsimony, which is a particular strength of the harmonic counting method compared to other methods; and, improve the agreement with empirical data of geometric counting, which is lower than for harmonic counting (Hagen 2010b). Therefore, adaptable geometric counting is proposed and evaluated in the following.

Adaptable geometric counting

Inspired by Hagen (2010b), who compared four counting methods (Eqs. 69) in terms of agreement with empirical data, this work extends the comparison to a new, adaptable counting method based on regular geometric counting (Eq. 8). In particular, as the base factor 2 lacks a theoretical or empirical foundation, I replaced it by a variable parameter \(\gamma\) and applied both total-credit and first-author-credit normalization:

$$\begin{aligned} \beta _i^\mathrm {adapt} = \gamma ^{-i} \quad \Rightarrow \quad \alpha _i^\mathrm {adapt} = \frac{\gamma ^{-i} \left( \gamma - 1 \right) }{1 - \gamma ^{-n}} ,\quad {\bar{\alpha }}_i^\mathrm {adapt} = \gamma ^{1 - i} . \end{aligned}$$
(12)

Here, \(\gamma\) represents the ratio between credits assigned to two subsequent authors. In the “Results” section, the golden ratio \(\phi\) will be shown to have a special significance for this parameter. For \(\gamma = \phi = ( 1 + \sqrt{5} ) / 2 \approx {1.618}\), Eq. 12 simplifies to

$$\begin{aligned} \beta _i^\mathrm {gold} = \phi ^{-i} \quad \Rightarrow \quad \alpha _i^\mathrm {gold} = \frac{\phi ^{-(i+ 1)}}{1 - \phi ^{-n}} ,\quad {\bar{\alpha }}_i^\mathrm {gold} = \phi ^{1 - i} . \end{aligned}$$
(13)

As regular geometric counting, both adaptable and golden-ratio geometric counting feature bounded total credit with first-author-credit normalization (compare Table 3).

Example applications

Often, publication credit is based on author contributions, and in these cases, total-credit normalization reflects that the sum of relative author contributions is 100%. This notion can be supported by the idea that researchers with many collaborators should be more productive than those with fewer, and thus will compensate the decrease of credit to credit coauthors; or by the fact that collaborative publications have some bibliographic benefits such as higher average number of citations (Katz and Hicks 1997). It relies on the assumptions that credit cannot be assigned multiply without unfairly putting the rest of the scientific community in an unfavorable position; and that instead it can be divided and transferred to other coauthors. However, there are forms of credits where these assumptions do not hold: the following three subsections discuss three examples supporting criterion 6, respectively, in which first-author-normalization of credit may be more appropriate.

Graduation and promotion requirements

One example mentioned in the introduction is when publications are counted as a criterion for graduation or promotion, which is becoming more common (Wilson 2002; Hagen 2010a). The key feature of this example is the usually short time period between publication and evaluation, in particular, in the case of graduation. This reduces the possibility of evaluating the quality of a publication by features such as citations. Therefore, the increased quality of collaborative publications is usually irrelevant for researchers when evaluated shortly after publication of their work. In these cases, with total-credit normalization, a researcher working in a larger group or collaboration would have to author more publications than an isolated researcher, unless it could be shown that the usually low gains for coauthorships offset the losses on first-author publications implied by total-credit normalization; or that less effort per publication and author is required for a publication with more collaborators, allowing the researcher to publish more in the same time. Until this can be shown, first-author normalization appears to be a valid alternative.

The above issue becomes even more relevant when one considers that a first author subject to graduation or promotion requirements may not be in the position to determine who is an author or not; in particular, in fields where the principal investigator is usually the last author. In the most basic, but very relevant scenario, the principal investigator, PhD advisor, and/or department head ask for themselves to be named as coauthors without having contributed as such (Slone 1996; Bonekamp et al. 2012), which unfairly decreases first-author credit with total-credit normalization: clearly, the effort required to publish does not decrease, while the awarded credit does.

Collaborations

Another example application is when collaboration between researchers shall be encouraged. The key defining property of this example is the researcher’s possibility of choosing potential coauthors, assuming that entities such as policy makers aim to encourage researchers to collaborate with each other. In practice, the modest increase of total credit for a collaborative publication may not be enough to counter the decreased relative credit of, for example, the first author. This implies that researchers may suffer bibliographically from initiating a collaboration, unless they are convinced that publishing becomes much less of an effort through the collaboration, and that the quality of the publication increases. As a consequence, researchers considering initiating a collaboration may find themselves torn between two conflicting objectives: one to initiate a collaboration, and another to maximize their bibliographic indexes. A funding agency aiming to encourage collaboration may consider using first-author credit normalization to assure researchers that collaborative efforts will not hurt their bibliographic indexes used to evaluate future proposals.

Credit based on accountability

This subsection challenges the assumptions that credit cannot be assigned multiply, and that it can always be divided. For this final example, consider the situation of two researchers authoring a publication in a field where, for simplicity, first and last authors are perceived identical. I assume a junior first author carrying out the research and writing the manuscript, and a principal investigator contributing through study design and ideas, advice and guidance. Compared to a single-author publication, there may be only a modest increase in total effort due to additional reporting and synchronization, and double checks and management (as opposed to only self-management). Similarly, there may only be a small increase of scientific contribution by having two authors. Thus, according to relative-contribution-based credit, the total credit should increase only to a small degree and be divided among the two authors; effectively, the first author transfers some of their credit to the second. However, there is a duplication of accountability, as each author can be held accountable for the full publication, which is one form of an investment. Effectively, accountability is assigned multiply and cannot be transferred to other coauthors.Footnote 4 This implies that the return, each author’s relative-contribution-based credit with total-credit normalization, would not be on a par with that investment.

Taking the above reasoning further, the question that arises is whether the addition of more authors should further decrease the first (and last) author’s credit, while maintaining their full accountability, thus weakening the first author’s incentive to collaborate. This effect is only reinforced by the fact that publishers acknowledge that some authors of a publication may not be held fully accountable (Nature 2007). Thus, while middle authors are granted some creditFootnote 5 for generally low accountability, first authors’ credit–accountability ratios decrease with each additional coauthor, due to decreasing credit and constant full accountability. In this case, the argument that without correction methods, researchers benefit from working in larger groups (compare the “Introduction” section) is thus reversed by total-credit normalization: first authors pay for working in larger research groups, by more colleagues easily becoming coauthors on their publications and receiving a share of the first authors’ credits.

At the same time, it should be noted that tackling honorary authorship, which is one of the main arguments for avoiding inflationary bias, does not necessarily require total-credit normalization: advanced counting methods (such as Eqs. 79) fight honorary authorship by attributing less credit to less important authors. In that case, total-credit normalization not only hurts first authors, but also has a very limited effect, in particular in fields with many authors.

One way of dealing with the first author’s problem, in line with criterion 6, is to assign credit based on accountability.Footnote 6 For an arbitrary counting method, this can be implemented by first-author-credit normalization, accepting that a publication’s total credit can exceed 1.Footnote 7 In the above example, the researchers’ contributions could be equal to one another, as are their accountabilities. The difference between relative-contribution-based and accountability-based credit is how these equal credits will be normalized. Based on relative contributions, each would receive 0.5; based on accountability, each receives 1.

Empirical data and evaluation

The empirical data available has been collected under the total-credit-normalization assumption; therefore, counting methods are only compared for this case. The following sections describe the data and the methods.

Empirical data

For validation of adaptable geometrical counting, I evaluated the ability to model empirical data and compared the results to four other counting methods. Initially, I chose the same data sources as Hagen (2010b), which are assumed to describe the perceived contributions of authors of multiauthored publications, published in fields where authors are not ordered alphabetically (sequence determines credit). These data have been collected from 37 faculty members and advanced graduate students in psychology (Maciejovsky et al. 2009), from 87 promotion committee representatives in medicine (Wren et al. 2007), and from 68 researchers in chemistry (Vinkler 2000), and presented as averages. In particular, I used the average assigned contribution credit (Maciejovsky et al. 2009, Fig. A2); the mean perceived credit, averaging values for initial conception, work performed and supervision (Wren et al. 2007, Table 1); and the individual credit shares of coauthors (Vinkler 2000, Table 4). The three datasets cover different numbers of authors: 2 to 4 (psychology), 3 and 5 (medicine), and 2 to 6 (chemistry). In total, \(K= {37}\) individual data points are available, not counting the trivial case of single authors, based on an estimated 2000 total individual assessments. We note each data point \(k\) by a tuple

$$\begin{aligned} \left( i_k, n_k, \alpha ^\mathrm {emp}_k\right) , \end{aligned}$$
(14)

where \(\alpha ^\mathrm {emp}_k\) represents the perceived contribution of an \(i_k\hbox {th}\)-ranked author of a publication with a total of \(n_k\) authors.

Since the psychology data had been published only by means of plots, these were re-digitized based on their vector-graphics representation (Maciejovsky et al. 2009, mksc.1080.0406-sm-appendix.pdf). In the case of medicine data, perceived contributions of the last authors were so similar to those of the first authors that I set \(\alpha ^m_n := \alpha ^m_1\) for all counting methods \(m\) (compare the discussion at the end of the “Related work” section), and re-normalized to ensure \(\sum \alpha ^m_i= 1\).

Evaluation

The counting methods compared here are those described by Eqs. 69 and 12; for adaptable geometric counting, I varied the value of \(\gamma\) between 1 and 2.5 with a step size of 0.01, and with a finer step size of 0.001 for \(\gamma \in [1.5, 1.7]\). For each counting method \(m\) and each value of \(\gamma\), to quantify the agreement between empirical and modeled contributions, I computed the lack of fit (LOF) between the two quantities by

$$\begin{aligned} \mathcal {L} = \frac{1}{K- 1} \sum _k\frac{1}{\alpha _k} \left( \alpha ^\mathrm {emp}_k- \alpha _k\right) ^2, \end{aligned}$$
(15)

where \(\alpha _k= \alpha ^m_{i_k} \left( n_k, \gamma \right)\). For four out of five methods, values of \(\mathcal {L}\) were compared to those published by Hagen (2010b).

For adaptable geometric counting, I determined the values of \(\gamma\) for which \(\mathcal {L}\) takes its minimum. To gain insight into the uncertainty associated with that minimum of \(\mathcal {L}\), I repeated the analysis after adding zero-mean Gaussian noise with a standard deviation of 1 percentage point (p.p.) to the observations (10,000 noise realizations). Finally, correlation of \(\alpha _k\) with \(\alpha ^\mathrm {emp}_k\) was studied for harmonic and golden-ratio (\(\gamma = \phi\)) geometric counting, respectively.

Results

Figure 1a shows the LOF for each of the four previously proposed counting methods and, as a function of \(\gamma\), for adaptable geometric counting. For \(\gamma = 1\), adaptable geometric counting equals fractional counting, confirming the approximately 18-fold LOF compared to harmonic counting (Hagen 2010b). Similarly, for \(\gamma = {2}\), adaptable geometric counting equals regular geometric counting (6-fold LOF compared to harmonic counting). Between these extremes, the LOF curve shows a broad minimum for \(\gamma \in \left[ {1.5}, {1.7}\right]\), where the harmonic-counting LOF is exceeded by only 51–95% (\(\min _\gamma \mathcal {L} = {0.00564}\), at \(\gamma = {1.573}\)). LOF variations with different weights confirmed this trend, with minima at \(\gamma ' = {1.641}\) and \(\gamma '' = {1.620}\), respectively, and LOF values as low as 20% worse than for harmonic counting.

Fig. 1
figure 1

a Lack-of-fit (LOF) values \(\mathcal {L}\), with respect to empirical data as used by Hagen (2010b), of adaptable geometric counting as a function of \(\gamma\) compared to four other counting methods. b Histogram of minimum positions of \(\mathcal {L}\) for adaptable geometric counting with Gaussian noise (zero mean, 1 p.p. standard deviation, 10,000 noise realizations

Concerning the sensitivity to uncertainties in the data, Fig. 1b shows the histogram of LOF minimum positions after adding zero-mean Gaussian noise with a standard deviation of 1 p.p. to the observations, after 10,000 repetitions. The Gaussian fit to that histogram reveals a mean value of 1.587 with a standard deviation of 0.035.

Fig. 2
figure 2

Individual data points (\(K= {37}\)), linear regression lines, and linear correlation coefficients for harmonic and golden-ratio geometric counting

Figure 2 compares the correlation with empirical data on the basis of \(K= 37\) individual data points for harmonic and golden-ratio geometric counting, respectively. Considering the degree of uncertainty in the empirical data (e.g., standard deviations of 5–25 p.p. in the medical data; Wren et al. 2007), these two methods can be considered practically identical in the context of the available data.

Fig. 3
figure 3

Lack-of-fit values \(\mathcal {L}\) and histograms of minimum positions of \(\mathcal {L}\) as in Fig. 1 with respect to three variations of the empirical data, where processed chemistry data was a replaced by an earlier processed version, b replaced by the empirical data, c ignored

The above comparison yielded LOF values for fractional, arithmetic, geometric, and harmonic counting that are in excellent agreement with previously reported ones (\(R^2 = {0.997}\)), despite my re-digitization of the results reported by Maciejovsky et al. (2009) and of half of the LOF values from Hagen (2010b). Interestingly, the numbers reported by Vinkler (2000) and used by Hagen (2010b) had already undergone some processing since their initial collection (Vinkler 1993), such as rounding to 0.05 and removal of last-author effects.

I have thus studied, as before, the specific impact of the chemistry dataset by

  1. a.

    using an earlier version of said processed data based on the same empirical data (“Cooperativeness” values from Vinkler 1993, Table 1),

  2. b.

    using the original empirical data (“Total Contribution Factors” values from Vinkler 1993, Table 5) while correcting an apparent last-author effect by setting \(\alpha ^m_5 := \alpha ^m_1\) (only for 5-author publications), or

  3. c.

    ignoring this dataset, respectively.

These result variations are shown in Fig. 3. Notably, while the LOF functions are changed, they still exhibit broad minima around 1.410, 1.648 and 1.624 (Fig. 3, left), which is confirmed by LOF minimum positions in the noise analysis (Fig. 3, right). The minimum around 1.4 for processed chemistry data illustrates how much of an effect processing of these data can have. Both with unprocessed chemistry data and without chemistry data, the mean of the Gaussian is less than one (0.98 and 0.35, respectively) standard deviation away from the golden ratio.

Discussion

While the minimum of the lack-of-fit curve (Fig. 1a) is not exactly at \(\gamma = \phi \approx {1.618}\), the golden ratio is a parameter value that appears to be generally compatible with the empirical data.

This is confirmed by two observations in particular. First, the LOF at \(\phi\) is only 3.8% higher than at the minimum due to a very low curvature of the LOF curve, making \(\phi\) an as good candidate as the true minimum, and only 57% worse than that of harmonic counting. This is further supported by the excellent correlation between the empirical data and golden-ratio geometric counting, compared to harmonic counting (Fig. 2). Second, as the relatively high uncertainty in the available data points limits the accuracy of the best estimate, I have studied its impact by repeating the evaluation with a very moderate noise influence (Fig. 1b). As \(\phi\) is within less than one standard deviation of the mean value, it is not unreasonable to consider that the true minimum is near \(\phi\).

The high sensitivity of the LOF minimum to noise implied by that initial comparison is further confirmed by Fig. 3(left). Interestingly, both with the original chemistry data (Fig. 3b) and without it (Fig. 3c), the LOF minimum is closer to \(\phi\) than with either of the processed chemistry data sets (Figs. 1, 3a) from Vinkler (1993, 2000). This is similar in the noise study in Fig. 3(right). This result further supports \(\phi\) as a good parameter choice, while it should not be understood as the only, or best, choice in all cases.

Since harmonic and golden-ratio geometric counting are close to each other numerically, as both model the same empirical data equally well, drastically different results for any of the derived bibliometric indexes should not be expected—in line with the review of 37 h-index variants (Bornmann et al. 2011), of which most yield little additional information over the original one. In fact, in the terminology used by Hagen (2013), the variability in the empirical data explained by harmonic and golden-ratio counting, respectively, differs by only 1.1%. However, there are several distinct advantages of (golden-ratio) geometric counting over harmonic counting.

First, one has to acknowledge that most popular bibliometric indexes have not yet incorporated a bibliometric counting method that considers the author sequence. In practice, adoption of such a method in the public may be aided by the link between geometric counting and the perceived attribution of author contributions through the golden ratio. Therefore, one may even hypothesize that the golden ratio has had an intrinsic influence on the empirical data, in particular, in those studies where participants were asked to divide a bar representing 100% to quantify relative contributions (compare Maciejovsky et al. 2009, Fig. 1). For a two-author (\(n = 2\)) publication by A and B and total-credit normalization, the normalization factor is given by \(\phi ^{-1} + \phi ^{-2} = 1\). Then, the weighting is simply

$$\begin{aligned} \alpha _i^\mathrm {gold} = \phi ^{-i}, \end{aligned}$$
(16)

and the ratio of A’s and B’s credit equals the ratio of the full and A’s credit:

$$\begin{aligned} \phi ^{-1}\big /\phi ^{-2} = 1 \big /\phi ^{-1} = \phi . \end{aligned}$$
(17)

Second, in the absence of additional parameterization, thus assuming the single parameter \(\gamma\) fixed, golden-ratio counting is similar in parsimony to harmonic counting (Hagen 2013), having a compact, closed-form expression and requiring no extra parameter. In addition to that, adaptable geometric counting offers a direct way of modifying the parameter \(\gamma\) as a means to account for additional considerations, such as custom valuations of multi-author publications. As already mentioned, the \(\gamma\) parameter has a straightforward interpretation as the ratio between two subsequent author’s credits. Thus, \(\gamma\) may be subject to adaptation whenever a research field perceives fair credit sharing differently from those I focused on here. Arguably, harmonic counting could be turned more flexible by introducing a decay parameter \(\gamma\) in weights such as \({\bar{\alpha }}= 1 / (n_p)^{\gamma }\). However, for \(\gamma < 1\), the problem of unbounded total credit with first-author normalization remains; the bound is still excessively large for other parameter values only slightly larger than 1; and \(\gamma\) does not offer as simple an interpretation as in adaptable geometric counting.

Finally, publication credit can be normalized to full first-author credit while maintaining an upper bound for total publication credit; for example, to attribute credit as a function of accountability, and not (only) of contribution. This implies a second interpretation of \(\gamma\), as \(\gamma / \left( \gamma - 1 \right)\) represents the upper bound of the total contribution of a publication compared to a single-author publication. For \(\gamma = \phi\), this bound equals \(\phi ^2 \approx 2.618\). However, it is possible that perceived author accountabilities behave differently from perceived relative author contributions, and additional empirical data may be required to validate values of \(\gamma\) in that case. For any \(\gamma > 1\), the adaptable geometric counting rule for accountability-based credit reads

$$\begin{aligned} {\bar{\alpha }}_i^\mathrm {acc} = \gamma ^{1 - i} \end{aligned}$$
(18)

(compare Table 3). Note that there is no harmonic counting method having both \({\bar{\alpha }}_i= 1\) and bounded total credit, regardless of normalization.

One advantage of harmonic counting is that it describes empirical data slightly better than golden-ratio geometric counting, but with a negligible difference for low numbers of authors. In order to study potential differences, these two counting methods should be compared in terms of describing data for publications with long author lists, where the difference between the two methods may be more significant.

Conclusion

For contribution-based publication credit of individual researchers when we suspect that the coauthors’ contribution is not uniform, I have shown that adaptable geometric publication counting with a parameter \(\gamma\) can model empirical data with similar accuracy as harmonic counting, and that the golden ratio \(\phi\) appears as a sensible parameter value of \(\gamma\). First-author-normalization of golden-ratio counting allows to make credit independent of coauthors while maintaining an upper bound for the total credit of a publication. As an alternative to author contributions, author accountability could be considered to define author credit; this can be achieved with first-author credit normalization, unlike with harmonic counting. The parameter \(\gamma\) can be be adapted, if necessary, having two straightforward quantitative interpretations related to the ratio of two subsequent authors’ credits as well as the upper bound of the a publication’s total credit.