Introduction

The species problem—the problem of defining the species category and delimiting species taxa in an objective, consistent and biologically meaningful way across all living taxa—is one of the most debated issues in biology. Few topics have triggered more publications, but a consensus has not been reached. Many researchers might therefore be inclined to consider it a waste of time to publish even more on it. However, over the last 12 months or so a number of noteworthy attempts at ameliorating this predicament have appeared, in particular two new species concepts and a radical new proposal of how to deal with the inherent vagueness around the ‘species level’. A detailed discussion or appreciation of these publications is beyond the scope of a research commentary, so instead I will only briefly summarize them and place them in the broader context of what I think are the inherent limitations of taxonomy. An admission that these limitations are real is key not only to taxonomy but also to wide areas of biology and beyond that depend in one way or another on species as the currency of many of their research questions, e.g., (macro-)ecology, evolutionary biology, conservation, and environmental policy. One only has to think of the large number of studies comparing intraspecific vs interspecific patterns (microevolution vs macroevolution) to acknowledge how critically all these analyses hinge on the way we define the species category and delimit species taxa.

Background: the plurality of species concepts

There are at least 34 different species concepts (see Zachos 2016, chapter four, and below), i.e. definitions of the species category. While this plurality was long seen as an obstacle to solving the species problem, a different interpretation has recently caught on. Mayden (1997) and de Queiroz (1998, 2007) have pointed out that not all species concepts are the same. All biologists seem to agree that species are separately evolving (meta-)population lineages as conceptualized by the evolutionary species concept (ESC, Wiley 1978) or the very similar general lineage species concept (GLSC, de Queiroz 1998) and unified species concept (USC) (de Queiroz 2007). These three therefore serve as primary or ontological concepts, while the other ca. 30 are rather identification criteria of these separate population lineages. The nuisance of having to deal with so many species concepts can be reinterpreted as a situation in which various lines of evidence (the content of the identification ‘concepts’) can be used to discover species lineages. The fact that, for instance, the biological species concept is only applicable in synchrony and to sexually reproducing organisms would be less of a problem because there are many alternative ways to identify species. Shanker et al. (2017) have recently pointed out that this approach embraces rather than overcomes pluralism and is more of a framework in which the process of lineage divergence can be understood than a truly universal species concept. An alternative approach is to view the species category (similar attempts exist for the species taxon) as a cluster or family resemblance concept sensu Wittgenstein, i.e. as a fuzzy group of natural entities with a cluster of characteristics that need not all be instantiated at the same time (Pigliucci 2003). Both these approaches are theoretical steps forward towards an understanding of the species category and its ontology, but the most pressing practical problem remains unsolved—species delimitation.

New concepts and an old problem

Recently, two new species concepts have been introduced, the mitonuclear compatibility species concept (MCSC, Hill 2017, see also Hill 2016) and the inclusive species concept (ISC, Shanker et al. 2017).

The MCSC is based on the hypothesis that mitonuclear interactions are key to speciation processes, which in turn is based on the fact that mitonuclear compatibility is a prerequisite for intracellular energy production, in particular a functioning electron transport system and oxidative phosphorylation in the mitochondria. Nuclear genes that interact with mitochondrial genes in this Hill calls \(\hbox {N}_{\text {O}}\)-mt genes, and he holds that ‘the process of speciation is the process of divergence of sets of coadapted mt and \(\hbox {N}_{\text {O}}\)-mt genes. These species-specific sets of mt and \(\hbox {N}_{\text {O}}\)-mt genes will both define a species and maintain its identity. Fitness loss in offspring with mixed mt and \(\hbox {N}_{\text {O}}\)-mt genes will serve as a barrier to gene flow between species’ (Hill 2017, p. 397). The MCSC accordingly reads as follows: ‘A species is a population that is genetically isolated from other populations by incompatibilities in uniquely coadapted mt and \({N}_{{O}}\)-mt genes’ (ibid., italics in the original). Hill focusses on birds, but in principle, the concept could be applicable more widely. It is similar to the genetic and the differential fitness species concepts but aims specifically at coadaptation between the mitochondrial and the nuclear genome. Hill provides a number of findings that could be explained by the MCSC, e.g. mitochondrial barcode gaps and their concordance with phenotypic differences such as plumage colouration, the genetics of hybrid zones, Haldane’s rule and others. Irrespective of its future fate as a generally applicable species concept, the research into mitonuclear compatibility promises exciting new insights into the genetic underpinning of lineage divergence.

The ISC considers a species as ‘that inclusive group of individuals that have finite probabilities of contributing to a common gene pool’ (Shanker et al. 2017, p. 416). The idea of genetic distinctness is prominent, as in so many other species concepts, and the authors particularly highlight similarities with the genotypic cluster species concept, but emphasize that their approach allows for probabilities to contribute to more than one cluster and that the ISC is to be seen within the larger framework of the GLSC. They explicitly acknowledge nature’s fuzzy boundaries at or around what we perceive as the species level. Importantly, Shanker et al.’s inclusive approach includes practical guidelines for species delimitation. They acknowledge the dilemma that species as products of evolution are historical entities and thus individuals in the philosophical sense, but that in classification we ‘treat’ them like classes when we group individual organisms according to presence or absence of characters (for an epistemological analysis of the species problem based on this discrepancy, see Hey 2001). They propose an approach to species delimitation that combines genetic data with geographical distribution information and morphological divergence on the lineages retrieved by the genetic analysis. What they call ‘morphometric terrain’ is basically the density distribution of phenotypic variation in morphospace and similar to morphological diagnosability approaches, e.g. under some versions of the phylogenetic species concept (PSC).

So, where do these newly published concepts leave us? While their impact on taxonomy is yet unknown, they both add to our arsenal of dealing with nature’s fuzzy boundaries and, particularly in the case of the MCSC, point towards a concrete research programme. Any attempt at diminishing the validity of the old quip that ‘a species is whatever a competent expert in the group says it is’ is highly welcome. However, in spite of all the (justified!) claims that taxonomy at its best is not only descriptive but also a hypothesis-driven science, taxonomists suffer from a fundamental limitation inherent in their discipline. Ultimately, taxonomy will always be a discrete binary system (species yes or no) that is imposed on a continuous process (evolution). This is why Dobzhansky (1937, p. 312) famously stated that ‘Species is a stage in a process, not a static unit’, and it leads to a two-fold problem: delimitation and ranking. We may agree that species are independent lineages, but independence comes in different degrees. Applied to the two new species concepts this begs the question ‘how’ isolated by mitonuclear incompatibilities two populations must be to count as different species, and ‘how’ high or low the finite probabilities of contributing to a common gene pool must be to count as one or two species. Analogous questions can be asked for every single species concept or criterion. All these criteria are continuous, even seemingly discrete opposites like allopatry vs sympatry (Zachos 2016, chapter 6.1). From this it follows that one can delimit species more or less inclusively, and both will be based on biological realities (Zachos 2015). At this point, the old argument between lumpers and splitters cannot, by virtue of the process of evolution itself, be decided unequivocally and objectively. It has been tried over and over again to find the outlier level in the hierarchy of life that separates, in the words of Hennig, tokogeny from phylogeny, but this is an area or space, not a well-defined single level. This area of fuzziness is the grey area in which a species truly is what a taxonomist says it is (or what the taxonomic community agrees on)—and has to be because both splitting and lumping are neither right nor wrong. Therefore, while taxonomy in large parts undoubtedly is a true science with testable hypotheses, in this grey area of lineage divergence taxonomic conclusions on species status are more like ‘executive decisions’ because both alternative views are equally right or wrong (an analogous case in linguistics is the sometimes fuzzy distinction between dialects and languages) (Zachos 2018). In the light of evolution, many such cases are to be expected. In fact, if there were none, evolution as a historical truth would probably be refuted. This grey area is also an area of exciting research into how population lineages diverge in the process that we ultimately perceive as speciation. But this dream for evolutionary biologists is a nightmare for taxonomists. Whether two closely related populations are to be considered one or two species is therefore sometimes a terminological/nomenclatural rather than a scientific issue—analogous to the naming of higher monophyletic taxa: whether the name Mammalia refers to the crown group of synapsid amniotes (i.e., the least inclusive clade comprising monotremes, marsupials and placentals) or whether it also comprises the sister group to that taxon is a matter of words and convention, not of science, making the essentialist question ‘What is a mammal?’ almost rhetorical (see below). Taxonomic theorists sometimes seem obsessed with objectivity, which in itself of course is not a bad thing. However, where objectivity is not to be had, claims to the contrary are flawed. In what is called the Hennigian species concept, it is, among other things, absolute reproductive isolation that makes a species, which would result in horses and donkeys being conspecific because fertile hybrids, although extremely rare, have occurred. At the other end of the spectrum, the diagnosability version of the PSC acknowledges every single diagnosably distinct population as a separate species, so that in effect the question ‘What is a species?’ ultimately would be equal to ‘What is a population?’ (see Zachos 2015 for a discussion of this species concept). Because of the continuous nature of most relevant biological phenomena, this objectivity is an illusion, and deciding on a threshold that can be measured objectively is not the same as an objective delimitation criterion!

Because delimitation along a continuum can result in equally valid more or less inclusive entities, the next problem arises: that of ranking. The very fact that the lion and the dandelion are called species suggests that they are the same kind of thing. But are they really? And is there even a way of finding out? Or is the species level not unique among the Linnaean categories but just as arbitrary as families, orders and classes? It is important to realize that species taxa are real historical entities (just like higher monophyla such as Mammalia or Coleoptera). The category, however, may not be, i.e. what we call the species level has nothing in common but the name: ‘Species are equivalent by designation, only not in terms of their state of evolutionary, genetic or ecological differentiation or divergence’ (Heywood 1998, p. 211). Accordingly, there have been demands that the species category be abolished (e.g. Mishler 1999; Hendry et al. 2000; Mishler and Wilkins 2018). This is a worrying thought, but I am not aware of any convincing refutation, or of convincing evidence that species across the board (or even within a relatively small part of the Tree of Life like mammals or mosses) are really directly comparable entities. All attempts to pinpoint such a universal species level have failed so far, and the two new concepts are no solution to this conundrum either. This problem and its solution, or at least a conscious appreciation of the predicament it entails, are of utmost relevance not just to taxonomy but to large parts of biology and related disciplines that use species (and particularly species numbers) as the universal currency to quantify biodiversity. The ‘genuine problem with species counts, even repeatable ones that are arrived at with a consensus on methods, is that we do not know just what they are counts of’ (Hey 2001, p. 187). If this is true, and at the moment there is unfortunately little evidence that it is not, the question ‘What is a species?’ may be misguided. Rather than asking fundamental taxonomic questions in an essentialist way, methodological nominalism may be more appropriate: instead of ‘What is a species?’ we should perhaps more humbly ask ‘What should we call reproductively isolated/diagnosably distinct/reciprocally monophyletic etc. populations?’ This would perhaps sensitize biologists to the potential arbitrariness involved in the assignment of the species rank, and it would raise awareness that some of the species controversies may be just as much about words (names) as about real biological phenomena. Also, it would contribute towards an insight that we must ‘choose’ the level that we want to call species. This should be done with some kind of biological relevance in mind because ‘given the degree of focus on them (species) in biology and more broadly, we need to choose carefully the entity that receives this designation’ (Freudenstein et al. 2017, p. 644).

The realization that the species rank’s elusiveness may be due to nature’s inherently fuzzy boundaries and not due to our ignorance has resulted in a number of pragmatic approaches. Rather than continuing to search for the Holy Grail, it has been suggested to agree on a consistent and quantifiable delimitation procedure for what is then called species (Hey 2001). This way, given the same raw data, different taxonomists would at least come up with the same species delimitations in a consistent and repeatable (albeit still not completely nonarbitrary) way. One such approach worked out in detail are the so-called Tobias criteria (Tobias et al. 2010) in ornithology. Based on a quantitative scoring system of phenotypic, acoustic, ecological, behavioural and geographical raw data, species status is assigned or denied. Of course, this scheme only works for birds and potentially in some other groups, so it is not universal, but general application is not impossible for such systems. A standardized (as much as possible) framework including multidisciplinary datasets could be adjusted to be more widely applicable. One idea could be to identify populations, collect the relevant data and then group these populations into units such that among-group variance is maximized. An analogous algorithm, spatial analysis of molecular variance (SAMOVA), already exists in population genetics. The datasets or algorithms could then be modified so that the groups yielded conform best to our a priori notion of the species level. The result would not be objectivity, but at least consistency, and after all, one hallmark of science is retrieving the same results from the same input data.

A focus on species as lineages, however important and well-founded, is not enough. The Tree of Life is hierarchically structured into lineages within lineages, and species delimitation approaches tailored to detect structuring, for example coalescent algorithms, do exactly that: they detect structures, but these structures need not necessarily be species (Sukumaran and Knowles 2017; see also Sites and Marshall 2004, who hold that species delimitation necessarily entails qualitative judgement). The level that we choose to call species is increasingly required—again, one should add, as it has often indiscriminately been dismissed as ‘typology’ or pheneticism—to also show some kind of phenotypic and ecological distinctness. Most recently this has been put forward in the ‘phenophyletic’ view of species by Freudenstein et al. (2017) which highlights phenotypic and ecological uniqueness and makes explicit reference to similar ideas in the ESC, one of the most fundamental (ontological) species concepts (see above). More precisely, Freudenstein et al. view their idea of species as ‘a lineage or a group of connected lineages with a distinct role’ (p. 650, italics in the original) as a combination of Simpson’s early version of the ESC (Simpson 1951) and the ecological species concept (Van Valen 1976). They consider being a lineage a necessary but not sufficient condition for the species category as there are so many different levels of lineages in the Tree of Life, including within what is usually perceived of as species (there is a whole discipline dealing with intraspecific lineages: phylogeography). Ecological role and its concomitant phenotypic distinctness also need to be met to make a species in their approach. In this context, it is noteworthy that in some conservation approaches not all species are the same: in the evolutionarily distinct and globally endangered (EDGE) approach (Isaac et al. 2007), species and threat status are just two of the necessary conditions to be prioritized; the third is evolutionary (and thus also phenotypic) divergence.

‘Taxonomy anarchy’ and the Garnett and Christidis debate

A more radical proposal was made by Garnett and Christidis (2017). In a comment paper in Nature entitled ‘Taxonomy anarchy hampers conservation’, the authors, after emphasizing the arbitrariness and personal taste of individual taxonomists involved in species delimitation, ‘contend that the scientific community’s failure to govern taxonomy threatens the effectiveness of global efforts to halt biodiversity loss, damages the credibility of science and is expensive to society’ (p. 25). They ‘propose that the governance of the taxonomy of complex organisms be brought under the purview of the International Union of Biological Sciences (IUBS)’ (ibid.). In other words: the decision which species taxa to accept is supposed to be centralized to ‘restrict the freedom of taxonomic action’ (p. 26). Importantly, this restriction should include the creation of ‘boundaries for species (and other taxonomic units) that can be applied consistently across multiple life forms’ (ibid.). Further, to secure legal status of species (‘vagueness is not compatible with conservation’, p. 27) and to also take into account social and financial ramifications of species status (as when the habitat of an endangered species needs protection) not only biologists, but also lawyers, anthropologists and sociologists should be included in the taxonomic process. This is undoubtedly the most far-reaching proposal so far and would amount to nothing less than a revolution in taxonomy. Accordingly and expectedly, it has triggered criticisms and outcries from within the taxonomic community (see e.g. the Correspondence section of the subsequent Nature issues and Raposo et al. 2017). Too much bureaucracy, lack of scientific freedom and more money and recognition for taxonomy and taxonomists have been bemoaned and demanded as a reaction. Much of what Garnett and Christidis have said does not sit well with many of us. However, while nobody has to agree with the conclusions they have drawn, it is worthwhile, indeed indispensable, to also look at the first part of their argumentation. It is a fact that hardly any two taxonomies of any given group are identical, and sometimes the differences are huge. As argued above, this is not a sign of bad science, but of inherent limitations in trying to impose a discrete system on a continuous process. It is not biology’s fault that nature’s boundaries are fuzzy, but neither taxonomists nor all the other biologists who use species as a proxy in their analyses can afford to keep closing their eyes to the uncomfortable truth that as yet there is no, and perhaps there never will be, a fully objective way of ranking and delimiting species. Just like the use of higher Linnaean categories, that of species as a category may be flawed, which means that many quantitative studies on diversification rates, the distribution of biodiversity and conservation priorities may be seriously skewed. Faurby et al. (2016) have recently published a worrying analysis of the impact of different taxonomic opinions on quantitative studies. It should, at long last, be a wake-up call to all of us. One may disagree with Garnett and Christidis, even passionately so, but then one has to come up with a better solution. Criticisms in the name of scientific freedom and taxonomy as a hypothesis-driven science are important, but they are not enough. Business as usual is not an option if taxonomy is to be a full-fledged scientific discipline.