Keywords

1 Introduction

Molds from the genus Trichoderma (Hypocreales, Ascomycota) are among of the most common fungi; they are easy to isolate and handle in a pure culture (Migheli et al. 2009; Zachow et al. 2009; Chen et al. 2021). Consequently, the taxonomy of Trichoderma started with the beginning of the modern fungal taxonomy in the eighteenth century (Persoon 1794). Similar to other fungi, it was in the descriptive stage for two centuries and before entering a period of turbulence caused by molecular methods (Bissett 1984; Bissett 1991a, b, c; Kuhls et al. 1997; Kindermann et al. 1998; Kullnig et al. 2000). Ideally, taxonomy should reflect the nature of the organism and help its investigation. The biology of Trichoderma offers a convenient example to illustrate this relationship. Many Trichoderma strains have properties of environmental opportunism meaning that they are capable of fast colonization of a great variety of natural and artificial substrates, are highly competitive in microbial communities, are resistant to xenobiotics including chemical fungicides, and are potent producers of various metabolites such as enzymes, secondary metabolites, or surface-active proteins (Druzhinina et al. 2011; Sun et al. 2019; Gao et al. 2020; Druzhinina and Kubicek 2017; Pang et al. 2020). Some Trichoderma species can survive in soil and colonize rhizosphere possessing almost no harm to plants but stimulating their growth and development (Druzhinina et al. 2011; Harman et al. 2004; Marra et al. 2019; Rivera-Méndez et al. 2020). Being mycoparasitic, a growing number of Trichoderma species are proposed as biofungicides for plant protection in agriculture (Ding et al. 2020; Wu et al. 2018). However, the same property also makes Trichoderma species causative agents of the green mold disease on mushroom farms (Komoń-Zelazowska et al. 2007; Kredics et al. 2010) (see Kredics et al. in this book). Finally, some Trichoderma strains also have clinical significance as causative agents of nosocomial mycoses in immunocompromised humans (Chouaki et al. 2002; Myoken et al. 2002; Kredics et al. 2003). These versatile, largely beneficial, but also harmful properties of Trichoderma make the taxonomy of this genus a high priority task because the correct identification of a species can predict its properties and thus facilitate applications. The taxonomy of Trichoderma has been intensively studied over the last two decades resulting in a hundred-fold increase in the species number from a few “species aggregates” of Rifai (1969) to several hundred molecularly defined species enumerated in several recent reviews (Druzhinina et al. 2006; Atanasova et al. 2013; Bissett et al. 2015; Cai and Druzhinina 2021). Thus, today Trichoderma comprises the genus of very common fungi with most species that have been characterized using modern molecular techniques.

The large number of species in Trichoderma appears to be reasonable: Whole genomic investigations of this genus and other hypocrealean fungi have estimated the origin of the genus at the edge of Cretaceous-Paleogene mass extinction event 66–67 million years ago (Kubicek et al. 2019). The most recent phylogenomic tree (Kubicek et al. 2019) indicates that the formation of the major infrageneric clades such as Sections Trichoderma and Longibrachiatum recognized by John Bissett in the 1990s or the Harzianum Clade (Bissett 1984; Chaverri et al. 2003) was formed somewhat 20–25 million years ago, while some closely related species such as T. reesei and T. parareesei shared a common ancestor 4–8 million years ago. This vast evolutionary time and the relatively high evolutionary rates (compared to, e.g., vertebrates) offer the genus Trichoderma tremendous possibilities for the adaptation to the environmental conditions and speciation. However, similar to other fungi, many evolutionary different strains of Trichoderma still share remarkable morphological and ecophysiological similarities. It appears that many traits suitable and accessible for direct examination by taxonomists are homoplasious and appeared due to convergent evolution. Thus, the most difficult task of modern taxonomy of Trichoderma is to retrieve the traits that would allow one to distinguish a great number of species.

The general fungal taxonomy is regulated by the Code, i.e., CN International Code of Nomenclature for algae, fungi, and plants (Turland et al. 2018), that now contains an advanced section for fungi in Chapter F, San Juan Chapter F (May et al. 2019). Even though the Code strictly regulates nomenclatural acts, it assumes a heterogeneity of approaches to define species (Turland et al. 2018). This can be explained by the complexity of lineage-dependent evolutionary processes (Steenkamp et al. 2018; Inderbitzin et al. 2020) or numerous pragmatic criteria used by the taxonomists for the classification of particular fungal groups. Lücking et al. (2020) found that the best practice depends on the group in question and the required level of precision. Some fungi can be grouped based on phenotype characteristics; however, most fungi, especially asexual forms such as Trichoderma, require time-consuming and labor-intensive methods that include culturing, DNA barcoding, and phylogenetic analysis as well as discipline- or taxon-specific approaches such as physiological profiling (Lücking et al. 2020). Therefore, it is common for species concepts determined by the taxonomy providers to vary even within one genus. However, taxonomy users expect that the identification of species should be precise and accurate. For Trichoderma, this collision of possibly vague species delimitation and the need for the exact species identification was recently addressed in Cai and Druzhinina (2021). This topic requires a thoughtful discussion that will also be presented in this chapter and continued elsewhere.

The biology of Trichoderma offers a number of exclusive opportunities to the taxonomists. Fungi from this genus are ubiquitous and relatively simple to recognize and collect in natural and human-made habitats. They are easy to isolate directly from specimens and from a broad range of substrates based on the characteristic genus-specific features. Most strains have fast growth in vitro on all common laboratory media and do not require demanding cultivation conditions such as temperature, illumination, or humidity. Importantly, and as it will be described in most chapters of this book, many Trichoderma spp. have highly valuable properties for industry and agriculture. Respectively, Trichoderma has attracted the attention of classical mycologists and people focusing on applied microbiology and developmental applications. Therefore, all collections of microorganisms have numerous Trichoderma isolates. Public depositories of gene sequences contain thousands of Trichoderma DNA barcodes, and the number of the whole genome sequences has grown exponentially. However, the identification of Trichoderma is also considered to be extremely difficult. Fungal taxonomists including experts working with this genus for many years now frequently fail to determine the species (Cai and Druzhinina 2021).

In this chapter, we investigate the theoretical background of these collisions in Trichoderma research aiming for a concise review of the taxonomic state of the genus. We present a brief synopsis of Trichoderma taxonomy through January 2021, list all Trichoderma species names, and explain the latest identification protocol for Trichoderma species.

2 The Numerical State of Trichoderma Taxonomy and Species Identification

After the implementation of the “One fungus – One name” concept of fungal nomenclature (Taylor 2011)—and based on the voting organized by the International Commission on Trichoderma Taxonomy (ICTT) (formerly www.isth.info, now www.trichoderma.info) of the International Commission on the Taxonomy of Fungi (ICTF, www.fungaltaxonomy.org)—Trichoderma was selected as a single generic name that should be used for all stages such as holo-, ana-, and teleomorphs. Consequently, the taxonomy of the genus Trichoderma was updated to include the species names previously attributed to teleomorphs from such genera as Hypocrea, Sarawakus, and Protocrea (Jaklitsch 2009a; Jaklitsch et al. 2014). The formal transfer of a few species of Hypocrea to Trichoderma is still pending (Cai and Druzhinina 2021); nevertheless, these species are valid names of the genus (Table 1).

Table 1 The alphabetic list of all species names deposited for Trichoderma in Index Fungorum (http://www.indexfungorum.org/), MycoBank (https://www.mycobank.org/), NCBI Taxonomy Browser (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi), and scientific literature as of February 2021

As of January 2021, the genus Trichoderma contains 468 species epithets, among which 379 names are currently in use, while 89 names (19%) are synonyms of different categories (abandoned names, orthographic variants, synonyms) (Cai and Druzhinina 2021) updated with materials from Gu et al. (2020). Forty names were introduced before the twentieth century. Of these, only five are currently in use including such important species as T. viride and T. atroviride . Sixty species were introduced in the twentieth century based on their morphology, (sometimes) ecophysiological properties, and biogeography (Rifai 1969; Bissett 1984, 1991a, b, 1992). The end of the century coincided with the introduction of molecular methods in Trichoderma taxonomy and the proposal of the genealogical concordance phylogenetic species recognition concept (GCPSR ) as the most powerful approach to distinguish fungal taxa (Taylor et al. 2000; Lücking et al. 2020). These changes resulted in a rapid increase in the number of taxa adding the majority of modern Trichoderma species names (364, 78%) delineated in the first two decades of the twenty-first century. Consequently, only 14 (4%) currently valid Trichoderma species have not been characterized by molecular markers (Cai and Druzhinina 2021), while 365 species (96%) have been DNA barcoded. This makes the genus Trichoderma a suitable model for DNA barcoding and molecular evolutionary studies in fungi.

The largest database of Trichoderma names is available in MycoBank (http://www.mycobank.org/) followed by Index Fungorum (http://www.indexfungorum.org). Most species names are recorded in both taxonomic depositories, but MycoBank still has 14 and Index Fungorum has 8 unique records. Therefore, none of the official depositories of fungal taxonomy has the full list of Trichoderma species names (Fig. 1). To date, the most complete list of Trichoderma species can be found in Table 1 (sorted alphabetically for convenience). Alternatively, the newly re-established website of the ICTT (www.trichoderma.info) contains the other copy of the complete list of species and is designed to be regularly updated. The interactive, updated, and searchable version of the complete list of Trichoderma species is available as a supplementary tool in the species identification protocol (www.trichokey.com) (Cai and Druzhinina 2021). However, as the number of species grows rapidly (Cai and Druzhinina 2021), it has been suggested to screen the most recent taxonomic literature and compare it to the data on recent website updates.

Fig. 1
figure 1

The numerical representation of Trichoderma taxonomy. The left Venn diagram shows the number of Trichoderma species deposited in the major depositories of fungal taxonomy such as Index Fungorum (http://www.indexfungorum.org/), MycoBank (https://www.mycobank.org/), and NCBI Taxonomy Browser (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi). The right Venn diagram shows the numbers of species that have one or several of the three DNA barcode sequences required for the molecular identification of Trichoderma . The bar plot illustrates the alarming situation related to identifiability of Trichoderma species. Numbers near the bars show the numbers of species (based on the estimates updated from Cai and Druzhinina 2021, www.trichokey.com and www.trichoderma.info)

The introduction of molecular methods in Trichoderma taxonomy not only resulted in the rapid growth of the species number but it also ended the morphological identification of Trichoderma (Kullnig-Gradinger et al. 2002; Druzhinina and Kubicek 2005; Druzhinina et al. 2005). Regardless of the experience and training of the taxonomist, the analysis of many morphological features cannot lead to unambiguous diagnosis of Trichoderma taxa even at the level of clades or sections. Thus, identification can only be achieved via analysis of DNA barcodes.

Even though 96% of Trichoderma species are characterized molecularly and the sequences are preserved in public databases, the Taxonomy Browser of NCBI (https://www.ncbi.nlm.nih.gov/taxonomy) contains only 340 species names (89% from all and 93% from molecularly characterized) meaning that sequence records for at least several dozen described species were not updated; however, these are still deposited as taxonomically undefined records (i.e., Trichoderma sp. strain ID). Consequently, these species will not appear in the results of the sequence similarity search using NCBI BLAST. The vouchered sequences can be retrieved based on sequence accession numbers provided in the publications.

Due to the high number of cryptic and closely related species, the accurate molecular identification of Trichoderma species requires analysis of at least three DNA barcodes (Cai and Druzhinina 2021) (see below). Considering the updated records for early 2021, the largest number of species have been DNA barcoded for tef1 (86%) followed by rpb2 (82%) and ITS (78%); only 270 (71%) have all 3 DNA barcodes (Fig. 1). Other commonly provided DNA barcodes (chi18-5=ech42, cal1, act, acl1, 18S rRNA=SSU, and 28S rRNA=LSU) are sequenced for less than one-half of the species; therefore, they currently have limited or no suitability for molecular identification regardless of their properties.

We notice that the number of species suitable for accurate species identification based on molecular markers is even lower than the estimate provided above (71%, Fig. 1). Our analysis showed that the identification of at least 50 recently described species is compromised by either incomplete reference sequences or sequences indistinguishable from the sister species (Cai and Druzhinina 2021). Thus, we counted only 224 (60%) of Trichoderma species that can be potentially identified based on available DNA barcodes (ITS , tef1, and rpb2). Still, this number appears to be an overestimate because the individual analysis of species frequently reveals further taxonomic collisions and leads to ambiguous results.

Thus, we conclude that while the taxonomy of Trichoderma attracted considerable attention over the last two decades, the taxonomic situation in the genus is alarming and requires urgent improvements (Fig. 1). The reasons for this unfortunate state of Trichoderma taxonomy and possible measures that can be taken for its improvement will be discussed below.

3 Three Stages of Trichoderma DNA Barcoding

The development of DNA barcoding of Trichoderma went through three pronounced stages: First, the species could be identified based on the combination of diagnostic oligonucleotide sequences in specific areas of ITS sequences of the rRNA gene cluster when the total diversity of the genus did not exceed 100 taxa (Druzhinina et al. 2005). This method was implemented in the web-based tool TrichOKEY and was supported by the public database of the reference sequences. At least for a decade, the TrichOKEY tool was appreciated by users of Trichoderma taxonomy because of its simplicity. For most species recognized at that time, a pasting of an ITS sequence in the web form provided an unambiguous and final identification result that did not require further analyses (reviewed at Druzhinina et al. (2006)). The identification could be performed by people having no experience in fungal taxonomy or molecular phylogeny. However, there were already several pairs of species that shared the same phylotypes of ITS and therefore were not distinguishable. Upon subsequent introduction of more and more new species, insufficient variability of ITS was demonstrated for many infrageneric groups especially for the clades within Section Trichoderma and Section Longibrachiatum as well as the Harzianum Clade . Therefore, ITS started to lose its reputation as the diagnostic marker for Trichoderma species (Druzhinina et al. 2012; Atanasova et al. 2010).

A new effort was focused on a search for the so-called “secondary” DNA barcode loci that would aid in unambiguous species identification. At that stage, the suitability of various loci was tested based either on the random use of recently cloned and characterized genes (e.g., ech42 = chi18-5) or more commonly following the practices used for the large DNA barcoding initiatives such as the Fungal Tree of Life project (Lutzoni et al. 2004). Thus, rpb2 (Liu et al. 1999), cal1 (Carbone and Kohn 1999), act (Carbone and Kohn 1999), 18S rRNA=SSU (White et al. 1990), and 28S rRNA=LSU were sequenced for a broad range of species, but only tef1 locus received broad support by the community (Cai and Druzhinina 2021). Therefore, the second phase of Trichoderma DNA barcoding was associated with the use of the large intron of tef1 gene (Kopchinskiy et al. 2005) for sequence similarity search. The sequences of tef1 were sufficiently polymorphic and allowed species identification with quite high precision versus the curated database of vouchered sequences using such tools as TrichoBLAST or (with more caution) NCBI BLAST. At that stage, we estimated that intraspecific variability of tef1 large (4th) intron could be as high as 4–5% meaning there was a 95% similarity threshold for most of the species in BLAST.

Rahimi et al. (2021) recently offered a way to identify T. reesei strains by searching for the long (400 bp) sequence of tef1 fragment that they postulated to be diagnostic for this species. However, no such hallmarks were reported for other Trichoderma spp. This “tef1” stage ended with the so-called species boom that occurred in Trichoderma in 2014–2015 when more than 100 new species were added mainly due to the taxonomic studies in Europe and China (reviewed in Cai and Druzhinina 2021). Dou et al. (2020) were the first group to realize that the single secondary barcode—the partial tef1 sequence—was no longer sensitive enough for the identification of Trichoderma species. For this purpose, they programmed MIST (The Multiloci Identification System for Trichoderma (http://mmit.china-cctc.org/)) that relied on the gradual application of sequence similarity search for the three loci: ITS , tef1, and rpb2. This started the third stage of Trichoderma DNA barcoding. This program offered a reasonable replacement to TrichOKEY that was consequently shut down (Cai and Druzhinina 2021). The strength of MIST was the most complete database of the reference sequences for Trichoderma and included the tree DNA barcoding loci for many type strains; it also contained numerous unverified records and thus could not result in highly accurate or precise identification. Interestingly, the two secondary DNA barcodes (the partial sequences of tef1 and rpb2) have unequal levels of polymorphism. Therefore, no single value of the similarity threshold could be used for either markers. To overcome this issue, we recently collected all DNA barcoding records for all contemporary valid Trichoderma species and proposed the species identification protocol (Cai and Druzhinina 2021). There, we reviewed the interspecific polymorphism of ITS , tef1, and rpb2 sequences of closely related Trichoderma species to find the most reasonable sequence similarity values for each of the three DNA barcoding loci. This allowed us to formulate the sequence similarity standard:

$$ Trichoderma\kern1em \left[{\mathrm{ITS}}_{76}\right]\sim \mathrm{sp}\exists !\kern1em \left( rpb{2}_{99}\cong tef{1}_{97}\right). $$

Here, “Trichoderma” means the genus Trichoderma, “sp” means a species, “~” indicates an agreement between ITS and other loci, “≅” refers to the concordance between “rpb2” and “tef1,” and “∃!” indicates the uniqueness of the condition (only one species can be identified). Subscripts show that the similarity per locus is sufficient for identification based on the assumptions of the protocol. This standard was then implemented in the molecular identification protocol (Cai and Druzhinina 2021) that required a manual analysis of every set of sequences per individual strain. Still, due to the high number or poorly characterized reference taxa, this protocol would also result in some ambiguous identifications. Moreover, the application of the identification procedure requires training in sequence analysis and can be difficult for inexperienced people. However, no “easy” solution appears to be feasible at this phase of Trichoderma taxonomy.

The current (third) stage of DNA barcoding of Trichoderma is based on the three DNA loci that are considered to be the most reliable. Still the identification process remains complex. Even though Cai and Druzhinina (2021) argue that all three loci are required for the accurate and precise species identification, ITS can only be used to identify Trichoderma at the generic level. Most species recognition comes from the diagnostic fragments of tef1 and rpb2 gene sequences. The choice of these loci is not determined by their particular suitability for the purpose but rather by their availability in public databases for most species (Fig. 1).

The advantage of tef1 is the high polymorphism of its large (4th) intron sequence that is 250–300 base pairs long. We determined that individual strains within most of the contemporary species share >97% similarity of this fragment meaning that the polymorphism can reach up to 3% or 20–25 single mutations. This “identification window” is small versus that during the second stage of DNA barcoding, but it still offers a reasonable resolution and may potentially lead to unambiguous identification of strains having tef1 phylotypes highly similar to that of the type strain for a given species. However, the disadvantage of tef1 is also linked to its high polymorphism because it prevents combining strains from different infrageneric clades on a single alignment (Jaklitsch 2009a, 2011). Consequently, many Trichoderma taxonomy providers keep sequencing tef1 for newly described species but have largely abandoned the polymorphic fragment and shifted toward the 3′ end of the gene to the highly conserved fragment of the last (6th) exon (Jaklitsch 2009b, 2011). Consequently, the taxonomic value of this version of the tef1 DNA barcode locus is neglectable. This shift coincided with the “species boom” and resulted in the description of the large number of species that cannot be distinguished based on existing DNA barcodes (Cai and Druzhinina 2021).

The properties of rpb2 are the reverse versus tef1: The DNA barcoding fragment of this gene covers an area of relatively highly conserved exon sequence. Contrary to tef1, these sequences are easily aligned genus-wide and therefore are suitable for the construction of whole genus phylograms (Atanasova et al. 2013; Cai and Druzhinina 2021). Consequently, the polymorphism of rpb2 is essentially lower than tef1, and such well-defined pairs of sister species such as T. asperellum and T. asperelloides , T. reesei and T. parareesei , and T. harzianum and T. afroharzianum differ by only 1% or a few single mutations of rpb2 (usually less than eight). Unfortunately, we have detected numerous recently described species that share identical or highly similar (>99%) sequences of rpb2 (Cai and Druzhinina 2021). The consideration of above-described limitations of tef1 and rpb2 DNA barcodes is the main but not the only source of identification complexity.

The other issue causing the identification ambiguity is related to the cases of unconcordant similarities of the three DNA barcoding loci. For example, Cai and Druzhinina (2021) pointed to the ambiguous taxonomic position of their model whole genome sequenced strain NJAU 4742 (Zhang et al. 2016, 2019; Pang et al. 2020; Cai et al. 2020; Gao et al. 2020; Druzhinina et al. 2018; Kubicek et al. 2019; Jiang et al. 2019; Zhao et al. 2021). This strain has the tef1 DNA barcode identical to the type strain of T. guizhouense . Therefore, it was attributed to this species at the second stage of DNA barcoding of Trichoderma. However, the rpb2 sequence of this strain is less than 95% similar to that of the type strain of T. guizhouense and has most affinity to T. pyramidale (97.8%, which is still below the identification threshold). Interestingly, we came across several other strains with the same haplotype of tef1 and rpb2 as NJAU 4742. These data suggest the existence of a putative new species (T. shenii nom. prov., Cai and Druzhinina 2021). This and numerous other cases of incongruent similarities point to the need for phylogenetic analyses of tef1 and rpb2 alignments along with the consideration of the similarities. In turn, these data explain why any attempts at automated identification of sequences such as TrichOKEY and MIST do not appear feasible.

4 Notes on the Identification of Trichoderma Species

The protocol for molecular identification of a single Trichoderma strain is detailed in Cai and Druzhinina (2021). That work also contains several dozen practical examples that provide an overview of various situations related to the implementation of this protocol. In this chapter, we do not repeat the description of the protocol but rather comment on it and highlight a few aspects that appear critical for its understanding and correct use (Fig. 2).

Fig. 2
figure 2

The summary of the current molecular identification protocol for Trichoderma species (Cai and Druzhinina 2021)

First, it is important to bear in mind that neither the choice of DNA barcode markers nor the sequence similarity threshold values were selected based on their properties or particular suitability for the species recognition in Trichoderma. The decision to use these loci was merely pragmatic because these were the only three DNA barcoding markers that were available in public databases for the majority of species (Fig. 1). Accordingly, the similarity values were picked such that they could distinguish most of the contemporary species (Cai and Druzhinina 2021). We admit that the whole genome sequences for Trichoderma (Druzhinina et al. 2018; Kubicek et al. 2019) could be used for the detection of essentially more powerful DNA barcoding loci in a hypothetical situation of a taxonomic revision of the entire genus. However, it is important to understand that no such revision appears to be envisioned in the near future for nonscientific reasons. The comparison of closely related Trichoderma strains is impeded by the strain exchange barriers between countries. For instance, at least 100 Trichoderma species have been recently described in China, and this number will likely keep growing (Cai and Druzhinina 2021). Due to the quarantine rules, sending strains across the borders between some specific countries for examination in other laboratories appears to be difficult. Thus, at this stage of DNA barcoding of Trichoderma, the selection of diagnostic loci and criteria for the identification were determined by the availability and other practical considerations.

Second, the protocol largely relies on the sequence similarity values, and its successful implementation requires precisely defined sequence fragments per each locus. Consequently, preparation of the protocol by trimming the sequences is an essential step that must not be omitted (Fig. 2). Every DNA barcoding locus can be PCR amplified using a variety of primer pairs (Jaklitsch et al. 2005; Carbone and Kohn 1999; Liu et al. 1999) resulting in fragments of different lengths. Therefore, the base pairs flanking the diagnostic regions must be removed either manually following the instructions in Cai and Druzhinina (2021) or using online support such as www.trichokey.com (Fig. 2).

Third, sequencing ITS is compulsory for the identification of Trichoderma species and the analysis of infrageneric diversity. Unfortunately, to date, the database of vouchered ITS sequences is smaller compared to tef1 and rpb2 (Fig. 1) because sequencing of ITS was abandoned by some providers of Trichoderma taxonomy after this locus lost its power in distinguishing many pairs or groups of closely related species. However, ITS still has an exceptional value in fungal taxonomy (Schoch et al. 2012). Even in Trichoderma, many species have unique phylotypes of ITS and can therefore contribute to the identification precision. More critically, ITS is highly diagnostic at the generic border of Trichoderma where the limited polymorphism of the protein-coding genes appears to be less informative (Cai and Druzhinina 2021). It is also necessary to determine ITS sequences for all new fungal taxa because it is the main locus used for fungal metagenomic studies and has a vast database of environmental records (reviewed in Lücking et al. (2020)).

Fourth, it is important to specify that the protocol allows one to identify some species through the analysis of sequence similarity values with no need to run phylogenies. For example, it might be common when a certain strain has the trimmed ITS and rpb2 phylotypes identical to that of T. asperelloides CBS 125938 (type) and the trimmed tef1 phylotype having one or two SNPs different from that of the above strain. In this case, the application of the Trichoderma [ITS76]~sp∃!(rpb299tef197) standard is unambiguous and leads to the molecular identification of the query strain as T. asperelloides . Many other cases require phylogenetic analysis. This is in particular necessary when tef1 and rpb2 are not concordant or the reference DNA barcoding material is incomplete. The quality of phylogenetic analysis is also strongly influenced by the taxonomic completeness of the reference materials. The dataset suitable for phylogeny should have no gaps, i.e., it should include all species reported for this infrageneric group. The protocol of Cai and Druzhinina (2021) offers a list of Trichoderma species and reference strains sorted based on their phylogenetic relation (PhyloOrder in Table 2 there and on www.trichokey.com) . This should assist people searching for a taxonomically complete set of sequences required for their analysis.

The fifth note on the implementation of the molecular identification protocol for Trichoderma species refers to the validation and verification steps (Fig. 2). These steps were not considered important at the first and second stages of Trichoderma DNA barcoding but now appear critical.

In Cai and Druzhinina (2021), validation refers to the quality control step in the reference materials for DNA barcoding. The most common issue leading to ambiguous identifications is the deposition of the reference tef1 sequences that contain only a portion of the last large intron (Jaklitsch 2009a) that is diagnostic for Trichoderma DNA barcoding. One or another end of this sequence is the mission (more frequently the 5′ end of the intron sequence). The taxonomically relevant map and the structure of the tef1 gene were provided in Rahimi et al. (2021). As mentioned above, many taxonomists sequence the 3′ end of the tef1 gene spanning over the last large exon that can be aligned for across the genus, but it has limited or no suitability for DNA barcoding. This refers to numerous new species introduced from Europe and China in prior and over the recent “species boom” in 2009–2015. The missing diagnostic tef1 DNA barcodes should be provided on the first instance because with the current high number of taxa, even a single incomplete reference sequence per species will result in ambiguous identification.

This situation is less frequently noticed for rpb2 sequences. However, rpb2 can sometimes contain sequences of poor quality that are also not suitable for references. For the cases when the DNA barcoding sequences for the reference strains are either incomplete or of poor quality, the protocol of Cai and Druzhinina (2021) suggests using the T. cf. [species name] construct. The users of taxonomy (researchers that perform the identification) are advised to seek or request the completion of reference materials from their respective taxonomy providers. Alternatively (and as it was practiced at early stages of Trichoderma DNA barcoding), the reference strains can be obtained from the respective strain collections and sequenced.

The validation step can also fail when several species share the same phylotype of one or several DNA barcodes. Unfortunately, this is also a common situation in Trichoderma taxonomy (Cai and Druzhinina 2021). For example, T. afarasin and T. endophyticum share a highly similar tef1 phylotype (>99% similarity); T. yunnanense and T. kunmingense share highly similar phylotypes of rpb2 with each other and with T. asperellum (>99%). In this case, the ambiguity of the final identification can be recorded as T. aff. asperellum if the query strain was isolated from Europe (for instance). If sampling was performed in the Chinese province Yunnan, then the strains can be identified as T. aff. yunnanense or T. aff. kunmingense, depending on other properties.

After the results of molecular identification become validated through the quality control of reference materials, the next important step is the biological verification of the identification result. Biological verification requires critical evaluation of such criteria as morphology, ecophysiology, biogeography, habitat, and occurrence. At this stage, the consideration of micromorphological features appears to be reasonable. For example, the three sister species T. pleuroti , T. amazonicum , and T. pleuroticola have numerous common and sharply different morphological and ecophysiological features verifying their distinct taxonomic statuses. Cai and Druzhinina (2021) provide a detailed explanation of the verification stage of their protocol.

Finally, the “new species hypothesis” can be an unambiguous, accurate, and precise result of molecular identification. This case ultimately requires validation of reference materials, phylogenetic analysis, and biological verification. In this chapter, we avoid discussing the criteria applicable for the delineation of species in Trichoderma as Cai and Druzhinina (2021) had presented a comprehensive discussion of this topic. However, we would like to stress that the correct implementation of the genealogical concordance phylogenetic species recognition concept (Taylor et al. 2000) requires the analysis of single gene topologies. The common use of the single tree based on a combined multilocus alignment is insufficient for the new species proposal.

5 Conclusions

The identification of Trichoderma species is an intricate and laborious task that requires a background in mycology, molecular biological skills, training in molecular evolution, and in-depth knowledge of taxonomic literature (Cai and Druzhinina 2021). The contemporary diversity of Trichoderma spp. cannot be identified by automated sequence similarity searches (such as NCBI BLAST or MIST BLAST) or oligonucleotide DNA barcodes. All molecular identification results require in silico validation and biological verification. Similarly, Trichoderma spp. cannot be identified by phylogenetic analysis without considering the sequence similarity values relative to the complete set of closely related species. The complexity of the identification process points to the need for close interactions between Trichoderma taxonomy experts.

In this chapter, we used Trichoderma to address the modern taxonomic collision that can also occur in many other genera of common and well-investigated fungi. The taxonomy of these fungi was visited and revisited many times and seemingly progressed with the introduction of new species. The delineation of the cryptic species is considered to be a useful practice because it increases the accuracy and precision of property prediction. However, many of newly recognized species appear to be difficult to identify. Ultimately, the failure to identify species leads to ambiguity but, more dangerously, to the description of more new species that further complicate the identification. This loop has been already reported before and noticed that every single fungal species has been named 2.5 times on average (Hawksworth and Lucking 2017). The good taxonomic practice should include the verification of species identifiability. Even though this process appears to be implemented as a reverse operation to the species recognition, it is frequently obscured by the application of vague species criteria. In an unfortunate case, a species can be recognized based on a comparison with a taxonomically incomplete set of references or based on species criteria that do not correspond to the state of the art in this genus. Even now, the Code will allow the application of the morphological species concept or a description of a Trichoderma species based on the morphological characters and the analysis of any single locus, i.e., ITS .

In this chapter, we tried to emphasize that such cases will result in a valid species name, but this species will not be possible to identify because most sister species were delineated based on advanced molecular species criteria such as GCPSR or even an integrated polyphasic approach. The example above is an exaggeration, but the taxonomic reality of Trichoderma is highly ambiguous. We assume that this turbulent state was caused by the recent introduction of highly powerful molecular techniques in fungal taxonomy, and the situation will get its rational solution. However, we set a further warning related to the introduction of the whole genus genomic data in Trichoderma taxonomy. The whole genome sequences have a still unexplored inter- and intraspecific polymorphism and thus offer essentially more options for taxonomic splitting: Species within the genus may share only 75% similarity genome-wide (Kubicek et al. 2019) and genomes of the two strains of the same clonal species T. harzianum have up to 1000 unique genes each. Therefore, the discussion of the unified species concept suitable for such fungi as Trichoderma is an urgent task for Trichoderma researchers and fungal taxonomists.