Introduction

An estimated 1.5 to 5 million species of fungi are found across diverse environmental conditions [1]. Many fungal species are symbiotic or pathogenic and thrive in close associations with other organisms. Independently evolved from non-pathogens, over 8000 fungi are plant pathogens and around 200 are pathogenic to humans [2]. Annually, more than one billion people contract a fungal infection, over 300 million people suffer from a serious fungal-related disease, and more than 2 million people die, making them the fifth largest cause of death worldwide [3, 4]. While the majority of fungal infections are superficial and relatively easy to cure, invasive fungal infections, commonly caused by Candida albicans, Aspergillus fumigatus, and Cryptococcus neoformans, are more difficult to diagnose and treat, resulting in a mortality rate that can reach 90% in immunocompromised individuals [5]. The increase in antifungal resistance further challenges our ability to treat these diseases, contributing to high mortality rates [5].

Very few fungal pathogens are dependent on a human host for its life cycle and their pathogenicity is unintended [2]. As opportunistic pathogens, many of the genetic traits required for virulence are likely not specific markers for causing disease and were selected for based on the pathogen’s ability to survive in its natural habitat. The adaptability of the fungal pathogen response to their host (i.e., expression of virulence factors, antifungal tolerance) is dependent on their ability to generate genomic variation. Stable and prolonged changes to the genome—gene gain or loss, genomic rearrangements, horizontal gene transfer, changes in ploidy, and sexual reproduction—contribute to the genetic variability, virulence, and antifungal resistance of human fungal pathogens [6].

The first sequenced eukaryotic genome was fungal and fungi have more genomes sequenced than any other eukaryotic group (Table 1). Genome sizes in the fungi are highly variable, ranging from 8.97 to 117.57 Mb with an average genome size of 36.91 Mb in Ascomycota, 46.48 Mb in Basidiomycota, and 74.85 Mb in Oomycota phyla (Table 2). The depth of fungal genome sequencing has enabled direct comparisons between species and lineages, contextualizing the genetic diversity that enables fungi to flourish in disparate habitats and invade plants and animals. This review will focus on the genomic features of the most prevalent human fungal pathogens (Aspergillus, Cryptococcus, and Candida) and endemic fungal pathogens (Histoplasma, Blastomyces, Coccidioides, Paracoccidioides, and Sporothrix).

Table 1 Summary of the most common human fungal pathogens and their distribution
Table 2 A summary of the sequenced genomes for the most common human fungal pathogens

Aspergillus

Aspergillus is a genus of widespread and diverse filamentous saprobes with clinical and agricultural significance. Most Aspergillus species are not pathogenic, specializing instead in the breakdown of botanical matter. As a genus, the genetic variation in Aspergillus is equal to that of the Vertebrate phylum; the close relatives A. fumigatus and A. fischerianus are as dissimilar as humans and mice [7•]. There are hundreds of described Aspergillus species, but only a fraction of them are capable of infecting humans, with infections primarily caused by A. fumigatus and A. flavus. Currently, reference genomes are available for 194 Aspergillus species through the NCBI Genome Database [8].

Aspergillus fumigatus

Disease and Diversity

A. fumigatus causes the greatest number of deaths, the second highest number of human infections, and is responsible for up to 90% of aspergillosis cases [9]. The global distribution of A. fumigatus and its ability to grow well at 37 °C results in 11 million allergic reactions and over 3 million chronic and invasive lung infections annually [9]. Phylogenetic analyses separating A. fumigatus into clades have been inconclusive with no significant variation found between clinical and environmental isolates [10, 11]. However, the subdivision of A. fumigatus into two broad clades is supported by the uneven distribution of cyp51 (erg11) alleles, the target for azoles [12].

Genome

A. fumigatus was first sequenced in 2005 (strain Af293), with recent genomes providing telomere-to-telomere coverage for strains CEA10 and A1160 [13•]. Comparisons of the A1160, CEA10, and Af293 genome assemblies revealed several chromosomal rearrangements, the most significant occurring between chromosomes 1 and 6 [13•]. Pan-genome analysis identified a core set of orthologs (69%), with 16% to 22% of the genome varying between strains [10]. Variation is primarily found in accessory genes affiliated with transmembrane transporters, iron-binding activity, and carbohydrate and amino acid metabolism, which may explain the wide range in virulence observed in A. fumigatus isolates [10]. Chronic disease isolates are more genetically diverse than strains from invasive aspergillosis or the environment and are more likely to engage in parasexual or sexual recombination, contributing to the development of azole resistance [10, 14].

Aspergillus flavus

Disease and Diversity

A common plant pathogen, A. flavus produces several aflatoxins, causes pulmonary and systemic infections in humans, and can be up to 50 times more virulent than A. fumigatus [15]. However, infection by A. flavus is less common than A. fumigatus, responsible for less than 10% of pulmonary aspergillosis cases [16]. A. flavus forms a single monophyletic clade but whole genome analysis breaks A. flavus isolates from the USA into 3 populations, with population C more closely related to A. oryzae [17]. Populations A and B are widely distributed and have similar geographic distribution while population C is often isolated from Iowa, Indiana, and Pennsylvania [17]. Notably, populations B and C have lower diversity than population A [17].

Genome

Several A. flavus isolates have been sequenced [18,19,20] with the nearly complete assembly of isolate NRRL3357 released in 2021 [21•]. This 37.75 Mb genome assembly completed 7 of the 8 chromosomes from telomere-to-telomere and is considerably larger than other Aspergillus genomes [21•].

Compared to A. fumigatus, there is significantly less genetic diversity among the clinical isolates of A. flavus. Remarkably similar to that of its closest relative, A. oryzae, only 43 genes are unique to A. flavus [22]. A. flavus produces carcinogenic secondary metabolites known as aflatoxins, absent from its close relatives. Furthermore, the regulatory proteins of aflatoxin biosynthesis are necessary for A. flavus asexual development [23].

Cryptococcus

A basidiomycete, Cryptococcus yeasts are found worldwide in soil, bird-droppings, decaying wood, and trees. Cryptococcus is the etiological agent of one of the most lethal fungal infections, cryptococcosis and fungal meningoencephalitis [24]. The vast majority of infections, up to 95%, are caused by the globally distributed C. neoformans although cases caused by C. gattii are increasing annually [24]. C. gattii is a primary human pathogen, causing disease in both immunocompetent and immunocompromised individuals [24]. C. gattii is endemic in tropical climates, with climate change likely playing a role in the Pacific Northwest outbreaks [25].

Cryptococcus species are typically haploid with a 19 Mb genome on 14 chromosomes. However, changes in ploidy, hybrid genomes, and chromosome duplications are not uncommon and karyotype variation has occurred in strains over the course of infection [26]. C. neoformans and C. gattii share a genetic identity of ~ 85%; however, hybrids between the two species have been reported, increasing the genetic variability of the genus [27]. Multilocus sequence typing has identified 5 major molecular types of C. neoformans and 4 major molecular types of C. gattii [28•]. Genomic rearrangements and changes in chromosome length in Cryptococcus likely contribute to chronic infection, adaptation to the host, and antifungal resistance [29, 30].

Cryptococcus neoformans

Genome

Both the reference strain H99 and a recently completed ungapped genome of C. neoformans VNII span 19 Mb across 14 chromosomes [31, 32•]. Comparisons between C. neoformans and C. gattii genomes found 2 large inversions, 3 translocations, and extensive rearrangements in C. neoformans [30, 33•].

C. neoformans undergoes ploidy changes during sexual development and in response to various environmental and host cues [26]. During infection, the haploid C. neoformans can form polyploid titan cells [34] and form diploid blastospores during unisexual reproduction [26]. These genomic variations correspond with phenotypic differences and alter transcriptional regulation, signal transduction, and glycolysis pathways, impacting the course of infection [35]. Segmental aneuploidy has been detected on multiple chromosomes, which conferred azole resistance in some isolates during host infection [26]. Aneuploidy formation in C. neoformans may be related to an increased rate of transposon movement [29].

Cryptococcus gattii

Genome

The most complete C. gattii assembly contains 14 chromosomes and 18.4 Mb with eight internal gaps [36]. A number of other strains and variants have been sequenced, but they remain incomplete scaffolds. The genome structure is highly conserved across C. gattii variants, on average only a 7% sequence divergence among C. gattii VGI and VGII genomes [36]. Between all four C. gattii variants, ~ 87% of the genome has been identified as a core set of genes [37]. The limited genome evolution of C. gattii has not changed genome size or structure but instead acted on conserved gene families, like drug transporters, and gene expansions that likely facilitate survival in the human host [37].

Candida

Candida encompasses non-pathogenic species, harmless commensals or endosymbionts, and pathogens of humans and plants. Several Candida species can cause superficial infections, systemic fungemia, or invasive candidiasis. C. albicans, a normal constituent of the human skin, gastrointestinal, and genitourinary tracts, causes the majority of Candida bloodstream infections but other non-albicans Candida species, including C. glabrata, C. parapsilosis, C. tropicalis, C. krusei, and C. auris are responsible for an increasing number of cases [38]. Resistance to commonly used antifungals may explain the rise in cases caused by other Candida species [39].

A polymorphic fungus, Candida is able to express several different morphologies. Generally, the environmental yeast-phase of Candida species switches to a multicellular filamentous form during infection [40]. The highest genetic diversity is observed in species that are most frequently human commensals—C. albicans, C. tropicalis, and C. glabrata [41]. Below, we discuss the genome characteristics of C. albicans, non-albicans Candida, and the emerging pathogen C. auris.

Candida albicans

Disease and Diversity

C. albicans is the most prevalent human fungal pathogen. It is the fourth most common hospital acquired infection in the USA and responsible for nearly half a million life-threatening infections annually, primarily in immunocompromised individuals [42]. Multi-locus sequence typing split C. albicans into 17 predominantly clonal populations that separate independent of geography [43]. In C. albicans, C. tropicalis, and C. parapsilosis, the CUG codon is translated to serine instead of leucine [44]. C. albicans demonstrates a wide range of morphological forms—yeast, true hyphae, pseudohyphae, and chlamydospores—that likely aid in its survival, growth, and dissemination throughout their mammalian host as a commensal and pathogen.

Genome

Multiple sequencing efforts have assembled the diploid C. albicans genome [45,46,47]. Long-read sequencing generated a haploid assembly of pathogenic C. albicans [48] and a diploid assembly for environmental C. albicans [49]. C. albicans is naturally diploid with a 14 to 16 Mb haploid genome organized into eight pairs of chromosomes [45]. However, C. albicans can maintain stable ploidy states ranging from haploid to tetraploid [50•].

Chromosomal rearrangements, aneuploidy, point mutations, and loss of heterozygosity (LOH) contribute to C. albicans genome plasticity and have been extensively reviewed [51, 52, 53•]. C. albicans is heterozygous with more than 1% nucleotide divergence between isolates [54]. Excessive polymorphisms are present on chromosomes 5 and 6 with low instances of polymorphism found on chromosomes 3 and 7 [45]. Host pressures and other stressors, like exposure to antifungals, can result in a temporary increase in C. albicans ploidy, driving diploid cells up to 16N [55]. In patients treated with azoles, C. albicans aneuploidy frequency increased over time [56]. Additional stressors may also lead to non-disjunction events as C. albicans often loses chromosome 5 when forced to grow on sorbose and strains that are resistant to fluconazole have frequently lost chromosome 4 or gained chromosome 3 [57].

Although the vast majority of mutational events occur somatically, mating and parasexual mating are strong drivers of genetic diversity in C. albicans [58]. C. albicans primarily reproduces through asexual clonal division, but the machinery needed for mating and meiosis has been retained [59•]. However, the products of diploid C. albicans mating are tetraploid and carry out “concerted chromosome loss” by losing chromosomes at random until they reach a near-diploid genome [60, 61].

Non-albicans Candida (NAC) species

The non-albicans Candida (NAC) species C. glabrata, C. tropicalis, C. parapsilosis, and C. krusei are increasingly responsible for candidiasis globally [39]. C. glabrata and C. krusei were recently renamed as Nakaseomyces glabrata and Pichia kudriavzevii, respectively; however, we have maintained the former naming scheme in this review article to align with previously published literature. Their prevalence varies with geographical location, with C. glabrata infections highest in Asia–Pacific and Europe, whereas C. tropicalis are the top infection in Africa and the Middle East, and C. parapsilosis is the predominant cause of infection in North American and Latin America [38].

Candida glabrata (Nakaseomyces glabrata)

Typically a harmless commensal, C. glabrata can cause superficial mucosal and serious disseminated infections in older, immunosuppressed patients, and those with diabetes [62, 63]. Phylogenetically, C. glabrata is more closely related to Saccharomyces cerevisiae than C. albicans [64]. A haploid fungus, the completed genome of C. glabrata has 13 chromosomes with a total size of 12.3 Mb [65•, 66]. Most of the genomes sequenced recover between 97.3 and 98.7% of the genes annotated in the reference genome, showing little variation in gene content [67]. Genetic variation in C. glabrata results from changes in copy number variation, aneuploidy, or single-nucleotide polymorphisms and affects biofilm formation, GPI-anchored cell wall adhesins, and protease expression [65•, 68].

Candida tropicalis

C. tropicalis is a globally distributed opportunistic fungal pathogen found in numerous ecological environments [69]. Primarily infecting neutropenic patients, C. tropicalis is the most common cause of candidiasis in Southeast Asia and Africa and second most common species in Central and South America [69]. C. tropicalis isolates are genetically diverse and have arisen from disparate environments, with no clear geographic separation [70]. First sequenced in 2009, the diploid C. tropicalis genome is 14.6 Mb across seven pairs of chromosomes [71•]. Interestingly, early research identified 12 chromosomes in C. tropicalis with chromosomal length polymorphisms between three strains, suggesting that chromosomal rearrangements occur frequently in C. tropicalis [72]. Like C. albicans, C. tropicalis has a known parasexual cycle that often results in a high level of aneuploidy [73]. Single-nucleotide polymorphisms and copy number variants, including ERG11 and TAC1, were present in fluconazole-resistant isolates, indicating that stress and selection pressure are mechanisms through which C. tropicalis may acquire resistance [70].

Candida parapsilosis

In contrast to most other Candida species, C. parapsilosis cases are higher in neonates [63]. C. parapsilosis infections are increasing because of its global distribution, broad range of virulence factors, and antifungal resistance. The completed diploid genome of C. parapsilosis has 8 chromosome pairs spanning 13 Mb [74]. With low levels of heterozygosity, there is little evidence for significant diversity among C. parapsilosis isolates [74, 75]. Multi-locus sequence typing divided C. parapsilosis into three distinct species: C. parapsilosis, C. orthopsilosis, and C. metapsilosis [76]. Additional sequencing of clinical strains discovered hybrids between these species with major translocations occurring between C. parapsilosis and C. orthopsilosis chromosomes [77]. In both C. parapsilosis and C. orthopsilosis, expansion of cell wall gene families for the creation of biofilms have been associated with increased virulence [78].

Candida krusei (Pichia kudriavzevii)

C. krusei is an opportunistic fungal pathogen of high medical importance because of its natural resistance to fluconazole [79]. Causing invasive candidiasis in immunocompromised individuals, C. krusei responds poorly to antifungal therapies and has a mortality rate up to 58% [79]. While genetically split into two clusters, different populations of C. krusei co-exist in the same geographic environment [79]. A diploid, highly heterozygous yeast, the first assembly of C. krusei contained 626 contigs covering 10.4 Mb [80]. PFGE analysis estimates that C. krusei has 4 to 6 chromosomes and a genome size of 11.4 Mb [81]. Compared to other Candida species, C. krusei is understudied and the genomic mechanisms supporting its high genetic diversity have not been investigated. Exposure to antifungal agents is believed to act as a selection factor and may play a role in the evolution of C. krusei biofilm formation [79].

Candida auris

Disease and Diversity

C. auris represents a newly emerging human fungal infection that poses a significant threat as it rapidly develops resistance to antifungals and spreads easily through hospital environments on skin and surfaces. C. auris mainly manifests as a bloodstream infection, but it is also found in wound and ear infections [82]. Diagnosing a C. auris infection requires molecular methods, which is not always feasible, contributing to an underestimation of the global spread of C. auris [82]. C. auris is a thermotolerant, multidrug-resistant ascomycete, with 80.8% of strains showing resistance against fluconazole, 38.1% against voriconazole, and 26.2% against amphotericin B [83].

First described in 2009, C. auris has spread across six continents with outbreaks occurring in more than 30 countries [82, 83]. Genomic analyses have confirmed a near-simultaneous evolution of C. auris in multiple areas around the world [84•]. C. auris has been separated into 5 genetically distinct, geographically distributed clades: South Asian (Clade I), East Asian (Clade II), African (Clade III), South American (Clade IV), and Iranian (Clade V) [84•, 85].

Genome

The majority of C. auris assemblies remain highly fragmented and inconsistently annotated. A haploid ascomycete, C. auris has a 12.1 to 12.7 Mb genome spread across five to seven chromosomes [86•]. Each clade differs from the other four by tens of thousands of single-nucleotide polymorphisms, but exhibits a highly clonal population structure within the clade; on average less than 70 single-nucleotide polymorphisms within each geographic cluster, even in isolates thousands of miles apart [84•, 87]. Comparisons of Clades I through IV (comparisons with Clade V have not been reported) show a high level of similarity, with a shared 98.7% nucleotide identity [86•]. Clade II is the most rearranged with two inversions and nine translocations but is most similar to Clade III with a 99.3% shared identity [86•]. Conservation of C. auris as a species complex is supported by their more distant relationship to other Candida species; on average 88% similar to its closest relatives, C. haemulonii, C. duobushaemulonii, and C. pseudohaemulonii [86•].

C. auris genome variation results from changes in copy number and gain or loss of chromosomes as there is no evidence for alterations in ploidy states [87]. These mutations contribute to differences in antifungal resistance between C. auris clades and increased virulence. Compared to other Candida species, C. auris has higher resistance to cationic, cell wall, and oxidative stressors and can maintain viability and higher proteinase and phospholipase activity at 42 °C [88].

C. auris genomes have conserved mating loci, but only one of the two mating types, MTLa or MTLα, have been detected in each clade [83]. Mating between clades has not yet been reported, but in countries where multiple clades have been identified, mating may occur where MTLa and MTLα strains are no longer geographically separated [83, 88].

Endemic Fungal Pathogens

Thermally dimorphic fungal pathogens, which alter their morphology and virulence in response to temperature, are responsible for hundreds of thousands of infections and deaths annually [89]. Globally distributed, but geographically and ecologically restricted, these organisms exist in the environment as saprotrophic hyphae that transition to parasitic forms (yeasts or spherules) in mammalian hosts [90]. These pathogens are all found within the phylum Ascomycota, but are spread across a number of orders, exemplifying the convergent evolution of dimorphism and pathogenesis in fungi [90]. Assessing the global burden of these diseases is difficult, but mortality rates can reach up to 70% for infected individuals [89]. Below, we discuss the genome characteristics of the thermally dimorphic fungi: Histoplasma, Blastomyces, Coccidioides, Paracoccidioides, and Sporothrix.

Histoplasma

The fungal pathogen Histoplasma is found on every continent. It causes mild flu-like symptoms in most people but the infection may develop into a life-threatening systemic disease, especially for immunocompromised individuals. Previously, Histoplasma was divided into three varieties based on clinical presentation, morphology, and geographic distribution: H. capsulatum var. capsulatum, responsible for pulmonary histoplasmosis; Hcapsulatum var. duboisii, responsible for African histoplasmosis; and H. capsulatum varfarciminosum, responsible for equine histoplasmosis [91]. H. capsulatum associates with river valleys, particularly in the Central and Eastern United States and Central and South America, while H. duboisii is primarily found in Africa [92].

Phylogenetic analyses have revealed at least eight clades that are tightly associated with specific geographical regions: North American classes 1 and 2 (NAm 1 and NAm 2), Latin American groups A and B (LAm A and LAm B), Eurasian, Netherlands, Australian, African [93, 94], and a recently identified Indian lineage [95]. The LAm groups were later divided into six phylogenetic groups [96]. Speciation and admixture have been shown between Histoplasma isolates [97,98,99]. Comparative genetic analyses have suggested new nomenclature for H. capsulatum as four new subspecies: H. capsulatum (Panama or H81 lineage), H. mississippiensis (NAm 1), H. ohiensis (NAm 2), and H. suramericanum (LAm A) [97].

Early studies identified 5–7 chromosomes [100]. The original genome assembly contained > 3000 contigs spanning 43.5 Mb across the highly repetitive Histoplasma genome (strain G217B) [101]. Completed assemblies of 5 Histoplasma strains revealed genomes ranging in size from 31 to 40 Mb due to differences in repeat content with extensive synteny among geographically segregated isolates [102•]. The observation of transposon and transposon-embedded gene upregulation in the yeast phase of strain G217B suggests that repetitive DNA may play a role in the dimorphic lifestyle [102•].

Blastomyces

Blastomyces dermatitidis and Blastomyces gilchristii are the etiological agents of blastomycosis, an invasive fungal infection in humans. Identifying the environmental niche that Blastomyces inhabits has proven elusive, but epidemiological data suggests that Blastomyces species live in soil and wet, decaying wood [103]. B. gilchristii is primarily found in Canada and the Northern United States [104]. B. dermatitidis is endemic to Eastern North America, found throughout northern Ontario to the Mississippi and Ohio River Valleys, but its range is expanding toward the Appalachian Mountains and the Eastern United States [103]. The genome of B. dermatitidis is incompletely sequenced, with four strains represented by up to ~ 4000 scaffolds. Only one isolate of B. gilchristii has been sequenced with a genome scaffold of ~ 1800 contigs. Compared to other fungi, the gene content of Blastomyces species is highly conserved, but the genome contains large, highly variable repetitive long terminal repeat transposon regions [102•, 105]. An increase in gene copy number is likely associated with gene expression changes in proteases, antioxidants, and trace metal acquisition which are involved in host interactions and virulence [105].

Paracoccidioides

Paracoccidioides brasiliensis and Paracoccidioides lutzii are responsible for paracoccidioidomycosis, a disease that forms granulomas in the nose, sinuses, and skin. Up to 80% of cases occur in Brazil with the severity of disease increasing in HIV and immunocompromised patients [106]. Four genomes of P. brasiliensis and one genome of P. lutzii have been sequenced and assembled to the scaffold-level with ~ 2000 contigs [106, 107]. Paracoccidioides species have haploid genomes that vary from 29.1 to 32.9 Mb and are highly divergent [107,108,109,110]. Gene family expansions specific to Paracoccidioides include the fungal-specific kinase family and genes encoding secreted proteins, with gene losses in cell wall and carbohydrate metabolism detected across dimorphic fungal pathogens [107, 111].

Coccidioides

Coccidioides immitis and Coccidioides posadasii are the etiological agents of coccidioidomycosis, also known as valley fever. Endemic to the Southwestern United States and Mexico, it is estimated that 60% of infections are asymptomatic with less than 1% of patients developing disseminated disease [112]. Morphologically identical, C. immitis and C. posadasii are genetically distinct [112, 113]. There are 5 scaffold genome sequences with at most ~ 4000 contigs available for C. immitis and 13 genome sequences available for C. posadasii with one recent chromosome-level reference genome released [112, 114•]. Genomes for both C. immitis and C. posadasii are ~ 28 Mb organized into 9 chromosomes [114•]. Hybridization has occurred between the two species, mainly from C. posadasii to C. immitis, transferring coding genes that likely function in immune evasion and cell wall biosynthesis [112, 115]. C. posadasii is divided into two main clades: Clade I isolates are found in Arizona and Clade II isolates are found in Texas and South America [116]. Phylogenetic analyses of Coccidioides species have proven useful in molecular epidemiology studies [117].

Sporothrix

The common route of Sporothrix infection introduces spores through a cut or wound in the skin, as opposed to pulmonary routes. S. brasiliensis, S. schenckii, and S. globosa are found worldwide, but are endemic in Peru and Asia, which experience a higher incidence of disease [118]. There is a high level of similarity between Sporothrix genomes with an average sequence identity of 97.5% between S. schenckii and S. brasiliensis [119•]. There is one assembly for S. brasiliensis with 13 contigs spanning 33.2 Mb [106, 119•]. The S. globosa genomes have only been assembled to the scaffold-level with at most 571 contigs for the 33.5 Mb genome [120]. The S. schenckii genome has been assembled to 16 contigs, covering 32.8 Mb [121, 122]. S. schenckii has the greatest genetic variation and evidence of genetic recombination, but all Sporothrix species have lost polysaccharide lyase genes suggesting that they have switched from plant to animal hosts [119•].

Conclusion

Fungal genomics has been gaining importance in recent years. More than 50% of research articles cited in this review were published within the last 5 years, underlining the attainability of fungal genome sequencing and analysis tools. Accordingly, the next steps that will expand upon our understanding of fungal genetic diversity are to (1) generate complete telomere-to-telomere sequences for all notable pathogens and their non-pathogenic relatives, (2) expand the number of strains and isolates sequenced by carrying out clinical and environmental population level analyses, and (3) establish a system for identifying and detecting emerging pathogens. With our current understanding of genetic diversity in the fungi, a single or few reference genomes is insufficient for describing the full range of variation present in the population. With the reduction in cost of long-read sequencing, the number of complete fungal genome assemblies will continue to increase. The subsequent limiting factor will be characterizing the impacts of genetic variability on gene expression, translational efficiency, and function, which may shed light onto the molecular mechanisms of fungal pathogenesis.