Introduction

Aromatic amines (AAs) are a group of chemicals used in industrial processes to produce many products, including polyurethane. These compounds are also by-products of chemical manufacturing (pesticides, dyes, rubbers or pharmaceuticals) and combustion. Many of them are cytotoxic or genotoxic (Kim and Guengerich 2005). It is, for example, estimated that 12% of the known carcinogenic substances in humans are AA (U.S. Department of Health and Human Services PHS, National Toxicology Program 2005). These chemicals may accumulate in the environment as xenobiotics, where they have noxious effects. This is especially true for 3,4-dichloroaniline (DCA). DCA is a by-product of aniline-derived herbicides such as diuron (Voelker et al. 2008). Indeed, in treated soil, diuron is transformed into DCA and is then acetylated by soil bacteria and fungi into the less toxic acetyl-DCA (Tixier et al. 2002; Westwood et al. 2005). Microorganisms are thus good candidates to be used for decontamination of AA-polluted soil, a process known as bioremediation.

Whilst attention has been first paid to bacteria in soil AA-remediation studies (Rodrigues-Lima et al. 2006), much less is known about the potential of fungi. This is surprising since fungi are the dominant life form in topsoil in term of biomass (Durrieu 1993), and they are known to carry out numerous metabolic reactions. Nevertheless, we recently showed (Martins et al. 2009) that the fungus Podospora anserina is able to acetylate DCA using two arylamine N-acetyltransferases (NATs), PaNAT1 and PaNAT2. Although both enzymes acetylate AA in vitro, only the PaNAT2 isoform appears to confer to P. anserina resistance to a wide range of AA in vivo. Interestingly, the P. anserina enzymes are efficient enough to confer to this fungus the ability to efficiently acetylate DCA in contaminated soil and thereby to bioremediate contaminated soil (Martins et al. 2009). Another NAT enzyme has been characterized from Fusarium verticillioides (Glenn and Bacon 2009). This species infects maize and wheat in the field and is frequently found on stored seeds, probably because it is able to tolerate benzoxazolinones, which have antimicrobial activities and are produced by these plants. It does so by hydrolysing 2-benzoxazolinone into 2-aminophenol by enzyme(s) encoded at the FDB1 locus. 2-aminophenol is then acylated and inactivated into the nontoxic metabolite N-(2-hydroxyphenyl)malonamic acid by the FDB2 enzyme. The FDB2 locus of F. verticillioides has been shown to encode a putative NAT enzyme (Glenn and Bacon 2009). To date, the full extent of NAT genes in fungal genomes and their role in fungal physiology are unknown.

NAT enzymes are found only in a subset of organisms (Vagena et al. 2008), and the reason for their presence or their absence in an organism is presently not clear. In our previous study (Martins et al. 2009), we reported that NATs are present in a wide range of fungi, but did not perform a phylogenetic analysis to understand how NAT genes have evolved in this important group of eukaryotes. The availability of 96 genome sequences (Table 1) and the recent deciphering of the fungal phylogeny (Hibbett et al. 2007; James et al. 2006) enable us to perform an analysis of NAT genes in Eumycetes. We present here the results of a phylogenetic survey of NAT genes in all the available fungal genomes. Data analysis reveals a complex evolutionary pattern for NAT genes in Eumycetes, including recurrent gene amplifications and losses, as well as one possible horizontal transfer from a bacterium.

Table 1 NAT genes in fungal genomes

Materials and Methods

Available fungal genomes and deduced protein sets from complete sequencing projects (Table 1) were searched for the presence of putative NAT genes by TBLASTN and BLASTP with e-value cutoffs of 10−4 and 10−6, respectively (Altschul et al. 1990). Representative NAT genes from all the species were used as queries on genomic sequences in which no isoform was detected in the first round of queries. In many instances, because automatic annotation may not give accurate intron splicing boundaries, each putative NAT gene was manually re-annotated. Manual annotation was based on conserved sequence with the closest species as well as the use of consensus for intron boundaries (Kupfer et al. 2004; Silar et al. 2003). Putative functional genes as well as putative pseudogenes were detected (Table 1). Pseudogenes were validated by checking the trace chromatograms to ensure that the mutations observed in the coding sequences were not sequencing errors. A few genes were incomplete due to gaps in the genomic sequences (Table 1). The deduced protein sequences for complete bona fide genes were aligned with the MUSCLE algorithm with the default parameters (Edgar 2004, Supplementary Fig. 1 online). The Neighbour Joining tree was made with MEGA4 (Tamura et al. 2007) with the default parameters (complete deletion, Poisson correction and uniform rates amongst sites) and 1,000 bootstraps. The PhyML tree (Guindon and Gascuel 2003) was constructed using the www.phylogeny.fr website with the default parameters (Dereeper et al. 2008) and 100 bootstrapped data sets. Trees were visualized with the iTOL server (Letunic and Bork 2007). Both methods yielded similar results (Fig. 1, Supplementary Fig. 2 online). In all the annotated proteins, the Cys-His-Asp catalytic triad was conserved, indicating that the enzymes are likely to be functional (Supplementary Fig. 1 online). Positions of introns were mapped onto the alignment (Supplementary Fig. 1 online).

Fig. 1
figure 1

Maximum Likelihood phylogenetic tree of fungal NAT genes Tree generated from the alignment of Supplementary Fig. 1 online by PhyML (Guindon and Gascuel 2003) with 100 bootstrap replicas. NAT families I–VII are from Pezizomycotina. Arrow indicates the M. globosa NAT gene

Results and Discussion

Complex Set of Gain and Loss of NAT Genes in Eumycete Fungi

Metazoa is the sister group of Eumycetes (James et al. 2006). Most deuterostomian animals possess NAT genes, whilst other animals (Cnidarian, Acoelomata, Pseudocoelomata and Protostomia) do not (Vagena et al. 2008). Yet, the dog, a deuterostomian, lacks NAT genes (Trepanier et al. 1998). The presence of NAT in animals was once thought to be due to horizontal transfer, but this hypothesis was immediately challenged (see Genereux and Logsdon 2003 for a review) and is not favoured anymore (Rodrigues-Lima and Dupret 2002). A patchy phylogenetic distribution similar to that found in animals is also found in fungi. First, no NAT gene is detected in the genomes of unicellular protists related to animals and fungi, i.e. Monosiga brevicollis and Proterospongia sp. ATCC50818, two choanoflagellates, as well as Capsaspora owczarzaki, a mesomycetozoan (Table 1). Second, NAT genes are also absent from the genomes of all the investigated lower fungi (Spizellomycetales, Blastocladiomycota, Microsporidia and Mucoromycotina), except for the genome of Batrachochytrium dendrobatidis that contains a single NAT gene (Table 1). Lack of NAT in the Mucoromycotina is in line with their sensitivity to AA (Martins et al. 2009). Third, NAT genes are found in both Basidiomycota and Ascomycota, albeit with a patchy phylogenetic distribution (Table 1).

The presence of NAT genes in both animals and fungi suggests that they were present in the common ancestor of both groups and lost independently in some lineages, as evidenced by the detection of pseudogenes in fungal genomes (Table 1). Trees for all the fungal full-length proteins and selected NAT from other organisms were constructed to reconstitute their evolution (Fig. 1, Supplementary Fig. 2 online). Unfortunately, due to the small size of the NAT proteins and their rapid evolution, the branching order for the most divergent NAT is not defined with good statistical support, especially when the whole diversity of fungi is considered. This precludes the decision as to whether the present patchy repartition is due to artefacts in the methods used for phylogenetic analysis, to recurrent gene losses or to multiple horizontal transfers early during evolution. For example, in both analyses, the B. dendrobatidis protein branches within the NAT from Basidiomycetes, which could be indicative of horizontal transfer or which may result from a long-branch attraction. However, apart from this discrepancy and that of a probable horizontal transfer from a bacterium to Malassezia globosa, (see below), the NAT gene phylogeny follows that known for species, an argument in favour of an early presence of NAT genes followed by recurrent gene losses. Figure 2 proposes a plausible scenario for NAT gene evolution based on the data of Table 1 and the known phylogeny of fungi. Whatever the scenario for the early evolution of NAT, additional recent gene losses are clearly needed to explain the lack of NAT gene in particular organisms, such as the dog and some species of higher fungi (Table 1, Fig. 2).

Fig. 2
figure 2

Possible scenario of NAT genes evolution. The figure gives the most parsimonious scenario of NAT genes gains and losses based on the data of Table 1 and the currently known evolution of fungi (Hibbett et al. 2007; James et al. 2006). Alternative scenarios involving additional gene losses, gene duplication or lateral gene transfer are possible. a NAT gene evolution in fungi, but Pezizomycotina. Note the presumptive losses of NAT genes in many phyla. Triangle gene deletion, gene duplication, HGT horizontal gene transfer. b NAT gene evolution in Pezizomycotina. The NAT isoforms putatively present at the origins of the different classes are indicated on the left. This scenario suggests a highly dynamic pattern of gene gains and losses. Deletion (triangle) or duplication (2×) of specific isoforms is indicated above the corresponding branches. The lengths of the branches do not reflect the evolutionary distances between the species

In the higher fungi, NAT genes also present a patchy phylogenetic distribution and are found mostly in filamentous species (Table 1). Indeed, whilst they are present in the Pucciniomycotina yeasts and M. globosa, they are absent in all the other yeasts, i.e. Cryptococcus neoformans and the Taphrinomycotina and Saccharomycotina species. This pattern is in line with numerous gene losses that go together with the yeast life style (Arvas et al. 2007; Zeng et al. 2001). Intriguingly, a NAT activity has been reported in Candida albicans (Fang et al. 1997). It is likely that other enzymes supply the NAT activity in this species, whose genome does not contain canonical NAT gene. Therefore, if a species is missing canonical NAT genes, then it may still retain some NAT activity carried out by another acetyl transferase, although no such enzyme has yet been characterized enzymatically. In the filamentous Basidiomycota, NAT genes have been lost in many species of the higher Agaricomycotina (Fig. 2), whilst still present in Ustilaginomycotina and Pucciniomycotina. Interestingly, Basidiomycota genomes and especially Agaricomycotina ones are often poor in polyketide synthases (PKS) and non ribosomal peptide synthases (NRPS), which are enzymes involved in secondary metabolisms. For example, the genomes of the mycorrhizal fungus Laccaria bicolor and the saprobic Coprinopsis cinerea contain a single PKS as well as one and five NRPS, respectively, whilst the genome of the white rot Phanaerochaete chrysosporium has no PKS and three NRPS. Similarly, yeasts that do not contain NAT are devoid of PKS and contain few NRPS. In line with these observations, the most diverse array of NAT genes is found in the higher Ascomycota, the filamentous Pezizomycotina that are endowed with numerous PKS and NRPS genes involved in the production of secondary metabolites (Bushley and Turgeon 2010; Kroken et al. 2003).

The Genomes of Higher Ascomycetes Contain Numerous NAT Genes

Most species of Pezizomycotina contain at least one NAT gene, often several with up to five functional paralogs and two pseudogenes in Haematonectria (Nectria) haematococca, yet, a few species appear devoid of NAT genes (Table 1), which can only be accounted for by gene loss in these fungi. The history of NAT genes in Pezizomycotina appears quite complex since sequence comparison permits the definition of at least seven groups (I–VII) supported by strong statistical significance (Fig. 1, Supplementary Fig. 2 online). However, the branching order between these families is poorly resolved, and differs between the Neighbour Joining and PhyML methods. Intron presence and conserved position (see Supplementary Fig. 1 online) support a common ancestor for these genes and is in accordance with their grouping into seven families. Genes belonging to the family IV usually have two introns (I1 and I2, Supplementary Fig. 1 online) with conserved positions, except in Coccidioides imitis, C. posadasii and Uncinocarpus reesii, which only have I1. These three species are closely related, a single loss is, therefore, needed to explain lack of I2. I1 is present in many genes of families I, II, III and VII, whilst a few genes of the same families lack I1. The intron pattern of Pezizomycotina NAT genes is thus best explained by recurrent intron loss, starting with I2 and then I1. However, there appears to be two exceptions, Mycosphaerella fijiensis Mycfi1|51513 and Cryphonectria parasitica NA2 genes. M. fijiensis Mycfi1|51513 possesses two introns located at unique positions and may have gained its introns recently as seen for other fungal genes (Nielsen et al. 2004). C. parasitica NA2 intron is slightly moved 7 nucleotides towards the 3′ end of the gene, a phenomenon known as intron sliding (Rogozin et al. 2005). It is noteworthy that none of the intron positions in Pezizomycotina genes is common with the ones from genes of Basidiomycota (Supplementary Fig. 1 online). The selection pressure and the mechanisms for intron gain and loss in fungi are still controversial (Nielsen et al. 2004).

Some NAT families are restricted to a specific clade of Pezizomycotina, whilst others are present in the whole diversity of Pezizomycotina. Family II is found only in Sordariomycetes, families IV and VII in Eurotiomycetes and family VI in Dothideomycetes. On the contrary, family I is found in the four investigated classes of Pezizomycotina. However, whilst it is frequent in Sordariomycetes and possibly Leotiomycetes, it is rare in Eurotiomycetes and Dothideomycetes. Retracing the evolution of the NAT families (Fig. 2) is difficult since they are too divergent to obtain good statistical support for branching order (Fig. 1, Supplementary Fig. 2 online). Interestingly, closely related species may harbour NAT genes belonging to different families. For example, Aspergilli may contain genes of families III, IV and VII (A. niger) or I, III and IV families (A. oryzae). Conversely, paralogs resulting from duplication events may be present in the same species, as seen in H. haematoccoca, which contains two functional paralogs of family I, two corresponding pseudogenes inactivated by mutations and two functional paralogs of family II. Overall, this complex pattern of the absence/presence of NAT paralogs suggests intense gene duplications and losses during the evolution of Pezizomycotina (Fig. 2), as observed for other protein families (Espagne et al. 2008; Wapinski et al. 2007). Amongst all these NAT, only the activities of PaNAT1/PaNAT2 of P. anserina (families I and II, respectively) and F. verticillioides FDB2, which belongs to the family III (Table 1), have been explored (Glenn and Bacon 2009; Martins et al. 2009). In both species, the enzymes are involved in the resistance to toxic compounds. It is known from studies in Streptomycetes that NAT-related enzymes may function as amide synthase that catalyses the ring closure of macro-cyclic chemicals produced by PKS, such as rifampicin (see Sim et al. 2008 for a review). There is thus a possibility that some enzymes detected in Pezizomycotina may act in secondary metabolite production. Intriguingly, there seems to be no correlation between the known lifestyle of the fungi and the presence of one or several NAT genes (Table 1), suggesting that NAT genes respond to particular demands, as exemplified by the FDB2 enzyme of F. verticillioides, which may facilitate the growth of this species on some cereals. It is likely that a finer knowledge of the biotope and lifestyle of the various fungi will be necessary to understand the selection pressure that may lead to the retention or loss of NAT genes during evolution.

A Horizontal Transfer from a Bacterium to Malassezia globosa

Amongst the NAT genes present in Eumycota, the one present in M. globosa (MGL_4060) stands out by the fact that it branches with high statistical support amongst bacterial genes, as there is 81 and 87% boostrap support in the Neighbour Joining and PhyML analyses, respectively, for a bacterial clade of NAT + MGL_4060 (Fig. 1, Supplementary Fig. 2 online,). It is, therefore, likely that this gene is a recent transfer from a bacterium to M. globosa. GenBANK search by BLASTP shows that the NAT similar to the M. globosa NAT is encoded by the NAT1 gene of Mesorhizobium loti (Rodrigues-Lima et al. 2006). M. loti is likely not the donor as it is involved in symbiosis with plant roots, whilst M. globosa is associated with several seborreic conditions in humans, such as dandruff. The sequenced strain (Xu et al. 2007) derives from a patient affected with pityriasis versicolor. Therefore, in addition to gain by duplication, fungi may gain NAT genes by horizontal transfer, a feat also achieved for PKS genes (Schmitt and Lumbsch 2009).

Lack of NAT Genes in Oomycetes and Presence in Related Algal Groups

The fungal lifestyle has evolved in another group of eukaryotes, the Oomycetes. Although similar in appearance and ecology, these organisms possess quite different biological properties from Eumycetes and are related to diatoms and other algae of the Chromista eukaryotic supergroup (Lamour et al. 2007). It has been proposed that genes were horizontally transferred from Eumycetes to Oomycetes during their long evolutionary cohabitation (Richards et al. 2006). To test, if NAT genes were transferred, then we mined the sequenced genomes of Oomycetes (Lamour et al. 2007). BLAST analysis did not reveal any NAT gene in the genomes of four oomycetes (Hyaloperenospora parasitica at http://genomeold.wustl.edu/index.cgi, Phytophthora sojae, P. ramorum and P. infestans at http://genome.jgi-psf.org/) nor in the genome of Thalassiosira pseudonana, a diatom related to oomycetes (at http://genome.jgi-psf.org/). However, one NAT gene is present in the diatom Phaeodactylum tricornutum (this gene has not been detected by the automatic annotation), two in the genome of Fragilariopsis cylindrus, another diatom (Fracy1|264642 and Fracy1|243479 at http://genome.jgi-psf.org/) and at least two in the more distantly related haptophyte, Emiliana huxleyi (Emihu1|222756 and Emihu1|451400 at http://genome.jgi-psf.org/). Therefore, NAT genes present the same patchy distribution in Chromista as observed in Opisthokonts and have likely not been transferred from Eumycetes to Oomycetes

Conclusion

The genomes of filamentous Ascomycetes contain numerous NAT genes, unlike those of the other fungal clades. The fungal NAT genes are most likely derived from an ancestral gene that was present in the ancestor of all the opisthokonts, except for one gene potentially resulting from the horizontal transfer from a bacterium to M. globosa. During evolution, the fungal NAT gene family has undergone a complex history of duplications and gene losses (Fig. 2), as observed in other fungal gene families. The role of NAT in fungi is not clear. In the investigated cases, NAT enables fungi to resist toxic metabolites, be they xenobiotics or plant secondary metabolites. However, the presence of NAT genes follows a similar pattern of taxonomic distribution and amplification as seen for PKS and NRPS genes in fungal genomes, suggesting that some NAT could be involved in the production/modification of fungal secondary metabolites.