Introduction

Filamentous fungi have played an important role in the history of drug discovery and development. The secondary metabolites (SMs) that these organisms produce have served as a source of low molecular weight molecules with a variety of biological activities. Examples of these are antibiotics such as penicillin, immunosuppressants such as cyclosporine, antifungals such as griseofulvin and the echinocandins, and antihypercholesterolemic drugs such as lovastatin [11, 24, 35]. Many of the bioactive SMs that are easily accessible under conventional laboratory conditions have already been isolated and patented for drug development. However, advances in genome sequencing [23, 32, 38, 40] revealed that fungal species harbor an abundance of SM gene clusters and these far exceed the number of known metabolites produced by the species [45]. This potential abundance of SMs may reflect their importance in nature as a chemical arsenal for niche security [41]. The carefully controlled growth conditions in laboratory culture settings prevent any competition or life-threatening circumstances that would trigger the production of SMs, thereby leaving many of the gene clusters dormant. Activating these silent gene clusters, revealing their biosynthetic pathways, and isolating the SMs produced by these pathways is a major challenge in the search for new SMs.

Various approaches have been taken in attempts to activate silent SM gene clusters [14], including fusing of regulatable promoters to a pathway-specific transcription factor [5, 17], removal of genes required for heterochromatin formation [7], genome-wide analysis of mutants of LaeA, a global regulator of SM [8], co-incubation with microorganisms to mimic conditions in nature [48], and the “one strain many compounds” (OSMAC) strategy [6]. Most of these approaches were developed in A. nidulans due to the availability of highly efficient gene-targeting systems in this model organism. The developed approaches are often subsequently applied to other filamentous fungi.

In this review, we focus on recent advances in genome mining of secondary metabolism genes in A. nidulans. We also describe the current status of the annotation of the products of secondary metabolism genes in A. nidulans. We would also like to direct readers to the accompanying review in this issue by our collaborators Nancy Keller and Philipp Wiemann on general strategies for mining fungal natural products and to other recent reviews on this subject [10, 28, 53, 56, 61].

The status of annotating secondary metabolite genes in A. nidulans

Among the Aspergillus species, A. nidulans has been used as a model organism, making it the most comprehensively studied and best-characterized species in the genus with the largest body of literature. Most studies of secondary metabolite biosynthesis in A. nidulans have used strains derived from a common reference strain, A. nidulans FGSC A4. A. nidulans FGSC A4 was initially sequenced by Cereon Genomics (Monsanto) in 1998 to three-fold genome equivalent coverage and the sequence was publicly released in 2003. Shortly thereafter, additional sequencing was completed at the Whitehead Institute/MIT Center for Genomic Research to give a total of 13 genome-equivalent coverage. The seminal paper describing the A. nidulans genome was published in 2005 [23]. Access to this sequenced genome has allowed investigators to use sequence similarity to known genes from other species to mine for core genes that are involved in secondary metabolism in A. nidulans. Algorithms such as SMURF (Secondary Metabolite Unknown Regions Finder) [27] and antiSMASH (antibiotics and Secondary Metabolite Analysis Shell) [34] are extremely useful in predicting the core SM biosynthetic genes. Taking into consideration the most recent annotation and additional analysis of available genomic data, our group’s most recent estimate is that the A. nidulans genome contains 56 putative secondary metabolism core genes including 27 polyketide synthase genes (PKS), two polyketide synthase-like genes (PKS-like), 11 nonribosomal peptide synthetase genes (NRPS), 15 NRPS-like genes, and one hybrid NRPS-PKS gene. Table 1 and Fig. 1 show our current understanding of the products of these genes and the products from the pathways.

Table 1 Secondary metabolism gene clusters in A. nidulans
Fig. 1
figure 1

Structures of compounds isolated from A. nidulans

Bioinformatic advances

Since the original publication of the genome sequence data [23], A. nidulans gene annotations have been refined repeatedly to correct incomplete or inaccurate content [3, 4, 25, 39, 57]. The Aspergillus Genome Database (AspGD; http://www.aspgd.org/) provides gene and protein sequence data that are curated based on submitted information and published literature. Although the wealth of data and the availability of the algorithms mentioned previously have provided accurate predictions of core SM biosynthetic genes, it is still not possible to predict with accuracy the boundaries of secondary metabolite gene clusters or the functions of each member of the clusters based solely on genome sequence data. This is due to the fact that many of the genes surrounding the core SM biosynthetic genes often have unknown functions, making predictions of their involvement in the biosynthetic process of the SM almost impossible. Elucidation of biosynthetic gene clusters have thus been heavily dependent on experimental verification, a laborious process that involves single gene deletion of each gene with a suspected role in SM biosynthesis, followed by identification and characterization of SMs produced by the deletion strains. Improvements in “omics”-based methods for accurate prediction of SM gene cluster members and the availability of more precise annotations are desirable for a more rapid and efficient experimental verification of novel SM gene clusters.

Andersen et al. [2] recently published a novel strategy for the accurate prediction of SM gene cluster boundaries based on the fact that expression of genes of a given SM cluster is coordinately regulated. A DNA expression microarray was used to identify genes that were co-regulated with SM gene cluster backbone enzymes. A variety of culture media were selected that, based on SM profiling experiments, would elicit expression of as many gene clusters as possible. Samples were then taken from A. nidulans growing on the selected culture media for transcriptional profiling, and the generated data were combined with previously published data to form a superset of a total of 44 expression conditions for analysis. Andersen et al. developed clustering scores (CSs) that reflected the degree to which each gene was co-regulated with its neighbors. They developed statistical guidelines for identifying the extent of gene clusters, which were applied to the microarray data to generate cluster predictions. Comparisons with published data demonstrated that their algorithm predicted gene clusters with high accuracy and can even predict gene clusters that are scattered across different chromosomes. Using this algorithm, a list of 58 predicted SM gene clusters was generated.

These data have been curated at AspGD and applied as a criterion for the manual annotation of computationally predicted gene clusters as a part of a continued effort to improve and refine the prediction of SM gene cluster boundaries [25]. This updated gene cluster boundary annotation also incorporates published experimental data, synteny between clustered genes among different species, functional annotation of putative gene cluster members, and increase in the distance between predicted boundary genes and genes that are directly adjacent to it but not included in the cluster. This new and improved set of comprehensive SM gene cluster predictions will aid in facilitating the future investigation of novel Aspergillus SMs.

Genome-wide kinase knock-outs

The molecular genetic system of A. nidulans is powerful and technical advances in recent years have made genome-wide, systematic approaches more feasible. The Fungal Genetics Stock Center (FGSC) provides a systematic gene deletion construct collection, a valuable experimental resource for the A. nidulans research community. De Souza et al. [19] have generated a set of gene deletion constructs for 9,851 genes, which represents 93.3 % of the encoding genome. Mutant strains generated with the cassettes are deposited with the FGSC after construction.

Using this deletion construct resource, a genome-wide kinase knock-out library consisting of deletion strains of most A. nidulans non-essential kinase genes was generated and deposited at the FGSC [19]. The kinase deletion strains were used for genome-wide functional analysis of kinases, resulting in identification of many previously unknown functions for kinases [19]. This kinase knock-out library was screened to test the hypothesis that manipulation of kinase expression has the potential to activate silent SM gene clusters [58]. This led to the discovery of an mpkA knock-out strain that produced aspernidine A, a compound that had been discovered previously in A. nidulans [47] but the biosynthetic pathway remained unknown. The mpkA knock-out strain produced a sufficient amount of aspernidine A to allow the identification and analysis of the gene cluster involved in its biosynthesis. From the chemical structure of aspernidine A combined with previous data [1], it was predicted that a nonreducing polyketide synthase (NR-PKS) gene, pkfA (AN3230) is involved in the biosynthesis of aspernidine A. Deletion of pkfA confirmed this, and the boundary of the gene cluster was identified through a series of gene deletions of the surrounding genes of pkfA. Analysis of the SMs produced by mpkA deletion strains resulted in isolation and characterization of novel intermediates that aided in generating a proposed pathway for aspernidine A.

A similar deletion set of 28 protein phosphatase genes was generated and used to identify four essential phosphatases and four required for normal growth [50]. The deposited deletion constructs were also used in a study that identified multiple kinases and phosphatases involved in the sensing of carbon and energetic status, and also contributed to the understanding of the signaling cascades that result in regulation of CreA derepression and hydrolytic enzyme production [13].

Genome-wide analysis of all non-reduced polyketide synthases and NRPS-like enzymes in A. nidulans

Despite the success of various strategies to activate silent gene clusters, a large number of potential SM gene clusters remain untapped. To analyze clusters resistant to activation through existing approaches, a strategy was developed that completely bypasses normal regulation [1]. It takes advantage of recent advances in the construction of transforming fragments by fusion PCR and effective gene targeting to replace promoters of SM genes with the regulatable alcA promoter. It was applied to obtain a comprehensive understanding of the products of nonreducing polyketide synthase (NR-PKS) genes, a class of key genes of SM biosynthetic pathways [1]. The A. nidulans genome harbors 14 NR-PKS genes, and combined efforts by several groups over the years led to the identification of the chemical products of six of them [7, 12, 16, 17, 29, 42, 48, 52, 55, 62]. To determine the products of the remaining eight NR-PKS genes, the native promoters for each NR-PKS and other genes necessary for product formation or release were replaced with the alcA promoter. Induction of expression resulted in the production and release of compounds from each of the NR-PKS and allowed the completion of the determination of the products of NR-PKS genes of A. nidulans.

This approach can be applied to the discovery of other classes of SM biosynthetic gene clusters. This was demonstrated by systematically targeting nonribosomal peptide synthetase (NRPS)-like genes for promoter replacement, resulting in the discovery that one of the NRPS-like genes, micA, is the sole gene responsible for the biosynthesis of the metabolite microperfuranone [59].

In another strategy carried out by Nielsen et al. [36], a genome-wide PKS deletion library was constructed by systematically deleting all 32 putative PKS genes. A reference strain was cultured on an array of culture media to find conditions that would induce production of SMs that were not previously linked to a gene cluster, and this was followed by screening of the genome-wide PKS deletion library to establish the genetic link to the SMs. This approach provided novel links between PKS genes and SMs, demonstrating its strength and the potential usefulness of the deletion library as a resource for further PKS studies.

Use of A. nidulans as a host for heterologous expression of SM genes from other Aspergillus species

The highly advanced and established molecular genetic system of A. nidulans can be applied to the study of SM production of other fungal species that have poor or nonexistent molecular genetic systems [60]. Heterologous expression of fungal genes in other fungi has been used and with some success, but this approach is not without limitations including finding a suitable host and the difficulty of handling large genes and gene clusters. An advantage of fungal systems over bacterial for expressing fungal secondary metabolism genes is that fungi can correctly splice introns of secondary metabolism genes from other fungi resulting in successful expression [15, 22, 26]. Since many fungal SM genes are quite large and contain introns (often several introns) this is of considerable benefit.

Major advances have recently been made in establishing A. nidulans as a host for heterologous expression of fungal SMs. First, entire SM gene clusters have been deleted to eliminate production of unwanted A. nidulans SMs, resulting in reduced SM background and facilitating detection and isolation of compounds produced by the heterologously expressed genes [15].

Second, a system for transferring SM genes from other fungi while placing them under control of the alcA promoter has been developed [15, 33]. This system uses a strategy that involves (1) PCR amplification of each gene, (2) the use of fusion PCR to place each gene under control of the alcA promoter and to construct a transforming fragment, and (3) integration of the fragment into a target A. nidulans locus. For larger clusters several genes must be transferred into A. nidulans and, to avoid running out of selectable markers for transformation, a marker recycling strategy was developed [15]. Each time a new gene is introduced into A. nidulans a selectable marker is evicted and this marker can be used in the subsequent transformation. This strategy allows an unlimited number of genes to be transferred into and expressed in A. nidulans. The use of this approach resulted in the successful expression of all six genes of the gene cluster that encodes the production of asperfuranone, a cryptic gene cluster from A. terreus. Furthermore, various combinations of expression genes were tested, leading to clarification of the asperfuranone biosynthetic pathway.

Another recent approach to transfer members of entire SM gene clusters is to assemble the PCR amplified individual cluster fragments into a single large transforming fragment using USER fusion, followed by insertion into the integration vector by USER cloning [37]. Using this technique, a total of 13 genes of a putative gene cluster responsible for geodin biosynthesis from A. terreus were transferred into A. nidulans in a two step process, successfully enabling geodin biosynthesis in A. nidulans.

Conclusions

Advances in genome sequencing in fungi have provided us with a wealth of information that suggests that the number of SM gene clusters far exceeds the number of discovered compounds. A combination of bioinformatics and experimental verification is fundamental to elucidating the SM biosynthetic pathways that these SM gene clusters encode. Among the many species of Aspergillus, A. nidulans is used as a model organism and it is the species with the most abundant literature by far and the most advanced, highly efficient molecular genetic system. Recent advances in development of prediction algorithms in A. nidulans and updated curation by AspGD have given us access to improved SM gene cluster predictions, which we can use as a basis for subsequent experimental verification. Advances in transforming fragment construction techniques and effective gene targeting expedite the experimental verification process. These advances, in combination, have enabled quick and systematic approaches to uncover the potential of SM production by A. nidulans. The application of these advances is not limited to the SMs of A. nidulans. Combined efforts such as the “1,000 Fungal Genomes Project (http://1000.fungalgenomes.org/home/)” by the DOE Joint Genome Institute (JGI) are dedicated to sequencing numerous different species of fungi and providing a database for the research community. Many of these fungi do not have good molecular genetic systems, which makes experimental verification a big challenge. Heterologous expression of fungal genes in other host fungi is one approach that is being used, and major advances have been made to establish A. nidulans as a host. Newly developed methods in constructing transforming fragments and improved transformation strategies have made it possible for large or multiple genes to be transformed into A. nidulans. These approaches will contribute greatly to uncovering the untapped resources of SMs that the fungal genomes encode.