Keywords

I. Introduction

Fungi are well-established producers of foods, food additives, industrial enzymes, and pharmaceutical drugs and contribute significantly to human health and economy. Production of industrial enzymes alone had an annual worth of €3.5 billion in 2015, which is a mere fraction of the general white biotechnology products estimated to reach €450 billion in 2020 (Meyer et al. 2016). The vast majority of fungal processes are performed with species that have not been genetically engineered. Hence, the fungi have either been domesticated through millennials of human use, or their performance and yields have been improved via classical genetic methods like mutagenesis. The importance of fungi has sparked several multi-species genome sequencing projects where one aim is to uncover novel fungal genes involved in enzyme secretion and secondary metabolite (SM) formation. The global fungal diversity currently encompasses 144,000 classified species and estimates reaching 3.8–6.0 million species (Taylor et al. 2014; Willis 2018), of which ~1500 species have been genome sequenced (de Vries et al. 2018; Grigoriev et al. 2014). As such, it is becoming increasingly clear that fungi represent a vast reservoir of potentially useful enzymes and secondary metabolites yet to be discovered. As part of the sequencing project of genus Aspergillus, Vesth et al. analyzed the genomes of 36 species of Aspergillus section Nigri, predicting 40,424 unique genes, including 17,903 carbohydrate-active enzymes and an average of 70 putative secondary metabolite gene clusters per species (near 2700 total), comprising 450 distinct compound classes (Vesth et al. 2018). With the increasing number of fungal genome sequences, many new genes and potentially interesting products will be uncovered in this post-genomic era.

However, many of these new products are likely made by natural fungal producers that are challenging to incorporate into industrial processes, either because they are problematic to propagate efficiently in bioreactors, difficult to engineer, or opportunistic human pathogens. Based on the logical assumption that fungi by evolution likely are the best producers of fungal products, we therefore envision that there will be an increasing demand of transferring relevant genes and pathways from novel, exotic fungi to well-characterized fungal cell factories.

Filamentous fungi have been widely used for heterologous production of industrial enzymes, taking advantage of their large protein secretory capacity. The most commonly used fungal species for this purpose are Trichoderma reesei and members of the genus Aspergillus, e.g., Aspergillus niger and Aspergillus oryzae , and thus, most studies have been performed with these species. More recently, filamentous fungi have also been employed for heterologous production of SMs, typically as a part of an SM pathway elucidation strategy. For these studies, classical model fungi like Aspergillus nidulans and Neurospora crassa have been added to the repertoire of cell factories. The usefulness of fungal industrial workhorses and general model systems is also reflected by the development of substantial genetic toolboxes and methodologies for heterologous gene expression for these species.

The current trend for heterologous gene expression is to implement a synthetic biology approach. Hence, a host strain is transformed with a gene-expression cassette, generated from libraries of bio-blocks (see Sect. III), which are individually functional molecular units that can be combined by simple and seamless DNA fusion strategies (Fig. 10.1). According to this concept, we will first describe strategies to set up synthetic biology-based expression systems followed by an overview of popular bio-blocks that can be used for construction of expression cassettes. Next, we will provide two sections with examples where these concepts have been partly or entirely used to produce either enzymes or secondary metabolites in a heterologous fungal host.

Fig. 10.1
figure 1

Assembly of a basic gene-expression cassette. Bio-blocks are PCR amplified using primers (arrows) with tails containing sequences that are complementary to the tails of the adjacent bio-block (indicated by color); see main text for details. This allows matching overhangs (small boxes) to be formed ensuring that the bio-blocks are fused in the correct order. Tails for inserting the gene-expression cassette into a vector are not shown

II. Expression Systems

Heterologous expression is achieved by transforming the new fungal host with a suitable gene-expression cassette , which is integrated into a chromosome or an extrachromosomal vector and thereby maintained during growth. Below, we review strategies for assembling gene-expression cassettes for heterologous enzyme or pathway expression, along with methods for introducing these into the fungal host and the benefits of these methods. The functional bio-blocks for fungal heterologous gene expression are presented in Sect. III.

A. Construction of Simple Gene-Expression Cassettes

In the simplest setups, heterologous production depends on the expression of a single gene of interest (GOI). In these cases, gene-expression cassettes (GEC) can be constructed by fusing the GOI bio-block to relevant components of a basic set of bio-blocks that includes promoters, terminators, and selectable markers (Fig. 10.1). If necessary, this set can be expanded with additional bio-blocks for specialized purposes such as sequences encoding secretion signals, epitope- and purification tags, fluorescent proteins etc., or bio-blocks that are designed to target the expression cassette for integration at a specific site in the genome by homologous recombination. Since many individual bio-blocks need to be combined in a single cloning step, it is important that they can be joined with high efficiency. Several reliable systems are available for this task, e.g., In-Fusion assembly (Zhu et al. 2007), Gibson assembly (Gibson et al. 2009), USER fusion (Bitinaite et al. 2007), and Golden Gate cloning (Engler and Marillonnet 2014). Often these systems are directly compatible with accompanying vectors dedicated to gene transfer into a desirable host and the construction work commonly facilitated by ligation and propagation in Escherichia coli . Alternatively, bio-blocks can be assembled by in vivo homologous recombination using, e.g., the yeast Saccharomyces cerevisiae as a host (Finnigan and Thorner 2015).

B. Introducing Gene-Expression Cassettes into Fungal Hosts

The gene-expression cassettes can be introduced into the fungal host according to different principles. Below, we briefly review how gene-expression cassettes can be maintained by being inserted into an extrachromosomal vector or by being integrated into a random or defined position on a chromosome (Fig. 10.2).

Fig. 10.2
figure 2

Strategies for introducing gene-expression cassettes into fungi. The GEC can be (a) inserted into a self-replicating AMA1 plasmid via ligation or by in vivo HR or (b) integrated into the host genome by random or (c) targeted integration through NHEJ or HR, respectively

1. Plasmid-Based Expression Systems

Plasmid-based expression systems utilize self-replicating plasmids carrying autonomous replication sequences (Fig. 10.2a). In many filamentous fungi, this can be achieved via the AMA1 (autonomous maintenance in Aspergillus ) element. AMA1 was originally discovered in a lab strain of A. nidulans (Gems et al. 1991) in a search for genetic elements enhancing transformation in A. nidulans (Johnstone et al. 1985). In subsequent studies, it has been shown that AMA1-based vectors increase fungal transformation efficiency in A. nidulans up to 2000 times in comparison to earlier developed ectopically integrating vectors (Aleksenko and Clutterbuck 1997). Moreover, AMA1 plasmids propagate in other Aspergillus species as well as in species belonging to other fungal genera, including Penicillium (Fierro et al. 1996), Talaromyces (Nielsen et al. 2017), Trichoderma (Kubodera et al. 2002), and Rosellinia (Shimizu et al. 2012).

Gene-expression cassettes can easily be incorporated into extrachromosomal AMA1-based shuttle vectors, using the cloning systems highlighted above. As an extreme case, fungal AMA1 plasmids have been combined with a bacterial artificial chromosome (Zhu et al. 1997) to form fungal artificial chromosomes (FACs). These vectors have been instrumental in the search for new secondary metabolite gene clusters as they allow up to 300 kbp of fungal DNA to be cloned into the FAC in E. coli and subsequently transferred into a new fungal host for product discovery (Bok et al. 2015; Clevenger et al. 2017). Importantly, AMA1-based plasmids are not very stable and their maintenance requires selection, thereby limiting their use to small-scale exploratory heterologous production experiments. However, we note that for some purposes, such as cas9 expression in CRISPR experiments, plasmid instability is desirable as plasmid loss allows for transient gene expression, diminishing potential Cas9 off-target effects (Zhang et al. 2015).

2. Chromosome-Based Expression Systems

Integration of the expression cassette into the host genome provides a more stable expression, as compared to AMA1-based expression, and attenuates the need to maintain selection pressure. Chromosomal integration of foreign DNA occurs via one of the two DNA repair mechanisms (Krappmann 2007), either non-homologous end joining (NHEJ), where the DNA integrates into a random locus (Fig. 10.2b), or homologous recombination (HR), where the DNA integrates into a defined locus (Fig. 10.2c). Although both pathways are active in all fungi, foreign DNA typically integrates more efficiently via the NHEJ pathway in most filamentous fungal species (Meyer et al. 2007; Nødvig et al. 2015).

a) Random Chromosomal Integration of Gene-Expression Cassettes

Since NHEJ is the dominant pathway for chromosomal integration of DNA in most filamentous fungi, random chromosomal integration of gene-expression cassettes via NHEJ is therefore straightforward and can be efficiently performed in essentially all transformable fungal species. A drawback of the method is that the chromatin structures of the insertion sites are unpredictable and expression levels may therefore vary substantially between different transformants (Lubertozzi and Keasling 2006). On the other hand, if several gene-expression cassettes enter a nucleus, they may undergo recombination prior to genomic integration. As a result, transformants may contain multiple copies of the GOI inserted into a single locus. In these cases, the transformants will often be better producers of the product due to the many gene copies. However, since the cassettes will be organized as direct and/or inverted repeats, expression levels of such strains may be unstable as gene copies may be lost due to direct repeat recombination or formation of hairpin structures (Leach 1994; Petes 1988). Lastly, it is important to note that when gene-expression cassettes integrate randomly into the genome, they may disrupt important genes, or alter the expression levels of neighboring genes, causing undesired phenotypes. In summary, with this integration method, it is advisable to screen for transformants that produce high yields over many generations and at the same time do not display undesirable fitness defects.

b) Defined Chromosomal Integration of Gene-Expression Cassettes

The use of defined and well-characterized loci as integration sites for GECs allows for gene expression that is much less prone to clonal differences. Hence, insertion of gene-expression cassettes into defined loci enables comparative screening of expression levels or enzyme activities based on different genetic elements employed in the cassette, such as promoter, secretion signal, and terminator sequences. Similarly, the effects achieved by changing the GOI sequence can be directly compared. In this way it is possible to address whether changes in the GOI codon composition increase yields or whether mutations infer changes in the heterologous protein that influences its activity, specificity, folding, and/or stability (Hansen et al. 2011b; Holm 2013).

Gene-expression cassettes can be inserted into integration sites in the genome by HR. To ensure high expression levels, it may be useful to position integration sites in intergenic regions located in transcriptionally highly active sections of a chromosome. With this method, the gene-expression cassettes must be flanked by up- and downstream targeting sequences matching the genomic expression site. Since fungi rarely prefer to integrate foreign DNA into its genome by the HR pathway, extensive screening for the desired transformant may be necessary. The screening workload can be reduced by using bipartite gene-targeting substrates that select for HR proficient protoplasts (Nielsen et al. 2006) or avoided by using NHEJ-deficient strains (Meyer et al. 2007). Alternatively, methods based on restriction enzymes or CRISPR technology can be employed (Ouedraogo et al. 2016; Zheng et al. 2017). Importantly with the latter technology, GOIs can be inserted in a marker-free manner and multiplexing is possible (Liu et al. 2015; Nødvig et al. 2018; Zhang et al. 2016a). In fact, CRISPR mediates very efficient gene targeting even in NHEJ-proficient strains (Nødvig et al. 2018), eliminating the risks of working with strains with a defective DNA repair pathway. We therefore envision that CRISPR-based methods will deliver the preferred tools for introducing gene-expression cassettes into defined chromosomal expression platforms.

C. Bio-Block-Based Multi-GOI Expression Strategies

In many cases, two or more genes are required for the synthesis of a desired product. For example, for heterologous production of SMs, several genes are commonly required to synthesize the product. Above, we described that it is possible to transfer large chromosomal fragments containing entire gene clusters from one fungus to another (Sect. II.B.1). However, in many cases the activation mechanism for the genes in the cluster is not known and may depend on unknown or even host-specific transcription factors (TF) (Keller 2019); see Chap. 11. Moreover, not all genes contributing to the biosynthetic pathway of interest may be situated in the cluster (Schäpe et al. 2019). It may therefore be desirable to reconstruct the genes involved in formation of the heterologous compound using a setup that allows necessary GOIs to be equipped with known and well-characterized promoter and terminator sequences. Here we present three strategies that allow for bio-block-based multi-GOI expression (Fig. 10.3). Although multi-GOI expression cassettes can be assembled on AMA1-based vectors, we recommend using defined chromosomal expression sites for GOI insertion to gain genetic stability and to reduce dependency on selectable markers.

Fig. 10.3
figure 3

Strategies for multi-GOI expression cassette assembly. Individual GECs composed of a promoter (Px), a gene of interest (GOIx), a terminator (Tx), and a selection marker (Mx). Individual GECs can be (a) integrated in different loci, or (b) assembled into a combined multi-GOI cassette for single locus integration, or (c) assembled as a single polycistronic GOI where coding sequences of individual polypeptides are separated by the sequence encoding the picornavirus 2A peptide (A2p)—see main text for details

In the first strategy (Fig. 10.3a), individual expression cassettes for different GOIs are inserted into a number of unique integration sites. This strategy is advisable if all of the individual gene-expression cassettes contain an identical bio-block, such as a specific promoter or terminator. In this way, copy loss due to direct repeat recombination between the identical bio-blocks is avoided. With conventional gene-targeting methods, insertion of gene-expression cassettes into individual sites requires a number of selectable markers that matches the number of gene-expression cassettes required for establishing the biosynthetic pathway. Alternatively, the genes can be inserted by iterative gene targeting using a recyclable marker (see Sect. III.B.5). Strain construction with these methods is cumbersome and restricts the method to biosynthetic pathways composed of a small number of genes. However, these limitations can be dramatically reduced by constructing the strains via CRISPR-based multiplexed marker-free gene insertions.

In the second strategy (Fig. 10.3b), all genes required to support a biosynthetic pathway are inserted into the same expression site. Arranging the GOIs as a synthetic gene cluster may be preferred if only few selectable markers are available. Note that if two markers are available, then an infinite number of consecutive gene-targeting events into the same locus can be performed. Hence, even very large synthetic gene clusters can be constructed at a defined expression site via multiple integration steps. In this case, the selectable marker used in a given integration step replaces the marker used in the previous integration step (see Sect. III.B.5). In addition, if the heterologous host has a sexual cycle, strains that contain a synthetic gene cluster in a single defined locus can be crossed to other strains without risking that the individual genes of the cluster segregates during meiosis. This feature can be used to combine the heterologous pathway with other beneficial traits harbored by other strains.

In the third strategy (Fig. 10.3c), the GOIs are arranged as a polycistronic unit where each open reading frame (ORF) is separated by a bio-block, encoding the 2A peptide from the Picornaviridae virus family; see Table 10.3 (Schuetze and Meyer 2017). This strategy is based on the facts that; firstly, the ribosome fails to link glycine and proline residues during translation of the 2A spacer, thereby resulting in a break in the polypeptide chain. Secondly, this error does not result in release of the ribosome from the mRNA, thereby allowing for continued translation. Hence, in a simple manner, several proteins can be encoded from a single transcript generated from a single expression cassette. One drawback of this method may be that the final size of the polycistronic gene-expression cassette makes construction work difficult. Additionally, “cleavage” at the 2A spacer results in a partial 2A peptide sequence remaining at the C-terminus of the protein, which may interfere with protein function and therefore needs to be removed, e.g., by combinatory usage with other proteolytic sites (Hoefgen et al. 2018). Moreover, as “cleavage” at the 2A spacer is not 100%, the relative efficiency of protein production often depends on the position of the proteins ORF in the transcript (Schuetze and Meyer 2017).

Lastly, we note that for all strategies, two genes can be inserted simultaneously as a single cassette if bidirectional promoters (Wiemann et al. 2018; Rendsvig et al. 2019) are used to control gene expression of the heterologous gene pairs. This feature can be used to speed up strain construction.

III. Bio-Blocks

A bio-block is a DNA sequence that encompasses a molecularly functional unit. This includes nucleotide sequences, e.g., promoters and terminators, or sequences encoding a protein, e.g., the GOI and selection marker, or shorter protein sequences such as secretion signals and purification tags. Generation of the basic gene-expression cassette for production of a heterologous protein requires several bio-blocks, including the GOI, promoter, and terminator. Selection markers may also be included in the construct, offering versatile applications in different scenarios. In recent years, the use of -omics data has driven the discovery of natural promoters, which are active during the desired cultivation conditions for production. While the majority of strain engineering has been based on natural genetic elements for controlling gene expression, significant headway has been made in the development of synthetic promoters and terminators. In extension, even full synthetic gene-expression systems have been established, in which several individual genetic elements (natural or modified) are combined, to facilitate controlled gene expression and enhance productivity. Moreover, to enable secretion of the protein, a transport signal must be included as an additional bio-block (Sect. IV.A.1). For specialized purposes like protein visualization or purification, protein tags fulfilling such tasks are also included in the construct, thereby increasing the pool of bio-blocks.

A. Gene of Interest

Since the goal of heterologous expression is to produce products derived from one or more genes of interest, this element can also be considered the most important bio-block. As the GOIs coding potential in most cases should not be changed, it therefore constitutes the least flexible bio-block; however, its design requires several considerations. The GOI sequence may be derived from different sources. In the simplest scheme, if the donor strain is related to the new host, genomic DNA can typically serve as a template to make a functional PCR-derived GOI bio-block. However, if the donor strain is more distantly related, intron splicing and codon bias may compromise gene expression and translation, and we will briefly review the two latter issues below.

1. Introns

The presence of introns in a heterologous gene may result in reduced mRNA production due to mis- and incomplete splicing (He and Cox 2016; Zhao et al. 2013). In principle, the problem can be solved by employing cDNA as a PCR template for making the GOI bio-block or alternatively fuse PCR-derived exons by another round of PCR or by a one-step cloning method that allows for seamless assembly of multiple fragments (An et al. 2007). Alternatively, an entirely synthetic gene may be acquired from a commercial source. However, transcription, splicing, polyadenylation, and mRNA export are a coordinated process (Bentley 2014) and as a result the presence of introns may affect gene expression.

Indeed, sequential elimination of three introns in a gene encoding a protease from the thermophilic fungus Malbranchea cinnamomea cumulatively reduced production levels in T. reesei (Paloheimo et al. 2016). Interestingly, the largest reduction was observed when the intron closest to the transcriptional start site was removed. Similar results were obtained when a gene encoding an antifungal protein from Aspergillus giganteus was expressed in Trichoderma viride (Xu and Gong 2003). Moreover, the inclusion of an artificial intron was required for the heterologous production of GFP and mRFP in the basidiomycete Armillaria mellea (Ford et al. 2016).

Since it is not straightforward to predict the impact of retaining introns in a foreign gene on the final yield of mRNA, it may be useful to try different gene variants containing no, a few selected, or all introns, to optimize expression levels from the GOI.

2. Codon Optimization

Codon usage may vary dramatically between species (Iriarte et al. 2012) and, consequently, influence how well a heterologous mRNA is translated into a protein, reviewed in (Hanson and Coller 2018; Tanaka et al. 2014). For example, codon composition may change the chromatin structure of the GOI affecting gene transcription efficiency (Zhou et al. 2016), whereas rare codons in the beginning of the transcript may determine the efficiency of translation initiation (Pop et al. 2014). Moreover, if the mRNA contains abnormally high levels of rare codons, this may result in premature transcription termination and production of reduced length mRNA for the ORF (Zhou et al. 2018). Conversely, rare codons may be used in the native host as translation pause sites required for proper and timely folding of subdomains of the protein structure. If such codons are replaced by frequently used codons, overall folding may be compromised (Yu et al. 2015; Zhou et al. 2015).

Different algorithms exist in order to address the different challenges concerning optimal codon choices for heterologous protein production (Gould et al. 2014). Construction of a codon-optimized gene usually requires de novo synthesis of the entire gene, which can be done by fusing a set of overlapping oligonucleotides (Sect. III.E) (Hoover and Lubkowski 2002) or simply be acquired from a commercial source. Codon optimization may benefit heterologous protein production significantly, especially if the source of the foreign gene and the new host are distantly related species. However, it is important to stress that predicting the optimal codon composition for heterologous gene expression is a discipline still in its infancy. It is therefore often an advantage to test more than one gene variant during cell factory construction. This can be exemplified by the extracellular yield from T. reesei expressing a codon-optimized mammalian monoclonal immunoglobulin (IgG) antibody, which showed a 40-fold difference, dependent on which of two companies had performed the codon optimization (Lin et al. 2006).

B. Selection Markers

Selection markers are mainly used to facilitate identification of transformants containing the gene-expression cassette (Dave et al. 2015). Since targeted integration of the gene-expression cassette can be conducted in a marker-free manner via CRISPR technology, the presence of this bio-block in the cassette is no longer essential. Nevertheless, selective markers will likely remain a common feature in a gene-expression cassette since strain identification is simplified and more efficient with a marker and since the marker may also serve useful post-transformation roles. For example, a selectable marker can be used to quickly identify strains, or to promote breeding via sexual and parasexual cycles, or to maintain AMA1 plasmid-based expression. Moreover, to achieve high expression levels, it may be desirable to obtain a fungus with many copies of the expression cassette. Such strains can be selected by employing cassettes containing a weak selection marker as the strains with a high copy number will gain a fitness advantage at heavy selection pressure (Wernars et al. 1985). In some cases, it is necessary to eliminate the marker after transformation, for example, if a marker-free strain is desirable or if a marker needs to be recycled during iterative genetic engineering.

The many uses of selectable markers in strain development are reflected in the fact that several different types of markers have been developed, which can be divided into four main categories: resistance, visual, auxotrophic/nutritional, and counter-selectable markers. Importantly, the choice of selectable marker depends on the strain that needs to be engineered and on the purpose of the experiment. Below, we will briefly review the different categories of selectable markers and discuss in which situations a given marker can be advantageous.

1. Resistance Markers

Resistance markers encode proteins that neutralize the toxic effect of antimicrobials and thereby convey resistance as a dominant trait. Since the functionality of these markers (like some visual markers, see below) typically do not require any modifications of the host genome, they can be used with wild-type model fungi or with fungi where no or few genetic tools are available. Several antimicrobial/resistance-marker systems have been introduced for fungal genetic engineering (Table 10.1).

Table 10.1 Commonly used fungal antimicrobials and their corresponding resistance genes

The mechanism of these antimicrobials typically involves interfering with protein translation, inhibition of metabolism, or induction of lethal DNA double-stranded breaks (DSBs). The resistance to the antimicrobials is achieved through protein-based mechanisms including 1:1 drug binding, drug turnover by chemical modification, and drug-resistant target enzymes; see Table 10.1.

The presence of an array of different marker systems is highly useful because it potentially supports that several cycles of genetic engineering steps can be performed. Perhaps more importantly, it also expands the number of species that can be engineered due to the fact that many fungal species possess an inherent resistance to certain antimicrobials due to cell wall impermeability, native efflux pumps, or catalytic activities (Garneau-Tsodikova and Labby 2016). Sometimes, susceptibility depends on the media composition (Roller and Covill 1999) exemplified by Aspergillus species, which show resistance toward hygromycin B and bleomycin at pH-values below five and in complex or hypertonic media (Punt and van den Hondel 1992). Hence, before using resistance markers, it is necessary to conduct susceptibility assays of the fungus to determine appropriate selective antimicrobials and have consistent properties of selective media batches. It should be noted that the use of antimicrobials may be undesirable, especially for large-scale cultivations, due to the risk of developing drug resistant strains, the price of the antimicrobial, and that some of the compounds are toxic to humans.

2. Visual Markers

Visual markers provide a phenotypic trait that can be easily visualized by conferring, e.g., a color change due to disruption of a host gene, or by heterologous expression of a color, fluorescent, or bioluminescent marker. Like with antimicrobial markers, visual markers can be used for selection in wild-type strains, but without posing any risks in relation to compound toxicity or transference of resistance genes to other organisms.

Three types of heterologous visual markers are frequently used, all of which may be applied for spectrophotometric assays. Firstly, β-galactosidase LacZ (lacZ) (Lubertozzi and Keasling 2006) and β-glucuronidase GusA (uidA) of E. coli (Tada et al. 1991), respectively, convert the synthetic substrates X-gal and X-gluc to 5-bromo-4-chloro-3-hydroxyindole that undergoes dimerization forming a blue pigment. Secondly, luciferases convert substrates into bioluminescent products. Thirdly, fluorescent proteins derived from jellyfish Aequorea victoria GFP, which was adapted for fungal use by codon optimization and by eliminating a non-conventional/cryptic intron splice site (Lorang et al. 2001), or from RFP.

Visual markers do not offer any fitness advantage, and, if the goal is to detect transformants containing the expression cassette, selection requires visual screening. Selection by solely visual inspection can be achieved by inserting the cassette into conidial pigmentation genes (e.g., homologs of A. nidulans wA and yA genes), resulting in a distinct change in conidial pigmentation from the native black or green spores of Aspergillus species to white or yellow (Jørgensen et al. 2011; Nielsen et al. 2006). Similarly, the expression cassette can be inserted into adeA or adeB as this will produce easy detectable red colonies due to polymerization and oxidation of 4-amino-imidazole ribotide, the intermediate resulting from the disrupted purine biosynthetic pathway (Jin et al. 2004). Alternatively, the gene-expression cassette can be equipped with a heterologous marker that produces color, fluorescence, or bioluminescence to allow for visual selection. This is useful if the goal is to insert the cassette into an intergenic section of the genome by HR, thereby avoiding host gene disruption. In cases where correct transformants are rare, fluorescent markers set the stage for high-throughput detection schemes via FACS analysis, which can be employed if the fungus produces discrete entities like conidia (Bleichrodt and Read 2019; Vlaardingerbroek et al. 2015). Color, fluorescent, or bioluminescent markers can also be advantageously used if the goal is to insert the gene-expression cassette in multiple copies, e.g., via integration events catalyzed by the NHEJ pathway, as transformants with a high copy number can be selected for by the strength of the marker signal (Throndset et al. 2010).

When heterologous enzymes and fluorescent proteins are used as selective markers, it should be noted that some fungi produce enzymes that may catalyze the same reactions as those provided by the marker enzyme, or the fungi may display auto-fluorescence. It is therefore always necessary to test whether significant native background signals exist.

3. Nutritional Markers

Usage of nutritional markers relay on auxotrophies in the host organism, which thereby requires supplementation with specific metabolites to sustain growth. Introducing a functional copy of the given nutritional marker into the host complements the defective native gene function, allowing for growth without supplementation. In this way, nutritional markers offer selective pressure without the negative effects of antibiotics. On the other hand, prior to cell factory construction, the host needs to be mutagenized or genetically engineered to create the relevant auxotrophic mutations. If the starting point is a wild-type strain, this may require the use of mutagens, traditional genetic engineering using antimicrobial or visual markers, or CRISPR technology. Complementing homologous or heterologous genes can be used as markers, but often a heterologous marker is preferred as the sequence differences reduce the risk of generating false positives due to homologous recombination between the marker and the corresponding mutated locus.

Some commonly used nutritional markers include argB (Buxton et al. 1985), trpC (Goosen et al. 1989), adeA and adeB (Jin et al. 2004), pyroA (Osmani et al. 1999), and pyrG (Goosen et al. 1987), which are required for synthesis of arginine, tryptophan, adenine, pyridoxine, and uracil/uridine, respectively. Other frequently used markers are niaD required for nitrate assimilation (Unkles et al. 1989) and the amdS gene from A. nidulans allowing growth on acetamide as the sole nitrogen source (Kelly and Hynes 1985).

Like visual markers, the design of the nutritional marker can be used to select for strains containing multiple expression cassettes. For example, the complementary marker gene can be equipped with a poor promoter to ensure that only strains with many copies produce sufficient amounts of the missing enzyme.

Often several auxotrophies are introduced in the host to allow for several rounds of genetic engineering using a set of complementing nutritional markers. It is important to note that nutritional markers may change the metabolism of the host cell despite that the defects are compensated by addition of the missing metabolites to the growth medium. For physiological characterizations of the engineered strains, this may lead to undesired artefacts; and for cell factories, it may lead to suboptimal growth, or growth, which is restricted to specific media. It is therefore advisable to restore the functionality of those nutritional markers after genetic engineering. To facilitate this process, it has been recently shown that it is possible to functionally revert several mutated genes by multiplexing CRISPR technology (Nødvig et al. 2018).

4. Counter-Selectable Markers

Some nutritional markers can be counter-selected by using analogs of their natural substrates that are converted into antimetabolites. The two most frequently used counter-selectable wild-type markers are pyrG and amdS, which are lethal in the presence of 5-fluoroorotic acid (5-FOA) and 5′-fluoroacetamide (FAA), respectively. For both pyrG, and to lesser extent amdS, a gene copy is naturally present in wild-type fungi. Hence, the endogenous gene needs to be disrupted before the marker can be applied in selection/counter-selection experiments. In pyrG+ strains, 5-FOA is converted by the orotidine-5′-phosphate decarboxylase (PyrG) into 5-fluorouracil, which is further metabolized into toxic substances that interfere with RNA and DNA synthesis (Boeke et al. 1984; Longley et al. 2003). Similarly, in amdS+ strains, FAA is converted by the acetamidase (AmdS) into the toxic compound fluoroacetate (Hynes and Pateman 1970), which forms a stable complex with Coenzyme A (fluoroacetyl-CoA), hence hindering normal functionality of the tricarboxylic acid cycle. Other counter-selectable markers include genes in the tryptophan pathway, most often trpA (Foureau et al. 2012), and genes in the lysine pathway, most often lysB (Alberti et al. 2003), which can be counter-selected in the presence of 5-fluoroanthranilic acid, 5-FAA, and α-aminoadipate, respectively. Similarly, expression of genes encoding herpes simplex virus 1 thymidine kinase can be counter-selected as the viral thymidine kinase, unlike the thymidine kinases of the hosts, activates nucleoside analogs like 5-fluoro-2′-deoxyuridine for toxic incorporation into nucleic acids (Lupton et al. 1991). Counter-selectable markers are highly useful as they facilitate marker recycling and iterative genetic engineering.

5. Marker Recycling for Iterative Engineering

In most fungi, the number of available markers is a limiting factor for multi-step genetic engineering strategies. If, e.g., an entire biosynthetic pathway needs to be implemented into a new host, marker recycling is therefore often a necessity. Specifically, if two markers are available for transformation of a host strain, an endless number of genes can in principle be inserted into the same site in the host genome by iterative marker-swapping. In this method, the marker gene, which was used for gene integration in the previous transformation step, is replaced by a new gene targeting construct that contains new GOIs and another marker (Nielsen et al. 2013).

A marker gene can also be eliminated from the genome by spontaneous direct-repeat recombination if the marker is flanked by sufficiently long direct repeats, typically 500–1000 bp (Nielsen et al. 2006). This method may be preferred if it is desirable to leave a genetically neutral DNA sequence (i.e. the direct repeat) as the only scar after genetic engineering, or if it is desirable to integrate several genes or sets of genes into several different loci in the genome. As spontaneous direct recombination events are rare, this method requires that marker loss is selectable, e.g., by using a counter-selectable marker; see above (Sect. III.B.4). Alternatively, marker loss can be achieved by induced recombination involving either site-specific recombinases like Cre and a marker, which is flanked by its target sequences, or gene deletion catalyzed by site-specific nucleases like I-SceI or a CRISPR nuclease (Ouedraogo et al. 2016; Zhang et al. 2013).

C. Promoters

Among the different bio-blocks, promoters have drawn most attention since they control gene transcription initiation. Promoters are typically divided in two groups: those that are constitutively active, i.e., promoters that are active under all/most circumstances, and those that are inducible/repressible. Often constitutive promoters are preferred for large-scale heterologous production as they are active throughout the fermentation process in inexpensive media. In cases where the product is toxic or unstable, it may be necessary to restrict production to a specific growth phase to maximize yields; and in these cases, it is necessary to use an inducible/repressible promoter. However, as addition of inducing or repressing agents comes with an additional cost, use of this type of promoters may be restricted to small scale production or exploratory studies.

The vast majority of heterologous gene-expression studies are based on natural promoters of which we have listed sets of frequently used promoters derived from different fungal genera (Table 10.2). Interestingly, many promoters involved in basic metabolism do not seem to be species specific. Hence, promoter versatility allows the same genetic element to be used in many different species, even species belonging to different genera. However, to our knowledge, no comparative studies have systematically analyzed how active a given promoter is in different species. Based on the fair assumption that a promoter is most active in the species from which it originates, and less active when applied in other fungi, it is advisable to equip the GOI with a promoter derived from the intended production host. Importantly, the efficiency of promoters, even constitutive promoters, often depends on the growth environment and growth phase of the host. This has motivated establishment of synthetic promoters that offer orthogonal setups that work independently of the host metabolism. Below we present examples of natural and synthetic promoters.

Table 10.2 Commonly used constitutive and inducible/repressible fungal promoters

1. Natural Promoters

A wide selection of natural fungal promoters for species belonging to Aspergillus, Penicillium, and Trichoderma genera as well as others (Table 10.2) have been experimentally characterized as constitutive or inducible, along with their mode of induction and repression, as reviewed in Fitz et al. (2018), Fleissner and Dersch (2010), and Kluge et al. (2018). Typical constitutive promoters for heterologous expression in fungal cell factories are derived from genes involved in major metabolic pathways or cell functions that require high steady protein levels. Examples are the gpdA promoter (PgpdA) and the stronger tef1 promoter (Ptef1), which control production of glyceraldehyde-3-phosphate dehydrogenase acting in glycolysis and translation elongation factor 1α assisting in protein synthesis, respectively. Promoters catalyzing basic reactions in the cell are often functional if transferred to other species. In line with this view, PgpdA and Ptef1 from A. nidulans have been shown to work in other species of Aspergilli (Nødvig et al. 2015), in Talaromyces atroroseus (Nielsen et al. 2017), and the PgpdA of A. nidulans was even applied in the fungus Metarhizium anisopliae (Nakazato et al. 2006). Inducible/repressible promoters are also recruited from basic metabolic genes. This includes the commonly used alcohol and threonine inducible alcA promoter, which controls expression of alcohol dehydrogenase I involved in alcohol catabolism, and the thiamine repressible thiA promoter, controlling expression of thiamine thiazole synthase involved in thiamine synthesis. This type of promoters can typically also be expected to function in other species, such as the PthiA of A. oryzae shown to also be functional in A. nidulans (Shoji et al. 2005) and the A. nidulans PalcA applied in A. fumigatus (Romero et al. 2003). Some very strong inducible promoters depend on degradation of the feedstock, e.g., the starch-inducible promoter of A. niger glaA, controlling expression of a secreted glucoamylase/1,4-alpha-glucosidase for starch hydrolysis. Similarly, a favorite promoter from T. reesei is the cel7a promoter controlling expression of secreted cellobiohydrolase I, which is induced on media containing cellulose or lactose. For promoters that normally drive expression of secreted biomass-degrading enzymes, the conditions inducing the gene expression typically involve co-induction of other endogenous genes of secreted biomass-degrading enzymes. Thereby, this may lead to production of these enzymes as undesired side-products. The activation mechanisms of such promoters are more specialized than for genes of the central metabolism and may not be directly transferrable from species to species. For example, activation of T. reesei Pcel7a involves recruitment of a general transcription factor of cellulase-encoding genes, Xyr1 (Castro et al. 2016), in complex with a non-coding RNA, HAX1 (Till et al. 2018; Table 10.2).

a) Approaches for Identification of Natural Promoters

Classical promoters may often be chosen due to their availability and history rather than due to considerations on whether their expression profile and strength fit the heterologous production process. Hence, in many cases the default choice is to employ the strongest promoter available as a basis for heterologous production. However, in some cases, maximum transcription levels are not desirable. For example, if the goal is to produce a complex secondary metabolite, the set of genes required to synthesize this compound may need to be expressed at different levels to achieve a balanced biosynthetic pathway. Expanding the library of fungal promoters to include members displaying a wide range of different expression strengths and other properties is therefore desirable. Fortunately, their discovery is accelerated by the availability of fully or partially sequenced fungal genomes. For example, by analyzing genome-wide transcription profiles obtained at different growth stages on different relevant growth media, it is possible to identify promoters with properties that are tailored to fit a given process. In the simplest scheme, one may apply small-scale targeted transcriptomic approaches, like RT-qPCR to determine the promoter strength under certain conditions (Li et al. 2012). Alternatively, large publically available datasets can be applied, such as the A. niger transcriptome microarrays covering several different growth conditions (Andersen et al. 2008; Breakspear and Momany 2007). Currently, transcriptomic microarray datasets covering 155 different cultivation conditions are available for A. niger (Schäpe et al. 2019). Using this approach to their advantage, Blumhoff and co-workers identified six novel constitutive A. niger promoters with varying expression strength (Blumhoff et al. 2013). The library contains promoters that are stronger and weaker than A. niger PgpdA displaying ten-fold differences in expression levels of native genes, and when applied to produce the β-glucuronidase (uidA) from E. coli , 1000-fold in specific activity yields were obtained. A similar set of T. reesei promoters displaying varying expression strengths has been derived from its glycolytic genes (Li et al. 2012), and in Penicillium chrysogenum both homologous and heterologous promoters have been evaluated for expression efficiency (Polli et al. 2016). It is important to note that although genome-wide transcription profiles provide mRNA levels for specific genes, they do not define the exact promoter sequences that control these genes. To this end, a global map of transcription initiation sites in A. nidulans was generated based on full-scale RNA-sequencing of gene transcripts obtained at six different growth conditions (Sibthorp et al. 2013), and this will facilitate extraction of new A. nidulans promoters as the map simplifies promoter sequence identification.

2. Synthetic Promoters and Gene-Expression Systems

Natural promoters are often influenced by the state of metabolism of the host, and this may be undesirable for heterologous production. Synthetic biology-based approaches are therefore adopted to generate artificial/synthetic gene-expression systems that ideally act independent of the host metabolism.

For example, Gressler et al. developed an expression system based on a fusion of the maltose-inducible amyB promoter from A. oryzae with the ORF encoding the transcriptional activator TerR from the Aspergillus terreus terrein gene cluster (Gressler et al. 2015). In addition, the system contains an expression cassette with a bidirectional promoter that binds TerR. Subsequently, the system was used to heterologously express two GOIs in A. niger by addition of maltose.

Another synthetic expression system was developed in T. reesei to facilitate cellulose degradation. Cre1 and Ace1; two general glucose stimulated repressors of genes involved in production of cellulolytic enzymes were fused to the activation domain of herpes simplex virus protein 16 (VP16 ), hence turning glucose repression into gene activation (Zhang et al. 2018).

More advanced orthogonal systems based on synthetic transcriptional regulators and matching synthetic promoter sequences have made gene expression much more controllable. For example, synthetic biology based methods have elegantly been applied to develop a constitutive expression system, which is almost equally functional in distantly related fungi including A. niger , T. reesei, and several yeasts (Rantasalo et al. 2018). Such synthetic tools bypass the organism specificity of conventional promoters and may become a valuable bio-block for multi-species studies. This system consists of a synthetic transcription factor (sTF) composed of the LexA repressor from E. coli fused to the activation domain from VP16. Expression of sTF is controlled by promoters with different arrays of LexA binding sites positioned upstream of a fungal core promoter, which was selected for functionality in several species. Low constitutive expression of the gene encoding the synthetic transcription factor is ensured via another universally active fungal core promoter. Importantly, promoter strength can be regulated by varying the number of LexA binding sites in the synthetic promoter. The system was recently combined with CRISPR-Cas9 multiplexing technology enabling triple targeted integration of the GOI cassette in three different loci in T. reesei (Rantasalo et al. 2019).

Orthogonal inducible expression platforms based on the Tet-on and Tet-off systems have been implemented in fungi (Fig. 10.4) (Meyer et al. 2011; Wanka et al. 2016). The Tet-off system is based on the synthetic transcription factor (tTA) a fusion of the E. coli repressor TetR with the activation domain of VP16 from herpes simplex virus, which binds to tetO operator elements in the absence of the tetracycline derivative doxycycline, Dox (Gossen and Bujard 1992). In contrast, the Tet-on system is based on rtTA-M2 (or other variants of rtTA) (Gossen et al. 1995; Urlinger et al. 2000), which are mutated versions of tTA that binds to tetO operator elements in the presence of Dox. A Dox responding synthetic promoter is made by positioning tetO elements upstream of a core promoter, which provides low or undetectable basal expression level in the absence of the synthetic transcription factors tTA and rtTA-M2 (Wanka et al. 2016). In the Tet-on system, which constitutively produces rtTA-M2, the promoter is activated by addition of Dox, whereas in the Tet-off system, which constitutively produces tTA, the promoter is activated in the absence of Dox. Note that the Tet-on system is often preferred over Tet-off due to its faster response time (Wanka et al. 2016).

Fig. 10.4
figure 4

Graphic representations of the Tet-on and Tet-off expression systems. (a) Tet-on system. (b) Tet-off system. In the Tet-on system, the Dox ligands acts as an activator of the synthetic TF rtTA2s-M2 allowing it to bind to the tetO7 sites of the promoter. In the Tet-off system, the Dox ligands repress binding of the synthetic TF tTA2s to the tetO7 sites of the promoter; see main text for details

D. Terminators

Terminators coordinate transcription termination and the extent of polyadenylation at the 3′-end of the new transcript, which in turn is important for its nuclear export and stability (Bentley 2014). In fungi, gene-expression cassettes have typically employed terminators derived from tef1 and trpC from Aspergilli, and it appears that terminators can be functionally transferred from one fungal species or genera to another. For example, A. nidulans tef1 and trpC terminators have successfully been used in various Aspergilli (Nødvig et al. 2015) and T. atroroseus (Nielsen et al. 2017), while the terminators of T. reesei cbhII and pdc were applied in A. niger (Blumhoff et al. 2013) and the basidiomycete Ganoderma lucidum (Qin et al. 2017), respectively. Since functionality is often conserved during heterologous application of terminators, an alternative approach for gene expression uses the GOI and its natural terminator directly as a functional unit for assembly of the gene-expression cassette (Gressler et al. 2015; Li et al. 2018).

We note that no thorough comparative analyses of the impact of terminators on the overall protein production have been performed for filamentous fungi despite the prominent roles of terminators in the RNA life-cycle. However, in other expression systems, e.g., in yeasts, the choice of terminator has been demonstrated to significantly influence production yields (Curran et al. 2013; Morse et al. 2017), and it may therefore be useful to assemble expression cassettes with terminator variants if production yields are suboptimal. To this end, we also note that small functional synthetic terminator bio-blocks have been developed for S. cerevisiae to facilitate gene-expression cassette assembly and control of production yields (Curran et al. 2015).

E. Protein Tags and Linkers

In many cases, it is desirable to expand the sequence of a protein with additional sequences encoding domains that provide new properties to the protein to facilitate its secretion, purification, or visibility (Table 10.3). These new domains often need to be attached to the main protein via linker sequences, which may act solely as spacers (Chen et al. 2013), or contain a proteolytic site that allows for removal of the attached domain (in vivo or in vitro) when its function is no longer required. This adds to the complexity of the gene-expression cassette structure as additional bio-blocks need to be designed and incorporated into the cassette at the appropriate positions.

Table 10.3 Protein-tag and functional-linker sequences

The design of these bio-blocks depends on the size of the new functional unit and strategies for their assembly are presented in Fig. 10.5. Firstly, bio-blocks encoding large size addendums (>30 amino acid residues) like fluorescent proteins can be made as individual bio-blocks by PCR. Similar to the assembly of the basic gene-expression cassette (in Fig. 10.1), proper incorporation of the new bio-block is ensured by sequences in the tails of the primers used to generate the individual bio-blocks, which can be enzymatically cut to provide overhangs that direct the correct fusion order of all components. Secondly, for medium size bio-blocks encoding sequences of 10–30 amino acid residues (e.g. secretion signals), the information can be incorporated into each of two oligonucleotides that are annealed in vitro to form the complete bio-block. Note that small sequence extensions may be added to the oligonucleotides to produce bio-blocks, which are directly equipped with short ssDNA overhangs for bio-block assembly. Thirdly, for shorter sequences (<10 amino acid residues), e.g. purification tags or proteolytic sites, the information may be incorporated into the primer tails of neighboring bio-blocks. For example, if the C-terminus of a protein needs to be extended with a poly-histidine tag for purification, the primer tail used to link the GOI to the next bio-block could be elongated with the sequence encoding the tag. The combinations in which these basic strategies can be employed to construct GECs are numerous, and as the prices decrease, it is likely that even large size protein tags will be synthesized de novo rather than by PCR amplification.

Fig. 10.5
figure 5

Assembly strategies for employing protein tags. (a) Bio-blocks can be generated by three general strategies depending on their size: PCR amplification, annealing of two oligonucleotides, or incorporation into the tail of a primer used to PCR amplify another bio-block. (b) The resulting bio-blocks can be fused in a directed manner as illustrated in Fig. 10.1

IV. Heterologous Protein Production in Filamentous Fungi

Filamentous fungi serve as favorite hosts for production of industrial enzymes due to their superior protein secretory potential. This potential has been further improved by classical mutagenesis yielding strains with superior protein secretion properties. This includes A. niger CBS 513.88 and T. reesei RUT-C30, which can reach titers of 30 and 100 g/L of endogenous cellulolytic enzymes by optimized cultivations, respectively (Cairns et al. 2018; Cherry and Fidantsef 2003). Detailed mechanistic insights into the secretory pathway combined with fully sequenced genomes and a diverse range of omics data set the stage for rational strain engineering. In this section, we will present examples of genetically modified strains, which serve as platforms for heterologous protein production. This includes strains with improved production and secretion physiology obtained by introducing defined mutations influencing processes ranging from the delivery of amino acid building blocks for protein synthesis to processes involved in exocytosis.

Specifically, we will treat two major topics: firstly, engineering the secretory pathway focusing on transport signals, the associated processes of protein glycosylation and folding, and vesicular trafficking and secondly, expression in protease-deficient strains as a method for reducing degradation of the heterologous protein.

A. Engineering the Secretory Pathway

The protein secretion pathway is complex and provides several processes that can be engineered to enhance production yields. Generally, protein secretion processes can be grouped into three major themes. Firstly, secretion signals that mediate translocation into the endoplasmic reticulum (ER). Secondly, folding and glycosylation of secretory proteins. Thirdly, the complex machinery, which transports proteins through the secretory pathway and leads to secretion by exocytosis. Below, we provide examples showing how each of these steps can be optimized by rational genetic engineering.

1. Transport Signals

A significant obstacle in optimizing secretion of a heterologous protein is choosing the right pre- and pro-sequence, as protein secretion is inefficient when suboptimal signals are applied. Pre-sequence s encode secretion signals that mediate protein translocation into the ER, while pro-sequence s may facilitate the folding process and maintain proteins as inactive forms until its removal. One of the three general approaches is typically used to ensure secretion of a heterologous protein. In the first approach, secretion of the recombinant protein simply relies on the secretion signals of the native protein. In the second approach, the N-terminus of the recombinant protein is extended with either a pre- or pre-pro-sequence from a highly secreted protein of the production host (or from a closely related species) or thirdly N-terminal extension with a secreted carrier protein.

The first, and most simple, approach employs the native secretion signal of the heterologous protein. However, the fact that the sorting signal are from a different organism raises questions concerning their ability to support efficient secretion in the host. The observation that heterologous proteins originating from species closely related to the host often sort quite efficiently suggests that their secretion signals are functional.

For example, glucoamylase (glaA) from A. niger and rhamnogalacturonate lyase A (rglA) from Aspergillus sojae have been efficiently produced via their own secretion signals in A. nidulans (Schalén et al. 2016) and in A. oryzae (Yoshino-Yasuda et al. 2012), respectively.

Secretion signals may also be functional between more distantly related fungi, as observed with two basidiomycete laccases, Lcc1 of Pycnoporus coccineus and Lcc of Pycnoporus sanguineus , successfully secreted by the hosts A. oryzae and A. nidulans using the proteins native secretion signals, albeit with low yields (Hoshida et al. 2005; Li et al. 2018). More efficient secretion may be achieved by using host endogenous signals as they are expected to be more proficiently recognized and processed by the host. The second approach therefore employs the pre- or pre-pro-sequence of a highly secreted protein originating from the host, or a closely related species, to enhance secretion of the heterologous protein.

A successful example of this approach is production of basidiomycete laccase, Lac1, from Pycnoporus cinnabarinus in A. niger (Record et al. 2002). In this study, an 80-fold increase in extracellular activity was achieved by replacing the natural pre-sequence of Lac1 with the pre-pro-sequence of A. niger glaA.

In the final approach, a carrier protein is fused to the heterologous protein to improve production by alleviating post-translational bottlenecks and, in some cases, by increasing mRNA levels (Gouka et al. 1997). Comparative DNA microarrays of A. oryzae strains expressing heterologous bovine chymosin, with and without a carrier protein (AmyB), showed that inclusion of the carrier protein promoted induction of the UPR (see Sect. IV.A.2.b) and increased expression of genes encoding secretory chaperones and proteins involved in intracellular trafficking, thereby facilitating folding and secretion (Ohno et al. 2011). Examples of proteins that have been used as carriers are GlaA, AmyB, and CbhI from A. niger , A. oryzae, and T. reesei , respectively. To liberate the heterologous protein from the carrier protein in vivo, the two proteins are fused via a linker containing a Kex2 cleavage site (KR/RR) to enable proteolytic separation in the late-Golgi (Jin et al. 2007; Landowski et al. 2016).

Employing this approach, Gouka et al. used A. niger GlaA as carrier protein for heterologous production of human interleukin-6 in Aspergillus awamori , thereby improving the extracellular protein yield 100-fold (Gouka et al. 1997).

In an interesting variant of this approach, Jin et al. produced human lysozyme in A. oryzae using a gene cassette containing the amyB gene fused to tandem gene copies of HLY encoding lysosome, which were separated by sequences encoding Kex2 cleavage sites (Jin et al. 2007). However, releasing the heterologous protein from the carrier protein may be a bottleneck. For example, during production of human interferon (IFNα-2b) in T. reesei using Cbh1 as a carrier protein, 44% of the fusion protein was not cleaved by Kex2 (Landowski et al. 2016). More efficient Kex2 cleavage has been achieved by overexpressing kex2 (Landowski et al. 2016) or by optimizing Kex2 cleavage sites (Lin et al. 2006; Nakajima et al. 2006; Yang et al. 2013), e.g., by using the entire pro-sequence of GlaA, NVISKR, as a Kex2 cleavage site (Landowski et al. 2016).

It is important to stress that the choice of optimal signal sequences is not straightforward. For example, Rantasalo et al. optimized secretion of lipase B from Candida antarctica (CalB) in T. reesei (Rantasalo et al. 2019) by using natural secretion signals of the host proteins CbhI and CbhII and secretion signals from heterologous AmyA and GlaA from A. awamori and A. niger , respectively, and by using CbhI as a carrier protein. When the lipase yields obtained with the different strains were compared, the strain employing the secretion signal of AmyA performed best. On the other hand, studies on heterologous production of Cbh1 in A. awamori using different secretion signals showed no differences illustrating that gains may be dependent on the heterologous protein and/or the host (Adney et al. 2003; Chou et al. 2004).

2. Glycosylation and Folding

Glycosylation of the heterologous protein is important for the subsequent folding, and without validity of these properties, the protein activity may be impaired or abolished. In the heterologous production host, these processes may be executed in alternate fashion compared to the native producer, leading to reduced yields or non-functional proteins. Therefore, substantial efforts have been conducted to improve both processes by increasing or reducing expression of genes encoding proteins involved in the glycosylation and folding machinery of the production host to enhance protein production.

a) Glycosylation

Protein glycosylation has multifaceted impact on heterologous protein production. Firstly, it affects protein folding and secretion efficiencies; secondly, it influences the final properties of the enzyme including its stability, solubility, and catalytic parameters; and finally, it acts to protect the enzymes from proteases (Gupta and Shukla 2018; Li and d’Anjou 2009). In agreement with this view, mutation of the four N-glycosylation sites of A. terreus β-glucosidase resulted in reduced thermal stability and 15–35% decreased specific activity and catalytic rate as compared to the wild-type variant when heterologously produced by T. reesei (Wei et al. 2013). Hence, proper glycosylation is often required to achieve high production yields. For example, analysis of bovine chymosin produced by A. niger showed that an N-glycosylation site was poorly glycosylated by this host. Importantly, the extracellular activity yields could be increased by up to 100% in A. niger by optimizing the glycosylation efficiency of this site by site-directed mutagenesis (van den Brink et al. 2006). Similarly, production yields and quality may also be increased by engineering entirely new glycosylation sites into proteins. Using this approach, heterologous production activity yields of cellobiohydrolase Cel7A from Penicillium funiculosum in A. awamori were increased by 70% by engineering a novel N-glycosylation motif into the protein at position N194 (Adney et al. 2009). On the other hand, production yields and quality may be decreased if non-native glycosylation sites are unintentionally or aberrantly glycosylated by the glycosylation machinery of the heterologous host. Hence, a 70% increase of T. reesei Cel7A activity yields were achieved by removing a glycosylation site (N384), which contained a larger glycan structure when using A. awamori as production host, than in the native organism T. reesei (Adney et al. 2009). Fungi may also potentially be used as hosts for production of therapeutic proteins. The fact that different species add different sugar structures to proteins poses an additional production challenge, as therapeutic proteins containing aberrant sugar moieties may cause immune responses in patients. In yeasts, this challenge has been successfully addressed by humanizing the glycosylation pathway by genetic engineering (Gupta and Shukla 2018) indicating that production of therapeutic proteins may also be possible in filamentous fungi after dedicated strain engineering (Anyaogu and Mortensen 2015).

b) Chaperones and ER Stress

Overexpression of secretory proteins may result in accumulation of misfolded proteins in the ER, and this problem poses a severe bottleneck toward high yields of secreted proteins. Cells respond to misfolded proteins in the ER by triggering the unfolded protein response (UPR), which is sensed by the ER transmembrane protein Ire1. Normally, Ire1 forms an inactive complex with the chaperone BipA, but as BipA is increasingly recruited to assist in folding, Ire1 forms a nucleolytically active dimer that catalyzes splicing of the non-conventional intron in the hac1 mRNA (Krishnan and Askew 2014). The processed hac1 mRNA encodes mature Hac1 transcription factor, which in turn triggers transcription of UPR genes. This includes genes encoding ER-resident molecular chaperones like BipA, the heat-shock protein family of chaperones (Hsp104, Hsp70, Hsp90) and the protein disulfide-isomerase PdiA, thus satisfying the increased demand for folding capacity (Zubieta et al. 2018). Persistently misfolded proteins are deleterious to the cell and are degraded through the ERAD (ER-associated degradation) pathway that includes transport to the cytoplasm and ubiquitin-mediated degradation by the proteasome (Carvalho et al. 2011).

Several strategies have been pursued to optimize folding. These include defensive methods, like lowering the growth temperature or applying weaker promoters, but also potentially more awarding methods based on stimulating the folding potential via genetic engineering of the host organism folding machinery.

To promote heterologous protein production in A. awamori , the UPR pathway has been constitutively activated by overexpressing an intron-free variant of the hac1 gene. Using this strain, production of Trametes versicolor laccase (Lcc1) was increased sevenfold and bovine chymosin 2.8-fold (Valkonen et al. 2003). Similarly, heterologous production of A. niger glucose oxidase (Gox) in a T. reesei strain overexpressing intron-free hac1 increased production 1.8-fold (Wu et al. 2017). In comparison, overexpression of the bipA chaperone alone in T. reesei increased production of Gox by 1.5-fold.

Folding of proteins containing multiple disulfide bonds may be particularly challenging and may benefit from increased levels of the protein disulfide isomerase PdiA. In agreement with this, production of thaumatin, a sweet-tasting plant protein containing eight disulfide bonds, was increased twofold by using an A. awamori strain overexpressing pdiA (Moralejo et al. 2001). Since heterologous protein production appears to benefit from increased levels of host chaperones, it is tempting to speculate that more process-specific folding assistance could be achieved by co-expressing genes encoding one or more chaperones from the natural source of the protein of interest.

3. Vesicular Trafficking and Polarized Growth

Filamentous fungi propagate as hyphae and the general view is that most protein secretion occurs at the hyphal tip (Cairns et al. 2019). Secreted proteins are packed into vesicles, either by cargo-receptors or as the result of bulk flow, thereby mediating transport from the ER to the Golgi and from the Golgi to the plasma membrane (Barlowe and Miller 2013). Most vesicles accumulate at the Spitzenkörper at the apex of the hyphae before fusing to the membrane in a process that depends on the exocyst octamer (Ahmed et al. 2018; Riquelme and Sánchez-León 2014). Since proteins are synthesized throughout the hyphae, vesicles are transported by an elaborate transport system based on actin filaments and microtubules to ensure efficient transport to the hyphal apex (Steinberg et al. 2017). Many of the individual steps in the secretory pathway are understood in molecular detail, and a popular strategy aiming at enhancing secretory transport is to overexpress genes encoding proteins that are directly involved in transport. For example, loading of cargo proteins into vesicles and targeting vesicles to a destination membrane have been engineered to increase secretion as exemplified below.

To stimulate secretion of a heterologous protein, the vesicular transport was enhanced by overexpressing rabD, encoding a Rab GTPase involved in transport of exocytic post-Golgi vesicles (Pantazopoulou et al. 2014), the deletion of which reduces protein secretion (Punt et al. 2001). Specifically, a fusion of carrier glucoamylase to mRFP via a Kex2 cleavable linker (Kex2cl) increased production of mRFP in A. nidulans by 40% (Schalén et al. 2016) in the rabD overexpression background. Hoang and co-workers showed that Vip36, a putative lectin-type cargo receptor inferred to be involved in vesicular cargo loading of glycoproteins between ER and Golgi, facilitates secretion of the recombinant fusion proteins AmyB-Kex2cl-GFP and AmyB-Kex2cl-chymosin in A. oryzae , presumably by reducing protein retention in the ER (Hoang et al. 2015). Importantly, overexpression of vip36 in strains producing the recombinant GFP fusion protein led to sevenfold increase in the levels of extracellular GFP (Hoang et al. 2015). Finally, docking of vesicles to a target membrane was stimulated by overexpressing snc1 encoding a vSNARE, and this feature increased secretion of A. niger glucose oxidase in T. reesei by 2.2-fold (Wu et al. 2017).

Similarly, specific genes involved in the secretory pathway can be deleted to positively influence heterologous protein production. For example, the vacuolar protein sorting receptor Vps10 mediates transport of proteins from the Golgi to the vacuoles; and it has been observed that heterologous proteins designated for secretion partially accumulate in the vacuoles (Masai et al. 2003). This principle has been exploited in A. oryzae where a deletion of vps10 improved production of recombinant human lysosome and bovine chymosin 2.2- and 3-fold, respectively (Yoon et al. 2010). Even the micromorphology of the mycelium can be manipulated to enhance protein secretion. In an elegant study, Fiedler et al. produced a hyperbranching A. niger strain by deleting the gene encoding the Rho G-protein RacA, a key protein in maintaining polarized growth (Kwon et al. 2011); and this feature increased glucoamylase production four-fold (Fiedler et al. 2018).

C. Building Blocks for Protein Synthesis

Many filamentous fungi secrete large amounts of enzymes and production of these proteins consume building blocks that could be used for the desired heterologous protein. Hence, additional building blocks for heterologous protein production can be obtained by deleting genes encoding the most abundant secreted proteins. This strategy has the further advantages that it reduces pressure on the secretory pathway machinery and that subsequent product purification is simplified. In its simplest scheme, this approach can be utilized by deleting genes encoding major secreted products, such as α-amylase A and B of A. oryzae , thereby providing a strain with no detectable α-amylase activity (Kitamoto et al. 2015). Toward the same goal, genes encoding transcriptional activators of major cellulolytic enzymes have been deleted in both A. niger and T. reesei .

In A. niger , deletion of amyR reduced the total amount of secreted protein 16.4-fold (Zhang et al. 2016b). Likewise, deletion of xyr1 in T. reesei abolished expression of the genes cbhI and cbhII encoding the major cellobiohydrolases CBHI and CBHII, which may constitute up to 80% of the total amount of secreted protein (Bergquist et al. 2004; Stricker et al. 2006).

While these approaches diminished expression of endogenous enzymes, it may also be advantageous to modify the GOI expression cassette. To this end, for driving secretion of a heterologously produced protein, Rantasalo et al. exchanged usage of a carrier protein for a smaller secretion signal, thereby reducing the total secreted protein twofold and inferring a substantial increase in recombinant enzyme purity (Rantasalo et al. 2019).

Comparative transcriptomics approaches have been used to uncover engineering targets that enhance heterologous protein production. For example, processes related to biosynthesis of amino acids and tRNAs are upregulated in strains of A. nidulans overexpressing heterologous enzymes (Zubieta et al. 2018) and in A. niger CBS 513.88, a classic enzyme production strain (Andersen et al. 2011). Hence, building block availability may be a limiting factor that could be improved by genetic engineering, e.g., by increasing expression of genes encoding specific transporters or genes required for making aminoacryl-tRNAs.

D. Protease-Deficient Strains

The saprophytic lifestyle of filamentous fungi is key to their high secretion capacity. In addition to facilitating heterologous protein production, a consequence of this lifestyle is secretion of large amounts of proteases , which the fungi use for extracellular biomass degradation. For example, proteomic analysis of the T. reesei QM6a secretome under submerged cultivation showed pH-dependent secretion of 39 peptidases and proteinases, with more than 20 secreted simultaneously (Adav et al. 2011). Since these proteases need to digest a broad spectrum of substrates, chances are that the heterologous protein is also degraded leading to decreasing activity of the recombinant protein over time (Kamaruddin et al. 2018). In addition, intracellular proteases in the secretory pathway, such as Kex2, may also degrade recombinant proteins, which can be resolved by mutagenizing the cleavage site (Lin et al. 2006). The fact that addition of protease inhibitors, such as trypsin and aspartic protease inhibitor (Landowski et al. 2015), or use of strains with different protease profiles (Sun et al. 2016) can lead to significantly higher protein yields suggests that a strategy aiming at producing protease-deficient strains could be advantageous. However, whereas elimination of some proteases does not impact fitness, some may be involved in cell wall maintenance, and their absence often causes an altered undesired morphology. Hence, a gain in product yield may be lost in growth rate or overall productivity, and the relevant strain modifications vary from product to product even with the same host (Landowski et al. 2015).

In a simple case, Li et al. applied A. nidulans as host for heterologous production of a P. sanguineus laccase (Pslcc) and demonstrated that deletion of two genes encoding major proteolytic activities, proteases dipeptidyl-peptidase DppV and aspartic protease PepA, increased laccase production 13-fold (Li et al. 2018).

In a more elaborate study, Kitamoto et al. generated an A. oryzae strain with five protease gene deletions (alp1, npI, npII, pepA, pepE) thereby reducing extracellular protease activity to 1% (Kitamoto et al. 2015) and providing a production strain with minimal proteolysis.

In the post-genomic era, it is possible to make a rational gene deletion strategy based on proteomics where abundant proteases can be identified and linked to their genes. Using this strategy in T. reesei , nine target genes (pep1, tsp1, slp1, gap1, gap2, pep4, pep3, pep5, amp2) were identified and deleted to produce a strain where the majority of proteolytic product degradation is abolished (Landowski et al. 2015, 2016). The authors employed this strain for production of human interferon alpha-2b (IFNα-2b) and obtained 2.4 g/L of correctly processed IFNα-2b.

In contrast to deleting individual protease genes, similar effects may be achieved by RNAi silencing of individual genes (Kitamoto et al. 2015; Qin et al. 2012). In another approach, reduction of a set of proteases was achieved by deleting a general transcription factor gene, namely, prtT in A. niger , which reduced the levels of several secreted proteases, including PepA, PepB, and PepF. After producing Glomerella cingulata cutinase in this A. niger strain a 25-fold increase of residual activity was observed in culture filtrates after 2 weeks (Kamaruddin et al. 2018).

V. Heterologous Production of Secondary Metabolites in Filamentous Fungi

The genome sequencing projects have uncovered a vast repertoire of fungal SMs, which serves as an underexploited source of new food additives and drug candidates. This process has been facilitated by the fact that all genes required for production of a specific SM are typically organized in a biosynthetic gene cluster (BGC) (Keller 2019; Keller et al. 2005; Rokas et al. 2018). However, it is difficult to exploit the fungal SM potential for several reasons. Firstly, the vast majority of the fungal SM BGCs are silent; secondly, many producer species are difficult to propagate in bioreactors; thirdly, many compounds are produced in tiny amounts; fourthly, no genetic toolbox exists for new natural production hosts; and finally, the “generally recognized as safe” (GRAS) production status may be more difficult to achieve with a new species.

Heterologous expression of fungal BCGs using synthetic biology based approaches in well-characterized fungal cell factories provides avenues to speed up SM discovery, characterization, and production. On the other hand, heterologous SM production is challenged by several features including toxicity of some SMs, poor gene annotations, intron splicing differences, compartmentalization of pathways, and the requirement for simultaneous expression of many SM genes. Unlike heterologous production of industrial enzymes, heterologous production of SMs is still in its infancy, and most studies are aiming at product discovery and pathway elucidation rather than large-scale production. In this section, we will present examples on how heterologous SM-gene expression has contributed to expand our insights into SM biosynthesis, as well as challenges toward their production.

A. Challenges in Heterologous Secondary Metabolite Production

1. Product Toxicity

Many SMs are antimicrobials, and they may therefore impair the growth of, or even kill, the new heterologous producer strain. In case the task is to produce large amounts of a known SM, the first experiment should therefore be to test whether the host can tolerate the desired SM. If SM toxicity is a problem, it may be necessary to develop a resistant strain. If the resistance mechanism is known in the native producer species, it may be possible to transfer the resistance mechanism from the native host to the new producer strain.

For example, mycophenolic acid (MPA) from Penicillium brevicompactum kills A. nidulans by inhibiting its inosine-5′-monophosphate dehydrogenase (IMPDH). However, A. nidulans can be engineered to tolerate mycophenolic acid by inserting mpaF of the P. brevicompactum mpa BCG, which encodes an MPA-insensitive IMPDH, into its genome (Hansen et al. 2011a).

Toxicity can also be avoided or reduced by introducing a pump that exports the new compound. For example, the gliotoxin sensitivity of a sirodesmin transporter-deficient strain of Leptosphaeria maculans can be rescued by introduction of the transporter gene gliA from A. fumigatus (Gardiner et al. 2005). Similarly, S. cerevisiae expressing the mlcE efflux pump gene from the compactin BGC is protected against statins (Ley et al. 2015). A different mode of detoxification is based on glycosylation of the SM, a principle which is commonly used in plants (Sandermann Jr. 1992), but which has also been observed in fungi. For example, during yanuthone production by A. niger , a toxic intermediate is glycosylated (Holm et al. 2014), or during co-culturing, Trichoderma species uses glycosylation as a defense against deoxynivalenol, a toxin produced by Fusarium graminearum (Tian et al. 2016). For cell factory construction, this strategy has been applied for vanillin production in S. cerevisiae by expressing a gene encoding a glycosyltransferase from Arabidopsis thaliana (Brochado et al. 2010).

If the task of heterologous production is product discovery , rather than large-scale production, toxicity issues may be less important as only small amounts are sufficient for product identification. However, if weak growth of the new cell factory is observed, it may be advantageous to use inducible promoters to control expression of key genes in the pathway. For example, the inducible alcA promoter was fused to genes in the asperfuranone BGC to avoid toxicity issues during pathway elucidation (Chiang et al. 2013).

2. Genome Annotation and Intron Splicing

Heterologous production requires that the genes in the native producers are correctly annotated and for many sequenced genes, this may not be the case. In cases of doubt, it may be necessary to determine the correct 5′-end of the transcript by the rapid amplification of cDNA ends (RACE) technique (Frohman et al. 1988). Subsequently, introns and the correct stop codon can be identified by generating a complete cDNA of the gene by RT-PCR exploiting that the 3′-end of the transcript is polyadenylated. However, in cases where the GOI is silent, this strategy is not possible, and it may therefore be advisable to produce a set of gene constructs covering different combinations of start and stop codon possibilities.

Intron recognition and splicing differs between species, especially for phylogenetically distant organisms (Kupfer et al. 2004), and flawed splicing may reduce or even prevent formation of the desired protein as described in Sect. III.A.1. For example, introns are more abundant in basidiomycetes than in ascomycetes (Stajich et al. 2007). Moreover, introns from basidiomycetes may not be recognized properly in ascomycetes.

In agreement with this, a recent study using A. oryzae as a cell factory demonstrated that only half of the mRNA species produced by 30 terpene synthase genes from two basidiomycetes, Clitopilus pseudo-pinsitus and Stereum hirsutum , exhibited correct splicing patterns (Nagamine et al. 2019).

Even among species that are more closely related, e.g., within ascomycetes , transfer of genes between species may cause splicing errors. For example, heterologous expression of the 3-methylorcinaldehyde synthase gene from Acremonium strictum (Fisch et al. 2010), the avirulence gene ACE1 from Magnaporthe oryzae (Song et al. 2015), and the citrinin synthase gene from Monascus ruber (He and Cox 2016) in A. oryzae resulted in incorrect processing of introns.

Moreover, some mRNAs of filamentous fungi are regulated by alternative splicing (Kempken 2013; Zhao et al. 2013). Hence, populations of different splice variants were detected in the transcriptome (Wang et al. 2010) with consequences for protein levels (Chang et al. 2010). Indeed, new splice variants may be inactive or the correct variant may be produced only in small amounts (He and Cox 2016; Kempken and Windhofer 2004). Mapping mRNA splicing in the native producer, if possible, is therefore advisable; and if it is different in the new host, it may be necessary to produce different synthetic gene variants covering different splice variants to achieve heterologous production.

3. Compartmentalization of Secondary Metabolite Biosynthetic Pathways

Fungal secondary metabolite biosynthetic pathways are often compartmentalized into different subcellular locations, such as ER, vesicles, peroxisomes, cytoplasm, or vacuoles (Kistler and Broz 2015; Roze et al. 2011). The reason for this is either to channel the pathway into the compartment with relevant precursors to increase biosynthesis efficiency as in the case of penicillin in A. nidulans (Herr and Fischer 2014) or to sequester the toxic product or intermediates from the rest of the cell as in the case of aflatoxin synthesis in Aspergillus parasiticus (Chanda et al. 2009). Little is known about how SM biosynthetic pathways are compartmentalized in fungi, and it is unclear how often this occurs. However, in cases where compartmentalization is important for product formation or to avoid toxicity effects, it is likely essential that the pathway is organized in a similar manner in a new host. Otherwise, it may result in lack of production or even cell death. In the future, we expect that pathway compartmentalization will achieve significant attention in SM cell factory construction. As a standard control experiment, we advise to determine whether individual enzymes in the pathway localize to the same compartments in the natural producer and in the new host, e.g., by GFP tagging. In cases where the host sorts an enzyme incorrectly, it may be possible to ensure proper localization of the relevant enzyme by fusing it to transport signals or a carrier protein that is sorted to the correct compartment.

B. Secondary Metabolite Discovery via Different Gene-Expression Systems

Multiple gene-expression strategies and different fungal hosts have been used for SMs discovery and pathway elucidation by heterologous gene expression. A. oryzae , unlike most other common fungal cell factories, produces few SMs. A. oryzae is therefore often chosen as a host as the subsequent detection of the new SM in the metabolite profile is relatively simple. However, with A. oryzae genetic engineering is complicated by the fact that its asexual spores contain several nuclei (Maruyama et al. 2001), which makes purification of transformed strains more laborious. In contrast, A. nidulans produces asexual spores that contain only a single nucleus (Yuill 1950), and transformants are therefore relatively easy to purify. Hence, if pathway elucidation requires more elaborate genetic engineering, species like A. nidulans may be a better choice of host. Moreover, in some cases, the more complex host-chemistry can advantageously contribute to formation of novel synthetic and potentially useful compounds. Alternatively, it may facilitate construction of a new synthetic pathway by delivering a missing activity in the desired pathway. Below we will provide successful examples of heterologous SM production using different hosts and expression systems, which may serve as an inspiration for construction of new fungal SM cell factories.

1. Heterologous Expression of Secondary Metabolite Synthase Genes

To investigate whether a new SM can be produced in a given cell factory, it may often be advantageous to establish the first step of a multi-enzyme pathway, which produces the scaffold of the final compound. Indeed, the first demonstration of a heterologously produced SM was 6-methylsalicylic acid (6-MSA) (Fujii et al. 1996), which serves as a precursor of several SMs including patulin and yanuthones (Beck et al. 1990; Holm et al. 2014; Petersen et al. 2015; Read and Vining 1968). Hence, it is of interest to develop efficient systems for single gene transfer. In A. oryzae , a plasmid was constructed to allow a gene to be inserted ectopically into the genome in multiple copies (Fujii et al. 1995).

This plasmid has been widely applied for heterologous gene expression, and it has been used to deliver basic SM scaffolds including production of the naphthopyrone YWA1 (Watanabe et al. 1998, 1999), alternapyrone (Fujii et al. 2005), ferrirhodin (Munawar et al. 2013), and astellifadiene (Matsuda et al. 2016).

In A. nidulans , where genetic engineering is easier, genes have often been inserted into a defined locus to facilitate strain characterization and to obtain better gene-expression control. To ensure high expression levels, it may be useful to position integration sites in intergenic regions located in transcriptionally highly active sections of a chromosome.

Using this approach, it has been demonstrated that MpaC from P. brevicompactum produces 5-methylorsellinic acid (5-MOA) as the first intermediate toward production of the immunosuppressant drug mycophenolic acid by expressing mpaC in A. nidulans (Hansen et al. 2011b).

The same setup was used to make a cell factory yielding 1.8 g/l of 6-MSA by expressing yanA from A. niger in A. nidulans (Knudsen 2015), which is sixfold higher as compared to ectopic integration in A. oryzae reported by Fujii et al. (1996). Using the A. nidulans 6-MSA cell factory, it was possible to produce 13C-labelled 6-MSA, which was subsequently used to clarify the biosynthetic pathway for yanuthone D production in A. niger (Holm et al. 2014). The wA locus of A. nidulans has also been used as a gene-expression site. In this case, correct transformants can easily be identified as they produce white conidia. Using this approach, Chiang and co-workers discovered several new compounds by expressing A. terreus polyketide synthase genes in A. nidulans (Chiang et al. 2013).

Finally, we note that synthesis of some scaffolds requires the action of additional enzymes, such as a trans-enoyl reductase, a trans-acyltransferase, a trans-thioesterase, or another synthase. For example, tenellin synthetase from Beauveria bassiana requires trans-acting enoyl reductase for correct polyketide scaffold assembly (Halo et al. 2008); during lovastatin synthesis, both acyltransferase and thioesterase are necessary in order to release the polyketide products from the synthases LovF (Xie et al. 2009) and LovB (Xu et al. 2013), respectively, whereas for the synthesis of the first intermediate toward asperfuranone, two polyketide synthases are needed (Chiang et al. 2009). In these cases, a multi-gene insertion strategy is required; see Sect. II.C.

2. Reconstitution of Secondary Metabolite Pathways in a Heterologous Host

Establishing entire pathways is more challenging as several genes need to be functionally expressed. In the following section, we will describe various strategies that have been used to ensure expression of some or all genes from a BGC.

In an early study by Sakai et al., a 20 kb fragment of DNA containing the citrinin BGC from Monascus purpureus was inserted into an E. coli -Aspergillus shuttle cosmid and ectopically integrated into the genome of A. oryzae to produce 4 μg/L of citrinin (Sakai et al. 2008). This yield was increased approximately 400-fold to 1.48 mg/L when the cluster-specific transcriptional regulator encoding gene (ctnA) was constitutively expressed under control of the A. nidulans trpC promoter. Pathways may also be transferred using a vector set developed by Pahirulzaman et al., which allows up to 12 different genes to be ectopically integrated into the genome via three vectors containing different selectable markers (Pahirulzaman et al. 2012). Hence, pathways may be partly or entirely established in a fungus to produce key intermediates for pathway elucidation as well as to identify the final product. Using this system and A. oryzae as a host, He and Cox produced a detailed model of the citrinin pathway and generated a cell factory producing citrinin to final titers of 19.1 mg/L (He and Cox 2016).

Defined integration of entire BGCs into fungal chromosomal expression sites has also been used to establish functional SM pathways. One advantage of this method is that reverse genetics can be applied to dissect the pathway once the pathway has been functionally established in a new host. This principle has been exploited to successfully transfer all 12 biosynthetic genes required for geodin production from A. terreus to A. nidulans . Additionally, to activate the biosynthetic genes, the TF gene of the cluster (gedR) was equipped with a constitutive A. nidulans promoter prior to genomic integration. Subsequently, the efficient gene deletion toolbox of A. nidulans was used to identify gedL as the gene encoding the halogenase necessary for geodin production (Nielsen et al. 2013).

A recent screening-friendly method for linking BGC to metabolites uses FACs as vectors for gene transfer (see Sect. II.B.1) as they can accommodate DNA fragments containing even large BGCs. FAC-based BGC libraries from A. terreus , Aspergillus wentii, and Aspergillus aculeatus have been transferred and analyzed in A. nidulans resulting in the assignment of 17 compounds to BGCs (Bok et al. 2015; Clevenger et al. 2017).

3. Synthetic Pathway Setups for Secondary Metabolite Production

For some compounds of interest, it is not possible to establish a cell factory based on the natural pathway. This is the case if the biosynthetic pathway is unknown or it has only been partly elucidated or if enzymes are compartmentalized in the native producer in a manner that cannot be implemented in the new host. In these cases, it may be possible to make a cell factory based on a semi-synthetic or synthetic pathway with enzymes of known (or predicted) activities from various species. In this way, it may be possible to synthesize the desired compound, but in a manner, which is different from the pathway in the native host. For example, production of the pharmacologically relevant meroterpenoid daurichromenic acid (DCA) from the plant Rhododendron dauricum in A. oryzae represents a successful case of a functional semi-synthetic pathway (Okada et al. 2017). Using fungal expression vectors for ectopic genome integration, the production of DCA was achieved by expressing two fungal genes from Stachybotrys bisbyi encoding a polyketide synthase (StbA) and a prenyltransferase (StbC), as well as a plant gene from the native producer encoding the DCA synthase. Similarly, production of carminic acid , an important food colorant produced by the scale insect Dactylopius coccus , has been achieved in A. nidulans via a semi-synthetic pathway (Frandsen et al. 2018). In this case, the five-step pathway was based on a single gene from the natural producer encoding a C-glucosyltransferase , a plant gene from Aloe arborescens encoding a type III octaketide synthase, two bacterial genes from Streptomyces sp. R1128 encoding a cyclase (ZhuI) and a aromatase (ZhuJ), and finally an unknown fungal gene from A. nidulans , which putatively encodes a monooxygenase. Importantly, unlike the DCA cell factory, all non-fungal genes were inserted into fungal expression cassettes and integrated into defined expression sites. These examples highlight the potential of puzzling together pathways by combining genes from different organisms. Similar strategies may likely gain an increasing role in future development fungal cell factories for production of known and novel small molecules.

VI. Concluding Remarks and Perspectives

This review demonstrates that heterologous production in fungi is already well established and that synthetic biology-based methods are increasingly used to create new cell factories, typically via bio-block-based strategies. In the future, we envision that this trend will gather momentum and that libraries of bio-blocks containing mutated genes, scrambled homologous genes, or synthetic genes encoding new combinations of functional domains will serve as common resources for developing cell factories that produce, e.g., industrial enzymes with new or improved properties or SMs that may target new diseases. In addition, increasingly efficient high-throughput CRISPR-based methods will allow for the generation of genome-wide mutant-strain libraries. Hence, rather than using a single strain for construction of a heterologous cell factory, mutant libraries may serve as the preferred starting point toward developing new cell factories. In combination, these strategies will create an enormous number of potential cell factories; and screening for the successful candidates will very likely constitute a bottleneck. The expanding synthetic biology toolbox may also contribute to address this challenge. For example, it may offer biosensors that can be used to screen for high-yielding strains. In agreement with this view, membrane-bound RFP was recently used as a biosensor in allowing for high-throughput FACS-based screening of T. reesei mutants to identify strains with improved enzyme secretion potential (Gao et al. 2018). Interestingly, biosensors that monitor the levels of specific intracellular metabolites are already available for E. coli and S. cerevisiae (Adeniran et al. 2015; Morris 2010; Rogers et al. 2016; Zhang and Keasling 2011), and it is likely that this type biosensors will be used in the future to identify superior fungal cell factories for production of specific SMs.

Novel methods for controlling gene function during fermentation are highly desirable. To this end, we envision that new improved orthogonal TFs with programmable specificities, e.g., by using catalytically dead Cas9 variants as TFs (Qi et al. 2013), will be developed. Such TFs have already been shown in other organisms to facilitate the activation of silent genes (Cheng et al. 2013; Perez-Pinera et al. 2013), to regulate metabolism of a heterologous cell factories for increased production yields (Deaner and Alper 2017; Jensen et al. 2017; Vanegas et al. 2017), to study genetic interactions (Du et al. 2017; Peters et al. 2016), or even to induce directed evolution (Hess et al. 2016; Ma et al. 2016). We also envision that synthetic biology tools will change the fungal production platforms dramatically. Specifically, highly efficient genetic engineering tools will allow for the development of cell factories with minimal genomes, where all undesirable cell functions that do not significantly impact fitness during fermentation have been eliminated. For this purpose, multiplex gene editing has already been implemented in fungi (Foster et al. 2018; Liu et al. 2015; Nødvig et al. 2018; Pohl et al. 2016; Shi et al. 2019). Deletion of a number of BGCs in A. nidulans with the aim of increasing the pool of available SM precursors and to simplify metabolite analysis (Chiang et al. 2013, 2016) already serve as a small scale example of this strategy. More ambitiously, one may envision that entirely synthetic filamentous fungi, inspired by the synthetic yeast 2.0 project (Pretorius and Boeke 2018), may be developed. Altogether, we conclude that the rapidly expanding synthetic biology toolbox in combination with the large number of fully sequenced genomes will set the stage for a very exciting future of fungal heterologous product production.