Introduction

Malaria is one of the most severe endemic infectious diseases. Based on annual reports from World Health Organization (WHO), more than one billion people are living in high risk regions of malaria, such as Africa and other tropical countries (O’Neill 2004; WHO 2013). Millions of people, particularly children under 5-year old and pregnant women, have died of malaria in the past. Numerous global efforts have been undertaken to control malaria and to fight against this disastrous disease by governments, charities, and private sectors (Gelband and Seiter 2007; WHO 2015). Hundreds of millions of dollars have been invested annually to decrease malaria-caused death (Bell et al. 2005; Hopkin 2006; WHO 2015). Over the past 10 years, the number of yearly deaths have dramatically decreased from more than one million to a little lower than half a million (Ridley 2003; WHO 2015). Based on the recent WHO’s report, there were still approximately 450,000 deaths in 2015 (WHO 2015). Accordingly, it is apparent that there is still a long road to eradicate malaria.

Malaria is caused by parasites of the Plasmodium genus, in which five species, Plasmodium falciparum, P. vivax, P. ovale, P. malariae and P. knowlesi, are infective in human beings (Mita and Tanabe 2012; Salvador et al. 2012). P. falciparum and P. vivax are two most prevalent protozoan species infecting humans (Mita and Tanabe 2012; Ridley 2003). In particular, P. falciparum is responsible for most deaths (Mita and Tanabe 2012; WHO 2013, 2015). This protozoan causes severe cerebral and lethal effects on people (Grant et al. 1960; John et al. 2010; McGregor et al. 1968; Schmidt et al. 2015), particularly children under 5-year old and pregnant women (John et al. 2010; Mita and Tanabe 2012; Salvador et al. 2012; Uneke 2009). Symptoms include high fever, extreme chilling, severe coma, acidosis, and/or severe anemia (Bell and Molyneux 2007). In addition, severe malaria infected by this species can lead to cognitive damage (Kihara et al. 2006). Unfortunately, this parasite is highly resistant to most current medicines, such as quine and chloroquine (Gelband and Seiter 2007; Mita and Tanabe 2012; Sidhu et al. 2002). Fortunately, artemisinin from the medicinal plant Artemisia annua forms the effective frontline of therapy to treat the falciparum malaria (Ridley 2003; Tu 2011; WHO 1994).

A. annua (Qinghao in Chinese and sweet wormwood in English) is an effective antimalarial herbaceous plant in the family Asteraceae (Compositae). This herb is a Traditional Chinese Medicinal (TCM) plant that has been used for treatment of fever and chilling for more than 2300 years (Zhong-Yi-Yan-Jiu-Yue-Zhong-Yao-Yan-Jiu-Suo 1978). Its antimalarial medicinal function was unknown until 1970s, when the Chinese government funded a group of scientists to investigate TCM plants to develop a novel medicine to treat malaria. Professor YouYou Tu, the Nobel Laureate in Physiology and Medicine in 2015, led a group of talented scientists to study ancient Chinese medicine literature in 1967–1969, and unearthed an ancient prescription for treatment of malaria-like symptoms that were documented in an ancient medicinal book entitled “A Handbook of Prescriptions for Emergencies” by Ge Hong (284–346 CE) (Tu 2011). The ancient prescription was recently translated to English (Tu 2011), entitled “A handful of qinghao immersed with 2 L of water, wring out the juice and drink it all” (Tu 2011). This information helped Tu and her team form a hypothesis that Qinghao might have antimalarial activity. Immediately, Tu and her team developed different protocols for extraction of active phytochemicals to test for antimalarial function. Based on Tu’s description, on the 4th of October, 1971, her team obtained effective crude extracts that showed effectiveness for the treatment of malaria. After improvement of extraction methods, they isolated an active crystal and named it Qing Hao Su (artemisinin) in 1972. Later on, artemisinin was structurally elucidated to be C15H22O5 by another group (Liu et al. 1979; Qinghao-Kanglue-Hezuo-Yanjiu-Zu 1977) (Fig. 1). By the end of 1970s, artemisinin was successfully developed into an effective antimalarial medicine to treat malaria infected by different parasites. Particularly, artemisinin is effective to kill P. falciparum, the most severe cerebral and lethal parasite that was highly resistant to quinine and other anti-malarial medicines.

Fig. 1
figure 1

A scheme showing updated pathways to artemisinin via the MVA pathway located in the cytosol and the MEP/DOXP pathway located in plastids. Abbreviations for enzymes at each step in the MVA pathway include AACT, acetoacetyl-CoA transferase; HMGS, HMG-CoA synthase; HMGR, HMG-CoA reductase; MK, mevalonate kinase; PMK, phosphomevalonate kinase; MPDC, mevalonate 5-pyrophosphate decarboxylase. Abbreviations for enzymes at each step in the MEP/DOXP pathway include: DXS and DXR, 1-deoxy-D-xylulose 5-phosphate synthase and reductase; MCT, 2-C-methyl-D-erythritol 4- phosphate cytidyltransferase; CMK, 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; MDS, 2-C-methyl-D-erythritol 2, 4-cyclodiphosphate synthase; HDS and HDR, 4-hydroxy-3-methylbut-2-enyl pyrophosphate synthase and reductase. IPPI means isopentenyl pyrophosphate isomerase. Abbreviations for enzymes to different classes of terpenoids include: IS isoprene synthase, GPPS geranyl pyrophosphate synthase, FPPS farnesyl pyrophosphate synthase, GGPPS geranylgeranyl pyrophosphate synthase, MS monoterpenoid synthase (e.g. LS: limonene synthase), SQTS sesquiterpenoid synthase. Abbreviation for enzymes to artemisinin pathway includes ADS, armorpha-4, 11-diene synthase, ADH1: alcohol dehydrogenase 1, ALDH1: aldehyde dehydrogenase, CPR1: cytochrome P450 reductase 1, CYB5: cytochrome b5 mono-oxygenase, CYP71AV1: cytochrome P450 mono-oxygenase; DBR2: artemisinic aldehyde delta-11(13)-double bond reductase. Dashed arrows mean unknown

Artemisinin (Fig. 1) is an unusual novel endoperoxide sesquiterpene lactone (Liu et al. 1979). Its structure contains five oxygen atoms, two of which form a unique oxygen bridge in a trioxane (Fig. 1). Actually, the O1, O2, and O13 form a type of 1, 2, 4-trioxane ring. The oxygen bridge was reported to associate with antimalarial activity. The opening of the O1–O2 bridge and the subsequent binding to sarco/endoplasmic reticulum Ca2+-ATPase (SERCA) orthologs of P. falciparum, such as PfATP6, are a favorable mechanism to explain its medicinal efficacy (Eckstein-Ludwig et al. 2003; Haynes et al. 1999; Kamchonwonggpaisan and Meshnick 1996; Kapetanaki and Varotsis 2000; Naik et al. 2011; Wang and Wu 2000). In addition, other antimalarial models have been proposed. It has been observed that when the ring of artemisinin is opened to form an intermediate with a free hydroperoxide in the presence of benzylamine, an oxygen atom is transferred to tertiary amines to form N-oxides. This mechanism explains that artemisinin likely binds proteins to kill parasites (Haynes et al. 1999). Another action mechanism was proposed that artemisinin kills parasites via perturbation of hemoglobin catabolism and heme polymerization in parasites (Pandey et al. 1999). Pandey et al. (1999) observed that artemisinin treatment led to accumulation of hemoglobin instead of degradation in the parasites. They also observed that artemisinin inhibited a histidine-rich protein II-mediated heme polymerization in P. falciparum. Fourier transform infrared (FTIR) and resonance Raman (RR) spectroscopies were performed to characterize the interaction of artemisinin and plasmodial hemin dimer. A ferryl-oxo heme intermediate was observed resulting from the cleavage of the endoperoxide bridge in the reaction (Kapetanaki and Varotsis 2000). Other action models include interference with parasite mitochondrial functions, membrane damages, alkylation of heme, and other proteins (Pandey and Pandey-Rai 2016). Although there are likely additional mechanisms remaining to be explored, the abovementioned have greatly enhanced the understanding of artemisinin action models in fighting against malaria. Furthermore, these understandings are particularly significant to develop new artemisinin derivatives to overcome the development of parasite resistance, given that the falciparum parasite has been found to show delayed response (resistance) to monotherapy of artemisinin in Cambodia and Myanmar (Talundzic et al. 2015; WHO 2015; Ye et al. 2016).

To date, artemisinin-based combination therapy (ACT) forms the frontline of malarial treatment (Banek et al. 2014; Gogtay et al. 2013; Jelinek 2013; Shunmay et al. 2008; Taylor 2013). Based on annual reports from WHO, the application of ACT together with other preventive strategies has greatly reduced nearly 50% of death over the past decade (Straimer et al. 2015; WHO 2009, 2015). However, there are approximately half a million malarial patient losing their lives each year. One of the reasons is the lack of ACT accessible to patients due to the insufficient supply of artemisinin produced by plants or synthetic approaches. To improve ACT, numerous laboratories in the world endeavor to perform both basic and applied researches to increase artemisinin production. Over the past few decades, many promising advances to understand artemisinin biosynthesis and to increase its yield were achieved. Here, we review those fundamental accomplishments in molecular biology, biochemistry, metabolic engineering, and synthetic biology. Furthermore, central questions relating to artemisinin formation are discussed for future efforts to increase artemisinin yield.

Biochemical pathway starting with amorpha-4, 11-diene

The understanding of the biosynthetic pathway of artemisinin started with radioactive-isotope labelling experiments in 1980s (Akhila et al. 1987). Although all data have not supported early hypothetic pathways (Akhila et al. 1987), radioactive labelling and bioconversion observations provided certain constructive suggestions for late investigations. For example, bioconversion of arteannuin b to artemisinin and conversion of artemisinic acid to arteannuin b and artemisinin were observed from radioactive labelling (Akhila et al. 1987; Nair and Basile 1993; Sangwan et al. 1993). Although artemisinic acid and arteannuin b were recently discussed to be unlikely precursors (Brown 2010), to date, semi-synthesis using artemisinic acid as substrate successfully supplement artemisinin for ACT (Turconi et al. 2014).

Artemisinin is an endoperoxide sesquiterpenoid lactone. Its biosynthetic pathway is localized in the cytosol (Fig. 1). Its building blocks, isopentyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP), are synthesized from the mevalonate (MVA) and 2-C-methyl-D-erythritol-4-phosphate (MEP) pathways (Ma et al. 2015). A CO2 feeding experiment using C-13 isotope revealed a metabolic crosstalk between the cytosol and plastids in A. annua (Schramek et al. 2010), suggesting that these two pathways can contribute the formation of artemisinin. To date, genes involved in the MVA and MEP pathways have been characterized from numerous plants (Chang et al. 2013; Rodriguez-Concepcion and Boronat 2015; Vranova et al. 2013). Our recent sequencing also obtained cDNA sequences that encode all enzymes involved in the two pathways (Ma et al. 2015). In this review, we summarize those enzymes in Fig. 1, but do not introduce their function. In addition, we do not introduce and discuss transcription factors that have been demonstrated to regulate the pathway from amorpha-4, 11-diene to artemisinic and dihydroartemisinic acid (Lu et al. 2013; Ma et al. 2009; Zhang et al. 2015), given that a recent review summarized and discussed their functions in A. annua (Shen et al. 2016).

To date, gene isolation, transgenic analysis, and synthetic biological research have characterized ADS, CYP71AV1/CPR/CBR, ADH1, ALDH1, and DBR2. These genes encode enzymes that have been demonstrated to catalyze different reactions from amorpha-4,11-diene to artemisinic and dihydroartemisinic acids (Fig. 1). Here, we review and characterize those reactions from farnesyl pyrophosphate (FPP) to artemisinin, including cyclization, oxidoreduction, alcohol dehydrogenation (oxygenation), aldehyde dehydrogenation, and trioxane formation.

Cyclization

The correct understanding of the artemisinin biochemical pathway did not start until 1999, when the discovery of amorpha-4, 11-diene synthase (ADS) (Fig. 1) opened a new era to understand artemisinin formation. A native ADS was isolated from A. annua and enzymatically demonstrated to convert FPP to amorpha-4,11-diene (Bouwmeester et al. 1999). This catalytic activity was then demonstrated by cDNA cloning and gene function analysis (Chang et al. 2000; Mercke et al. 2000; Wallaart et al. 2001). The overexpression of ADS in tobacco plants led to the formation of amorpha-4, 11-diene although its level was low (Wallaart et al. 2001). Continuous recombinant enzyme analyses further elucidated catalytic mechanism of ADS that cyclizes acyclic FPP to the two ring skeleton of amorpha-4,11-diene (Kim et al. 2006; Picaud et al. 2006). Synthetic biology using yeast further demonstrated that ADS controls the first committed step toward amorphadiene, artemisinic acid, and other derivatives (Martin et al. 2003; Ro et al. 2006; Teoh et al. 2006). In addition, its overexpression and downregulation were conducted in both heterozygous and homozygous A. annua varieties. Transgenic experiments showed that its overexpression in heterozygous A. annua increased the production of artemisinin although, most different transgenic data lacks consistence in artemisinin production (Alam and Abdin 2011; Tang et al. 2014). In addition, the overexpression and downregulation of ADS in homozygous A. annua increased and reduced artemisinin contents in transgenic plants, respectively (Ma et al. 2015). Furthermore, both overexpression and downregulation led to a tradeoff of metabolic balance between amorphadiene pathways and other sesquiterpene pathways (Ma et al. 2015). Taken together, all data support that the cyclization reaction is the first committed step to artemisinin.

Oxidoreduction

A cytochrome P450 mono-oxygenase has been characterized to catalyze an oxidoreduction reaction to convert amorpha-4,11-diene to artemisinic alcohol. Two laboratories almost simultaneously characterized the cytochrome P450 mono-oxygenase, namely mono-oxygenase CYP71AV1 (Ro et al. 2006; Teoh et al. 2006). Ro et al. (2006) used a yeast system to express CYP71AV1 and CPR (a cytochrome P45 reductase). Recently, CPR was renamed as CPR1 by the same laboratory (Paddon et al. 2013). Accordingly, we use CPR1 in this review. When CYP71AV1 and CPR1 were co-expressed in Saccharomyces cerevisiae, the engineered yeast cultured in a liquid medium in flask or bioreactor produced artemisinic acid and secreted this compound to liquid medium (Ro et al. 2006). A low level of artemisinic alcohol was also detected from cultures. In contrast, artemisinic aldehyde was hardly detected. To demonstrate enzymatic reactions, Ro et al. (2006) incubated microsomal proteins containing both CYP71AV1 and CPR1 with three substrates, amorphadiene, artemisinic alcohol, and artemisinic aldehyde, separately. The microsomal proteins converted amorphadiene to artemisinic alcohol, artemisinic aldehyde, and artemisinic acid, dehydrogenated artemisinic alcohol to artemisinic aldehyde and artemisinic acid, and dehydrogenated artemisinic aldehyde to artemisinic acid. By contrast, Teoh et al. (2006) did not use CPR1 in their in vitro assay. Without using CPR1, the microsomal proteins containing CYP71AV1 alone could also use amorphadiene, artemisinic alcohol, and artemisinic aldehyde as substrates. The microsomal proteins oxidized amorphadiene to artemisinic alcohol and artemisinic aldehyde, dehydrogenated artemisinic alcohol to artemisinic aldehyde, and further dehydrogenated artemisinic aldehyde to artemisinin acid. In comparison, the assay by Ro et al. (2006) produced more yield of each of those metabolites. In summary, these two different experimental systems imply that yeast strains themselves used by the two laboratories express CPR1-like enzymes to partner with CYP71AV1 to catalyze the entire enzymatic conversion from amorphadiene to late metabolites. The potential mechanism is discussed in synthetic biology of artemisinic acid below.

Sequential studies further suggest that a small group of enzymes involves the oxidoreduction step in a cognate manner. In addition to CPR1, recent studies have demonstrated that other proteins are involved in the catalysis from amorphadiene to artemisinic alcohol. Paddon et al. (2013) reported that the yield of artemisinic acid produced by engineered yeast was significantly increased by a co-expression of CYB5 from A. annua. (Paddon et al. 2013). This gene encodes a cytochrome b5 mono-oxygenase. Based on the high yield of artemisinic acid produced in engineered yeast lines that expressed CYP71AV1, CPR1, and CYB5, Paddon et al. (2013) revised that the step from amorphadiene to artemisinic alcohol is catalyzed by these three proteins rather than by CYP71AV1 and CPR1 only. We recently sequenced six cDNA libraries and identified a CPR1 homolog from A. annua, namely CPR2, which is specifically expressed in inflorescences and highly relative to the production trend of artemisinic acid and artemisinin in different tissues (Ma et al. 2015). This observation suggests a possible association of CPR2 with the formation of artemisinic acid and artemisinin.

Alcohol dehydrogenation

Alcohol dehydrogenase (ADH) (EC.1.1.1.1) is a family of enzyme that catalyzes interconversion between alcohol (-OH) and aldehyde (C=O, ketone). An ADH1 gene was cloned from A. annua (Paddon et al. 2013). It was co-expressed with CYP71AV1, CPR1, and CYB5 to double the yield of artemisinic aldehyde in engineered yeast strains. This data suggest that ADH1 is involved in the biochemical conversion from artemisinic alcohol to artemisinic aldehyde. To date, although its enzymatic kinetics and catalytic mechanism remains open for investigation, this enzyme is placed on the step responsible for the formation of artemisinic aldehyde (Fig. 1). Further characterization of its kinetics and in vivo function will enhance the understanding of its role in the formation of artemisinin.

Aldehyde dehydrogenation

Aldehyde dehydrogenase (ALDH) (EC 1.2.1.3) catalyzes the dehydrogenation (oxidation) of aldehydes. Teoh et al. (2009) isolated a cDNA, namely ALDH1, from trichomes and flowers of A. annua (Teoh et al. 2009). It is highly expressed in aboveground tissues, particularly in trichomes. Teoh et al. (2009) expressed a recombinant protein in E.coli and demonstrated that the cell-free protein converted both artemisinic and dihydroartemisinic aldehydes to artemisinic and dihydroartemisinic acids in the presence of NADP or NAD cofactor, respectively. The Km value of artemisinic aldehyde is lower than that of dihydroartemisinic aldehyde. In terms of the structural features, it is apparent that the enzyme dehydrogenates the proton at the C12 position of both artemisinic and dihydroartemisinic aldehyde in the presence of cofactors. In addition, a hydroxyl (–OH) group is added to C12 (Fig. 1), indicating that ALDH1 is also involved in the hydroxylation of C12.

ALDH1 was used for metabolic engineering of precursors of artemisinin in both yeast and plants. Paddon et al. (2013) introduced ALDH1 in engineered yeast strains and succeeded in an industrial scale production of artemisinic acid. Zhang et al. (2011) overexpressed ALDH1 in tobacco plants. Transgenic plants could produce dihydroartemisinic alcohol, although neither artemisinic acid nor dihydroartemisinic acid was detected (Zhang et al. 2011). These results indicated that heterogeneous host conditions could control the final products when ALDH1 were overexpressed ectopically. In addition, other transcriptional analysis indicated that the expression of ALDH1 in A. annua is closely relevant to the production of artemisinin (Dilshad et al. 2015; Xiang et al. 2015), indicating its involvement in the biosynthetic pathway.

Double bond reduction

The C9 of artemisinin is characterized by a methyl group (Fig. 1). This single C–C linkage results from the reduction of the Δ11(13) double bond in amorpha-4,11-diene, artemisinic alcohol, and artemisinic aldehyde (Fig. 1). Zhang et al. (2008) first tested reduction activity from crude protein extract of flowers, then cloned a cDNA and characterized it to encode a double bond reductase, namely DBR2 (Zhang et al. 2008). Both recombinant and native enzymes were then shown to reduce artemisinic aldehyde to dihydroartemisinic aldehyde in the presence of NADPH. However, DBR2 did not utilize other artemisinin precursors as substrates, indicating its specificity to aldehyde moiety. Interestingly, DBR2 used 2-cyclohexen-1-one and (+)-carvone as substrates. In addition to biochemical analysis, two research groups heterogeneously expressed this gene in tobacco plants to produce artemisinin and its precursors. It is interesting that two groups observed different results. Zhang et al. overexpressed ADS, CYP71AV1, DBR2, and ALDH1 to produce a low level of dihydroartemisinic alcohol but not dihydroartemisinic acid (Zhang et al. 2011). By contrast, Farhi et al. (2011) overexpressed ADS (in either the cytosol or mitochondrion), CYP71AV1, CPR1, and DBR2 as well as HMGR (an upstream MVA pathway gene) to produce 0.75 to 6.8 µg/g (dry weight) artemisinin. In addition, DBR2 was overexpressed in A. annua to lead to increase artemisinin contents in a few of selected heterozygous transgenic plants (Yuan et al. 2015). These transgenic results demonstrated its involvement in the biochemical pathway of artemisinin. Furthermore, several transcriptional analyses provided data that the expression level of DBR2 is relevant to the production of artemisinin (Jiang et al. 2014; Lu et al. 2013; Olofsson et al. 2011; Xiang et al. 2015; Yang et al. 2010). Although further investigation is required, all of these studies have shown its involvement in the biosynthesis of artemisinin.

Formation of trioxane and oxygenation

Since the elucidation of artemisinin structure (Liu et al. 1979), its formation in plants has been unknown. Although steps described above have gained biochemical characterization, how the 1, 2, and 13 trioxane (belonging to the 1, 2, 4-trioxane type) and C12–O–C10 ether linkage are formed is unknown. Additionally, its direct precursor is unknown. These problems have hindered efforts to improve artemisinin production.

Regarding its direct precursors, Brown (2010) reviewed and discussed several hypotheses. Those proposed precursors include artemisinic acid, arteannuin B, dihydroartemisinic acid, dihydroarteannuin B, seco-cadinane, and artemisitene, all of which were used to feed plants or for cell-free enzymatic assays. Of these metabolites, the most recent isotope feeding experiments proposed that dihydroartemisinic acid but not others is the most likely precursor (Brown and Sy 2004). As appropriately discussed by Brown (2010), the formation of all hypotheses was based on radioactive-labeling and feeding experiments, isotope feeding, or other in vitro chemical conversion experiments. However, given that there have not been solid experiments to demonstrate the artemisinin formation via an enzymatic reaction in plants, the final step(s) to artemisinin has(ve) been favorably proposed to be photo-oxidation based spontaneous reactions, which were characterized to be four steps. These include photo-sensitized reaction at the delta-4,5-double bound in dihydroartemisinic acid, Hock cleavage, oxygenation, and cyclization (Brown 2010; Sy and Brown 2002). To date, this hypothesis is supported by semi-synthetic biology researches. The semi-synthetic technology successfully used in vitro photo-oxidation based reaction to convert artemisinic acid and dihydroartemisinic acid to artemisinin with a high yield (Paddon et al. 2013). Particularly, the recent industrial scale production of artemisinin from artemisinic acid via dihydroartemisinic acid (Turconi et al. 2014) strongly supports the mechanism of spontaneous photo-oxidation. However, the chemical reaction conditions used for semi-synthesis may not exist in plant tissues, particularly in trichomes, given that plant cells are unlikely to be tolerant to an extremely intensive lighting and high concentrations of catalysts used in the photoreactor (Turconi et al. 2014). Therefore, a possible hypothesis is that an unknown biochemical process is responsible for formation of trioxane structure in those vulnerable trichomes and other tissues. This hypothesis can be supported by certain preliminary data obtained from several different studies (Dhingra and Lakshmi Narasu 2001; Farhi et al. 2011; Sangwan et al. 1993; Tatineni et al. 2006). Tatineni et al. (2006) observed the formation of artemisinin using Microbacterium trichotecenolyticum as a resource for conversion of arteannuin B. Farhi et al. (2011) developed transgenic tobacco plants that formed artemisinin as described above. Sangwan et al. (1993) incorporated of C14 labelled artemisinic acid to artemisinin in a cell free system (Sangwan et al. 1993). Although these data provide indirect evidence, as more experiments are performed using new plant materials and approaches, the secret of trioxane formation can be elucidated in the future.

Genetic breeding

Germplasm selection and genetic breeding for high artemisinin production have been two primary approaches to improve the shortage of artemisinin supply. Naturally, A. annua is a diploid and cross-pollinating species (Delabays et al. 1993; Xie et al. 1995). In the 1980s, numerous research efforts focused on the phytochemical screening and isolation of artemisinin and its precursors (Acton and Klayman 1985; Klayman 1985; Klayman et al. 1984; Liersch et al. 1986; Tu et al. 1982). Since the late 1980s, multiple laboratories have endeavored to select elite germplasms and to characterize ecotypes for high artemisinin production (Alejos-Gonzalez et al. 2011; Charles et al. 1990; Delabays et al. 1993, 2001; Duke et al. 1994; Elhag et al. 1992; Ferreira and Janick 1995; Graham et al. 2010; Singh et al. 1988; Wallaart et al. 1999, 2000). Germplasm investigations have demonstrated that the artemisinin content in plants is relatively low and dramatically variable, from 0.003 to 0.2% (g/g, dry weight) in different ecotypes and from 0 to 0.39% (DW) in individual plants of the same ecotype (Charles et al. 1990; Paul et al. 2014). To overcome the low content problem, breeding efforts have focused on creating superior cultivars. Over the past two decades, different laboratories reported new cultivars that can increase artemisinin content to 1–2% and even higher (Brisibe et al. 2012; Delabays et al. 1993, 2001; Ferreira et al. 2005; Graham et al. 2010; Simonnet et al. 2008). For example, four hybrids, namely, Hyb1252r ‘Jewel’, Shennong hyb1209r, Hyb8003r ‘Verdant’ and Hyb8001r ‘Zenith’, which was reported produce more than 1% artemisinin (g/g, dry weight), were obtained by the CNAP (http://www.york.ac.uk/org/cnap/artemisiaproject/). In addition, doubling chromosome numbers generated a novel tetraploid cultivar. Field growth of the new variety also showed promising increase of artemisinin (Banyai et al. 2010; Wallaart et al. 1999). To understand genetics of the artemisinin biosynthesis, a fundamental genetic map for a hybrid was established using transcriptomics and quantitative trait loci (QTL) (Graham et al. 2010). The resulting map characterized primary linkage groups and traits controlling artemisinin variation. This type of genetic map provides potential to develop robust crops.

All genetic breeding efforts have also revealed that increasing and stabilizing artemisinin yield has been challenging. One of primary reasons has been the instability of artemisinin production in plants (Paddon et al. 2013; Ro et al. 2006). A. annua is a cross-pollination species (Alejos-Gonzalez et al. 2011; Delabays et al. 2001; Graham et al. 2010; Simonnet et al. 2008) and all current commercial cultivars are heterozygous. The heterozygosity leads to segregation of progeny. Accordingly, although breeding efforts created a few high content lines, from 1 to 2.4% (DW) (Brisibe et al. 2012; Cockram et al. 2012; Delabays et al. 2001; Graham et al. 2010; Larson et al. 2013; Simonnet et al. 2008), the instable yield in those cultivars has been unable to solve the insufficient problem of artemisinin supply and to decrease the cost of ACT (Paddon et al. 2013; Pilloy 2007; Shretta and Yadav 2012; White 2008). In addition, previous selection of germplasm demonstrated that the artemisinin biosynthesis is localized in glandular trichomes consisting of 10 cells (Duke and Paul 1993; Lommen et al. 2005; Tellez et al. 1999). This trichome specificity is also a limit factor for improvement of artemisinin content.

To overcome progeny segregation, self-pollination was investigated using commercial cultivars. Unfortunately, previous trials failed to complete self-pollination (Delabays et al. 2001; Peter-Blanc 1992). Although the previous failure was discouraging, we have developed a self-pollinated A. annua cultivar (Alejos-Gonzalez et al. 2011). In the beginning of selection, we observed that progeny from wild-type plants were phenotypically variable, e.g. quick elongation, no elongation, not flowering, no branching, and others. During successive selection, we focused on individual plants that flowered relatively early and grew appropriate biomass in growth chamber conditions. After we obtained F2 plants, screened progeny that produced artemisinin. When the plants were grown in a growth chamber, the artemisinin contents from the bottom to higher positional leaves and inflorescences of F2 plants ranged from 0.01 to 0.1% (g/g, DW) (Alejos-Gonzalez et al. 2011). The content of artemisinin was increased to 0.3–0.4% (g/g, DW) in F3 and F4 plants during continuous selection for homozygous plants (Ma et al. 2015). To date, we have obtained F7 progeny, in which the trend of artemisinin biosynthesis is associated with leaf positions and flowering status. In addition, all progeny grow similarly without phenotypical segregation, demonstrating the homozygosity of plants. Additionally, self-pollinated plants display an approximately 100% regeneration capacity from different tissues via both organogenesis and somatic embryogenesis (Alejos-Gonzalez et al. 2013) and is feasible for genetic transformation. Therefore, the self-pollinating plants can form a new platform to enhance understanding of artemisinin biosynthesis and improve artemisinin production (Ma et al. 2015).

Metabolic engineering

Metabolic engineering of plants has been another primary research approach to improve artemisinin production. In addition, metabolic engineering is a potent approach to demonstrate the artemisinin biosynthetic pathway (Fig. 1). In the 1980s and 1990s, most researches focused on plant tissue culture and genetic transformation. Although whether callus, cell, shoot, and hairy root cultures can produce artemisinin was debated, most of investigations provided solid data that the artemisinin biosynthesis occurred in cultures in vitro (Brown 1994; Cai et al. 1995; Elhag et al. 1992; Ferreira and Janick 1996; Ferreira et al. 1995b; Gupta et al. 1996; Kudakasseril et al. 1987; Nair et al. 1986; Qin et al. 1994; Weathers et al. 1994; Woerdenbag et al. 1993; Xie et al. 2000). Since 1999 when amorpho-4,11-diene synthase was identified to catalyze the first committed step to artemisinin (Bouwmeester et al. 1999), five other pathway genes have been cloned from trichomes as described above. Transcription factors in four different families, including WRKY, AP2/ERF, bHLH, and bZIP, were also cloned and characterized to associate with the regulation of artemisinin biosynthesis (Han et al. 2014; Ji et al. 2014; Lu et al. 2012, 2013; Ma et al. 2009; Zhang et al. 2015). To date, all known pathway and transcription factor genes have been overexpressed in A. annua (Chen et al. 2013; Liu et al. 2011; Saxena et al. 2014; Shen et al. 2012; van Herpen et al. 2010; Zhu et al. 2014). Recent literatures reviewed the effects of the overexpression of all known genes on artemisinin production (Shen et al. 2016; Tang et al. 2014). Tang et al. (2014) particularly summarized artemisinin production in multiple transgenic A. annua plants generated in their laboratory. It is worth noting that the artemisinin contents in transgenic A. annua plants ranged from approximately 0.06 to 2.5% (g/g, DW) (Tang et al. 2014). This variation was characterized by the inconsistency of artemisinin contents in transgenic versus wild-type plants. For example, wild-type plants produced higher contents of artemisinin than some transgenic plants and vice versa. This problem might be associated with segregation that can occur in progeny resulting from heterozygosity. Although it is discouraging that none of transgenic A. annua plants has been applied for artemisinin production, all promising engineering investigations have showed a potential to increase artemisinin yield. As described above, we have bred a homozygous cultivar to overcome the segregation problem of the artemisinin biosynthesis in engineered plants. We recently overexpressed ADS in the homozygous cultivar. All transgenic plants showed consistent increase of artemisinin to approximately 0.4% (DW) when grown in a growth chamber (Ma et al. 2015). In contrast, the down-regulation of ADS consistently decreased artemisinin production. These promising results indicate that homozygous plants may provide a new platform to overcome segregation for metabolic engineering and to improve and stabilize artemisinin production in the future.

Tobacco species have been studied as a new plant source for metabolic engineering of artemisinin. Tobacco plants are also a useful tool for elucidating the artemisinin biosynthetic pathway. A few of laboratories introduced ADS, CYP71AV1, CPR and DBR2 alone or together into tobacco plants (Farhi et al. 2013; Liu et al. 2011). When ADS alone was introduced into Nicotiana tabacum, transgenic plants could synthesize amorpha-4,11-diene although the level was low (Wallaart et al. 2001). When ADS, CYP71AV1, DBR2, and ALDH1 were co-expressed in N. tabacum, transgenic plants produced a detectable level of amorphadiene and artemisinic alcohol. In addition, when ADS and CYP71AV1 were expressed in N. benthamiana, transgenic plants produced artemisinic acid diglucoside in the leaves (van Herpen et al. 2010). More importantly, when ADS, CYP71AV1, CPR and DBR2 were stacked together in a megatransgene construct and then was introduced into N. tabacum, transgenic plants synthesized a low level of artemisinin (Farhi et al. 2011). Although these laboratories obtained different final products in their transgenic plants, these promising results showed a potential to use tobacco plants as an alternative resource for metabolic engineering of artemisinin and its precursors.

Synthetic biology and semi-synthesis

To date, the semi-synthesis of artemisinin via engineered yeast strains is likely the most successful example of synthetic biology in plant natural products. Sanofi, a pharmaceutical company who owns the semi-synthetic patent to produce artemisinin, reported that they could produce 60 tons of artemisinin in 2014 using synthetic biology and semi-synthetic chemistry approaches (Turconi et al. 2014). Although this approach has not been cost effective to produce artemisinin (Peplow 2016), it has shown a significant potential to supplement artemisinin supply to battle against malaria. Several recent reviews highly applauded this success (Abdin and Alam 2015; Corsello and Garg 2015; George et al. 2015; Majdi et al. 2016; Paddon and Keasling 2014), accordingly, we do not review those biochemical theories and technologies invented in synthetic biology. Instead, we discuss what this success implies for us. In 2006, two laboratories independently produced artemisinic acid using engineered yeast strains (Ro et al. 2006; Teoh et al. 2006). One of main differences in their experiments was that Ro et al. (2006) expressed ADS, CYP71AV1, and CPR (now named as CPR1) in their engineered yeast lines, while Teoh et al. (2006) expressed CYP71AV1 alone. Ro et al. (2006) could use ADS, CYP71AV1, and CPR1 to produce approximately 32 mg/l artemisinic acids and trace levels of artemisinic alcohol and aldehyde. In this system, CPR1 was used as a partner of CYP71AV1. In contrast, Teoh et al. (2006) could only detect a low level of artemisinic alcohol when amorphadiene was incubated with microsomes containing CYP71AV1 alone. In addition, Teoh et al. (2006) could detect artemisinic aldehyde and artemisinic acid when they incubated CYP71AV1-containing microsomes with artemisinic alcohol and aldehyde, respectively. These data indicated that the CYP71AV1 alone expressed in yeast could also convert amorphadiene to artemisinic acid when the concentrations of substrates were sufficient in the system. This is likely because yeast expresses diverse CPRs (Lamb et al. 1999). Besides, a plant CPR homolog has been reported to complement cpr-deficient yeast (Cabello-Hurtado et al. 1999). Accordingly, results from the two laboratory experiments not only showed chain steps of reaction catalyzed by CYP71AV1, but also inferred that the engineered yeast strains could express a CPR1-like enzyme homolog.

Paddon et al. (2013) further isolated another native partner of CYP71AV1 from A. annua, namely CYB5. When CYB5 was co-expressed with CYP71AV1 and CPR1, the yield of artemisinic acid was increased to approximately 300 mg/l (Paddon et al. 2013). This experiment demonstrated the importance of a cognate reductase that is associated with the oxidation of amorphadiene into artemisinic acid.

Based on these accomplishments in synthetic biology, it can also be implied that ADS and CYP71AV1are likely specific to A. annua or other plants, while CPR1 and CYB5 homologs are expressed in yeast because yeast strains can produce artemisinic acid in the presence of CYP71AV1 alone. In addition, synthetic biology showed that the final product of a co-expression of ADS, CYP71AV1, CPR1, CYB5, and ADH1 in yeast strains is artemisinic acid but not artemisinin. This result provides indirect information, which does not support the favored current hypothesis that the final steps of artemisinin formation is via a spontaneous photo-oxidation reaction in the presence of singlet oxygen. This hypothesis was formed based on the synthetic steps from artemisinic acid to artemisinin in the presence of singlet oxygen, including reduction, esterification, ‘ene-type’ reaction, and Hock fragmentation and rearrangement (Paddon and Keasling 2014; Paddon et al. 2013; Turconi et al. 2014). However, these steps cannot be proceeded in yeast culture or fermentation even though singlet oxygen can exist. Therefore, these results infer that in addition to plant specific ADS and CYP71AV1, A. annua may have a specific system to catalyze the formation of trioxane.

Questions, challenges, and perspectives

The goal to elucidating artemisinin biosynthesis is to increase the yield of this effective anti-malarial medicine. As described above, the past research endeavors have taught us that this task is challenging. Furthermore, the previous efforts shed light on the fact that many questions remain to be answered to reach this goal. Here, a few of fundamental questions are discussed below.

Is artemisinin biosynthesized from the biochemical pathway that starts with amorphadiene through dihydroartemisinic acid (Fig. 1)? This question is raised based on the challenging of metabolic engineering (Liu et al. 2011; Tang et al. 2014) as discussed above. As reported in other natural products (Dixon 2005), successful metabolic engineering is based on a known biosynthetic pathway elucidated by genetics, biochemistry, transgenics, and cell biology studies. For example, the anthocyanin biosynthesis in different plants is intensively elucidated by genetic, biochemical, transgenic, and cell biology studies (Holton and Cornish 1995; Shi and Xie 2014; Winkel-Shirley 2001). To date, metabolic engineering for high production of anthocyanins becomes relatively feasible via manipulation of either transcription factors or pathway genes (Butelli et al. 2008; Shi and Xie 2011, 2014; Zhou et al. 2008; Zuluaga et al. 2008). Another example is the biosynthesis of proanthocyanidins via the anthocyanidin reductase (ANR) pathway (Peel et al. 2009; Peng et al. 2012; Sharma and Dixon 2005; Xie et al. 2003, 2006). To date, metabolic engineering of the ANR pathway is feasible in forage crops (Peel et al. 2009; Xie et al. 2006). As described above, most of known genes of the proposed artemisinin pathway have been overexpressed either in A. annua or in tobacco plants (Farhi et al. 2013; Tang et al. 2014). However, no transgenic plants have been reported to be of agricultural significance for production of artemisinin (Tang et al. 2014; Xiang et al. 2015; Yuan et al. 2015). This discouraging outcome may result from different factors, such as gene expression levels, metabolic flux competition, trichome localization limitations, unknown final steps, and others. Of these, two most fundamental factors that hinder production improvement are its unknown biosynthetic pathway and unclear genetics. To date, most of known pathway data resulted from cell-free enzyme assays, recombinant protein incubation, or synthetic combinations of genes in engineered yeast strains. Although these data have fundamentally enhanced the understanding of the biochemical pathway, solid genetic evidence has been lacking to support all known biochemical steps from artemisinic alcohol to artemisinic acid and dihydroartemisinic acid (Fig. 1). To confirm gene functions in planta, different research groups have generated transgenic A. annua plants. However, given that most transgenic data resulted from heterozygous progeny without the same genetic background, the effects of transgene overexpression on artemisinin production were confusing. As comprehensively reviewed by Tang et al. (2014), the contents of artemisinin were highly variable from 0.06 to 2.5%, even though the same laboratory generated transgenic plants. This severe variation was caused by the fact that most of those transgenic plants were generated from heterozygous wild-type progeny, in which segregation occurs. Accordingly, transgenic and wild-type control plants for artemisinin comparison may not have the same original genetic background. If low artemisinin-content individuals were randomly selected as control, the artemisinin contents between T0 transgenic and wild-type plants should not be comparable.

To date, only the first committed step catalyzed by ADS has been solidly demonstrated by different research approaches. Bouwmeester et al. (1999) first demonstrated a native ADS activity. Later on, the heterogeneous expression of ADS alone or coupled with other genes was shown to convert FPP to amorphadiene and other metabolites (such as artemisinin) in tobacco plants (Chang et al. 2000; Farhi et al. 2011; van Herpen et al. 2010; Wallaart et al. 2001; Zhang et al. 2011). Recombinant expression in bacterium or yeast also provided strong evidence showing the catalytic function of ADS (Jiang-Qiang et al. 2009; Lindahl et al. 2006; Martin et al. 2003; Ro et al. 2006; Tsuruta et al. 2009). We recently provided molecular evidence to show the effects of ADS on artemisinin formation in planta by overexpression and downregulation in homozygous A. annua plants (Ma et al. 2015). In addition, multiple transcriptional analyses revealed the association of ADS expression with artemisinin contents in A. annua (Alejos-Gonzalez et al. 2011; Arsenault et al. 2010; Kim et al. 2008; Ma et al. 2015). All data conclude that ADS catalyzes the first committed step of the artemisinin biosynthesis. However, in comparison with ADS, genetic and other data are still necessary to conclude other steps in planta (Fig. 1). The new CRISPR/Cas9 technology (Lowder et al. 2015) provides a new genome edition tool to manipulate A. annua plants to elucidate the biosynthetic pathway of artemisinin.

In addition to glandular trichomes, can other cells produce artemisinin? Glandular trichomes are a morphological feature of A. annua (Duke and Paul 1993; Ferreira and Janick 1995). Two fundamental previous researches showed that glandular trichomes were tightly associated with artemisinin production. Ferreira and Janick (1995) designed a floral dipping experiment to extract artemisinin in petroleum ether or acetonitrile in 60 s. This rapid extraction indicated that trichomes would be the most possible localization of artemisinin biosynthesis, given that artemisinin was hardly extracted from other cells in such a short time. Telle et al.(1999) identified a natural A. annua mutant lacking glandular trichomes. Metabolite analysis determined that the glandless mutant did not produce artemisinin. The results of these two studies formed an appropriate platform to search for candidate genes that are likely involved in the artemisinin biosynthesis. Transcriptome sequencing of glandular trichomes from different ecotypes was subsequently conducted by a few of laboratories (Covello et al. 2007; Graham et al. 2010; Teoh et al. 2006; Wang et al. 2009; Xiao et al. 2016). These sequencing data successively allowed cloning of genes, such as CYP71AV1, CPR1, CYB5, ADH1, ALDH1 and others as described above, as well as a few of transcription factor genes (Ma et al. 2009; Shen et al. 2016; Tan et al. 2015; Zhang et al. 2015). All these studies provided fundamental data to enhance understanding that the artemisinin formation occurs in glandular trichomes.

This question raised here is based on a potential paradox between those reported artemisinin contents and the biomass of glandular trichomes in studied different germplasm and transgenic plants. As discussed above in the section of genetic breeding, artemisinin contents were reported in a range of 0–1.4% (dry weight) (Chan et al. 1995; Charles et al. 1990; Delabays et al. 1993, 2001; Paul et al. 2010). In addition, artemisinin contents were reported to fall a range of 0.3–2.5% (dry weight) in transgenic A. annua plants (Tang et al. 2014; Yuan et al. 2015). Those plants with more than 1% content are promising to increase the yield of artemisinin. However, 1–2.5% contents of artemisinin should be produced in more than 1–2.5% dry biomass of glandular trichomes. We recently estimated trichome density in our self-pollinated plants (Alejos-Gonzalez et al. 2011). We observed that the dry biomass of glandular trichomes could not reach 1%. To date, the dry biomass of glandular trichomes has never been reported in other cultivars, particularly in those high-content hybrid cultivars. Therefore, estimating glandular trichome biomass will be highly valuable to determine the localization of the artemisinin biosynthesis. Two possible results can be expected from a measurement of glandular trichomes. On the one hand, if those high-content cultivars cannot develop more than 1–2.5% dry weight of glandular trichomes, it can be deduced that the formation of artemisinin also occurs in other tissues. On the other hand, if the dry biomass of glandular trichomes can reach more than 1–2.5% of the total plant tissue biomass, it may indicate that glandular trichomes likely are the sole localization. In addition, to understand the roles of glandular trichomes, another useful method is to create non-natural A. annua mutants that lack glandular trichomes and then functionally characterize the artemisinin biosynthesis. In conclusion, answers for this question are fundamental to direct metabolic engineering of artemisinin in the future.

How does the terpenoid network control artemisinin biosynthesis and metabolic engineering? A. annua is rich in terpenoids. Brown (2010) comprehensively reviewed previous literatures and summarized 330 terpenoid molecules (such as 134 monoterpenoids and 174 sesquiterpenoids) identified from this species. Common sesquiterpenoids and monoterpenoids include acyclic, monocyclic, bicyclic, and tricyclic molecules (Figs. 2, 3). For example, farnesene, germacrene, caryophyllene, and 8-epi-cedrol (Fig. 2) are prevalent sesquiterpenoids in A. annua. Linalool, camphor, 1, 8-cineole, and terpinene are common monoterpenoids. Although many of other metabolites are produced by different ecotypes or controlled by different growth field conditions, this large number of terpenoids indicates the complexity of a biosynthetic network existing in this anti-malarial plant. Previous field trials reported that the yield of essential oils was much higher than that of artemisinin (Ahmad and Misra 1994; Chalchat et al. 1994; Woerdenbag et al. 1994). For example, a study in India showed that one hectare field produced 7.1 kg artemisinin and 91 kg essential oil (Ram et al. 1997). Here, it should be noted that the production of artemisinin in the field-grown plants resulted from the accumulation of the entire growth period, while the production of essential oils only represented a “transient” yield at the harvesting moment, given that essential oil components are volatile. If calculation is based on the entire growth season, the total yield of essential oils should be much higher than 91 kg/ha. These previous studies indicate that the biochemical activities to monoterpenoids and non-amorphadiene sesquiterpenoids are much higher than those activities to artemisinin. We recently identified 19 non-amorphadiene volatile sesquiterpenoid molecules (Fig. 2) and 25 monoterpenoids (Fig. 3) from a self-pollinated A. annua cultivar grown in a growth chamber (Ma et al. 2015). Estimating the content of five compounds showed that their summed production was significantly higher than the summed production of artemisinin, artemisinic acid and arteannuin B in different positional leaves and inflorescences at different flowering times (Fig. 4). In addition, gene expression profiling showed that the total transcript of monoterpene and non-amorphadiene sesquiterpenoid synthase genes (Fig. 1) was much higher than that of ADS in different tissues. These results indicate that the total pathway activities to other sesquiterpenoids and monoterpenoids are much higher than the pathway activity to artemisinin. Furthermore, 47 pathway genes and all detected terpenoids were integrated to develop a metabolic network that visualized the correlation of gene–gene, gene-metabolite, and metabolite–metabolite in six analyzed tissues. The resulting network shows that four known artemisinin-pathway genes (CPR1, ADS, CYP71AV1, and DBR2) and one new candidate (CPR2) are five elements associated with the accumulation of 34 metabolites in six tissues (Ma et al. 2015). This observation suggests that metabolic network-based regulation critically controls artemisinin biosynthesis and is fundamental to metabolic engineering.

Fig. 2
figure 2

Structures of twenty-one sesquiterpenoids identified from a self-pollinated A. annua cultivar

Fig. 3
figure 3

Structures of twenty-six monoterpenoids identified from a self-pollinated A. annua cultivar

Fig. 4
figure 4

Comparison between the total production of artemisinin, artemisinic acid and arteannuin B and the total production of two monoterpenes and three non-amorphadiene sesquiterpenoids in 13 tissues. a Pictures of six different tissues; b total production summed from two monoterpenes and three non-amorphadiene sesquiterpenoids; c total production summed from artemisinin, artemisinic acid, and arteannuin B. It must be noted that the production of β-pinene, 1,8-cineole, α-copaene, caryophyllene, and β-farnesene only shows their yield in tissues at the harvest moment, while the production of artemisinin, artemisinic acid, and arteannuin B is the result of accumulation during the entire tissue development

Can the artemisinin biosynthesis be independent upon the spatial and developmental regulation of plants via metabolic engineering? Multiple previous reports reproducibly showed that artemisinin accumulation in different heterozygous ecotypes reached a peak value prior to or during blooming (Arsenault et al. 2010; Chan et al. 1995; Ferreira 2008; Ferreira et al. 1995a; Liersch et al. 1986). Those researchers also showed that sometimes, the maximum production time might change to cause yield instability. These disadvantageous features challenge to stabilize the artemisinin yield from plants (Corsello and Garg 2015; Paddon and Keasling 2014; Paddon et al. 2013; Ro et al. 2006; Turconi et al. 2014). To understand if flowering can increase artemisinin biosynthesis, a flowering-promoting factor (FPF) was introduced into A. annua. Transgenic plants were reported to flower 20-day earlier than wild-type ones. However, the contents of artemisinin were similar between the early flowering transgenics and non-transgenic vegetative control plants (Wang et al. 2004), suggesting that the flowering time itself may not be essential in artemisinin biosynthesis. In recent, we also profiled artemisinin contents in 14 tissues of self-pollinated homozygous plants and reproducibly found that the peak value of artemisinin occurred in the head inflorescence buds (HB) (Ma et al. 2015). All of these data disclose that in addition to glandular trichomes, other mechanisms are likely associated with the biosynthetic phenomenon of artemisinin. To understand this metabolic phenomenon, we compared RPKM values of ADS, CYP71AV1, DBR2, and ALDH1. The RPKM values of ADS and ALDH1 in HB were the highest in six tissues. In addition, the RPKM values for CYP71AV1, DBR2, and ALDH1 in HB were the second highest. These transcriptional data indicate that the accumulation pattern is associated with gene expression (Ma et al. 2015). All these data reveal that the peak artemisinin biosynthesis is spatially and temporally dependent upon plant growth. Therefore, to improve artemisinin biosynthesis, this dependence must be genetically perturbed using an effective technology, such as development of master regulatory complexes into tools. For example, anthocyanin biosynthesis is also highly dependent upon plant development (Shi and Xie 2010; Xie et al. 2003). However, the dependence was overcome by expression of a master regulator, namely production of anthocyanin pigment 1 (PAP1, a R2R3-MYB transcription factor) and its partners (Shi and Xie 2011, 2014; Xie et al. 2006; Zhou et al. 2012). PAP1 transgenic plants and cells highly produce anthocyanins (Xie et al. 2006; Zhou et al. 2008). Taken together, although plant development controls the artemisinin biosynthesis, it is possible that establishment of new technologies has a high potential to perturb the dependence on specific growth period to improve yield.

In addition, there are many other questions. For example, how does metabolic crosstalk occur between plastids and the cytosol control artemisinin production? Can geranyl diphosphate be transported to the cytosol? How much is singlet oxygen concentration in plant cells, particularly in glandular trichomes? What enzymes catalyze final step(s) to artemisinin? Answers for all questions will enhance elucidating the biosynthetic pathway to develop metabolic engineering approaches to improve artemisinin production.

Conclusion

The past research endeavors discovered the biochemical pathway starting with amorphadiene to artemisinic acid and dihydroartemisinic acid. On the one hand, this discovery has fundamentally led to an industrial scale success of semi-synthesis and a promising increase of artemisinin content in plants. On the other hand, many questions remain for answers to finally elucidating the biosynthetic pathway of artemisinin. Genetic evidence of the pathway is particularly lacking. All unanswered questions are hindering efforts of metabolic engineering for high production of artemisinin. Therefore, an integrated approach including forward genetics, reverse genetics, biochemistry, omics, and metabolic engineering is essential to elucidate the biosynthetic pathway and regulation of artemisinin to increase yield for ACT.