Introduction

C4 photosynthesis is a distinct metabolic pathway in some specific plant species (so called C4 plants) that fixes CO2 initially into a four-carbon (C4) molecule, oxaloacetate, instead of the production of the C3 molecule, 3-phosphoglycerate, which occurs in the majority of photosynthetic organisms (C3 plants). Oxaloacetate is synthesized from the carboxylation of phosphoenolpyruvate (PEP) by PEP carboxylase (PEPC) and is converted into malate or aspartate in mesophyll cells in the leaves. These molecules are then transported to bundle sheath cells around the veins and decarboxylated to release CO2. CO2 released in the bundle sheath cells is fixed by the enzyme Ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO) and metabolized via Calvin–Benson–Bassham (CBB) cycle in a similar fashion as in C3 species. The remaining carbon backbone is transported back to mesophyll cells as pyruvate, alanine and PEP and then recycled to generate PEP to serve as the carboxyl accepter molecule (Fig. 1). There are three pathways proposed to mediate C4 photosynthesis. These pathways are named as NADP+ dependent malic enzyme (NADP-ME), NAD+ dependent malic enzyme (NAD-ME) and PEP carboxykinase (PEPCK) subtypes according to the enzymes mediating the decarboxylation reaction in the bundle sheath cells (Fig. 1). It should be noted that these pathway subtypes have different requirements of redox co-factors (NAD+/NADH and NADP+/NADPH) and ATP at specific cellular locations (Hatch 1987). Historically, each C4 plant species was thought to dominantly operate one of these pathway subtypes. However, recent analyses indicate that these pathways are simultaneously operated in a single species (Furbank 2011; Wang et al. 2014b; Weissmann et al. 2016; Arrivault et al. 2017). Through the C4 photosynthesis cycle, CO2 is concentrated in bundle sheath cells with thick cell walls, which prevents CO2 from diffusing out of the cells. As such, C4 photosynthesis pathway serves as a CO2 concentrating shuttle that achieves a high CO2 concentration in the proximity of RubisCO in the bundle sheath cells. High CO2 concentration leads to efficient carboxylation with a low rate of oxygenation reaction by RubisCO due to the high CO2/O2 ratio in the chloroplast. As a result, C4 plants are estimated to be 50% more efficient in photosynthetic carbon fixation than C3 species (Sage and Monson 1999). Additionally, C4 species generally show higher water use efficiency. C4 plants also do not require a high CO2 concentration for carboxylation thanks to the high affinity of PEPC for HCO3, allowing the stomata to open less to fix the same amount of CO2 as C3 plants (Sage and Monson 1999). Due to the superior features of C4 over C3 photosynthesis, worldwide efforts are currently underway to engineer C4 photosynthesis into C3 crops like rice (Suzuki et al. 2000, 2006; Ku et al. 2001; Agarie et al. 2002; Greco et al. 2012; Peterhansel and Offermann 2012; von Caemmerer et al. 2012; Leegood 2013). C4 photosynthesis evolved independently to form 61 lineages during the evolutional history of vascular plants (Sage 2016). All essential enzymes of the C4 pathway exist in C3 plants and some parts of the C4 pathway is operational in C3 plants (Hibberd and Quick 2002). A simulation study suggests that the modules of biochemical traits were acquired through stepwise acquisition during the evolution of complete C4 pathways from C3 plants (Heckmann et al. 2013). Additionally, comparative transcriptomic studies identified sets of genes differentially expressed among different types of C4 and C3 species (Brautigam et al. 2011; Leegood 2013; Bräutigam et al. 2014; Wang et al. 2014a, b). These studies depicted the blueprint for engineering C4 photosynthesis in C3 species. It has been hypothesized and demonstrated that precise, quantitative coupling is vital in order to ensure the successful engineering of C4 photosynthesis in C3 plants increases the efficiency of carbon fixation as well as of nitrogen and water use simultaneously (von Caemmerer et al. 2012; Wang et al. 2012). However, a complete understanding of the C4 pathway operation is still lacking.

Fig. 1
figure 1

Pathways of C4 photosynthesis. Species with all three active subtype pathways are assumed. The pathways of NADP+ dependent malic enzyme (NADP-ME), NAD+ dependent malic enzyme (NAD-ME) and phosphoenolpyruvate carboxykinase (PEPCK) subtypes are shaded with yellow, red and blue, respectively. The reactions shared by multiple pathways have mixed color. Arrows indicate reactions or transport between different compartments. Dashed arrows represent the diffusion of CO2. Enzymes mediating the reaction are described in italic. Co-factors generated by the reaction are shown by blue letters. 2OG, 2-oxoglutarate; 3PGA, 3-phosphoglycerate; Ala, alanine; AlaAT, alanine aminotransferase; Asp, aspartate; AspAT, aspartate aminotransferase; CBC, Calvin–Benson cycle, Glu, glutamate; Mito, mitochondria; MDH, malate dehydrogenase; OAA, oxaloacetate; PEP, phosphoenolpyruvate; PEPC, PEP carboxylase; PPDK, pyruvate phosphate dikinase; Pyr, pyruvate; TP, triose phosphate. (Color figure online)

Recent developments in metabolic flux analysis (MFA) and constraint-based reconstruction and analysis (COBRA), including flux balance analysis (FBA), have the potential to reveal the guiding principal of C4 metabolic pathways and their interactions with other parts of the metabolism on a system-level (Kruger and Ratcliffe 2009). MFA and FBA can analyze the metabolic pathway structure and activity (i.e., the rate of chemical conversion; flux) in vivo (Dieuaide-Noubhani and Alonso 2014). Here, we categorize MFA and FBA as experimental and theoretical approaches, respectively. MFA is conducted by feeding experiments with radio- or stable-isotope-labeled molecules, while FBA analyzes the flow of metabolites in network-based models reconstructed from genomic and/or gene expression information. Each approach has specific advantages and limitations and has contributed to elucidate the nature of C4 photosynthesis in different ways. In this article, we introduce different types of MFA, metabolic modeling as well as FBA-based analyses, and review how these experimental and computational tools have contributed to elucidate the structure and operation of C4 photosynthetic pathways to date. We also discuss how MFA can be used in concert with metabolic network modeling in order to dissect the C4 photosynthetic machinery and plant systems as a whole to enable a comprehensive understanding of metabolic network in C4 plants.

Metabolic flux analysis

Metabolic Flux Analysis (MFA) is based on isotope tracer experiments, in which substrate molecules containing atoms of rare isotopes are fed to a biological system and the distribution of isotopic atoms in individual metabolites in the system is analyzed after a set period of time. Historically, relatively stable radioisotopes (e.g., 14C, 3H, 32P, 35S) or stable isotopes (e.g., 13C, 2H, 15N, 18O) are used as tracers. Autoradiograms and scintillation counters are typically used to determine the radioisotope labels in metabolites, while mass spectrometry (MS) and nuclear magnetic resonance (NMR) are the methods to quantify the stable isotope labels. Each type of isotope tracer has advantages and limitations (Batista Silva et al. 2016) which will be discussed in a subsequent section. MFA is also classified into two types, isotopically steady-state and nonstationary (also called “dynamic”) MFA. In both approaches, the metabolic pathways in the biological system should maintain stable fluxes (i.e., metabolic steady-state Fig. 2b) during the labeling period. All variants of MFA assume metabolic steady-state to avoid complexity in the flux calculations. Therefore, experiments should be completed within a metabolic steady-state in order to obtain accurate metabolic flux information. Isotopically nonstationary MFA is designed as a time course labeling experiment, in which a labeled compound is fed to a biological system and label accumulation in individual metabolites is analyzed at multiple time points before the metabolite labeling reaches saturation (Fig. 2c). The kinetics of label accumulation in metabolites indicates the sequence of reactions in the metabolic pathways and the rates of reactions, which can be calculated from the slope of the label accumulation curve (Fig. 2c). Additionally, the “pulse-chase” experiment provides further evidence on the sequence of metabolic reactions. In this experimental design, a labeled substrate is fed to the biological system for a short period (pulse) and then replaced back to the unlabeled one (chase). The labels incorporated into metabolites are transferred to other molecules downstream of the metabolic pathways. The peak time of label accumulation in individual metabolites depends on the metabolic distances (i.e., number of reactions to produce the metabolites) and the reaction rates from the initially pulse-labeled metabolites. Nonstationary MFA is very useful in elucidating the structure of metabolic pathways and the analysis of a few specific reactions. Therefore, it was employed extensively in the analyses of photosynthetic metabolism. However, it is very difficult to estimate metabolic fluxes at the metabolites with long metabolic distances from applied substrate due to the distribution of label into multiple routes. The experimental setup tends to be complicated due to the quenching of metabolic reactions at exact time points. In steady-state MFA, label accumulation in metabolites are analyzed at one time point, after the label accumulation in all metabolites becomes stable (isotopic steady-state, Fig. 2c). Labeled substrates must contain both labeled and non-labeled atoms at fixed positions, which end up with specific patterns of label accumulation in individual metabolites at isotopic steady-state. The obtained label accumulation values are applied to a pre-defined metabolic network model and a set of metabolic flux parameters to estimate the experimentally determined label accumulation pattern (Fig. 2c). This method allows us to analyze the pathway fluxes in overall metabolic network if an appropriate metabolic model is available. Unfortunately, the steady-state MFA of C4 photosynthesis is very difficult due to the: (1) requirement of isotopic steady-state, which is difficult to achieve in plant systems due to significant diurnal alteration in metabolic flux, (2) lack of suitable labeled substrates since a CO2 molecule contains a single C atom, and (3) lack of good-quality metabolic models, including precise descriptions across multiple cell types. Therefore, the application of experimental MFA is limited to non-steady-state analysis for C4 photosynthesis research.

Fig. 2
figure 2

Metabolic and isotopic steady state. a A metabolic pathway starting from substrate s to generate metabolites e and f is assumed here. b Changes in metabolite contents during a culture. At the metabolic steady-state, contents or accumulation/decrease rates of all metabolites are stable. All types of metabolic flux analysis assume metabolic steady state. c Changes in label enrichment in individual metabolites after the application of labeled substrates. Label accumulates in each metabolite over time and saturates at a certain level depending on the flux distribution in the metabolic network. At the isotopic steady-state, the label enrichment in all metabolites is stable. The label accumulation value reflects the metabolic flux and structure of the metabolic network. The fluxes at individual reactions are calculated based on the pre-defined metabolic network model (steady-state metabolic flux analysis). By a time-course experiment during the isotopic non-steady state, the metabolic flux is calculated from the slope of label accumulation. The time when the label enrichment reaches steady state reflects the pathway structure (isotopically nonstationary metabolic flux analysis)

Genome-scale metabolic modeling and constraint-based analysis

In addition to experimental approaches, cellular metabolic flux is also analyzed by a theoretical approach which uses genome-scale metabolic network models. A metabolic network model is a structured representation of the chemical transformations between cellular metabolites catalyzed by enzymes, which describes reaction stoichiometry and directionality as governed by thermodynamics, enzyme association and gene-protein-reaction relationships (GPRs), organelle-specific reaction and enzyme localization, transport of metabolites between intracellular organelles, nutrient and product exchange mechanisms within the growth environment, transcriptional/translational regulation, and biomass composition (Feist et al. 2009; Kumar et al. 2012). A metabolic network model can be used to predict metabolic flux under specific environmental conditions and/or genetic perturbations by defining the feasible metabolic space bound by mass balance, thermodynamic, and environmental constraints. This approach facilitates the understanding of physiology of a particular organism, allows for hypothesis generation and experimentation, and metabolic engineering (Feist et al. 2009; Thiele and Palsson 2010; Saha et al. 2011; Islam and Saha 2018). Recent advances in the reconstruction of high-quality genome-scale metabolic reconstruction of microbes and plants have accelerated due to the rapid growth of genome sequencing and annotation data generation.

The major steps in a metabolic reconstruction procedure includes: (a) developing a draft reconstruction, (b) manually refining the draft reconstruction, (c) converting the curated reconstruction to a mathematical model, and (d) validating and improving the model (Thiele and Palsson 2010). The draft reconstruction is based on the genome annotation of the organism of interest. The genome-encoded metabolic functionalities are included by utilizing information from biochemical databases like KEGG (Kanehisa and Goto 2000), BRENDA (Barthelmes et al. 2007), ModelSEED (Henry et al. 2010), and PlantSEED (Seaver et al. 2014). A number of automated reconstruction toolboxes such as Pathway Tools (Karp et al. 2002, 2010, 2016), KEGG Pathways (Kanehisa and Goto 2000; Du et al. 2014), MetaSHARK (Pinney et al. 2005), PUMA2 (Maltsev et al. 2006), and SimPheny (Price et al. 2003) have been developed in recent years. A fast and automated draft reconstruction is the first step in the model building process; in contrast, the tedious manual curation process in the model refinement step relies heavily on organism-specific experimental data covering the metabolic functionalities, biochemical reaction and thermodynamic information, correct gene-protein-reaction relationships, enzyme localization, and cellular constituent information. Once the metabolic reaction parameters are obtained, the metabolite charge and formula information are scrutinized to ensure correctness and the proper reaction stoichiometry based on a standard cell protonation state, using information from available resources (Wheeler et al. 2000; Barthelmes et al. 2007; Henry et al. 2010; Pence and Williams 2010). Thermodynamic information about each reaction is obtained from literature data or estimated and the reversibility of the reactions is assigned accordingly (Kümmel et al. 2006; Jankowski et al. 2008; Fleming et al. 2009). Next, reactions are assigned to specific enzyme or genes using GPR relationships obtained from literature, and enzymes are compartmentalized using available localization information or are determined by algorithms (Lu et al. 2004; Gardy et al. 2005; Yu et al. 2010; Finn et al. 2010). Once the localization is performed; intracellular transport reactions are added to allow metabolite flow between organelles and exchange reactions are added to allow metabolite transactions to and from the extracellular environment. The next critical step is to experimentally estimate the biomass constituents (i.e., proteins, RNA, DNA, lipids, vitamins, pigments, and cofactors) of a cell, estimate the growth-associated energy requirement (for synthesis of macromolecules) and the non-growth-associated energy requirement (to maintain cellular functions), and compile a biomass production reaction with all biomass precursors in appropriate stoichiometric proportions. In the third step, all the reactions are assembled into a mathematical structure, namely the stoichiometric matrix, where the stoichiometric coefficient of every metabolite in every reaction is listed. The next step involves the determination of auxotrophy and essential nutrients required for growth (biomass production) using experimental growth data or primary literature. Based on the growth yields and essentiality information, minimal, defined, and rich medium/environmental compositions are established and reported.

A myriad of tools (often called as Constraint-Based Reconstruction and Analysis, or COBRA tools) can be used to analyze the metabolic capabilities and nutrient flow behaviors, evaluate the model for further refinements, and use the model for efficient strain design and other bioengineering purposes (Varma and Palsson 1994a; Mahadevan et al. 2002; Burgard et al. 2003, 2004; Mahadevan and Schilling 2003; Pharkya et al. 2004; Palsson 2006; Joyce and Palsson 2008; Ranganathan et al. 2010). One of the most popular and universally used tool is flux balance analysis (FBA), which analyzes the metabolite distribution through a network model (Varma and Palsson 1994a; Oberhardt et al. 2009; Orth et al. 2010). FBA assumes a pseudo steady-state, in which the internal concentration of metabolites within a cellular system stays constant over a time period much shorter than the time scale of other biological processes, such as transcription or translation (Varma and Palsson 1994a, b; Orth et al. 2010). In addition to this constraint on mass balance, the availability of nutrients/electron acceptors and other environmental conditions can be imposed as environmental constraints, and thermodynamic information can be incorporated in terms of reaction reversibility. The effects of gene expression are captured by regulatory constraints on the metabolic fluxes that are subject to environmental changes (Terzer et al. 2009). Thus, the solution space of this under-determined system of equations represents the possible metabolic flux distribution at any given condition (Varma and Palsson 1993, 1994a), which can then be optimized with specific objective functions (the cellular growth rate or yield of a desired bioproduct) to simulate the biological behavior of the cell. The selection of the most biologically relevant objective function is critical to making an accurate prediction of phenotypic behavior or physiological function of an organism. Although maximization of the biomass reaction flux to mimic cellular growth has been the objective function of choice for the vast majority of in silico simulations of biological systems, there have been several studies pursuing the identification and interrogation of the likelihood of cellular objectives (Burgard and Maranas 2003; Gianchandani et al. 2008). An inherent property of a biological network is that the same solution of the objective function can have alternate optimal flux distributions, which are dependent on physiological and environmental growth conditions, and network properties. The extent and effect of this redundancy and degeneracy of the metabolic models can be assessed with mathematical tools based on linear and quadratic programming (Mahadevan and Schilling 2003). Although the steps of genome-scale metabolic model development and the utility of COBRA-based approaches have been evident mostly in microbial systems, much work remains to comprehensively explore the eukaryotic metabolism, especially in C4 plant species.

Radioisotope-based metabolic flux analysis to elucidate metabolic pathway structure of C4 photosynthesis

Radioisotope-based MFA is a traditional method used in the analysis of photosynthetic carbon fixation pathways, which contributed to the discovery of pathways that include the CBB cycle and the C4 pathway (Calvin 1962; Hatch and Slack 1966). Radioisotope-labeled compounds are fed to the biological systems and the radioisotopes in metabolites are detected, typically by a liquid scintillation counter or an autoradiogram. Depending on the experimental requirements, metabolites need to be separated prior to the label determination by techniques which include chemical fractionation, liquid chromatography, thin-layer chromatography, and enzymatic degradation. The significant advantage of radioisotopes is the extremely high sensitivity during detection. This enables us to detect a tiny amount of the label at the very beginning of the feeding time course. While modern MS-based analyses also offer very high sensitivity, radioisotopes are still advantageous for the analysis of metabolites with a small pool size. Additionally, the radioisotope-based methods can be used for the bulk analysis of metabolic fluxes at the pathway- and system-levels. For example, a 14C-glucose feeding experiment can be used to analyze the respiratory flux distribution into the synthesis of sugar, starch, cell wall and protein, and respiratory CO2 release (Obata et al. 2017). The bulk information is very useful to understand overall metabolic flux in various levels of biological systems (i.e., organelle, cell, tissue, organ, organism, and community levels), and the radioisotope is readily used to analyze it (Batista Silva et al. 2016). On the other hand, it requires laborious separation steps for the analysis of individual metabolites; yet, the separation of molecules is sometimes still poor. The metabolite pool size, which is often required to analyze metabolic flux, must be determined by additional experiments.

14CO2-based MFA was used for the discovery of the C3 photosynthesis pathway by Calvin, Benson and Bassham in which phosphoglycerate was found to be the metabolite initially labeled by 14CO2 feeding (Calvin 1962). The discovery of the C4 pathway began with finding much higher initial label accumulation in malate and aspartate than in phosphoglycerate in sugarcane (Kortschak et al. 1965). The pathways of CO2 fixation in sugarcane were further analyzed by radioisotope-based MFA in the milestone work by Hatch and Slack (1966). They observed a quick label accumulation in malate and aspartate within 1–2 s of 14CO2 application. The radioactivity in these compounds represented more than 90% of total label in leaves, and this ratio decreased over time. Then the proportion of radioactivity in the 3-phosphoglycerate temporally rose, followed by the continuous increase in the label accumulation in sucrose and glucan (Hatch and Slack 1966). The authors also conducted pulse-chase experiments in which the illuminated leaves were pulse-labeled for 15 s with 14CO2 and then chased with 12CO2 for up to 200 s. The 14C that accumulated in malate and aspartate moved into 3-phosphoglycerate, followed by hexose monophosphates, and were then incorporated into sucrose and glucan (Hatch and Slack 1966). These two experiments clearly revealed that C4 dicarboxylic acids, namely malate and aspartate, are the carbon fixation products and the fixed carbon is transferred into sugars via 3-phosphoglycerate. Oxaloacetate was also labeled by 1 s of labeling as malate and aspartate. In addition, the authors revealed that only the carbon atom at the C4 position of malate and aspartate was labeled at the primary phase. This transmission of carbon atom at the C4 position is the hallmark of C4 photosynthesis and clearly distinguishable from other metabolic processes involving carbon incorporation into malate by PEPC (Furbank 2016). From these results, the authors proposed a carbon fixation pathway in sugarcane which is composed of two cycles; one is responsible for fixation and transfer of a carboxyl group to an accepter molecule and another regenerates carboxyl acceptor (Hatch and Slack 1966). Although spatial distribution of reactions and the decarboxylation/re-fixation process were not predicted, the overall scheme of the C4 pathways was established at that moment. The authors further tested the leaves of different ages, under various CO2 concentrations and light intensities, as well as in the leaves of 33 plant species, and confirmed the operation of the pathway in certain species including maize and sorghum, regardless the growth conditions (Hatch et al. 1967). These findings were followed by the identification of the decarboxylation enzymes and the cell type specific localization of them in the leaves (Björkman and Gauhl 1969; Berry et al. 1970; Johnson and Hatch 1970; Edwards et al. 1970, 1971). Hatch further improved 14CO2 MFA to capture the label accumulation in the CO2 pool in planta (Hatch 1971). 14C-labeled leaves of maize and Amaranthus were kept in the dark in an air stream for a few seconds before fixation. This dark incubation step removes the 14CO2 remaining in the boundary air layer and intercellular spaces. Then the 14CO2 in the leaf cells were extracted, the radioactivity was counted, and the pool size of CO2 was determined. In the leaves of both species, the 14CO2 label in the intracellular CO2 pool declined with the half-time of 15–20 s, which was much slower than the expected half-time of 2 s that was calculated with the assumption of no intracellular CO2 donor (Hatch 1971). This research established the overview of the current model of C4 photosynthesis (Slatyer and Tolbert 1971).

At that stage of C4 photosynthesis research, many species were tested by 14C-MFA to be confirmed as C4 species (Hatch et al. 1967; Johnson and Hatch 1968; Hatch 1975). In these attempts, it turned out that the rates of label accumulation in malate and aspartate, and label decrease during the chase phase vary from species to species (Hatch et al. 1967; Johnson and Hatch 1968; Chen et al. 1971; Hatch 1971). Some species including maize, sugarcane, and sorghum preferentially accumulate 14C in malate, while other species, including amaranthus, prefer aspartate (Johnson and Hatch 1968; Hatch 1971). These findings led to the establishment of the three pathway subtypes, together with the identification of species-specific decarboxylating enzymes localized in the bundle sheath cells (Johnson and Hatch 1970; Edwards et al. 1971; Hatch and Kagawa 1974). Hatch and Kagawa (Hatch and Kagawa 1976) conducted 14C-MFA with isolated bundle sheath cells from plant species of different C4 subtypes to verify these proposed metabolic models. The bundle sheath cells isolated from C4 species were fed with 14C-labeled malate or aspartate together with non-labeled additional substrate, and the rate of decarboxylation was analyzed. In the bundle sheath cells from maize, an NADP-ME subtype C4 plant, malate decarboxylation occurs exclusively under the light and also requires provision of 3-phosphoglycerate. Little light-dependent O2 evolution suggests very limited capacity for NADPH production by the photosynthetic light reaction in the bundle sheath cells of Z. mays. This suggests that the CBB cycle in the bundle sheath cells of NADP-ME type C4 plants relies on NADPH production almost exclusively from the NADP-ME reaction and requires an additional 3-phosphoglycerate supply from mesophyll cells to regenerate NADP+ to maintain the carbon-concentrating shuttle flux (Fig. 1). In contrast, rapid light-dependent O2 evolution is observed from bundle sheath cells of NAD-ME-type C4 plants, e.g., Atriplex spongiosa and Panicum miliaceum. Isolated bundle sheath cells decarboxylate both malate and aspartate, and the decarboxylation from aspartate is stimulated by 2-oxoglutarate. This also supports the metabolic model in which the 2-oxoglutarate works as an amino donor in the aspartate aminotransferase reaction that converts aspartate into malate via oxaloacetate (Fig. 1). PEPCK-type plants also require 2-oxoglutarate for the decarboxylation of aspartate in the bundle sheath cells. The contribution of malate decarboxylation in the carbon transport shuttle is most likely very limited since five times less 14CO2 is detected from 14C-malate than 14C-aspartate feeding. This is in accordance with the model of the PEPCK subtype, with aspartate as the major source of CO2 (Fig. 1) (Hatch and Kagawa 1976).

As such, the radioisotope-based MFA played a crucial role in the discovery and pathway elucidation of C4 photosynthesis, together with enzymatic assays. The extremely high sensitivity of radioisotope detection makes the quantification of initial label accumulation possible. However, the low resolution in metabolite separation limits the more precise assessment of metabolic pathway fluxes and the simultaneous analysis of multiple pathways.

Stable isotope (13C)-based MFA to quantitatively analyze the C4 cycle operation

Recent development of metabolomics has enabled the analysis of label enrichment in a broad range of metabolites with high resolution. MS and NMR are the major platforms for metabolomics analysis, and both quantitatively discriminate compounds containing stable isotopes. Isotopically-labeled molecules emit NMR signals different from non-labeled ones, while MS can distinguish labeled and non-labeled metabolites by the difference in molecular mass (Fig. 3). NMR has the great capability to analyze labeling position in molecules, while MS offers superior sensitivity. MS-based analysis is the method of choice for the analysis of the photosynthetic metabolism due to the small pool sizes of the primary photoassimilates. MS is usually coupled with chromatographic separation including gas chromatography (GC) and liquid chromatography (LC), which provides fine separation of the metabolites in the biological samples. MS also contributes to discriminating the metabolites with similar retention times in the chromatographic separation. Collectively, the hyphenated technologies (e.g., GC-MS and LC-MS) can provide the analysis of each single metabolite in a broad range of chemical classes. Quantification of individual metabolite pools can also be done with the sum of peak intensities of isotopic molecules. A labeled molecule appears to have higher molecular mass than a non-labeled one, due to the number of heavy isotopic atoms incorporated into it (Fig. 3). Isotopically-labeled molecules are found even in non-labeled materials due to naturally abundant stable isotopes. Label accumulation is calculated as the ratio of isotopic atoms against the total number of atoms in the metabolite pool (Fig. 3). Importantly, MS-based analysis determines the number and the position of the labels in a molecule. Isotopologues are the molecules with different isotopic composition (i.e., number of isotope labels), while isotopomers are molecules with isotopes at specific positions. The composition of these isotopic molecules in a metabolite pool is often important to understanding the metabolic pathway organization/topology, which will be discussed in the subsequent paragraph (Weissmann et al. 2016; Arrivault et al. 2017). However, the stable isotope-based method is not well-suited for analyzing bulk metabolic fluxes, and sensitivity of detection is still an issue for analysis of metabolites with small pool sizes despite the dramatic increase in sensitivity of recent MS systems (Batista Silva et al. 2016).

Fig. 3
figure 3

Determination of label accumulation by mass-spectrometry (MS). The 13C label accumulation in a C3 molecule is analyzed. MS separately detects isotopically labeled molecules based on the mass/charge (m/z) ratio. Molecules containing n number of 13C atom (isotopologues) has m/z of m + n. Each isotopologue is composed of isotopomers which have 13C at different positions. Non-labeled samples have some isotopically labeled molecules due to naturally abundant 13C. 13C feeding leads to the enrichment of labeled molecules compared to the non-labeled one. Label enrichment is calculated by dividing the sum of the isotopologue peak intensities (Im+n) multiplied by the number of 13C with the total intensity of isotopologue peaks multiplied by the number of C in the molecules (3)

The results of 14C-MFA (Hatch 1971) and recent transcriptomic approaches (Brautigam et al. 2011; Furbank 2011; Pick et al. 2011; Wang et al. 2014b) indicate the co-operation of multiple subtypes of C4 pathways in plant species. This gives rise to additional questions on the operation of C4 pathways; including: (1) How are the fluxes through these subtype pathways coordinated? (2) Is there interconnection between subtype pathways? (3) How do C4 pathways interact with other metabolic pathways, including the CBB cycle, respiratory pathways and photorespiration? Two 13C-MFA studies were conducted in recent years to address these questions (Weissmann et al. 2016; Arrivault et al. 2017) with similar experimental setup. 13CO2-containing artificial air (mixture of approximately 78% N2, 22% O2 and 0.033 or 0.042% CO2) was fed to the middle of maize leaves and quenched following various time periods, ranging from 1 s to 60 min. Custom chambers were used to ensure gas-tight sealing, rapid gas exchange, and controlled temperature and illumination. Metabolites were analyzed by GC-MS and LC-MS/MS to cover metabolites from the C4 and C3 cycles, photorespiration, and respiratory pathways (Heise et al. 2014). In both studies, very quick label accumulation in malate and aspartate was detected within the first 10 s and then a plateau was reached, supporting the operation of multiple subtype pathways in maize (Hatch 1971; Furbank 2011; Pick et al. 2011; Wang et al. 2014b). Isotopologues containing only one 13C was dominant in both compounds for the first 3 min (Weissmann et al. 2016; Arrivault et al. 2017), followed by accumulation of isotopologues containing a higher number of 13C (Arrivault et al. 2017). These observations reflect the constant fixation and release of 13C at one position of these molecules, and nicely reconfirm the model derived from the 14C experiments in which the C atom of CO2 is fixed into the C4 position of malate and aspartate in mesophyll cells and released in the bundle sheath cells (Hatch and Slack 1966; Hatch 1971, 1987). Contribution of malate and aspartate dependent pathways in the carbon transport shuttle is calculated as the initial 13C incorporation rates into each molecule. Weissmann et al. (2016) calculated a 2.9-fold higher transport ratio via malate than aspartate, while the ratio was estimated to be approximately tenfold by Arrivault et al. (2017). The difference in the pathway operation ratio likely reflects flexible operation of subtype pathways. Each pathway subtype requires different numbers of reducing equivalents, amino groups, and ATP in various cellular and subcellular compartments (Fig. 1). The changes in the ratio of C transport through subtype pathways are probably useful to meet metabolic demands in various environmental and developmental consequences (Stitt and Zhu 2014).

The possible interaction of C4 pathway subtypes is indicated by the metabolic flux phenotype of maize dct2 mutants (Weissmann et al. 2016). DCT2 (Dicarboxylic aCid Transporter2) is a chloroplast-localized transporter which transports 2-oxoglutarate and glutamate in exchange for malate. Its gene expression is induced by light specifically in bundle sheath cells, making DCT2 a candidate protein for mediating malate import into the chloroplast in bundle sheath cells in the NADP-ME subtype pathway (Taniguchi et al. 2004). The chloroplasts isolated from bundle sheath cells from the dct2 Activator-tagged mutant plants show severely diminished malate-dependent pyruvate production, indicating malate transport activity of DCT2 at the chloroplast in bundle sheath cells (Weissmann et al. 2016). NADP-ME subtype pathway activity is impaired in dct2 lines due to limited malate transport into the bundle sheath chloroplasts. Although the photosynthetic carbon assimilation is reduced in dct2 line to 3% of that in the wild type plants, contribution of the PEPCK pathway is increased to 55% of the net assimilation rate compared to 25% in the wild type (Weissmann et al. 2016). MFA using isolated bundle sheath cells with 2H- or 14C-labeled malate showed a higher rate of malate conversion to aspartate and alanine in dct2 than in the wild-type (Weissmann et al. 2016). Since the malate transport into the chloroplast is strongly diminished in dct2, malate accumulated in the cytosol is likely transported via a bypass as aspartate. The mitochondrial malate might also undergo a NAD-ME-like pathway to release CO2; although, experimental evidence is missing (Weissmann et al. 2016). These results clearly show that the metabolic flux of each of these three subtype pathways can be altered, and that the enzymes which are not directly involved in the C4 shuttle can also compose a bypass to ameliorate the compromised metabolic network.

13C-MFA also indicates complex interactions of C4 photosynthesis with other metabolic pathways. The C4 shuttle pathway provides CO2 to the CBB cycle, with part of the assimilated carbon going back to the shuttle as PEP via the interconversion of 3-phosphoglycerate, 2-phosphoglycerate and PEP, mediated by phosphoglycerate mutase and enolase (Fig. 1). The rate of carbon exchange between these pathways can be estimated from the ratio of 13C atoms in the carbon backbone of the C4 shuttle intermediates to the total amount of fixed 13C at the initial time points in the 13CO2 feeding experiments. Surprisingly, approximately 10% of total C fixation is incorporated into the carbon backbone of the C4 pathway intermediates (Arrivault et al. 2017). Additionally, the relatively slow label accumulation in the CBB cycle intermediate, between 5 and 20 min, suggests that the influx of unlabeled C is most likely from the carbon backbone (C1 to C3 position) of malate and aspartate (Arrivault et al. 2017). These suggest a substantial amount of C exchange between the C4 shuttle and the CBB cycle. The C4 shuttle also affects the photorespiratory flux, as CO2 is accumulated in the proximity of RubisCO by the shuttle, preventing the oxygenation reaction. Arrivault et al. (2017) compared label accumulation in photorespiratory intermediates in Arabidopsis (a C3 species) under ambient O2 (21%) and low O2 (2%) and maize under ambient O2 conditions. Label accumulation in serine and glycerate is faster in Arabidopsis under the ambient O2 condition than under the low O2 condition and maize under the ambient O2 condition, which show almost identical label accumulation rates (Arrivault et al. 2017). As photorespiration is very limited under the 2% O2 condition, even in C3 plants (Florian et al. 2014), C4 photosynthesis is shown to greatly affect the photorespiratory flux, and can minimize its activity. The interaction between the C4 and respiratory pathways seems quite likely since the C4 pathways share intermediates with respiratory pathways. Malate and oxaloacetate are intermediates of the tricarboxylic acid (TCA) cycle, and pyruvate and PEP are involved in glycolysis. However, 13C-MFA shows limited exchange of metabolites between these pathways. Firstly, the label accumulation in the TCA cycle intermediates, including 2-oxoglutarate and succinate, starts after 20 min of labeling, which is much later than label accumulation in malate. Part of the malate pool is considered to be converted to fumarate, most likely by the reversible reaction of fumarase since the label accumulation kinetics of fumarate is very similar to that of malate (Arrivault et al. 2017). The carbon leak to fumarate is estimated to be 0.10–0.25% of 13C atom in the malate pool when the ratio of 13C atom equivalents between malate and fumarate was compared, based on the data presented in Arrivault et al. 2017 (Supplementary Table 1). Secondary, Weissmann et al. investigated the fate of fumarate in the bundle sheath cells (Weissmann et al. 2016). In the experiment, positionally 2H-labelled malate was fed to isolated bundle sheath cells, and the isotopologue distribution in fumarate, aspartate and alanine was determined. Fumarate was quickly labeled and reached isotopic steady-state within 15 s (Weissmann et al. 2016). The isotopologues of aspartate and alanine containing single 2H were preferentially accumulated compare with ones with two 2H atoms. Since the isotopologues with two 2H are synthesized only by the direct conversion of oxaloacetate from 2,3,3-2H-malate fed as the substrate, these compounds were not produced directly from malate, but were converted to fumarate first (Weissmann et al. 2016). In addition, the label accumulation in other TCA cycle intermediates, including succinate and citrate, were elevated in the dct2 mutant that also showed enhanced conversion of malate to aspartate (Weissmann et al. 2016). From these results, the TCA cycle in the bundle sheath cells is likely involved in the transformation of C4 shuttle intermediates; although, its role in C4 photosynthesis is limited in regular conditions due to very little carbon leak from the malate pool. This strict pathway discrimination is essential to avoiding the excessive consumption of carbon by the respiratory pathway, but the underlying mechanism is still unclear (Bräutigam et al. 2014). This is at least partly due to the existence of a malate pool which is inactive to the carbon assimilation pathways. After 60 min of labeling, about 60% of malate still remains unlabeled while almost 100% of aspartate molecules containing at least one 13C (Arrivault et al. 2017). This indicates that 40% of malate is inactive in photosynthetic carbon assimilation. Malate is involved in multiple pathways and has functions in carbon storage, respiratory metabolism, redox regulation, pH regulation, and stomatal function (Fernie and Martinoia 2009). This metabolite accumulates in various subcellular comportments to restrict the contact with enzymes of multiple pathways (Martinoia and Rentsch 1994), which subsequently results in separation between the C4 and the TCA cycles.

As discussed above, the results of recent stable isotope-based MFA suggest that the C4 shuttle is not a simple cyclic pathway (Fig. 1). Traditionally postulated pathway subtypes simultaneously operate in a single plant, and the metabolic flux through each pathway likely alters in response to metabolic cues. Additionally, the carbon transporting shuttle closely interacts with the CBB cycle and photorespiration, but little with the mitochondrial TCA cycle. These pathways need to be analyzed as a system for full understanding of C4 photosynthesis.

Application of COBRA methods on genome-scale models to explore C4 plant metabolism

Recent progress in plant systems biology, bioinformatics, and high-throughput genome sequencing and annotation has facilitated the creation and scale-up of predictive genome-scale plant metabolic models (Lee et al. 2011; de Oliveira Dal’Molin and Nielsen 2013; de Oliveira Dal’molin et al. 2014; Seaver et al. 2014). Development of single-cell, genome-scale metabolic models for plants have been a recent endeavor, starting with the reconstruction of an Arabidopsis cell suspension culture (Poolman et al. 2009) and a barley seed model (Grafahrend-Belau et al. 2009) in 2009. The first genome-scale metabolic reconstruction of a C4 plant was C4GEM in 2010 (de Oliveira Dal’Molin et al. 2010a, b), which investigated the flux distribution in mesophyll and bundle sheath cells during C4 photosynthesis in sorghum, maize and sugarcane. Although C4GEM is based on a C3 plant model, ARAGEM (de Oliveira Dal’Molin et al. 2010a, b), it was significantly expanded to represent three different C4 subtype pathways. In silico simulation of the C4 metabolism requires specific physiological and regulatory constraints to be imposed upon the C4 pathways under photosynthetic conditions, and the correct Mesophyll (M) and Bundle Sheath (BS) tissue localization of genes, reactions and metabolites, to obtain biologically meaningful predictions. The iRS1563 model by Saha et al. (2011) was the first attempt to globally characterize the metabolic functionalities of maize under different physiological conditions (i.e., photosynthesis, photorespiration and respiration) in a compartmentalized model. In this work, they also explored model predictions against experimental observations for two naturally occurring mutants (i.e., bm1 and bm3). A comprehensive pathway database for maize enzyme catalysts, proteins, carbohydrates, lipids, amino acids, secondary plant products, and other metabolites was created from annotated genes from the maize reference genome, which was sequenced from the B73 variety (Monaco et al. 2013). With increased availability of proteomic and transcriptomic data for different plant tissue types, tissue-specific metabolic models have started to emerge in recent years. The second-generation maize leaf model, an extension of the iRS1563 model, was expanded to an additional 4261 genes and 6540 reactions, and accounted for C4 fixation and N2 assimilation (Simons et al. 2014a). It used condition-specific biomass descriptions and regulatory constraints to simulate nitrogen-limited conditions and mutants that were deficient in glutamine synthetase, gln1–3 and gln1–4 with high (90%) accuracy. There were also attempts to construct separate evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm by utilizing biochemical database information (Seaver et al. 2015). However, those models demonstrated an inability to differentiate metabolic and physiological variations between cell types in a tissue. Bogart and Myers developed a spatially segmented multiscale maize leaf model by imposing nonlinear physiological constraints on reaction fluxes to achieve realistic predictions of the C4 pathway responses to environmental and biochemical perturbations (Bogart and Myers 2016). C4GEM framework was also utilized to reconstruct a genome-scale metabolic model for foxtail millet (Setaria italica), integrating tissue-specific omics and functional pathway analysis (de Oliveira Dal’Molin et al. 2016). Recently, maize leaf and kernel model reconstructions were utilized to systematically investigate the C4 metabolism in the leaves of maize ideotypes and to elucidate correlations with high grain yield potential (Cañas et al. 2017). Simons and coworkers from the Maranas Lab are working on integrating different tissue-specific models (kernel, leaf, stem, and root) into interactive whole-plant models (unpublished work, obtained via personal communication). The evolution of C4 from C3 plants would involve a substantial rearrangement of cellular structures within the leaves and more efficient expression of various enzymes related to the specific photosynthetic pathways (Leegood 2013; Karki et al. 2013; Danila et al. 2016; Lin et al. 2016). To understand how the structure of the C4 plant metabolic network may constrain gene expression, Robaina-Este´vez and Nikoloski utilized flux coupling analysis with second-generation maize metabolic models (Simons et al. 2014b; Seaver et al. 2015) to investigate the correspondence between metabolic network structure and transcriptomic phenotypes along the maize leaf gradient (Robaina-Estévez and Nikoloski 2016). McQualter et al. (2016) also identified that for C4 photosynthesis to work, sufficient ATP has to be available in BS, which requires space for the light harvesting machinery and sufficient light to reach BS to drive photophosphorylation.

Integration of experimental and theoretical approaches toward the comprehensive understanding of C4 metabolism

As described above, isotopically nonstationary MFA has indicated complex interactions between C4 subtype pathways and other metabolic pathways. A computational approach using metabolic network models is desired to precisely interpret experimental MFA data in the context of metabolic flux distribution among the pathways (Fig. 4). In this section, we attempt to describe the use of metabolic network models in steady-state MFA and its limitations, the challenges in the application of model-based approaches to isotopically nonstationary MFA of C4 photosynthesis, the alternative approach of kinetic models, and the possible future directions of MFA and FBA. Metabolic network models are used to estimate metabolic flux distribution, especially for the analysis of steady-state labeling data. Computational algorithms have been explained in detail in recent reviews (Gopalakrishnan and Maranas 2015; Ma et al. 2017). Moreover, software packages that make use of these algorithms have also been developed both for steady-state MFA and isotopically nonstationary MFA (Young 2014; Guo et al. 2015). In order to estimate metabolic fluxes from isotopic labeling data, the genome-scale metabolic network described above is translated to an atom-mapping network (or isotopomer network) (Gopalakrishnan and Maranas 2015). Atom-mapping information can be obtained from biochemical literature or online databases (Gopalakrishnan and Maranas 2015; Ma et al. 2017). Moreover, for reactions with no available data, computational procedures such as maximum common substructure (MCS; Chen et al. 2013), minimum weighted edit-distance (MWED; Latendresse et al. 2012), or canonical labelling for clique approximation (CLCA; Kumar and Maranas 2014) can be used to predict atomic mappings. Next, the isotopomer network is decomposed using one of several available frameworks, to reduce the network’s complexity (Schmidt et al. 1997; Wiechert et al. 1999; Antoniewicz et al. 2007). The most prevalently used framework is the elementary metabolite units (EMU) method (Antoniewicz et al. 2007), which uses a distinct subset of a metabolite’s atoms (defined as EMUs) as its state variables (Young et al. 2008). This decomposition method significantly reduces computational demand, and results in mass isotopomer distributions (MIDs) of EMUs that can contribute to the available experimental measurements (Ma et al. 2017). Metabolic fluxes and pool sizes are estimated by minimizing the differences of fit between experimentally measured and computationally simulated MIDs. Here, the experimentally obtained isotope accumulation pattern and the metabolite pool size data is used as an ‘answer’ to draw reasonable combinations of pathway fluxes. During each of the iterations, the MIDs are simulated by solving the isotopomer and metabolite balance equations (derived from the isotopomer network) for a set of flux parameters (starting with a random combination of fluxes) (Reed et al. 2010; Ma et al. 2017). The discrepancy between the measured and simulated MIDs is next calculated, and the parameters are updated to achieve an improved fit (Ma et al. 2017). Stoichiometric and thermodynamic constraints, similar to the constraints applied in FBA in the previous section, are imposed on the fluxes to reduce the solution space (Gopalakrishnan and Maranas 2015). Finally, a Chi-square statistical test is used to assess the goodness-of-fit of the results (Young et al. 2011), and 95% confidence intervals are computed by evaluating the sensitivity of the sum-of-squared residuals to parameter variations (Antoniewicz et al. 2006). Large confidence intervals indicate poor flux determination, while small confidence intervals indicate well-determined fluxes for the given network description (Ma et al. 2017). Moreover, sensitivity assessments can indicate which measurements were most crucial in establishing a given flux value, potentially inspiring further studies or better analysis of the results (Ma et al. 2017). Rigorous statistical analysis of the results is necessary due to the underdetermined nature of the metabolic network, as well as the inherent noise introduced by the isotope labeling measurements (Gopalakrishnan and Maranas 2015).

Fig. 4
figure 4

Overview of metabolic flux analysis procedures. Isotopically nonstationary (INST) and steady state metabolic flux analysis (MFA) labeling data are collected and analyzed using spectroscopic methods to generate mass isotopomer distribution (MID) data. Cell culture data and genome annotation information from biochemical knowledgebases are utilized to generate genome-scale metabolic models and flux distributions are simulated. Optimization algorithms are employed to optimize flux parameters in an iterative fashion, to find the best fit of flux mapping and to discover and redesign pathways

It is important to note that the scalability of genome-scale models comes at the cost of ignoring much of the system’s dynamic behavior (Smallbone et al. 2010). These models lack the ability to directly take into account enzyme levels, metabolite concentrations, and substrate-level regulatory barriers (Khodayari and Maranas 2016). This in turn makes it difficult to identify rate-limiting steps and to interrogate their regulatory behavior (Greene et al. 2017). This is a problem especially for the modeling of C4 photosynthesis as the transport of metabolites between the mesophyll and the bundle sheath cells in the C4 shuttle is considered to be non-enzymatically mediated by the concentration gradient of metabolites between these cell types (Arrivault et al. 2017). It might be useful to fill the gaps in the model by applying the metabolic flux rate of carbon transport that is experimentally determined by isotopically nonstationary MFA. This improves the accuracy of the model, and can be another way to integrate experimental and theoretical approaches.

Since the computational methods described above were developed for steady-state MFA, their application to photoautotrophic biological systems has been very limited due to the inapplicability of the isotopic steady state assumption (as there are significant diurnal changes in metabolic flux). Furthermore, the lack of a suitable labeled substrate with multiple carbon atoms further limited their use in heterotrophic or mixotrophic growth conditions, with five-/six-carbon sugar used as the major carbon source (Young et al. 2011). Although the study of nonstationary MFA has seen dramatic improvements over the past two decades, a number of challenges remain to be resolved before this method becomes a standard technique for metabolic flux elucidation in C4 plants. The first challenge is that commonly applied simplifications in network modeling, such as pool lumping or neglecting storage metabolism, are yet to be justified for nonstationary MFA (Wiechert and Nöh 2013). Indeed, significant amounts of malate and sedoheptulose 1,7 bisphosphate are not involved in the C4 photosynthesis (Arrivault et al. 2017). However, there is no clear explanation of how these “inactive” pools are formed on the cellular and subcellular levels. This is crucial information for the metabolic model when it is extended to the context of the whole-cell metabolic network. Models that include separate metabolite pools are believed to have superior prediction performance, as demonstrated in the models of Arabidopsis’ photosynthetic metabolism (Szecowka et al. 2013). The second challenge is the lack of a standard workflow for pool size measurements, which are necessary for obtaining accurate flux estimations (Wiechert and Nöh 2013). Accurate flux estimations depend largely on the experimental systems. For example, Ma et al. (2014) used MIDs without pool size information to predict metabolic flux in Arabidopsis rosettes. On the other hand, the pool size is important when the determination of label accumulation in all isotopologues is impossible, such as radioisotope-based MFA and GC-MS analysis using a harsh ionization condition (Szecowka et al. 2013). Pool size information is also useful when the metabolic flux at a specific reaction is calculated. However, it requires additional experimental efforts to determine the pool size and accurate determination is often difficult, especially for the metabolites with a small pool size, including CBB cycle intermediates (Ma et al. 2014). As such, the available parameters for modeling are closely related to the experimental systems and biological questions are hardly standardized. Furthermore, computational costs must be decreased further to allow the analysis of larger and more complex genome-scale models. This can be achieved by: (1) further reducing the total size of the isotopomer system (network decomposition), or (2) by dividing the system into smaller subsystems that can cumulatively be solved more quickly (block decoupling) (Young et al. 2008). Computational time can also be reduced by using more efficient algorithms for solving the system of differential equations, implementing the EMU concept to isotopically nonstationary MFA, and applying the concept of block decoupling, allowing for the analysis of a whole new and more complicated set of metabolic networks that were previously unsolvable due to the computational complexity of isotopically nonstationary MFA (Young et al. 2008; Reed et al. 2010). This concept was later coupled to the isotopically nonstationary MFA approach (Wiechert and Nöh 2005) and resulted in more than a 5000-fold speedup relative to previous isotopically nonstationary MFA algorithms (Reed et al. 2010). This was a major milestone in the study of photoautotrophic systems, as it made the analysis of realistically sized autotrophic networks computationally feasible. As a proof of concept, the method was applied to produce a comprehensive photoautotrophic flux map of the model cyanobacterial strain Synechocystis sp. PCC6803 (Young et al. 2011). A recent work shows that a new algorithm based on the numerical solution of a system of ordinary differential equations (ODEs) has been able to reduce computational time by up to 48% (Gopalakrishnan et al. 2018). Reduction in computational costs will allow the development of algorithms capable of determining optimal tracers, based on the specific experiment (Antoniewicz 2013). It will also allow for exploration of the synergetic effects of parallel labeling experiments (PLEs; Antoniewicz 2013).

The metabolic models discussed above deal with metabolic steady state. In order to capture dynamics of metabolic networks, kinetic models have been proposed as a logical next step. Ideally, these models would use enzyme kinetics to characterize the mechanics of each reaction, and to quantify reaction rates as a function of metabolite concentrations. However, these mechanistic or approximate modeling approaches require a large amount of experimental data to estimate the required parameters (Smallbone et al. 2010). Furthermore, most of the rate laws used to characterize enzyme kinetics are nonlinear, making it computationally intractable to iterate over the system of equations in moderately large systems (Srinivasan et al. 2015). Another approach that attempts to capture the kinetics of the system without requiring kinetic parameters is the Monte Carlo sampling approach. The Monte Carlo approach circumvents the need for kinetic parameters by creating an ensemble of models through parameter sampling (Khodayari and Maranas 2016). Each of the models in the original ensemble is then perturbed, and the models that do not agree with experimental results are successively filtered out (Rizk and Liao 2009). The most popular Monte Carlo based kinetic model is known as Ensemble modeling (Tran et al. 2008), where the physiological space is constrained using steady state phenotypic data, such as flux or concentration (as opposed to evaluating kinetic parameters), to determine sampling bounds (Rizk and Liao 2009). However, it is important to stress that since kinetic models are affected by a number of variables, such as parameter identification and computational tractability, some major obstacles must be addressed before kinetic models become viable at the genome-scale (Rizk and Liao 2009; Srinivasan et al. 2015). Details of the different formulations used in kinetic modeling have been extensively reviewed elsewhere (Srinivasan et al. 2015).

One of the ultimate goals of metabolic flux analysis is to elucidate the roles of C4 metabolism in the whole plant system. This requires a whole plant metabolic model that includes the tissue and organ-level metabolic networks and an accurate description of metabolite transport among them. Genome-scale metabolic reconstruction of multi-tissue eukaryotic systems, especially C4 plants, is challenging due to their larger genomes and complexity of metabolic functionalities, multiple organelles performing diverse metabolic activities, different nutrient source-sink relationships, the complicated transcription, translation and regulatory processes, and the different growth environments they usually encounter. Despite that, there have been significant success in genome-scale metabolic reconstructions of plants, and several constraint-based modeling approaches (Varma and Palsson 1994b; Segre et al. 2002; Mahadevan and Schilling 2003) have been used to study different aspects of metabolism, including the prediction of optimal fitness and energy efficiencies (de Oliveira Dal’Molin et al. 2010a, b), flux distributions under different environmental and genetic conditions (Grafahrend-Belau et al. 2009), and designing metabolic intervention strategies (Poolman et al. 2009; de Oliveira Dal’molin et al. 2014). The first such attempt was a multi-tissue model of Arabidopsis thaliana that explored the metabolic activities of the tissues and energy demands in different environmental conditions (de Oliveira Dal’Molin et al. 2015). The tissue-specific maize leaf model (Simons et al. 2014b), rice leaf and seed model (Lakshmanan et al. 2013), and the integrated whole-plant model of maize in development by the Maranas Lab (unpublished work, obtained via personal communication) promise a significant increase in multi-tissue whole-plant models of plant metabolism. Hence, plant metabolic network modeling provides an important complement to MFA, and an alternative to enzyme kinetics-based mechanistic models of relatively smaller scale (Sweetlove and Ratcliffe 2011), that can help characterize the metabolism of C4 plants. On the other hand, isotope-based MFA can contribute to provide parameters related to metabolite fluxes across different pathways and transport among organelles, cells, tissues and organs. Non-aqueous fractionation, a method that separates metabolites in each organelle under metabolically inert conditions, has the potential to be applied to the analysis of metabolite exchange among organelles (Klie et al. 2011; Szecowka et al. 2013). Arrivault et al. (2017) employed a differential grinding technique to analyze the metabolite concentration gradient and transport of C4 shuttle metabolites in combination with 13C-MFA. High special resolution MS imaging can also be applied to investigate metabolite transport between tissues; although, its application in labeling experiments has not been reported so far (Dueñas et al. 2017). 13C-labeling was also applied to analyze carbon transport from leaves to the other organs, including kernels (Gao et al. 2017). The use of radioisotopes, including positron emission tomography (PET), can advantageously monitor the movement of metabolites within the whole plant scale in non-invasive manner (Kiser et al. 2008; Hubeau and Steppe 2015; Karve et al. 2015; Tran et al. 2017).

Conclusions

MFA and COBRA have been playing significant roles in dissecting the C4 metabolic pathways. 14C-based MFA discovered and established the pathway structure, 13C-based MFA elucidated the interactions among pathways, and COBRA opened the way to analyze multiple metabolic pathways as a system. The C4 photosynthesis studies in the future should address the roles of the C4 pathway in context of the metabolic network. The integration of experimental and theoretical approaches is required for full understanding of C4 pathway operation. Essentially, experimental and theoretical approaches are complimentary, since the former needs an accurate metabolic network model to interpret the results from labeling experiments and the latter requires experimental inputs to validate and improve reconstructed models (Fig. 4). However, they are technically distinct and quickly developing research fields. Collaborative studies are essential, and researchers should identify and share the specific questions to address. Remaining key questions to understand C4 photosynthesis would include: (1) does the flux distribution among C4 subtype pathways alter depending on the environmental input? (2) How are the metabolite transport processes regulated? (3) How is the differential regulation of metabolic pathways achieved? (4) Are the regulatory mechanisms of C4 pathways different among subtype species? (5) How does the regulation of C4 photosynthesis affect plant performance? The specific tissue structure in C4 plants, so called Kranz anatomy, is a key feature of C4 plants (Fouracre et al. 2014); although, it was not discussed in this article. There are some variations in the Kranz anatomy among species (Fouracre et al. 2014), and the interaction between tissue structure and C4 pathway flux would be another question to address. As development of techniques continue both experimentally and theoretically, a more detailed picture of the C4 pathway will start to emerge, paving the way for more practical methods of engineering this pathway into C3 plants.