Introduction

Gram-negative proteobacteria of the genus Xanthomonas exist in a plant-associated and usually plant-pathogenic life-style (Swings and Civerolo 1993). They affect important crop plants like rice, but are also applied for industrial production of the exopolysaccharide xanthan gum (García-Ochoa et al. 2000; Vorhölter et al. 2008), which also contributes to virulence in plant pathogenicity (Rigano et al. 2007). Due to its dual role as an efficient xanthan producer and as a pathogen of Brassicaceae including Arabidopsis thaliana, Xanthomonas campestris has become a model organism with more than 1,000 publications listed in Pubmed in June 2011. X. campestris pv. campestris is known to be prototrophic for proteinogenic amino acids (Bradbury 1984), but only a few biosynthetic pathways required to synthesize amino acids have been determined so far. An overview of the amino acid biosynthesis pathways would not only deepen our fundamental understanding of the metabolism of this important group of plant pathogens, but would also set the basis for applications in systems biology such as stoichiometric network modeling or metabolic flux analysis.

X. campestris pv. campestris utilizes imported sugars to produce energy and building blocks for cellular components like amino acids. Recently, a biosynthetic pathway via glutamate and N-acetyl-l-citrulline was reported for arginine in X. campestris pv. campestris, which deviates from the canonical ornithine transcarbamylase-dependent pathway (Morizono et al. 2006). This raised the question whether further amino acid biosynthesis routes in Xanthomonads might differ from those known from other organisms, and further motivated our comprehensive study of the amino acid biosynthesis pathways in X. campestris pv. campestris.

Amino acid biosynthesis pathways can be reconstructed from complete genome sequences (Bono et al. 1998). To date, such sequences are publically available for three strains of X. campestris pv. campestris (da Silva et al. 2002; Qian et al. 2005; Vorhölter et al. 2008). In our study, the biosynthetic pathways for the proteinogenic amino acids have been reconstructed based on the genome annotation of X. campestris pv. campestris strain B100. Genome comparisons revealed the presence of orthologs for the identified genes also in other X. campestris pv. campestris genomes. For alanine, glycine, and isoleucine, where alternative biosynthetic pathways were reconstructed, the metabolically active pathways were determined by isotopologue profiling using [1-13C]-glucose as sole carbon source.

Materials and methods

Strains and plasmids

Besides the X. campestris pv. campestris wild-type strain B100 (Vorhölter et al. 2008), the Escherichia coli wild-type strain DH5αMCR (Gibco-BRL), and the E. coli mutant strains NK5648 (the Coli Genetic Stock Center, CGSC# 6172) deficient in the cystathionine gamma-synthase gene metB, and AT2699 (the Coli Genetic Stock Center, CGSC# 4524) deficient in the cystathionine beta-lyase gene metC, were used. Cosmid pXCB1002 and plasmid vector pFV7 (Vorhölter et al. 2001) were used to generate the gentamicin-resistant plasmids pXH1H2 and pXS4A04 carrying the X. campestris pv. campestris genes metB, and metC, respectively.

Microbiological methods

X. campestris pv. campestris B100 was grown in VMX medium (Watt et al. 2009) containing 3 g of [1-13C] glucose (99%, Isotec, Miamisburg OH, USA) as a sole carbon source at 30°C. Cells were harvested by centrifugation at 20,000×g, washed with isotonic buffer saline (0.9% NaCl), frozen in liquid nitrogen and finally lyophilized with an Alpha 1–4 LD plus freeze dryer (Christ, Osterode, Germany). E. coli was grown in PA medium (Difco Antibiotic Medium No. 3, Becton–Dickinson, Heidelberg, Germany) at 37°C, supplemented with 10 μg/ml gentamicin or 10 μg/ml tetracycline where appropriate.

Metabolic reconstruction

The open source software Pathway Tools version 14.0 (Karp et al. 2010) was employed for the metabolic network reconstruction. As a first step, genome annotation data of X. campestris pv. campestris B100 was updated using the GenDB software (Meyer et al. 2003). Further software tools were employed for the functional annotation, in particular the Swiss-Prot database (UniProt Consortium 2010) and the PRIAM tool that predicts enzyme specificities (Claudel-Renard et al. 2003). Subsidiary information was provided by Pfam (Finn et al. 2010), InterPro (Hunter et al. 2009), CDD (Marchler-Bauer et al. 2011), TMHMM (Krogh et al. 2001), and SignalP (Emanuelsson et al. 2007). The nucleotide sequence was exported from GenDB in fasta format and the functional annotation data in GenBank format. This data was used by the Pathway Tools component PathoLogic to create a new Pathway/Genome Database according to the author’s instructions. The resulting new model organism database (MOD) was termed XccCyc. PathoLogic’s expert system TIP facilitated the inference of membrane transport proteins based on the provided annotation data. The Pathway/Genome Navigator was used to query and visualize XccCyc data. The XccCyc database was curated to exclude short pathways that were inferred by PathoLogic with no enzyme-encoding gene available for the majority of the essential reactions. Finally, if for more than 50% of the reactions no genes encoding the respective enzymes were identified, the respective amino acid biosynthetic pathways were removed from the database.

Recombinant DNA methods

Standard methods (Sambrook and Russel 2001) were used for agarose gel electrophoresis, DNA restriction, ligation, transformation of E. coli and the screening on solid media containing 5-bromo-4-chloro-3-indolyl-β-galactoside (X-gal). Restriction enzymes, ligase, and X-gal were supplied by Fermentas (St. Leon-Rot, Germany). Agarose was obtained from Peqlab (Erlangen, Germany). Plasmid DNA was extracted employing QIAprep Spin Miniprep kits (Qiagen, Hilden, Germany). E. coli clones carrying plasmids obtained by ligating DNA fragments into pFV7 plasmid vectors were selected on gentamicin resistance.

NMR-based 13C-isotopologue profiling

174 mg of X. campestris pv. campestris B100 cell pellet (dry weight) was hydrolyzed for 24 h in 6 M hydrochloric acid at 100°C. Hydrochloric acid was removed under reduced pressure and the residue was applied onto a column (23 cm) of DOWEX 50 WX8, which was subsequently washed with 300 ml of water and developed with a gradient of hydrochloric acid (0–3 M; total volume 2 l). 18 ml-fractions were collected. Aliquots were analyzed by thin layer chromatography and monitored by reaction with ninhydrine. Fractions containing amino acids were collected. The solvent was removed under reduced pressure. The residue was dissolved in 0.5 ml of DCl (pH 1) and subjected to NMR analysis. 13C-enrichments were determined from the 13C-NMR signal intensities using the respective values of unlabelled samples (i.e. with 1.1% 13C-abundance) as a reference.

Comparative genome analysis

The EDGAR software (Blom et al. 2009) was employed to compare the genome of the X. campestris pv. campestris strains B100 (Vorhölter et al. 2008), ATCC33913 (da Silva et al. 2002), and 8004 (Qian et al. 2005) with a focus on checking for the presence of orthologous genes involved in amino acid biosynthesis in all the strains.

Results

Genome-based reconstruction of the complete X. campestris pv. campestris metabolism

In prokaryotes, the biosynthesis of proteinogenic amino acids is closely integrated into cellular metabolism. Biosynthetic pathways for individual amino acids derive from different intermediates of the central metabolism. Thus, reconstructing amino acid biosynthesis pathways for X. campestris requires an overview of its entire metabolic network. As complete genomes of three strains of X. campestris pv. campestris are publicly available (da Silva et al. 2002; Qian et al. 2005; Vorhölter et al. 2008), genome data were employed as a basis to obtain a metabolic reconstruction. Initially, the most recent genome data of X. campestris pv. campestris strain B100 (Vorhölter et al. 2003, 2008), which had been analyzed with a focus on xanthan biosynthesis and the central metabolism, was used to re-annotate 98 genes putatively involved in amino acid biosynthesis by the means of the GenDB software. The updated genome data was imported with the Pathway Tools software (Karp et al. 2010) to generate a Pathway/Genome Database (PGDB) termed XccCyc. Furthermore, the Pathway Tools component PathoLogic was used to reconstruct the metabolic pathways and transport reactions of X. campestris pv. campestris B100. A graphical cellular overview of the entire metabolism reconstructed in the XccCyc database is displayed in Supplementary Fig. 1.

The XccCyc PGDB includes 4,575 genes, among them 4,513 protein-coding genes, 62 RNA genes, and 46 pseudo-genes, encoded by a chromosome of 5,079,002 base-pairs. Based on the amino acid sequences and the functional annotation of the annotated protein-coding genes, the automated analysis with PathoLogic indicated 100 cross-membrane transport reactions. The metabolic capacity made available in XccCyc comprised 1,856 enzymatic reactions. From this reactome of X. campestris pv. campestris B100, 351 metabolic pathways could be derived. Names were assigned to the pathways in an automated way according to the corresponding pathway names of the MetaCyc database (Caspi et al. 2008). For compatibility reasons and to obtain unambiguous designations, the MetaCyc-derived pathway names are used throughout this manuscript. Among the reconstructed metabolic pathways, 227 were classified as biosynthetic pathways. Further 146 pathways were assigned to the degradation of metabolic compounds. Finally, 5 pathways were classified to be involved in detoxification, and 34 pathways were involved in the generation of energy and precursor metabolites. Moreover, the interconnection of the metabolic pathways was reflected by the identification of 53 super-pathways that integrate multiple basic pathways, which share common metabolites and when combined provide a common functionality to the bacterial cell.

Genome-based analysis of amino acid biosynthesis pathways

The analysis of amino acid biosynthesis reconstructed by PathoLogic for X. campestris pv. campestris included 41 pathways plus 4 super pathways in total. The pathways inferred from the genome annotation covered the biosynthesis of all proteinogenic L-amino acids. The reconstructed pathways involved in amino acid biosynthesis comprised 175 enzyme-catalyzed reactions. For 158 of these 175 reactions, genes were identified that encode the respective enzymes. Thus, some of the inferred pathways had gaps, meaning that for a few of the required reactions genes encoding the respective enzymes could not be identified. However, this incompleteness in reconstruction is to be expected as it reflects our incomplete knowledge on enzyme structures, which impedes assigning appropriate functions to their genes. As a consequence of such failures in the genome annotation process, Pathway Tools is assumed to currently miss approximately 20% of the enzymatic reactions for a typical microbe (Karp et al. 2010). A rate of 17 unassigned reactions among 175 reactions suggested for the amino acid biosynthesis is well in the range of this proportion.

The automated metabolic reconstruction performed with PathoLogic on the basis of the curated genome annotation provided clear evidence for the pathways involved in the biosynthesis of asparagine, aspartate, cysteine, glutamate, glutamine, histidine, leucine, lysine, phenylalanine, serine, threonine, tryptophan, and valine. For each of these amino acids, the MetaCyc-compatible biosynthetic pathway that ultimately delivers the respective amino acid is depicted in Supplementary Fig. 2. For arginine, the genome annotation and the XccCyc database were updated to reflect the recent experimental data (Morizono et al. 2006) on its novel biosynthesis route in X. campestris pv. campestris.

Further pathways could be reconstructed for the amino acids proline, cysteine, and methionine. For each of these amino acids, the enzyme repertoire identified from the genome indicated deviations from the pathways available in Pathway Tools or the presence of two or more biosynthetic routes. Two possible pathways were suggested to generate proline, “proline biosynthesis I” that uses l-glutamate as metabolic precursor and “proline biosynthesis II (from arginine)”. While in “proline biosynthesis I” for each enzymatic step a gene encoding the respective enzyme was identified, two enzymes were missing in the reconstructed “proline biosynthesis II (from arginine)”. Consequently, the pathway “proline biosynthesis I” is assumed to be used in X. campestris pv. campestris as proline source.

For cysteine biosynthesis, the pathway “cysteine biosynthesis I” was inferred, which uses serine as a precursor molecule. While no enzyme could be identified that provides O-acetyl serine, there were indications that the X. campestris pv. campestris cysteine synthase (EC 2.5.1.47; encoded by cysK) may instead use the serine biosynthesis intermediate phosphoserine as substrate.

The analysis indicates the precursor molecules for the biosynthesis of the above mentioned amino acids in X. campestris pv. campestris (Fig. 1). We inferred that the biosynthesis of individual amino acids originated from the metabolites 3-phosphoglycerate, pyruvate, α-ketoglutarate, oxaloacetate, and ribose 5-phosphate. All biosynthetic pathways branch from intermediates of the central carbohydrate metabolism.

Fig. 1
figure 1

Genome-based reconstruction of the central metabolism of X. campestris pv. campestris as origin of amino acid biosynthetic pathways. The central metabolism of X. campestris provides the precursor molecules required to synthesize amino acids. Arrows indicate the predominant direction of the reactions when glucose is utilized as carbon source for protein biosynthesis. Glucose is initially phosphorylated to glucose 6-phosphate (Glc-6P). When not used to build the lipopolysaccharide and xanthan precursors UDP-glucose (UDP-glc), UDP-glucuronate (UDP-GlcA), and GDP-mannose (GDP-man; Vorhölter et al. 2001), Glc-6P is oxidized to 6 phospho-gluconate (Gla-6P), which is either metabolized to pyruvate (Pyr) and glyceraldehyde 3-phosphate (GAP) via the Entner–Doudoroff pathway, or converted to the intermediates ribulose-5-phosphate (Rul-5P), ribose-5-phosphate (Rib-5P), xylose-5-phosphate (Xyl-5P), sedoheptulose-7-phosphate (Hep-7P), erythrose-4-phosphate (Ery-4P), and fructose-6-phosphate (Fru-6P) in the pentose-phosphate pathway. Metabolites can be further converted to the glycolysis intermediates 3-phospho-glycerate (GA-3P), phosphoenol pyruvate (PEP), pyruvate (Pyr), to enter the citrate cycle via acetyl co- enzyme A (Ac-CoA). This results in the formation of citrate (CitA) that is converted either via glyoxylate (GlyA) or via alpha-ketoglutarate (ak-GlA), succinate (SucA), fumarate (FumA), and malate (MalA) to oxaloacetate (OAA). Genes found to be involved in the reconstructed amino acid biosynthesis are indicated by giving their italicized names in the vicinity of the respective reaction arrows. Proteinogenic amino acids are symbolized by their three-letter abbreviations following IUPAC nomenclature

Complementational analysis to elucidate methionine biosynthesis

The genome-based elucidation of the methionine biosynthesis had been impaired by uncertainties in the underlying similarity-based annotation of pyridoxal phosphate (PLP)-dependent enzymes. Although PLP-dependent enzymes catalyze different reactions, their similar primary structure complicates the precise prediction of the enzyme specificities based on their amino acid sequence (Rückert et al. 2003). Thus, for the two key enzymes the catalytic activity was checked experimentally by heterologous complementation of well-characterized E. coli mutants. For this approach, the X. campestris pv. campestris genes metB (CDS xcc-b100_3752) and metC (xcc-b100_3753) were chosen since multiple enzyme specificities of the gene products were predicted at equal probability. A 1.8 kb HindIII fragment carrying the metB gene and a 1.6 kb SalI fragment with the metC gene, both derived from cosmid pXCB1002 (Vorhölter et al. 2001), were cloned into vector pFV7 to obtain the plasmids pXH1H2 and pXS4A04, respectively. Plasmid pXH1H2 complemented E. coli strain NK5648 deficient in cystathionine gamma-synthase (EC 2.5.1.48; encoded by metB), while pXS4A04 complemented E. coli strain AT2699 lacking the cystathionine beta-lyase activity (EC 4.4.1.8, metC). Despite the elucidation of the enzyme specificities of the metB and metC gene products, eight methionine biosynthetic pathways were automatically assigned by Pathway Tools PathoLogic. The “S-adenosyl-l-methionine cycle” was neglected, since it is not a de novo biosynthetic pathway for methionine but provides methyl groups for various metabolic interconversions. Among the remaining seven pathways, lacking assigned enzymes for up to five essential catalytic reactions, the “methionine biosynthesis I” (Supplementary Fig. 2) pathway was strikingly promising. For this pathway, a gene coding for the respective enzyme was predicted for each reaction step. In addition, an O-acetylhomoserine aminocarboxypropyltransferase (EC 2.5.1.49; metY) provides an alternative source for homocysteine, the penultimate intermediate of “methionine biosynthesis I” that is finally methylated to obtain methionine.

Elucidation of amino acid biosynthetic pathways by 13C-isotopologue profiling

For some of the amino acids the genome-based reconstruction did not result in conclusive and unique pathways. In particular, the reconstructed metabolic network includes multiple biosynthetic routes for the amino acids alanine, glycine, and isoleucine (Fig. 2). It therefore remained unclear which of the suggested pathways was metabolically active. Even a careful re-annotation of the genes suggested by Pathway Tools to be involved in these pathways resulted in no further evidence. To support the identification of the relevant pathways by experimental data 13C-NMR measurements were applied, since this approach has proven to be helpful in the recent elucidation of arginine biosynthesis in X. campestris pv. campestris (Morizono et al. 2006).

Fig. 2
figure 2

Alternative biosynthesis pathways inferred from the genome data for alanine, glycine and isoleucine. The genome-based reconstruction of the biosynthesis pathways for the proteinogenic amino acids of X. campestris pv. campestris revealed two (alanine, isoleucine) or three (glycine) alternative biosynthesis routes for three amino acids. These alternative pathways are displayed as provided by the Pathway Tools software. A rounded box above each pathway gives its respective title as provided by the Pathway Tools with reference to the MetaCyc database. Arrows indicate enzymatic reactions. The respective enzyme names and EC numbers of the enzymes involved are given besides the arrows. Although included in MetaCyc and Pathway tools, the transaminase reaction providing alanine from serine (EC2.6.1.51) is unnamed in MetaCyc. In this study it was tentatively termed “alanine biosynthesis Xcc”. As the genome-based metabolic reconstruction remained inconclusive for alanine, glycine, and isoleucine, their biosynthesis was further analyzed by isotopologue profiling (Figs. 3, 4, 5)

X. campestris pv. campestris B100 was grown in minimal medium supplemented with [1-13C]-labeled glucose as a sole carbon source. After growth to the late exponential phase, cells were harvested and hydrolyzed. The resulting amino acids were separated by ion exchange chromatography and analyzed by 13C-NMR spectroscopy. The positional 13C-enrichments were then used to identify the pathways leading to amino acids and their precursors (Eisenreich and Bacher 2007). Threonine, aspartate, glutamate, leucine, proline, and histidine were not significantly labeled (<0.5% 13C-enrichments; see Supplementary Table 1). On the other hand, C-1 of alanine acquired 13C at high rates (46% 13C-enrichment), whereas C-3 was labeled with only 2% 13C-enrichment. High 13C-enrichment (54%) was detected at C-1 of isoleucine. C-4, C-5 and C-6 of isoleucine showed moderate enrichment values (5–8%). Arginine was labeled at C-6 only (14% 13C-enrichment). Serine was labeled at C-1 and C-3 at approximately 5% 13C for each atom, and glycine acquired 13C at C-1 (2% 13C-enrichment). Lysine was labeled at C-1 (19%).

Pyruvate represents the precursor of alanine biosynthesis

Two short pathways were inferred by Pathway Tools as alternatives for the generation of alanine. “Alanine biosynthesis III” employs a cysteine desulfurase (EC 2.8.1.7) to convert cysteine into alanine (Fig. 2, alternative A). In the second route to alanine (Fig. 2, alternative B), a serine-pyruvate transaminase (EC 2.6.1.51) transfers an amino group from serine to pyruvate. The cysteine desulfurase (alternative A) is essential for the biosynthesis of iron-sulfur clusters, where it provides sulfur with cysteine as precursor metabolite, while the remaining molecule is recycled as alanine (Fontecave and Ollagnier-de-Choudens 2008). Well in line with a role in iron-sulfur cluster biosynthesis, the sufS gene that encodes the cysteine desulfurase of X. campestris pv. campestris B100 is located in an operon-like structure downstream of the sufRBCD genes that code for further functions required in iron–sulfur cluster biosynthesis. This indicates a specific role of the “alanine biosynthesis III” pathway, pointing to the alternative pathway via the serine-pyruvate transaminase as the main route for alanine biosynthesis.

The different labeling patterns of alanine (high label at C-1) and serine (minor label at C-1 and C-3) indicate that serine does not serve as a major precursor of alanine, excluding alternative B as a main route. Although cysteine could not be observed directly, alternative A also appears unlikely since aspartate (biogenetically closely related to cysteine) was unlabeled. Therefore, it can be concluded that alanine is derived from pyruvate 13C-enriched at C-1 (Fig. 3; corresponding to alternative B of the previous genome-based metabolic reconstruction).

Fig. 3
figure 3

Metabolic flow of the [1-13C]-label towards alanine. A 13C-NMR analysis revealed how alanine is synthesized from [1-13C]-labeled glucose. Glucose is phosphorylated. The resulting glucose 6-phosphate can be converted to pyruvate either via the pentose phosphate pathway/Embden–Meyerhof pathway or via the Entner–Doudoroff pathway. The resulting pyruvate carries a 13C label at C3 or, when derived from the Entner–Doudoroff pathway, at C1. Alanine is then derived from the common pool of pyruvate, as reflected by its analogous labeling pattern. The scheme displays the metabolic intermediates that were detected with 13C labels. Carbon atoms carrying predominant 13C labels are indicated by shaded dots. The biosynthetic route deduced from the 13C-labeled metabolites corresponds to alternative B of the pathways suggested in the genome-based reconstruction (Fig. 2)

Determination of serine and glycine biosynthetic pathways

The analysis of the genome data resulted in the reconstruction of three distinct pathways for the biosynthesis of glycine. Based on MetaCyc, these metabolic routes received the designations “glycine biosynthesis I”, “glycine biosynthesis III”, and “glycine biosynthesis IV” (Fig. 2, alternatives A–C). The first pathway, “glycine biosynthesis I”, implicates the synthesis of serine from 3-phosphoglycerate. Serine is then converted to glycine by a serine hydroxymethyltransferase (EC 2.1.2.1) by the addition of tetrahydrofolate. The second alternative (“glycine biosynthesis III”) is to generate glycine via an alanine-glyoxylate transaminase (EC 2.6.1.44), which converts glyoxylate and alanine to pyruvate and glycine. In the final alternative “glycine biosynthesis IV”, glycine is derived from the metabolic precursor threonine via the threonine aldolase (EC 4.1.2.5).

The 13C-labeling patterns of glycine and serine (both labeled at C-1) suggest a close biosynthetic relationship indicating that serine is converted to glycine by serine hydroxymethyltransferase (“glycine biosynthesis I”; Fig. 4). No indications were found for glycine production neither from glyoxylate nor from threonine (unlabeled).

Fig. 4
figure 4

Elucidation of the biosynthesis pathway for serine and glycine. X. campestris pv. campestris metabolites obtained from bacteria that had been feed with [1-13C]-labeled glucose were analyzed by NMR. From the position of the 13C-label it becomes obvious that glycine was derived from serine. Serine was synthesized from 3-phosphoglycerate, a common intermediate of the central metabolism. Serine is the precursor of the pathway termed “glycine biosynthesis I” (Fig. 2, alternative A). C-atoms carrying predominant 13C labels are indicated by shaded dots

Isoleucine is synthesized via a threonine-independent pathway

Pathway Tools inferred two alternative routes for the biosynthesis of isoleucine by X. campestris pv. campestris. The respective pathways are termed “isoleucine biosynthesis I” and “isoleucine biosynthesis II” (Fig. 2, alternatives A and B). In “isoleucine biosynthesis I”, the biosynthesis of isoleucine is linked to the biosynthesis of aspartate, threonine, and glutamate. Here, the biosynthesis originates from the precursor molecule oxaloacetate, which is converted by an aspartate transaminase (EC 2.6.1.1) to aspartate. Aspartate is then phosphorylated to aspartyl-4-phosphate by the means of aspartate kinase (EC 2.7.2.4). Aspartyl-4-phosphate is converted to aspartate-semialdehyde and further processed to homoserine, catalyzed by aspartate-semialdehyde dehydrogenase (EC 1.2.1.11) and homoserine dehydrogenase (EC 1.1.1.3), respectively, and in both reactions at the expense of oxidizing NAD(P)H. Subsequently, the homoserine kinase (EC 2.7.1.39) phosphorylates homoserine to O-phospho-L-homoserine, which is then converted to threonine by the threonine synthase (EC 4.2.3.1). Threonine is transformed to 2-oxobutanoate by threonine ammonia-lyase (EC 4.3.1.19; ilvA1, ilvA2). In the following reaction, 2-aceto-2-hydroxy-butanoate is generated by acetolactate synthase (EC 2.2.1.6; ilvB). A ketol-acid reductoisomerase (EC 1.1.1.86; ilvC) catalyzes the reaction to 2,3-dihydroxy-3-methylvalerate, which is dehydrogenated to 2-keto-3-methylvalerate by a dihydroxy-acid dehydratase (EC 4.2.1.9; ilvD). Isoleucine (and α-ketoglutarate) are finally generated from 2-keto-3-methylvalerate by transferring an amino group from glutamate, a reaction that depends on the branched-chain amino acid transaminase (EC 2.6.1.42; ilvE). For all reactions, genes were identified that encode the enzymes required for this pathway (Fig. 2).

The other route “isoleucine biosynthesis II” shares the last 4 reactions with the “isoleucine biosynthesis I” pathway. This pathway, which was originally identified in the spirochaete Leptospira interrogans (Xu et al. 2004), includes leucine biosynthetic enzymes with relaxed specificities that facilitate the conversion of analogous isoleucine biosynthesis metabolites (Risso et al. 2008; Fig. 5a). “Isoleucine biosynthesis II” starts from the precursor pyruvate. Pyruvate and acetyl-CoA are converted to (R)-2-methylmalate, also known as D-citramalate, and coenzyme A by the R-citramalate synthase (EC 2.3.1.182). The leuA gene product corresponding to this enzyme has previously been annotated as 2-isopropylmalate synthase (EC 2.3.3.13), but a closer analysis in particular with PRIAM indicates an additional full-length similarity to citramalate synthase (Fig. 5b). Furthermore, the leuA gene was found amidst a cluster of almost all genes involved in isoleucine biosynthesis at a single chromosomal locus (Fig. 5b). Only ilvD and ilvE that encode the endmost reactions of the pathway and that are also required for valine synthesis (Fig. 1; Supplementary Fig. 2) were found at different chromosomal positions. The initially provided (R)-2-methylmalate (D-citramalate) is subsequently converted via the intermediate 2-methylmaleate (synonym: citraconate) and to erythro-3-methylmalate by isopropylmalate isomerase (EC 4.2.1.35). This enzyme activity may be encoded by the leuCD genes. Here again, an in silico analysis clearly suggests a high similarity of the gene products to the large and small subunits of a 3-isopropylmalate dehydratase (EC 4.2.1.33) in addition to the evidence for isopopylmalate isomerase (Fig. 5b). Erythro-3-methylmalate is oxidized to 2-oxobutanoate by 3-methylmalate dehydrogenase. For this enzyme, which has been characterized recently in the archaeon Methanocaldococcus jannaschii (Drevland et al. 2007), the preliminary EC number 1.1.1.n5 was assigned. 2-oxobutanoate is subsequently converted to isoleucine by the products of the genes ilvB, ilvC, ilvD, and ilvE as described above for “isoleucine biosynthesis I”. Both the isopopylmalate isomerase (EC 4.2.1.35; leuCD) and the isopropylmalate dehydrogenase reaction (EC 1.1.1.n5; leuB) were not automatically assigned to the metabolic network represented in XccCyc.

Fig. 5
figure 5

Biosynthesis pathway for isoleucine based on 13C-NMR measurements. Metabolic intermediates derived from the utilization of [1-13C]-labeled glucose were analyzed to identify the pathway employed for isoleucine biosynthesis (a). Arrows indicate the direction of the reactions. Dot-separated numbers in the vicinity of the reaction arrows give the EC numbers of the deduced enzymatic reactions. Carbon atoms carrying predominant 13C labels are indicated by shaded dots. The 13C labeling pattern is consistent with the generation of 2-oxobutanoate from pyruvate. This pathway, with the intermediates (R)-2-methylmalate (Dcitramalate), 2-methylmaleate (citraconate), and D-erythro 3-methylmaleate, corresponds to “isoleucine biosynthesis II” in the MetaCyc database. The subsequent reactions to generate isoleucine from 2-oxobutanoate are represented in both genome-suggested pathways “isoleucine biosynthesis I” and “isoleucine biosynthesis II”. The detection of the specific intermediates of the initial “isoleucine biosynthesis II” pathway is clear evidence for the use of this pathway (Fig. 2, alternative B) by X. campestris pv. campestris. As the threonine detected did not carry 13C-labeled carbon atoms, a participation of threonine could be ruled out for the biosynthesis of the labeled isoleucine isotopologues. All genes for isoleucine biosynthesis except for ilvD and ilvE, which code for the final two reactions and both are also involved in valine biosynthesis, were localized in single chromosomal cluster (b). Arrows indicate the direction of transcription, the extents of the coding sequences (CDS), and the names of the genes. Shaded boxes underneath the gene symbols represent PRIAM profiles that indicate enzyme specificities, with the predicted specificities denoted by their EC numbers. Boxes shaded in light gray symbolize profiles with functions in isoleucine biosynthesis. Boxes shaded in dark gray symbolize profiles with functions in leucine biosynthesis. It becomes obvious that multiple genes are related to both isoleucine and leucine biosynthesis. While there is no PRIAM profile for the newly classified 3-methylmalate dehydrogenase (EC 1.1.1.n5), this function is reflected by similarity to the Swiss-Prot entry Q58130 from M. janaschii. The capital letter M indicates the gene ilvM, which encodes the small subunit of the acetolactate synthase. An asterisk symbolizes a PRIAM profile with the EC number 2.2.1.6. The exact function of the putative methyltransferase encoded by the gene ubiE2 remained unclear

The high 13C-enrichment at C-1 of isoleucine from X. campestris pv. campestris grown with [1-13C]-labeled glucose clearly identified the used pathway. More specifically, a route involving oxaloacetate (unlabeled) as a precursor unit can be excluded. Rather, a pathway transferring label from [1-13C] pyruvate to C-1 of isoleucine appeared plausible. Indeed, “isoleucine biosynthesis II” (Fig. 2, alternative B) via 2-oxobutanoate, which is derived via (R)-2-methylmalate (citromalate) from pyruvate, is perfectly in line with the observations (Fig. 5a).

In summary, the data obtained by 13C-NMR facilitated the identification of biosynthetic pathways for isoleucine, glycine and alanine. In conjunction with the results of the genome-based metabolic reconstruction, and by considering the complementation analyses two genes involved in methionine biosynthesis, this provided evidence for the biosynthesis of all proteinogenic amino acids. In addition to the strain B100 analyzed in this study, complete genome data are available for the X. campestris pv. campestris strains ATCC 33913 (da Silva et al. 2002) and 8004 (Qian et al. 2005). Genome comparison revealed orthologs for all amino acid biosynthesis genes identified in strain B100 also in the strains ATCC 33913 and 8004 (Supplementary Table 2). This indicates the presence of similar biosynthetic routes for the generation of proteinogenic amino acids in all X. campestris pv. campestris strains.

Discussion

Based on a detailed analysis of the complete genome of X. campestris pv. campestris B100 and on in vivo 13C-isotopologue profiling (Eylert et al. 2010; Eisenreich et al. 2010), the biosynthetic routes to amino acids were determined (Table 1). As expected, most of these pathways corresponded to the classical routes known for many other microorganisms. However, the pathways leading to arginine (Morizono et al. 2006) and isoleucine were unusual and might reflect specific metabolic features of xanthomonads.

Table 1 Metabolic pathways for the biosynthesis of l-amino acids identified in X. campestris pv. campestris

More specifically, the analysis of isoleucine biosynthesis indicates the prevalence of a biosynthetic pathway that was identified only recently (Xu et al. 2004; Risso et al. 2008). Thus, besides the recently elucidated arginine biosynthesis pathway (Morizono et al. 2006), there are now indications for a second novel biosynthetic route involved in amino acid biosynthesis. In the isoleucine pathway inferred in this study, several reactions are catalyzed by enzymes that are also involved in leucine biosynthesis. The isoleucine biosynthesis pathways of L. interrogans (Xu et al. 2004) and Geobacter sulfureducens (Risso et al. 2008) are similar to the pathway suggested for X. campestris pv. campestris as here the leuB, leuC, and leuD gene products have relaxed enzyme specificities and are thus involved in both leucine and isoleucine biosynthesis. However, the L. interrogans and G. sulfureducens pathway include a specific R-citramalate synthase (EC 2.3.1.182, cimA) in addition to the 2-isopropylmalate synthase (EC 2.3.3.13) encoded by the leuA gene. In X. campestris pv. campestris, the initiating reactions of the two pathways both are also probably catalyzed by the leuA gene product. A deeper experimental analysis of the biochemical properties and the structure of this enzyme may shed more light on the function of the enzyme encoded by leuA.

Another question is raised concerning the roles of those biosynthetic pathways that were deduced from the genome annotation (Fig. 2), but were then found to be irrelevant for the biosynthesis of alanine, glycine, and isoleucine. What is the reason for the presence of genes encoding unused metabolic pathways? Concerning the utilization of threonine for isoleucine biosynthesis, the 13C-isotopologue profiling data clearly show that this pathway has no traceable influence on the biosynthesis of isoleucine when glucose is used as a sole carbon source. However, there are two isogenes encoding threonine ammonia-lyase present in the genome of X. campestris pv. campestris B100, ilvA1, and ilvA2, which are supposed to be involved in synthesizing isoleucine from threonine. It is tempting to suggest that these two genes are active when other carbon sources are available. Considering the phytopathogenic life-style of Xanthomonads, the enzymes encoded by the ilvA genes may be particularly useful when threonine becomes available in planta, either directly or via the availability of precursor molecules that can be converted to threonine. Unfortunately, so far there is little knowledge related to the adaptation of the metabolism of X. campestris following the infection of its host plants. To a certain extent analogously, the metabolic pathways “glycine biosynthesis III” and “glycine biosynthesis IV”, shown not to be involved in glycine biosynthesis from minimal medium supplemented with glucose, may play a role under different environmental conditions. Enzymes from other organisms employing these pathways were shown to be particularly heat tolerant (Liu et al. 1998) or required utilization of alternative carbon sources (Schlösser et al. 2004). This implies a possible function of these pathways under different environmental conditions. A plausible example for such a complementary role of a reconstructed biosynthetic pathway that was found to be not important for protein biosynthesis may be “alanine biosynthesis III”, which probably provides alanine for biosynthesis of iron–sulfur clusters (Fontecave and Ollagnier-de-Choudens 2008).

The broad metabolic survey of amino acid biosynthesis in minimal medium with glucose as a sole carbon source reported in this study may facilitate specific follow-up analyses. Such studies may either be targeted at X. campestris enzymes of special interest, or they may aim at the characterization of further biosynthetic pathways, in particular with relevance to plant pathogenicity. In any case, the data provided is close to the industrial utilization of X. campestris pv. campestris in the large-scale production of xanthan. The metabolic pathways outlined in this paper can be used directly to enhance flux balance analysis (Orth et al. 2010) or to deduce metabolic flux data from the amino acid composition of sample cultures (Wittmann 2002). Both techniques are well established to initiate metabolic analyses in Systems Biology. While useful applications exploiting the amino acid biosynthesis routes newly identified for X. campestris pv. campestris are described above, the combination of genome-based metabolic reconstruction and isotopologue profiling employed in this work has a much wider potential. To our knowledge, this combinatorial approach is described here for the first time. A limitation in employing NMR-based profiling can provide the compounds to be analyzed in sufficient abundance for their identification. However the methodological combination with genome-based metabolic reconstruction can be individually adapted and is flexible enough to be extended when appropriate. In this study, an additional complementational analysis was included to shed more light on the branched methionine biosynthesis and thereby complete the metabolic field investigated by covering the anabolism of all proteinogenic amino acids. Furthermore, this combination of methods is not only flexible, but it also addresses a problem that is getting increasingly urgent. As a result of continuing advances in genome sequencing, more than 1,600 completed bacterial genomes and a growing number of sequenced eukaryotes is currently available (Sayers et al. 2011), while this outstanding data are frequently too inaccurate for metabolic analyses, it becomes obvious not only from errors but also from gaps in metabolic reconstruction (Karp et al. 2010). This study shows that in vivo isotopologue profiling based on stable isotopes can contribute to utilizing genome data efficiently, when specifically applied to determine those metabolic pathways that cannot be resolved by careful genomic reconstruction. Obviously, this combination is a promising basis to answer still open questions concerning metabolic pathways in organisms with genome data available.