Introduction

Plants are exposed to a large variety of environmental stresses including abiotic stresses such as drought or extreme temperatures, and biotic stresses caused by pathogens. To survive, plants have evolved mechanisms to perceive these stresses in order to respond to these adverse environmental conditions (Fujita et al. 2006). Concerning biotic stresses, biotroph pathogen perception is often conditioned by major resistance (R) genes following the classic gene-for-gene model (Flor 1955). Most plant R genes cloned to date belong to the Nucleotide Binding Site (NBS)-Leucine Rich Repeat (LRR) gene family (Martin et al. 2003; Belkhadir et al. 2004; Rairdan and Moffett 2007). R genes belonging to this class have been identified in various plant species (monocotyledons as well as dicotyledons) and correspond to R genes effective against all types of pathogens and pests including fungi, bacteria, viruses, nematodes, oomycetes, and insects (Dangl and Jones 2001; Hammond-Kosack and Parker 2003; McHale et al. 2006).

R proteins have two functions: they are ‘sensors’ that directly or indirectly detect pathogens, and they activate downstream signaling leading to pathogen resistance (Chisholm et al. 2006). Upon pathogen recognition, R proteins trigger defence responses that often result in the so-called hypersensitive response (HR), a rapid induction of host cell death localized at the site of invasion of the pathogen (Hammond-Kosack and Jones 1996). This plant response is associated with massive cellular ion fluxes, activation of mitogen-activated protein (MAP) kinase (K) cascades, generation of reactive oxygen species (ROS), and strengthening of cell walls as well as the induction of the expression of many defence genes, including pathogenesis-related (PR) genes (Hammond-Kosack and Jones 1996; van Loon et al. 2006; Desveaux et al. 2006; Grant et al. 2006). Furthermore, evidence of complex interplay between defence molecules such as salicylic acid (SA), ethylene (ET), jasmonic acid (JA) and nitric oxide is emerging (Hammond-Kosack and Jones 1996; Hammond-Kosack and Parker 2003; Robert-Seilaniantz et al. 2007). During the last decade, several studies have reported evidence for points of convergence in biotic and abiotic stress signalling networks (Sessa et al. 1995; Grant et al. 2003; Chini et al. 2004; Agarwal et al. 2006; Cao et al. 2006; Jiang et al. 2008). Hormone, ROS signalling pathways and activation of MAPK cascade are common plant responses to both types of stresses, suggesting that resistance against biotic and abiotic stresses at least partially share common pathways (Fujita et al. 2006). However, as the molecular mechanisms involved in these stresses were analyzed mostly independently, convergence points between biotic and abiotic stress signaling processes remain elusive (Fujita et al. 2006). Therefore, further studies are needed to understand the key factor(s) governing the connection between the signaling pathways involved in response to biotic and abiotic stresses.

Formate dehydrogenase (FDH) is a widespread enzyme found in various organisms such as bacteria, fungi, yeast, and plants. FDH catalyzes the oxidation of formate (HCOO) into CO2. Different types of FDHs have been described; however, only the NAD-dependent type (EC 1.2.1.2) has been reported in plants. NAD-dependent FDHs have been shown to provide NADH to the respiratory chain (Popov and Lamzin 1994). Historically, the first plant FDH activity was described in common bean (Phaseolus vulgaris) seeds (Thunberg 1921). In the Arabidopsis thaliana genome, only one FDH encoding gene has been identified, suggesting that FDH is a low copy gene in plant genomes (Li et al. 2001). In plants, FDH is nuclear-encoded as a precursor containing an NH2-terminal extension signal peptide, which contains the organelle targeting information. Direct evidence for the Solanum tuberosum FDH targeting and activity into mitochondria has been given (Colas des Francs-Small et al. 1993; Ambard-Bretteville et al. 2003a); however, FDH activity has also been detected in the chloroplasts of A. thaliana (Olson et al. 2000; Herman et al. 2002). FDH transcript accumulation has been reported under various abiotic stresses including hypoxia, chilling, drought, dark and wounding in potato, and Arabidopsis leaves (Hourton-Cabassa et al. 1998; Li et al. 2001) as well as iron deficiency and anaerobia in barley root (Suzuki et al. 1998). At the protein level, FDH activity has been reported in various organs such as pea pods, developing seeds, mature seeds, embryo, seedling and root nodule (Davison 1949), and potato tuber where it is one of the major mitochondrial proteins (Colas des Francs-Small et al. 1993). FDH protein accumulation and increased activity were observed in barley roots under iron deficiency (Suzuki et al. 1998) and in potato tuber where an increase in FDH activity was correlated with a decrease in oxygen concentration (Bykova et al. 2003). However, no evidence for the implication of FDH in biotic stress has been reported so far.

In common bean, a complex R gene cluster referred to as the B4 R gene cluster has been identified at the end of linkage group (LG) B4 (Geffroy et al. 1999). Specific R genes against the fungus Colletotrichum lindemuthianum (Geffroy et al. 1999, 2008) and R quantitative trait loci (QTL) against C. lindemuthianum and Bean yellow mosaic virus (Geffroy et al. 2000; Lopez et al. 2003) as well as expressed Coiled-coil-NBS-LRR (CNL) encoding genes (Ferrier Cana et al. 2003; Ferrier Cana et al. 2005) were mapped to the B4 R gene cluster. In preparation for sequencing of the B4 R gene cluster in common bean, we generated a physical map based on bacterial artificial chromosome (BAC) clones from the Meso-American genotype BAT93 (David et al. 2008). This analysis revealed that the B4 R gene cluster is a very large cluster spanning several megabases and that CNL sequences were heterogeneously distributed across the B4 R gene cluster, with several regions extremely rich in CNL sequences separated by large intervening regions with few or no CNL sequences.

In the present study we report the sequencing and gene annotation of one BAC clone (105 kb) of common bean, mapped at the B4 R gene cluster (David et al. 2008). We identify 17 putative genes of which four CNL and three full-length formate dehydrogenase (FDH) encoding genes (FYN24G.15, FYN24G.16 and FYN24G.17). Database searches confirm that FDH is a low copy gene in plant genomes and prediction programs suggest that mitochondrial localization of FDH might be the rule in higher plants. Expression analyses are presented for two of the three common bean FDH (FYN24G.15 and FYN24G.16) under dark stress and under pathogen infection with the fungus Colletotrichum lindemuthianum. It revealed that, although these two genes are very similar, they are differentially expressed and respond to distinct stresses, i.e., dark stress for FYN24G.15 and infection with an avirulent strain of C. lindemuthianum infection for FYN24G.16.

Materials and methods

Biological materials and plant treatments

Seeds of Phaseolus vulgaris genotype BAT93 were germinated and grown for 8 days as described previously (Geffroy et al. 1999). For the biotic stress, the two cotyledonary leaves of BAT93 seedlings were sprayed, 8 days after germination, with M126 (virulent) or C531 (avirulent) Colletotrichum lindemuthianum conidia according to Geffroy et al. (1999). Anthracnose symptoms appear 96 h after infection with the virulent strain. Control plants were sprayed with water and incubated in a control growth chamber under the same temperature (19°C) and humidity conditions (90% relative humidity) as the infected plants. For the dark treatment, 8-day-old BAT93 seedlings were placed in a dark growth chamber at 19°C, for 72 h. After 72 h in the dark, the plants were re-exposed to light for 2 days. Control plants were placed in a growth chamber under the same conditions, except that they were exposed to the fluorescent tubes (OSRAM L, 58W/830) providing light at a fluence of 166 μmol m−2 s−1. One hundred mg of cotyledonary leaves from three different plants were harvested at 10, 24, 36, 48, 72, and 96 h after C. lindemuthianum infection and at 10, 24, 36, 48, 72, 96, and 120 h after the dark treatment initiation, as well as from the control plants and immediately frozen in liquid N2 for mRNA isolation. To provide replicate RNA samples for Real-time quantitative PCR assays, each time-course was repeated twice.

BAC library screening, sequencing, and sequence annotation

A BAT93 Phaseolus vulgaris BAC library (Kami et al. 2006) was screened with PRLJ1, a NBS probe previously shown to be specific for the B4 R gene cluster (Geffroy et al. 1999; Kami et al. 2006). The positive BAC clones were organized in contigs (David et al. 2008) and BAC clone FY-N24 was selected for sequencing. Lark Genaissance (UK) sequenced 768 random clones of P. vulgaris BAT93 BAC FY-N24 from a shotgun library with inserts of 1–3.5 kb, giving approximately an eightfold sequence coverage. Since no gene prediction program specifically trained for common bean is yet available, the annotation was done manually, using a combination of gene-finding programs and sequence homology with known plant genes and proteins.

First, the FY-N24 sequence was analyzed for repetitive elements and transposon sequences using RepeatMasker (Smit and Green, http://repeatmasker.org) and The Institute for Genomic Research (TIGR) Arabidopsis repetitive element database. Second, gene models were predicted using the ab initio gene finder FGENESH (Burset and Guigo 1996) and GeneMark.hmm (Lukashin and Borodovsky 1998) using the Arabidopsis model. Additionally, BLASTX (Altschul et al. 1997) algorithm was used to search for homologous protein sequences against the Genbank non-redundant (NR) database. Third, to find correct predictions of exon/intron borders, the genomic sequence of FY-N24 was extracted to perform BLASTN analyses against all the expressed sequence tags (ESTs) available at GenBank (25 August 2008) from P. vulgaris (83,348) (Ramirez et al. 2005), P. coccineus (20,120), P. acutifolius (739), and P. angustissimus (6,672). The resulting list of ESTs overlapping with FY-N24 genomic sequence was passed onto a custom script that automatically retrieves the sequences of all matches and performs an in-depth verification of each EST using Sim4 (Florea et al. 1998). Since the three common bean FDH sequences present a high level of nucleotide identity, EST assignment to each of them was deduced from careful manual inspection of multiple alignments established with CLUSTALX version 1.83 (Thompson et al. 1994) between EST and putative FDH coding sequences. The same procedure was used for EST assignment to CNL sequences. This information was compiled with ARTEMIS (Rutherford et al. 2000) to decide on final gene structure. The resulting predicted protein sequences were used to query the PFAM database (Bateman et al. 2004) for domain prediction. The FY-N24 BAC sequence has been deposited in the NCBI database under accession number EU931620.

Predicted genes are named with the FY-N24 BAC name followed by the letter “G” for gene and X, where X is a number (from 1 to 17) corresponding to the order of the genes along the FY-N24 BAC sequence. Among the 17 annotated genes 4 correspond to CC-NBS-LRR (CNL) encoding genes, and were referred to as CNL-B8 to CNL-B11 to comply with a previous nomenclature. CNLs corresponding to pseudogene are presented with an asterisk (*) after their name. These pseudogenes contain a start methionine but present frameshift(s) or premature stop codon(s) leading to truncated predicted proteins compared with “full length” encoding CNL.

Database searches and sequence analyses

A set of nucleotide FDH sequences from other plants was identified with a TBLASTN search using the bean FDH FYN24G.16 as a query sequence against (1) the GenBank NR database (cut off evalue 1e-50, 30 March 2007) (2) the soybean whole genome shotgun (WGS) sequence (cut off evalue 1e-50, 7x draft assembly, 28 January 2008, http://phytozome.net) (3) the PUT (Putative unit transcript) of plantGDB [cut off evalue 1e-30, 30 March 2007, (Dong et al. 2004)] and the GenBank database viridiplantae ESTs division (cut off evalue 1e-30, 30 March 2007). Matching ESTs of the same species were assembled using cap3 (Huang and Madan 1999), and resulting contigs were checked manually in order to prevent wrong assignments (Supplementary Table 1). The identified FDH DNA sequences were translated with the MacMolly program (version 3.5.1, Soft gene GmbH) and only full length FDH sequences were retained for further analysis.

Predictions for protein subcellular localizations were performed using Predotar (Small et al. 2004), TargetP (Emanuelsson et al. 2007) and Mitoprot (Claros and Vincens 1996) software.

Nucleotide identities and amino acid similarities were obtained using bl2seq (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi). For phylogenetic analyses, nucleotide sequences of the 33 plant FDH sequences were aligned with CLUSTALX version 1.83 (Thompson et al. 1994) and edited in GENEDOC for manual adjustments (http://www.psc.edu/biomed/genedoc). This alignment was analyzed using Bayesian phylogenetic methods as implemented in MrBayes V. 3.1.2 (Huelsenbeck and Ronquist 2001). MrBayes analyses were run with the nucleotide model 4by4, following a gamma distribution. We used the Metropolis-coupled Markov Chains Monte Carlo method to make two runs of four chains. We ran 10 million generations, sampling every 100 generations. A consensus tree with branch lengths and posterior estimates of branch probabilities was generated with the “sumt” command of MrBayes and “burnin” parameter of 50,000. The resulting phylogenetic tree is displayed with the MEGA4.1 software (Tamura et al. 2007).

RNA extraction, cDNA preparation and amplification, cloning and sequencing of RT-PCR products

Extractions of total RNA were carried out with the Nucleospin RNA plant Kit according to manufacturer’s instructions (Macherey–Nagel, Düren, Germany). cDNA preparation was performed as described in Langlois-Meurinne et al. (2005). To confirm the FDH gene structure, three primer pairs were designed upstream of the ATG (forward oligonucleotide) and downstream of the stop codon (reverse oligonucleotide) of each of the three bean FDH nucleic sequences (Table 1). They were used to amplify fragments of 1,319; 1,370; and 1,313 bp of the first-strand cDNA product of FYN24G.17, FYN24G.16, and FYN24G.15, respectively, with the Advantage-HF 2 PCR Kit according to manufacturer’s instructions (BD Biosciences, Moutain View, CA, USA). PCR products were separated on a 1.5% agarose gel and visualized with ethidium bromide staining under ultraviolet light. Fragments of the expected size were excised from the gel, purified using Nucleospin extract 2 kit according to manufacturer’s instructions (Macherey–Nagel, Düren, Germany), and cloned with the TOPO TA Cloning Kit (Invitrogen, Carlsbad, CA, USA). Plasmid DNA of positive clones was purified using the QIAprep Spin Miniprep Kit (Qiagen, Valencia, CA, USA) and sequenced using an automated 373A DNA sequencer and the Big Dye dideoxy chain Terminatorcycle sequencing Kit (Perkin Elmer, Applied Biosystems, Roissy, France).

Table 1 Primer sequences used in RT-qPCR and FDH gene structure experiments

Expression analysis

All primers used in the Real-time quantitative PCR (RT-qPCR) experiments are reported in Table 1. Two primer pairs specific for FYN24G.15 and FYN24G.16, respectively, were defined (Supplementary Fig. 1). However, no primer pairs following manufacturer instructions and specific to FYN24G.17 could be defined. RT-qPCR analysis of bean FDH was carried out using a Light-Cycler 480 instrument and Light CyclerR 480 SYBR Green Master mix (Roche Diagnostics, Mannheim, Germany) using primers at final concentration of 0.21 μM, 5 μl of cDNA (diluted at 1:100) as template, in a final volume of 15 μl. PCR-cycling comprises an initial polymerase activation step at 95°C for 10 min, followed by 45 cycles at 95°C for 15 s, 60°C for 20 s (except for FYN24G16 primers pairs which was 55°C), and 72°C for 15 s. After each run, a dissociation curve was acquired to confirm the absence of nonspecific amplification products, by heating the samples from 65 to 95°C. Real-time DNA amplification was monitored and analyzed using the Light Cycler® software release 1.5.0 (Roche Diagnostics, Mannheim, Germany). Two completely independent experiments were performed, i.e., based on two independent time courses. For each experiment, each sample was run in duplicates. Absence of amplification on Colletotrichum lindemuthianum genomic DNA of primer pairs designed on bean EF1-α, FDH FYN24G.15, and FYN24G.16 was controlled. Contamination by genomic DNA was checked by using primer pairs designed on FYN24G.15 intron 1 (Table 1). Quantification of gene expression was performed using the comparative Ct method (Livak and Schmittgen 2001). EF1-α, whose expression remains constant in the different conditions tested, was used as internal reference. For each condition, Ct value obtained for each gene of interest (Ct) was normalized by the internal reference Ct value (Ctr), according to the equation ΔCt = Ct − Ctr. ΔCt value obtained for each condition was then normalized by ΔCt value obtained for the calibrator (ΔCt (c)), which corresponds to untreated 8-day-old BAT93 seedlings, according to the equation ΔΔCt = ΔCt − ΔCt (c). Relative expression was then expressed as 2−ΔΔCt. Data are presented as induction fold from treated relative to control conditions.

Results

Sequence analysis of the bean FY-N24 BAC clone

The BAC clone FY-N24 was completely sequenced, providing 105,981 bp of contiguous sequence with an approximate sixfold coverage. The overall GC content of the 105,981 bp is 34.45%. The FY-N24 BAC sequence was annotated using a combination of prediction and comparative tools (see “Materials and methods”). BLAST queries were performed against ESTs and NR peptide databases. Alignments were used to optimize manually the homology regions, taking into account characteristics of splice sites and features shown by ESTs or peptides.

As most of the ESTs originate from different Phaseolus species (ESTs are either from P. vulgaris, P. coccineus, P. acutifolius or P. angustissimus whereas the BAC comes from P. vulgaris) or from different P. vulgaris genotypes (P. vulgaris ESTs are from 13 different genotypes, including genotype BAT93, which provides the presently sequenced BAC), we did not expect complete identity between ESTs and coding sequences of the annotated genes. The final result of the manual gene prediction is presented in Table 2; Fig. 1. In total, 17 genes named FYN24G.1 to FYN24G.17 (Fig. 1; Table 2), were predicted, providing an average gene density of one gene every 6 kb, with a large central region of 18,386 bp with no predicted gene. Four genes presenting the same transcriptional orientation (FYN24G.1, FYN24G.3, FYN24G.4, and FYN24G.6) are similar to R genes of the CNL class (Fig. 1). Two of them, FYN24G.1 and FYN24G.6, were previously identified from sequencing of phages mapped at the B4 R gene cluster, where they were referred to as CNL-B8 and CNL-B11, respectively (Geffroy et al. 2009). According to this gene nomenclature, FYN24G.3 and FYN24G.4 correspond to CNL-B9 and CNL-B10, respectively (Fig. 1; Table 2). An EST hit (FE897117) was identified that matched the N- terminal part of the CNL-B8 (Table 2). CNL-B8, CNL-B9, and CNL-B11 are intronless genes in their ORF and correspond to full length CNL encoding genes of, respectively, 3,363, 3,384, and 3,381 bp. These three CNL harbor 21 Leucine Rich Repeats (LRR) while the truncated CNL-B10* gene (2,792 bp) is composed of 13 LRRs only.

Table 2 Annotation overview of the 105-kb sequence of the FY-N24 bacterial artificial chromosome (BAC) clone from Phaseolus vulgaris (BAT93)
Fig. 1
figure 1

Schematic representation of the annotation of the sequence of the FY-N24 Phaseolus vulgaris BAT93 BAC clone (GenBank accession EU931620). The description of each putative gene is presented in Table 2. Arrowheads indicate the transcriptional orientation of the genes. Drawn to scale

Three regions containing similarities with known class I retroelements were found: FYN24G.2 is similar to a P. vulgaris TY-1/Copia like LTR retroelement AY533394, FYN24G.5 is similar to a Glycine max TY-3/Gypsy like retroelement polyprotein AAO23078, and FYN24G.11 is similar to a Medicago truncatula putative non-LTR retroelement RNA-directed DNA polymerase ABE85417. Apart from FYN24G.12, which is highly similar to the bean BA2 gene of unknown function (Melotto et al. 2004) and FYN24G.8, which is only a Genemark.hmm prediction, the remaining eight predicted genes are similar to plant genes with known functions. FYN24G.13 and FYN24G.14 are predicted to be basic blue protein-like encoding genes; FYN24G.9 gene is similar to the N-terminal region of an NADH dehydrogenase. The P. vulgaris EST FD791375 could be aligned with the N- terminal region of the FYN24G.9 gene. FYN24G.7 and FYN24G.10 are similar to F-box protein. Finally, a small cluster of three intact NAD-dependent formate dehydrogenase (FDH) genes (FYN24G.15, FYN24G.16 and FYN24G.17) was identified. FYN24G.15 and FYN24G.16 genes are separated by 1,287 bp and oriented in the same direction while FYN24G.17, located 4,062 bp away from FYN24G.16, is oriented in the opposite direction (head to head). The structure of the three FDH genes was established by taking into account the 17, 3, and 5 Phaseolus ESTs assigned to genes FYN24G.15, FYN24G.16, and FYN24G.17, respectively (Table 2).

Formate dehydrogenase gene structure

To confirm common bean FDH gene structure predictions, near-full-length cDNA sequences of the BAT93 genotype were obtained by RT-PCR as described in the experimental procedure and sequenced. Three distinct transcripts corresponding to the three FDH genes (FYN24G.15, FYN24G.16 and FYN24G.17) were obtained. Nucleic acid alignments between the near-full-length cDNA and their corresponding genomic sequences confirm the predicted FDH gene structure. Six exons and five introns were identified, with introns occurring at the same positions for the three bean FDHs (Fig. 2). The most striking difference was the size of the first intron, which was 1,568, 133, and 327 bp for FYN24G.15, FYN24G.16, and FYN24G.17, respectively (Fig. 2). In all, the nucleic acid coding sequences were 1,161, 1,146, and 1,146 bp for FYN24G.15, FYN24G.16, and FYN24G.17, respectively. At the nucleotide level, the coding sequences of FYN24G.16 and FYN24G.17 are very similar, with 99% identity (only 10 bp distinguish FYN24G.16 and FYN24G.17 coding sequences). The nucleic acid coding sequence of FYN24G.15 shares 87% identity with either FYN24G.16 or FYN24G.17. The putative amino acid sequences encoded by FYN24G.16 and FYN24G.17 share 100% identity and FYN24G.15 shares 95% identity with either FYN24G.16 or FYN24G.17.

Fig. 2
figure 2

Exon-intron structure of three formate dehydrogenase (FYN24G.15, FYN24G.16 and FYN24G.17) genes from Phaseolus vulgaris. Exons (E1E6) and introns (I1I5) are numbered sequentially from left to right and are represented by black boxes and black lines, respectively. Gray boxes indicate putative signal peptides. Size in base pair of each exon and intron is indicated in brackets. Drawn to scale

To assess the FDH gene copy number in the P. vulgaris genome, a Southern-blot analysis was carried out (data not shown), demonstrating that only three FDH encoding genes are present in the P. vulgaris genome for both genotypes tested (BAT93: Mesoamerican; JaloEEP558: Andean). Together, these data show that three highly similar FDH sequences are present in the bean genome of BAT93 and that these three tightly linked FDH sequences are located near a cluster of CNL sequences at the end of linkage group B4.

Genome-wide exploration of FDH genes

In order to investigate the evolution of the NAD-dependent FDH gene family, we produced a comprehensive dataset of FDH sequences. FDH gene-coding sequences were searched in the GenBank (GB) non-redundant (NR) database, in Glycine max Whole Genome Shotgun (WGS) sequence and in the Plant Genome database (PlantGDB) EST Clusters. To complete this dataset with sequences from incompletely sequenced genomes, the full-length coding sequence of FDH genes was also deduced from manually established EST contigs (see “Materials and methods”) (Supplementary Table 1). Collectively, including the three present common bean FDH sequences, a total of 36 FDH sequences were identified: 33 plant FDH genes from angiosperms, gymnosperms and moss, indicating that FDHs are widespread throughout land plants and FDH sequences from bacteria (Pseudomonas syringae), yeast (Saccharomyces cerevisiae), and fungi (Neurospora crassa). The identification of NAD-dependent FDH sequences in plants, but also in diverse organisms such as bacteria, yeast and fungi indicates that NAD-dependent FDH arose early in evolution. In all the genomes of species fully sequenced or close to being completed, the number of FDH genes is low: A. thaliana (1), Medicago truncatula (1), Oryza sativa (2), Glycine max (3), Populus trichocarpa (1), Vitis vinifera (1), N. crassa (1), S. cerevisiae (2), demonstrating that FDH is a low copy gene family. In rice, the two FDH genes are separated by 24 kb on chromosome 6 and in soybean, the three FDHs identified belong to two distinct chromosomes: GmaFDH03 is located on chromosome 13 (27,112,435 to 27,116,473 bp), while GmaFDH01 (824,250 to 828,124 bp) and GmaFDH02 (829,852 to 833,359 bp) are separated by 1,728 bp on chromosome 19. At the amino acid level, alignment of the 36 FDH sequences from various taxa [bacteria (1, P. syringae), fungi (1, N. crassa), yeast (1, S. cerevisae), moss (1), monocots (7), and dicots (25)], revealed a limited level of sequence diversity except for the putative signal peptide (Supplementary Fig. 2).

Formate dehydrogenase signal peptide analysis

In order to assess the subcellular localisation of FDHs, we used three prediction programs (Predotar, TargetP and MitoProt) on the 33 plant FDH protein sequences as well as on the S. cerevisae and N. crassa protein sequences (Table 3).

Table 3 Predotar, TargetP, and MitoProt predictions for the ability of various presequences to target FDH of various species to mitochondria or chloroplast

For most plant sequences, all three programs gave consistent results, with high scores predicting plant FDH as mitochondrial and low scores for the chloroplast localization (Predotar). In particular, the three bean presequences get high scores for a mitochondrial localization with all three programs (0.53 for FYN24G.16 and FYN24G.17, and 0.68 for FYN24G.15 with Predotar, 0.6, 0.6 and 0.439, respectively, with TargetP and 0.974, 0.974, and 0.995, respectively with Mitoprot). It is worth noting that although Predotar gives extremely low scores for a chloroplast localization of FDHs from most species, TargetP gives quite high scores, but always lower than for a mitochondrial localization. In particular, the Predotar scores for chloroplast localization for the three bean FDHs are 0.04 for FYN24G.15 and 0.05 for FYN24G.16 and FYN24G.17, and the TargetP scores are 0.28 for FYN24G.15 and 0.43 for FYN24G.16, and FYN24G.17. This is probably due to the criteria used for the designing of the neural networks used for each program, particularly the number of serines present in the presequence. For S. cerevisae and N. crassa sequences, the scores associated with mitochondrial localization were low with the two-program tested, strongly suggesting a cytosolic localisation of FDH in these two organisms.

Phylogenetic analysis of plant FDH genes

Nucleotide coding sequences were used to conduct phylogenetic analysis because of high conservation and therefore uninformative alignments at the amino-acid level between several sequences. For example, bean FDH FYN24G.16 and FYN24G.17 presented 100% identity at the amino acid level. Bacterial, yeast, and fungal FDH sequences had to be left out of the analysis because they are too divergent to produce reliable nucleotide sequence alignments. As the basis of the phylogenetic tree constructions, we chose the entire FDH gene coding sequence except the putative signal peptide region (75 bp) and 21 additional bp (taking StuFDH01 as a reference sequence), which are too divergent (Supplementary Fig. 2). We defined clades as subtrees containing at least two sequences with greater than 70% boostrap support.

The phylogenetic tree shows five well-supported, distinct, multiplant family clades (Fig. 3). Generally, the plant FDH phylogeny reflects species phylogeny. The Poaceae clade contains FDH sequences from Triticum aesticum, Hordeum vulgare, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Zea mays. The Taxaracum officinale and Helianthus annuus FDH sequences are part of the Asteraceae clade. The Solanaceae clade is composed of FDH sequences from Lycopersicon esculentum, Solanum tuberosum, and Nicotiana tabacum. The Brassicaceae clade is composed of three FDH sequences from Arabidopsis thaliana, Brassica oleracea, and B. napus. Finally, the last clade supported by an elevated bootstrap value (100%) is the Legume clade with the P. vulgaris, G. max, and Medicago truncatula FDH sequences. In agreement with the phylogenetic relationships between these three Legume species (Doyle and Luckow 2003), bean and soybean FDH belong to a well-supported subclade (boostrap value 100%), while the M. truncatula FDH sequence (MtrFDH01) is more distantly related. The fact that the three bean and the three G. max FDH sequences belong to the same subclade (boostrap value 100%), while a single FDH has been identified in M. truncatula, suggests that the amplification of FDH sequences in bean and soybean genome occurred after divergence of Medicago and the bean/soybean common ancestor. Two G. max FDH (GmaFDH02—chromosome 19; GmaFDH03—chromosome 13) and one bean FDH (FYN24G.15) belong to the same subclade (bootstrap value 89%), while GmaFDH01 (chromosome 19) and the two other bean FDH (FYN24G.16 and FYN24G.17) belong to another subclade with a high bootstrap value (100%). FYN24G.16 and FYN24G.17 are sisters within a strongly supported subclade (bootstrap value 100%).

Fig. 3
figure 3

Bayesian phylogenetic tree of 33 plant FDH nucleotide sequences. Numbers on the branches represent posterior probabilities after running a Markov chain Monte Carlo search for 10,000,000 generations. Sequences used in this tree are reported in Supplementary Table 1. The Glycine max chromosomes (13 and 19) on which each soybean FDH has been identified is indicated

Expression analysis of bean FDH genes

To determine if the three common bean FDH encoding genes have distinct expression patterns, quantitative RT-PCR experiments were carried out. Two primer pairs specific for FYN24.G15 and FYN24.G16, respectively, were defined for quantitative RT-PCR experiments. Given the high nucleotide identity between FYN24G.16 and FYN24G.17 (99%), no specific primers could be designed for FYN24G.17. Time-course analyses during either abiotic stresses (dark treatment) or biotic stresses (inoculation with the fungus Colletotrichum lindemuthianum) were performed. Two types of pathogen infections were performed: either with the C. lindemuthianum C531 incompatible strain (bean BAT93 genotype resistant) or with the M126 compatible strain (bean BAT93 genotype susceptible).

Dark treatment resulted in a 4.5-fold increase in FYN24G.15 mRNA levels after 72 h, while no change in FYN24G.16 mRNA levels was detected (Fig. 4a). A dark-to-light shift at 72 h after dark exposure results in a decrease in FYN24G.15 transcript accumulation, confirming that the 4.5-fold increase in FYN24G.15 mRNA levels is due to the dark treatment. After inoculation with incompatible C. lindemuthianum strain, FYN24G.16 transcript levels were increased approximately fivefold (72 h) during incompatible interaction, while no variation in FYN24G.15 mRNA levels was detected (Fig. 4b). For the other tested treatment (compatible interaction after inoculation with C. lindemuthianum) no significant change in FYN24G.15 or FYN24G.16 transcript levels were detected (Fig. 4c). In conclusion, these RT-qPCR experiments revealed that FYN24G.15 and FYN24G.16 are differentially up-regulated in common bean seedlings subjected to biotic (incompatible interaction with C. lindemuthianum) and abiotic (dark stress) stresses.

Fig. 4
figure 4

Expression analysis by Real-Time q-PCR of two common bean FDH (FYN24G.15 and FYN24G.16) after dark stress (a), incompatible interaction with the fungus Colletotrichum lindemuthianum (b) and compatible interaction with C. lindemuthianum (c). Quantification of gene expression was performed using the comparative Ct method (Livak and Schmittgen 2001). Data are presented as induction fold from treated relative to control conditions. The dark stress was performed for 5 days, followed by a dark-to -light shift for 2 days

Discussion

FDH is a low copy gene family in plant genome and three common bean FDH genes are localized in a cluster of CNL sequences

In this paper, we present the results of the sequence annotation of a 105-kb BAC clone from P. vulgaris encompassing a CNL-gene-rich region of the B4 R gene cluster located at the end of linkage group B4 (David et al. 2008; Geffroy et al. 2008; Geffroy et al. 2009). Only ten common bean BAC clones have been annotated so far (D’Ovidio et al. 2004; Melotto et al. 2004; Kami et al. 2006; Innes et al. 2008), and no gene prediction program specifically trained for P. vulgaris is yet available. In the present study, we developed a semi-automatic annotation procedure using a combination of prediction and comparative tools. In particular, we used the information derived from a growing amount of ESTs available for Phaseolus [P. vulgaris (83,348), P. coccineus (20,120), P. acutifolius (739), and P. angustissimus (6,672)] to complement the ab initio gene-model prediction of FGENESH and GenMark.hmm (Table 2). In addition to four CNL-encoding genes, we identified three FDH-encoding genes clustered in less than 13 kb (Fig. 1; Table 2). ESTs were very helpful to update the gene model for these three FDH genes, containing six exons and five introns (Fig. 2). Bioinformatic predictions were confirmed by near-full-length cDNA sequencing. We conducted a comprehensive plant FDH sequence database search. One important aspect of our work was to use EST information to increase the number of complete FDH sequences, especially from non-well sampled genomes. Indeed, out of 36 complete FDH sequences, half came from EST clustering, either from PlantGDB or from manually established EST clusters. Consequently, as already demonstrated in other studies, EST databases provide an essential resource for both genome annotation and gene identification from non- sequenced genomes for phylogenetic or diversity analyses (Zhu et al. 2002; Rossi et al. 2003; Radwan et al. 2008).

The current database searches clearly demonstrated that FDH is a low copy gene in plant genomes, since 33 FDH sequences were identified from 27 plant species. In the soybean genome, we identified two chromosomes (13 and 19) containing FDH genes. G. max diverged from P. vulgaris approximately 20 million years ago (Mya) (Lavin et al. 2005; Shoemaker et al. 2006), and has undergone a subsequent polyploidy event. Since FDH is a low copy gene in plant genomes, these results suggest that these two scaffolds correspond to homoeologous regions of the soybean genome. It also suggests that these two chromosome regions correspond to the orthologous regions of the bean B4 R gene cluster. Concerning the two model legume species (Young et al. 2005; Cannon et al. 2006; Sato et al. 2008), a partial FDH sequence (because of its location at the end of a BAC clone) has been identified in L. japonicus contig “CM0201”, confirming that this contig corresponds to the putative orthologous region of the bean B4 R gene cluster (David et al. 2008). This L. japonicus FDH sequence was not included in the present study based on complete plant FDH sequences. In M. truncatula, we identified a single FDH sequence derived from EST clustering. So far, no M. truncatula BAC clone containing a FDH gene is publicly available, suggesting that in M. truncatula the orthologous region of the B4 R gene cluster has not yet been completely sequenced.

Our phylogenetic results (Fig. 3) are consistent with two duplication events for bean FDH. A first duplication event, leading to FYN24G.15 and (FYN24G.16/FYN24G.17) common ancestor, probably occurred after the M. truncatula/(G. max/P. vulgaris common ancestor) divergence, estimated at ~54 Mya (Cronk et al. 2006). A second more recent duplication event, leading to FYN24G.16 and FYN24G.17, probably occurred after the divergence of P. vulgaris and G. max (~20 Mya). This second duplication event corresponds to an inverted duplication giving rise to FYN24G.16 and FYN24G.17. As mentioned earlier, since G. max is a polyploid species, this scenario implies the loss of one FDH on soybean chromosome 13. Inverted duplication events from a founder gene result either in tail-to-tail or head-to-head (present case for FYN24G.16 and FYN24G.17) orientation. Interestingly, this evolutionary mechanism has been found for gene families other than bean FDHs and has been demonstrated to be involved in micro RNA biogenesis (Allen et al. 2004). For example, in Arabidopsis thaliana, two miRNA genes, MIR161 and MIR163 evolved relatively recently by inverted duplication events from their target genes (Allen et al. 2004).

Prediction programs are in favor of the ‘mitochondrial only’ localization of plant FDH

Sequence comparisons of the 36 FDH sequences identified in our database search revealed that they are well conserved (Supplementary Fig. 2). The occurrence of conserved Thr76 and Thr333 in all the 36 FDH (except Thr333 which is absent in the Neurospora crassa FDH NcrFD01), is striking. In potato tubers, FDH has been shown to be phosphorylated on residues Thr76 and Thr333 (Bykova et al. 2003). This suggests that phosphorylation could be a general mechanism for NAD-dependent FDH protein regulation.

In fact, sequence comparison of the 36 identified FDH revealed that most of the nucleotide variations reside in the amino terminal region of the coding sequence, which corresponds to the predicted signal peptide (Supplementary Fig. 2). Signal peptides, which target a particular subcellular compartment, are not well conserved among proteins and organisms, but have common characteristics (Hurt et al. 1985; von Heijne 1986; Hartl et al. 1989). For example, mitochondrial precursors generally contain their targeting information in a cleavable N-terminal presequence, which is rich in hydroxylated and positively charged residues and generally produces an N-terminal amphiphilic α-helix. In yeasts and fungi, FDH is localized in the cytosol (van Dijken et al. 1976; Yurimoto et al. 2003) and this localization is confirmed by the present in silico predictions (Table 3). In higher plants, FDH is encoded by nuclear genes and imported into the mitochondrion, as demonstrated in S. tuberosum by antibody recognition of the protein in mitochondrial fractions (Colas des Francs-Small et al. 1993) and GFP targeting (Ambard-Bretteville et al. 2003a). In A. thaliana only, a dual localization of FDH in the chloroplast as well as in mitochondria has been proposed (Olson et al. 2000; Herman et al. 2002), but has never been shown by GFP-targeting as seen in the Arabidopsis SUBcellular database [SUBA, (Heazlewood et al. 2007)]. Furthermore, two proteomic studies report identification of FDH as well as other very abundant mitochondrial proteins in chloroplastic fractions of A. thaliana, and stress that they can only be contaminants (Friso et al. 2004; Kleffmann et al. 2004). In the present study, we present the results obtained using three independent prediction programs (Predotar, TargetP and Mitoprot) on a very large subset of 33 plant FDHs. All the predictions are in favor of the ‘mitochondrial only’ localization of FDH, even in A. thaliana, strongly suggesting that the ‘mitochondrial only’ localization is the rule in higher plants. Ambard-Bretteville et al. (2003a) suggested that the first three residues of the presequence are very important for mitochondrial import, since the mutation of a single amino acid in the potato FDH presequence was sufficient to strongly inhibit import, and the deletion of the first two residues totally abolished it. They also observed that the presequence bearing the “MAM” motif in position 1–3 lead to very high scores for mitochondrial import predictions. In the present work, the recurrence of the conserved “MAM” motif at the beginning of most of the plant FDH presequences is striking (found in 25 out of 33 plant FDH sequences).

Why is plant FDH localized in mitochondria since formate can be produced in all cell compartments (Hourton-Cabassa et al. 1998; Igamberdiev et al. 1999)? Although formate is beneficial as a C1-unit donor (folate synthesis), it is also known to be toxic to plants at concentrations from 4 mM added to the culture medium (Li et al. 2002). Although 4 mM seems to be a high concentration it is not dissimilar to the Km of plant FDHs for formate, which has been reported to be 0.5–2 mM (Uotila and Koivusalo 1979; Farinelli et al. 1983; Li et al. 2000; Ambard-Bretteville et al. 2003b). This formate toxicity could be crucial to mitochondria since formate has been reported to inhibit cytochrome-c oxidase and respiration (Nicholls 1975). Therefore, we propose that FDH might be localized inside mitochondria to regulate the intra-mitochondrial concentrations of formate and protect the mitochondria from the deleterious effect of high concentrations of this compound which are thought to arise under stress conditions (Ambard-Bretteville et al. 2003b).

Two nearly identical FDH genes are differentially up-regulated in common bean seedlings subjected to biotic and abiotic stresses

In most higher plants, FDH transcripts are scarce in unstressed leaves, but accumulate in response to numerous abiotic stresses. The first evidence for this was described in S. tuberosum leaves submitted to abiotic stresses such as drought, osmotic stress, cold, hypoxia, prolonged darkness, wounding as well as foliar application of metabolites (formate, methanol, ethanol…) (Hourton-Cabassa et al. 1998). Similar observations were made in Arabidopsis thaliana where FDH was shown to be up-regulated after dark, cold, drought, heat treatments, and wounding (Li et al. 2001, 2002). In P. vulgaris, we detected an increase in FYN24G.15 transcript levels, during prolonged darkness (72 h), and an accumulation of FYN24G.16 transcript levels during incompatible interaction with C. lindemuthianum (72 h) (Fig. 4). Interestingly, this time (72 h post inoculation) corresponds to the transition between biotrophic and necrotrophis phases, a key step in the infection process of the hemibiotrophic fungus C. lindemuthianum (Dufresne et al. 2000). Although we know that all three bean FDH genes are transcriptionally active (cDNA analysis), we were not able to test FYN24G.17 in qPCR experiments since no specific primers could be designed for that gene. Our results constitute an example of gene duplication and subsequent divergence, because two genes FYN24G.15 and FYN24G.16, with 87% nucleic acid identity, are differentially up-regulated. During the duplication events, flanking sequences were duplicated together with the FDH-coding sequences (Supplementary Fig. 3). FYN24G.15 and FYN24G.16, which present distinct expression patterns, also show the highest nucleotide variation in their 5′ flanking regions. Our results not only confirm the FDH transcript accumulation in response to abiotic stresses, but constitute a first literature report of FDH transcript accumulation in response to a biotic stress (infection with a fungus). This is backed up by transcriptome data in publicly available databases such as Genevestigator (https://www.genevestigator.ethz.ch), which reveal a high accumulation of FDH transcripts during infection of Arabidopsis thaliana by numerous pathogens (for example Botrytis cinerea, Phytophtora infestans, Pseudomonas syringae). The physiological role of FDH proteins in plants during biotic and abiotic stresses is not fully understood yet. Formate, the substrate for FDH, can arise from various pathways including photorespiration, cell wall degradation or synthesis, and glycolysis (Igamberdiev et al. 1999). In stressed plants, formate biosynthesis may result from enhancement of any of these pathways. As hypothesised by Hourton-Cabassa et al. (1998) formate could be the signal for the accumulation of FDH transcripts. Indeed, since formate accumulation is known to have deleterious effects in mitochondria at high concentrations (Nicholls 1975), FDH might protect mitochondria from these harmful effects. As we detected bean FDH transcript accumulation after both dark treatment and C. lindemuthianum incompatible infection, our study constitutes an additional example of connection between biotic and abiotic stresses (Fujita et al. 2006; Sudo et al. 2008). Mechanisms that underlie signaling pathway connections remain unclear and further work is needed to clarify these connection points, but formate could be a good candidate as a signaling molecule involved in both pathways.