Introduction

The barley genome is diploid (2n=2x=14) and its estimated size is in a range between 4,800 and 5,400 Mb (Arumuganathan and Earle 1991). In this species, mutant map-based cloning is hindered by the low degree of polymorphism (reviewed in Castiglioni et al. 1998).

Comparative genetic studies across species based on mapping of RFLP markers and of characteristics as plant height, flowering time and shattering (Gale and Devos 1998) reveal syntenous conservation in the order of genes and markers along grass chromosomes (Devos 2005). In addition, the availability of the genomic sequence of rice has stimulated comparative analyses with Triticeae genomes (Bennetzen and Ma 2003; Sorrells et al. 2003; Conley et al. 2004; Miftahudin et al. 2004; Peng et al. 2004). Micro-colinearity with rice has been confirmed for barley and wheat (Ramakrishna et al. 2002; Yan et al. 2003; Chantret et al. 2004), although DNA rearrangements have also been observed (Li and Gill 2002; Bennetzen and Ma 2003; Brunner et al. 2003).

Positional cloning is facilitated by the consideration of candidate genes (CGs), as well as markers linked to the target locus in syntenous species. This strategy recently led to the identification of the gene underlying the barren stalk1 mutant in maize (Gallavotti et al. 2004). Rice genome data have already assisted, for example, the investigation of synteny relationships and in silico mining of CGs for the Ph2 locus of wheat (Sutton et al. 2003), the sdw3 dwarfing mutation of barley (Gottwald et al. 2004) and the Z self-incompatibility locus of rye (Hackauf and Wehling 2005).

Developmental mutants have been extensively investigated in several plant species. Genes involved in leaf development (reviewed in Pozzi et al. 2001; Tsiantis and Hay 2003), shoot (reviewed in Ward and Leyser 2004) and inflorescence branching (Chuck et al. 2002; Komatsu et al. 2003; Gallavotti et al. 2004) have been cloned and their mutations associated to phenotypic effects. In barley, only few genes have been isolated following this approach (Müller et al. 1995; Helliwell et al. 2001; Chandler et al. 2002; Chono et al. 2003), although the species has the advantage of a rich collection of monogenic mutants altering plant height and architecture (e.g. brachytic, brh1; brachytic dwarf, brh2; many noded dwarf 6, mnd6; slender dwarf 4, sld4; uniculm-2, -3, -5, -15, -16, cul-2, -3, -5, -15, -16), leaves and leaf-like organs (e.g. liguleless, lig; calcaroides-b19, -C15, -d4, cal-b19, -C15, -d4; short awn 5, lks5; triple awned lemma, trp; third outer glume, trd; Hooded, Kap; suppressor of Hooded D-25, E-74, F-76, suK D-25, E-74, F-76) and inflorescence morphology (e.g. absent lower laterals, als; branched 1, brc1; double seed 1, dub1; six-rowed spike 1, vrs1). The mentioned loci have been positioned in a genetic map based on molecular markers (Castiglioni et al. 1998; Pozzi et al. 2000, 2003; Roig et al. 2004) and a synteny-based strategy exploiting the rice genome sequence is presented in this paper with the aim of identifying CGs. The approach has made it possible to identify rice orthologous regions for 23 barley loci. In a subsequent step, the regions were scanned for CGs following the annotation of rice genomic sequences. For the liguleless (lig) and branched1 (brc1) mutants, the analysis has provided strong circumstantial evidence in favour of a specific candidate gene–mutant association.

Materials and methods

DNA sequences linked to the mutants of interest

Barley mutants assigned to linkage subgroups were considered (Castiglioni et al. 1998; Pozzi et al. 2000, 2003; Roig et al. 2004). Based on their location, relevant regions on barley linkage maps (summarized in GrainGenes http://wheat.pw.usda.gov/GG2/index.shtml) were identified. For each mutant, a list of markers located in the region of interest was compiled and the corresponding DNA sequences obtained from the GrainGenes database.

In silico mapping of DNA sequences on the rice genome and identification of barley-rice syntenous regions

Sequenced markers were mapped on the rice genome by BLASTn similarity searches (websites: Gramene http://www.gramene.org/; The Institute for Genomic Research, TIGR http://www.tigr.org/tdb/e2k1/osa1/, August 2003–August 2004). In cases, throughput was increased by Python scripts of National Center for Biotechnology Information (NCBI) standalone BLASTn package to scan rice pseudomolecules downloaded from the TIGR website on 14 May 2004 (release 2). BLASTn settings were those of the TIGR BLAST server (http://tigrblast.tigr.org/euk-blast/index.cgi?project=osa1). Results were organized in Excel spreadsheets. An E value 10−10 was adopted to claim a significant match. In cases, hits with 10 −10 <E value <10−4 were considered when consistent with synteny results. Markers with single BLAST hits were assigned to a chromosomal position. For multiple BLAST hits, chromosomal attribution was based on the most conservative interpretation of the data, assigning the markers to the rice chromosome with the best colinearity with the barley region. Some markers with multiple BLAST hits were discarded to simplify analyses and avoid identification of artefactual syntenous regions. Rice chromosomes hosting at least three markers mapping to positions consistent with those of the barley map were further evaluated for synteny. Syntenous regions were finally defined based on the number and relative order of markers supporting co-linearity with the barley target region.

Annotation and identification of CGs

Based on rice genome assemblies available through Gramene (http://www.gramene.org/) and TIGR (http://www.tigr.org/tdb/e2k1/osa1/pseudomolecules/info.shtml), a contig of BAC/PAC genomic clones covering the rice region of interest was identified and annotation of the genomic sequence of each clone was obtained. Where possible, priority was given to annotations released by the International Rice Genome Sequencing Project (IRGSP) through NCBI. In cases, the annotation included predicted CDS potential function based on homology to characterized genes. When predicted CDS had no functional annotation, batch BLAST searches were conducted at http://www.bio.ifom-firc.it/BLAST/index.shtml to reveal similarities. For unannotated clones, the full sequence was obtained and annotated through the Rice Genome Automated Annotation System (RiceGAAS, http://www.ricegaas.dna.affrc.go.jp/). In addition to NCBI and RiceGAAS, the rice genome automated annotation available from TIGR was considered (http://www.tigr.org/tdb/e2k1/osa1/irgsp.shtml).

In selecting CGs, we considered present knowledge of the genetic, biochemical and molecular bases of the traits studied in model species, where genes and signal molecules have been identified as key regulators. Gibberellins, brassinosteroids and related genes were considered for plant height. Leaf development and inflorescence branching are also under hormonal control as well as dependent on several classes of transcription factors (McSteen et al. 2000; Hay et al. 2004; Ward and Leyser 2004). In foliar development, relevance has been attributed to knox genes (reviewed in Pozzi et al. 2001; Tsiantis and Hay 2003) and their genetic interactors, i.e. gene from the LOB, MYB, homeobox, YABBY and NAC families, as well as genes involved in chromatin regulation and gene silencing (reviewed in Wagner 2003).

In summary, CGs were considered when, beside mapping in syntenous positions to mutants of interest, they had the following characteristics: (1) genes encoding regulatory proteins with a role in developmental processes/traits correlated with the phenotype of our mutants; (2) gene families organised in sub-classes, easily distinguished in molecular analyses; (3) genes encoding proteins involved in metabolism and signalling of plant hormones.

Phylogeny of SBP-box genes

Aminoacid sequences were retrieved through BLASTp searches using the SBP domain from the maize LG1 protein (accession O04003) as a query against the non-redundant protein database (http://www.ncbi.nlm.nih.gov/BLAST/) and the TIGR rice genome database (http://www.tigrblast.tigr.org/euk-blast/index.cgi?project=osa1). Aminoacid sequences from six maize genes (including Lg1), 16 rice genes, five Antirrhinum majus genes, 19 Arabidopsis genes and one barley gene (see below) were aligned using the programme MUSCLE with default settings (Edgar 2004). Seventy-seven unambiguously aligned positions from the SBP domain were subjected to phylogenetic analysis with the program MrBayes (Huelsenbeck 2000) under the JTT aminoacid substitution model, with one invariable and four γ-distributed variable substitution rate categories. Four incrementally heated MCMC chains were run for two million generations with trees sampled every 50 generations. Consensus trees were generated after the first 1,000 trees recovered had been discarded as “burn-in”. For bootstrap analysis, the programme Seqboot (Felsenstein 2002) was used to generate 100 resampled datasets. The programme Phyml (Guindon and Gascuel 2003) allowed to analyse resampled datasets under the same substitution model adopted for Bayesian analysis. Bootstrap proportions were calculated using the programme Consense (Felsenstein 2002).

Barley Lg1-related genomic fragment

DNA was extracted from leaves using the CTAB protocol. All PCR amplifications were carried out in PTC-100 thermalcycler (MJ Research), using Taq polymerase and PCR reaction buffer from Invitrogen (S. Giuliano Milanese, Italy). PCR products were sequenced directly (Applied Biosystems 377) or through PRIMM (Milan, Italy). DNA sequence analysis was performed with BioEdit (Hall 1999).

For the isolation of the barley SBP domain and surrounding region from Lg1 ortholog, degenerate primers (SIGMA-GENOSYS, Milan, Italy) were designed on conserved regions of LG1 maize (O04003) and rice (CAE03411) genes. Degenerate primer sequences were: (for the 3′ part) LIGdeg_F4 5′-GAYGARTTYGAYGAYGCIAA-3′, LIGdeg_R1 5′-TCRAARTCIARRTCRAACAT-3′; (for the SBP domain) LIGdeg_F1 5′-GCICCIGARTAYTAYTTYCC-3′ and SBPrev 5′-TTGTGGTCTGCGAGGCGCTTCC-3′. The degenerate primer PCR reaction (25 μl) contained 25 ng barley genomic DNA (genotype Nudinka), 1.25 units Taq polymerase, 1× PCR reaction buffer, 1.5 mM MgCl2, 200 μM each dNTP, 0.5 μM each primer. Thermal cycling was carried out as follows: 94°C 5 mins, followed by 35 cycles 94°C 1 min, 52°C 1 min (LIGdegF4–LIGdegR1) or 60°C 1 min (LIGdeg_F1-SBPrev), 72°C 3 min, followed by a final step at 72°C 10 min.

Thermal asymmetric interlaced-polymerase chain reaction (TAIL-PCR) was carried out as described by Liu et al. (1995) with minor modifications. Gene specific primers used in the primary, secondary and tertiary TAIL-PCR reactions were LG1fwd 5′-GGTCTTCACCTGCAGAGCC-3′, LG2fwd 5′-TTTCCCTTCGACCTCTGCA-3′, LG3fwd 5′-CAGCTTGGGGTTCCATCA-3′, respectively.

Amplified fragments of approximately 3,300 bp purified from agarose gel were directly sequenced. A total of 1,541 bp, comprising part of the introns, were assembled in a putative barley Lg1-3′ fragment (HvLg1). The sequence was verified by reamplification and resequencing and deposited in GenBank with accession number AM117950.

Partial sequence of the HvLg1 fragment was obtained from barley cultivars Steptoe and Morex, parental genotypes of a doubled-haploid (DH) mapping population (Kleinhofs et al. 1993), after amplification with primers HvLG3′fwd 5′-GCATCCGTCCATGTTTCTCT-3′, HvLG31′rev 5′-GAAGGATGTTGCTGTGCTGA-3′ (product size of 497 bp). The fragment was mapped via heteroduplex analysis utilizing DHPLC (WAVE, Transgenomic Inc. USA) according to a previously described procedure (Kota et al. 2001). Map Manager QTX v0.27 (Manly et al. 2001) was employed for linkage analysis in 94 individuals of the above-mentioned DH population. Recombination frequencies were converted to genetic distances in centiMorgan by applying the Kosambi function (Kosambi 1944).

Results

Bioinformatics and molecular genetic approaches were used to individuate putative CGs. The procedure which has been followed includes four phases. Phase 1: a list of RFLPs located in the region of interest was compiled and the corresponding DNA sequences retrieved from public databases. Phase 2: RFLP sequences were used in BLASTn searches against the rice genome and putative rice–barley syntenous genomic regions were identified. Phase 3: relevant rice genomic regions were scanned for CGs, exploiting existing annotation from public databases or ex novo annotation based on publicly available algorithms. A further step (phase 4) was introduced for some mutants to associate specific CGs with the mutant phenotype. This step is a test for reliability of the complete procedure carried out in phases 1–3. Putative rice CG sequences derived from annotation were used to isolate barley orthologues, which were positioned on existing barley genetic maps.

Assignment of CGs to specific chromosomal regions (phases 1–3) and steps towards the validation of the method (phase 4)

Unless otherwise stated, map positions of barley mutants were obtained from Pozzi et al. (2003) while mapped markers were as in Barley Consensus 2 Map (Qi et al. 1996). In tables, markers tested are reported including those that for different reasons do not support synteny.

Table S1 (supplementary online materials) lists for each mutant sequenced RFLPs along with their chromosomal attribution in rice. In cases, BLASTn hits pointing to different chromosomes were detected: only chromosomes where a significant number of markers gave hits in a confined region were kept for further analyses. In spite of this, in cases multiple hits with comparable E values still mapped to more than one chromosome. For relevant loci, Table S4 (supplementary online materials) lists the chromosomes hit by at least three markers but discarded from further analyses because marker position/order was not consistent with the barley map. Table S2 (supplementary online materials) reports the rice chromosomal intervals exhibiting the best colinearity with barley marker order: relevant RFLPs are indicated along with BLASTn E values and positions on the rice chromosome. Table 1 highlights in the case of the examined loci, the association to putative CGs derived from annotation of rice chromosomal regions syntenous to relevant barley regions. In supplementary online materials, Table S3 reports, for each CG, the putative function/homology, together with its accession number. Phase 4 represents a tentative validation test of the CG identification procedure and aims at obtaining circumstantial proofs in favour of CG–mutant association. This has been done based on several criteria: (a) presence of developmental mutants closely resembling phenotypes of the barley mutants under study in syntenous chromosome intervals (the case of mnd6, cul2, cul-3/-5/-15/-16, brh1, brh2, data not presented); (b) association already available for a specific gene and a given mutant of rice or of another species (brc1, cal b19, cal C15); (c) same as in b and, in addition, the candidate has been cloned and mapped in barley (lig).

Table 1 Putative candidate genes identified in the rice chromosome region exhibiting synteny with the barley intervals hosting the loci of interest

Absent lower laterals (als)

The mutant gene als maps on barley chromosome 3H, linkage subgroup 28, in a region containing markers CDO419b and WG110. Thirty-two sequenced barley RFLP were tested by BLASTn revealing ten hits on rice chromosome 1 spanning from bp 32154753 to bp 41561095. Based on mutant mapping in barley, we restricted further analyses to the telomeric region of the rice chromosome 1 from 36 to 41 Mb. Putative CGs present in this region are mentioned along with other candidate genes for the cal d4 locus.

Branched 1 (brc1)

The brc1 mutant (previously brc-5, Franckowiak and Lundqvist 2002) maps on chromosome 2H, subgroup 17, near markers MWG2067 and CDO665 (Castiglioni et al. 1998). Twenty RFLP markers were tested for BLASTn hits. Chromosomes 3 and 4 were hit by four and three markers, respectively, but were discarded from further analyses because they did not comply with our marker order/position criteria. Nine markers were located on chromosome 7 (Fig. 1) covering a region from bp 20693643 to 29220338. Combining these data with barley mapping data, we restricted the annotation to a region of rice chromosome 7 from 27 to 29 Mb. In this region, six CGs and a cluster of 11 tasselseed2-like genes were localized (Table 1). Based on phenotytic similarities, the most promising candidate can be considered FRIZZY PANICLE (FZP), a rice gene orthologous to maize Branched Silkless1 (BD1) located at position bp 28224493 (Komatsu et al. 2003).

Fig. 1
figure 1

Synteny between barley chromosome 2H and rice chromosome 7. The approximate position of the brc1 locus in barley is indicated along with the localization of the FRIZZY PANICLE (FZP) gene in rice. Distances along the rice map are in Mb

Brachytic 1 (brh1)

The mutant maps on barley chromosome 7H, subgroups 1 and 2, near marker CDO475. Thirty-three sequenced RFLP markers were tested for BLAST hits. Following the most conservative interpretation of the data, chromosomes 3 and 6 were considered for analysis. Markers hitting chromosome 6 define a region comprised between bp 1515878 (MWG799) and 8576270 (ABG616). Consistent with their positions in the Barley Consensus 2 map, ABC255, ABC465 and ABG603 concentrated between bp 4108200 and 4769606 and analysis was restricted to rice chromosome 6 from 1 to 5 Mb (Fig. S1). In this region, 14 genes belonging to CG families were identified (Table 1).

Brachytic dwarf (brh2) and short awn 5 (lks5)

The brh2 and lks5 mutant loci map on barley chromosome 4H, linkage subgroups 38–40, south of CDO541 and linked markers. Out of 57 sequenced RFLP markers considered for the region, 23 showed similarity to sequences present on rice pseudomolecule 3, while five markers hit chromosome 12. The consideration of barley mapping data together with the analysis of the rice syntenic region allow us to concentrate on rice pseudomolecule 3 from 1 to 11 Mb (Fig. S2). In this region, 47 putative genes belong to candidate families (Table 1).

Calcaroides b19 (cal b19) and calcaroides C15 (cal C15)

The cal b19 and cal C15 loci map in a cluster on barley chromosome 5H, subgroups 60–63 (Pozzi et al. 2000). Fourteen sequenced RFLPs were considered in BLASTn analyses and six showed similarity with sequences on rice pseudomolecule 12 and 3. The simultaneous consideration of barley mapping data and rice syntenic analysis suggested the need to concentrate on rice pseudomolecules 12 (from 14 to 25 Mb) and 3 (from bp 25144050 to 31769971) (Fig. S3). No CGs were identified in the region located on chromosome 3. On chromosome 12, seven putative genes belonging to candidate families were individuated (Table 1). A candidate gene is present around bp 22954568 showing high similarity (E value 1.4×10−116) with rough sheath2 (rs2), a maize gene belonging to the MYB family (Tsiantis and Hay 2003).

Calcaroides d4 (cal d4)

The mutant gene cal d4 maps on barley chromosome 3H, subgroups 27 and 29 (Pozzi et al. 2000). Twenty-five sequenced RFLPs were considered: 12 of them hit rice chromosome 1, in a region comprised between bp 21399956 and 39615119. We concentrated the analysis of the rice pseudomolecule 1 between Mb 27 and 41 (Fig. S4). Here 29 putative genes belonging to CG families were identified (Table 1).

Uniculm 2 (cul2)

The cul2 locus maps on barley chromosome 6H in subgroup 54, near RFLPs cMWG679 and ABG458 (Babb and Muehlbauer 2003). Fifteen sequenced RFLPs were tested against the rice genome sequence and nine of them gave hits on rice pseudomolecule 2 (Fig. S5). Markers PSR167 and cMWG679 located to positions bp 74659 and 75443, respectively. Two regions from 22 to 29 Mb and from 1 to 2 Mb on rice chromosome 2 were further considered. Putative genes belonging to candidate gene families were spotted only in the first region (Table 1).

Uniculm -3, -5, -15, -16 (cul -3, -5, -15, -16)

These mutant loci map on barley chromosome 3H, subgroup 32, near RFLP markers CDO394a, CDO105 and the telomere. Nineteen sequenced RFLP markers were tested: five hit rice chromosome 1 and four rice chromosome 5. The region on rice pseudomolecule 1 spans from bp 25642214 to 42097950 and from bp 17239944 to 27679955 on chromosome 5. Further analyses concentrated on chromosome 1 from 41 to 44 Mb and on chromosome 5 between 16 and 18 Mb. In these two intervals, six and three putative genes belonging to candidate gene families were identified, respectively (Table 1).

Many noded dwarf 6 (mnd6)

The mnd6 locus maps on barley chromosome 5H, linkage subgroup 65 near markers WG364, CDO675B, CDO771A and WG1026. Thirty-two sequenced RFLPs were subjected to BLASTn searches. Thirteen hit chromosome 9 and 12 of them concentrated from bp 14273261 to 20593147 (Fig. S6). Four markers gave hits on chromosome 8 in an order consistent with the barley map (Fig. S6). We focused on the telomeric regions of chromosome 9 (from bp 14040010 to 21007741) and 8 (from bp 19405021 to 28267238) (Table 1). The target interval on chromosome 9 contains 36 CGs (Table 1), while the one on chromosome 8 was analysed for CGs for both mnd6 and suKE-74 and 83 CGs were identified in total (Table 1).

Slender dwarf 4 (sld4), suppressor of K E-74 (suKE-74)

The sld4 mutant maps on barley chromosome 7H, subgroup 6, near RFLPs BCD421, CDO358, WG669 and MWG808. The suKE-74 locus was positioned in an adjacent position defined by loci CDO358, CDO673 and BCD351A (Roig et al. 2004). Thirty-four sequenced RFLP markers were tested for similarity with the rice genome, 11 of which hit rice pseudomolecule 8 between 1 and 27 Mb. Four markers mapped to chromosome 6 between bp 3615437 and 22113934. Integration of barley mapping data and rice synteny data restricted the analyses to two regions spanning from Mb 0 to 9 and from 20 to 27 Mb on chromosome 8. Within the first interval, 52 CGs were identified (Table 1). CGs analyses of the second interval are presented along with the mnd6 data.

Suppressor of K D-25 (suKD-25), double seed 1 (dub1)

The dub1 locus maps to barley chromosome 5H, at the north end of subgroup 67, adjacent to suKD-25 locus (subgroups 66 and 67; Roig et al. 2004), between markers CDO504 (north) and CDO457 (south). Seventeen out of the 36 sequenced RFLP markers tested for BLASTn analysis hit rice pseudomolecule 3 (Fig. S7). Of these, four markers mapped in a region spanning from bp 4078293 to 21684735 and 12 defined the bp interval 30674394–35159201 (Fig. S7). Six markers gave hits on chromosome 1 from bp 7213822 (MWG533) to 36166562 (CDO113). Considering barley mapping results and rice synteny data, a telomeric segment on rice chromosomes 3 (bp 29624972 to 36118877, Table 1) and an interval on chromosome 1 from bp 29809961 to 36166562 (Table 1, see also the results for als and cal d4) were further considered. Analysis of the region of rice chromosome 3 identified 30 CGs. The analysis of the interval identified on chromosome 1 has been included together with the cal d4 results.

Suppressor of K F-76 (suKF-76)

The suKF-76 locus maps on barley chromosome 7H, north of subgroup 5 near markers CDO36, CDO348, CDO771B and MHVCMA (Roig et al. 2004). Twenty-one sequenced RFLPs were tested and seven gave hits on rice chromosome 6 defining the interval between bp 2948716 to 8008176 (Table S2, Fig. 2). Additional markers for rice chromosome 6 were derived from the linkage region harboring, in barley, sld4 and sukE-74, corresponding to chromosome 7H, subgroup 6. Based on rice synteny analysis, the focus was kept on bp 2553427 to 12124999 of chromosome 6, where 41 CGs were identified (Table 1).

Fig. 2
figure 2

Syntenic relationships between barley chromosome 7H and rice chromosomes 6 and 8. The approximate position of the suKF-76, sld4 and suKE-74 loci in barley is indicated. Rice centromere = full oval

Third outer glume (trd)

The trd locus maps on barley chromosome 1H, in subgroup 52, north of BCD304 and CDO989. Fourteen sequenced RFLPs were tested and five hit rice pseudomolecule 5. This region spans from bp 24589696 to 28177915. Barley mapping data suggested we concentrate on pseudomolecule 5 from Mb 20 to 28. Twenty-two putative genes belonging to candidate families were detected (Table 1).

Triple awned lemma (trp) and liguleless (lig)

The trp locus maps on barley chromosome 2H, subgroup 22, linked to the RFLP marker DGF41. Thirteen sequenced RFLP were tested for similarities to rice genome sequences. Seven of them hit rice chromosome 4 in a region between bp 31282128 and 34924639. This was the same location for the lig locus, which was mapped near RFLP BCD266 and DGF41 (Fig. 3). Based on barley mapping data, the rice genomic sequence comprised between Mb 31 and 34 was annotated in search for CGs (Table 1), leading to the identification of a putative candidate gene for the lig mutation (CAE03411) showing 71% aminoacid identity (83% similarity) to Liguleless1 (Lg1), a maize gene encoding an SBP-domain transcription factor required for ligule development (Moreno et al. 1997). Phylogenetic analysis of SBP-domain aminoacid sequences from five plant species, including all available complete sequences from rice, supports the orthology of LG1 to CAE03411 (OsLg1, Fig. 4a). Pairwise alignments identified a specifically conserved region at the C-terminus of the rice and maize proteins (Fig. 4b). PCR amplification with degenerate primers designed on this region and further 3′ TAIL-PCR yielded a barley genomic fragment of approx. 3,300 bp. (putative barley HvLg1 fragment, sequence deposited in GenBank with accession number AM117950). The deduced coding sequence resulted in 285 amino acids showing 66% identity with the corresponding maize and rice LG1 sequences (Fig. 4b). This suggested that the isolated genomic sequence encodes the C-terminal portion of the putative barley LG1 orthologue (HvLG1). Phylogenetic analysis supports orthology of this sequence to the maize and rice Lg1 genes (Fig. 4a). As a step to map this CG and to test its correlation with the barley lig phenotype, a single nucleotide polymorphisms (SNP) at bp-position 307 (T/A, Morex/Steptoe) of HvLg1 was detected by comparing the mapping parental genotypes Steptoe and Morex. Segregation analysis positioned the HvLg1′ fragment on long arm of chromosome 2H at 6.5 cM distal to MWG520b and 3.9 cM proximal to ABC157 in a position corresponding with the lig locus (Fig. 3). Thus, based on this genetic correlation there is a strong potential that HvLG1 indeed represents the 3′ of the barley liguleless 1 gene.

Fig. 3
figure 3

Synteny between barley chromosome 2H and rice chromosome 4. The approximate position of the trp and lig loci in barley is indicated, along with the localization of the rice and barley Liguleless1 putative orthologues (OsLg1 and HvLg1, respectively)

Fig. 4
figure 4

a Reconstruction of the phylogenetic relationships of 6 maize (Zm), 16 rice (Os), 19 Arabidopsis (At), 5 snapdragon (Am) and one barley SBP-box genes based on their SBP-domains. For each sequence a GenBank or TIGR accession number is indicated. b Alignment of aminoacid sequences from the LG1 maize gene (O04003), the orthologous rice gene (CAE03411) and the predicted peptide derived from conceptual translation of the putative HvLg1 barley genomic fragment. Black boxes highlight identical residues, grey boxes indicate similar residues. The aminoacid sequence corresponding to the SBP domain is underlined and conserved intron positions are indicated. The specifically conserved C-terminal region starts from sequence KRLADH

Six-rowed spike 1 (vrs1)

The hexastichon-v3 and hexastichon-v4 (hex-v3, hex-v4) mutants are alleles of the vrs1 locus (Franckowiak and Lundqvist 2002). They map on barley chromosome 2H, between subgroups 19 and 21, in a southern position with respect to BCD355 and in proximity to ABG619, cMWG699, MWG865 and MWG2081 (Pozzi et al. 2003; Komatsuda and Tanno 2004). We tested 19 sequenced RFLPs. Thirteen of them hit chromosome 4 and six also chromosome 2. A hit on the latter chromosome was also found for marker BCD111. The rice region identified on pseudomolecule 4 spans from bp 20877372 (WG996) to 34267428 (MWG892). On rice pseudomolecule 2 the region identified covers bp 23619508 to 26761720. Regions between 20 and 26 Mb of rice chromosome 4 and between 23 and 26 Mb on rice chromosome 2 (Fig. S8) were further analyzed. On chromosome 4, 13 putative genes belonging to the candidate gene families were identified, as well as 17 putative CGs on chromosome 2 (Table 1).

Discussion

In large cereal genomes, identification of CGs based on comparative mapping can accelerate positional cloning (Pflieger et al. 2001). Based on the rice genome sequence, syntenous regions to 23 barley developmental loci were defined. CGs with reasonable although different probabilities to match expectations were identified for four loci (brc1, cal b19, cal C15, lig). Evidence has been provided that one CG has high probability to represent the lig locus.

Components of the tool: map positions of the mutants in barley and the rice genome sequence

The map localization of the target loci in barley is the starting point. In cases, linkage analysis provides ambiguous positions, a situation complicated by inconsistencies among different genetic maps (Yap et al. 2003). Such a case is exemplified by the centromeric region of chromosome 7H hosting the sld4 and suKE-74 loci: the analysis of three barley linkage maps (Castiglioni et al. 1998; Barley consensus and Barley consensus 2 at http://www.graingenes.org/) reveal a weak consensus in the order of RFLP markers constituting the recombinational backbone of the region. Clustering of markers around poorly recombinogenic centromeric regions has been reported for several species (see Castiglioni et al. 1998 for a review). In silico mapping of probes from this region identified two syntenous blocks on rice chromosome 8 (Fig. 2) and marker order within these blocks was broadly consistent with the barley consensus map (Langridge et al. 1995). The regions located north and south of the centromere in barley corresponded, respectively, to the first and the second rice block, consistent with independent observations (Hossain et al. 2004).

The level of map resolution in barley is also critical: our mutant loci are positioned at few cM from mapped AFLP markers (Castiglioni et al. 1998; Pozzi et al. 2001, 2003; Roig et al. 2004), while using more spaced RFLP markers led to a loss of resolution. As a result, several Mb of rice genomic sequence had to be annotated and a relatively high number of potential CGs were identified for each mutant. Higher resolution would help to restrict the region to be analysed in rice: Gottwald et al. (2004) mapped the sdw3 locus to an interval of 0.55 cM in barley, which corresponded to about 252 kb of rice genome where three CGs were detected.

A further factor affecting the search for CGs is quantity and quality of rice genomic sequence. For the present work, we referred to rice genome assemblies available from TIGR (release 2.0, 30 April 2004 and release 4.0, October 2005, for some TIGR gene annotations in Table S3) and GRAMENE: these datasets are dynamic over time and for specific genomic regions chromosome coordinates slightly differ between the two databases and may change in the future. Also, analysis of gene annotation is, to some extent, not yet conclusive and subject to updates.

Synteny: identification of syntenous regions and CGs

Recent comparative DNA sequence analyses indicate that differences among rice and Triticeae genomes may imply more rearrangements than previously suspected (La Rota and Sorrells 2004;). Orthologous sh2/a1 regions are conserved in rice, sorghum and maize (Bennetzen and Ma 2003), but a major rearrangement is present in the Triticeae (Li and Gill 2002). Other cases are known where a specific gene is absent in rice (Brueggeman et al. 2002). Our results support syntenous blocks established by other authors (reviewed in Gale and Devos 1998), but few intra- and inter-chromosomal rearrangements have also been observed (for example, the barley centromeric region of chromosome 7H is syntenous to three distinct regions of chromosomes 6 and 8 of rice; Fig. 2).

Availability of genomic tools has prompted various groups to carry out comparative mapping by “virtual Southern blot” based on similarity searches (Salse et al. 2004). In the present study, only rice chromosomes hosting a minimum of three linked syntenous barley markers were considered. Synteny analysis is complex because of the existence of marker sequences without homologs or mapping to several regions of the rice genome. Cases of sequence deletion in one species are not surprising because rice and barley separated about 50 MYA (Dubcovsky et al. 2001). As for probes with multiple BLASTn hits, they could correspond to repetitive elements or to members of gene families (Salse et al. 2004). Such markers were not considered to avoid identification of artefactual syntenous regions. In cases, a potential ortholog was identified based on clear discrepancies among E values of multiple hits. cMWG704, for example, produced hits on several genes of the Hsp20/α crystalline family which are distributed on seven different chromosomes (data not shown). A clear cutoff in E values suggested we focus on a hit/gene on chromosome 6 as the likely ortholog. Also, more than one syntenous region can be identified for specific sets of barley markers within the rice genome. This is the case of RFLP markers surrounding the barley mnd6 mutant for which two regions were identified on rice chromosomes 8 and 9. The finding is consistent with the occurrence of ancient duplication events in the genome of rice (Guyot and Keller 2004).

Despite the complications deriving from synteny analysis and breakdowns, our working hypothesis revealed correct. Consistent with previous reports (Paterson et al. 2000), the barley chromosomal segment hosting the lig locus is colinear to a region of rice chromosome 4 where the rice liguleless mutation comaps with the rice Lg1 ortholog (CAE03411) (Yoshimura et al. 1997). The barley region hosting the branched 1 locus is colinear with the rice genomic segment hosting frizzy panicle, a locus involved in inflorescence branching (Komatsu et al. 2003). Similar considerations concern the rice syntenous blocks identified for the barley mutants mnd6, cul2, cul-3/-5/-15/-16, brh1 and brh 2 (data not shown). A gene related to maize terminal ear1 (Veit et al. 1998) may be a good candidate for the mnd6 locus. A robust CG was also identified for the cal b19 and cal C15 loci located on chromosome 5H. The rice syntenous region for these loci hosts a predicted gene exhibiting high sequence similarity to rs2, a MYB gene implicated in the negative regulation of knox activity in maize (Tsiantis and Hay 2003).

The approach we have adopted helps to associate barley phenotypes with specific gene sequences; it will not result in ex novo discovery of functions. CGs are selected from a list of predicted protein-coding genes obtained by annotation of rice genomic sequences. However, the traits under study may be controlled by non-protein regulators, such as microRNAs which have recently been demonstrated to play a major role in various developmental pathways (reviewed by Mallory and Vaucheret 2004).

In support of the correlation between mutant loci and CGs, data on their map co-localization were of help as for the case of the lig mutant. The availability in our laboratory of large segregating populations for the mutants considered would be an important complement to the approach presented here in cloning genes involved in plant development and architecture.