Introduction

The MADS-box gene family encodes transcription factors that are widely distributed in eukaryotic organisms (ranging from yeast and plants to insects and mammals), and many of these genes have important biological functions in eukaryotes (Riechmann and Meyerowitz 1997; Becker et al. 2000; Messenguy and Dubois 2003). This gene family is defined by the presence of a conserved 58 ~ 60 amino acids MADS-box domain whose name is based on four members: the Mini Chromosome Maintenance 1 (MCM1) gene in yeast, Agamous (AG) in Arabidopsis, Deficiens (DEF) in snapdragon and Serum response factor (SRF) in humans (Passmore et al. 1988; Norman et al. 1988; Yanofsky et al. 1990; Sommer et al. 1990). The MADS domain is involved in binding to DNA sequences (CArG box) based on the consensus CC(A/T)6GG sequence (Pellegrini et al. 1995). Indeed, many genes are likely regulated by MADS-box gene family members through CArG boxes that are present in their promoter regions (Shore and Sharrocks 1995). In plants, increasing evidence suggests that MADS-box genes play crucial roles in the processes of flowering plant growth, which influence flowering time (i.e. genes such as SOC1, FLC and SVP), formation and meristem identity of floral organs (AP1, FUL and AG), fruit ripening (SHP1 and SHP2), seed pigmentation and embryo development (TRANSPARENT TESTA16) (Alwee et al. 2006; Samach et al. 2000; Michaels and Amasino 1999; Hartmann et al. 2000; Mandel et al. 1992; Gu et al. 1998; Liljegren et al. 2000; Nesi et al. 2002; Parenicova et al. 2003; Arora et al. 2007).

Based on phylogenetic analysis, the MADS-box gene family in eukaryotes was divided into SRF-type and MEF2-type genes (Theißen et al. 1996). As a result of an ancestral duplication that occurred in MADS-box genes before the divergence of plants from animals to fungi, the MADS-box family can be divided into two main groups: type I and type II (MIKC) genes. The members of the type I group, including SRF-like genes, have been found in animals, fungi and plants; the type II group includes MEF2-like genes, found in animals and yeast, and this group also contains the plant-specific MIKC-type MADS-box genes (Alvarez-Buylla et al. 2000; De Bodt et al. 2003). In plants, the type II MADS-box genes of known function are a group of the most extensively studied and described transcription factors (Purugganan 1997; Theissen et al. 2000; Kaufmann et al. 2005).

Apart from the MADS (M) domain, the MIKC-type genes contain three additional functional domains: the Intervening (I), the kertain-like (K) and the C-terminal (C) domains, from N- to the C-terminus (Norman et al. 1988; Theißen et al. 1996; Cho et al. 1999; Yang et al. 2003). The K domain is a second conserved region among MIKC-type MADS-box genes, which is comprised of ~70 amino acids and mainly contributes to the dimerization by a coiled-coil structure (Ma et al. 1991; Davies et al. 1996). However, the type I genes in plants lack the typical K domain that is found in the type II MADS-box genes (Parenicova et al. 2003; Díaz-Riquelme et al. 2009). The I domain is a less conserved region of ~30 amino acids, which may influence DNA-binding specificity and dimerization (Riechmann et al. 1996). According to the structural divergence of the I domain, the MIKC-type genes have been categorized into MIKCC- and MIKC*-types (Henschel et al. 2002). The C-terminal domain (C-domain) is a highly diverse region, but it has been demonstrated that the C-domain has important functions in transcriptional activation and in the interaction of MADS-box proteins (Kramer and Irish 1999; Honma and Goto 2001). Based on the features of the M domain sequences, the type I MADS-box genes have been further classified into Mα, Mβ, Mγ and Mδ groups. In Arabidopsis, the Mδ group consists of six genes: AGL30, 94, 65, 67, 104 and 66, which are considered to be MIKC*-type genes according to the phylogenetic relationship of the two groups (De Bodt et al. 2003; Kofuji et al. 2003).

Maize and sorghum are globally important agricultural crops for humans and animals. In previous studies, 34 members of the MADS-box gene family were identified based on specific probes designed from the conserved MADS-box domain in maize (Mena et al. 1995), but a genome-wide analysis of the MADS-box gene family has not been undertaken in maize or in sorghum. Recently, the maize and sorghum genomes have been sequenced (Paterson et al. 2009; Schnable et al. 2009), and the completion of the whole genome sequences of maize and sorghum provided an opportunity for us to analyze the MADS-box gene family of these two plants. The whole genome sequences also provide a valuable resource for comparative analyses, which can be used to further understand the evolutionary history of the MADS-box family in the two plants. In this study, a complete set of 75 and 65 MADS-box genes were identified in the maize and sorghum genomes, respectively. Moreover, gene duplication events and expression patterns of MADS-box genes in maize and sorghum were also surveyed. Our results may be helpful for functional studies of the MADS-box genes, and these bioinformatics analyses also provide a biological reference for cloning of MADS-box genes, which can contribute to genetic breeding in maize and sorghum.

Materials and methods

Identification and classification of MADS-box genes

Maize genome sequences were downloaded from http://www.maizesequence.org/. The complete genome sequences of sorghum were obtained from http://genome.jgi-psf.org/Sorbi1/Sorbi1.download.ftp.html. Similar methods were employed to identify the MADS-box genes in maize and sorghum as described in other plants (Zhou et al. 2004; Leseberg et al. 2006).

First, the Hidden Markov Model (HMM) profile of the MADS-box domain was employed as a query to identify all possible MADS-box genes in the maize and sorghum genomes using the BlastP program (P value = 0.001). In order to identify the maximum number of MADS-box domain-containing sequences, two different HMM profiles were adopted to search for the MADS-box genes. One was obtained from the Pfam database (http://Pfam.sanger.ac.uk/Software/Pfam) (Finn et al. 2006), and the other profile was generated by alignments of 107 Arabidopsis MADS-box genes (Parenicova et al. 2003). Second, the Pfam database was used to determine if each of the candidate MADS-box sequences was a member of the MADS-box family. Additionally, all of the candidate MADS-box proteins were divided into type I and type II groups according to whether they encoded the K domain. To exclude overlapping genes, all of the candidate MADS-box genes were aligned using ClustalW (Thompson et al. 1994) and checked manually. All of the nonoverlapping MADS-box genes were used for further analysis.

Phylogenetic analysis

In order to better understand the phylogenetic relationships, and further classify the type I and type II MADS-box genes, two individual phylogenetic trees (one for type I and one for type II genes) were constructed using the complete amino acid sequences of the MADS-box genes from maize, sorghum and Arabidopsis genomes. The phylogenetic trees were generated using MEGA v4.0 (Tamura et al. 2007) with the neighbor-joining (NJ) method and using complete MADS-box protein sequences. The maize and sorghum MADS-box genes were then named according to their phylogenetic relationship with the corresponding Arabidopsis MADS-box genes.

Analysis of conserved motifs and gene duplication

To examine the structural divergence among the MADS-box genes in maize and sorghum, the conserved motifs in the encoded MADS-box proteins were investigated. Their complete amino acid squences were subjected to MEME (Multiple Expectation Maximization for Motif Elicitation) (Bailey and Elkan 1995) analysis online (http://meme.sdsc.edu/meme4_3_0/intro.html). Due to the limitations of the software, and also for analyses of individual conserved motifs in more detail, the MIKCC, MIKC*, Mα, Mβ and Mγ groups were analyzed separately using the MEME search tool with the following parameters: (1) optimum motif width was set to ≧6 and ≦200; and (2) the maximum number of motifs was set to identify 20 motifs. SMART program (http://smart.embl-heidelberg.de) and Pfam database were adopted to annotate the MEME motifs.

Gene duplication events of MADS-box genes in maize and sorghum were also investigated. All of the relevant genes identified in the maize and sorghum genomes were aligned using ClustalW and calculated using MEGA v4.0 (Yang et al. 2008).

Chromosomal locations of MADS-box genes

In order to determine the physical locations of MADS-box genes, the starting position of all MADS-box genes on each chromosome were confirmed by BlastN searching, using local databases of the complete nucleotide sequences of the maize and sorghum genomes. MapInspect software was used to draw the location images of the maize and sorghum MADS-box genes (http://www.plantbreeding.wur.nl/uk/software_mapinspect.html).

In silico analysis of expression pattern of maize and sorghum MADS-box genes

We used NCBI EST database to detect the expression patterns of maize and sorghum MADS-box proteins using megablast tool (http://www.ncbi.nlm.nih.gov/dbEST/). To search the expression data of MADS-box genes, the following parameters were adopted: maximum identity > 95%, length > 200 bp and E value < 10−10.

Results

Identification and classification of MADS-box genes in maize and sorghum

We used BlastP searches based on the conserved MADS-box domain HMM profile to identify the MADS-box genes in the maize and sorghum genomes. To identify the maximum number of MADS-box genes in the two plants, two different HMM profiles were used to search for the MADS-box genes (see Materials and methods). Interestingly, the total numbers of MADS-box genes identified using the two different HMM profiles were the same in both maize and sorghum. By this approach, 75 and 65 MADS-box genes were identified in maize and sorghum, respectively. The total number of maize MADS-box genes was similar with the number present in sorghum. According to phylogenetic tree analysis, 43 genes were determined to be type II MADS-box genes, and 32 were confirmed as type I MADS-box genes (including Mα, Mβ and Mγ groups) among the 75 maize representatives. In sorghum, 35 type II and 30 type I (consisting of Mα, Mβ and Mγ groups) MADS-box genes were identified. More interestingly, the total numbers of both type I and type II group MADS-box genes in maize and rice are the same, and the total numbers of both type I and type II group genes in Arabidopsis are greater than those in maize, sorghum and rice (Table 1).

Table 1 The total number of MADS-box genes within each group in maize, sorghum, rice and Arabidopsis

According to the survey of intron numbers, we found that maize and sorghum type II MADS-box genes usually contain multiple introns, and type I genes usually have introns with a range of 0–4 except ZmMADS44, 61, 63 and 71. The same phenomenon was also found in rice and Arabidopsis (Parenicova et al. 2003; Arora et al. 2007).

Comparative phylogenetic analyses of MADS-box genes in maize, sorghum and Arabidopsis

To examine the phylogenetic relationships of MADS-box genes and investigate the evolutionary history of this gene family among maize, sorghum and Arabidopsis, two individual phylogenetic trees of type I and type II group genes were constructed, respectively. To better view the phylogenetic relationships, shortened gene names were used in the phylogenetic trees (Table 2). The type II genes phylogenetic tree was constructed from 123 MADS-box sequences (43 maize, 35 sorghum and 45 Arabidopsis type II genes) (Fig. 1).

Table 2 Characters of maize and sorghum MADS-box genes
Fig. 1
figure 1

Phylogenetic tree of maize, sorghum and Arabidopsis type II MADS-box genes. The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method. MIKC*-type MADS-box genes were used as an outgroup. See Table 2 for the maize and sorghum MADS-box gene names

Based on the phylogenetic tree analysis, we divided the tree into 11 subfamilies. All of the MADS-box genes of maize and sorghum can be grouped with their Arabidopsis counterparts except for ZmMADS2, 14, 34 and SbMADS32 and 30. The tree shows that there are six maize homologs each for Arabidopsis SEP, TT16 and AP3/PI, three each for SOC1, SVP and AGL12-like, five for CAL/FUL/AP1, and four for AG/SHP. The CAL/FUL/AP1, SVP and AG/SHP subfamilies each contain four sorghum homologs. Three sorghum homologs each were confirmed for Arabidopsis SOC1, AP3/PI and AGL12-like, two for TT16 and seven for SEP. Interestingly, aside from one sorghum homolog grouped with Arabidopsis ANR1, no maize or sorghum MADS-box genes of the FLC and ANR1 subfamilies were found. The number of AG/SHP, SOC1 and AGL12-like homologs in maize and sorghum was the same. Obviously, besides the AP3/PI and TT16 groups, the total number of plant-specific MIKC-type MADS-box genes grouped into each subfamily was similar between maize and sorghum. According to this analysis, we concluded that maize and sorghum MIKC-type genes may stem from a common ancestor, which belongs to the same species of grass. In addition, four maize (ZmMADS40, 41, 42 and 43) and two sorghum (SbMADS35 and 34) MADS-box genes were grouped with the Arabidopsis MIKC*-type genes (Fig. 1).

In rice and Arabidopsis, the total number of genes in Mα, Mβ and Mγ groups was similar. However, unlike the rice and Arabidopsis type I MADS-box genes, the Mα group included the most members of type I genes in both maize and sorghum (Fig. 2).

Fig. 2
figure 2

Phylogenetic tree of maize, sorghum and Arabidopsis type I MADS-box genes. Black (Mα), Red (Mβ), Blue (Mγ). The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method. See Table 2 for the maize and sorghum MADS-box gene names

Analysis of conserved motif and gene duplication

A total of 140 MADS-box genes from maize to sorghum were subjected to MEME motif analysis to discover conserved motifs shared among related proteins. As previously described (Parenicova et al. 2003; Arora et al. 2007), conserved motifs encoding the MADS-box domain can be found in each maize and sorghum MADS-box gene (Fig. 3). In the maize and sorghum genomes, a total of 72 MADS-box sequences were identified as MIKCC-type genes. However, only 50 MIKCC-type MADS-box genes were found, including three conserved motifs (motif 5, 2 and 3) of the K domain, while only motif 5 was found in ZmMADS31 and 6 and motif 3 in SbMADS23. This result confirmed the findings in other plants that the K domain is a weakly conserved region (Parenicova et al. 2003; Arora et al. 2007). Four maize and two sorghum MADS-box genes have predicted coiled coil motifs. In addition, several unknown motifs were also identified by MEME motif analysis.

Fig. 3
figure 3figure 3

Twenty putative conserved motifs were identified in the MADS-box family using MEME search tool in the MADS-box family in maize and sorghum. Different motifs are indicated by different colors, the names of all members and combined P value from different groups are shown on the left side of the figure. The same number in different groups refers to the different motif. The length of motif in each protein represents the actual length and motif sizes are indicated at the bottom of the figure

To investigate MADS-box gene duplication events in maize and sorghum, the following criteria was adopted (Yang et al. 2008; Gu et al. 2002): (1) the length of alignable sequence covered ≥80% of the longer gene; and (2) the similarity of the aligned regions was ≥70%. Additionally, a detailed comparison of gene duplication was performed using all of the MADS-box genes from the maize, sorghum, rice and Arabidopsis genomes. As a result, 28 MADS-box gene subfamilies (including 62 MADS-box genes) were confirmed among the 75 maize MADS-box genes, and 15 subfamilies (including 36 genes) were found among the 65 sorghum MADS-box genes. The ratio of multi-genes/single-genes of maize (4.77) was significantly higher than in sorghum (1.24), rice (0.53) and Arabidopsis (1.74). The average number of members of each MADS-box gene subfamily was 2.21, 2.4, 2.17 and 4.25 in maize, sorghum, rice and Arabidopsis, respectively (Table 3). These results suggest that MADS-box gene duplications in maize and sorghum were more prevalent than in rice.

Table 3 Comparison of MADS-box gene duplications in the four plant genomes

Chromosomal locations of MADS-box genes

The physical locations of the MADS-box genes on maize and sorghum chromosomes are depicted in Fig. 4 a and b, respectively. In maize, the largest number of MADS-box genes was located on chromosome 8 (ten). The same number of genes (nine) was located on chromosomes 5, 6 and 9, chromosomes 2 and 7 both contained seven genes, six were found on chromosomes 1 and 4, and eight and four genes were located on chromosomes 3 and 10, respectively (Fig. 4 a).

Fig. 4
figure 4

Chromosomal locations of MADS-box genes on maize (a) and sorghum (b) chromosomes. The chromosome numbers are shown at the top of each chromosome (vertical gray bars). The names on the left side of each chromosome correspond to the approximate location of each MADS-box gene (zm: ZmMADS; sb: SbMADS). The markers next to the gene names represent the groups to which each MADS-box gene belongs. The unmapped maize and sorghum MADS-box genes are shown on Zm0 and Sb0, respectively. The thick black lines on the left side of the names of the MADS-box genes indicate the clusters of tandemly duplicated genes on each chromosome. The thin black lines connect the MADS-box paralogs lying on duplicated chromosomal segments. The images were generated using MapInspect software (http://www.plantbreeding.wur.nl/uk/software_mapinspect.html)

In sorghum, chromosome 2 had the largest number of genes (eleven) followed by chromosome 3 (nine). Seven genes were locateded on chromosome 4, eight each on chromosomes 1 and 10, and four each on chromosomes 8 and 9. Five and six genes were found on chromosomes 6 and 7, respectively, and only two genes were located on chromosome 5. Interestingly, only the MIKCC-type genes were located on chromosome 1 (Fig. 4 b).

It should be noted that MADS-box genes appeared on each chromosome in both maize and sorghum. This phenomenon indicates that the distributions of MADS-box genes are wide in both genomes, which may represent the distributed characteristics of the MADS-box gene family in grass plants. However, the distribution of MADS-box genes on each chromosome in both maize and sorghum were significantly non-random. Indeed, relatively high densities of MADS-box genes were observed on some chromosomes, such as maize chromosome 8 and sorghum chromosome 2.

Like Arabidopsis and rice MADS-box genes (Parenicova et al. 2003; Arora et al. 2007), several of the MADS-box genes are physically clustered (Fig. 4) in the maize and sorghum genomes. According to Holub’s (2001) definition, a gene cluster is a chromosome region which contains two or more genes within 200 kb (Holub 2001). In maize, we found that 23 MADS-box genes were located in 9 clusters. Among the nine clusters, two were located on chromosome 7 and one each on chromosomes 2, 4, 5, 6, 8, 9 and 10. No cluster was found on chromosomes 1 or 3. In sorghum, nine clusters including 22 genes were found. Chromosomes 1, 2, 3 and 7 each contained two clusters, and only one cluster was found on chromosomes 8. Chromosomes 4, 5, 6, 9 and 10 did not contain any cluster. The largest cluster comprised of five and four MADS-box genes on maize and sorghum chromosomes, respectively.

Expression patterns analysis of maize and sorghum MADS-box genes

Since EST data can provide valuable information for gene expression research, we examined the expression patterns of maize and sorghum MADS-box genes in various tissues and organs using NCBI EST database. There are 2, 019, 105 maize and 209, 828 sorghum ESTs in the EST data of NCBI. Based on tissue and organ types, we divided the EST data into 12 groups (Fig. 5). In maize, expression of 50 MADS-box genes, including 39 type II and 11 type I MADS-box genes, was detected from NCBI EST database. We found that most of the maize MADS-box genes have a broad expression spectrum. Interestingly, three genes were found to have specific expression in single tissue and organ. ZmMADS1 was detected to have specific expression in root, while ZmMADS34 and 60 in endosperm. In addition, total of nine genes (ZmMADS7, 41, 42, 43, 46 51, 54, 52 and 64) were found in mixed tissue from NCBI EST database.

Fig. 5
figure 5

The expression profile for maize and sorghum MADS-box genes from NCBI EST database. The black points indicate the expression data for maize and sorghum MADS-box genes, and the blank shows that no expression could be detected. The names of expressed gene are shown on the left side of the figure; genes with no expression data are not shown. The group of “Multiple” indicates the expression data comes from mixed tissues and organs

Using the same method, we also detected the expression pattern of sorghum MADS-box genes from NCBI EST database. 20 of 65 sorghum genes, including 18 type II and 2 type I genes, were shown to express in various tissues and organs. Eleven genes were found to show specific expression in single tissues or organs. For example, SbMADS10, 16, 22, 24, 25 and 57 were expressed only in ovary, suggesting their specific functions in plant growth and development.

More interestingly, type I MADS-box genes with expression data were far less than type II genes in both maize and sorghum. As previous mentioned in this study and other studies, most functionally known MADS-box genes belong to the type II groups, which are the most extensively studied and described transcription factors in plants (Parenicova et al. 2003; Liljegren et al. 2000).

Discussion

As the key transcription factors in flowering plants, many aspects of the MADS-box genes have been extensively analyzed in the model plants Arabidopsis and rice. The results of present work confirmed many features described in other species. However, some novel characteristics were also found in maize and sorghum compared with Arabidopsis and rice.

On the basis of phylogenetic relationships with Arabidopsis type II genes (Fig. 1), maize and sorghum type II MADS-box genes can be further divided into 11 subfamilies. Interestingly, no FLC homologs were found in the maize or sorghum genomes, such genes have been confirmed to play a major role in controlling flowering time by vernalization and autonomous pathways (Scortecci et al. 2001). This phenomenon was also found in the rice genome (Arora et al. 2007). In addition, we tried to use the coding sequence of Arabidopsis FLC genes to search against maize EST database of NCBI but failed. In light of the fact that maize and sorghum do not require vernalization for flowering, we suggested that FLC-type genes, which were identified in Arabidopsis (Parenicova et al. 2003) have been lost in their genomes.

In addition, Arabidopsis ANR1 homologs, which play important roles in root development (Zhang and Forde 1998), were also not found in maize. The phenomenon from studies of maize and sorghum phylogeny revealed that MADS-box genes have experienced two rounds of gene lose events, an ancient one before maize-sorghum divergence and a recent one after polarization. Previous studies have shown that type I genes have a faster birth and death rate in angiosperms (Nam et al. 2004). However, the number of Mα-type genes in both maize and sorghum was higher than those in rice and Arabidopsis. We hypothesize that a fast birth and lower death rate of Mα-type genes contributed to this phenomenon.

By analysis of the conserved motifs present in MADS-box proteins, we obtained some essential sources of the evolutionary relationship between maize and sorghum. The majority of the members in the same MADS-box subfamily in the phylogenetic trees clearly had similar motifs, which again confirmed their highly sequence similarity again. For example, the CAL/FUL/AP1 subfamily contained five maize and four sorghum MADS-box genes, including ZmMADS24, 25, 27, 29, 13 and SbMADS25, 9, 33 and 7. Additionally, all of them shared the common conserved motif 1, 4, 5, 2 and 3 except for ZmMADS25 and SbMADS25. Particularly, ZmMADS13, 29, and SbMADS7 contained the same motifs, as did ZmMADS27 and SbMADS9 and also SbMADS33 and ZmMADS24. This suggests that these highly conserved motifs play crucial roles in subfamily-specific functions. While the functions of some motifs identified by MEME analysis are unknown (and most of them were also subfamily-specific motifs), we hypothesize that they may also be involved in subfamily-specific functions, and have important implications for further supporting the subfamily definition. In addition, the MEME analysis also supports the close evolutionary relationship between maize and sorghum, because almost every maize or sorghum protein had one or more orthologous (with similar motifs) in the other genome. Exceptions to this observation were rare.

Although the maize genome is about threefold larger than sorghum (2,300 Mb: 730 Mb), however, in this study, 75 MADS-box genes were identified in the maize genome, which was similar with the number present in sorghum genome. Moreover, the difference in the total number of genes was mainly due to the expansion of type II genes, 43 of which were found in maize and 35 in sorghum. Previous studies show that gene duplications, including tandem and segmental duplications, played a major role in the expansion of the MADS-box gene family (Becker and Theißen 2003; Irish 2003). To investigate the main cause of the possible increase of MADS-box gene number in sorghum, we analyzed both tandem and segmental duplications.

In maize, a total of seven segmental duplication events (14 genes) were found, and eight events (16 genes) presented in sorghum. These results were confirmed by shared phylogenetic clades and the distribution of these MADS-box genes on chromosomes. Obviously, the results also show that the number of MADS-box genes involved in segmental duplication in sorghum (24.6% of the total number of MADS-box genes) is larger than in maize (18.7%). However, about 30.6 and 33.8% of MADS-box genes were clustered on the maize and sorghum chromosomes, respectively. Since MADS-box genes in the same cluster have highly sequence similarity, gene clustering is an outcome of the birth and death process which arises from tandem duplications (Xiong et al. 2005). These results indicate that both tandem and segmental duplications played a major role in MADS-box gene expansion in the maize and sorghum genomes. However, both tandem and segmental duplications of MADS-box genes in sorghum were more prevalent than in maize.

In Arabidopsis, 107 MADS-box genes were identified. According to our previous estimate before this work, the ratio of gene duplications of MADS-box genes in both maize and sorghum should be smaller than in Arabidopsis. However, we found that the ratio of of MADS-box gene duplications in Arabidopsis was far less than that in maize. This phenomenon was also found when comparing maize, sorghum and rice. Since maize, sorghum and rice are important agricultural crops, the differences in MADS-box gene duplications in all four plant genomes may be due to the artificial selection, which plays a major role in promoting evolution.

In maize, 49.3% of MADS-box genes were located on chromosomes 5, 6, 8 and 9. In sorghum, chromosomes 1, 2, 3 and 10 contained 55.4% of the MADS-box genes, especially the tip of chromosome 2, which contained seven Mα and one Mγ MADS-box genes in a 6 Mb region. We considered this region as a hot spot of MADS-box genes distribution. However, this phenomenon was indistinct in maize. The genes in the sorghum hot spot might originate from tandem duplications, which comprised two gene clusters. In addition, we hypothesize that the fast birth and lower death rate of Mα-type genes, led to multiple copies of Mα-type genes presented in the hot spot. Interestingly, while most clusters were comprised of the same types of genes, five clusters, containing different types of genes were detected in maize, but only one was found in sorghum. Different types of MADS-box genes located in the same cluster may have originated from ancient duplications (Díaz-Riquelme et al. 2009). Although this phenomenon was rare, it provided informations about the evolution rate and novelty of the maize genome, which has important implications for the evolution of the MADS-box gene family.

Expression patterns of maize and sorghum MADS-box genes were detected using NCBI EST database. The results provided important clues for gene function research. However, this bioinformatic method was only the first step to study the functions of maize and sorghum MADS-box gene family. In this study, no expression data were detected in 25 maize and 45 sorghum MADS-box genes, possibly for the following reasons: First, pseudogenes. During the process of evolution, many genes may lose their functions due to mutations, duplications, indels and other rearrangements (Gojobori et al. 1982; Caetano-Anollés 2001); Second, some genes show temporal and spatial expression pattern. On the other hand, the expression of the MADS-box genes may occur at specific developmental stages or under special conditions. Many transcription factor families have common characters of the temporal and spatial expression pattern, such as rice Aux/IAA genes (Jain and Khurana 2009). Third, due to the limitation of current ESTs of NCBI EST database, especially for sorghum EST database, expression of some genes was not identified in any tissues or organs. MADS-box genes with specific expression can be firstly considered as candidate genes for cloning and further functional analysis, which may play specific functions in specific tissues and organs.