Introduction

The plant hormone auxin, represented by indole-3-acetic acid (IAA), influences many complex plant processes including apical dominance, vascular elongation, embryogenesis, lateral root initiation, and flower and fruit development (Woodward and Bartel 2005; De Smet and Jürgens 2007). Auxin signaling involves early response genes such as Aux/IAA, GH3 (Gretchen Hagen3), and SAUR (small auxin up RNA) family members (Abel and Theologis 1996) and modulation of the interactions of the transcription factors with auxin response elements (AuxREs) of affected genes. AuxREs are found in promoter regions of primary/early auxin responsive genes (Ulmasov et al. 1999; Tiwari et al. 2003).

Auxin response factors (ARFs) are transcription factors in plants that play a vital role in auxin-mediated responses. ARFs can specifically bind to TGTCTC-containing AuxREs and mediate responses to the plant hormone auxin (Li et al. 1994; Hagen and Guilfoyle 2002). ARF1 was first isolated in Arabidopsis, using a yeast one-hybrid screen with the TGTCTC element as bait sequence (Ulmasov and Hagen 1997). ARF1 protein contains three components, upstream of the DNA-binding domain (DBD) in the N-terminal portion, 665 amino acids in the middle and downstream of a protein–protein domain in the C-terminal portion (Ulmasov et al. 1997; Ouellet et al. 2001). In species with sequenced genomes, the structure and function of ARF genes have been identified. Most ARF proteins contain a DBD in the N-terminal portion (classified as a plant-specific B3-type), a characteristic glutamine (Q)—rich middle region that functions as an activation or repression domain, and protein–protein interaction domains III and IV in the C-terminal portion that allow the homo- and hetero-dimerization of ARFs and the hetero-dimerization of ARF and Aux/IAA proteins. Domains III and IV are similar to those found in the C-terminus of Aux/IAAs and can increase in vitro binding (Ulmasov et al. 1999). Much remains to be determined regarding the mechanisms of ARF and Aux/IAA interaction and their regulation at the cellular and whole organism levels.

Sequences derived from large-scale sequencing projects are informative in functional genomics research, providing the opportunity to scan for gene families. Genomic analyses indicated that Arabidopsis, rice (Oryza sativa), poplar (Populus trichocarpa) and Grapevine (Vitis vinifera) had 23, 25, 39 and 20 ARF protein family members, respectively (Hagen and Guilfoyle 2002; Udaya et al. 2007; Wang et al. 2007). Structural and phylogenetic comparative analyses of auxin responsive gene families were done between Arabidopsis (AtARFs) and rice (OsARFs) (Terol et al. 2006; Wang et al. 2007). The B73 maize genome sequencing project was initiated in 2005 (Bennetzen et al. 2001) and the results were later published (Schnable et al. 2009), enabling us to analyze the maize ARF gene family. The complete maize genome sequence also provides a valuable resource for comparative analyses of ARF evolution in different species. A recent genome-wide analysis describes the primary auxin-responsive Aux/IAA gene family in maize (Wang Yijun et al. 2010).

In this study, we identified at least 35 putative members of maize ARF genes (ZmARFs) using a special ARF domain Hidden Markov Model (HMM) of the whole genome. We investigated the maize ARF gene number, genomic organization, expansion pattern, motif analysis, phylogenesis, and expression profiles. Our results may be helpful for functional studies of the ARF gene family and the relationship between the ARF and Aux/IAA gene families.

Materials and methods

Identification of maize ARF gene family

To identify ZmARFs, maize genome sequences were downloaded from http://www.maizegenome.org/data_portal.html. The Hidden Markov Model (HMM) profile of ARF domains from the Pfam database (http://pfam.janelia.org/search/sequence) was then used to search for maize ARF genes using BlastP program (P-value = 0.001). The Pfam database was used to confirm whether each predicted ZmARF gene encoded the ZmARF domain and if the ZmARF protein sequence was an ARF protein. All confirmed ZmARFs were aligned using Clustal W (Thompson et al. 1994) in MEGA v4.0 (Tamura et al. 2007) to exclude overlapping ZmARF genes. Non-overlapping ZmARF genes were classified on the basis of different domains.

Phylogenetic analysis of maize ARF genes

Complete protein sequences of maize ARFs were merged and then multiple-sequence alignments done with Clustal X (version 1.83) (Thompson et al. 1997). Phylogenetic trees for all complete ZmARF protein sequences were also constructed using MEGA v4.0 (Tamura et al. 2007) by the neighbor-joining (NJ) method. The same methods can be applied to analyze the evolutionary relationships between maize and other plants such as rice, Arabidopsis, poplar and grapevine.

Sequence analysis and duplication patterns of ZmARF genes

Based on the Pfam results, the ZmARF amino acid sequences were obtained and complete open reading frames (ORFs) were identified using protein analysis on line (http://www.expasy.ch/tools). Multiple alignments of the motifs III and IV of ZmARF proteins were obtained with Clustal X. Gene duplication events of ARF genes in maize B73 were also investigated. All of the confirmed ARF genes from maize genomes were aligned using Clustal W and analyzed using MEGA v4.0 on the basis of the phylogenetic tree.

Mapping ZmARFs on maize chromosomes

Each non-overlapping ZmARF gene sequence was used as a query against the maize sequence (http://maizesequence.org/blast), analyzed by using TBlastN program and positioned on the 11 maize chromosomes. ZmARF names are given according to their position from the top to the bottom on maize chromosomes 1–11. The chromosome map showing the physical location of all ZmARF genes was generated by Genome Pixelizer software (http://www.niblrrs.ucdavis.edu/GenomePixelizer/GenomePixelizer_Welcome.html).

Expression profiles of the ZmARF gene family

As transcription factors, all ZmARF genes were investigated at the transcriptional level. Maize EST database (http:/www.maizesequence.org/blast) was acquired and the maize expression data was obtained through blast searches against the maize EST database by conducting DNATOOLS blast program. ZmARF genes were analyzed by using TBlastN program with the following parameters: maximum identity > 95%, length > 400 bp and E value < 10−10. While ZmARF genes had no expression specifically in our loaded EST database, these were identified through NCBI EST database (Wang Yijun et al. 2010).

Results

Identification of ZmARFs

We used the multiple sequence alignment of Arabidopsis ARF protein domain sequences to build an ARF domain HMM profile. BLASTP searches based on the conserved ARF domain HMM were used to identify the ARF genes in maize genomes. 35 potential ARF protein sequences were predicted with a probability E value threshold of 0.001 and identified as ZmARF genes using the BLAST program from the Pfam database. 22 protein sequences contained the B3, ARF, and Aux/IAA domains. 10 protein sequences contained the ARF and B3 domains. ZmARF24 protein sequence contained the ARF and Aux/IAA domain. ZmARF26 protein sequence contained only the ARF domain, and ZmARF28 had two protein–protein domains. The overall analysis revealed 35 ZmARF gene family members in the complete maize genome. The deduced polypeptide included three fields: number of amino acids (length), molecular weight, and isoelectric point (PI). ZmARF ORF lengths ranged from 1257 bp (ZmARF1) to 3459 bp (ZmARF21) and the molecular weights ranged from 46.79 kDa (ZmARF26) to 127.82 kDa (ZmARF21) (Table 1).

Table 1 : List of ARF genes in maize

Phylogenetic analysis of ARF genes

The unrooted phylogenetic tree of all maize ARF full-length protein sequences was generated using MAGE v4.0 program by the N–J method. All ZmARFs could be divided into 4 major classes—I, II, III, and IV (Fig. 1). Class I and Class IV contained 7 and 6 members, respectively. Class II (9 members) and Class III (13 members) were two subgroups from one branch. All 35 ZmARFs were distributed into 12 sister pairs, while the remaining ZmARFs were not matched.

Fig. 1
figure 1

Phylogenetic tree of maize ARF genes. The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method

Arabidopsis and rice were chosen to further investigate the phylogenetic relationship between dicot and monocot ARF genes. The phylogenetic relationship was examined in aligned full-length protein sequences of 35 ZmARFs and 23 AtARFs (Fig. 2). A total of 58 members were divided into 5 groups named Class 1, 2, 3, 4, and 5. Its classification is largely similar to the ZmARFs (Fig. 1). Classes I and II belong to the same branch, but the maize and Arabidopsis families do not have a high sequence homology based on the phylogenetic tree. In the joint phylogenetic tree, Classes 1, 2, 3, 4, and 5 contained 29, 7, 9, 6, and 7 members, respectively (Fig. 2). Class 1 is composed of two subgroups. All 18 homologous pairs, including 13 ZmARF-ZmARF and 5 AtARF-AtARF, were confirmed without ZmARF-AtARF pairs. AtARFs and ZmARFs were both found in Classes 1, 2, 4, and 5, but Class 3 only had AtARFs without ZmARFs. Noteworthy is that each family contained a separate branch besides the above paired ZmARFs.

Fig. 2
figure 2

Phylogenetic tree of maize and Arabidopsis ARF genes. The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method

The phylogenetic tree of 35 ZmARFs and 25 OsARFs was aligned and constructed using the same method in order to analyze the relationship among the ARF genes of two monocot plants. We also classified all 60 members into 4 classes named Classes A, B, C, and D containing 22, 15, 9, and 14 members, respectively (Fig. 3). Classes C and D were two subfamilies further divided from the same branch. 9 ZmARF-ZmARF and 12 OsARF-ZmARF constituted 21 sister pairs. Every group contained OsARF-ZmARF pairs.

Fig. 3
figure 3

Phylogenetic tree of maize and rice ARF genes. The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method

In order to better evaluate the phylogenetic relationship of the same gene family among different species, poplar and grapevine were also selected and, respectively carried out comparative analysis of evolutionary trees with maize. An unrooted phylogenetic tree based on the alignments of 35 ZmARFs and 39 PoptrARFs was constructed by the N–J method (Fig. 4). 74 ARFs fell broadly into three major classes: Classes X, Y and Z containing 27, 30 and 17 members, respectively. They contained only one ZmARF-PoptrARF homologous pair in Class X, 17 PoptrARF-PoptrARF and 11 ZmARF-ZmARF sister pairs.

Fig. 4
figure 4

Phylogenetic tree of maize and poplar (Populus trichocarpa) ARF genes. The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method

As the grapevine genome has been announced, but the grapevine ARF genes have not been reported. Grape vine assembly and annotation V1.0 were downloaded from http://www.genoscope.cns.fr/externe/English/Projets/Projet_ML/index.html (Yang et al. 2008). Using the same method as above to determine ZmARFs, 20 grapevine ARF genes were indentified. Based on the alignment of 35 ZmARFs and 20 GsARFs, a phylogenetic tree was constructed (Fig. 5). As shown in Fig. 5, this phylogram distinguished 3 groups, namely, Classes α (23 members), β (18 members) and γ (14 members). All members included 5 GsARF-GsARF, 12 ZmARF -ZmARF and 3 ZmARF-GsARF sister pairs.

Fig. 5
figure 5

Phylogenetic tree of maize and grapevine (Vitis vinifera) ARF genes. The unrooted tree was generated using the MEGA v4.0 program with the neighbor-joining method

Sequence and conserved region analysis of the ZmARF proteins

According to the Pfam outcome, 33 putative ZmARF protein sequences were identified that had a typical DBD domain. ZmARF24 had no DBD in the N-terminal region. ZmARF26 had only an ARF domain without a DBD in the N-terminal region and a protein–protein interaction domain in the C-terminal portion. Using Clustal X to analyze the full protein sequences and conserved regions of all ZmARFs, we found that the DBD domains of 33 ZmARFs were composed of about 460 amino acids in the N-terminal portion, and 23 ZmARFs contained motifs III and IV found in the Aux/IAA protein family in the C-terminal portion. These core domains all had high similarity (Fig. 6a, b).

Fig. 6
figure 6

a Alignment profile of maize ARF proteins obtained with the ClustalX program. The height of the bars indicates the number of identical residues per position. The arrows indicate the core region among DBD regions. Motifs III and IV are found in the Aux/IAA domain. b Alignment of Motifs III and IV of ZmARF proteins using Clustal X

Chromosomal locations of ZmARFs

Based on available information (http://www.maizegenome.org/data_portal.html), standard ZmARF genes were positioned on maize chromosomes using Genome Pixelizer software. According to the maize evolutionary tree analysis, each ARF gene was divided into one of four categories (Classes I, II, III, and IV) as represented by different colors in Fig. 5. However, the maize gene sequences have not been fully sequenced (about 95% coverage of the whole genome). In addition to the 10 sequenced chromosomes, there is an additional chromosome called unknown or chromosome 0. All 35 ZmARF genes can be distributed on chromosomes 1–10. Most ARF genes were located on chromosome 5 (6 genes), and five genes were located on chromosome 3 (Fig. 7). The same number of genes was located on chromosomes 6 and 10, with four genes each on chromosomes 1 and 4. These chromosomes all contained the different types of genes. Only one ZmARF gene was located on chromosomes 2, 7, 8, and 9, respectively (Fig. 7).

Fig. 7
figure 7

Genomic distribution of ARF genes on maize chromosomes. The four categories of genes corresponding to Fig. 1. The boxes above and below the chromosomes (chr; represented as gray bars) designate the approximate locations of the four categories of ARF genes

In reference to the nomenclature previously used for AtARFs and OsARFs (Wang et al. 2007), genes were temporarily named from ZmARF1 to ZmARF35 to distinguish every ARF gene based on its position from the top to the bottom of maize chromosomes 1–10. This approach has been broadly applied in genome-wide studies for the ERF, GH3, and Aux/IAA gene families in Arabidopsis and rice (Jain et al. 2006; Nakano et al. 2006; Terol et al. 2006).

Analyses of ZmARF gene duplication

To investigate ZmARF gene duplication events, phylogenetic trees of maize ARF genes provided some valuable information. A detailed comparison of gene duplications are as follows (Gu et al. 1998; Yang et al. 2008): (1) the length of alignable sequence covering ≥80% of the longer gene, and (2) the similarity of the aligned regions ≥80%. All four ARF gene categories had gene expansion. Among the 35 ZmARF genes, 24 ARF genes (12 pairs) were confirmed and present in the phylogenetic tree (Fig. 1). ZmARF gene duplication was prevalent and was the main contributor to the expansion of the ZmARF gene family.

Expression patterns of the ZmARF family

In order to study the expression of the ZmARF gene family, we used the maize EST database to predict ZmARF transcripts. The maize EST were divided into 8 groups (Table 2). According to the NCBI EST expression database, ZmARF1, 18, 19, 23, 24, 32, and 35 have no expression, 9 ZmARF genes (ZmARF4, 6, 11, 14, 15, 21, 30, 31, and 33) were found in only one tissue or organ, and the remaining ZmARF genes were identified in two or more tissues and organs (Table 2).

Table 2 Expression analysis of ZmARF genes in silico

Discussion

As key transcription factors, the ARF gene family plays an important role in various plant growth and development. The structure of ARF gene family have been extensively studied and described in some species. Arabidopsis, rice and P. trichocarpa are model plants with complete genomic sequences and analyzed ARF gene families in the genome-wide (Tiwari et al. 2003; Udaya et al. 2007; Wang et al. 2007). However, there are few studies of ARF genes in maize. In this study, we used the ARF genes of Arabidopsis as queries to determine the largest possible number of maize ARF gene sequences and applied various bioinformatics software to find ZmARF genes. Through a genome-wide comparative analysis of the evolutionary relationships between maize and other plants such as rice, Arabidopsis, poplar and grapevine, the result showed that maize shared more homology with the monocot plant (rice).

In this study, 35 ARF genes have been identified in the maize genome, which is higher than in Arabidopsis (23), rice (25) and grapevine (20). Therefore, gene duplication can play an important role in a succession of genomic rearrangements and expansions (Vision et al. 2000). Gene duplication included tandem and segmental duplication events, but tandem duplication is an on-going process in poplar genome evolution, whereby two or more genes are located in the same chromosome. Gene duplication between different chromosomes and the same clades are designated as segmental duplication events. This has been found in rice and tandem duplication has been reported in Arabidopsis (Lynch and Conery 2000; Simillion et al. 2002; Raes et al. 2003; Wang et al. 2005). 60% of these duplications may help maize to evolve distinct properties from other angiosperms. However, the 35 ZmARF genes were distributed on the 10 maize chromosomes without obvious hot spots in chromosomes.

In maize, 35 ZmARF genes were divided into four categories and 12 sister pairs were formed. Only 3 of these pairs diverged from their corresponding chromosomal locations on chromosomes 3 (ZmARF8 and 9), 5 (ZmARF20 and 21), and 6 (ZmARF25 and 26). The remaining nine sister pairs were located chromosomes 1 and 5 (ZmARF4 and 17), 1 and 10 (ZmARF1 and 35), 3 and 6 (ZmARF12 and 27), 3 and 8 (ZmARF10 and 29), 4 and 5 (ZmARF14 and 22), 2 and 10 (ZmARF5 and 34, ZmARF6 and 33), and 5 and 6 (ZmARF 18 and 24, ZmARF19 and 23). We found that segmental duplication was common in ZmARF genes. The 12 ZmARF sister pairs were the result of the interaction between two kinds of replication events to expand ZmARF number, which were significantly different than for Arabidopsis and rice.

Analysis of the conserved motifs indicated that 23 of 35 ZmARF proteins contain domains III and IV, which were also found in the C-terminus of Aux/IAAs. These domains have been shown to mediate both homo- and hetro-dimerization between members of the Aux/IAA and ARF families (Kim et al. 1997). Several interesting points arise from these phylogenies. Seven out of nine ZmARF sister pairs and 9 single ZmARFs related to domains III, and IV in the Aux/IAA family (Fig. 6). ARF1 carboxyl terminus has also been shown to interact with Aux/IAA proteins in a yeast one-hybrid screen, and these interactions probably occur through conserved domains III and IV in the carboxyl termini of these proteins. Both ARF and Aux/IAA families were transcriptional regulators, specially affected auxin signaling transduction (Reed 2001). It has therefore been hypothesized that the Aux/IAA proteins regulate transcription by modifying ARF activity (Guilfoyle et al. 1998a, b).

Potential gene expression patterns were determined using our and NCBI EST databases. Most ZmARF genes were expressed in specifical tissues and organs, while 7 ZmARF genes had no expression specifically in our EST database but were identified in mixed tissues through NCBI EST database. Whether these gene expressions were induced by the outside, for example, light-induced and external growth hormone treatment has become a research focus. It has been reported that RT–PCR method was used in Semi-quantitative analysis of ARF genes in Arabidopsis and rice. (Okushima et al. 2005; Overvoorde et al. 2005; Wang et al. 2007). Additionally, different growth environments may also affect the functional prediction in theory. The comparative and phylogenetic analyses of the ZmARF gene family and the expression and structure analysis of ZmARF proteins will lay the foundation for further functional studies.