Introduction

Liriodendron tulipifera L., also called tulip tree, tulip-poplar, yellow-poplar, white-poplar, and whitewood, is one of the most attractive and tallest tree species of the eastern United States (Harlow and Harrar 1969). Liriodendron has a large natural range, covering much of eastern North America (Little 1979). The wood of yellow-poplar is commercially valuable and is used in a diverse range of products, such as in furniture and framing construction as a substitute for increasingly scarce softwoods (Hernandez et al. 1997; Moody et al. 1993; Williams and Feist 2004). Yellow-poplar is also valued as a nectar source for honey production, as a source of wildlife food (mast), and as a large shade tree in urban plantings (McCarthy 1933). Because of its rapid growth at preferred sites, yellow-poplar may have additional future potential as a source of fiber for biologically based products, biofuels, and chemicals (Nagle et al. 2002).

Yellow-poplar is a member of Magnoliaceae in the order Magnoliales, which is one of the early branching angiosperm lineages commonly known as “basal angiosperms.” Magnoliales and three other basal angiosperm orders (Laurales, Piperales, and Canellales) comprise an extremely diverse group that may be sister to the monocots, sister to the eudicots, or sister to both monocots and eudicots (Soltis et al. 2000). Thus, Magnoliales holds a phylogenetically critical position among the early angiosperms that ultimately gave rise to the largest modern angiosperm lineages (cf. Qiu et al. 2005). Floral and other structural features place yellow-poplar at a phylogenetic position that is ideal for comparative studies of the evolution of biological processes in land plants (Hunt 1998; Ronse de Craene et al. 2003; Wei and Wu 1993). In addition, the availability of a reliable transformation system and the ability to mass produce clonal trees from somatic embryos (Wilde et al. 1992) make the species a good model forest tree in which we can explore modern breeding techniques and population dynamics as well as functional genomics.

Despite the fact that yellow-poplar has been used extensively as a benchmark species in studies on plant evolution (Parks and Wendel 1990; Wen 1999; Endress and Igersheim 2000; Zahn et al. 2005), there is very little information about the nuclear genome. A PubMed search with “Liriodendron” resulted in only one publication related to gene structure and function (LaFayette et al. 1999) and none on genomics. Recently, the Floral Genome Project created a cDNA library from yellow-poplar floral tissues by Albert et al. (2005). Sequences from this cDNA library resulted in an EST dataset with over 6,500 unigenes representing a wide variety of putative functions (http://pgn.cornell.edu/). Toward a major step forward for genomic-scale resources for yellow-poplar, this floral EST data is likely to lead to an even greater interest in the structure of genes and their regulatory regions in Liriodendron.

Large-insert genomic libraries are invaluable for genome sequencing, physical mapping, positional gene cloning, and the analysis of gene structure and function. Both yeast artificial chromosome (YAC) and bacterial artificial chromosome (BAC) vectors greatly facilitate the cloning of large genome fragments. BAC technology has substantial advantages over YAC libraries (Shizuya et al. 1992; Woo et al. 1994; Yu et al. 2000 and references therein); hence, more and more genome libraries employ BAC vectors. In this paper, we report the construction of a yellow-poplar BAC library, the first large-insert DNA library constructed for Liriodendron. Our primary goal is to develop the resources required to fully implement yellow-poplar as a model tree species for evolutionary and comparative genomics. An understanding of the Liriodendron genome will also support genetic improvement of yellow-poplar, including the application of biotechnology, and the characterization of traits important to the physiology and ecology of the species. In addition to characterization of the BAC library, we also provide new genome-size estimates and identification of three floral timing genes [Gigantea, Frigida, and LEAFY (LFY)] as well as three genes in the lignin biosynthesis pathway [cinnamyl alcohol dehydrogenase (CAD), 4-coumarate:CoA ligase (4CL), and phenylalanine ammonia-lyase (PAL)] from the library.

Materials and methods

Nuclear DNA content determination

Fresh leaves from nine yellow-poplar trees, including the plant from which the BAC library was constructed (clone 108 at the University of Tennessee), were sent to Benaroya Research Institute at Virginia Mason, Seattle, WA for flow cytometric estimation of nuclear DNA content. Nuclei from Nicotiana tabaccum cv. SR-1 served as a size standard and propidium iodide as a staining dye in the flow cytometric assays. The assays were conducted individually on one, two, or four leaves from each tree, with triplicate runs for each leaf. Preparation of samples and instrumentation techniques were performed as described previously (Arumuganathan and Earle 1991).

Nuclear DNA characterization

To examine the complexity of the yellow-poplar genome and the density of single/low-copy protein-coding genes in the BAC library, a small shotgun library of yellow-poplar genomic DNA was constructed in the EcoRV site of plasmid vector pBluescript II SK(+). The Liriodendron shotgun library contained 3,072 clones with an average insert size of 3 kb (the LT_Sa library is available at http://genome.arizona.edu/orders/). DNA sequences were obtained from both ends of 96 randomly selected clones from the plasmid shotgun library, using the vector T7 and M13 reverse primer sites. High-quality sequences were trimmed and then used as queries in BLASTX, BLASTN, and TBLASTX (2.2.11 version, Altschul et al. 1990). High-quality sequences were defined as those having >100 high-quality bases (Q20) after trimming of vector sequence. The BLAST search of NCBI was run using a Batch BLAST script (http://greengene.uml.edu/analysis/analysis.html). A probability cutoff value (E value) of 10−6 was used to assign putative identities to these sequences.

BAC library construction

Young floral buds were obtained from clone 108 in the yellow-poplar breeding orchard of the University of Tennessee’s Tree Improvement Program based at Knoxville, Tennessee. Outer bud layers were removed, and the innermost layers including meristematic regions were used for high molecular weight DNA isolation. BAC library construction was essentially the same as that described previously by Tomkins et al. (2001, 2002). The first size selection of HindIII fragments via pulsed field gel electrophoresis used switch times of 1–40 s in a linear ramp for 18 h (12°C, 0.5× TBE buffer). Fractions between 100 and 300 kbp were cut from the gel, inserted into a second gel, and run with a 3- to 5-s switch time (18 h at 12°C, 0.5× TBE buffer) to remove small trapped DNA fragments. After removing appropriate fractions from the second size selection, DNA was removed from the agarose by electroelution (Model 422 Electro-Eluter, Bio–Rad). Size-selected DNA was ligated into the pCUGIBAC-1 vector (Luo et al. 2001) and electroporated (Cell-Porator, Gibco BRL) into Escherichia coli (strain DH10B) cells using 320–330 V. Transformed cells were plated on selective medium [Luria–Bertani (LB) medium] in 24 × 24 cm plates (Genetix) with 12.5 μg/ml chloramphenicol, 0.55 mM IPTG, and 80 μg/ml X-Gal. After a 20-h incubation at 37°C, the plates were placed at room temperature in the dark for an additional 20 h to allow stronger color development of nonrecombinant colonies. Plates were either stored at 4°C or used immediately for picking. Recombinant white colonies were picked robotically using the Genetix Q-BOT and arrayed as individual clones in 384-well microtiter plates (Genetix) containing 50 μl freezing broth. After incubation overnight, microtiter plates were stored at −80°C. Three copies of the library were made using the replicating function of the Genetix Q-BOT and stored in separate −80°C freezers. A total of 384 clones were selected randomly for characterization of average insert size, percentage of empty vectors, and the frequency of organellar DNA clones as previously described by Tomkins et al. (1999), Budiman et al. (2000), and Yu et al. (2000).

BAC end sequencing

Ninety-six BAC clones were randomly selected and sent to The Institute for Genomic Research for end sequencing. To determine the sequence composition of the BAC end sequences, sequences were quality-trimmed and then used as queries in BLAST searches as described above in “Nuclear DNA characterization”.

BAC library screening for specific genes

Four filters containing the entire BAC library were screened by hybridization for Gigantea, Frigida, LFY, CAD, 4CL, and PAL genes. Probe DNA for each gene was prepared either by restriction enzyme digest or PCR from yellow-poplar cDNA clones provided by the Floral Genome Project (http://www.floralgenome.org/fgp/). For the yellow-poplar LFY gene, however, a genomic clone was generated by PCR amplification of Liriodendron genomic DNA using a degenerate-sequence primer pair (LFY737: 5′-CGAGGTGGCCCGGGSNA-3′ and RCLFY739: 5′-CCGTCACGTCCCGCATYGTYACNTG-3′) and confirmed by sequencing, as a corresponding cDNA clone did not exist. The LFY probe was 491 bp in length, containing 105 bp of coding sequence and 386 bases of intron sequence. Detailed information about the probes and the best BLASTX alignments to the probes are available in electronic supplementary material S1. Approximately 50 ng of template DNA was used for preparing labeled probes. Probes were labeled with 32P-dATP (3,000 Ci/mmol, 10 mCi/ml) using Strip-EZ™ DNA Random Primed StripAble DNA Probe Synthesis and Removal Kit, according to manufacturer’s instructions (Ambion). Probes were purified using Centri Spin-20 columns (Princeton Separations) before hybridization. After prehybridizing the filters at 65°C overnight, filters were hybridized with individual probes at 55°C overnight. Washing stringency was set at 55°C. The solutions used for hybridization and washing were based on Frank’s (1997). Filters were stripped between hybridizations following the procedure of Maniatas et al. (1982). Positive BAC clones (those with strong hybridization signals) were ordered from the Clemson University Genomics Institute (LT_Ba library clones; http://www.genome.clemson.edu/groups/bac/). Clones were confirmed by Southern hybridization of restriction enzyme-digested BAC DNA using the same probe resources as for library screening. The Gigantea cDNA probe was labeled with 32P-dATP, and hybridization and washing conditions were the same as those described above for library screening. For the Frigida, LFY, CAD, 4CL, and PAL genes, however, positive clones were confirmed by labeling probes with digoxigenin-dUPT and filter hybridization using the DIG-High Prime DNA Labeling and Detection Starter Kit I (as described by the manufacturer Roche Diagnostics).

Subcloning of genes from the BAC clones and DNA sequencing

One positive BAC clone was chosen for each gene. The BAC clone inserts were fragmented by restriction enzyme digestion (HindIII was used for Gigantea and 4CL, EcoRI for Frigida, CAD, and PAL, and PstI for LFY) and subcloned into the pBluescript II SK(−) plasmid vector (Stratagene, La Jolla, CA, USA). Subclones containing fragments of the relevant genes were identified by dot-blot hybridization (following the method of Maniatas et al. 1982) using the probes prepared as above for BAC clone verification (i.e., the Gigantea cDNA probe was labeled with 32P-dATP, and the Frigida, LFY, CAD, 4CL, and PAL genes were labeled with digoxigenin-dUPT). Positive subclones were sent to either the Nucleic Acid Facility at Pennsylvania State University or the Cornell BioResource Center for sequencing.

Genomic DNA hybridization for gene copy number estimates

Total genomic DNA was isolated from bark tissues of young yellow-poplar twigs (clone 108 at the University of Tennessee) using the Qiagen DNeasy Maxi Kit. Aliquots (10 μg) of genomic DNA were digested individually with five different restriction enzymes (EcoRI, BamHI, EcoRV, NcoI, or XhoI) and then size-fractionated via electrophoresis in a 0.8% agarose gel before being transferred to Hybond N membrane (Amersham). The genomic DNA was hybridized individually with Gigantea, Frigida, LFY, PAL, CAD, and 4CL probes (same probe resources as for library screening). The Frigida, LFY, CAD, 4CL, and PAL probes were labeled with 32P-dATP, and the Gigantea cDNA probe was labeled with digoxigenin-dUTP. Hybridization was performed as described above for BAC library screening and positive clone confirmation.

Results

Genome-size estimates

The approximate genome size of yellow-poplar was determined by flow cytometry using leaf nuclei from nine unrelated trees and tobacco as an internal standard. The genome sizes obtained ranged from 1,853.36 to 1,706.50 Mbp per haploid genome, with an average size of 1,802 Mbp per haploid genome (SD=16) (Table 1). Yellow-poplar thus has a medium-sized genome when compared to plants known to be at the small (Arabidopsis at 157 Mbp per haploid genome) (Bennett et al. 2003) or large (wheat at 15,966 Mbp per haploid genome) (Arumuganathan and Earle 1991) ends of genome-size distribution in angiosperms.

Table 1 Flow cytometry determination of the nuclear genome sizes of several representative yellow-poplar individuals

Nuclear DNA characterization

To estimate the complexity of the Liriodendron genome, a small shotgun library of yellow-poplar genomic DNA was constructed in the plasmid vector, pBluescript II SK(+). Ninety-six clones randomly selected from this shotgun library were end-sequenced, resulting in a total of 176 high-quality sequences. The average read length was 635 bp with a standard deviation of 130 bp. When searched against GenBank, TBLASTX (matching each translated end-sequence tag against the translated nucleotide database; August 8, 2005) identified 54 matches with scores of E ≤ 10−6, whereas BLASTX (matching translated end sequences to the GenBank protein database) and BLASTN (matching nucleotide to nucleotide sequences) alignments resulted in 47 and 14 sequences, respectively. Among the 54 sequences with E ≤ 10−6, there was one sequence homologous to a chloroplast protein (E ≤ 8 × 10−9), and there were three sequences identified as homologs of bacterial or fungal proteins (E ≤ 2 × 10−27). For the remaining 50 sequences, 30 (60%) matched various transposable elements (17% of the total number of shotgun end sequences), while 16 sequences (32%) matched putative genes (or 9% of the total number of shotgun end sequences). There were four sequences with E ≤ 10−6 matching to sequences that could not be classified into any informative categories. Forty-one shotgun end sequences had best alignments to eudicot sequences (82%), whereas eight sequences (16%) with top hits had best alignments to monocot sequences. Finally, there were 122 shotgun end sequences (69%) that did not share significant sequence similarities to any DNA or protein sequences in GenBank. The AT content of the shotgun end sequences was 59%, whereas GC content was 41%. Detailed information about the BLAST results of the 54 shotgun end sequences can be found in supplemental Table S2.

Liriodendron BAC library construction and characterization

Our L. tulipifera BAC library, which was constructed from yellow-poplar clone 108 in the University of Tennessee’s Tree Improvement Program, is suitable for physical mapping, map-based cloning, and high-throughput sequencing of selected genomic regions. The library consists of 73,728 clones stored in 192 384-well microtiter plates at the Clemson University Genomics Institute. All of the clones in the BAC library were gridded onto 22.5 × 22.5 cm filters at high density in 4 × 4 patterns of doubled spots for ease in spot identification after hybridization.

A total of 384 randomly sampled clones were digested with NotI to release the insert. Digestion typically generated vector plus one to two insert bands per BAC clone (Fig. 1) probably due to the facts that there are two NotI sites in pCUGI-1 flanking the multi-cloning site, and NotI recognizes an 8-bp GC sequence, while the yellow-poplar genome is relatively AT-rich (59% according to our BAC end and shotgun end sequences). To determine the size distribution of BAC clones in the library, the 384 BACs analyzed with NotI digests were grouped by insert size, and the insert size of each clone was plotted against the frequency of each group of clones represented in the library (Fig. 2). This analysis revealed that 89% of the clones in the library had an average insert size equal to or larger than 100 kbp. The average insert size of the BAC library was 117 kb, with a range of 40 to 280 kb. A very low percentage for the empty vector content clones (less than 1%) was found. Based on the average insert size and a haploid genome size of 1,802 Mbp, the coverage of the library is 4.8× haploid genome equivalents, resulting in 99.2% probability of recovering any specific sequence of interest (Clarke and Carbon 1976).

Fig. 1
figure 1

An analysis of 43 randomly selected Liriodendron BAC clones. Ethidium-bromide-stained CHEF gel (5–15 s switch time, 15 h) showing insert DNA above the common 7.5 kb pCUGI-1 vector band. Molecular weight marker in outside lanes is a 50-kb lambda ladder, and the molecular weight marker in the center of gel is a Midrange I (NEB) with bands at 15, 34, 49, 82, 97, 131, and 146 kb

Fig. 2
figure 2

Insert size distribution of clones from the yellow-poplar BAC library. To estimate insert size range, BAC DNA inserts from 384 randomly selected clones were analyzed using NotI digestion and pulsed field gel electrophoresis. The average insert size was 117 kb, and 89% of the clones are ≥ 100 kb

Approximately 0.2% of library sequences were determined to be of chloroplast DNA origin by screening the high-density colony filter arrays with four highly conserved chloroplast genes spaced equidistantly around the 154-kbp soybean chloroplast genome. We attribute this exceptionally low level of chloroplast sequence contamination to use of the inner layers of young floral buds that were not photosynthetically mature and probably had a low plastid content.

BAC end sequencing results

As one quality control, 96 BAC clones were end-sequenced. A total of 192 sequencing reactions were performed, resulting in a set of 182 high-quality sequences, for a sequencing success rate of 95%. The average read length, after trimming of vector sequence, was 761 bp with a standard deviation of 142 bp.

TBLASTX searches against GenBank resulted in 45 BESTs with alignment scores of E ≤ 10−6 or better, whereas BLASTX and BLASTN alignments resulted in 42 and nine sequences, respectively. Among the BAC end-sequence tags, only one contained chloroplast DNA. This value, 0.5% of the total high-quality BAC end-sequence tags, is consistent with the estimate of 0.2% chloroplast DNA inserts obtained by hybridization screening of the library with probes from the chloroplast genome. Of the remaining 44 BESTs with matches in GenBank, 30 had sequences similar to transposable elements. This indicates that 68% of the BAC end-sequence tags with matches in GenBank and a minimum of 16% of the Liriodendron BAC end sequences may have been derived from transposable elements. In addition, there were 11 putative genes detected by BLAST searches, a value that represents 25% of sequences with matches in GenBank and 6% of the total high-quality Liriodendron sequences. There were two sequences with E ≤ 10−6 matching to sequences that could not be classified into any informative categories. The remaining BAC end-sequence tag with a significant alignment score was similar to a Magnolia stellata microsatellite marker (E value = 7e−13). Finally, 136 high-quality BAC end-sequence tags (75%) lacked significant matches in GenBank. The putative assignments of the 45 BESTs by BLAST researches are available in S3. BLAST searches also revealed that 84% of the BESTs with E ≤ 10−6 had best hits to eudicot sequences, whereas 9% of the top alignments were to monocot sequences. The AT content of the BESTs was 59%, whereas GC content was 41%.

Screening of the BAC library for Gigantea, Frigida, LEAFY, CAD, 4CL, and PAL genes

To test if our BAC library was useful for cloning homologues, we cloned three floral-timing genes, Gigantea, Frigida, and LFY, and three lignin biosynthesis pathway genes, PAL, CAD, and 4CL, from the library. These genes are known to be low copy number in other plant species (Park et al. 1999; Johanson et al. 2000; cf. Dixon et al. 2001; cf. Wada et al. 2002; Dunford et al. 2005; http://fgp.huck.psu.edu/tribe.php). As shown in Table 2, at least two positive BAC clones were found for each of these genes, with PAL having the most clones (eight clones) and LFY the least (two clones). The fact that only two positive clones were found for LFY in the BAC library may have resulted from the small size of the probe that was used (105 bp of coding sequence). Putative positive BAC clones were reconfirmed by Southern hybridization of restricted BAC DNA (Fig. 3), and then positive restriction fragments from the BAC inserts (those that hybridized in the Southern analysis) were cloned. Genomic DNA sequence was then obtained for each of the subclones to confirm that they contained the desired genes. BLASTX search of GenBank with the sequences of these BAC subclones yielded strong matches (E < 5e−15) to homologues from other species for all of the genes (Table 2) except Frigida. When compared to the Frigida protein sequence from Arabidopsis, the Liriodendron Frigida sequence shared only 34% amino acid identity, with an E value for BLASTX at only 7e-06. This is not too surprising because the Arabidopsis Frigida gene is not well conserved in other plant species, including cereals, pea, and legumes (Izawa et al. 2003; Simpson 2003; Hecht et al. 2005).

Table 2 Positive clones identified from screening the Liriodendron BAC library with homologous probes
Fig. 3
figure 3

Examples of the BAC library screening for specific genes of interest and confirmation by Southern hybridization. An autoradiogram of BAC-library filter (left) and an autoradiogram of Southern blot (HindIII digest) (right) are shown

Genomic Southern hybridization analysis

Southern hybridization was conducted to estimate the copy numbers of the genes used for screening the BAC library filter array. Five different restriction enzymes were employed individually to digest the genomic DNA. However, for each gene, only one or two of the restriction enzyme digests resulted in useful hybridization patterns with fragments smaller than 10 kb that could be accurately scored. As shown in Fig. 4, the Gigantea, Frigida, LFY, PAL, CAD, and 4CL probes detected one to two fragments in the nuclear genome of Liriodendron by Southern hybridization. The only exception was the CAD probe which revealed five genomic fragments on the EcoRV digest blot. Based on the sequences included in the probes and the number of clones retrieved from the BAC library, most of the hybridization fragments were as expected in terms of size and number, indicating that the Gigantea, Frigida, LFY, PAL, and 4CL genes appear to be either low or single copy genes in yellow-poplar, while CAD may be in a small, multigene family.

Fig. 4
figure 4

Genomic Southern hybridization. Aliquots of yellow-poplar genomic DNA were digested with restriction enzymes, separated by electrophoresis on a 0.8% agarose gel, and transferred to Hybond N membrane (Amersham). The genomic DNA blots were hybridized individually with Gigantea, Frigida, LFY, PAL, CAD, and 4CL probes. L DNA Ladder I (GeneChoice)

Discussion

We describe in this study the development and characterization of the first large-insert DNA library for the Liriodendron genus. To our knowledge, these are the first published data on the complexity of a Liriodendron genome or that of any other basal angiosperm species. This BAC library contains 73,728 clones with an average insert size of 117 kb. Contamination is very low in the library (0.2% of chloroplast DNA and less than 0.1% of empty vector clones). Taking 0.2% of chloroplast DNA into consideration along with a haploid genome size of approximately 1,802 Mb, we estimate that the library represents 4.8× haploid genome equivalents and calculate a 99.2% probability of including any specific sequence of interest. These features will facilitate the application of this library for many high-throughput genomic applications.

The genome-size estimates that we obtained from nine different yellow-poplar trees were over twice as large as the previously reported genome size for Liriodendron (790 Mbp, Bennett et al. 2000). Because the BAC library was constructed based on the previous genome-size estimate, the library coverage is lower than originally intended. However, at 4.8× haploid genome equivalents, it still provides sufficient genome coverage to permit low-copy genes to be isolated, as demonstrated by the success of retrieving sequences for Gigantea, Frigida, LFY, CAD, 4CL, and PAL genes from the library.

We chose the Gigantea, Frigida, LFY, CAD, PAL, and 4CL genes to test the utility of the library in identifying important regions of the genome for two reasons. First, these six genes are known to play important roles in timing floral initiation and in lignin biosynthesis in other plants. Functional genomics studies of these genes in Liriodendron and comparative studies of these genes with their homologues in other plants should be informative for further studies of floral development and lignin biogenesis. Second, these genes are low-copy genes in other plant species (in most cases, one to two copies per haploid plant genome for Gigantea, Frigida, and LFY, and one to four copies for CAD, PAL, and 4CL; Park et al. 1999; Johanson et al. 2000; cf. Dixon et al. 2001; cf. Wada et al. 2002; Dunford et al. 2005). Our genomic Southern hybridization analysis resulted in simple hybridization patterns that indicate that these genes appear to be low copy in yellow-poplar as well. Assuming that Gigantea, Frigida, and LFY are present as single copy genes, and given the 4.8× depth of coverage for the library, we would expect to find four to five BAC clones in the library for each of these genes. Our primary screening of filter arrays of the library produced 4.3 clones per gene (on average; data not shown), which was very close to the expected number. After Southern hybridization analysis, used to confirm the clones, the average number of positive BAC clones was, however, reduced to 3.3 (ranging from two to five). The genomic sequences obtained from this study are the first to be reported for Liriodendron and among the first reported for any basal angiosperm species.

The BLAST searches of both the Liriodendron BAC end sequences and the small-insert random genomic clone shotgun end sequences produced at least twice as many matches in GenBank to transposons than to genes. Transposable elements are ubiquitous in the plant kingdom. They are present in high copy numbers in most plants, making them major constituents of plant genomes (Kumar and Bennetzen 1999). Plant species, like maize (San Miguel et al. 1996), Pinus lambertiana (Kossack and Kinlaw 1999), Lilium speciosum (Leeton and Smyth 1993), wheat (Moore et al. 1991), rice (Noma et al. 1997), Vicia species (Pearce et al. 1996a,b), Allium cepa (Pearce et al. 1996a,b), and tobacco (Yoshioka et al. 1993), contain more than 50% transposable elements in their genomes. Based on our genome-size estimate and BLAST results, we predict that over 50% of the Liriodendron genome is composed of transposable elements also. The number of transposable elements in the Liriodendron genome may be even greater than this estimate if the majority do not share sequence homology with any previously sequenced elements. Perhaps the relatively large genome size that we found for Liriodendron relative to other magnoliids can be explained by a larger number of transposable elements and/or other repetitive elements in Liriodendron, as opposed to a recent genome duplication. This interpretation is supported by our Southern hybridization results for single copy genes, which would suggest that either a recent genome duplication event has not taken place or that the low copy genes in Liriodendron behave as single copy on Southern blots because the genes have not yet diverged in sequence. Analysis of genes expressed during flower development (Albert et al. 2005) does show evidence for two rounds of ancient polyploidy in Liriodendron (Cui et al. 2006). However, the Ks approach used in Cui et al. (2006) cannot detect very recent genome duplication events in which sequences in multiple copies of a gene are so similar that they assemble into the same unigenes.

It is quite interesting that the majority of the Liriodendron BAC end sequences and shotgun end sequences with matches in GenBank aligned best to eudicot sequences (84 and 82%, respectively) despite the fact that the full rice genome was available. This pattern may support a phylogeny that places the magnoliids closer to eudicots than to monocots, or may be indicative of more similar composition patterns at the DNA and protein levels between Liriodendron and the diverse eudicots that have been identified (Leebens-Mack et al. 2005; Qiu et al. 2005). Sequences from BAC and shotgun end sequencing also indicate that Liriodendron has a relatively AT-rich genome.

Approximately 75% of BESTs and 68% of the genomic shotgun sequences did not share significant sequence similarities to any DNA or protein entries in GenBank, indicating that a significant portion of the yellow-poplar genome remains unknown and, when better characterized, may be more informative for understanding other basal angiosperms than sequences from those model species from which most genomic sequences in GenBank are derived. Because L. tulipifera occupies an important phylogenetic position as a basal angiosperm, genomic information from it will help provide important insights in the evolution of biological processes in angiosperms. When combined with the EST data that is available, the LT__Ba BAC library will be an important tool for gene targeting and unraveling gene functions in yellow-poplar. The yellow-poplar BAC library LT__Ba is available to the scientific community through the Clemson Genomics Institute (for ordering information, visit the CUGI BAC/EST Resource Center site http://www.genome.clemson.edu/cgi-bin/orders).