Introduction

Transposable elements (TEs) are mobile DNA sequences that are a common constituent of every genome studied to date. Those which transpose via a RNA intermediate are termed Class I elements or retrotransposons, while those in which transposition occurs directly are termed Class II elements or DNA transposons (Flavell et al. 1994). Retrotransposons are closely related to retroviruses with respect to both their structure and life cycle (Boeke and Corces 1989). They have been classified on the basis of whether or not they possess a long terminal repeat (LTR) sequence (Kumar and Bennetzen 1999). Among the LTR retrotransposons, two types (Ty1-copia and Ty3-gypsy) have been distinguished, mainly on the basis of the linear arrangement of their coding domains. Retrotransposons are ubiquitous in the plant kingdom and constitute a large fraction of many plant genomes, ranging from ~5.5% of the Arabidopsis thaliana genome to >50% of that of Zea mays (Hirochika 1993; Pearce et al. 1996; SanMiguel et al. 1996).

Most retrotransposons are believed to be either transcriptionally fully inactive (Kumar and Bennetzen 1999), or active only during particular stages of development or in response to stress (Hirochika 1993), wounding or pathogen infection (Kimura et al. 2001). Transcription of tobacco Tto1, Tnt1, Tnp2, rice Tos17, oat OARE-1 and orange CIRE1 has been demonstrated (Grandbastien et al. 1989; Vaucheret et al. 1992; Hirochika 1993; Hirochika et al. 1996; Kimura et al. 2001; Rico-Cabanas and Martínez-Izquierdo 2007) but, in all cases, transcriptional activity is limited to either stressful conditions or occurs only in a restricted number of tissues (Grandbastien 1998). Based on their abundance in all plant genomes, their stability during transposition, and their sensitivity to stress, retrotransposons are believed to play a number of significant biological roles (White et al. 1994; Kato et al. 1999; Lonnig and Saedler 2002).

The strawberry (Fragaria spp.) is an herbaceous perennial that propagates vegetatively. Micropropagated strawberry plants—a process which includes a period in tissue culture—are used widely in production. In many plant species—including strawberry (Nehra et al. 1991)—regenerated plants arising from tissue culture are phenotypically variable, an effect termed somaclonal variation. One of the mechanisms proposed to underlie somaclonal variation is retrotransposon activation (Tahara et al. 2004). Segments of both the Ty1-copia and Ty3-gypsy genes encoding reverse transcriptase (RT) were amplified from the genomic DNA of strawberry by Ma et al. (2008), who concluded that the contribution of the Ty1-copia type was greater than that of the Ty3-gypsy type. Here, we report the isolation of a complete, transcriptionally active, Ty1-copia retrotransposon (FaRE1) from the cultivated strawberry genome using genome walking, and study the extent to which FaRE1 expression is affected by the exogenous application of various phytohormones.

Materials and methods

Plant material

The strawberry cultivar “Allstar” was used throughout these experiments. To analyze the effect of phytohormones on retrotransposon expression, folded young leaves (~1.5 cm long) of greenhouse-cultivated plants were sprayed with 50 mM naphthalene acetic acid (NAA), 50 mM 2,4-dichlorophenoxyacetic acid (2,4-D) or 50 mM abscisic acid (ABA).

Nucleic acid extraction

Genomic DNA was extracted from fresh leaves following a modified CTAB method (Ma et al. 2008). Total RNA was isolated using the modified CTAB method, as described by Chang et al. (2007). All RNA samples were treated with DNase I (TaKaRa, Japan) at 37°C for 4 h before precipitating RNA. The concentration of nucleic acid in each preparation was estimated spectrophotometrically.

Genome walking

Isolation of the complete strawberry Ty1-copia retrotransposon was based on xp5-2, a 260 bp RT fragment isolated from strawberry leaves by RT-PCR (Ma 2008). The Universal Genome Walker kit (Clontech, Palo Alto, CA) was used to isolate the complete Ty1-copia retrotransposon sequence. Genomic DNA was blunt-end digested by DraI, EcoRV, PvuII or StuI (TaKaRa, Japan). The construction of Genome Walker libraries, and the purification and ligation of adaptors to restriction fragments were performed as described in the Universal Genome Walker kit manual. Amplification reactions were performed in a 20 μL volume based on the Advantage Genomic PCR Kit (Clontech), according to the manufacturer’s instructions. The first PCR consisted of 7 cycles of 94°C/10 s and 70°C/3 min, followed by 32 cycles of 94°C/10 s and 65°C/3 min. The second PCR consisted of 5 cycles of 94°C/10 s, and 70°C/3 min, followed by 20 cycles of 94°C/10 s, and 65°C/3 min. At the end of each PCR, an elongation step of 65°C/10 min was included.

Long-range PCR

Amplification was performed in 20 μL reactions containing 50 ng genomic DNA, 1 μM of each primer (see Table 1) targeted outside the FaRE1 sequence, 0.5 U LA Taq DNA polymerase in a reaction buffer that included 0.2 mM dNTP (TaKaRa, Japan). The cycling conditions consisted of an initial denaturing step of 94°C/3 min, followed by 35 cycles of 94°C/1 min, 55°C/1 min and 72°C/5 min, and ending with an elongation step of 72°C/10 min.

Table 1 Primer sequences used for PCR

Dot blotting

TE copy number was estimated by hybridization between genomic DNA and a probe amplified from the strawberry genome in a 25 μL PCR comprising 25 ng genomic DNA, 200 μM dNTP, 1 μM of each primer directed at various TE domains (primer sequences listed in Table 1), and 1 U Ex Taq polymerase (TaKaRa, Japan). The amplification profile comprised an initial denaturation step of 94°C/3 min, followed by 35 cycles of 94°C/1 min, 55°C/1 min and 72°C/1 min, and a final elongation step of 72°C/10 min.

The amplified products were purified from 1.5% agarose gels, and labelled with digoxigenin-dUTP using the DIG High Prime DNA Labeling and Detection Starter Kit II (Roche, Mannheim, Germany), following the manufacturer’s instructions. Genomic DNA and the PCR products were denatured in 1 M NaOH, 0.2 M EDTA (pH 8.2) for 30 min, held at 100°C for 10 min, then quickly chilled. The denatured DNA was spotted onto an Immobilon-Ny+ membrane (Millipore, Bedford, MA) in various amounts (genomic DNA: 50, 100, 200, and 400 ng; PCR products: 25, 50, 100, and 200 pg). The membranes were incubated in DIG Easy hybridization buffer for 16 h at 62°C. After washing, the signal was visualized immunologically following the DIG High Prime DNA Labeling and Detection Starter Kit II protocol. A linear regression between the natural logarithm of probe sequence copy number and signal strength was used to infer copy number in the genome.

RT-PCR and northern blotting

cDNA was synthesized from total RNA using Reverse Transcriptase XL (AMV; TaKaRa) primed with dT18 and a random nonamer primer, following the manufacturer’s instructions. Amplification of cDNA was performed in a 20 μL reaction containing 1 μL cDNA, 1 μM of each primer (gag domain, integrase domain, reverse transcriptase domain, rnaseH domain—see Table 1), 0.5 U Taq DNA polymerase and buffer that included 0.2 mM dNTP (TaKaRa). The cycling conditions were an initial denaturation step (94°C/3 min), followed by 35 cycles of 94°C/1 min, 55°C/1 min and 72°C/1 min, and a final elongation step (72°C/10 min). Strawberry 18S ribosomal RNA was used as an internal control gene (Simovic et al. 1992).

For northern analysis, RNA (10 μg) was separated on a 1.2% formaldehyde agarose gel and then transferred to an Immobilon-Ny+ membrane (Millipore). The membrane was prehybridized and hybridized in DIG Easy hybridization buffer for 16 h at 42°C using reverse transcriptase domain cDNA probe labelled with digoxigenin-dUTP, as described in dot blotting.

DNA sequence analysis

PCR products were purified and recovered using the Agarose Gel DNA Purification and Recovered Kit (TaKaRa), ligated into pBS-T (Tiangen, China) and transformed into Escherichia coli strain JM109. DNA sequencing reactions employed M13 primers, and was performed by Shanghai Biological Engineering and Technology and Service (Shanghai, China). Each clone sequence was subjected to either BLASTN or BLASTX analysis (http://www.ncbi.nlm.nih.gov) and aligned using ClustalX (Thompson et al. 1997). These alignments were then used to generate neighbour-joining trees using MEGA4.1 (Kumar et al. 2008) with a 1,000 replicate bootstrap test. PlantCARE (http://www.oberon.rug.ac.be.8443/carebin/CallMat_IE55.htpl) and PLACE (http://www.dna.affrg.go.jp/htdocs/PLACE) databases were used for the analysis of the FaRE1 retrotransposon 5′-LTR domain.

Results

FaRE1, a novel Ty1-copia retrotransposon from strawberry

To isolate the complete sequence of Ty1-copia retrotransposons from F. × ananassa, a Ty1-copia retrotransposon RT sequence of 260 bp, named xp5-2, was used as an original sequences to commence genome walking. The sequence of xp5-2 was isolated from strawberry leaves treated with 50 mM ABA by RT-PCR with degenerate primers corresponding to reverse transcriptase domains I and III of Ty1-copia retrotransposons (Ma 2008). Two successive downstream and two successive upstream steps generated fragments that could be assembled into a 6,066 bp contig, which contained the complete LTR retrotransposon sequence with two almost perfect LTRs (490 bp), starting with TG and ending with CA. To ensure that the contig was not a mosaic of independent sequences, a primer pair was targeted outside the retrotransposon sequence, and used to amplify a ~5,200 bp fragment from the strawberry genome (Fig. 1). This sequence shared nearly 99% identity with the contig sequence. We have named the complete LTR retrotransposon sequence FaRE1 (Fragaria × ananassa retrotransposon), and deposited this sequence in GenBank (accession number FJ871121).

Fig. 1
figure 1

Long-range polymerase chain reaction (PCR) of the genomic region including FaRE 1. 1% agarose gel electrophoresis pattern of PCR amplification products using a pair of outside FaRE1-specific primers M 500 bp ladder marker

FaRE1 is 5,104 bp in length, and consists of a single 3,891 bp open reading frame (ORF). Its structure is typical of the Ty1-copia type, containing a pair of LTRs (490 bp) starting with TG and ending with CA. Each LTR has the 5 bp direct repeat 5′-CAAAT-3′ flanking the element (Fig. 2a), which probably represents a duplication of the genomic target site produced by transposon insertion (Konieczny et al. 1991). Both LTRs start and end with the 4 bp inverted repeat 5′-TGGT…ACCA-3′, including the retroviral consensus 5′-TG…CA-3′, which are thought to be important for integration (Liubomirskaia et al. 2003). The alignment of FaRE1 with other plant retrotransposons (Fig. 2a) showed that the ORF includes the genes encoding group-specific antigen (GAG), protease (PR), integrase (IN), reverse transcriptase (RT) and RNaseH (RH), ordered as in all Ty1-copia retrotransposons. A particularly high degree of identity was obtained between the peptide sequences encoded by FaRE1 and CIRE1 of orange, namely 30, 35, 43, 58, and 50 for, respectively, GAG, PR, IN, RT and RH. On the basis of peptide sequence, FaRE1 belongs within the class of Ty1-copia retrotransposons (Fig. 2b).

Fig. 2
figure 2

FaRE1, a novel Ty1-copia retrotransposon isolated from strawberry. a The structure of the FaRE1 retrotransposon is schematically shown at the top of the figure. Alignment of amino acid sequences of relevant domains corresponding to Gag, protease, integrase, reverse transcriptase and RNase H proteins of BARE1 of barley (Manninen and Schulman 1993), CIRE1 of sweet orange (Rico-Cabanas and Martínez-Izquierdo 2007), copia of Drosophila (Mount and Rubin 1985), Tlc1 of tomato (Yañez et al. 1998), Tnt1 of tobacco (Grandbastien et al. 1989) and Tto1 of tobacco (Hirochika 1993) retrotransposons are shown together with those encoded by the sequenced FaRE1 retrotransposon. b Dendrogram tree of conceptual translated amino acid sequences in the whole gag-pol region of FaRE1 and other retrotransposons. The tree was obtained using the Neighbor-joining method (Saitou and Nei 1987). Horizontal distances are proportional to evolutionary distances according to the scale shown on the bottom. The tree was displayed with the MEGA4.1 program showing bootstrap values (from 1,000 replicates) higher than 50%. In addition to the retrotransposons described in a, the amino acid sequences of Gypsy of Drosophila (Marlor et al. 1986), Maggy of Magnaporthe grisea (Farman et al. 1996), the fungal Skippy element (Anaya and Roncero 1995), Ted of baculovirus (Friesen and Nissen 1990) and Ty3 of yeast (Hansen et al. 1988) are also included in the comparison

A further feature typical of retroviruses and retrotransposons that is also present in FaRE1 is the primer binding site (PBS) adjacent to the 5′LTR. The FaRE1 PBS is 20 bp long and represents a potential tRNA-Met primer binding site for the initiation of minus strand synthesis (Fig. 2a). Typically, this sequence begins with TGG and is complementary to the 3′-end of the host-encoded tRNAs-Met that is used to prime minus strand DNA synthesis (Sprinzl et al. 1996). Upstream of the 3′LTR, FaRE1 has a 15 bp polypurine tract (Fig. 2a), preceded by seven pyrimidines characteristic of the retroid family putative priming site for plus strand DNA synthesis (Perlman and Boeke 2004). The positions of both priming sequences coincided with those described in other plant retrotransposons (Kumar and Bennetzen 1999).

FaRE1 copy number

The dot blot hybridization assay indicated that 200 ng genomic DNA gave the same strength of signal as 50 pg the LTR probe (Fig. 3). F. × ananassa has a 1C DNA value of 0.61 pg (598 Mb; Nehra et al. 1991) but, as it is an octoploid species, whose haploid consists of four genomes, the size of the strawberry genome is 149.5 Mb. Based on MacRae (1998), the FaRE1 copy number was calculated by multiplying the genome size by the average proportion of nuclear genomic DNA hybridizing to the probe, and dividing by the molecular weight of the probe sequence. On the basis of the LTR probe, this produced an estimate of 196 for the copy number of the LTR sequence. Applying the same calculation to GAG, IN, RT and RH resulted in estimates of, respectively 102, 101, 96 and 95 (Table 2). Since the ratio between the LTR and the other sequence copy numbers is effectively 2, it is clear that most, if not all, of the FaRE1 elements consist of a single ORF and two LTRs, which is characteristic of retrotransposons. RT corresponds to regions encoding enzymes critical to the retrotransposon life cycle and would be expected to be well conserved among retrotransposons. Therefore, the copy number of the FaRE1 retrotransposon was estimated using the RT probe, resulting in an estimated ~96 copies per genome. Based on a copy number of 96/genome, the proportion of the strawberry genome occupied by FaRE1 is ~0.33%. As the number of Ty1-copia retrotransposons in the genome is of the order of 2,875 (Ma et al. 2008), FaRE1 therefore represents around 3% of the population of Ty1-copia retrotransposons in the strawberry genome.

Fig. 3
figure 3

Dot blots used to estimate FaRE1 retrotransposon copy number in strawberry. Dots (LTR amplicons): 1 25 pg, 2 50 pg, 3 100 pg, 4 200 pg PCR product; Dots (genomic DNA): 1 50 ng, 2 100 ng, 3 200 ng, 4 400 ng

Table 2 FaRE1 retrotransposon copy number in the strawberry genome

Transcriptional analysis of FaRE1

A search for eukaryotic promoter motifs upstream of the FaRE1 5′LTR was carried out by scanning the PlantCARE and PLACE databases. This led to the identification of 14 possible TATA boxes and various other DNA motifs associated with the plant response to endogenous or environmental factors. The effect on FaRE1 transcription level of the application of the phytohormones NAA, 2, 4-D or ABA to the leaf was investigated by RT-PCR and northern blotting. cDNA was generated from mRNA harvested 16 h after treatment, and used as a template for the amplification of the FaRE1 domains GAG, IN, RT and RH. RT-PCR found that the FaRE1 sequence was represented in the population of mRNA, meaning that the retrotransposon was transcribed under the specific conditions applied. In contrast, no band was observed in mRNA from leaves that had not been treated with hormone (Fig. 4). The result of northern blotting analysis also indicated that transcription of FaRE1 was induced by treatment with phytohormones (Fig. 5). Comparing to the sizes of 28S, 18S rRNA, the size of FaRE1 transcripts was estimated to be ~4.8 kb.

Fig. 4
figure 4

RT-PCR demonstrates transcriptional activity of FaRE1. ad Amplifications of the FaRE1 domains: a GAG, b IN, c RT, d RH; e amplification of 18S internal control gene. Lanes: 1 Untreated leaves; 2 leaves incubated with naphthalene acetic acid (NAA); 3 leaves incubated with 2,4-dichlorophenoxyacetic acid (2,4-D); 4 leaves incubated with abscisic acid (ABA); 5 water control; 69 lanes contain non-reverse transcribed RNA as the PCR template; M 100 bp ladder marker

Fig. 5
figure 5

Northern blot analysis of the transcriptional activity of FaRE1. Lanes: 1 Untreated leaves; 2 leaves incubated with NAA; 3 leaves incubated with 2,4-D; 4 leaves incubated with ABA

Discussion

LTR retrotransposons are present in many plant genomes in very high copy numbers, but the majority of them are probably inactive. Transcriptionally active LTR retrotransposons are important because of the role they play in gene and genome evolution. In contrast to this agronomic importance, very little information about the strawberry genome has been reported. Recently, Pontaroli et al. (2009) identified 13 Ty1-copia retrotransposons, 14 Ty3-gypsy retrotransposons and 3 unclassified elements from a Fosmid library of wild diploid strawberry F. vesca subsp. americana; however, their transcriptional activity was unknown, and none of them showed regions of homology to known nuclear genes. Here, we have isolated FaRE1 from the cultivated strawberry genome, and shown that it is transcriptionally active, at least following phytohormone treatment. The FaRE1 retrotransposon has all the features of a Ty1-copia retrotransposon, including the predicted transcriptional and regulatory signals and conserved protein-coding domains. Its copy number was ~96/genome, which is equivalent to ~0.33% of the DNA present. The majority of the FaRE1 elements carry two LTRs per internal domain, similar to Grande in the maize genome (García-Martínez and Martínez-Izquierdo 2003) and CIRE1 in the orange genome (Rico-Cabanas and Martínez-Izquierdo 2007). We believe that this is the first report of a transcriptionally active LTR retrotransposon in Fragaria spp.

It is known that certain retrotransposons are activated by stress, and others by exogenously supplied phytohormones. Thus, for example, transcription of the tobacco element Tnt1A is induced by the provision of jasmonic acid (JA), and Tnt1C by salicylic acid and auxin (Beguiristain et al. 2001). Similarly, ABA and JA enhance the transcription of the barley element Bare1 and tobacco element Tto1 (Suoniemi et al. 1996; Takeda et al. 1998), while UV stress activates LTR retrotransposons in both Xenopus sp. (Shim et al. 2000) and oat (Kimura et al. 2001). During periods of high transpositional activity, some retrotransposon sequences are amplified to produce multi-copy, relatively homogeneous elements which then diverge, either by subsequent retrotransposition or by natural acquisition of mutations. It was estimated that actively transposing retrotransposons incorporate mutations at a rate approximately a million-fold higher than that of nuclear genes (Gojobori et al. 1990). Here, we have shown that FaRE1 transcription occurs in leaves treated with NAA, 2, 4-D or ABA. During tissue culture, strawberry plantlets grow in medium supplemented with lower concentrations of hormones. We are now are developing sequence-specific amplification polymorphism (S-SAP) markers based on the LTR sequences of FaRE1 to analyze whether FaRE1 is activated and transposes when strawberry plants are micropropagated, especially in medium supplemented with NAA and/or 2,4-D. This will allow us to elucidate whether transposition of retrotransposons could be one of the main reasons for somaclonal variation in micropropagated strawberry plants.

LTRs are important for the initiation of transcription and retrotransposition. Transcription of retroviruses and retrotransposons starts at the 5′LTR (Kumar and Bennetzen 1999) since the 5′LTR represents the promoter, which directs transcription by host cell RNA polymerase II. A characteristic of active plant retrotransposons is that they contain cis-regulatory elements in the U3 region of their 5′LTR, generally associated with signal transduction pathways related to the plant’s defense response. The 5′LTR of the Bare1 retrotransposon contains ABRE elements that respond to ABA (Suoniemi et al. 1996), while a 13-bp motif associated with Tto1 has been identified as a cis-regulatory sequence required to drive its induction by JA (Takeda et al. 1999). The TLC1.1 ethylene-responsive element in Lycopersicon chilense carries the 8 bp motif ATTTCAAA, which acts as an ethylene-responsive element box in the promoter region (Tapia et al. 2005). The FaRE1 LTR sequence includes some well-characterized promoter motifs, such as the TGTCTC box, the ACTTTA and GAGAC cores, and the YTGTCWC, WAACCA, CANNTG and AWTTCAAA motifs. The TGTCTC box (Hagen and Guilfoyle 2002; Goda et al. 2004; Nemhauser et al. 2004), ACTTTA (Baumann et al. 1999) and GAGAC (Maruyama-Nakashita et al. 2005) cores, and the YTGTCWC motif (Boyle and Brisson 2001) have all been associated with the response to auxin; while the WAACCA (Abe et al. 2003) and CANNTG (Abe et al. 2003) motifs respond to ABA. The AWTTCAAA motif is ethylene responsive (Tapia et al. 2005). We suggest that the FaRE1 5′LTR contains a functional promoter, and so are presently involved in constructing a FaRE1 LTR:GUS fusion for use as functional tool in transgenic strawberry.

In conclusion, this is the first description of a complete active Ty1-copia retrotransposon in strawberry. We have shown that FaRE1 expression occurs in leaves supplied with exogenous phytohormones. FaRE1 may play a role in gene and genome evolution in the cultivated strawberry.