Introduction

Phytic acid (PA), known as myo-inositol 1,2,3,4,5,6-hexakisphosphate, is the major storage form of phosphorous (P) in cereal seeds and exists as mixed salts (phytates) of mineral cations, including minor amounts of Zn2+and Fe3+ (Lott et al. 2000; Raboy et al. 2001). Phosphorus in the form of PA or phytate (PA-P) and divalent cation minerals in phytates are almost indigestible in monogastric animals. In addition, PA may also interact with minerals encountered in the intestinal tract and thus reduce their absorption. As a result, PA is widely regarded as an anti-nutrient in food/feed. Undigested PA-P excreted in manure is also increasingly a source of phosphorus pollution in the environment (Raboy 2009).Thus, the development of low phytic acid (LPA) crops, in which the PA content is significantly reduced in seeds, has become an important task for crop breeders.

More than two dozen LPA lines have been reported in major crops, developed through forward and reverse genetic approaches (see reviews Raboy 2007, 2009). Mutations of genes involved in PA biosynthesis or compartmentalization have been identified in a number of LPA mutants, e.g., myo-inositol phosphate synthase (MIPS) gene (Kuwano et al. 2009; Nunes et al. 2006), myo-insitol kinase (MIK) gene (Kim et al. 2008b; Shi et al. 2005), inositol polyphosphate kinase (IPK) gene (Shi et al. 2003; Stevenson-Paulik et al. 2005; Yuan et al. 2012) and the multi-drug resistance-associated protein (MRP) ATP-binding cassette transporter gene (Maroof et al. 2009; Nagy et al. 2009; Panzeri et al. 2011; Shi et al. 2007; Xu et al. 2009). To enhance the nutritional value of rice grains, a number of LPA lines also have been developed (Larson et al. 2000; Li et al. 2008; Liu et al. 2007). Mutations of three genes are known to result in PA reduction in these rice mutants, e.g., Oslpa1gene (LOC_Os02g57400) (Kim et al. 2008a; Zhao et al. 2008b), the OsMIK gene (LOC_Os03g52760) (Kim et al. 2008b) and the OsMRP5 gene (LOC_Os03g04920) (Xu et al. 2009).

Several LPA rice lines have been developed through chemical and physical mutagenesis by our group and genetic analyses showed that mutations of at least four genes were involved in the LPA phenotypes of these lines, two of which have been identified, i.e., Oslpa1 (Kim et al. 2008a; Zhao et al. 2008b) and OsMRP5 (Xu et al. 2009). The lpa gene that controls the LPA phenotype of Os-lpa-XS110-1 was mapped to a region of chromosome 3 (Liu et al. 2007), but remained to be precisely identified. In the present study, the molecular basis of XS-lpa was further studied and a large insert derived from a LINE retrotransposon gene was identified in the OsMIK gene, significantly reducing its transcription and translation.

Materials and methods

Plant materials and inorganic P test

The rice LPA mutant line Os-lpa-XS110-1 (XS-lpa) used in this study was developed through 60Co gamma irradiation followed by NaN3 treatment of a commercial japonica cultivar Xiushui 110 (XS110, Liu et al. 2007). XS-lpa has a ~45 % reduction of seed PA and a 4.3-fold increase in inorganic P (Pi) content compared to XS110 (Frank et al. 2007; Liu et al. 2007). XS-lpa was crossed with a wild-type (WT) cultivar Jiahe 218 to produce an F2 population (~800 plants), and F3 seeds of each F2 plant were harvested individually. All materials were grown in the Experimental Farm of Zhejiang University in Hangzhou, using conventional agronomic practices.

The Pi level of rice seeds was colorimetrically assayed according to Liu et al. (2007). Development of a blue color implies an increased level of Pi typical for LPA mutant and its homozygous mutant progenies, while colorless samples typified WT levels of the parent cultivar.

DNA extraction, amplification and sequencing

For gene sequencing, genomic DNA was extracted from plant leaves using Biospin plant genomic extraction kit (Bioflux, China) and treated with RNase A (Fermentas, Canada) to remove RNA. For molecular marker analysis, genomic DNA was extracted from leaves according to a modified CTAB method as previously described (Liu et al. 2007). All DNA samples were adjusted to a final concentration of ~25 ng/μL after quantification using the Nanodrop 2000 (Thermo Scientific, USA).

PCR primers were designed using the Primer Premier 5 software according to the genome and transcript sequences of the japonica cultivar ‘Nipponbare’ (http://www.gramene.org/) and synthesized by Shanghai Sangon Biological Engineer Technology and Services Co., Ltd. (Shanghai, China). Sequences of all primers are given in Table 1 and Table S1.

Table 1 Primer sequences and expected size of amplicons

For amplification of OsMIK (LOC_Os03g52760) and OsIPK (LOC_Os03g51610), PCRs were performed in 25 μL volumes with 50 ng genomic DNA, 2.5 μL 10× PCR buffer for KOD-Plus-Neo, 1.5 mM MgSO4, 0.2 mM dNTPs, 0.5 U KOD-Plus-Neo (TOYOBO, Japan) and 0.3 μM of each primer. A touchdown program was used to increase the specificity of PCR amplification as follows: pre-denaturation at 94 °C for 2 min; 5 cycles of 98 °C for 10 s, 74 °C for 2 min; 5 cycles of 98 °C for 10 s, 72 °C for 2 min; 5 cycles of 98 °C for 10 s, 70 °C for 2 min; 20 cycles of 98 °C for 10 s, 68 °C for 2 min, with a final extension at 68 °C for 5 min.

PCR amplicons were separated using a 1 % agarose gel and target fragments were cut out and purified using the Axy-Prep DNA Gel Extraction Kit (Vitagen, Hangzhou, China). The purified fragments were cloned into PMD-19 T vector (Takara, Japan) and sequenced at Nanjing Genscript Biotech Co., Ltd. (Nanjing, China).

Annotations of candidate genes in the mapped region were obtained from the TIGR database (http://rice.plantbiology.msu.edu/). Sequence searches were performed using the NCBI BLAST search program (http://blast.ncbi.nlm.nih.gov/). Multiple sequence alignments were generated performed using ClustalX (http://www.ebi.ac.uk/Tools/msa/clustalw2/) and BioEdit 7.0 program.

Thermal asymmetric interlaced PCR (TAIL-PCR)

TAIL-PCR was performed using Genome Walker Kit (Takara, Japan). Based on the known flanking region of OsMIK, three left (L1, L2, L3) and three right (R1, R2, R3) walking primers were designed (Table 1; Fig. 1c) and used for amplification of unknown fragment together with shorter arbitrary degenerates (AP1, AP2, AP3 and AP4, provided in the kit) in 50 μL according to the manufacturer’s protocol. Amplified PCR fragments were collected, purified and cloned into the PMD19-T vector (Takara, Japan) for sequencing.

Fig. 1
figure 1

The rice XS-lpa mutant contains an insertion in the intron of OsMIK. Exons and introns of OsMIK (a) and the LINE gene LOC_Os03g56910 (b) are to scale and are depicted as boxes and solid lines; UTRs are shown in empty boxes, while coding sequences are filled in black (OsMIK) or in grey (LINE). Flanking sequences are shown with dotted lines for OsMIK and with broken lines for the LINE; the structure of OsMIK with the rearranged LINE insertion in XS-lpa is illustrated in panel c, with an un-sequenced fragment denoted with a question mark. The nucleotide position starts with +1 for ATG, with M for OsMIK and L for the LINE gene. The vertical arrow in the intron of OsMIK denotes the insertion site (M+1249) and the one in the third exon of the LINE indicates the point (L+2654) or rearrangement before or after inserting into OsMIK. The primers given above the genes correspond to those listed in Table 1. The EcoRI sites in OsMIK and the LINE gene are given together with their positions. The positions of probes (P1 and P2) used for Southern blot are indicated under the genes; only a part (P1a, P1b, P2a, P2b) may hybridize with genomic DNA in certain cases (see also Fig. 2). In a the OsMIK sequence around the insertion site is shown with the “AAAAAT” deleted in XS-lpa and the underlined nucleotides that are homologous to a fragment of LOC_Os03g56910 adjacent to the insertion site

Southern blot analysis

Southern blot analysis was performed following Sambrook and Russell (2001) using the DIG High Prime DNA Labeling and Detection Starter Kit II (Roche, Switzerland) according to the manufacturer’s recommendations. Briefly, genomic DNAs (~15 μg) were digested with EcoRI and separated by electrophoresis on 0.8 % agarose gel, then transferred onto a nylon membrane (Hybond-N+, Roche) by capillary method. The nucleic acids were fixed to the membrane by baking for 30 min at +120 °C. The membrane with transferred DNA was hybridized to DIG labeled gene specific probes (P1 and P2, Fig. 1), which were amplified by the primers PP1 and PP2 (Table 1), and detected by CSPD (chloro-5-substituted adamantyl-1,2-dioxetane phosphate) substrate by means of autoradiography with X-ray films (Kodak, Japan).

RNA isolation and qRT-PCR

Developing seeds (14 days post-anthesis) were collected, and after dehulling, total RNA was isolated using the RNeasy Plant Mini Kit (QIAGEN, Germany) according to the manufacture’s protocol. Genomic DNA contamination was removed by treatment of RNase-Free DNase Set (QIAGEN, Germany). RNA quality was assessed on 1 % agarose gels and concentrations were adjusted to 250 ng/μL after quantification using Nanodrop 2000 (Thermo Scientific, USA).

First-strand cDNA was prepared from 1 μg DNase-treated total RNA in a total volume of 20 μL using PrimeScript II 1st Strand cDNA Synthesis Kit according to the manufacturer’s protocol (Takara, Japan) with oligo(dT)18 primers and stored at −80 °C until further use. The full-length cDNA of OsMIK was amplified by RT-PCR using the RT1 primers (Table 1) and subsequently sequenced.

Quantitative real-time PCRs were performed for OsMIK with the Q1primers (Table 1) using an SYBR® Premix Ex Taq™ II (TAKARA, Japan) in Stratagene Mx3005P instrument (Stratagene, USA) according to the manufacturer’s instruction. Each 20 μL reaction contained 3 μL diluted cDNA template, 10 μL 2× SYBR® Premix Ex Taq™ II buffer, 0.8 μL (10 μM) of each primer, 0.4 μL ROX Reference Dye II and 5 μL ddH2O. Amplifications were initiated with 30 s at 95 °C followed by 40 cycles of 95 °C for 5 s, 58 °C for 30 s, and at 72 °C for 30 s. The results were analyzed using MxPro™ QPCR Software and the quantification of gene expression was performed in triplicate using the relative −2△△CT method with Actin as internal control. Expression experiments were performed in triplicate (each with seeds from 2 to 3 individual plants).

Protein extraction and Western blot analyses

Protein extraction and Western blot analyses were performed according to Li et al. (2011) with minor modifications. Briefly, developing seeds were collected from at least five plants 14 days post-anthesis and served as stock seeds after dehulling manually. About 300 mg seeds were randomly taken from the stock and ground into powder in liquid nitrogen and suspended with 800 μL extraction buffer [62.5 mM Tris × HCl pH 7.4, 10 % glycerol, 2 % SDS, 20 mM NaF, 2 mM EDTA, 1 mM PMSF, 5 % β-mercaptoethanol, complete protease inhibitor Cocktail (Sigma)]. The suspended solution was chilled on ice for 10 min (vortexed every 2 min) and cell debris was removed by centrifugation at 12,000 rpm for 20 min at 4 °C. The supernatants were transferred to a new 1.5 ml tube and protein concentration was determined using the Modified BCA Protein Assay kit (Sangon, China). The supernatants were either immediately subjected to protein blot analyses or stored at  −70 °C until use.

To generate the OsMIK protein-specific antibodies, the specific epitope RSSMHDELHKSLQE was selected using PepDesign software (Cao et al. unpublished); its uniqueness in the rice whole proteome was verified by a BlastP search (Altschul et al. 1997) against rice database (http://rice.plantbiology.msu.edu/). The anti-OsMIK polyclonal antibodies were generated by immunizing healthy rabbits using KLH-conjugated synthetic peptides as antigens. The peptide conjugations, immunizations and antiserum purifications were carried out by BPI (Beijing Protein Innovation Co., Ltd., Beijing, China). Equal loading of proteins was confirmed by detection of heat shock protein (HSP, Li et al. 2011).

For protein blot analyses, approximately 30 μg of protein per lane was separated using SDS-PAGE and electro-transferred to a PVDF membrane (Millipore, USA) for 1.5 h at 150 V. After blocking in 5 % non-fat dried milk in a TTBS solution (2 mM Tris–HCl, pH 7.6; 13.6 mM NaCl; 0.1 % Tween 20) for 1–2 h at room temperature, the membrane bound with proteins was incubated with the OsMIK antibody in 5 % non-fat milk in a TTBS solution for 3 h at room temperature and washed in TTBS solution three times, each for 5 min. Then the membrane was incubated with a secondary antibody [horseradish peroxidase-conjugated goat anti-rabbit antibody (Zhongshan Goldenbridge Biotechnology Co., Ltd., China)] for 1 h at room temperature and rinsed in TTBS solution three times, each for 5 min. The blot was developed with a SuperECL plus kit (Amersham Biosciences, USA) and the signal was checked by exposing on X-ray film. HSP was used as reference protein (Li et al. 2011).

Western blot analyses were repeated at least three times and results were quantified using the Quantity One Software (Bio-Rad, USA).

Bisulfite sequencing

One microgram of genomic DNA (from seeds of 14 days post-anthesis) was treated with sodium bisulfite using the EZ DNA Methylation-Gold™ kit (Zymo Research, USA) according to the manufacturer’s instructions. The bisulfite-treated DNA was diluted in 10 μL distilled water before PCR. Four pairs of primers (B1-B4, Fig. 1c; Table S1) were designed for bisulfite sequencing using Methyl Primer Express software v1.0 (Applied BioSystems, USA). PCRs were performed in a 25 μL volume using 3 μL of DNA templates and 0.5 units KOD-Plus-Neo DNA polymerase (TOYOBO, Japan) with the following program: 94 °C for 2 min; 40 cycles of 98 °C for 10 s, 50 °C for 45 s, 68 °C for 30 s; and 68 °C for 5 min. The PCR products were purified and cloned into PMD19-T vector (Takara, Japan). Twelve individual clones were sequenced for each sample. The sequencing data were analyzed using Kismeth software (Gruntman et al. 2008).

Development of molecular markers

Two sets of primers were designed for PCR amplification of either an OsMIK fragment or a hybrid segment consisting of part of OsMIK and part of the insert (Table 1; Fig. 1). PCRs were carried out in a volume of 20 μL containing approximately 50 ng genomic DNA, 1× Taq Master Mix buffer (GeneSolution Co., Ltd. China) and 200 nM of each primer with the following program: 94 °C for 5 min followed by 35 cycles at 94 °C for 30 s; 58 °C for 30 s and 72 °C for 60 s, with a final extension for 5 min at 72 °C. Amplicons were separated on a 1 % agarose gel.

Results

Analyses of candidate genes

The XS-lpa mutation was previously mapped to a region on chromosome 3, tightly linked to microsatellite marker RM3199 (30.42 Mb) with a genetic distance of 1.198 cM (Liu et al. 2007). Although there are three genes that are annotated to encode enzymes that function in PA biosynthesis on chromosome 3, only two genes, i.e., OsMIK (LOC_Os03g52760, 30.24 Mb) and OsIPK1 (LOC_Os03g51610, 29.53 Mb) are reasonably close to RM3199 (Fig. S1). Therefore, both genes were subjected to PCR amplification. Sequencing and alignment of OsIPK1, including its promoter region, showed that there were no difference between XS110 and XS-lpa (data not shown), suggesting that it is unlikely that mutation in the OsIPK1 gene leads to the LPA phenotype of XS-lpa.

Five pairs of primers were designed and used for PCR of amplification of OsMIK (S1-S5, Fig. 1a). While five fragments of the expected size were amplified from XS110, only four fragments were produced from XS-lpa with the same size as XS110 and PCR with the primers S4F/R did not produce any fragment (Fig. S2). Sequencing of the amplified four fragments of OsMIK showed that there were no differences between XS-lpa and XS110. A few more primers were designed for PCRs of the fragment between S4F and S4R with different programs; however, no clear-cut results were obtained, indicating possible unknown changes within the region.

Identification and characterization of insert sequence

To uncover the sequence between S4F and S4R, TAIL-PCRs were performed using primers L1, L2 and L3 together with arbitrary primers AP1–AP4 for amplifying the 5′ flanking sequences (Fig. 1c). A discrete product of ~3.2 kb was produced after three rounds of TAIL-PCRs (Fig. 2a). Similarly, a 3′ flanking fragment of ~3.6 kb was acquired by TAIL-PCR using specific primers R1, R2, R3 combined with AP1–AP4 (Fig. 2b). No amplification was observed in parallel experiments with genomic DNA of XS110 as template (data not shown).

Fig. 2
figure 2

TAIL-PCRs for amplifying the sequence of XS-lpa OsMIK between S4F and S4R using sequence-specific left (a; L1, L2, L3) and right (b, R1, R2, R3) primers together with arbitrary primers (APs). M: DNA molecular weight marker; lanes 1, 2, 3, products from the first, secondary and tertiary TAIL-PCR reactions with corresponding AP primers

The two TAIL-PCR amplified fragments were subsequently sequenced. BLAST searches against the rice genome revealed that both consisted of two parts, i.e., part of OsMIK and a fragment with high similarities (99.6 %) to the LINE retrotransposon LOC_Os03g56910.

Detailed analysis of the TAIL-PCR fragment indicated that the inserted fragment included not only the LINE gene, but also its 5′ and 3′ flanking sequences, i.e., a 258-bp 5′ flanking region and a 2,037-bp 3′ flanking region (the sequenced end nucleotides are at positions of L−258 and L+5966, respectively, Fig. 1c). Furthermore, some rearrangements of the LINE gene happened before or after inserting into the intron of OsMIK (Fig. 1c) in XS-lpa: the inserted LINE gene indeed was split at the third exon into two fragments at position L+2645 (Fig. 1b), which were rejoined inversely in the insertion (Fig. 1c). Analysis of the sequence around the insertion site revealed that it is an AT-rich region and a 6-bp segment (AAAAAT) of OsMIK was deleted in XS-lpa (Fig. 1a). Further analysis indicated that there was a small homologous fragment (TAAATCCACT) between OsMIK and LOC_Os03g56910 near the insertion site (Fig. 1a).

To confirm the sequence analysis, two pairs of primers, Z1F/Z1R and Z2F/Z2R, were designed to distinguish the native LOC_Os03g56910 locus from the insertion within the intron of OsMIK using PCR (Fig. 1c). Amplicons of the expected size (part of OsMIK sequence and part of the LINE gene) were produced for XS-lpa, but no bands were observed for XS110 (Fig. 3). These results confirmed that the rearranged insertion was as indicated by sequence analysis.

Fig. 3
figure 3

Electrophoresis of DNA fragments amplified from XS110 and XS-lpa using the primers Z1F/R and Z2F/R. M: DNA molecular weight marker. Positions of Z1F/R and Z2F/R are shown in Fig. 1

The central part of the insert could not be amplified by TAIL-PCRs; hence, Southern blot analyses were performed by hybridizing probes P1 and P2 (Fig. 1) to genomic DNAs digested with EcoRI. Sequence analyses indicated that there were two EcoRI sites, both in OsMIK and the LINE gene LOC_Os03g56910 (Fig. 1a, b). In addition, there are two homologous genes of LOC_Os03g56910, i.e., LOC_Os04g44370 and LOC_Os07g43200, which have sequences with high identities (>96.6 %) to the probe 2 (P2b, Fig. S3). Therefore, a number of fragments are expected for XS110 and XS-lpa when probed with P1 and P2 (Fig. 4b). Furthermore, because only a part of the probe might be hybridized with a particular fragment, the hybridization temperatures also vary (Table S2). Consequently, when a fixed hybridization temperature is applied, bands may have different signal intensities in blotting.

Fig. 4
figure 4

Southern blot analyses of Xiushui 110 (XS110) and XS-lpa. Genomic DNAs were extracted from leaves, digested with EcoRI and fractionated in a 0.8 % agarose gel. Probes 1 and 2 were amplified from XS110 and XS-lpa using the PP1 and PP2 primers (Table 1), respectively, and labeled with DIG. a The bands detected and b the hybridized probes (P1 and P2) or part of them, i.e., P1a and P1b, or P2a and P2b (Fig. 1), the hybridized fragment or genes and their expected sizes

The results of the Southern blot analysis using probe P1 were as expected with one band X1.1 detected for XS110 and two bands for XS-lpa (L1.1 and L1.2, Fig. 4a). The probe P2 was amplified from XS-lpa containing part of OsMIK and part of LOC_Os03g56910 (Fig. 1c); hence when probed with P2, three bands were observed for both XS110 and XS-lpa, with bands X2.2 and L2.2, and X2.3 and L2.3 having similar length, while the other band of XS110 (X2.1) was shorter than that of XS-lpa (L2.1) (Fig. 4a). According to the expected sizes, band X2.1 and L2.1 correspond to the OsMIK fragment in XS110 and a hybrid fragment of OsMIK and the LINE gene, respectively (Fig. 4b). The bands X2.2 and L2.2 very probably represent the homologous LINE gene, LOC_Os04g44370, because they both are about 4.1 kb in size (Fig. 4a). The X2.3 band is likely to be mixed fragments of LOC_Os03g56910 (~7.8 kb) and LOC_Os07g43200 (~8.2 kb), because bands of such sizes could not be easily differentiated. The L2.3 might also be the same mixture or only the fragment of LOC_Os07g43200 if LOC_Os03g56910 no longer exists in XS-lpa.

To examine whether LOC_Os03g56910 still exists in XS-lpa, the E1 primers were designed around the split site of LOC_Os03g56910 in XS-lpa (Fig. 1c). Because of the sequence identity in the E1 primer regions, a fragment of 704 bp could be amplified from both LOC_Os03g56910 and LOC_Os04g44370, but the amplicons differed from each other by 20 nucleotides. Because of the split and rearrangement of LOC_Os03g56910 in XS-lpa, amplification would only be from LOC_Os04g44370 if LOC_Os03g56910 no longer exists. A single fragment of expected size was detected in both XS110 and XS-lpa (Fig. S4) and sequencing revealed amplicons of both LOC_Os03g56910 and LOC_Os04g44370 in XS110 and XS-lpa (data not shown), which confirmed that LOC_Os03g56910 remained in its original locus in XS-lpa.

Impact of the insertion on transcription and translation

To detect whether the XS-lpa mutation affected the expression of OsMIK, semi-quantitative RT-PCRs were first performed using primer pair RT2F/R. The results showed that XS-lpa had obviously lower expression than XS110 (Fig. 5a). Quantitative real-time PCR analyses confirmed that the expression level of OsMIK in XS-lpa was significantly decreased (only ~13 % that of XS110; Fig. 5b). To examine possible splicing and sequence changes of OsMIK transcripts, the full-length cDNAs were PCR amplified using primer pair RT1 (Fig. 1a). Only one fragment of the same size was observed for both XS110 and XS-lpa (data not shown). Sequencing of the transcripts revealed no differences between XS-lpa and XS110 (data not shown), indicating that the insertion did not change the splicing mode in XS-lpa.

Fig. 5
figure 5

Analyses of OsMIK gene expression in seeds of 14-days post-anthesis of XS110 and XS-lpa. a: Semi-quantitative RT-PCRs of OsMIK and four other genes known to be involved in PA biosynthesis or compartmentalization, with rice Actin as control. b: qRT-PCR analysis of OsMIK in XS110 and XS-lpa. c: Western blot analysis of OsMIK protein with HSP as control for XS110 and XS-lpa. All data are means of three biological replicates with error bars indicating standard deviation

Moreover, the protein level of OsMIK was also examined for XS-lpa and XS110 using Western blot analyses. Results showed that the content of OsMIK protein was reduced substantially in XS-lpa compared with that of XS110 (Fig. 5c).

To examine whether reduced expression of OsMIK affected expression of four other rice genes (OsMIPS, OsMRP5, OsIPK, Oslpa1) known to be involved in PA biosynthesis or compartmentalization, semi-quantitative RT-PCRs were performed and results showed that the OsMIK mutation did not affect transcription of these genes (Fig. 5a).

DNA methylation of OsMIK

To investigate whether the insertion of rearranged retrotransposon changed the methylation status of OsMIK, a CpG island that consists of part of the promoter and the first exon of OsMIK was identified (from −437 bp to +1,010 bp) and subjected to bisulfite sequencing. Compared with XS110, elevated methylation levels of XS-lpa were observed in the promoter, but not in the first exon. In the promoter region 8.3 % CGs and 6.4 % CHHs were methylated in XS-lpa, in contrast to 3.7 % CGs and 1.8 % CHHs in XS110 (Fig. 7b). The methylation is most concentrated in the starting part of the CpG island in both XS110 and XS-lpa (Fig. 7a and data not shown).

Functional molecular marker for genotyping

For molecular genotyping and selection of the XS-lpa allele, a PCR-based marker system was developed using a three-primer approach: two primers (Z1F and Z2R) match the OsMIK sequences flanking the integration site at 5′- and 3′-sides; the other primer complements the sequence of LOC_Os03g56910 at the 5′ end (Z1R) or 3′ end (Z2F) (Fig. 1; Table 1).

By combining the use of Z1F, Z1R and Z2R, one small fragment (~350 bp) could be amplified from plants homozygous for LPA mutant allele (XS-lpa), while a large fragment (~500 bp) was amplified from homozygous WT plants (Jiahe 218) (Fig. 6a).Genotyping of the 786 F2 plants derived from XS-lpa × Jiahe 218 revealed plants with three genotypes as expected (Fig. 6a, and data not shown). Assay of Pi levels of their corresponding F3 seeds indicated that the insert fragment perfectly co-segregated with Pi phenotype. In addition to providing further evidence that the OsMIK mutation is highly likely to underlie the LPA phenotype of XS-lpa, these results demonstrated the effectiveness of this marker for marker-assisted selection of the XS-lpa allele in rice breeding.

Fig. 6
figure 6

Agarose gel electrophoresis of DNA fragments amplified using three-primer PCRs (a: primers Z1F/Z2F/Z1R; b: primers Z1F/Z1R/Z2R). M: DNA molecular weight marker; WT and LPA represent Jiahe 218 and LPA mutant XS-lpa, respectively. Lanes 1–5, 6–10 and 11–15 are F2 plants with homozygous WT allele, heterozygous for WT and mutant alleles, and homozygous mutant allele, respectively. For sequence information of the primers see Table 1 and positions in Fig. 1b

Similarly, PCRs using primers Z1F, Z2F and Z1R could also be efficiently used for genotyping plants for the XS-lpa mutant allele. In this case, one small fragment (~500 bp) could be amplified from WT plants, while a large fragment (~750 bp) was amplified from LPA plants with XS-lpa allele (Fig. 6b). Identical results were obtained for the genotyping of F2 plants of Jiahe 218 x XS-lpa (Fig. 6b, and data not shown).

Discussion

Identification and characterization of gene(s) underlying valuable agronomic traits generated by mutagenesis, such as LPA, are important for advancing our understanding of their genetic control and efficient utilization in breeding. Characterization of the nature of the genetic lesions underlying mutant phenotype also contributes to a better understanding of the mechanism and characteristics of induced mutagenesis. In the present study, a novel LINE insertion mutation of OsMIK was identified in the LPA mutant XS-lpa. To our knowledge, this is the first report of such a mutation in plant mutants generated by physical/chemical mutagenesis. This finding may represent a novel mechanism by which genetic variation can be generated in plants. The allele-specific marker developed in the present study also provides a valuable means to improve the application of XS-lpa in LPA rice breeding.

Mutagenesis via retrotransposon insertion or homologous recombination?

Depending on the presence or absence of terminal repeats, retrotransposons are classified as long terminal repeat (LTR) or non-LTR. Non-LTR retrotransposons can be further subdivided into LINEs (long interspersed nuclear elements) and SINEs (short interspersed nuclear elements). A typical full-length, functional member of the LINE family contains two open reading frames (ORFs), ORF1 and ORF2. ORF1 encodes a protein with nucleic acid binding properties and nucleic acid chaperone activity; ORF2 encodes an endonuclease that introduces a nick in the AT-rich target site within genomic DNA (Dong et al. 2009; Kines and Belancio 2012).

LINEs are the most active mobile retrotransposons in the human genome and LINE proteins tend to act in cis or trans on their own encoded RNA to generate new retrotransposition events with newly inserted copy of full-length, or with 5′ truncation or inversions (Kazazian and Goodier 2002; Burns and Boeke 2012). Despite the fact that retrotransposons exist in high copy numbers, the great majority of them are inactive or defective. However, it has been reported that stress and irradiation activate retrotransposons and lead to mutation and genome restructuring (Farkash et al. 2006; Tanaka et al. 2012).

Retrotransposons account for >15 % of genome of rice (Zhao and Zhou 2012). Most belong to the LTR subclass and natural selection on gene function drives the evolution of LTR retrotransposon families in the rice genome (Baucom et al. 2009). In contrast, very little information is available concerning non-LTR retrotransposons (Komatsu 2003). To date, only one LINE-type retrotransposon, “Karma”, which can be activated in cultured cells and increase its copy number in later generations of plants regenerated from tissue culture, has been characterized (Komatsu 2003). Based on identical sequences of the insert identified in the intron of OsMIK, it is apparently derived from the LINE retrotransposon, LOC_Os03g56910.

Detailed analysis of the insertion indicated the presence of features that were both typical and atypical for retrotransposition. The abundance of A/T bases around the insertion site in OsMIK (Fig. 1a) is a typical target site for L1 retrotransposon insertion (Farkash et al. 2006). Furthermore, six A/T bases (AAAAAT, Fig. 1a) around the insertion site were deleted in XS-lpa, indicating a possible retrotransposition event. On the other hand, retrotransposition usually produces insertions of full-length cDNA of LINE genes or with 5′ truncation or inversions (Kazazian and Goodier 2002). In the case of XS-lpa, the inserted LINE gene, together with its flanking sequences, experienced breaks and inversion before or after insertion (Fig. 1b), which are features that have not been observed for any retrotransposon, suggesting a non-retrotransposition origin for the insertion.

XS-lpa was selected from the progeny of XS110 that was subjected to mutagenic treatment with gamma rays and NaN3 (Liu et al. 2007). Therefore, the insertion could also result from DNA repair of a double-strand break (DSB), which is known to take place in radiation-treated cells (Osakabe et al. 2012). DSBs are repaired through non-homologous end joining (NHEJ), microhomology-mediated end joining (MMEJ) or homologous recombination (HR). Since NHEJ and MMEJ often result in short or long sequence deletions (McVey and Lee 2008), it is not likely that the retrotransposon insert was produced by either NHEJ or MMEJ in XS-lpa. HR requires a short homologous sequence as a target for ectopic recombination between the damaged sequence and template sequence (Jeggo et al. 2011). Indeed, a BLAST search showed there was such a short homologous sequence “ATTTTACCTCT” between XS110 and the LINE gene around the breakpoint (Fig. 1a), which supports the hypothesis that the LINE insert was a result of HR repair of a DSB caused by mutagenic treatment, although further studies are needed before a conclusion can be made.

The insert could also result from a translocation of chromosomal segments. The existence of LOC_Os03g56910 in its original locus in XS-lpa, however, excluded this possibility. Of note is that the remaining LOC_Os03g56910 is expected if the insert resulted from retrotransposition or HR.

Gene knockout and knockdown via mutagenesis

To systematically assign functions to all predicted genes in the rice genome, a large number of rice mutant lines, including those created by T-DNA insertion, Ds/dSpm tagging, Tos17 tagging and chemical/irradiation mutagenesis, have been generated by groups around the world (reviewed by Wang et al. 2013). Exonic insertions often result in knockout of inserted genes, while intronic ones are often silent. Although individual cases of intronic insertions that decreased transcription or caused alternative splicing of the inserted genes have also been reported (e.g., Okada et al. 2009; Matsui et al. 2010), no underlying mechanisms have so far been addressed.

Epigenomic modification and epigenetic regulation is a common phenomenon and the genetic systems governing DNA methylation and demethylation are well characterized in plants such as rice (Zhao and Zhou 2012). However, reports on transposons affecting transcription of inserted or adjacent genes due to change of methylation status and consequently causing phenotypic variations have so far been limited in plants. The maize P1-wr-mum6 allele, a Mu-insertion allele, is one of few such examples: the methylation of the inserted Mu transposon caused hypomethylation of a floral-specific enhancer that is 4.7 kb upstream of the ~Mu1 insertion site (Robbins et al. 2008).

In the present study, no alternative splicing was observed for OsMIK, but its transcript abundance was significantly reduced in XS-lpa (Fig. 5b). By bisulfite sequencing, greater CpG and CHH methylation levels were observed in the promoter region of OsMIK in XS-lpa compared to that in XS110 (Fig. 7). Although the increase of methylation was not dramatic, bioinformatics analysis indicated a putative transcription start site right after the second cytosine (Fig. 7a), which may explain the great reduction of transcription of OsMIK in XS-lpa.

Fig. 7
figure 7

DNA methylation in the promoter and first exon of OsMIK in XS110 and XS-lpa. a: DNA methylation of CpG island in the promoter region; cytosines in the form of CG are marked in red, CHG in blue, CHH in green, where H = A, C, or T. Filled and empty circles denote methylated and unmethylated cytosines, respectively. The vertical arrow indicates a putative transcription starting site predicted by the BDGP website (http://www.fruitfly.org/seq_tools/promoter.html). b and c represent overall methylation levels in the promoter and 1 exon, respectively

Implications for genetics and breeding of LPA rice

It is well documented that mutations of the MIK gene can disrupt PA biosynthesis and result in reduced PA content in seeds. In maize, knockout of the MIK gene by a Mu insertion resulted in ~50 % PA reduction in the lpa3 mutant as compared to its corresponding WT cultivar (Shi et al. 2005). In rice, a single base pair change (C/G to T/A) in the first exon of OsMIK resulting in a nonsense mutation was reported to cause ~75 % seed PA reduction in the LPA mutant N15-186 (Kim et al. 2008b). XS-lpa had a ~46 % PA reduction compared with XS110 (Frank et al. 2007; Liu et al. 2007) and exhibited significantly and consistently increased content of myo-inositol, raffinose, galactose and galactinol compared to its WT parental cultivar XS110 in multiple environments, suggesting that a mutation event affecting phosphorylation of myo-inositol underlies the LPA phenotype of XS-lpa, similar to the lpa-3-type mutation in maize (Frank et al. 2007). In the present study, a rearranged LINE insertion was identified in the intron of OsMIK via TAIL-PCRs and Southern blot analysis, and the expression of OsMIK protein was demonstrated by Western blot for the first time. The insertion mutation resulted in substantial reductions of both transcripts abundance (~87 %, Fig. 5b) and protein levels (~60 %) in XS-lpa, as compared to XS110 (Fig. 5c).

With the availability of various LPA mutants, breeding yield-competitive varieties with significant seed PA reduction is becoming a reality. In rice, about a dozen LPA mutants have already been reported, such as KBNT lpa1-1 and N15-186 in USA (Rutger et al. 2004; Kim et al. 2008b), Sang-gol in Korea (Li et al. 2008, 2012) and those developed in our laboratory (Liu et al. 2007). Although primary LPA mutants are often inferior to their respective WT cultivars for traits like yield and field emergence, further breeding may eliminate or minimize the negative effects and develop yield-competitive LPA varieties. For example, significant improvement was reported by Spear and Fehr (2007) for field emergence of LPA soybean. Because XS-lpa also has been proven to have inferior seed viability and yield traits (Zhao et al. 2008a), further breeding is needed before it can become a new variety.

LPA phenotype is a seed trait and has to be performed on seeds of individual plants, which is quite time-consuming and arduous. Besides, since LPA mutants usually have similar plant growth and grain shape compared with WT cultivars, it is very difficult to visually distinguish them from WT. Therefore, development of functional molecular markers could greatly increase the efficiency and speed of LPA rice breeding (Rosso et al. 2011; Tan et al. 2013). However, development of functional markers depends on the identification of causative mutations. In this study, with the disclosure of causative mutation underscoring the LPA phenotype of XS-lpa, a three-primer PCR-based method was developed to genotype plants derived from crosses between XS-lpa and WT cultivars (Fig. 6), which should facilitate the efficient selection of plant with the XS-lpa mutant allele. In future, this co-dominant marker will also enable selection of plants with two or three non-allelic LPA mutations as demonstrated by Tan et al. (2013) for other mutant alleles. Development of double and triple mutants enabled by marker-assisted selection is not only important for LPA rice breeding, but is also important for genetic studies of genes and pathways in PA biosynthesis and regulation.