Introduction

Many RNA (RNAs) that do not encode proteins also are now known to play important roles in transcription, degradation and translation of messenger RNA (mRNA). Based on their sizes, these RNAs can be divided to small and large non-coding RNAs. Large or long non-coding RNAs are exemplified by Xist and Malat-1 (metastasis associated lung adenocarcinoma transcript-1) [1]. Small non-coding RNAs are further categorized into several subgroups, including small nucleolus RNAs, microRNAs (mirRNA), small interfering RNAs (siRNA), and a few others [2]. Biogenesis of mirRNA is very similar and related to that of siRNA. mirRNA can be encoded in exons or introns of protein-coding genes or in non-protein-coding regions (regulatory or intragenic regions) of chromosomal DNA. mirRNA genes are first transcribed as a precursor RNA, called pri-mirRNA, in the nucleus by RNA polymerase II, capped with 7′MGpppG and tailed with poly A, just like mRNA. Part of the pri-mirRNA forms double strands (ds) with one or several stem-loop structures dubbed as hairpins. Drosha, a member of RNase III, will then cut away the single-stranded parts, resulting in one or several double-stranded pre-mirRNA that are hairpins (stem-loops) of about 70–90 bp in length [3]. The pre-mirRNA will be transported to the cytoplasm where the loop is cut away by Dicer, another member of RNase III, resulting in only ds-stem, which is mature mirRNA of about 18–25 bp in length. The ds-mirRNA will be unwound to single strand mirRNA to exert its gene-regulating effects. Many mature mirRNAs are very similar in sequence. For those that differ in only one or two nucleotides, they are assigned a small-case letter, such as mir-20a and mir-20b [4]. This small difference makes it difficult, if not impossible, to detect a specific mirRNA with hybridization-based techniques without noise, i.e., without picking up signals from other similar mirRNAs.

Several mechanisms have been shown for mirRNA to silence mRNA. One mechanism is similar to RNA interference (RNAi) by siRNA, in which mirRNA base-pairs with a complementary sequence within the targeted mRNA. The mirRNA is coupled to the RNA-induced silencing complex (RISC), which is also the effector of RNAi [5]. Within RISC, the targeted mRNA will be cleaved into two fragments by argonaute proteins. The 3′ fragment is further destroyed in the cytoplasm by the exonuclease Xrn1 while the 5′ fragment is degraded by the exosome, a collection of exonucleases dedicated to 3′-to-5′ RNA degradation [6]. Moreover, a short polyuridine (poly-U) tail will be subsequently added to the 3′-end of the 5′ fragment, which facilitates decapping of the 5′-end of the 5′ fragment and the ensuing 5′-to-3′ destruction of the 5′ fragment [6]. Also similar to siRNA, when a mirRNA only partially pairs with their targeted mRNA, it cannot trigger mRNA cleavage but will block translation of the mRNA into protein [6]. In this case, usually several mirRNAs are required to bind with the same targeted mRNA to enforce the inhibition of protein translation [6]. These mirRNAs, after binding with argonaute proteins, block translational initiation and then drive the mRNA into a P-body, which is the cytoplasmic site of mRNA decapping and degradation. This sequestration of targeted mRNA in a P-body also excludes it from ribosomes, thus further blocking its translation to a protein [7].

Expression of mirRNAs is considered highly specific for tissues and developmental stages [8], although little is known about how mirRNA expression is regulated. A few recent studies further show that it is also tumor specific and thus can be used for tumor characterization [9, 10], unlike many oncogenes or tumor suppressor genes whose aberrant expression occurs more universally among different tissues and tumors. Some mirRNAs are transcribed as multicistronic primary transcripts [11]. Human breast cancer samples manifested aberrant expression of many mirRNAs and may constitute a molecular signature of breast cancer [12, 13]. However, since the comparison was made with normal breast tissue that is dominated by adipose tissue, due to technical limitations, this concept needs to be confirmed by comparison with normal mammary epithelial cells, considering that mirRNA expression may be tissue specific.

C-myc is an immediate early growth response gene that is expressed during G0/G1 transit of the cell cycle progression upon exposure of cells to extracellular growth stimuli, such as estrogen in the case of breast cancer cells. C-Myc, the protein product of c-myc gene, exerts a variety of physiologic functions, the most prominent ones of which are promotion of both cell proliferation and apoptosis [14]. Virtually all types of human malignancies have been reported to have c-myc overexpression at high frequencies with gene amplification in certain cases. These alterations occur in roughly 50% of human breast cancer biopsies [15]. Numerous animal experiments have also causally linked aberrant expression of c-myc to the development of cancer at the mammary gland and other body sites [14, 16]. Virgin female MMTV-c-myc transgenic mice, in which c-myc expression is targeted to the mammary gland by mouse mammary tumor virus (MMTV) long terminal repeat, spontaneously develop mammary adenocarcinomas that are both highly proliferative and highly apoptotic [15]. Therefore, this transgenic line is a good animal model not only clearly demonstrating the carcinogenic role of c-myc in the breast, but also perfectly reflecting the dual function of c-Myc, i.e., promotion of both proliferation and apoptosis [15].

Mechanistically, c-Myc protein exerts most of its functions by acting as a transcription factor, although emerging evidence suggests that non-transcriptional mechanisms are also involved [14]. Approximately 10–15% of genes in the human genome are suggested to be c-Myc targets [17, 18]. However, very few genes have been shown to encompass a c-Myc recognition DNA sequence, the so-called E-box element, in their regulatory regions [15]. This discrepancy indicates that regulation of many c-Myc target genes may occur via other mechanisms, an important one of which may be mirRNA. Indeed, c-myc overexpression in cell lines of human B-cell lymphomas up-regulated the expression of mirRNAs from the mir-17-92 cluster that is localized to chromosome 13, the mir-106a cluster on chromosome X and the mir-106b cluster on chromosome 7 [19]. Studies with the chromatin immunoprecipitation approach also show the binding of c-Myc protein at these chromosomes, suggesting the transcriptional regulation of these clusters of mirRNAs by c-Myc [19].

Like protein-encoding genes, mirRNA-encoding genes also manifest high frequencies of genomic alteration in various human malignancies, including breast cancer [20]. This is not surprising since over one-half of the mirRNAs are localized to genomic fragile sites [21, 22] or cancer susceptibility loci [23]. Numerous genetic studies have suggested the connections of chromosomal regions 13q31, Xq26, and Xp11 with human cancers [24]. Gain of 13q31 has been reported for Wilms tumors, liposarcoma, lymphoma, and other types of malignancies [2527], whereas deletion of this region is shown for gastric adenocarcinomas as well as squamous cell carcinomas in the larynx and head and neck [2830]. The fact that both gain and loss of this region associate with cancer suggests that this region may contain both oncogenes and tumor suppressor genes, and its change may vary among different types of malignancy. Several cancer-related genes are known to be localized at this region, including GPC5 and C13orf25 [26, 31]. More importantly, at least some mirRNAs in the mir-17-92 cluster that are localized in this chromosomal region, such as mir-17 and mir-92-1, show increased expression in both human and rat breast cancers or promote breast cancer cell growth [32, 33]. Loss of heterozygosity of Xq26 has been associated with cancer of the breast and ovary [3436]. Xq26 harbors GPC3, a putative tumor suppressor gene known to be silenced in breast cancer and many other cancers [37], as well as cancer-testis genes CT45 [38] and MAGE-C1 [39]. A nearby region (Xq27) also contains a gene susceptible for the development of prostate cancer and testicular cancer [4042]. The Xp11 region contains several cancer-testis genes including members of SSX and XAGE [43, 44], in addition to UXT that is widely expressed in many cancers [45]. Xp11 is also the region where BRCA1 tumor suppressor gene exerts its suppression of selected X-linked genes [46] and shows frequent rearrangement in a variety of cancer [47, 48].

In this study, we show that some mirRNAs localized in the 13q31, Xq26, Xp11, and other chromosomal regions manifest altered expression in mammary tumors developed from MMTV-c-myc transgenic mice. We also show that a novel non-coding RNA transcribed from chromosome 19 has a markedly decreased expression in the tumors. These data suggest that the mirRNAs and the novel non-coding RNA identified herein may be mechanistically behind the contribution of these chromosomal regions to breast cancer formation or progression and behind the carcinogenic role of aberrant c-myc expression.

Materials and methods

RNA extraction

Total RNA was extracted from frozen tissues of mammary tumors (MT) and adjacent proliferating mammary glands of virgin female MMTV-c-myc transgenic mice. As controls, total RNA samples were also prepared from mammary glands (MG) of virgin wild type (Wt) female littermates or from lactating mammary glands (LGM) of Wt littermates at the 15th day of lactation. About 100 mg of frozen tissue was homogenized by Polytron in 1 ml TRIzol (Invitrogen, Cat. 15596–026) and then incubated at room temperature for 10 min. After centrifugation at 12,000 × g for 10 min, supernatant was transferred into a new RNase free tube along with 0.2 ml of chloroform. The tube was shaken vigorously and incubated on bench for 5 min. After centrifugation at 12,000 × g for 20 min at 4°C, the supernatant was transferred into a new tube with addition of 2.5 volume of 100% ethanol and 0.1 volume of 0.3 M sodium citrate. The total RNA was precipitated at −20°C for at least 1 h, followed by washing with 70% ethanol. The RNA pellet was suspended with DEPC treated ddH2O. RNA quality and quantity were determined by UV spectrum measurement of OD260 nm and agarose gel electrophoresis.

mirRNA microarray

Total RNA (20 μg) samples extracted from MT and LMG tissues were sent to LC Sciences LLC (Houston, TX) for mirRNA microarray assay. All probe sequences in the array were based on Sanger miRBase Release 8.1, which represents the updated validated mouse mirRNA sequences. LMG RNA was labeled with Cy3 dye (in green color) and tumor RNA sample was labeled with Cy5 dye (in red color), followed by hybridization with the probe-containing Chip. The signals were presented after background subtraction, normalization and detection evaluation. Statistic significance was set by P-value < 0.01.

RT, PCR, and real-time PCR assays of mirRNAs

Because pre-mirRNAs and mature mirRNAs are short sequences that could not be analyzed with routine procedure of reverse transcription (RT) and PCR, we amplified these mirRNAs by using a method modified from literature [4951]. In brief, poly A tail was added to mirRNA by using A-Plus™ Poly(A) Polymerase Tailing Kit (Epicenter, WI). cDNA was then synthesised by using MMLV Reverse Transcriptase 1st-Strand cDNA Synthesis Kit (Epicenter, WI), and polyA(3) (Table 1) was used as specific RT primer.

Table 1 Primers used for real-time PCR

To amplify mir-20a, mature mir-20a sequence was used as forward primer and the poly A(3)B primer (Table 1) was used as reverse primer in the PCR reaction with Hot-Start PCR Mastermix (Denville, NJ) as the other reagents. PCR program included initial denature at 95°C for 5 min, 35 cycles of denature step at 95°C for 45 s, primer-annealing step at 57°C for 45 s, and extension step at 72°C for 45 s. To amplify other mirRNAs, TaqMan MicroRNA Reverse Transcription kit (Cat. 4366595) from Applied Biosystems (Foster City, CA) was used for RT reaction to convert RNA to cDNA. TaqMan MicroRNA Kit-mmu-miR-9 (Cat. 4373371) was used for real-time PCR detection of mature mir-9 with Eppendorf HotMasterMix (Cat. 954140181). The target sequence of this assay is UCUUUGGUUAUCUAGCUGUAUG that can specifically detect mature mir-9. Each sample was analyzed in triplicate and with RT negative control and PCR-no-template control. PCR reaction was run at Mx4000TM Multiplex Quantitative PCR System (Strategene, CA) with the following program: 95°C, 10 min for 1 cycle as well as 95°C, 15 s, 60°C and 1 min for 40 cycles. Signal was detected at the end of each cycle. The cycle number at which the reaction crossed an arbitrarily placed threshold (Ct) was determined for each sample. The relative expression level of mature mir-9 from each sample was described using the equation 2∆CT where the ∆CT = (CTtumor − CTNMG) or ∆CT = (CTtumor − CTLMG). Pre-mirRNA-20a and pre-mirRNA-20b levels were determined using Brilliant SYBR Green Master Mix system (Strategene, CA) and the real-time PCR equipment. U6 RNA was used as a loading control. Comparative quantification method was used for data analysis. Moreover, routine PCR was also conducted to verify pre-mir-221 and pre-mir-222 levels. All PCR products were visualized in 3–4% agarose gels or in 12% polyacrylamide gel after staining with ethidium bromide for 30 min.

RT-PCR of Malat-1

To amplify with PCR Malat-1, total RNA samples were reverse transcribed using hexmer primer. PCR amplification of Malat-1 cDNA was conducted with forward primer mCh19F1 and reverse primers mCH19R1 and mCH19R2 in combination with mir-R20a and polyA(3)B primers (Table 1). PCR products were separated and visualized in 2% agarose gel.

DNA sequencing and in-silico PCR

Genomic DNA samples were prepared from four randomly selected mammary tumors from MMTV-c-myc transgenic mice by using phenol and chloroform. DNA sequencing of genomic DNA, cDNA and plasmid DNA was performed by Genewiz, Inc. (South Plainfield, NJ). Sequences were analyzed using ABI Prism 3730xl DNA analyzers. Mouse genome (updated in July 2007) based in-silico PCR was performed at UCSC Genome Bioinformatics Site (http://genome.ucsc.edu/cgi-bin/hgPcr).

Results

Microarray of mirRNAs

Our microarray results gave data on the expression of 358 mirRNAs on MT and LGM tissues. Comparison of expression profiles from these 2 tissues showed that 50 mirRNAs manifested a 2-fold or more increase and 59 mirRNAs manifested a 2-fold or more decrease in the expression in the MT tissue. For the majority of these 109 mirRNA genes, the expression was changed more than 5-fold, as listed in Tables 1 and 2. Lactating mammary glands were used for comparison because they represent the fully-differentiated status of mammary epithelial cells, as opposed to the cancer cells that are undifferentiated, whereas normal mammary tissue of virgin female mice consists mainly of adipose tissue and thus is not a valid control for mammary tumors of epithelial origin (Fig. 1).

Table 2 Mir-RNAs that show 5-fold or more increase in the expression in MMTV-c-myc mammary tumors than in lactating mammary glands
Fig. 1
figure 1

Image of mir-microarray data. (a) Overlapping Cy3 (green) and Cy5 (red) chip images. Red or green colors indicate a higher expression in the tumor or LMG, respectively. Yellow color indicates no difference between LMG and the tumor. (b) Background-subtracted and normalized signals emitted from Cys3 (Sample A) and Cy5 (Sample B) labeled transcripts presented as log ratios. The red dots in the scatter plot represent those mirRNAs expression of which differs significantly between LMG and the tumor while the blue dots represent those mirRNAs that did not show different expression. The black diagonal line indicates that the ratio of sample A to sample B is 1, i.e., no difference between LMG and the tumor. The green diagonal line represents that the ratio of sample B to sample A is 1.5 and the red dots within the area between the black and green lines indicate more than 1.5-fold increases in the expression in the tumors, relative to LMG. The light-blue line indicates that the ratio of sample A to sample B is 1.5 and the dots in the area between the black and the light-blue line indicate more than 1.5-fold decreases in the expression in LMG, compared with the tumor

Clusters of altered mirRNAs

Computational analysis of the chromosomal locations of the 109 miRNAs manifesting altered expression in microarray revealed that some of the mirRNAs, localized closely at the same chromosomal region, showed similar alterations, i.e., showing either increased or decreased expression in the tumors. This indicates that these mirRNAs may be processed from the same RNA transcript and thus may be clustered together. As illustrated in Fig. 2, we identified eight clusters of mirRNAs from the 109 mirRNAs, which are arbitrarily named by the first mirRNA, i.e., mir-466, mir-25, mir-17-5, mir-99b, mir-145, mir-501, mir-221, and mir-99-2, respectively. As shown in Fig. 2, mir-446, mir-669c, mir-669a-2 are located within the region of nucleotides 14025771–14028113, a 2,342-bp fragment at 2qA1, and are thus grouped together and coined as cluster mir-466. The human counterparts of these mirRNAs are currently unclear. The fold changes of these three miRNAs are indicated above each mirRNA, whereas the size of each pre-miRNA is shown below each mirRNA. For instance, the pre-miRNA of mir-446 is 73 bp in size and shows 46.21-fold increase in the tumor. Similarly, mir-25, mir-93, and mir-106a were grouped as cluster mir-25 because they are collectively located within a 497-bp region at 5qG1 of mouse chromosome or 7q22.1 of human chromosome. Mir-17, mir-18, mir-20a and mir-19b-1 and mir-92-1 are grouped as cluster mir-17 because they are located in an 844-bp region at 14qE4 (13q31.3 in human). The mir-99b cluster is transcribed from a 692-bp fragment at 17qA3.2 (19q3.41 in human) and contains Mir-99b, mir-7e, and mir-125a. The mir-145 cluster is transcribed from a 1,433-bp fragment at 18qD3 (5q33.1 in human) and contains mir-145 and mir-143. The mir-501 and mir-362 locate at an 804-bp region at XqA1.1 (Xp11.22 in human) and are thus grouped as cluster mir-501. Mir-221 and mir-222 locate at a 678-bp region of XqA1.3 (Xp11.3 in human) and are grouped as cluster mir-221. Mir-92-2, mir-19b-2, mir-20b, and mir-106a locate at XqA5 (Xq26.2 in human) and are thus grouped as cluster mir-92-2.

Fig. 2
figure 2

Illustration of mir-466, mir-25, mir-17, mir-99b, mir-145, mir-501, mir-92-2, and mir-221 clusters of mirRNAs. Chromosomal locations of these eight clusters and locations of their human counterparts are indicated above the corresponding chromosome, whereas the beginning and ending nucleotides of each cluster are given below the chromosome. The number above each mirRNA is the multiple of change (+ for increase, − for decrease) in its expression in tumor relation to its expression in LGM. The size of each pre-mir-RNA and the interval between two mir-RNAs are given under each mir-RNA. The full-length of each cluster is also indicated. Whether the cluster is located in an intergenic region or within a gene is also indicated at the left site of each cluster

As illustrated in Fig. 2, the pri-mirRNA transcript for the mir-466 cluster is located in intron 10 of Sfmbt2 gene, whereas the mir-25 and mir-501 clusters are located in the complementary strand of intron 13 of Mcm7 gene or introns 2 and 3 of Clcn5, respectively. On the other hand, the mir-17, mir-145, mir-221, and mir-92-2 clusters are located in intergenic regions (Fig. 2). To explore whether altered expression of mirRNAs was associated with genetic mutations, we used PCR method to amplify the genomic DNA of all eight clusters from four randomly selected MMTV-c-myc mammary tumors and sequenced the PCR products. No mutations were found in these chromosomal regions in these tumors (data not shown).

Verification of some mirRNAs by RT-PCR and cDNA sequencing

To verify the microarray data, we linked the 3′-end of pre-mirRNAs or mature mirRNAs to an artificial sequence that contains poly A at the 5′-end and a reverse PCR primer sequence at the 3′-end. Total RNA samples were reverse transcribed with poly T primer (poly A(3); Table 1) to cDNA. The ensuing PCR amplification with mirRNA sequence as forward primer should be about 60–70 bp for mature mirRNAs and about 100 bp for pre-mirRNA as described in detail in the literature [49]. As shown in Figs. 3 and 4a, we verified the increased expression of pre-mir-222 and pre-mir-20a in MMTV-c-myc mammary tumors. During these verifications, we realized that gel visualization of the pre-mirRNAs and mature mirRNAs was not sensitive enough to faithfully reflect the extent of changed expression. Because the pre- or mature mirRNAs are too small in size, it requires too many folds of change in the expression to create enough amounts of DNA to be discerned in the gel as discussed later. We therefore used real-time PCR method to quantify several pre-mirRNA and mature mirRNA. As shown in Fig. 5, pre-mir-20a and pre-mir-20b as well as mature mir-9 showed similar extents of changes (Fig. 5) as seen in microarray (Table 1, Fig. 2).

Fig. 3
figure 3

RT-PCR detection of pre-mir-20a and pre-mir-222 with U6 RNA as a loading control from a virgin FVB mouse, lactating mammary glands (LMG) from a mouse at the 15th day of lactation, and mammary tumors from three MMTV-c-myc transgenic mice from a virgin FVB mouse, lactating mammary glands (LMG) from a mouse at the 15th day of lactation, and mammary tumors from three MMTV-c-myc transgenic mice (T1, T2, and T3). The PCR products were separated in 12% polyacrylamide gel and visualized by staining with ethidium bromide

Fig. 4
figure 4

Detection of mature mir-20a, chr19 RNA and Malat-1. (a) Visualization of mature mir-20a and a larger RNA in 12% polyacrylamide gel. In lactating (LMG) or virgin mammary glands (MG) from wild type (Wt) mice, the levels of mature mir-20a at about 65 bp are lower than the levels in mammary tumors (MT) and proliferating mammary glands (MMTV) from MMTV-c-myc transgenic mice. An additional band occurring at about 100 bp shows a decreased level in MMTV-c-myc mammary glands with decreases to undetectable levels in the tumors. (b) PCR amplification of hexmer primer-derived RT products with primers specific for Malat-1 does not show any difference between the tumors and normal mammary tissues. (c) Cloning the 65-bp band to TOPO vector followed by plasmid sequencing confirms that this band is mature mir-20a (small-case letters at the 5′-end of the sequence). (d) Plasmid sequencing also shows that the 100-bp band is generated from mis-priming of the mir-20a sequence (used as the forward primer) with part of Malat-1 (the 36-bp sequence in small-case letters), which is a large non-coding RNA transcribed from chr19

Fig. 5
figure 5

Real-time RT-PCR detection of mature mir-9, pre-mir-20a, and pre-mir-20b in LMG and in tumors. (a) Fold-increases in pre-mir-20a and pre-mir-20b in MMTV-c-myc mammary tumors relative to LGM. (b) Six tumor and four LMG samples shown as each amplification curve for mature mir-9. The difference between the tumors and the LMG samples, but not within those groups, is statistically significant. After background subtraction and normalization, different levels of mature mir-9 are calculated by ∆Ct and the relative expression levels are converted into 2∆Ct as shown in the chart. (c) The level of mature mir-9 is over 1,000 times higher in the tumors than in LMG samples

Many mirRNAs are similar in sequence, differing in only one or two nucleotides, such as mir-20a and mir-20b or mir-106a and mir-106b that all showed changes in MMTV-c-myc tumors (Table 1, Fig. 2). Since all current methods for detection of mirRNAs are hybridization-based and therefore cannot distinguish mismatch of one or two nucleotides, it is possible that microarray, PCR or real-time PCR experiments picked up signals from similar mirRNAs. Moreover, these PCR-based techniques involve only one gene-specific primer, unlike the routine PCR method that involves two specific primers, and thus dramatically increase the risk of picking up non-specific signals. To check whether we fell into these potential pitfalls, we purified pre-mir-222 and pre-mir-20a from 4% agarose gel and sequenced the cDNA. The results confirmed that the cDNA samples purified from the gel were indeed mir-222 and mir-20a (data not shown), indicating that at least the cDNAs we purified were dominated by the correct mirRNAs.

Identification of a non-coding RNA decreased in the tumors

During PCR amplification of the mature mir-20a, we fortuitously amplified an additional cDNA, which together with the linked poly A tail was about 100 bp in length, larger than the mature mir-20a detected at 65 bp (Fig. 4a). This RNA transcript was highly expressed in mammary glands from virgin and lactating mammary glands of wild type mice but is undetectable in MMTV-c-myc mammary tumors (Fig. 4a). It is intriguing, also, that its expression in the proliferating mammary glands from MMTV-c-myc mice was lower than that in normal (virgin) glands or lactating glands of wild type mice, suggesting a down-regulation by c-Myc. The further decrease in the tumors might be because the tumor cells expressed even higher levels of c-myc. In contrast, the mature mir-20a showed high levels in MMTV-c-myc mammary glands and tumors compared with normal or lactating glands, in line with the microarray data.

We purified and cloned this 100-bp cDNA into a TOPO vector and sequenced the plasmids. The results showed that it matched to the mouse metastasis associated lung adenocarcinoma transcript 1 (Malat-1; also called Neat-2), which is a large, non-coding RNA transcribed from the 5,800,000-bp region of chromosome 19 (chr19). However, it is currently unclear whether this RNA, coined herein as chr19 RNA, is transcribed from the same strand of Malat-1 and thus is part of the Malat-1 RNA. Sequencing of this 100-bp cDNA suggested that it was amplified accidentally because part of its sequence is significantly homologous with the mir-20a sequence used as the forward primer (Fig. 4d), whereas sequencing of the 65-bp band confirmed that it was mature mir-20a (Fig. 4c). Analysis of 30,000 bp at 5′-end and 50,000 bp at the 3′-end around the corresponding genomic region did not find any open reading frame, suggesting that this chr19 transcript is a non-coding RNA, similar to the Malat-1. However, when we used routine hexmer primer to perform RT and then PCR-amplified the cDNA with primers frank this chr19 transcript (mCH19F1, mCH19R1, mCH19R2, and mir-20a; Table 1), the level of the PCR products did not show any difference between normal mammary glands and mammary tumors (Fig. 4b). Therefore, the difference in the level of chr19 RNA seemed to occur only when the total RNA was first linked to a poly A tail, implying that the difference between normal mammary glands and tumors might occur only when the RNA was truncated or broken, no matter it was part of Malat-1 or not.

Discussion

Despite the fact that there has been a flood of studies on mirRNAs in various areas of biological and medical research in recent years, most of these studies do not sufficiently address the limitations of the techniques used for the detection of mirRNAs. Because many mature mirRNAs differ in only one or two nucleotides, no currently available method can really distinguish such a small difference. Northern blot and PCR are hybridization-based methods and thus cannot avoid the possibility of detecting similar mirRNAs. To determine whether we fell into this potential pitfall, we purified RT-PCR products from agarose gel and sequenced the cDNA. This method guarantees that the majority of the purified cDNA is specific but it cannot tell whether there still is some minor noise. The best corrective measure is to clone RT-PCR products into TA or TOPO vector and select a large number of single plasmid clones for sequencing, but this method is not used herein because it is time- and resource-consuming. Therefore, we cannot rule out the possibility that our data of mir-20a, mir-20b, mir-221, and mir-9 that are verified with PCR or real-time PCR still contain minor signals from similar mirRNAs.

The small size of each mirRNA, especially mature mirRNAs, makes it difficult to use routine PCR to visualize the difference in expression between samples. For a 1-kb RNA fragment, a 1-fold increase in the expression indicates 1,000 more nucleotides, which is discernible in the gel. However, for a mature mirRNA, a 1-fold increase in the expression increases only ~20 nucleotides, which is too small to be discerned in gel. Therefore, we are not able to use routine PCR method to verify more mirRNAs that show changed expression in microarray, especially for mature mirRNAs (data not shown). This likely explains the inconsistence between real-time data and routine PCR data visualized in gels. For instance, our real-time PCR results show that the pre-mir-20a and pre-mir-20b levels are more than 60 fold higher in MMTV-c-myc mammary tumors compared with lactating mammary glands, which is consistent with the microarray data (Table 2). However, such a dramatic difference is not reflected in the gel (Figs. 3 and 4), although on the gel the levels are still higher in the tumors (Table 3).

Table 3 Mir-RNAs that show 5-fold or more decrease in the expression in MMTV-c-myc mammary tumors than in lactating mammary glands

A recent study that maps global binding sites of c-Myc protein to human genome suggests that c-Myc potentially occupies >4,000 genomic loci with the majority near proximal promoter regions associated frequently with CpG islands [52]. This finding supports the estimation that about 10–15% of the known genes are regulated by c-Myc [17]. It is perceivable that some of these c-Myc binding sites may be regulatory regions of mirRNAs or other non-coding genes, which is supported by a recent study showing that c-Myc indeed regulates, mainly suppresses, many mirRNA [53]. Different individual mirRNAs have also been reported to be activated or repressed by c-Myc, as exemplified by mir-17-92 [54]. The mir-221 and mir-17 clusters as well as mir-106a have also been shown to be associated to c-myc gene amplification in human neuroblastomas [55]. In the present study, we used c-Myc transgenic tumors to identify the mirRNA genes that are possibly regulated by c-Myc in vivo. To our surprise, about one-third of the mirRNAs (109 of 358) show up- or down-regulation in c-Myc expressing mammary tissues compared with normal or lactating mammary glands. Some of the 109 mirRNAs may be culled out since their altered expression is not further confirmed by using other methods and, as aforementioned, since it is practically impossible to distinguish similar mirRNAs with currently available technology. Moreover, some of the deregulated mirRNAs may be regulated by c-Myc indirectly. On the other hand, considering tissue specificity, there may be many other mirRNAs that are c-Myc targets but not affected in the breast. As a net result after balancing these issues, the total number of mirRNAs as direct targets of c-Myc may still be large.

We arbitrarily cluster together the mirRNAs that are located in the same chromosomal region because they are likely to be generated from the same transcript. The mirRNAs in the same clusters are changed to the same direction, i.e., either increased or decreased together, which also suggests that they may be processed by Drosha from the same transcript. However, different mirRNAs in the same cluster manifest quite different expression levels. For instance, mir-466 shows 46-fold increase but mir-669c increases only 2.5 fold (Fig. 2). One of the possibilities is that the stability of each pre-mirRNA is different. The meaning of increases or decreases in the mirRNAs of the same cluster is unknown. Since inhibition of protein translation of mRNA usually requires collaboration of several mirRNAs collectively bound to the same mRNA, it is possible that the mirRNAs in the same cluster may work collaboratively to enhance the efficiency on inhibition of translation of a target gene.

Eight clusters, in total 24 mirRNAs, are found to be aberrantly expressed in c-Myc induced mouse mammary tumors in our study. While the human counterpart of the mir-466 cluster located at the 2qA1 mouse chromosomal region has not yet been identified, the mir-25, mir-17, mir-99b, mir-145, mir-501, mir-221, and mir-92-2 clusters of mirRNAs are localized at 7q22.1, 13q31.3, 19q3.41, 5q33.1, Xp12.22, Xp11.3, and Xq26.2 of human chromosomes, respectively. While little is known about whether 19q3.41 is genetically altered in human cancers, the other six chromosomal regions are known to have genetic alterations at high frequencies in various of human cancers [5665]. Although the effects of these genetic changes may be attributable to the protein-encoding oncogenes or tumor suppressor genes localized in these regions, mirRNA genes may also be part of the connection of these chromosomal regions to cancer. Supporting this thought is that many of the mirRNAs in these eight clusters have been shown in many studies to be cancer-related. Therefore, some mirRNAs may be part of the mechanism for c-Myc induced mammary carcinogenesis.

A fortuitous but exciting finding in our study is an RNA, the level of which is decreased in c-myc expressing mammary glands, compared with normal and lactating mammary glands. Its expression is further decreased to an undetectable level in the mammary tumors that express even higher levels of c-myc, which suggests that the level of this RNA may be reduced by c-Myc. Sequencing of this RNA shows that it is a transcript from chr19 and is homologous with part of Malat-1, a non-coding large RNA. However, when we used Malat-1 specific primers to amplify the cDNAs that were reverse transcribed by using routine hexmer primer from the tumor and normal mammary tissues, we could not detect the decreased expression of Malat-1 in the tumors. Because we do not know whether the chr19 RNA is transcribed from the same strand of DNA as Malat-1, currently we do not know whether it is really part of Malat-1 RNA. We are now performing a systematic study to determine whether this chr19 RNA is transcribed from the (−) or the (+) strand of DNA and to determine its initiation and ending site(s). Searching a few kb genomic sequences in the vicinity of this transcript does not find an open reading frame, suggesting that it is likely a non-coding RNA. Moreover, this chr19 RNA is identified by adding a poly A tail to the 3′-end. Therefore, even if it is part of Malat-1, it is a fractured form, and the decrease in tumor may only occur in this form, not in the total or intact Malat-1. In other words, the level of the intact chr19 RNA may not differ between the normal and tumor tissues. Its decrease in c-myc expressing mammary tissue or tumors may result from c-Myc induced fracture or break of the pre-RNA, since we have shown that c-Myc indeed alters expression of many genes that regulate RNA metabolism and process [66]. Whether this fractured form of chr19 RNA has any function is unknown. The observations that it is decreased to undetectable level in MMTV-c-myc tumors and is probably inhibited by c-Myc seem to suggest it as a tumor suppressor. However, since c-Myc has the dual function of promoting both cell proliferation and apoptosis, it is also possible to act as an oncogene, inhibition of which contributes to the well known apoptotic feature of the MMTV-c-myc transgenic mammary tumors [15].

In summary, this study reports that many mirRNAs show altered expression in c-myc transgene expressed mouse mammary tumors. Some of these mirRNAs, including mir-20a, mir-17, mir-92-2, mir-106a, mir-221, and mir-222 that have been shown to be cancer-related, may be grouped into eight clusters. A novel non-coding RNA transcribed from chr19 showed markedly decreased expression in c-myc expressing mammary glands and mammary tumors. These mirRNAs and chr19 non-coding RNA may contribute to c-myc induced mouse mammary tumors.