Introduction

Expression cloning technology, a powerful tool in elucidating gene function, provides a direct correlation between specific genes and protein activities. This approach has been successfully utilized to identify genes involved in diverse cellular processes, as well as new markers and targets for the diagnosis and treatment of human diseases [13].

Expression cloning includes the construction of a cDNA library in a suitable expression vector, introduction and expression of the library in a particular cell type, and subsequent functional screening for the desired biological activity. Screening facilitates the isolation of genes expressed in association with a specific activity.

One of the problems commonly encountered during the discovery of all genes concerned with a certain function is the redundant identification of transcripts that have already been analyzed during functional screening of the cDNA expression library. The recurrence of these transcripts during analysis often makes it a time-consuming and expensive procedure. Use of normalized cDNA libraries with equalized concentration of different transcripts does not resolve this problem. Although reduction of the frequency of highly expressed cDNAs is achieved by cDNA normalization, library complexity remains near intact. Thus, recurrence of already known sequences complicates expression cloning as before normalization. Consequently, selective removal (depletion) of such transcripts from the cDNA library is required to significantly accelerate the identification of novel genes during functional screening.

Most cDNA subtractive hybridization protocols developed to identify differences between two cDNA populations or facilitate EST sequencing [46] are not well suitable for expression libraries intended for functional screening. For example, methods focusing on amplified plasmid libraries [5, 7] cannot be utilized for viral expression systems that are more efficient for expression cloning. The widely used suppression subtractive hybridization procedure [4] is only effective for short (restriction endonuclease-treated) cDNA fragments. Only normalization and subtraction of Cap-Trapper-selected cDNA method [8] may be used to eliminate analyzed transcripts from cDNA for further EST sequencing and expression cloning. With this method, cDNA normalization and/or subtraction is performed on full-length cDNA prior to library cloning and subsequent use for functional screening. However, the Cap-Trapper protocol constitutes multiple steps involving the physical separation of the target cDNA fraction, and requires a large quantity of poly(A)+ RNA.

In this report, we describe a simple cDNA depletion method (designated “DSN depletion”) that ensures the elimination (depletion) of selected sequences from full-length-enriched cDNA. The method employs specific features of the recently characterized duplex-specific nuclease (DSN) from kamchatka crab, Paralithodes camtschaticus [9]. DSN has been utilized in a number of molecular biology applications, including cDNA normalization [1012], single nucleotide polymorphism (SNP) detection [9, 13], and quantitative telomeric overhang determination [14]. DSN-depletion procedure allows the efficient elimination of up to several hundred known sequences from cDNA prior to library cloning, removes the need for laborious physical separation, and requires minimum hands-on time.

Materials and Methods

Tracer Preparation

Placenta ds cDNA was prepared using commercially available total placenta RNA (Ambion) with a SMART™ PCR cDNA Synthesis Kit (Clontech), according to the manufacturer’s protocol. Zoanthus total RNA was purified as described in [15] and used for ds cDNA preparation with a Mint-Universal kit (Evrogen), following the manufacturer’s protocol II.

Driver Preparation

PCR Fragments

Each PCR fragment of the phosphodiesterase (PDE) gene to be eliminated was amplified in a 25 μl PCR reaction containing 1 ng placenta cDNA, 1 × Advantage 2 reaction buffer (Clontech), 200 μM dNTPs, 0.3 μM each specific PDE primer (Supplemental Table 1), and 1 × Advantage 2 Polymerize mix (Clontech). Amplification was performed on a MJ Research PTC-200 DNA Thermal Cycler. Depending on the fragment, 18–25 PCR cycles were performed. The following conditions were applied: 95°C for 7 s, 55–65°C for 20 s, and 72°C for 15 s. PCR products were cloned into pGemT-easy vector (Promega). Purified plasmids with desired inserts were used as templates for amplification (5 ng of each plasmid DNA per 25 μl of the reaction) with the PDE gene-specific primers. In each case, 12 cycles of PCR were performed. Amplified products were purified using the QIAquick PCR Purification Kit (QIAGEN), and mixed together to obtain final concentrations of 10, 20, and 40 ng/μl.

Synthetic Oligonucleotides

For the elimination of individual genes, a pair of synthetic oligonucleotides capable of forming perfect duplexes with both strains of the gene sequence were synthesized (zGFP: 5′-aca tgt gca tac cat tac gct gat gac aat gta gtt caa ttc aaa cc-3′ and 5′-ggt ttg aat tga act aca ttg tca tca gcg taa tgg tat gca cat gt-3′; zRFP: 5′-cgt tta tga cca tta acc tga taa gat tgt agt tct aac atg cta ttg cac gtt tat ga-3′ and 5′-tca taa acg tgc aat agc atg tta gaa cta caa tct tat cag gtt aat ggt cat aaa cg-3′), and purified on HPLS. Paired oligonucleotides were mixed at a final concentration of 1 pmol each.

Depletion

Tracer cDNA was purified using the QJ Aquich PCR Purification Kit (Qiagen), precipitated with ethanol, and dissolved in sterile water to a final concentration of 75 ng/μl. For each experiment, a 2 μl aliquot of solution containing ~150 ng cDNA was mixed with 1 μl of 4 × hybridization buffer (200 mM HEPES-HCl, pH 8.0, 2 M NaCl) and 1 μl driver. Samples were overlaid with mineral oil, denatured at 98°C for 5 min, and allowed to renature. For drivers consisting of PCR fragments, hybridization was performed at 70°C for 1 h. When synthetic oligonucleotides were used as the driver, samples were incubated at 65°C for 40 min. At the end of hybridization, 5 μl of DSN solution comprising 1 Kunitz-unit of DSN (Evrogen) in 1 × DSN buffer (100 mM Tris-HCl, pH 8.0, 10 mM MgCl2, 2 mM DTT) were added to each sample, and incubated at 68°C for 30 min. Before addition, DSN solution was preheated at 68°C for 1 min. After the completion of treatment, DSN was inactivated by heating at 95°C for 7 min. Samples were diluted with sterile water to a final volume of 40 μl, and 1 μl aliquots used for PCR with the M1 primer (5′- aag cag tgg tat caa cgc aga gt - 3′). A 25 μl PCR reaction mixture contained 1 × Advantage 2 Polymerize mix (Advantage 2 PCR Kit, Clontech), 1 × Reaction buffer (Advantage 2 PCR Kit, Clontech), 200 μM dNTPs, and 0.3 μM primer. In total, 20 PCR cycles (95°C for 7 s; 65°C for 20 s; 72°C for 3 min) were performed for human placenta cDNA and 17 for Zoanthus cDNA.

Quantitative PCR of Depleted Placenta cDNA Samples

Real-time quantitative PCR was performed on duplicate samples using commercially available primers and probes corresponding to the catalytic domains of different PDE genes (Applied Biosystems). Amplified control and depleted cDNA samples were quantitated, purified using the QJ Aquich PCR Purification Kit (Qiagen), precipitated with ethanol, dissolved in 50 μl of sterile water, and re-quantified on a spectrophotometer. cDNA (12.5 ng) was analyzed in a total volume of 25 μl. The PCR machine default output was 45 cycles. Due to the impossibility of excluding alterations in reference gene concentrations during normalization and subtraction procedures, the measurement was not normalized to reference genes, but simply represented relative concentrations of each gene using equivalent amounts of cDNA for each library.

Construction of Control and Depleted Zoanthus cDNA Libraries

Amplified cDNA samples were purified using the QIAquick PCR Purification Kit (QIAGEN). cDNA and the pTriplEx2 vector (Clontech) were digested with Sfi1. Digested cDNA samples were purified using CHROMA SPINTM—400 columns (Clontech), ligated into the corresponding restriction sites of pTriplEx2, and used for Escherichia coli transformation with the Bio-Rad Micropulser (Bio-Rad).

Results

DSN-Depletion Strategy

Figure 1 provides details of the molecular events occurring during selective depletion of cDNA using DSN. Double-stranded cDNA (tracer) is mixed with excess driver DNA (representing fragments of the genes to be removed), denatured, and allowed to re-anneal under hybridization conditions. During hybridization, driver molecules form hybrids with the sequences to be eliminated, allowing their removal from the ss fraction. Upon hybridization completion, ds cDNA is hydrolyzed by DSN, and the ss DNA fraction is amplified using long-distance PCR [16]. Amplified cDNA is size-fractionated, and cloned into the vector of choice.

Fig. 1
figure 1

Scheme of DSN-based depletion of ds cDNA. The black line signifies transcript of interest, while the gray line represents transcripts to be eliminated

Similar to the majority of cDNA subtractive hybridization and normalization methods developed to date, DSN depletion is based on the second-order solution hybridization kinetics [17]. Consequently, the depletion process is accompanied by cDNA normalization. In the course of hybridization, cDNA molecules of higher abundance have time to pass into the ds form, and are subsequently hydrolyzed during DSN treatment.

DSN Treatment

The key step in DSN depletion is degradation of the ds fraction comprising tracer-driver hybrid molecules by DSN. The enzyme is strongly specific for ds DNA in both DNA–DNA and DNA–RNA hybrids, leaving ss DNA and RNA intact. Moreover, DSN is active over a wide range of temperatures. Thus, degradation of the ds DNA-containing fraction by this enzyme may occur at high temperatures, thereby avoiding loss of transcripts due to the formation of secondary structures and non-specific hybridization involving cDNA adapter sequences [10, 11].

Tracer Preparation

Most cDNA prepared using common methods can be subjected to DSN depletion. The only requirement is the presence of known cDNA flanking sequences for subsequent amplification. For example, such cDNA can be obtained using well-known SMART approach [18, 19]. This method allows the synthesis of cDNA enriched with full-length sequences, and may be applied to both poly(A)+ and total RNA, even when only a limited amount of starting material is available. During cDNA synthesis, adapter sequences are introduced at both the 5′ and 3′ ends. Depending on the adapter structures, cDNA becomes flanked by the same or different adapters, and can be cloned in a non-directional or directional manner.

Driver Preparation

As demonstrated in SNP detection experiments, a 10 bp perfect DNA–DNA duplex is the minimal substrate for DSN [9, 14]. Thus, for elimination of individual sequences from a cDNA population, driver DNA of at least 10 bp in length must be provided in sufficient excess. In model experiments, we examined amplified fragments of certain genes (about 100 bp long), as well as synthetic 47- and 60-mer oligonucleotides. Similar efficiencies of depletion were obtained in both cases.

Hybridization and PCR

We applied standard hybridization conditions of suppression subtractive hybridization [4, 20], as well as a long and accurate PCR system [16] with subsequent size fractionation to retain long cDNA molecules in the resulting library.

Model Experiments

To examine DSN-depletion efficiency under various conditions, we eliminated several known genes comprising PDE catalytic domains from human placenta cDNA. Mammalian PDEs form a superfamily of enzymes comprising 10 families, which differ in terms of amino acid sequence, substrate specificity, inhibitor sensitivity, modes of regulation, and tissue distribution [21]. The carboxyl terminal portion of each PDE contains a highly conserved region of ~250 amino acids, which constitutes the catalytic domain [22].

Specific PCR fragments were utilized as the driver. PCR fragments within a catalytic domain sequence of nine PDE genes were amplified with gene-specific primers and mixed (Supplemental Table 1). The driver (about 100-times excess for each gene to be depleted) was added to SMART-amplified tracer cDNA, denatured, and hybridized for 1 h at 68°C. After hybridization, each reaction was treated with DSN to cleave the ds cDNA fraction. Residual ss cDNA was amplified by PCR with primers complementary to the flanked cDNA sequences. Tracer cDNA subjected to similar procedures in the absence of a driver was used as the control, reflecting the contribution of cDNA normalization to the depletion process.

Analysis of the depleted cDNA sample alongside control (non-depleted and non-normalized) and partially normalized (1 h of hybridization) cDNA on an agarose gel revealed that the bands corresponding to abundant transcripts disappeared owing to partial cDNA normalization, while the average cDNA length remained unchanged (not shown). To confirm the preservation of the average insert size in the depleted cDNA in comparison with non-depleted one, corresponding cDNA libraries were constructed and PCR of the 100 randomly picked clones from each library with standard plasmid primers was performed. Insert sizes were calculated and used for diagram drawing (Fig. 2a). We found that insert size distribution is not appreciably altered during depletion.

Fig. 2
figure 2

Analysis of placenta cDNA with depleted PDE genes. a Diagram of the insert size distribution in the control (white columns), partially normalized (gray columns), and depleted (black columns) placenta cDNA libraries. b Quantitative PCR of these samples using probes for depleted (gray columns) and undepleted (white columns) genes. Row data are available in the Supplementary Table 2. C—control cDNA library; N—partially normalized cDNA library; D—depleted cDNA library

The efficiency of DSN depletion was evaluated by quantitative PCR with TaqMan probes for specific genes. To exclude PCR products from truncated cDNA sequences, which are often generated during cDNA synthesis but do not contain the PDE domain and are not consequently depleted, we applied probes specific for the gene region upstream of the sequence used for depletion. Typical results of quantitative PCR are depicted in Fig. 2b. In each case, at least 100-fold reduction of gene targeted for depletion was achieved. To test the integrity of homologous genes not subjected to elimination, PDE4D expression was analyzed. The PDE4D transcript level was practically invariable in all control and experimental samples (Fig. 2b).

In another model experiment, we focused on the availability of depleted cDNA for expression cloning. Previously, we found that the Zoanthus species expresses two homologous fluorescent proteins of different colors, which display 87% sequence identity at the nucleotide level. In our experiment, Zoanthus cDNA was divided into two samples; the first used to eliminate transcripts of green fluorescent protein (zGFP, GenBank ID 1080018), and the second for depletion of the red fluorescent protein (zRFP, GenBank ID 1080019). In this experiment, synthetic oligonucleotides complementary to both cDNA strands of the fluorescent protein genes were used as the driver. After the depletion procedure (see “Materials and Methods” section), two cDNA libraries were constructed and used for visual screening of fluorescent clones. Table 1 provides a summary of data obtained from the analysis of 30,000 colonies from each library. In both cases, the number of colonies expressing undepleted fluorescent protein was similar to that in a initial library, while colonies expressing fluorescent protein targeted for depletion were essentially reduced.

Table 1 Screening results of control and depleted Zoantus cDNA libraries

Discussion

Use of standard expression libraries for functional screening often results in several hits representing a single protein. For instance, screening of an undepleted normalized placenta cDNA library for different PDE inhibitors in an attempt to identify other partners beside PDE yielded single PDE proteins as the majority of hits (data not shown). This may be attributed to strong interactions of inhibitors with PDE proteins and the consequent significant growth advantage induced by these interactions. However, other putative targets of the inhibitors were masked. cDNA depletion should lead to increased depth of analysis and reduce repetitive hits in the library.

A unique combination of properties, including high specificity for ds DNA and stability under elevated temperatures, allows the use of DSN for isolation of ss cDNA fractions enriched with full-length sequences from complex nucleic acids. Recently, DSN-based methods have been developed allowing effective normalization of full-length-enriched cDNA [1012] and quantitative telomeric overhang determination [14]. The novel DSN-depletion protocol includes step of driver addition into cDNA samples and allows specific depletion of selected genes from cDNA pools. As compared to classical subtractive hybridization the use of DSN is simpler, and allows preparation of full-length-enriched cDNA libraries suitable for functional screening.

The DSN-based depletion method is applicable when only limited amount of starting material is available and demonstrates high efficacy for full-length-enriched cDNA prepared from total RNA. The protocol works well with both amplified fragments and synthetic oligonucleotides as driver types. The individual cDNA molecules remain unchanged in size during depletion. Genes homologous to those targeted for depletion usually remain in the depleted cDNA libraries, but undergo partial normalization. Expression cloning experiments confirm the suitability of depleted cDNA for the generation of cDNA libraries and functional screening of novel genes displaying specific activities of interest.

Recently, the focus of research in the pharmaceutical industry had shifted toward drug specificity. Since optimization of compounds is achieved by testing a few targets, knowledge of potential interactions with other members of the cell machinery is essential. Especially in the case of targeted drug development, a question of specificity of therapy can be crucial for clinical development. Some known “specific” inhibitors are capable of interacting with other cell proteins. Functional libraries are useful for studying the selectivity of small molecules. Screening of such libraries should lead to the identification of other partners, even outside the expected field. While the validation of identified hits would extend the preclinical timelines, this procedure is less costly in comparison to failures in the clinic, due to adverse effects.

Duplex-specific nuclease-depletion method will undoubtedly aid screening of expression cDNA libraries in an attempt to search for new targets underlying biological functions because of elimination of hits from already known genes. The method provides a unique opportunity to eliminate such genes from full-length-enriched cDNA samples before library cloning without time-consuming procedures and even when only limited amounts of total RNA are available for manipulation.

In addition to expression cloning applications, DSN depletion is useful as improved step in cDNA normalization procedure. In a model experiments, we performed only shallow normalization. However, it is generally easy to combine DSN depletion with deep normalization. We believe that this method should facilitate gene discovery during EST sequencing in cases where: [1] a number of EST sequences is always obtained and should be specifically removed from analysis; [2] high redundancy of a limited amount of transcripts complicates library analysis; and [3] some sequences do not equalize during the standard normalization procedure due to complicated structures that block their effective re-hybridization.