Key words

1 Introduction

Knowledge of RNA secondary structure can provide a basis for insight into structure–function relationships and design of therapeutics [15]. Secondary structure determination is also a first step toward determination of 3D structure . X-ray crystallography provides definitive structures for RNA in crystals. Procedures for generating suitable crystals for X-ray analysis are not always successful, however. Chemical mapping provides insights into which nucleotides are not base paired, but interpretation can be ambiguous, especially for pseudoknots and multiple folding [6, 7]. Nuclear magnetic resonance (NMR) can identify nucleotides that are base paired and thus limit possible secondary structures. It can also reveal multiple conformations [8, 9]. Like X-ray diffraction, NMR data can also provide full 3-dimensional structures of RNA, although it is limited to structures of about 100 nucleotides. Unlike X-ray diffraction, however, NMR analysis is carried out on biomolecules in solution, not in single crystals. Thus, NMR experiments can be initiated as soon as the biomolecule is expressed and purified. While this is a great advantage, a disadvantage is that acquisition and analysis of NMR data for a 3D structure requires greater time and effort than crystallography .

NMR structure determination is based primarily on detection of short-range magnetic interactions known as the nuclear Overhauser enhancement, or NOE , between hydrogen atoms. [10] As many NOEs as possible are detected, typically 8–15 per nucleotide, and used as restraints in constructing a molecular model. Other structural NMR measurements include scalar coupling constants that provide estimates for dihedral angles, and residual dipolar couplings (RDCs) that provide information about relative orientation of molecular bonds. Interesting approaches for using comparisons between chemical shift assignments and predicted 3D models are being developed [11]. For coverage of NMR methods for complete RNA chemical shift assignment and 3D structure determination, the reader is referred to the literature [1216].

This chapter summarizes NMR methods that permit rapid identification of RNA secondary structure and other structural features—information that can be used as supplements to chemical mapping , and/or as preliminary steps required for 3D structure determination. The aim is to provide guidelines to enable a researcher with minimal knowledge of NMR to quickly extract secondary structure information and recognize some common internal loop structures in basic datasets. Some details of optimal acquisition and processing parameters for these datasets will be discussed, but not the details of spectrometer operation. It is presumed that the researcher has this ability or has access to either a local or national facility collaborator who can acquire such data.

2 Experimental Considerations

A number of factors should be considered to maximize the information that can be deduced from NMR data. The first consideration is the instrument itself. State-of-the-art instruments for biomolecular studies use magnetic field strengths typically between 11.7 and 21.1 Tesla (500–900 MHz for proton Larmor frequency ). A higher magnetic field provides greater NMR signal intensity and spectral resolution. Another important factor is whether the instrument has a standard room-temperature probe, or a cryo-probe. A cryo-probe will typically yield two to three-fold greater signal than the same sample in a room-temperature probe. Thus, the time required to produce a given signal-to-noise ratio is reduced by 4- to 9-fold. The combination of highest field and a cryo-probe will give the best data. Most instruments are configured to accept NMR sample tubes with a 5 mm diameter and hold liquid sample volume of 0.25–0.5 mL with the smaller volume range only being possible if “susceptibility matched” tubes are used.

Sample amount is a critical factor. The RNA concentration required to achieve sufficient signal-to-noise ratio depends not only on the instrument, but also on the experiments to be performed. For instance, to monitor RNA interactions/changes during a titration, only 1D spectra of RNA imino protons (Fig. 1) are required and concentrations as low as 10 μM may be sufficient (~2.5 nmol) [17]. For 2D/3D NOESY experiments, required for secondary structure identification, a concentration of 0.5–1.0 mM is desirable, but 0.1–0.2 mM may be sufficient to answer many secondary structure questions if an 800–900 MHz spectrometer with cryo-probe is available.

Fig. 1
figure 1

The most common Watson–Crick base pairings found in helical stems of RNA shown with standard numbering of hydrogen atoms most relevant in identifying secondary structure by NMR. Imino protons, GH1 and UH3, have pink labels. The base pairs are shown above the imino proton region of the 1H NMR spectrum of 5S ribosomal RNA from Escherichia coli (119 nucleotides). Each base pair is positioned above the portion of the 1H NMR spectrum where the imino protons for that pair type most commonly resonate. Aromatic proton region of the 1H NMR spectrum of the same RNA sample is also included, demonstrating spectral crowding in this region

Ionic strength of the buffer must also be considered because small, mobile ions reduce the sensitivity of signal detection. Cryo-probes are particularly sensitive to ionic strength, so buffers with very low or no added salts are often employed. Phosphate buffer is most commonly used as it has no protons to interfere with the 1H NMR signal. Sample pH should be kept as low as possible without influencing the native conformation of the RNA. This is because exchange of imino hydrogens with solvent hydrogens results in line broadening and reduction of NOE cross-peak intensity. Hydrogen exchange is catalyzed by hydroxyl ions so pH less than 7.0 is desired; less than 6.5 will provide better detection of signals from loops that are not as stable as Watson–Crick stems. The experiments described here pertain to samples dissolved in H2O as solvent; in D2O solvent the imino protons exchange with deuterons and disappear from the proton spectrum.

3 Revealing Canonically Base Paired Stems

NMR can provide a rapid and early assessment of base pairing. The majority of this information comes from the region of a proton NMR spectrum where only imino protons of G and U residues are observed (Fig. 1). The imino protons of G and U resonate well down field (higher chemical shifts in parts per million, ppm) of all other protons in biological macromolecules with the exception of tryptophan and histidine sidechain protons. These resonances exhibit fairly characteristic chemical shift and NOE patterns depending on whether they are in GC, AU, or GU pairs, or unpaired. Generally, G imino protons in GC pairs are found 12.0–13.5 ppm and U iminos in AU pairs are found 13.0–14.5 ppm (Fig. 1). Thus when only one conformation is present, the 12.0–14.5 ppm region will have at most only one resonance for each GC or AU pair. In contrast, the aromatic region of H8/H6/H2 protons (6.5–8.5 ppm) is more crowded because GC and AU pairs have two or three resonances, respectively. More importantly, aromatic protons lack structurally characteristic chemical shift or NOE patterns. In addition, this region of the spectrum overlaps with the amide and aromatic protons of proteins. There are two imino protons in GU wobble pairs with the U imino primarily between 11 and 12.5 ppm and the G imino between 10 and 11.5 ppm (Fig. 1). G iminos that are not hydrogen bonded typically have chemical shifts lower than (upfield of) 11.5 ppm and often are broad and exchange readily with water protons rendering them invisible in NOESY spectra. Consequently, the 10–14.5 ppm region of the spectrum is relatively uncrowded even in fairly large RNAs or in the presence of protein. These aspects of the 1D imino 1H spectrum make it useful for conveniently monitoring changes in structural properties or intermolecular interactions when buffer conditions are changed (addition of Mg2+, for example), or proteins are added.

The fundamental measurement for structure determination by NMR is the nuclear Overhauser effect (NOE) which is usually detected in 2D spectra (2D NOESY). In a 2D NOESY spectrum, a “cross-peak” is observed at the intersection of frequencies of two protons that are within about 5 Å of each other. The cross-peak intensity varies as 1/r 6, where r is the distance separating the two protons. NOESY cross-peaks between the imino protons of adjacent base pairs in an A-form helix are readily observed, so one cross-peak between two imino protons represents adjacent base pairs (Fig. 2), with the common exception of the strong cross-peak between two imino protons in a GU wobble pair. An imino resonance exhibiting cross-peaks to two different imino peaks identifies a “walk” representing three sequential base pairs. An additional cross-peak to the imino of one of the flanking base pairs indicates an even longer helical region. Thus, imino walks identify helixes important for secondary structure. Since the distance between imino protons of adjacent base pairs is typically 3.5–5.5 Å resulting in medium to weak NOE cross-peak intensities, a relatively long NOESY mixing time (100–300 ms) is generally recommended for these spectra. Recommended mixing time and other parameters are discussed in more detail later and in Table 1.

Fig. 2
figure 2

NMR spectra of the self-complementary RNA duplex (CGUGAUUACG)2 in 80 mM NaCl, 20 mM phosphate buffer at pH 6.5, and 95 % H2O/5 % D2O solvent. The horizontal axis of all spectra spans the imino proton region. The top panel is a 1D spectrum. The second and third panels are from a 2D NOESY spectrum acquired at 0 °C with a mixing time of 100 ms and a WATERGATE readout pulse [35]. The bottom panel is a 1H-15N HSQC Fig. 2  (continued) spectrum (natural abundance 15N). The 15N chemical shifts indicate which iminos are G and which are U. An “imino-walk” of NOESY cross-peaks is indicated with blue lines in the third panel. The three step walk indicates four sequential base pairs, represented in the diagram to the right. (Note that because the duplex is symmetric, nucleotide 1 is the same as 1*, etc.) In the diagram, shaded boxes represent base pairing and black dots represent imino protons. In the spectrum and diagram, respectively, the strong NOE between imino protons of the GU wobble pair is indicated with a green circle and line. The second panel (vertical axis region includes aromatic, amino, and H1′ protons) highlights strong cross-peaks that are characteristic of the different pair types. These include UH3-AH2 cross-peaks in Watson–Crick UA pairs and GH1-CH41 cross-peaks in Watson–Crick GC pairs. Also shown is the upfield shift and degeneracy of G amino protons that are not involved in hydrogen bonds, as for G4H2 in the G4-U7* wobble pair. Cross-peaks to G10H1 are weak as the terminal GC pair is exposed to solvent resulting in rapid exchange with solvent protons

Table 1 NMR experiments most useful for identifying secondary structure in RNA

Identifying the type of base pair (GC, AU, GU) corresponding to each imino resonance further characterizes the secondary structure. AU, GC, and GU pairs can often be identified in unlabeled samples by a distinctive NOE pattern to their pairing partner. The imino protons in GC and AU pairs have strong NOEs to amino or aromatic proton peaks between 6.5 and 8.5 ppm (Fig. 2). These are best identified in short mixing time (25–75 ms) NOESY experiments.

GC pairs: The typical 1H-1H NOESY pattern in a short mixing time NOESY for a G imino (G-H1) in a GC pair includes two strong peaks to the amino protons of the paired C residue (C-H41 and C-H42). The peak to the downfield amino (C-H41) may be stronger than the peak to the upfield amino (C-H42) as the former is hydrogen bonded to G-O6 and, therefore, closer to G-H1. The peak from G-H1 to C-H42 is primarily due to spin-diffusion through C-H41 or flips of the amino group. Two strong cross-peaks between the G imino proton and the intrabase amino protons are also commonly observed, although these are usually broader than C amino signals. Spectra at elevated temperatures (20–30 °C) may distinguish C aminos from G aminos better than at low temperature (0–5 °C).

AU pairs: The typical NOE pattern in a short mixing time NOESY for a U imino (U-H3) in a AU pair includes one strong peak to the H2 proton of the paired A residue (A-H2). Peaks to the A amino protons may also be observed, but these signals are typically exchange broadened, so the cross-peaks are much less pronounced than the H2 cross-peak. Again, elevated temperatures exaggerate distinction of A-H2 and A amino protons. The C amino signals in GC pairs are also broader than A-H2 signals, but are typically narrower than A amino signals.

GU pairs: G and U imino protons in a GU wobble pair are identified by a very strong NOE between the two imino protons (Fig. 2), which are separated by only ~2.5 Å. In contrast to GC and AU pairs, neither of the iminos in a GU pair exhibit intense cross-peaks in the amino/aromatic region, although the G imino may show a broad cross-peak to its own amino protons below 6.5 ppm. The two amino protons show the same chemical shift because no hydrogen bonds restrict the NH2 group from rotation about the C-N bond resulting in an identical averaging of the chemical shift environment experienced by these two protons. The G imino in a GU wobble pair is usually upfield of the U imino although the G imino chemical shift is particularly dependent on the orientation of the flanking base pairs and for some orientations the G and U iminos can be nearly overlapped [18]. The dependence of non-exchangeable proton chemical shifts on the orientation of the flanking base pairs has been closely examined [19].

As an alternative to identification of base pair type by NOESY pattern, imino protons can also be distinguished by identifying the chemical shift of the directly bonded imino nitrogen-15 (15N). The imino nitrogen (N3) of U residues resonates between 155 and 165 ppm, while the imino nitrogen (N1) of G residues resonates between 140 and 150 ppm. These 15N shifts are minimally influenced by hydrogen-bonds or neighboring residues (Fig. 2). Thus, U iminos and G iminos are unambiguously identified. 1H-15N correlation experiments, HSQC or HMQC (Heteronuclear Single/Multiple Quantum Correlation ), are used for this purpose. In these 2D experiments, magnetization is transferred “through-bond” between the 1H and 15N nuclei. The natural abundance of 15N nuclei is only 0.15 %, so unless the sample is isotopically enriched, signal sensitivity is very low. It is possible to do the experiment at natural abundance if the sample concentration is greater than 1 mM and the molecule’s size is less than ~25 nucleotides. Through-bond magnetization transfer is inefficient for large molecules or signals that are broad due to conformational or chemical exchange such as imino protons that exchange with solvent protons when base pair hydrogen bonding is weak or absent. Generally, it is preferable to isotopically enrich the sample for heteronuclear experiments. Isotopic enrichment with 15N and/or 13C opens the possibility of other experiments which can provide characterization of base pairs. A 2D HNN-COSY experiment can correlate an imino proton not only with the covalently attached imino-nitrogen detected in the HSQC , but also the imino nitrogen of the hydrogen-bonded base (e.g., C or A for GC or UA, respectively) [2022]. In this experiment, magnetization is transferred between nitrogens “through-bond” via weak scalar coupling in the N–H⋯N hydrogen bond. In other words, magnetization is transferred between nitrogen atoms that share electron density with one hydrogen atom. Because the transverse relaxation properties of 15N are favorable compared to 13C and because the transfers in this experiment involve only 15N, this experiment can give surprisingly reasonable signals in large RNAs [23].

A 3D or 2D 13C-edited HMQC-NOESY experiment can also aide in distinguishing base pair type in larger, labeled RNA. The HMQC-NOESY is a combination of through-bond correlation of 1H and 13C (HMQC ) followed by 1H-1H NOESY . This experiment can identify whether an imino NOESY cross-peak in the aromatic/amino region (6.5–8.5 ppm) involves an adenine H2 proton or another aromatic (H8/H6) or amino proton. This is possible because the 13C chemical shift of adenine C2 is distinct from C8 and C6 in any nucleobase [16]. Amino groups do not pass through the HMQC edit. So, for example, the 2D/3D 13C-edited HMQC-NOESY can distinguish the UH3-AH2 cross-peak of a WC/WC UA pair from the UH3-AH8 cross-peak of a WC/Hoogsteen UA pair such as found in a UAU triple [24]. The 2D HNN-COSY distinguishes the same two base pairs via UH3 cross-peaks to the characteristic AN1 (WC/WC) or AN7 (WC/Hoogsteen) 15N chemical shifts [25]. Another pair that can be similarly identified includes WC/WC GA pairs characterized by a strong GH1 to AH2 NOESY cross-peak (13C-edited HMQC -NOESY with 15N-1H HSQC), or a GH1 to AN1 cross-peak (HNN-COSY ) [26].

For small to medium-sized constructs (defined here as ~12 ~ 50 nucleotides) a simple one-dimensional spectrum and a two-dimensional NOE spectrum (typically 12–36 h of data collection) can often provide a full assessment of secondary structure without the need for isotopic labels. At the very least these simple initial spectra provide insight into the suitability of the construct and buffer conditions for a more complete study.

Despite the low density of peaks in the imino region, spectral overlaps will occur, especially in RNA larger than ~50–60 residues. Correlation with 15N nuclei in an HSQC spectrum can identify many overlaps, or an HNN-COSY if the RNA is isotopically labeled. In unlabeled samples, overlapped imino peaks can sometimes be identified in the aromatic/amino region of a NOESY spectrum if more than the expected number of cross-peaks to one imino chemical shift are observed. For instance, three or four strong cross-peaks in the aromatic/amino region to an imino proton may indicate an overlap. Chemical shifts are temperature dependent, so spectra at more than one temperature can often resolve overlaps. In general, 2D NOESY spectra are acquired at room temperature or slightly higher, and at 0–10 °C. A short and a long mixing time NOESY is acquired at each temperature (Table 1).

Missing imino–imino cross-peaks in a WC stem walk are not uncommon. Some imino protons, even in WC stems, exchange readily with water protons due to unstable hydrogen-bonding. This occurs near helix ends, in short helices, and particularly often in AU pairs. NOESY cross-peaks are reduced by this exchange and the imin o–imino NOE pathway along the helix may be broken. Hydrogen exchange can be slowed by low temperature and low pH. In some cases even subzero temperatures can recover rapidly exchanging imino protons. Buffer pH should generally not be above 6.5 unless necessary. In the case of an unstable UH3 in an AU pair, however, it is still usually possible to identify the strong UH3-AH2 cross-peak, and the NOE pathway along the stem can often be found through an NOE from the AH2 of the unstable AU pair to a stable imino of an adjacent base pair (GC or AU). Because this is not a strong NOE, 2D 13C-HMQC-NOESY of an A-labeled sample will differentiate it from the strong amino cross-peaks, especially for larger RNAs. Some imino–imino cross-peaks are weaker than others simply because the distance is longer. Imino-to-imino distances in WC stems range approximately from 3.5 to 5.5 Å [18]. NOEs for the longer distances are aided by “spin-diffusion” through a third involved proton (e.g., NH2 or adenine H2 proton) that is between the two imino protons. Spin-diffusion is often a problem in NMR as it causes NOE volumes that are not proportional to 1/r 6, but sometimes, as in the case of enhancing the longer distances of the imino walk, it has a desirable influence.

4 Data Acquisition

Acquisition of 2D NOESY spectra of RNA is much the same as for proteins, but a few points are worth considering. RNA secondary structure characterization is primarily accomplished through observation of imino 1H signals at 10–14 ppm (Fig. 1). Water suppression pulses and the spectral carrier frequency are usually centered on the water resonance near 5 ppm. Thus, a spectral width of 20 ppm (±10 ppm from center) is required to cover the range −5 to +15 ppm. However, since no RNA protons are found further upfield than approximately 3.5 ppm, there is “empty space” from 3.5 ppm to the upfield edge of the spectrum at −5 ppm. This empty space can be used to “wrap” NOESY spectra in the indirect dimension. If the indirect dimension spectral-width is reduced from 20 to 12 ppm (covering the range −1 to 11 ppm), then imino peaks that were previously at 11–15 ppm are “aliased” to the upfield portion of the indirect dimension (now at −1 to 3 ppm) without overlapping other peaks. Reduction of the spectral width means fewer t1 time-increments are required to obtain the same resolution as in a full-width spectrum, resulting in reduced total time for data acquisition. Alternatively, the same number of t1 increments yields higher resolution than in a full-width spectrum. t1-wrapping is not useful in 1H-1H NOESY spectra of proteins because the 1H shifts are distributed approximately equally on either side of the water signal.

The configuration of the water-suppression pulse used to read out the 1H signal is also worth considering. Since imino 1H signals are far from the water signal, very narrow-band water-suppression pulses that would allow direct detection of protons close to the water signal (e.g., H1′ protons at 5–6 ppm) are not required. Narrow-band excitation pulses typically require a few milliseconds, during which time signals decay via transverse relaxation processes. RNA imino proton signals often decay rapidly due to solvent exchange and would suffer losses during millisecond pulses. The large chemical shift difference between water and imino signals, along with no need to directly detect protons that are spectrally near water, means that broad-band shorter duration (<0.5 ms) water-suppression pulses can be used. NOESY cross-peaks from H1′ to imino protons can, nonetheless, still be observed along the indirectly detected dimension of the 2D spectrum. The pulses surrounding the indirect evolution time do not need to be water-suppression pulses.

5 Secondary Structure Prediction

The stretches of base pairs identified by NMR are complementary to information provided by chemical mapping . Further, the NMR findings can be entered into secondary structure prediction programs that have been modified to use the data to limit folding space or distinguish correct structures from a list of predicted structures. NAPSS (NMR-Assisted Prediction of Secondary Structure), discussed in the next chapter, and RNA-PAIRS (Probabilistic Assignment of Imino Resonance Shifts) are two examples currently being developed [7, 18, 27]. The combination of stretches of base pairs with algorithms for prediction of secondary structure allows assignment of resonances to individual nucleotides, a first step in determination of 3D structure .

6 3D Structure Determination

Global Structure. Identification of secondary structure elements as discussed here is important, but it is worth considering solution methods for rapidly characterizing the three-dimensional arrangements of these elements. Assignment of imino protons in elements of secondary structure as described above opens the possibility of using 1H-15N HSQC spectra of 15N-labeled RNA to measure 1H-15N residual dipolar couplings (RDCs) if the RNA is suspended in an appropriate alignment medium [28]. 1H-13C RDCs can also be measured for easily assigned 1H-13C HMQC peaks, such as for the adenine H2/C2 in an AU pair. However, the RDC data alone cannot distinguish between several possible orientational arrangements of the helices. While the degeneracy can be resolved if multiple alignment media are used, Wang et al. have described a protocol using SAXS data to break the degeneracy [29, 30]. They demonstrate the combined NMR/SAXS method in RNA of 100 nucleotides.

Complete 3D Structure . Solution of a full 3D RNA structure by NMR involves measurement of hundreds to thousands of NOE cross-peaks, scalar-couplings, and RDCs. This requires assignment of not only imino protons but also all amino, aromatic, and sugar protons. Most of these experiments require that the solvent be changed from 95 % H2O/5 % D2O to 100 % D2O. Methods for making these unambiguous assignments and measurements are not discussed here, but the reader is referred to the books and reviews mentioned earlier [1116]. In addition, novel isotopic labeling chemistry, including selective deuteration, is improving the assignment process and allowing studies of ever larger RNA molecules [3134].