Keywords

1 Introduction

Since the discovery of the double-helix DNA structure, our understanding of the structural diversity of nucleic acids (RNA and DNA) and its link to biological function has experienced a stunning growth. In addition to the well-known double-helix structure, nucleic acids can be found in single-stranded form, as triplexes, and in a variety of branched structures adopting multiple conformations responding to different biological needs. Many crucial biological processes such as replication, recombination and DNA repair require the opening of the double helix and the formation of transient structures, either to have access to the encoded genetic information or to repair damaged DNA. For instance, DNA Holliday junctions, replication forks and DNA flaps arise as intermediate states during DNA processing by highly specialised protein machineries. More recently, the formation of stable guanine quadruplexes, in both RNA and DNA sequences, has been reported in guanine-rich nucleic acid sequences. Although initially regarded as exotic forms of single-stranded DNA, the confirmation of their existence in vivo (Biffi et al. 2013) and their potential role in crucial processes such as gene regulation and telomere maintenance has triggered the need to develop novel technologies that can sense these structures with high specificity and affinity. Here, it is important not only to report the presence of the guanine quadruplex structure but also discriminate between the different conformers in which a given G-quadruplex (GQ) sequence can be present. The potential of GQs and GQ–ligands as therapeutic targets has also become highly promising in many areas related to human health. For instance, the role of GQ sequences as molecular targets to induce death in cancer cells has recently been reported (McLuckie et al. 2013).

In the last decades, the RNA world has also experienced its own revolution with the discovery that RNA function goes beyond exclusively acting as a passive transporter of genetic information. The discovery of non-coding RNA sequences with catalytic activity, the so-called ribozymes, in the early 1980s (Guerrier-Takada et al. 1983; Kruger et al. 1982) demolished a conceptual barrier, and the relevance of RNA as an active element participating in many crucial processes including gene silencing, gene regulation, metabolite sensing and immune response continues to expand. For instance, the discovery in 2002 of a small RNA sequence (riboswitch) that senses coenzyme B12 to regulate downstream expression (Nahvi et al. 2002) constituted the first example of a gene regulation mechanism purely performed by RNA and marked the starting of a new field where the formation of these regulatory RNA–ligand complexes could be explored as antibiotic targets.

The application of fluorescence-based techniques to diagnose the structure and presence of this huge diversity of nucleic structures has become very attractive because of their high sensitivity and the simplicity of the assays, which normally only require a relatively inexpensive fluorimeter that is available in most laboratories. Moreover, they enable real-time continuous probing, and because a fluorophore provides a range of observables (emission intensity, fluorescence lifetime and polarisation), assays in different formats can be tailored and optimised for each particular system. Lastly, the possibility of detecting each of these observables from a single fluorescent molecule, as opposed to the average fluorescence signal from an ensemble of many molecules, constitutes not only the ultimate limit of diagnostic and sensing, but it has also allowed an understanding of how RNA and DNA structures are dynamically processed (McCluskey et al. 2014). In particular, the field of nucleic acid research has greatly benefited in the last decade by allowing the detection and analysis of the structure and dynamics of nucleic acids in real time. Since its first implementation, the single-molecule DNA/RNA field has rapidly expanded, and there is a continuously growing library of techniques based on single-molecule fluorescence (SMF) (Joo et al. 2008). The majority of single-molecule techniques are based on the detection of the light emitted by fluorophores acting as highly sensitive reporters of the surrounding medium. In this chapter, we will describe the application of SMF to diagnose the structure and function on nucleic acids, with special focus on surface-immobilised and freely diffusing techniques. The use of fluorescence resonance energy transfer (FRET) as a diagnostic tool based on changes in molecular distances across the nucleic acid structure has become extensively used in vitro and in vivo assays, and as such, it will be discussed in detail. We will also report the most recent advances on the field ranging from DNA sequencing and nucleic acid-based sensing to the analysis of functional RNAs involved in gene regulation and catalysis. The last section of the chapter is devoted to in vivo RNA imaging using RNA aptamers and to hybrid technologies, where the combination of FRET with force mechanical manipulation is emerging as the state-of-the-art technique to fully understand the function of nucleic acids.

2 Single-Molecule Techniques for Nucleic Acid Diagnostics

The detection of single fluorescent molecules was demonstrated for the first time using pentacene molecules hosted on a crystal matrix at liquid helium temperatures (Moerner and Kador 1989). Advances in single-molecule techniques, mostly due to significant improvements in fluorescence detection technology, allowed a few years later the first measurements of spatially isolated molecules at room temperature (Betzig and Chichester 1993). Shortly after, the use of the total internal reflection (TIR) microscopy brought the first measurements in aqueous solution and provided a means for the detection of biological samples at single-molecule level (Fig. 1a) (Funatsu et al. 1995). More recently, the combination of TIR with FRET made possible to study the conformational changes in nucleic acids using DNA strands labelled at specific positions with a single donor and a single acceptor (Ha et al. 1996). The continuous improvements that single-molecule techniques have experienced in the last decade have consolidated it as one of the most powerful methods for the study of biophysical processes. The main benefit that single-molecule technique provides, when compared to ensemble techniques, is the possibility to observe isolated and non-synchronised samples in real time, representing the ultimate sensitivity limit in molecular diagnosis (Joo et al. 2008).

Fig. 1
figure 1

Schematic diagram for single-molecule TIR and confocal microscopy. (a) Diagram for single-molecule confocal and TIR microscopy showing the prism-based (pTIR) and objective-based (oTIR) variants. (Inset) The intensity of the evanescent wave decays exponentially with the distance, illuminating a little volume of sample with a penetration depth of ~150 nm. (b) Excitation and emission pathways for TIR and confocal FRET assays. In pTIR, the prism directs the beam to the slide, whilst in oTIR and confocal, the excitation occurs through the objective. (Left) A dichroic mirror splits the fluorescence signal into two components, which are collected on a CCD camera. (Bottom) Schematics for the confocal detection, where the emissions of donor and acceptor are collected on different avalanche photodiodes

SMF can be done for either freely diffusing molecules or for specimens immobilised on a glass or quartz microscope slide. Freely diffusing molecules can be observed in solution with the use of confocal microscopy, whilst surface-immobilised techniques use either confocal or wide-field excitation. Both confocal and wide-field techniques limit the excitation volume within the sample to reduce the background and provide a better signal-to-noise ratio (Fig. 1b). The reduction of the illuminated region to a small volume (~fL) is critical to detect the low number of photons emitted from a single dye above a background, which is mostly contributed from solvent Raman bands, scattering light and impurities in the medium. In this chapter, we will focus on how to apply two different but complementary single-molecule techniques for nucleic acid diagnosis: (1) single-molecule studies on freely diffusing molecules, which allow measurements of fast events on the sub-millisecond timescale and below, and (2) single-molecule studies of surface-immobilised molecules, which enable to monitor dynamics for long periods of time (up to seconds) with millisecond resolution.

In addition to advances in instrumentation, the widespread use of SMF techniques has been made possible thanks to parallel improvements in nucleic acid synthesis and site-specific labelling of DNA and RNA with fluorescent markers. Solid-phase synthesis and post-synthetic labelling are the two main methods used for this purpose. During solid-phase synthesis, the dye is incorporated in the nucleic acid sequence using fluorescently labelled phosphoramidites. Post-synthetic labelling uses normally a N-hydroxysuccinimidyl ester (NHS) derivative of the dye that reacts with a base modified with a primary amino group inserted at specific positions of the nucleic acid sequence (McCluskey et al. 2014). More recently, other chemical reactive groups have been developed to increase the labelling flexibility. This includes the coupling of maleimide ester derivatives of the dye to thiol-modified nucleobases and the use of click chemistry reagents. The purification of the nucleic acid to separate labelled from non-labelled sequences and also unreacted dye can be carried out using native or denaturing polyacrylamide gel electrophoresis (PAGE) or HPLC methods. The development of these labelling and purification methods has facilitated the wider use of SMF techniques and their application in FRET-based and multicolour assays. Another important factor contributing to the success of SMF in biological systems of increasing complexity has been the development of a variety of dye libraries covering from the near ultraviolet to the near infrared with improved photophysical properties (i.e. improved photostability, higher quantum yield). To further increase the observation time window, different oxygen scavenger systems (i.e. glucose oxidase/glucose catalase) and triplet-state quenchers, such as β-mercaptoethanol and trolox, are now commonly used in single-molecule measurements (Aitken et al. 2008; Benesch and Benesch 1953; Zheng et al. 2014). Oxygen scavenger systems allow to slow down the photobleaching rate of the fluorophore, thus enabling the observation of dynamic events up to several seconds. Triplet-state quenchers minimise the influence of blinking events that result in the temporary loss of the fluorescence emission, thus compromising the interpretation of dynamic fluctuations in single-molecule measurements.

2.1 Surface-Immobilised Molecules: Total Internal Reflection Fluorescence Microscopy

2.1.1 Surface Immobilisation Methods

Surface immobilisation methods combined with wide-field TIR illumination and CCD detection offer the possibility of observing single molecules for long periods of time with millisecond time resolution. They also provide a way to control the density of molecules within the field of view, which is important when building statistical data that are representative of the underlying mechanism. Early single-molecule experiments based on surface-immobilised samples in aqueous solution used polyacrylamide or agarose gels to confine the sample in a certain volume. The pores formed within the gel trap the molecules and allow their study over a long period of time; however, the slow diffusion of the nucleic acids across the gel and the difficulties for changing solution conditions required to explore additional immobilisation methods. Subsequent experiments used a combination of the strong interaction between biotin and streptavidin to specifically attach the sample to the microscope slide. In this method, quartz slides (wide-field) or coverslips (confocal) are coated with a first layer of biotin-labelled bovine serum albumin protein (BSA), and subsequently, neutravidin is added to the slide to form a second layer to which a biotin-modified RNA or DNA sample can be added to form a stable complex (Fig. 2a). The interaction between neutravidin and biotin is one of the strongest interactions known in nature, with an affinity in the picomolar range and, importantly, with a very slow off rate, thus ensuring the nucleic acid structure will remain anchored to the slide for a long time. Moreover, the interaction is very stable to temperature, pH variations and moderate denaturing conditions, allowing a variety of biologically relevant conditions to be tested without affecting the immobilisation strategy (Kurzban et al. 1991).

Fig. 2
figure 2

Surface immobilisation methods for single-molecule fluorescence techniques. (a) Surface immobilisation with the use of biotinylated bovine serum albumin (BSA) and neutravidin. Nucleic acid molecules labelled with biotin can bind one of the three free binding sites of the neutravidin. The resulting concentration of the sample is usually on the order of nM or pM. (b) The study of protein–DNA or protein–RNA interactions requires another passivation treatment with aminosilane and poly(ethylene glycol) (PEG) to prevent non-specific interactions between the protein and the slide surface. The slide is coated using 1–2 % biotinylated PEG and non-biotinylated PEG. Once the slide is prepared, first, neutravidin and then the sample can be flushed into the sample chamber. (c) Schematics of lipid encapsulation. The vesicles require a passivation either with a flat lipid bilayer or PEG to prevent its rupture on contact with the quartz surface. In the formation of the vesicle, only a small amount of lipids are biotinylated (<0.5 %), and those are the ones that can be tethered to the slide through the biotin–streptavidin interaction

In addition, non-specific interactions with negatively charged glass and quartz surfaces present in some avidin proteins can be minimised by using neutravidin because of its lower pI (6.3). Because of its robustness and slow dissociation rate, the neutravidin/biotin system is becoming the standard tool for the non-covalent immobilisation of nucleic acids in single-molecule microscopy.

The study of protein–DNA or protein–RNA interactions needs an initial step involving the passivation of the surface with aminosilane and poly(ethylene glycol) (PEG) to prevent any non-specific protein–surface interaction. For these assays, the slide surface is coated with a mixture of biotinylated and non-biotinylated PEG to control the final density of molecules (Fig. 2b) (Chandradoss et al. 2014). Although BSA/neutravidin and PEG methods are by far the most commonly used immobilisation methods in the study of nucleic acid structure and dynamics, it is always crucial to confirm that the observed behaviour is not altered by the nearby surface. Encapsulation of the nucleic acid inside lipid vesicles has become in recent years an alternative method that completely avoids surface artefacts (Fig. 2c) (Okumus et al. 2004). Encapsulation-based methods also offer several possibilities for real-time exchange of solution conditions. This includes the insertion of nanopores in the lipid membrane using the intrinsic ability of the bacterial toxin α-haemolysin (α-HL) to assemble into channels that communicate both sides of the lipid bilayer or by tailoring the lipid composition of the vesicle so that transient pores can be formed at room temperature [i.e. DMPC vesicles (Roy et al. 2008)]. Encapsulation is normally achieved by preparing the vesicles by the extrusion method (McCluskey et al. 2014) in the presence of a concentration of nucleic acid enough to maximise the presence of single occupied vesicles whilst minimising significant trapping of multiple DNA or RNA structures within the confined volume. By the extrusion method, highly homogeneous vesicles with diameters ranging from 50 nm to 500 nm and carrying a small percentage of biotinylated lipids can be easily prepared and purified from non-encapsulated DNA or RNA using size exclusion chromatography. It is important to note that lipid vesicles in contact with bare glass or quartz will burst open and release their contents. However, this well-known liposome property has been used to generate a passive 2-D lipid bilayer in direct contact with the glass/quartz from empty vesicles carrying a small percentage (<0.1 %) of biotinylated lipids. In single-molecule microscopy, the function of this layer is dual: (1) prevent the bursting of lipid vesicles carrying encapsulated molecules and (2) provide spatially separated biotinylated groups inserted in the 2-D bilayer to which neutravidin can bind.

The surface immobilisation methods described above can be used with either wide-field or confocal techniques. In wide-field mode, TIR illumination is normally used to create an evanescent wave within the sample with a penetration depth limited to ~100 nm, thus reducing the excitation volume and providing a signal-to-noise ratio suitable for single-molecule detection. In Sect. 2.1.2, we detailed the basic principles of how single-molecule TIR microscopy can be applied to investigate the function and structure of nucleic acids.

2.1.2 Total Internal Reflection Fluorescence

When light travels from one medium with higher refractive index to another with a lower index, there is a critical angle for which the light is refracted at 90°. Above this angle, total internal reflection takes place, where most part of the light is reflected and an evanescent wave is generated on the other side of the interface. The critical angle can be calculated from the Snell’s law (Eq. 1):

$$ \frac{ \sin {\theta}_1}{ \sin {\theta}_2}=\frac{n_2}{n_1} $$
(1)

where θ 1 and θ 2 represent the incident and the refractive angles, respectively, and n is the refractive index of each medium. In a TIR-based setup, a laser beam is coupled into the sample chamber at an angle greater than the critical angle to create an evanescent wave within the sample. This technique uses the evanescent wave as a wide-field excitation method, where the intensity decays exponentially with the distance.

Two main types of microscopes based on this technique can be distinguished depending on how the evanescent wave is generated: prism-based and objective-based (Fig. 1a). In prism-based TIR, a prism directs the beam to the slide and illuminates the side of the quartz slide in contact with the prism and opposite to the microscope objective. In objective-based total internal reflection fluorescence (TIRF), the excitation beam is focused onto the back focal place of the objective and directed to the slide through one side of the objective to achieve the critical angle condition. In this case, excitation and detection take place on the same side of the slide, and this normally increases the background levels resulting in a lower signal-to-noise ratio.

In the first years of smTIR, nucleic acid structures such as ribozymes (Zhuang et al. 2000) served as testing platforms to explore the potential of smTIR to reveal the intricate mechanisms of folding and catalysis, and in many ways, it can be considered that both fields benefit from each other. Since then, the application of single-molecule TIR microscopy in biology has been continuously growing, and once again, nucleic acids have been used as test systems for the implementation of more complex single-molecule techniques such as those involving multicolour detection (Hohng et al. 2004) and mechanical manipulation (Hohng et al. 2007). As recent examples, the application of smTIR techniques has greatly contributed to understand RNA-based gene regulation mechanisms in bacteria, to dissect ligand-binding process in RNA aptamers (Heppell et al. 2011) and to decipher the function and structure of guanine quadruplexes (Lee et al. 2005a, b). In the following sections of this chapter, we will describe in detail not only the techniques but also the insights that they have provided regarding the function, structure and dynamics of many of the above-mentioned nucleic acid systems.

2.2 Fluorescence Resonance Energy Transfer

Fluorescence resonance energy transfer (FRET) is a technique based on the non-radiative transfer of energy between two fluorescent dyes, a donor (D) and an acceptor (A), that are very close to one another (Fig. 3a) (Stryer 1978). FRET can be used as a powerful spectroscopic technique to measure distances between the two chromophores in the range from 10 Å to 100 Å, which is a relevant distance range in many biological processes involving nucleic acids. The transfer of resonant energy implies a weak Coulombic interaction between the oscillating dipole moments of both dyes and requires a significant overlap between the donor emission and the acceptor absorption spectra (Fig. 3b). The energy transfer depends inversely on the sixth power of the distance between the fluorophores. Thus, the distance between donor and acceptor, r, can be estimated from the efficiency of the FRET process (E app). This is normally achieved by calculating the relative emission of the acceptor with respect to the total fluorescence emission (D + A) (Eq. 2):

Fig. 3
figure 3

Single-molecule FRET assays for nucleic acid characterisation. (a) Jablonski diagram illustrating the FRET process. On the left, a donor molecule (D) absorbs a photon to reach its first excited electronic state, S D1 . Then, the molecule relaxes via fluorescence radiative emission (k f), via non-radiative processes (k nr), or undergoes resonance energy transfer (k FRET) to the acceptor (A). The acceptor, on its excited state S A1 , can also relax through radiative or non-radiative processes and return to its ground state, S A0 . (b) FRET efficiency as a function of distance, r, between the widely used FRET pair Cy3 (donor) and Cy5 (acceptor). E app depends inversely on the sixth power of the separation between both dyes (r 6) making it useful for the study of structure and dynamics of nucleic acids when the inter-dye distance varies between 1 and 10 nm. Right inset: Emission spectra of Cy3 and absorption spectra of Cy5. The pair of dyes needs to have significant overlap between both spectra and be close enough to show FRET

$$ {E}_{\mathrm{app}}=\frac{F_{\mathrm{A}}}{F_{\mathrm{A}}+{F}_{\mathrm{D}}}=\frac{R_0^6}{R_0^6+{r}^6} $$
(2)

R 0 is the Förster radius and represents the distance that corresponds to a 50 % transfer efficiency. This term can be calculated using Eq. (3):

$$ {R}_0^6=\frac{9,000\kern0.5em \ln 10\;{\kappa}^2{Q}_{\mathrm{D}}}{128{\pi}^5N{n}^4}{\displaystyle \underset{0}{\overset{\infty }{\int }}{F}_{\mathrm{D}}\left(\lambda \right){\varepsilon}_{\mathrm{A}}\left(\lambda \right){\lambda}^4d\lambda } $$
(3)

where κ 2 is a dipole orientation factor (assumed to be 2/3 for anisotropic orientation of the dyes), Q D is the quantum yield of the donor in the absence of acceptor, n is the refractive index of the medium, N is Avogadro’s number and the integral represents the overlap between donor emission and acceptor excitation spectra. This integral includes the normalised donor fluorescence intensity, F D, and the extinction coefficient of the acceptor at the same wavelength, ε A. Thus, the FRET efficiency depends on the distance, spectral overlap and relative orientation between the molecular dipoles of both dyes.

Although ensemble-averaging FRET techniques have been used as a molecular ruler to investigate the structure of nucleic acids for more than 30 years, it was not until they were applied at single-molecule level (smFRET) that their full potential was revealed. In common with single-molecule techniques in general, smFRET allows to follow the interactions, dynamics and conformational changes of nucleic acids on a timescale that includes most of the biologically relevant processes, ranging from milliseconds to minutes. Moreover, it enables to quantify the relative populations of different structural conformers that, otherwise, will be hidden in conventional ensemble FRET because of averaging among many molecules. The data obtained from smFRET measurements are commonly shown as time-dependent donor and acceptor intensity trajectories, together with the resulting FRET trace corresponding to the single D–A pair present in the DNA or RNA structure. In the absence of photophysical artefacts, anticorrelated variations in the intensity trajectories of the donor and acceptor molecules are interpreted as evidence for dynamic switching between different structural conformers (Fig. 4a). From each single-molecule FRET trajectory, two types of data can be extracted: (1) the number of different conformers present at equilibrium and their contribution and (2) the kinetic rates associated to the individual transition from one conformer to another. By accumulating these values for many molecules, appropriate steady-state FRET histograms showing the relative structural populations present in solution can be generated (Fig. 4b). Using the same FRET trajectories, rate histograms representing the underlying interconversion dynamics can be built with enough statistics to accurately reveal intricate folding and catalytic mechanisms (Fig. 4c) (Zhuang et al. 2000).

Fig. 4
figure 4

Data analysis for single-molecule FRET assays. (a) Single-molecule FRET trace for a doubly labelled nucleic acid. The detected signal is represented as a function of time and shows the anticorrelated fluorescence intensities of the donor (green) and acceptor (red). (Bottom graph) FRET efficiency, E app, obtained from the upper trace. E app can be calculated from the donor and acceptor intensities using Eq. (2) (see main text). The trace shows a dynamic switching between low-FRET (unfolded) and high-FRET (folded) conformations. (b) smFRET histograms representing FRET populations. The histogram is built from the average E app value of the first ten frames of each trace. (c) Dwell time histogram for the study of the interconversion dynamics. The distribution can be fitted to a mono- or multi-exponential decay \( I(t)={\displaystyle \sum_n{a}_n\cdot {e}^{-t/{\tau}_n}} \), where τ n represents the lifetime of the correspondent conformer

In the last few years, a wide range of dyes is commercially available for single-molecule techniques, covering the spectral range from UV to IR. Some of the most commonly used are the Alexa Fluor, cyanine and ATTO dye families. The photophysical properties of these chromophores need to meet some very specific criteria to be adequate for smFRET applications: (1) They need to have a good separation of their emission spectra (low or negligible crosstalk), have a substantial spectral overlap between donor emission and acceptor absorption spectra (determines the R 0 value), have similar quantum yields (no significant change in total fluorescence intensity) and, very importantly, be commercially available with reactive groups for site-specific attachment to nucleic acids. In Table 1, we list the most widely used fluorophores for smFRET applications together with their photophysical properties.

Table 1 Spectroscopic properties for some of the most common fluorescent dyes used in single-molecule fluorescence techniques

2.3 Single-Molecule Fluorescence Detection of Freely Diffusing Nucleic Acids

2.3.1 Fluorescence Correlation Spectroscopy

Single-molecule fluorescence correlation spectroscopy (smFCS) is an elegant alternative to surface-immobilised methods to investigate the interconversion dynamics and distribution of structural populations of nucleic acids in solution. FCS was initially established in the 1970s (Magde et al. 1972), but it was not until the 1990s that significant technical improvements made possible its wider application in biologically relevant systems (Rigler et al. 1993). smFCS measures small fluorescence intensity fluctuations from molecules transiting through a confocal observation volume on the order of a femtolitre and in timescales ranging from 10−7 to 102 s. smFCS provides information about diffusion coefficients, variations in molecular brightness and concentration, intramolecular conformational dynamics and association rates in a variety of biomolecular process such as host–ligand association and protein–protein and protein–nucleic acid interactions. In recent years, smFCS has been combined with a range of other techniques such as dual-colour cross-correlation methods, two-focus cross-correlation, laser scanning microscopy, TIRF, two-photon microscopy and stimulated emission depletion (STED), and the reader is referred to excellent reviews on the field for more specific information (Enderlein et al. 2004; Kim et al. 2007a, b). smFCS correlates the fluorescence emission signal from a single molecule diffusing across the excitation volume in a certain time t with the same signal at a time (t + Δt). This correlation, which is represented by the autocorrelation function (Eq. 4), can be described as the degree of self-similarity of the signal in time. The autocorrelation function can then be fitted to different models depending on the specific mechanism under study to get the values of diffusion times, brightness and molecular dynamics contributing to the observed fluctuations in the fluorescence signal.

$$ G\left(\tau \right)=\frac{\left\langle \delta F(t)\delta F\left(t+\tau \right)\right\rangle }{{\left\langle F(t)\right\rangle}^2} $$
(4)

In the autocorrelation function, \( \delta F(t)=F(t)-\left\langle F(t)\right\rangle \) represents the fluctuation around an average intensity. Although smFCS has provided a wealth of information regarding hybridisation rates and the mechanisms of nucleic acid self-assembly, for FCS to be useful in the study of conformational dynamics in nucleic acids, the fluctuations in intensity of the fluorescence probe have to depend on the conformation of the nucleic acid polymer. The most common approach is to label the nucleic acid with a fluorophore that is directly excited and also with either a quencher species or a FRET acceptor. Thus, fluctuations in the emission signal of the fluorophore are a direct consequence of distance changes between the fluorophore and the quencher or the acceptor tag. The analysis and interpretation of the autocorrelation (single-colour) or cross-correlation (two-colour) curves depend on the timescale of the conformational dynamics compared to the residence time of the molecule in the confocal volume, which is characterised by \( {\tau}_{\mathrm{D}}={r}_0^2/4D \). Experimentally, the simplest case is when the diffusion time and the timescale of the conformational fluctuations are well separated. For small biomolecules, the diffusion coefficient D is ~10−9 to 10−10 m2 s−1, and the residence time in the confocal volume is in the region of 100 μs. Thus, fluctuations in fluorescence signal due to conformational changes taking place between 1 and 10 μs can be easily separated from the diffusion component. For instance, relaxation times in the microsecond regime were monitored for DNA hairpins containing a poly(dT)5 loop with a short dC–dG stem (Kim et al. 2006). Biopolymers showing dynamics at longer timescales that overlap with the diffusion time have also been studied using different modifications to the basic smFCS technique. The description of these techniques is beyond the scope of this chapter, but readers are referred to excellent reviews and book chapters on the field for further information (Gurunathan and Levitus 2008; Jung and Van Orden 2005; Wallace et al. 2000).

2.3.2 Multiparameter Fluorescence Detection

Multiparameter fluorescence detection (smMFD) shares some common features with the smFCS techniques described previously in terms of targeting freely diffusing samples and using point detectors. However, smMFD methods use time-correlated single-photon counting (TCSPC) techniques to characterise the nucleic acid structure using simultaneously the complete set of fluorescence observables available (Sisamakis et al. 2010). For a nucleic acid structure labelled with a FRET pair, this entire set of fluorescence variables constitutes an 8-D parameter space that includes anisotropy, lifetime, intensity (stoichiometry), detection time, excitation and emission wavelength, fluorophore quantum yield and distance between fluorophores. Briefly, in smMFD, a pulsed laser excites the molecules labelled with donor and acceptor fluorophores when they pass through a small sample volume at the focal spot of the laser. This makes possible the simultaneous recording of various fluorescence parameters such as fluorescence lifetime, emission intensity and anisotropy of donor and acceptor. The detection of more than one parameter at the same time provides a means to represent and analyse the data using multidimensional histograms and allows to filter ‘true’ FRET events, induced by a change in molecular conformation, from those originating due to poor labelling, mostly lack of acceptor, and from variations in the quantum yield and rotational freedom of each dye. If uncorrected, these effects may lead to incorrect interpretations, particularly at low FRET efficiencies where the lack of acceptor dye may be interpreted as a very large distance. Thus, smMFD offers the possibility to test assumptions that are often considered to be valid a priori when using other single-molecule FRET techniques. For instance, local quenching of the donor and acceptor that may affect the R 0 value, differences in the mobility of the dyes that compromise the assumption of a 2/3 value for the orientation factor (κ 2) and the influence of the linker in the behaviour of the dye (Kühnemuth and Seidel 2001; Sisamakis et al. 2010).

It is beyond the aim of this chapter to describe the basis of smMFD and TCSPC techniques, but very detailed analysis of the methods and mathematic models has been reported by some of the pioneering groups on the field (Kudryavtsev et al. 2007; Sisamakis et al. 2010). The potential of this technique was demonstrated on different MFD–FRET assays. Some examples are the observation of the structure of the enzyme–substrate complex for the HIV-1 reverse transcriptase (Rothwell et al. 2003), the structural dynamics of the exocytose-related protein syntaxin-1 (Margittai et al. 2003) and the accurate determination of intramolecular distances within various branched DNA structures (Sabir et al. 2011; Wozniak et al. 2008).

2.3.3 Alternating-Laser Excitation

Alternating-laser excitation (ALEX), also known as pulsed interleaved excitation (PIE), allows to study simultaneously the structure and dynamics of nucleic acids labelled with a FRET pair and shares with MFD the ability to filter artefacts on the FRET reading caused by partial labelling. ALEX can be used either in combination with TIR for surface-immobilised nucleic acids or with confocal microscopy for freely diffusing molecules. Both cases need two different lasers to alternatively excite donor and acceptor on a nanosecond, microsecond or millisecond timescale, each one with different technical characteristics. In addition, ALEX is suitable for multicolour analysis and can incorporate a third and a fourth laser. The observation of more than two fluorophores within the same molecule enables to measure more inter-dye distances and create a structural model based on distance mapping. ALEX results are presented as two-dimensional histograms where it is possible to distinguish different populations according to their FRET efficiency and donor–acceptor stoichiometry. Thus, direct excitation of the donor and acceptor provides a way to isolate conformers and discriminate those species labelled only with donor or acceptor from those singly labelled (Lee et al. 2005a, b). The details of ALEX and its multiple variations (ns-ALEX, μs-ALEX, ms-ALEX) have been described by Hohlbein et al. (2014) and Kapanidis et al. (2005).

3 Single-Molecule Sequencing of Nucleic Acids: Zero-Mode Waveguides

The requirement to work in the picomolar range (50–500 pM) to achieve a good spatial separation that allows to identify isolated single molecules has constituted for a long time a significant barrier for the wider application of these techniques. Indeed, many enzymatic processes and most biological interactions require higher concentrations of substrate and/or interacting partners. This barrier known as the ‘concentration problem’ has recently been overcome by two different approaches: (1) encapsulation of reagents into a nanocontainer (i.e. lipid vesicle) to increase the effective concentration (see Sect. 2.1.1) and (2) the use of zero-mode waveguides (ZMW), which is described in this section.

The development of single-molecule technologies for DNA sequencing exemplifies how ZMWs have allowed the field to access important biological systems well beyond the concentration limit. To perform its function, the DNA polymerase requires micromolar to millimolar concentrations of fluorescent-labelled nucleotides. For this to be viable at single-molecule level, the excitation volume had to be reduced to the zeptolitre range in order to reach millimolar concentrations of the sample whilst still monitoring only one dye in that volume (Dulin et al. 2013). ZMWs consist of an array of nanoscale holes in a metal film deposited on a fused silica or quartz substrate. Hole dimensions can vary from a few tens to around 300 nm in diameter and 100 nm in depth. In the study of the DNA polymerisation, each hole might contain one single polymerase immobilised at the bottom, whilst the majority of dye-labelled nucleotides remain outside the nanopore with occasional diffusion of single nucleotides inside the nanopore region. Wide-field excitation can be used in the real-time sequencing study of a DNA strand by following the addition of the four distinct deoxyribonucleoside triphosphates labelled with four different dyes (Eid et al. 2009). The attachment of the dye to the end of the phosphate chain allows the addition of a new labelled nucleotide in the adjacent position. Hence, real-time sequencing requires the use of dyes placed on the third phosphate group of the nucleotide. The little volume illuminated allows the selective observation of single nucleotides whilst they are added to the growing strand. In this step, dyes are easily removed and diffuse out of the ZMW, so they no longer contribute to the collected fluorescent signal. Since the detection signal corresponds to fluorescent pulses of the dyes, the different duration of these pulses could be associated to the enzymatic mechanism. Therefore, in addition to the DNA sequence, ZMWs can also provide information about the kinetics of the process (Eid et al. 2009).

4 Single-Molecule Studies of Branched DNA

4.1 Guanine Quadruplexes

Guanine-rich DNA or RNA sequences can adopt a wide variety of four-stranded structures called G-quadruplexes. Their formation can be usually predicted by the presence of the motif sequence G ≥ 3 N1−7G ≥ 3 N1–7G ≥ 3 N1–7G ≥ 3, where N can be any base (Maizels and Gray 2013). The primary unit is the G-quartet, a square planar assembly of four guanine bases. G-quartets can form stacks that are mainly stabilised by monovalent cations, such as Na+ and K+. These cations interact with the oxygen atoms of the bases to counteract the negative electrostatic potential at the centre of the quartet (Vummidi et al. 2013).

G-quadruplexes have been investigated for a wide range of DNA sequences in vitro (Burge et al. 2006). One of those sequences is found in the human telomeres, which carry an overhang of 100–200 nucleotides formed by TTAGGG repeats that can adopt G-quadruplex structures, and they have attracted considerable interest since the early 1990s. Later studies revealed some other genomic sequences that adopt G-quadruplex architectures, such as the non-coding region of the gene C9orf72, involved in frontotemporal dementia and amyotrophic lateral sclerosis (Haeusler et al. 2014); the promoter of the bcl-2 oncogene, which acts as an apoptosis inhibitor (Dai et al. 2006); c-kit (Rankin et al. 2005), c-MYC (Siddiqui-Jain et al. 2002) and Ras family proto-oncogenes (HRAS, KRAS and NRAS), involved in cellular growth (Cogoi and Xodo 2006; Membrino et al. 2010); and the hypoxia-inducible factor 1 (HIF-1), which regulates the transcription of over 60 genes involved in cellular homeostasis (Armond et al. 2005; Brooks et al. 2010; Verma et al. 2008).

Some of the earliest single-molecule FRET studies on the structure of G-quadruplexes were performed on the human telomere (htelo) sequence (TTAGGG) n and revealed an extreme conformational diversity (Fig. 5a) (Ying et al. 2003). These smFRET experiments were performed at various temperatures and at different sodium and potassium salt concentrations for freely diffusing and surface-immobilised molecules (Lee et al. 2005a, b). Statistical analysis of the interconversion rates using single-molecule histograms revealed the presence of several distinct structural populations. Based on their FRET values, these populations were mostly associated to one unfolded and two folded states of the G-quadruplex. The two long-lived folded structures that correspond to the most relevant species at physiological conditions were assigned to parallel and antiparallel conformations of the G-quadruplex coexisting in equilibrium. smFRET analysis of the influence of monovalent ions revealed that lower concentration of potassium salts is required to stabilise the folded states when compared to the sodium counterparts. In this study, it was also demonstrated that K+ stabilises the antiparallel state, observed as an increase of the high-FRET population. The htelo G-quadruplex unfolds upon increasing the temperature at low K+ concentrations, whilst at physiological concentrations of K+ (100 mM), the folded state is stable even at 37 °C. The conformational heterogeneity observed for the htelo sequence by smFRET and the presence of additional mixed parallel and antiparallel conformations were recently confirmed by nuclear magnetic resonance (NMR) spectroscopy (Ambrus et al. 2006; Dai et al. 2007; Shirude and Balasubramanian 2008) and X-ray crystallography (Parkinson et al. 2002). More recently, two G-quadruplex sequences (kit-1 and kit-2) identified in the c-kit promoter region were studied by smFRET in freely diffusing conditions and encapsulated in lipid vesicles (Shirude et al. 2007). kit-1 is positioned between −87 and −109 bp upstream of the transcription start site and was shown to fold into a quadruplex in vitro (Rankin et al. 2005). kit-2 is positioned between −140 and −160 bp upstream of the transcription initiation site and was also shown to form a G-quadruplex structure by NMR and CD spectroscopy (Fernando et al. 2006). These studies confirmed that kit-1 and kit-2 can fold into non-duplex states within a natural extended DNA duplex (Shirude et al. 2007). Importantly, dynamic fluctuations in the c-kit quadruplexes were rare, even in single-stranded form, which is in contrast with the dynamics observed for the htelo sequence. Based on this difference, it was suggested that the dynamic behaviour of intramolecular DNA quadruplexes could be a property of each sequence and vary significantly between different quadruplexes. Moreover, the observation by smFRET of folded G-quadruplexes within the c-kit promoter not only challenges the general view that G- and C-rich sequences form very stable duplex sequences but also suggests a natural function for these motifs involving changes in DNA topology (Dai et al. 2007; Shirude and Balasubramanian 2008).

Fig. 5
figure 5

Schematics of FRET assays on nucleic acid structures. Summary of the most widely studied nucleic acid structures by smFRET. Donor (light grey) and acceptor (dark grey) groups are shown. (a) Guanine quadruplex. (b) Holliday junction. (ce) Three different ribozymes, small non-coding RNA fragments with catalytic activity: (c) Varkud satellite (VS) ribozyme, (d) hammerhead ribozyme and (e) natural form of the hairpin ribozyme. (fk) Various structures of riboswitches that sense different metabolites: (f) cyclic di-GMP, (g) purine (guanine, which riboswitch is represented with circled dyes, and adenine, represented with non-circled dyes), (h) lysine, (i) SAM-I, (j) SAM-II and (k) Pre-Q1

4.2 Holliday Junctions

DNA is not restricted to a static duplex or G-quadruplex organisation, and indeed, it can also adopt highly dynamic structures as exemplified by the branched architecture of the Holliday junction (HJ) (Holliday 1964). HJs are four-way DNA junctions that act as intermediates in the homologous recombination pathway, which involves the exchange of nucleotide sequences between two identical or very similar dsDNA molecules to repair DNA breaks in one of the duplexes, using the other duplex as a template. Movement of the crossover junction along the DNA is termed branch migration and, whether spontaneous or mediated by proteins, is a key step in various genetic processes. HJs can potentially adopt three structures: open, parallel and antiparallel. The four helical arms in the open form adopt a square shape, but only in the parallel form the two strands that have been exchanged are crossed (Liu and West 2004). smFRET studies on the HJ were carried out to investigate the formation and interconversion of these structures upon addition of Mg2+. It is well known that metal ions, and in particular Mg2+, can play a critical role in defining the most stable conformation of many DNA and RNA structures, but how they modulate HJ dynamics was poorly understood. Initial smFRET measurements in nonmigrating HJs with donor and acceptor molecules placed at the end of two arms of the Holliday junction (Fig. 5b) showed the presence of two conformers of the antiparallel structure and revealed that the open structure acts as an intermediate between these two conformers (McKinney et al. 2003). In contrast, the parallel conformations were not observed, and their presence was determined to be insignificant (Joo et al. 2004; McKinney et al. 2004). These studies demonstrated the power of smFRET to investigate in real time the structural dynamics of a complex system such as the HJ.

More recently, smFRET was applied to migratable HJs to investigate the branch migration mechanism (Karymov et al. 2005). It was found that branch migration takes place in a stepwise fashion with the overall kinetics depending on the concentration of Mg2+ ions. In contrast to early models suggesting a parallel orientation of exchanging strands (Sigal and Alberts 1972), these smFRET studies demonstrated that migratable HJs dynamically fluctuate between a conformation favourable to migration (open conformation) and a folded conformer where migration is hindered. Importantly, it was confirmed that Mg2+ induces folding of the HJ in one of the folded conformations, thus terminating the branch migration phase. Additional studies also revealed that stepwise switching between migration and folding phases can be affected by irregularities (i.e. mismatches, nicked HJs) (Palets et al. 2010) and how certain sequences can modulate the junction dynamics and thus influence the branch migration rate (Karymov et al. 2008).

Lastly, the presence of four different arms in the Holliday junction makes it an attractive scaffold for the study of nucleic acid dynamics using multicolour FRET techniques. Four-colour FRET measurements on surface-immobilised HJ molecules were first reported by Lee et al. (2010). Here, each dye was attached to one of the four helices that conform the Holliday junction. The excitation of four different dyes allowed the observation of six intramolecular distances and their relative changes within a single nucleic acid structure, thus paving for the analysis of more sophisticated nucleic acid systems and protein–DNA/protein–RNA complexes.

5 Single-Molecule Analysis of Catalytic RNA

For many years, RNA was considered as a rather passive carrier of the genetic information to the ribosome for the production of proteins. However, this widely accepted notion changed when Thomas Cech (Kruger et al. 1982) and Sidney Altman (Guerrier-Takada et al. 1983) reported that certain RNA sequences exhibit catalytic properties similar to those previously thought to be exclusive of the protein world. The discovery of these ribonucleic acid enzymes, so-called ribozymes, fuelled the ‘RNA world hypothesis’. If RNA molecules can carry genetic information and also perform catalysis, then protein-based processes could have evolved from this into more modern biological systems. Thus, it is possible that RNA catalysis could have played a critical role for the development of early life on Earth. Naturally occurring ribozymes are normally classified according to their size. Large ribozymes (>300 nt) include the self-splicing group I and group II introns (Lilley 2005; Tremblay et al. 2009) and RNAse P (Kazantsev and Pace 2006). Another mechanistically distinct class of ribozymes with a much smaller size includes the hammerhead, hairpin, hepatitis delta virus (HDV) and Varkud satellite (VS) ribozymes (reviewed in Lilley 2005). All these small ribozymes perform a self-cleaving transesterification reaction, generating a hydroxyl and cyclic phosphate termini that can be used to perform the reverse reaction. In the last few years, two additional ribozymes have been added to this group: the glmS ribozyme (Ferré-D’Amaré 2010), which also acts as a regulatory element, and the twister ribozyme (Roth et al. 2014). A common feature shared by all these catalytic RNAs is the requirement to form a very specific 3-D structure to support catalysis (Lilley 2005). This process, known as RNA folding, involves a very complex interplay between the self-recognition capabilities of RNA molecules and the effect of diffuse mono- and divalent metal ions shielding the electrostatic repulsion of the negatively charged phosphate groups. Thus, the catalytic activity of a particular ribozyme is strongly linked to its folding state and dynamics, which can be exceptionally well explored using smFRET techniques (Zhuang et al. 2002). Various excellent reviews have been done recently on single-molecule ribozyme analysis (Cochrane and Strobel 2008; Ditzler et al. 2007; Zhuang 2005). Moreover, it is important to note that although DNA does not appear to perform catalysis in nature, single-stranded DNA can be engineered to perform metal-dependent catalysis in a similar manner as metalloproteins, as it has been shown at single-molecule level for the 8-17 DNAzyme (Kim et al. 2007a, b). Here, we will focus on a detailed description of the hairpin ribozyme folding and function because it is the most studied ribozyme at single-molecule level and exemplifies how the application of smFRET has contributed to unravel the folding and catalytic steps of a small catalytic RNA. Recently, smFRET has been also applied to investigate the folding of the VS (Fig. 5c) (Pereira et al. 2008) and the hammerhead ribozymes (Fig. 5d) (McDowell et al. 2010).

5.1 Hairpin Ribozyme

Hairpin ribozymes are small catalytic RNA motifs found in the satellite RNAs of arabis mosaic virus (arMV), tobacco ringspot virus (TRsV) and chicory yellow mottle virus (CYMoV) that catalyse the self-cleaving reaction and ligation of its own backbone (Bajaj and Hammann 2014). The hairpin ribozyme consists of a four-way RNA junction with two internal loops (loop A and loop B) on adjacent helices (Fig. 5e). The formation of the active site strongly depends on the docking interaction between these two loops, which induces a 105 increase in the cleavage rate. A minimal form lacking the two helices with no loops still has catalytic activity, although the transition between open and folded state requires much higher concentrations of Mg2+ (Tan et al. 2003). The folding and catalytic activities of both hairpin ribozymes have been extensively characterised using biochemical assays and also at single-molecule level. Here, the power of the technique to uncover short-lived transient states, minor subpopulations and molecular heterogeneity was exploited to provide a complete picture of the reaction pathway. By labelling the termini of the two stems carrying the interacting loops, three distinct FRET populations were found in the minimal form of the hairpin ribozyme using either a TIR or a scanning confocal microscope (Zhuang et al. 2002). These populations were assigned to the catalytically inactive undocked state, the docked conformation where both loops are brought into close contact, and an intermediate state where the substrate is not bound. As previously suggested by biochemical studies, cleavage was only observed in the docked state. Interestingly, for such a simple RNA enzyme, analysis of the rates for loop–loop docking and undocking revealed a very complex dynamic behaviour with a single docking rate (~0.008 s−1) but up to four docking rates with values ranging from 0.005 s−1 to 3 s−1, leading to a complex cleavage kinetics. Surprisingly, it was found that ribozyme molecules tend to repeatedly remain in the docked state for a similar time, providing one of the first evidence for ‘memory effects’ in individual molecules (Zhuang et al. 2002). It was reasoned that the presence of a broad range of undocking rates and the observed memory effects could arise from slightly different configurations of loops A and B the docked state that exhibits a low interconversion dynamics. This explanation was further supported by NMR studies, which showed evidence for metastable conformations of loops A and B (Butcher et al. 1999).

In a later study, the folding and function of the natural form of the hairpin ribozyme containing a four-way DNA junction was compared to the minimal form using smFRET (Tan et al. 2003). In the absence of the interacting loops, the four-way junction exhibited a Mg2+-dependent interconversion dynamics between a distal (UD) and a proximal state (UP) that corresponds to the antiparallel conformation of the junction. Importantly, these studies on the natural form of the hairpin ribozyme demonstrated that the ribozyme inherits the intrinsic dynamics of the four-way DNA junction and rapidly fluctuates between the UD and UP states and confirmed that the UP state is an obligatory intermediate in the folding pathway of the ribozyme towards the catalytically active form. These studies were critical to confirm that the dynamic heterogeneity of the hairpin ribozyme is linked to the docking process between the two loops, as the four-way junction lacking the loops exhibited a much lower level of heterogeneity. Thus, the intrinsic dynamics of the four-way junction acts as a ‘folding enhancer’ to promote the active encounter between both loops, allowing a three orders of magnitude increase in the folding rate of the natural ribozyme. Similar ‘folding enhancers’ have also been observed for other small catalytic RNAs such as the hammerhead ribozyme, in which the interaction between the two loops at stems P1 and P2 was shown to be critical to support catalysis at physiological concentrations of Mg2+ ions (Penedo et al. 2004).

In two different studies, the equilibrium between cleavage and ligation was investigated using ribozyme constructs where the 2′-deoxynucleotide placed at the −1 position of the active site to prevent cleavage was replaced by the wild-type nucleotide (Tan et al. 2003). The two catalytically active hairpin structures were engineered in such way that cleavage would lead to products of different lengths and therefore stability (Wilson et al. 2007). In both studies, the authors took advantage of the different dynamics exhibited by intact ribozyme and the four-way junction resulting from cleavage to monitor the catalytic step. Natural ribozyme molecules were labelled with the FRET pair at the end of the arms A and B that carry the loops. Real-time addition of Mg2+ induced loop–loop docking characterised by a high and stable FRET value. Subsequent cleavage and rapid release into solution of a short 3 bp product was monitored by the appearance of rapid transitions between high- and low-FRET states typical of the four-way junction (Tan et al. 2003). Statistical analysis of a series of cleavage reactions gave a cleavage rate of ~1 min−1. To be able to record multiple cycles of cleavage and ligation, the length of the A arm was increased to 7 bp to favour the retention of the cleavage product, which can therefore serve as substrate for ligation (Nahas et al. 2004). Upon addition of Mg2+ ions, these ribozymes exhibited cyclic switching between two dynamic regimes: a stable high-FRET docked state (ligated form) and a second regime with fast fluctuations between high- and low-FRET states corresponding to the cleaved form. From these experiments it was determined that the cleavage/ligation equilibrium is significantly biased towards ligation (K = k L/k c = 34). In the natural context of the (−) strand of the tobacco ringspot virus satellite RNA, such shift towards ligation might be important to maintain the integrity of the circular (−) strand whilst serving as a template for the synthesis of the (+) strand (Wilson et al. 2005).

In another smFRET study, Okumus et al. (2004) introduced the use of lipid vesicles to investigate whether the dynamic heterogeneity observed for the hairpin ribozyme could be induced by non-specific interactions with the surface in which they were immobilised. The results using vesicle encapsulation were identical to those reported for ribozymes immobilised using biotin–streptavidin, thus confirming that the observed heterogeneity is intrinsic to the RNA.

6 Single-Molecule Analysis of Regulatory RNA

For a primitive RNA world to be viable, in addition to catalysis, RNA should also be able to perform regulatory functions. A decade ago, a second RNA revolution confirming this aspect took place when it was found that indeed certain non-coding messenger RNA sequences (mRNA) were able to use feedback regulatory mechanisms to control gene expression without the requirement of any helper protein (Mironov et al. 2002; Nahvi et al. 2002; Winkler et al. 2002a, b). These regulatory RNAs, so-called riboswitches, are found in the 5′-untranslated region (5′-UTR) of mRNA in bacteria and sense small metabolites to regulate the expression of genes normally associated to the transport, biosynthesis or degradation of the cognate metabolite (Blouin et al. 2009a, b). Riboswitches are composed of two domains: an aptamer domain and an expression platform. The aptamer is the most conserved part through evolution as it is involved in ligand detection, whilst the expression platform can vary in sequence and structure. The majority of aptamer domains investigated have revealed a tight association to the cognate ligand, where almost every functional group of the ligand interacts with a specific region of the RNA aptamer to ensure a very high ligand-binding specificity. Importantly, formation of the RNA–ligand complex has a direct influence on the structure of the downstream expression platform, which is used to regulate gene expression by folding into various secondary motifs that are mutually exclusive (Blouin et al. 2009a, b; Winkler and Breaker 2003).

To date, more than 20 classes of riboswitches have been reported for different types of ligands, including adenine (Mandal and Breaker 2004), guanine (Batey et al. 2004), lysine (Grundy et al. 2003), flavin mononucleotide (FMN) (Winkler et al. 2002a, b), adenosylcobalamin (AdCo) (Nahvi et al. 2002), thiamine pyrophosphate (TPP) (Winkler et al. 2002a, b), glycine (Mandal et al. 2004), S-adenosylmethionine (SAM) (Winkler et al. 2003), 9-glucosamine-6-phosphate (glmS) (Jansen et al. 2006) and pre-queuosine1 (Pre-Q1) (Roth et al. 2007). Riboswitches can regulate gene expression by different mechanisms such as modulating the formation of Rho-independent transcriptional terminators, controlling mRNA splicing and mRNA stability or sequestering the Shine–Dalgarno sequences required for translation initiation (Bastet et al. 2011; Lemay et al. 2011). Riboswitches have attracted considerable interest as targets for antibacterial drug development (Blount and Breaker 2006; Deigan and Ferré-D’Amaré 2011; Mulhbacher et al. 2010; Winkler and Breaker 2005). Several recent reviews have summarised the architectures and regulatory pathways for many riboswitch structures (Montange and Batey 2008; Serganov and Nudler 2013).

Since the first smFRET study reporting the folding pathway of the adenine riboswitch aptamer domain (Lemay et al. 2006), the application of these techniques to other metabolite-sensing mRNAs has experienced a stunning growth. smFRET studies have now been reported for the aptamer domains of S-adenosylmethionine (SAM-I) (Eschbach et al. 2012; Heppell et al. 2011), SAM-II (Haller et al. 2011a, b), Pre-Q1 (Suddala et al. 2013), c-di-GMP riboswitches (Wood et al. 2012), lysine (Fiegland et al. 2012) and thiamine pyrophosphate (TPP) (Haller et al. 2013) (Fig. 5f–k). The majority of these studies have focused on elucidating how the riboswitch harnesses the interplay between RNA folding and ligand recognition to control the gene regulation process. The emerging picture arising from the application of these fluorescence techniques to investigate riboswitch dynamics, both at ensemble and single-molecule level, has been recently the focus on several excellent reviews (Haller et al. 2011a, b; Heppell et al. 2009; Karunatilaka and Rueda 2009; Lemay et al. 2009; McCluskey et al. 2014; Savinov et al. 2014; St-Pierre et al. 2014). In this chapter, we will use the aptamer domain of the adenine-sensing riboswitch as a model system to describe how smFRET techniques can use to unravel the complex dynamic behaviour of these regulatory mRNAs.

6.1 SmFRET Characterisation of Purine-Sensing Riboswitches

Adenine- and guanine-sensing aptamer domains shared a common junction architecture comprising three helical domains (P1–P2–P3) (Serganov et al. 2004). Loops positioned at the end of stems P2 and P3 have been shown by X-ray crystallography and biochemical methods to be crucial for the function of the aptamer domain (Allnér et al. 2013; Delfosse et al. 2010; Serganov et al. 2004). Very importantly, these crystal structures have shown a very compact ligand-bound state hold together by tertiary interactions and with the ligand directly involved in a Watson–Crick base pair with the uracil nucleotide at position 65. The importance of this nucleotide in ligand specificity has been confirmed by replacing U65 by C, which converts the aptamer domain into a guanine-responsive motif (Lemay and Lafontaine 2007; Mandal and Breaker 2004; Mulhbacher and Lafontaine 2007; Tremblay et al. 2011).

The formation of the loop–loop interaction in the aptamer domain of the adenine riboswitch was studied by smFRET using a surface immobilisation approach (Lemay et al. 2006). Here, using the information provided by the crystal structure, donor and acceptor FRET labels were placed in loops P2 and P3 at locations that did not alter the aptamer ability to fold and bind the ligand. To perform this orthogonal labelling, the aptamer sequence was divided into two separate strands, each of them individually labelled with either the donor or the acceptor and subsequently ligated using T4 RNA ligase (Lemay et al. 2006, 2009). The folding of the aptamer domain was analysed as a function of Mg2+ ions at 50 ms time resolution. The predominant low-FRET (E app ~ 0.25) population in the absence of ions (unfolded state, U) shifted to very high-FRET (E app ~ 0.9) (docked state, D) population at saturating concentrations of Mg2+ ions (>5 mM Mg2+). This population was assigned to the formation of the loop–loop interaction between stems P2 and P3 that brings in close proximity the FRET pair resulting in a high FRET efficiency. Strikingly, using a high time resolution (16 ms EMCCD integration time), an intermediate state (I) with a FRET value between those of the U and D states was detected at subsaturating concentrations of Mg2+ ions. Analysis of >800 traces confirmed that most aptamer molecules transit from the U to D state through this intermediate conformation. Although the exact nature of this I state has not yet been determined, it has been suggested that may result from the stacking of P1 and P3 helices as observed in the crystal structure (Gilbert and Batey 2006). Information regarding the role of the ligand in the folding process was also obtained from these smFRET studies. In the absence of ligand, the folding and unfolding rates showed a high degree of heterogeneity (~100-fold), reminiscent of that observed for the hairpin ribozyme (Tan et al. 2003). However, upon addition of the ligand, the dynamic heterogeneity for both processes, \( \mathrm{U}\to \mathrm{D} \) and \( \mathrm{D}\to \mathrm{U} \) , decreased significantly. A detailed analysis of the interconversion relates revealed a ~2-fold acceleration of the folding rate, suggesting that the ligand actively participates in the formation of the loop–loop interaction and in the folding process of the core of the riboswitch aptamer domain.

6.2 SmFRET Combined with Chemical Denaturation

Although the use of chemical denaturants such as guanidinium chloride and urea is a common method in smFRET studies of protein folding (Schuler and Hofmann 2013), only a handful of single-molecule studies have reported their application to investigate RNA folding processes (Bokinsky et al. 2003; Holmstrom and Nesbitt 2014; Karunatilaka and Rueda 2009). This is a surprising difference if we take into account that the applicability of urea to provide new insights into the thermodynamics of RNA folding and the nature of the rate-limiting step has already been demonstrated at the ensemble level (Sosnick and Pan 2003; Treiber et al. 1998). A recent study used the aptamer domain of the adenine riboswitch to investigate the folding and ligand recognition mechanisms using the competing interplay between folding agents (i.e. Mg2+, ligand) and unfolding agents such as urea (Dalgarno et al. 2013). The authors separately analysed the influence of increasing concentrations of urea on the docking and undocking rates of surface-immobilised adenine aptamers at saturating (~2 mM) and subsaturating (~100 μM) concentrations of divalent metal ions in the presence and absence of adenine ligand.

From the analysis of the undocking rates, with and without ligand added, as a function of urea concentration, it was possible to (1) quantify the degree of aptamer stabilisation due to ligand-binding and (2) differentiate ligand-bound docked states (DLB) from ligand-free aptamers (DLF) even when both states exhibit identical values of FRET efficiency (E app ~ 0.9) (Dalgarno et al. 2013). The ability to differentiate these states was based on the fact that the formation of the aptamer–ligand complex strongly protects the DLB state against urea-induced denaturation. It was found that ligand-bound aptamers exhibited in average a 50-fold longer lifetime (k undock ~ 0.045 ± 0.003 s−1), in the presence of 5 M urea than ligand-free states (k undock ~ 2.1 ± 0.1 s−1). This 50-fold stabilisation of the docked state due to ligand binding agrees well with values reported by single-molecule force manipulation (Greenleaf et al. 2008). At lower concentrations of urea, the smFRET trajectories displayed a combination of fast and slow undocking events from an identical FRET value (E app ~ 0.9) to an E app ~ 0.3 corresponding to the undocked state. This work provided the first experimental evidence for the use of denaturants in single-molecule FRET studies to differentiate otherwise identical states in terms of FRET efficiency, purely from their different stability against urea-induced denaturation.

Similarly, the analysis of the docking rates in the presence of urea, with and without adenine ligand, provided additional insights into the fine mechanistic details of the aptamer–ligand interaction. It was found that increasing concentrations of urea decelerate the docking rate (k dock) of the aptamer in the absence of ligand and this was taken as evidence for a trap-free rate-limiting step in the folding of the ligand-free aptamer. Moreover, the relative decrease in the folding rate in the presence of urea (~2- to 3-fold) was independent on the concentration of Mg2+ ions (Dalgarno et al. 2013). In contrast, in the presence of ligand, the influence of urea on the docking rate was strongly dependent on the concentration of Mg2+ ions and varies from 2-fold at saturating Mg2+ ions (>2 mM) to ~8-fold at concentrations <100 μM. These results were taken as evidence for a ligand-induced switching in the rate-limiting step for aptamer folding. At saturating concentrations of Mg2+ ions, the rate-limiting step for ligand-free and ligand-bound aptamers is similar and involves the formation of native tertiary contacts between both loops. This is similar to the rate-limiting step proposed for the docking process of the hairpin ribozyme (Bokinsky et al. 2003). In contrast, ligand binding to the aptamer at subsaturating concentrations of Mg2+ ions, in which the loop–loop is not efficiently stabilised, requires the interaction with specifically trapped Mg2+ ions to further progress to the native state. Such requirement for specific positioning of divalent metal ions along the aptamer structure agrees with the observation of up to five trapped Mg2+ ions in the crystal structures of the ligand-bound aptamer domain (Serganov et al. 2004). In summary, this study exemplifies how balancing the competing effect of folding and unfolding agents constitutes a more powerful approach to uncover mechanistic detail of RNA folding and RNA–ligand interactions that otherwise will remain hidden and to manipulate the RNA folding landscape.

7 In Situ Generated Fluorescent RNA Aptamers for Live Cell Diagnostics

Fluorescence imaging in living cells has been mostly confined to the use of fluorescent proteins such as the green fluorescent protein (GFP) and its derivatives (Chalfie et al. 1994; Lippincott-Schwartz and Patterson 2003). GFP contains a small chromophore, the 4-hydroxybenzylidene imidazolinone (HBI) (Fig. 6c), residing inside the barrel-like structure of the native protein, thus reducing the contribution of non-radiative processes and increasing its overall brightness. However, in 2011, the Jaffrey lab (Paige et al. 2011) reported an RNA sequence, the Spinach aptamer, which mimics the GFP fluorescent properties, thus opening new avenues for imaging tagged RNAs in living cells. The 98-nt Spinach aptamer (Fig. 6a) senses the 3,4-difluoro-4-hydroxybenzylidene imidazolinone (DFHBI) (Fig. 6e), an analogue of the GFP fluorophore that exists as a nonfluorescent species in free solution and switches to an emissive form upon binding to the RNA aptamer.

Fig. 6
figure 6

Schematics of the RNA Spinach aptamers as well as the HBI fluorophore and two of its derivatives. (a) Sequence of the Spinach RNA, an aptamer that senses an analogue, DMHBI or DFHBI, of the GFP fluorophore. (b) Sequence of the Spinach 2 aptamer, which has been designed to improve the thermostability and the folding efficiency. (c) GFP chromophore, HBI, formed after an autocatalytic cyclization and oxidation of the three-residue sequence Ser65–Tyr66–Gly67. This compound exists as a nonfluorescent species when free in solution and switches to an emissive form upon binding to the Spinach RNA. (d) DMHBI, derivative of the GFP chromophore. (e) DFHBI, derivative of HBI with improved emission. The fluorines reduce the pK a of the compound, and therefore, it is only present on its phenolate (anionic) form at neutral pH

The Spinach RNA aptamer was identified by systematic evolution of ligands by exponential enrichment (SELEX) (Ellington and Szostak 1990) using a library of 5 × 1013 RNA molecules and searching for RNA sequences initially optimised to bind 3,5-dimethoxy-4-hydroxybenzylidene imidazolinone (DMHBI) (Fig. 6d). The emission spectra of the aptamer–DMHBI complex showed an emission peak at 529 nm and an excitation maximum at 398 nm. The brightness of the complex was 12 % relative to GFP and the dissociation constant value was 464 nM (Paige et al. 2011). In order to generate RNA–ligand complexes that mimic more enhanced GFP (EGFP) than GFP, RNA sequences were optimised to bind the DFHBI, which is exclusively in the phenolate form of a GFP-like fluorophore because the fluorine residues reduce the pK a. The RNA–DFHBI complex (Spinach aptamer) showed a marked quantum yield of 0.72 which is 20 % higher than EGFP (Paige et al. 2011). The ability of the Spinach aptamer to be used as an in situ RNA tagging method was demonstrated by fusing it to the 3′ end of 5S, a small non-coding RNA that associates with the large ribosomal subunits, and transfecting it into human embryonic kidney (HEK) 293 T cells. 5S–Spinach complex was detected with a distribution similar to that of endogenous 5S, thus confirming its applicability for in vivo imaging of RNA sequences. Interestingly, in vitro and in vivo measurements of Spinach–DFHBI complexes showed that its brightness is only 80 % of that of GFP and 53 % of that of EGFP (Strack et al. 2013). Later studies confirmed that the RNA–DFHBI complex has a propensity to misfold in live cells, and a second version of the RNA sequence (Spinach 2) was engineered with improved thermostability and folding efficiency, particularly when fused to other RNA sequences (Fig. 6b) (Strack et al. 2013).

A recent single-molecule study on the Spinach aptamer has shed some light into the photophysical properties of the RNA–ligand complex. This study explains its reduced fluorescence under cellular and high-resolution imaging conditions compared to what should be expected base on its brightness (Han et al. 2013). Using surface-immobilised Spinach aptamers and 5 μM concentration of DFHBI, it was found that the fluorescence of the Spinach aptamer rapidly decays to ~5 % of its initial value within 2 s. However, in contrast to GFP, which it is known to undergo photoconversion to long-lived dark states and/or irreversible photobleaching (Dickson et al. 1997; Patterson et al. 1997), the emission of the Spinach aptamer spontaneously recovered to ~95 % of the initial value following a period without light. This suggested that the fast loss of emission is a light-induced process, most likely involving its reversible conversion to a nonfluorescent state. The reversibility of this process was independent on the excitation power, but it depended strongly on the concentration of DFHBI. The fluorescence recovery rate increased from 0.08 s−1 to 2 s−1 as the concentration of DFHBI varied from 1 μM to 200 μM. Insights into the molecular basis for the observed reversible transition of DFHBI to a dark state were obtained by monitoring the variations in the absorption spectrum following illumination at 405 nm. At these conditions, a decrease in the absorption spectrum and a red shift were observed. These features are similar to those involved in the photoisomerization process of photoswitchable proteins (Andresen et al. 2007) and are spectroscopic hallmarks of molecular changes in the configuration of the chromophore. Based on this evidence, it was proposed that DFHBI might undergo a light-induced cis/trans isomerization process leading to a dark state, similar to that observed in other fluorescent probes (i.e. cyanine dyes) (Dempsey et al. 2009; Weiss 1999). Although this needs further studies, a model was proposed to explain the behaviour of the ‘green RNA’ complex. According to this model, under illumination, the bound DFHBI dye undergoes a photoisomerization step that results in fast dissociation and subsequent binding of a new DFHBI molecule that restores the fluorescence. Although this model provides an explanation to the observed fluorescence decay of DFHBI induced by light, it is not possible at this stage to completely rule out that this effect may also relate, at least partially, to the previously mentioned propensity of the Spinach RNA sequence to misfold. Additional information on the photophysics of the Spinach 2 aptamer will contribute to clarify this aspect.

Independently on the exact functional mechanism of these ‘green RNAs’, it is clear that they fill a current gap in live cell imaging with far-reaching applications. For instance, it has been recently reported the use of Spinach aptamer in fusions with other small-molecule sensing RNA aptamers to detect intracellular levels of adenosine 5′-diphosphate (ADP) and S-adenosylmethionine (SAM) in Escherichia coli (Paige et al. 2012). Because RNA aptamers can be engineered relatively easily against a broad variety of biomolecules (Cho et al. 2009), fused Spinach RNA constructs should enable to image essentially any small-molecule target in living cells.

8 From Sensing to Manipulating Nucleic Acids: Single-Molecule Hybrid Technologies

The development of single-molecule techniques to investigate biomolecular processes has traditionally followed two almost independent routes: (1) single-molecule mechanical manipulation and (2) single-molecule fluorescence detection (Monico et al. 2013). In recent years, it has become clear that the combination of both into hybrid techniques will provide a much more powerful approach to investigate biological systems by enabling not only to detect (fluorescence) but also to disrupt (mechanical manipulation) the nucleic acid sequence. In the context of nucleic acid diagnostics, the combination of single-molecule fluorescence with force-based manipulation methods using either atomic force microscopy (AFM) or optical (OT)/magnetic (MT) traps represents an emerging field that has already delivered insights into nucleic acid function with an unprecedented level of detail. It is beyond the scope of this chapter to describe in detail the technical aspects of combining both single-molecule fields, but the reader is referred to recent reports by the Block and Ha labs (Fazal and Block 2011; Ha 2014).

Early attempts in the development of these hybrid techniques combined optical traps and smFRET to investigate the hybridization and mechanical properties of double-stranded DNA (Lang et al. 2003). In order to separate the two strands, the optical trap applies piconewton forces, and the rupture event can be followed by the loss of FRET signal between two dyes positioned in each strand. A later study from Hohng et al. (2007) provided a tool to observe conformational changes in nucleic acids with the use of subpiconewton forces. The hybrid technique that combines confocal and force spectroscopy was applied to investigate the effect of weak forces on the conformational landscape of immobilised Holliday junctions (HJ). By merging smFRET and OT, it was possible to gently stretch the Holliday junction along different directions and monitor the influence of this mechanical manipulation along specific distance vectors using smFRET. By probing the dynamics of the HJ in response to stretching forces oriented along different directions, it was possible to map the structure and location of the transition states along the HJ conformational landscape. This work clearly demonstrated that unlike DNA or RNA hairpins, where high forces (~15 pN) are required to induce mechanical distortions, much lower forces are needed to probe by smFRET the conformational changes and the accompanying energy barriers in a variety of biologically relevant nucleic acid structures, on their own and in complex interactions with small molecules and proteins.

9 Future Perspectives

In the last decade, it has become clear that the application of single-molecule techniques to investigate the structure and function of nucleic acids provides information at an unprecedented level of detail. Many of these single-molecule diagnostic methods have now reached a mature state, and as a result, commercial equipment is now available in many biophysics and molecular biology laboratories. However, the field still faces many challenges, and further technical developments in many areas are awaiting. Additional methods to overcome the concentration limit barrier so weaker interactions can be measured, without the need for the costly and time-consuming fabrication of zero-mode waveguides, are needed. By opening the field to micromolar range concentrations of fluorescent species, aptamer-based fluorescence sensing and high-throughput screening will become more accessible. Another important area in the context of nucleic acid research is to exploit the applicability of single-molecule fluorescence technologies to investigate co-transcriptional processes. Although this area has been explored using mechanical-based sensing of the nascent RNA, the application of smFRET methods is still clearly underdeveloped. Such techniques will be extremely valuable to understand the folding and function of non-coding regulatory mRNAs in a closer context to that taking place in vivo.