Introduction

One of the central issues in life sciences is determining the structure–function relationship of proteins. Structural biology, which relies on crystal X-ray diffraction, electron microscopy (EM), and nuclear magnetic resonance (NMR), has been successful in revealing the ensemble-averaged static structure of proteins in detail. The number of revealed structures is already enormous, while the number of proteins whose structure remains unsolved is steadily becoming less and less. The recent advances of cryo-EM (Wang et al. 2015) and NMR are accelerating this clear trend. Thus, the focus of life sciences is changing from solving the static structure of more proteins to gaining an understanding of and using the acquired structural information. However, what we can learn from the ensemble-averaged static structures may be limited, with the exception of detailed local structures of protein–protein and protein–ligand interfaces. Unlike revealed static structures, proteins are dynamic in nature. The molecules undergo structural fluctuations and transitions, bind to and dissociate from interaction partners, and traverse a range of energy and chemical states during functioning.

Single-molecule biophysics was developed three decades ago to overcome the limitations associated with ensemble-averaging methods, such as above-mentioned structural biological techniques, fiber X-ray diffraction, various spectroscopy techniques, sedimentation analysis, and others. The dynamic action of many proteins has been observed by single-molecule techniques (fluorescence microscopy, optical spectroscopy, and optical and magnetic tweezers), which has enriched our understanding of how proteins function. Nevertheless, all of these techniques rely on optical detection and hence are limited to detecting the dynamic behavior of optical markers attached to the molecules. Although super-resolution fluorescence microscopy breaking the diffraction limit of light has been recently added to the growing list of tools (Eggeling et al. 2015), this limitation cannot be eliminated even by this new microscopy.

High-speed atomic force microscopy (HS-AFM) was developed with the aim of breaking the limitations of structural biology and light-based single-molecule biophysics and thereby enabling the direct visualization of single protein molecules in action at high spatiotemporal resolution. Here, I first provide a brief history of HS-AFM, then describe its current state, and finally list some future HS-AFM-related techniques that will open a new horizon of protein studies.

Brief history of HS-AFM

Atomic force microscopy was invented to visualize atoms on solid surfaces (Binnig et al. 1986). However, contrary to this initial motivation, researchers soon started using AFM to observe a variety of materials, including biological specimens in the liquid environment. The use of AFM was promoted by the mass production of micro-fabricated cantilevers and development of the optical beam deflection (OBD) method to detect cantilever deflection (Meyer and Amer 1988). Remarkably, initially, only a few groups, represented primarily by the Hansma group, attempted to observe biomolecular processes to explore the potential of AFM; these processes included the clotting process by fibrin produced from fibrinogen (Drake et al. 1989), assembly process of immunoglobulin G on mica surface (Lin et al. 1990), virus infection process in live cells (Häberle et al. 1992), and a few othertargets. These imaging studies were performed by DC-mode, or contact mode, AFM, where the deflection of the non-oscillating cantilever was measured to detect tip–sample interaction, so that the sample tended to be laterally dragged by the cantilever tip. In 1993, the amplitude modulation (AM) mode was introduced to AFM, where the cantilever is oscillated in the Z-direction at its (or near) resonant frequency and its oscillation amplitude is measured to detect the tip–sample interaction (Zhong et al. 1993). This mode of imaging can minimize the tip-drag force and hence allows molecules weakly attached to a substrate surface to be imaged, thereby facilitating the observation of biomolecular processes. Using this mode, Hansma, Bustamante, and colleagues observed DNA-involved processes, such as DNA digestion by DNase (Bezanilla et al. 1994), interaction of DNA with repressor Cro protein (Erie et al. 1994), and transcription process (Kasas et al. 1997). In these imaging experiments, it took at least 30 s to capture an image. Therefore, moving objects were not clearly imaged or were unable to be imaged at all. Other attempts were also made to achieve high-resolution imaging of membrane proteins embedded in membranes (led by the Engel group) (e.g., Schabert and Engel 1994) and use AFM as a force sensor (led by the Gaub group) (e.g., Rief et al. 1997). Thus, the potential of AFM in biological research was widely explored during the first 10–12 years after the advent of AFM.

The speed performance of AFM was obviously too low to capture dynamic biomolecular processes. Attempts to increase the imaging rate of AFM were initiated around 1993 by the Hansma group and the Ando group independently. Some results from these studies gradually began making an appearance between 1996 and 2002. The resonant frequency, f c, of conventional soft cantilevers for bio-imaging is about 1 kHz in water. Supposing that we can measure the pixel’s sample height at every cycle of cantilever oscillation, it takes time T = 2 N 2/f c s to capture an image with N × N pixels (the prefactor “2” arises from the trace/retrace scanning in the X-direction). For f c = 1 kHz and N = 100, T = 20 s. Therefore, to increase the imaging rate of AFM, both groups developed small cantilevers with high resonant frequencies and small spring constants, as well as an OBD detector for small cantilevers (Viani et al. 1999; Ando et al. 2001). The Ando group additionally developed a fast scanner and a fast amplitude detector that could output the amplitude signal at every half cycle of cantilever oscillation (Ando et al. 2001). Using prototypic HS-AFM, the dynamic GroEL–GroES interaction and diffusional motion of myosin V were imaged (Viani et al. 2000; Ando et al. 2001). However, the feedback control to maintain the tip–sample interaction force constant was still so slow that fragile biological samples were not able to be rapidly imaged without being damaged. In the following years the Ando group developed other techniques: an active vibration damping technique for the Z-scanner based on a Q-control method (Kodera et al. 2005), a new feedback controller to make high-speed imaging compatible with low-invasive imaging (Kodera et al. 2006), a compensator for drift of cantilever excitation (Kodera et al. 2006), a low-noise/fast amplitude detector based on a Fourier method (Ando et al. 2008), and a fast phase detector (Uchihashi et al. 2006). Finally, in 2008 the HS-AFM technique was established (Ando et al. 2008).

Current performance of HS-AFM system

The state-of-the-art HS-AFM that scans a small sample stage (typically, 2 mm in height and 2 mm in diameter) can image protein molecules at 10–20 fps, when the instrument is equipped with a small cantilever with resonant frequency of 0.8–1.2 MHz in water and spring constant of 0.1–0.2 N/m and a fast scanner with resonant frequency of 100–170 kHz in the Z-direction. Although the spatial resolution of HS-AFM largely depends on the cantilever tip, it is typically 2–3 nm in the lateral direction (sub-molecular resolution for proteins) and ~ 0.15 nm in the vertical direction. Since the oscillation amplitude of shorter cantilevers can be more precisely and sensitively detected than that of conventional long cantilevers, the vertical resolution of HS-AFM is higher than that of slow AFM, even at high bandwidth. The vertical resolution of ~ 0.15 nm is high enough to visualize unstructured polypeptides (0.45–0.5 nm in diameter) on a solid support (Miyagi et al. 2008).

When new HS-AFM imaging data of a protein are presented, the same question has been (and is still) always asked; is the molecular behavior observed by HS-AFM affected by tip–sample and surface–sample interactions? This makes a striking contrast with other cases. For example, for a protein structure revealed by X-ray crystallography, no one asks whether or not the structure is affected by crystallization. As to the impact of tip–sample contact on the sample, it can be quantitatively estimated from the cantilever’s mechanical quantities (spring constant k c and quality factor Q c in water) and the imaging condition (cantilever’s free oscillation amplitude A 0 and set point amplitude A s). The magnitude of tip force exerted to the sample can be approximately estimated to be k c(A 0 2A s 2)1/2/Q c, when the tip–sample interaction is weak so that the sinusoidal wave of oscillating cantilever is not significantly deformed. However, the magnitude of force is not an appropriate indicator of the degree of impact of tip–sample contact on the sample. The amount of cantilever oscillation energy lost per cycle by the tip–sample contact, E c, is probably the best indicator of the impact: E c = 1/2k c(A 0 2A s 2)/Q c. Alternatively, the magnitude of impact, i.e., force × time during which the force acts on the sample, can be used as an indicator of the impact. For successful HS-AFM imaging of proteins in action, E c is usually set at a few k B T (k B, Boltzmann constant; T, room temperature in Kelvin). This energy is considered to be transferred to the target molecule but partitioned to many degrees of freedom among the entire degree of freedom that a single molecule possesses, so that the energy transferred to one degree of freedom is negligibly small. More importantly, the transferred energy dissipates quickly, probably over a time much shorter than 1 μs. This conclusion of fast dissipation is derived from HS-AFM imaging of the α3β3 subcomplex of F1-ATPase in the presence of ATP (Uchihashi et al. 2011). A single molecule of this complex was successively imaged for 40 s using a cantilever with resonant frequency of ~ 1 MHz in water. For 40 s, this molecule was tapped with the oscillating tip many millions of times. Nevertheless, this molecule continuously exhibited very similar conformational changes repeatedly, with an unchanged rate. As to the impact of surface–sample interaction, we have to judge it from the observed behaviors of molecules.

In situ HS-AFM imaging

Most of the targets of HS-AFM studies conducted so far have been purified proteins or membrane patches containing purified membrane proteins placed on the substrate surface. These studies have been well documented in recent review articles (Ando et al. 2014; Ando 2017a), and therefore I do not repeat the discussion here. Rather, I focus on in situ HS-AFM imaging of proteins existing on the surface of high-order structures, such as live cells.

On bacterial cell surfaces

The first attempt of such imaging was carried out for live cells of the magnetotactic spirillum bacterium Magnetospirillum magneticum AMB-1 (Yamashita et al. 2012). The cell surface was not flat but slightly wavy and entirely covered with regularly arranged particles of similar size (Fig. 1a). The magnified image showed irregularly moving reticulate structures arranged in a hexagonal lattice (Fig. 1b, c). The analysis of trajectories of dents in the center of hexagons showed that their mean square displacements were approximately linear as a function time, despite the highly crowded condition. From this linear relationship, the average diffusion constant (D) was estimated to be 3.2 ± 0.4 nm2/s. When the outer membrane was isolated, its HS-AFM images showed a reticulate structure very similar to that observed on live cell surfaces. The membrane was then disrupted with the cantilever tip (Fig. 1d), which resulted in the appearance of diffusing triangular-shaped molecules in the membrane patch (Fig. 1e). Most of the molecules showed a trimeric form. Sodium dodecyl sulfate–polyacrylamide gel electrophoresis of the membrane fraction showed that the most abundant protein was a 40-kDa protein (~ 80% of the total protein content in the outer membrane fraction) (Fig. 1f). This protein was identified as a porin homolog (amb0025: Msp1) by matrix-assisted laser desorption/ionization time-of-flight tandem mass spectrometry of the trypsin-digested gel slices. Thus, the reticulate structures covering the outer surface mostly consist of porin trimers that are densely packed but can diffuse rapidly in the membrane. This conclusion was also confirmed for the cell surfaces of other Gram-negative bacteria, Escherichia coli and Rhodobacter sphaeroides (Oestreicher et al. 2015).

Fig. 1
figure 1

High-speed atomic force microscopy (HS-AFM) imaging of the outer surface of the live magnetotactic spirillum bacterium Magnetospirillum magneticum AMB-1. a Low-magnification image of the outer surface. b High-magnification image of the outer surface. c Trajectories of the center position of four indents surrounded by hexagons. These trajectories were drawn by tracing the movement of each indent. d HS-AFM images showing dissociation process of a reticulate structure in an outer membrane patch. Numbers indicate frame number. e Highly magnified AFM image of molecules observed after dissociation. f Sodium dodecyl sulfate–polyacrylamide gel electrophoresis gel profile of extracted proteins from the isolated outer membrane (lane 2). The protein band indicated by the arrow was identified as porin. The molecular masses of the standards are indicated on the left sides of the lanes

On mammalian membrane surfaces

Compared to the outer membranes of live bacterial cells, the plasma membranes of live mammalian cells are generally very soft. Therefore, it is difficult to obtain high-resolution images of protein molecules in situ. Nevertheless, the surface of lens fiber cells was successfully imaged by HS-AFM, where the top cell layer was removed, exposing the gap junctions (junctional microdomains) consisting of connexon and aquaporin 0 (AQP0) (Colom et al. 2012). AQP0 was observed to be densely packed in square arrays with approximate 6.5 nm unit cell dimensions. Although only moving very slowly (D = 0.5 nm2/s), the arrays diffused freely in the lens cell membrane. This motional freedom probably plays a role in the maintenance of adhesion between lens cells while the lens tissue undergoes elastic deformation when focusing to different distances. This success in imaging is probably due to the relatively rigid structure of junctional microdomains with closely packed AQP0 arrays. When proteins are sparsely embedded in mammalian membranes, their membrane vesicles have to be small enough for in situ HS-AFM imaging. In this regard, protein molecules on neuronal synapses are good candidates for imaging by HS-AFM.

The nuclear envelope (NE) comprises concentric outer and inner lipid membranes containing various proteins, including proteins that connect the two membranes and maintain their gap distance. Therefore, the NE is considered to be rigid compared to the plasma membranes. Recently, the surfaces of nuclei isolated from Xenopus laevis oocytes were observed by HS-AFM (Sakiyama et al. 2016). Nuclear pore complexes (NPCs) act as a central regulator of transport between the nucleus and cytoplasm. NPCs consist of ~ 30 proteins, termed nucleoporins. One third of nucleoporins are intrinsically disordered phenylalanine-glycine strings (FG Nups), which are tethered inside each pore and may play a role in selective barrier and transport mechanism, but this mechanism remains elusive. HS-AFM was used to visualize the spatiotemporal dynamics of nucleoporins inside NPCs. The cytoplasmic orifice is circumscribed by highly flexible, dynamically fluctuating FG Nups that rapidly elongate and retract. This transient entanglement in the NPC channel manifests as a central plug when averaged in space and time.

In contrast to these successful cases, HS-AFM imaging of protein molecules on soft and fragile membranes has been a challenge. HS-AFM imaging of voltage-dependent anionic channels (VDAC) on the surface of live mitochondria was attempted in my laboratory (Ando 2017b). Although assembled oligomers of VDAC and their diffusion on the membrane were visualized, the images were blurred, possibly due to membrane deformation (Fig. 2a). During imaging, the membrane was occasionally ruptured (Fig. 2b, c) .

Fig. 2
figure 2

HS-AFM images of mitochondria. a Magnified image showing assembled voltage-dependent anionic channels. b,c HS-AFM images before (b) and after (c) rupture of a mitochondrion

What’s NEXT

Technical developments are often motivated by the desire to observe new aspects of objects that cannot be observed with conventional techniques. Therefore, the question, “what is next?”, may be able to be answered by a survey of what researchers are eager to observe.

Faster HS-AFM

Even for purified protein systems, we still have molecular processes that cannot be observed with the current HS-AFM system. One of the reasons for this incapability is that the current temporal resolution is insufficient for the molecular processes being studied. Among all components of the system, the Z-scanner and cantilevers represent the main limitations to the temporal resolution. For narrow area scanning sufficient for imaging purified protein systems (100 × 100 ×  100 nm3 for X, Y, and Z), the speed performance of the scanner still has room for improvement. In typical stack piezoactuators, the tradeoff relationship between the maximum displacement d p and the resonant frequency f p is approximately given as f p × d p = 300–400 kHz·μm. Therefore, f p = 3–4 MHz is attainable when we limit d p to 100 nm. This resonant frequency is ~ 50-fold and ~20-fold higher than those of the developed fastest X- and Z-scanners, respectively. Therefore, it may not be too difficult to increase the scanner’s response speed by at least tenfold. The highest resonant frequency of developed small cantilevers with a spring constant of ≤ 0.2 N/m is 1.2 MHz in water. They are approximately 7 μm long, 2 μm wide, and 90 nm thick. With this small cantilever size we can achieve the shortest imaging time of 2N 2/f c = 17 ms for 100 × 100 pixels (imaging rate 60 fps), if we can immediately complete a feedback Z-scan after obtaining the pixel’s height information at every cycle of cantilever oscillation. The fabrication of smaller cantilevers is technically possible. However, the OBD detection limits their size down to a half of the current size and, therefore, the highest possible f c is ~ 3.5 MHz in water, corresponding to 2N 2/f c = 6 ms for 100 × 100 pixels (imaging rate 160 fps). Taking into account the time delay of the Z-scanner and feedback controller, the upper rate limit is likely to be 100 fps for imaging with 100 × 100 pixels. To further increase the imaging rate, we have to abandon the OBD method and create an alternative method for high-sensitivity deflection detection of a tiny cantilever or a nanowire (Sanii and Ashby 2010).

Hybrid HS-AFM/optical microscopy system

Conventional AFM and even HS-AFM cannot visualize small ligand molecules. This invisibility sometimes makes it difficult to judge whether or not an observed structural change of a protein molecule is really caused by the binding/release of ligand. Even protein molecules are invisible when they are encapsulated into the cavity of a target protein, such as in the case of substrate protein bound to chaperonin GroEL. To eliminate this limitation, HS-AFM combined with fluorescence microscopy is required. Such a combined system has already been developed. However, its purpose is not the simultaneous observation of HS-AFM/fluorescence images but the positioning of the cantilever tip on a region of interest within a large object (e.g., eukaryotic cells) visualized with fluorescence microscopy (Colom et al. 2013; Shibata et al. 2015). In this hybrid system, the sample stage is scanned, and therefore the fluorescence image oscillates during HS-AFM imaging. More seriously, the cantilever blocks the optical path of fluorescence microscopy. To overcome this problem, tip-scan HS-AFM has been developed, in which the laser beam of the OBD detector is steered with a two-dimensional mirror tilter to track the lateral motion of the cantilever (Fukuda et al. 2013). This system has been combined with total internal reflection fluorescence microscopy. Nonetheless, it is quite difficult to examine correlations between the two modalities of images because of large differences in their field of views and spatial resolution between the two microscopy methods. To obtain a clear correlation, it is probably best to use near-field scanning optical microscopy (NSOM) for the hybrid system. NSOM breaks the diffraction limit of light, and a cantilever (metal-coated) tip can also be used as an aperture or aperture-less probe for NSOM (Hecht et al. 2000). The simultaneous, pixel-by-pixel acquisition of the fluorescence and AFM signals at nearly identical sample positions facilitates finding correlations between the two overlaid images. The spatial resolution of NSOM achieved so far ranges widely from a few nanometers to 100 nm, but the highest resolution is not attained routinely. However, high-resolution NSOM will be achievable in the future through the establishment of plasmonic aperture NSOM.

Hybrid HS-AFM/optical tweezers system

Atomic force microscopy has been used not only for imaging but also for measuring the force to estimate the strength of intra- and intermolecular bonds at the single-molecule level and the elasticity of biological surfaces (Dufrene et al. 2011). However, these two capabilities cannot be used simultaneously. Although visualizing molecules under external force is not yet feasible, such visualization will provide a new opportunity to study proteins. As tip-scan HS-AFM can be combined with optical tweezers, this direction of experimental studies will be able to be explored in the future. For example, the processes in which a single protein molecule is being unfolded or refolded can be observed. Especially the refolding process can be observed in detail because this process can be slowed down by controlling the magnitude of applied force. Moreover, we will be able to observe a protein molecule responding to an external force applied to a specific locus in a given direction. Such responses have been studied in silico by elastic network modeling of proteins (Togashi and Mikhailov 2007; Düttmann et al. 2012). According to these studies, proteins are well designed so that a functionally important locus of a protein undergoes a conformational change only when a force is applied to another functionally relevant site in a specific direction.

Nano-endoscopy

Thus far, dynamic events occurring in the interior of live cells are only possible with optical microscopy, especially with fluorescence microscopy. However, fluorescence microscopy only visualizes marker molecules, and the marker-labeled entities themselves are completely invisible. There is a strong desire among researchers to directly observe molecular processes occurring in the cell without using optical markers, raising the question of whether it is possible to look at the cell interior with HS-AFM. It has already been demonstrated that the long tip (even a thin pipette) can be inserted into the cell without killing the cell (Obataya et al. 2005; Guillaume-Gentil et al. 2014). However, no one has tried to carry out AFM imaging of the cell interior. Since the plasma membranes are very flexible, tip scanning relative to the cell may not be disturbed by the membranes as long as the scan range is narrow. Do floating molecules crowded in the cytoplasm disturb the AFM imaging? Floating molecules are expelled easily by the inserted tip and therefore will not disturb AFM imaging and will also not be imaged. Taking these issues into account, HS-AFM imaging of the cell interior is likely possible for molecules attached to or supported by relatively rigid organelles (nuclei and mitochondria) and plasma membranes in which outer face is attached to the substrate surface.

Non-contact imaging

To visualize protein molecules on extremely soft membranes, we need non-contact AFM. However, this AFM has only been achieved for objects in vacuum using the frequency modulation mode because the small resonant frequency shift of the cantilever can be sensitively detected in vacuum thanks to the sharp resonance spectrum (i.e., a high-quality factor of the cantilever) in vacuum. Alternatively, we can use scanning ion conductance microscopy (SICM) for non-contact imaging (Hansma et al. 1989). The operation principle of SICM for capturing a topographic image differs completely from that of AFM. SICM uses an electrolyte-filled glass pipette (nanopipette) as a probe and relies on an ion current flowing between an electrode inside the nanopipette and another in an external bath solution. The ion current passing through the apex opening of the nanopipette is sensitive to the tip–sample surface separation. Therefore, SICM can capture topographic images without any tip–sample contact. Although high spatial resolution (~ 6 nm) has been reported for SICM (Shevchuk et al. 2006), it cannot be attained routinely. The spatial resolution is approximately equal to the diameter of the pipette opening, when the tip end–sample separation is similar to the diameter of the pipette opening. Fabricating a nanopipette with a very small opening is not impossible. However, it is difficult to have a very thin wall surrounding the opening. When the wall is thick, a protein molecule under the wall cannot be sensitively detected and hence the tip end is apt to make contact with and break the molecule. Moreover, the ion current resistance increases with decreasing opening diameter, resulting in the slower response of ion current due to resistance–capacitance coupling. Even with a nanopipette with opening of ~ 20 nm, it takes 10 min or longer to take an image. Thus, it is desired to develop HS-SCIM with high spatial resolution.

A short carbon nanotube (CNT) may be able to be used as an SICM probe. CNTs can be inserted into lipid membranes and transport ions across the membranes (Geng et al. 2014). The inner diameter of CNTs is very small. However, their ion current resistance can be maintained at a minimum by using short CNTs. Lipid bilayers can be introduced to the opening of a nanopipette. It is probably possible to insert a short CNT to the bilayer (Lopez et al. 2004). Since lipids containing UV-polymerizable diene moieties are commercially available, the bilayer can be polymerized after the insertion of a CNT. In this way, we will be able to fabricate a glass nanopipette with a membrane-inserted CNT, which will materialize HS-SCIM with high spatial resolution.

Is it possible to use this CNT nanopipette as a probe for nano-endoscopy? Macromolecules diffusing in the cell interior will be sensed by the nanopipette. Therefore, unlike HS-AFM, HS-SCIM will not be able to be used in nano-endoscopy. HS-SICM will be most useful to observe the dynamic action of protein molecules on the outer surface of plasma membranes of live cells and isolated intracellular organelles (Gorgi apparatus, endoplasmic reticulum, lysosomes, and others). It is also probably possible to use HS-SICM to observe the dynamic action of proteins in demembranated cells, including muscle fibers.