Introduction

Large proteins often consist of several domains which can be expressed and investigated separately. The structure of these subunits can be used as a starting point for the structure determination of the entire full-length protein. A similar approach can be applied to describe complexes of proteins with known monomer structures. This divide-and-conquer approach has been successfully applied in many cases (Gaudet 2009).

When individual domain structures are determined separately and assembled as building blocks to describe the entire structure either by modelling or by placing them into the experimental electron-density from electron microscopy or X-ray spectroscopy of lower-than-atomic resolution, it typically has to be assumed that the individual domains conserve their structure in the context of the assembly. This assumption may be questionable and has been proven entirely incorrect in some cases (Loquet et al. 2009; Wasmer et al. 2009). Therefore, it is relevant to ascertain this assumption in each specific case.

Here, solid-state NMR can provide information even if the molecule is too complex for NMR structure determination. Indeed, solid-state NMR can, at different levels, compare the structural features of proteins in a wide variety of contexts, from the recording of spectral fingerprints which reveal conformational consents—or differences also with respect to different preparations (Carlon et al. 2016; Gardiennet et al. 2012; Meier and Böckmann 2015; Ravera et al. 2016; van den Bedem and Fraser 2015)—to secondary structure information, as well as to domain or monomer 3D structures.

In the following, we explore the potential and the limitations of such an approach using fingerprints and secondary-structure elements for the example of the 12 × 56 kDa DnaB helicase from Helicobacter pylori (Hp). DnaB helicases are bacterial ATP-driven enzymes which unwind double-stranded DNA in 5′–3′ direction during DNA replication (LeBowitz and McMacken 1986). Structurally, DnaB is a two-domain motor protein with an amino-terminal domain and a carboxy-terminal domain separated by a flexible linker region (Bailey et al. 2007a; Stelter et al. 2012). The full-length protein forms ring-shaped dodecameric assemblies which encircle single-stranded DNA (LeBowitz and McMacken 1986). The DnaB hexamer binds the DnaG primase and couples DNA unwinding to RNA primer synthesis (Abdul Rehman et al. 2013; Bailey et al. 2007b; Kashav et al. 2009). Solid-state NMR spectra from sedimented samples of HpDnaB, as well as of the HpDnaB:HpDnaG complex, have been reported (Gardiennet et al. 2012, 2016) demonstrating that relevant solid-state NMR samples can be produced by simple ultracentrifugation (Bertini et al. 2012; Gardiennet et al. 2012). Crystal structures of a truncated N-terminal domain (NTD, residues 1–121, pdb 3GXV) (Kashav et al. 2009) and the C-terminal domain (CTD, residues 152–488, pdb 4A1F) (Stelter et al. 2012) were determined, and the NMR chemical shifts of a microcrystalline sample of the full NTD (residues 1–153) was recently sequentially assigned (Wiegand et al. 2016).

A structural model of the full-length (FL) protein at 6.7 Å resolution was obtained by combining manual positioning of the high-resolution X-ray structures of the individual domain structures into a low-resolution X-ray map and subsequent refinement (Bazin et al. 2015). Residues for which no high-resolution information was available, were modeled as polyalanines, notably an electron density representing a helix and spanning residues 123–147 in chain A of the structure described in reference (Bazin et al. 2015).

Experimental section

Protein expression and purification, sample preparation

The NTD, CTD and the full-length (FL) protein of the H. pylori helicase DnaB were as previously described cloned into the plasmid pET151 (Wiegand et al. 2016), pET101 (Stelter et al. 2012) and pACYC-duet1 (Bazin et al. 2015), respectively, and expressed in the E. coli strain BL21 Star (DE3) (One Shot® BL21 Star™ (DE3) Chemically Competent E. coli, Invitrogen™). The overexpression of these three proteins was performed in M9 minimal medium (Studier 2005) using 13C-enriched glucose 2 g L−1 (Cambridge Isotope Laboratories, Inc. CLM-1396-PK) and 15N-enriched ammonium chloride 2 g L−1 (Sigma-Aldrich® 299251) as sole carbon and nitrogen sources. The cells were lysed by microfluidization process.

NTD of HpDnaB

The expression and crystallization of the NTD was described earlier (Wiegand et al. 2016). Briefly, the uniformly 13C–15N isotope labelled NTD of HpDnaB was purified by gravity-flow with a Ni2+-agarose affinity chromatography (Qiagen™). The hexahistidine-tag was cleaved with the Tobacco Etch Virus protease in dialysis with a 6–8 kDa cut-off (Spectra/Por® Dialysis Membrane) during 3 h. The protein was concentrated to 23.5 mg mL−1 by centrifugation using a 10 kDa cut-off concentrator (Vivaspin® 20 VS2001 Sartorius) in the final buffer (50 mM Tris–HCl pH 6.5, 100 mM NaCl). The crystallization of the protein was performed by vapor diffusion in a nine-well glass plate using a reservoir solution of 2.3 M NaCl. An equal volume of protein and crystallization buffer (100 mM HEPES pH 7.0, 0.1 % (m/v) sodium azide, 10 % (v/v) polyethylene glycol 10,000) was mixed together. Microcrystals were obtained after one week at 20 °C and centrifuged into the NMR rotor (25,000×g during one hour at 4 °C) using a home-made filling device (Gardiennet et al. 2012).

CTD of HpDnaB

The uniformly 13C–15N isotope labelled CTD of HpDnaB was purified by a Ni2+-agarose affinity chromatography using a 5 mL HisTrap HP column (GE Healthcare Life Sciences). The protein was concentrated to 5–10 mg mL−1 by centrifugation using a Millipore concentrator (30 kDa cut-off) and submitted to size exclusion chromatography in 10 mM Tris–HCl pH 7.5, 200 mM NaCl. The protein was concentrated to 5 mg mL−1 using a centrifugal concentrator in 10 mM Tris–HCl pH 7.5, 200 mM NaCl. Equal volumes of protein and crystallization buffer (2.75 % (v/v) polyethylene glycol 6000 and 55 mM citric acid pH 5) were mixed together. The crystals were grown by vapor diffusion at 20 °C in a nine-well glass plate using a reservoir solution of 0.1 M citric acid pH 5, 17 % (v/v) polyethylene glycol 6000. The obtained microcrystals were harvested and centrifuged into the NMR rotor.

FL HpDnaB

Untagged, 13C–15N labeled FL HpDnaB was purified by heparin-agarose affinity chromatography using a 5 mL HiTrap Heparin HP column (GE Healthcare Life Sciences) followed by anion exchange chromatography using a 5 mL HiTrap Q HP column (GE Healthcare Life Sciences). The purified protein was concentrated up to 30 mg mL−1 by centrifugation in 10 mM sodium phosphate buffer pH 7.5, 150 mM NaCl, 1 mM DTT and 0.05 % (m/v) sodium azide.

The protein was sedimented directly into the NMR rotor using a centrifuge rotor-filling tool (Gardiennet et al. 2012) at 200,000×g, 4 °C during 16 h.

Solid-state NMR spectroscopy

Solid-state NMR spectra were acquired at 18.8 and 20.0 T static magnetic field strengths using 3.2 mm Bruker Biospin “E-free” probes (Gor’kov et al. 2007). The MAS spinning frequency was set to 17.5 and 17.0 kHz for the two fields, respectively. The 2D and 3D spectra were processed with the software TOPSPIN (version 3.2, Bruker Biospin) with a shifted (2.0–2.8) squared cosine apodization function and automated baseline correction in the direct dimension. The sample temperature was set to 278 K (Böckmann et al. 2009). All spectra were analyzed with the software CcpNmr (Fogh et al. 2002; Stevens et al. 2011; Vranken et al. 2005) and referenced to 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS). 13C and 15N resonance assignments for the NTD were taken from reference (Wiegand et al. 2016). The assignments of the FL protein are mainly based on a comparison of the 3D NCACB spectrum with the one from the NTD. Those assignments were further cross-checked by comparing also CANCO, NCACX, NCOCA, CCC and NcoCACB spectra (Schuetz et al. 2010).

Results and discussion

The overall structure of the two domains is conserved in multimeric full-length HpDnaB

A good estimate for the global validity of a divide-and-conquer approach can be obtained by comparing the 2D NMR fingerprints of the full-length protein with the ones of isolated domains. For HpDnaB, the vast majority of NMR signals in the 2D spectra of the full-length protein is indeed accounted for by the signals from the isolated NTD and CTD domains. This is illustrated in Fig. 1a, where the Cα-Cβ regions of 2D 13C,13C DARR dipolar correlation spectra (Takegoshi et al. 2001, 2003) (with a mixing time of 20 ms, for the complete aliphatic region see Figure S1) of a sedimented FL HpDnaB sample (grey contours), as well as of microcrystalline preparations of the individual NTD (blue contours) and CTD (red contours) are shown. The spectra of all samples show good signal-to-noise ratios and the 13C resonances are narrow with a line-width at half maximum in the order of 0.5 ppm, pointing to well-ordered structures in all samples studied herein. Although significant spectral overlap is observed in particular in the 2D spectrum of the FL protein, well-resolved resonances are detected in less crowded spectral regions, as for example shown in Fig. 1b, c for the alanine and isoleucine/valine regions. In both extracts, most peak positions revealed in the spectra from the FL protein (in grey contours) coincide within the line-width with resonances from the NTD (in blue contours), or the CTD (in red contours) such that the corresponding peaks can easily be identified (see Figure S2 for separate extracts and peak positions shown on the FL spectra). This can be evaluated in a site-specific manner for the alanine residues shown in Fig. 1b. Alanine assignments could be obtained even for the large CTD due to the outlier chemical shifts of its β-sheet alanine residues (see Fig. 1b, for assignments of the NTD see below). Similarly, in the 15N,13C dipolar correlation NCA spectra of the three proteins (Figure S3) most isolated peak positions revealed in the spectra of the FL protein (in grey contours) coincide within the line-width with resonances from the NTD (in blue contours), or the CTD (in red contours). At this stage, the comparable fingerprints in 2D dipolar correlation spectra clearly hint to overall similar structures of the individual domains of HpDnaB in the context of the FL HpDnaB protein and can exclude that major structural rearrangements take place, as have been for example observed for the HET-s prion (Wasmer et al. 2009).

Fig. 1
figure 1

The FL spectrum is largely the sum of the spectra of the isolated domains. a Overlay of (13C–13C) 20 ms DARR dipolar correlation spectra of FL HpDnaB (grey), as well as of the corresponding NTD (blue) and CTD (red). Spectral fingerprints of (13C–13C) 20 ms DARR spectra of the alanine region (b, peak assignments as mainly obtained from sequential assignments are shown for the FL protein) and valine/isoleucine region (c), the relevant spectral regions are highlighted for guidance of the eye

The conservation of the structure of the globular part of the NTD can be validated in a residue-specific manner

A site-specific investigation of chemical-shift changes between the isolated domains and the FL protein is possible for the NTD due to the availability of the sequential resonance assignments (Wiegand et al. 2016). The comparison of a set of 3D spectra (NCACB, NCACX, NCOCX, CANCO, NcoCACB and CCC) of both the NTD and FL protein allowed the identification of complete spin systems, as well as their connections, for the majority of NTD resonances in the FL protein spectra (for a representative example see Figure S4). Using this approach, about 65 %, limited by peak overlap in the full-length 3D spectra, of the assignments of backbone atoms from the NTD could be transferred to the FL protein. A residue-by-residue comparison of resonance frequencies for 15N, 13Cα and 13Cβ is shown in Fig. 2 (for 13C’ see Figure S5) for residues 16–134. The root mean square values (RMS) for the 13Cα and 13Cβ chemical shift differences amount to 0.16 and 0.15 ppm, respectively, whereas the RMS value for 15N is larger (0.43 ppm). For 13C, all values are below 0.5 ppm indicating no significant conformational changes. An example for a highly conserved stretch is illustrated in Fig. 3 where 2D planes from 3D NCACB spectra of the NTD and the FL protein for the loop comprising His32-Glu36 are shown. In such conserved stretches, also the side-chain atoms remain at similar resonance frequencies and the corresponding assignments can be transferred from the NTD to the FL protein (see Fig. 4, Figure S4). Somewhat larger differences can be observed in helix α1, before α3, after α5 and at the end of α6 (for the numbering of α-helices in the NTD see Fig. 2). Still, they remain smaller than 0.5 ppm. The largest deviations in 15N chemical shifts are concomitant, e.g. before α3 and at the C-terminus of α6.

Fig. 2
figure 2

Chemical-shift differences identify the location of structural changes: Absolute values of 15N, 13Cα and 13Cβ chemical shift differences between FL HpDnaB and the isolated NTD. Residues marked by circles could not be assigned in the NTD (Wiegand et al. 2016) and residues shown as purple bars experience significant chemical-shift changes and could thus not be assigned in the FL protein (see text). The secondary structure elements shown correspond to the NTD and are taken from reference (Wiegand et al. 2016)

Fig. 3
figure 3

Comparison of the resonances belonging to the His32-Glu36 stretch in 2D strips of 3D NCACB spectra of NTD HpDnaB (top row, blue) and FL HpDnaB protein (bottom row, grey)

Fig. 4
figure 4

Negligible chemical shift differences between NTD and FL protein allow the identification of spin systems: Representative example for Val34 as identified in 2D (13C–13C) DARR (20 ms mixing time) spectra of the FL (grey contours) and NTD (blue contours). The 1D trace (bottom part) is given to highlight the complete coincidence of peaks

The small chemical-shift changes (<0.5 ppm) quantified in the N-terminal portion (residues 16–89) of the NTD indicate that the conformations of these residues in the full-length HpDnaB are almost identical to the ones in the crystals of the isolated NTD. It is interesting to point out that this conclusion holds also for most residues located in loops, as can be observed in Fig. 5a, where average Cα and Cβ 13C chemical shift changes are plotted on the crystal structure of the NTD. For many residues in loop positions the resonances of side-chain atoms could be identified as well, and were found at similar chemical-shift positions (for examples see Fig. 4, Figure S4). An exception are the residues located around Phe40, for which a clustering of slightly larger chemical-shift differences is observed, but which do still not exceed the 0.5 ppm threshold we consider indicative for structural rearrangements.

Fig. 5
figure 5

a 13C chemical shift differences (determined according to \(\sqrt {\frac{1}{2}((\Delta \delta ({}_{{}}^{13} C\alpha ))^{2} + (\Delta \delta ({}_{{}}^{13} C\beta ))^{2} )}\)) between the isolated NTD and FL HpDnaB plotted on the crystal structure of the NTD [residues 1–121, pdb file 3GXV (Kashav et al. 2009)]. Residues colored in purple experience strong differences between the NTD and FL protein. The helix shown in grey is deduced from the NMR data for the complete NTD (residues 1–153) (Wiegand et al. 2016). b Final 2Fo-Fc electron density map of the FL His-HpDnaB protein contoured at 1.5 σ at a resolution of 6.7 Å with the NTD of HpDnaB (blue, pdb 3GXV) and the CTD (red, pdb 4A1F). The inserted polyalanine chain is shown in purple

The chemical-shift data thus clearly point to a full conservation of the globular region of the NTD (residues 16–89) in the context of the full-length protein. This correlates well with the fact that the helices of the globular domain of the NTD could be fitted with ease in the low-resolution electron density map of the full-length HpDnaB protein (Bazin et al. 2015). Our data complements this information and confirms conservation of the loop regions, which are invisible in the low-resolution map. We can conclude that, for this subdomain, using the high-resolution X-ray structure as a rigid building block in hybrid approaches is a valid and well justified procedure.

The structure of the C-terminal part of the NTD is flagged as being different

Interestingly, several residues located in the C-terminal portion of the NTD are clearly missing in the spectra of the FL protein: residues 107 and 109 in the 2D DARR spectra (Figures S6, S7) and residues 107, 111, 113, 120, 121, 125, 126, 128 and 129 in the 3D NCACB spectra (Fig. 6). Chemical-shift changes cannot be quantified for those resonances. Alternatively, they also might show dynamic behavior. These large changes point to “hot spots” where the structure in the FL protein is not just identical to the same position in the isolated domains.

Fig. 6
figure 6

Details highlighting the signals showing chemical-shift changes: Representative 2D strips from 3D NCACB spectra and corresponding 1D traces highlighting examples for resonances of the NTD which experience significant chemical-shift changes in the context of the FL protein (blue NTD of HpDnaB, grey FL HpDnaB)

Unfortunately, the C-terminal portion of the NTD is not present in the crystal structure of the NTD (Kashav et al. 2009). NMR secondary chemical shifts however clearly indicate a long α-helix in the C-terminal portion of the isolated NTD (spanning residues 95–135) (Wiegand et al. 2016) only possibly interrupted by a short stretch located at residues 118 and 119 which could not be assigned. In contrast, the low-resolution X-ray data of the FL protein indicates the presence of an α-helical hairpin (see Fig. 5b for the 2Fo-Fc electron density), consistent only with a turn around residues 124–127 (Bazin et al. 2015). These residues are in the isolated NTD clearly located in an α-helix as concluded from the NMR data. The hot spot identified by means of spectral changes thus indeed points to a region which is poorly conserved between the isolated domain and the FL protein. This part remains today a white spot in the FL structure of the protein, and is so far only represented by a polyalanine chain in order to mimic the second α-helix of the hairpin, as indicated in Fig. 5b (Bazin et al. 2015). Even if a full 3D structure could be calculated from the NTD NMR data, these hot spots could not be taken over to the FL structure.

We would like to point out also that the sequential assignments of the isolated NTD domain already hinted at a larger conformational space for the C-terminal part of the NTD, since peak doubling was observed for residues Thr116, Leu123, Ser124, Ala125, and Lys127 (Wiegand et al. 2016). This can be attributed to two crystallographically distinct HpDnaB-NTD monomers in the asymmetric unit with slightly different local environments for the residues within the C-terminal portion of the NTD, whereas for the globular part of the NTD such behavior was not observed (Wiegand et al. 2016). In the full-length protein, no peak doublings could be observed, as shown in Figure S8 at the example of T116. We thus conclude that the C-terminal portion of the NTD indeed crystallizes differently in the isolated NTD than in the FL protein and cannot be used as such for building a FL protein structure from the individual domains.

Conclusions

In this work we have investigated how and to what extent solid-state NMR can be used to assess the question of structural conservation of individual domains in large, multimeric protein assemblies. We have applied this approach to the HpDnaB helicase, where we could show that, according to the spectral fingerprints, the global fold of the individual domains is conserved in the context of the FL supramolecular assembly. A site-specific assessment of the NTD structure in the FL protein allowed to characterize in detail conserved regions, e.g. the globular region of the NTD, as well as regions which indeed undergo conformational changes between the isolated domain and the FL protein e.g. residues comprising 107–129 of the NTD. Accordingly, this region can attain different conformations in a different context, and seems in conclusion artefactual in the isolated NTD crystals. While the structure of the globular region of the individual NTD is well conserved, in the context of the full-length multimer, its flexible C-terminal hairpin is not.

NMR can thus distinguish domains that are structurally conserved and show only minor changes in chemical shifts, from domains that change structure in the context of assembly as a consequence of interactions between individual domains in the full-length protein. In the case of dodecameric helicase DnaB from H. pylori, residues of firstly mentioned domains show an average shift difference of only ∼0.15 ppm (RMSD) for the 13Cα and 13Cβ chemical shifts, while residues of the latter mentioned domains change enough so that it is difficult to find the corresponding pairs (≳1 ppm). To specifically identify the spots of non-conservation in an overall conserved setting, sequential assignment of these regions is necessary. Then, solid-state NMR can indeed provide key information when high-resolution structures of domains are combined with low-resolution X-ray or EM envelopes of large multimeric proteins.