Keywords

1 Introduction

The characterization of recombinant proteins provides valuable information about their structure, proper folding, and stability which is invaluable for fundamental research and biopharmaceutical industries. There are many frequently used techniques to monitor the conformational changes and stability of proteins in solution, such as differential scanning calorimetry (DSC), fluorescence spectroscopy, circular dichroism (CD) spectroscopy, nuclear magnetic resonance (NMR), spectroscopy, and X-ray crystallography [1,2,3,4,5,6,7,8,9,10,11]. Although widely used, all those techniques mentioned above have certain limitations, and hence one method is seldom enough for a detailed study of complex protein characterization. DSC requires high protein concentrations, which are challenging to achieve in the case of aggregation-prone proteins and at a bulk manufacturing scale in industries [12]. On the other hand, fluorescence spectroscopy relies on the presence of intrinsic fluorescence, making it an inefficacious technique to study proteins wherein the presence of a prosthetic group (covalently or non-covalently bound to the protein) in the close vicinity of fluorophore quenches the protein’s intrinsic fluorescence. Moreover, it provides no detailed information about the global folding of the protein but instead gives an idea about the local conformational changes around the fluorophores [13, 14]. However, there are certain limitations associated with NMR and X-ray crystallography in the structural determination of proteins. The ease of CD measurements and the requirement of low protein concentrations make it a demanding technique in structural biology [7, 8]. Table 1 highlights the comparison between CD, NMR, and X-ray crystallography. This chapter aims to discuss the uses of CD spectroscopy to obtain insights into the secondary/tertiary structures and stability/conformational dynamics of proteins. We also highlight methodological approaches in performing the CD method and data analysis tools in detail.

Table 1 A comparison of CD spectroscopy, NMR spectroscopy, and X-ray crystallography

2 Determination of Secondary and Tertiary Structures of Proteins Using CD Spectroscopy

Far-UV CD and UV/Vis CD are absorption spectroscopy techniques that investigate the secondary structures of proteins and charge-transfer transitions in metal–protein complexes, respectively. The near-infrared CD is used to study geometric and electronic structures by probing metal d → d transitions, while the vibrational CD is used for structural studies of small organic molecules, proteins, and DNA [15]. CD utilizes the differential absorption of the right-handed and left-handed components of the circularly polarized light by the chiral molecules to study their structural aspects. The difference in the absorption of the left-handed and right-handed circularly polarized light is measured and quantified in CD experiments [16, 17]. The homochirality of amino acids imparts chirality to proteins [18]. All amino acids (except glycine) carry at least one chiral center at Cα; threonine and isoleucine have an additional chiral center at Cβ [19, 20]. The CD signal is observed when a chromophore is optically active (chiral) either (1) intrinsically by its structure, (2) by being covalently linked to a chiral center, or (3) by being placed in an asymmetric environment [8]. CD is widely used to rapidly determine the secondary and tertiary structure of proteins. The CD spectrum is divided into three wavelength regions based on the electronic transitions that predominate in the given wavelength range (Fig. 1). These include:

  1. 1.

    The far-UV range (190–250 nm), where the contribution from peptide bonds dominates, is used to determine the secondary structure of proteins. A weak but broad n → π* transition region is present around 220 nm, and a stronger and sharper π → π* transition is centered at around 190 nm.

  2. 2.

    The near-UV range (250–300 nm), where the aromatic side chains contribute significantly, gives details about the tertiary structure of proteins.

  3. 3.

    The near UV-visible range (300–700 nm), where the extrinsic chromophores contribute, is used to monitor metal ion protein interactions [8, 21, 22].

Fig. 1
A diagram marks the wavelengths in nanometers from 150 to 600. The wavelength chromophore analysis is given as below 250 n m for peptide bonds, 250 to 300 n m for aromatic side chains, and above 300 n m for extrinsic chromophores.

CD spectral regions in proteins with their respective contributing chromophores

Due to “exciton” interactions, the optical transitions of the chromophores of the polypeptide chain get split into multiple transitions when aligned in arrays. This gives characteristic CD spectra of different structural elements in the protein [6, 23]. For instance, α-helix rich proteins show two negative bands at 222 and 208 nm of comparable magnitude and a strong positive band close to 193 nm. Proteins dominated with antiparallel β-pleated sheet structure have a negative band at 218 nm and a positive band at 195 nm. Disordered proteins rich in random coil structures show a strong negative band near 195 nm [6, 24, 25]. The far-UV CD spectra of various secondary structural elements in proteins are shown in Fig. 2a.

Fig. 2
A is a line graph for C D cross 10 raised to minus 3 versus wavelength in nanometers. The lines are plotted for the alpha-helix, beta-sheet, and random coil, which meet at (250, 0). B plots molar C D versus wavelength in nanometers. The curve is increasing, with points labeled P H E, T Y R, and T R P.

(a) The standard far-UV spectra are associated with different secondary structures in proteins. Adapted with permission from Corrêa et al. [24]. (b) The near-UV CD spectra of IgG monoclonal antibody. The characteristic peaks corresponding to Trp, Tyr, and Phe signals are shown. Source: https://www.chiralabsxl.com/Circular_Dichroism/CD_App_Protein_NUV.html

Phenylalanine (Phe), tyrosine (Tyr), tryptophan (Trp), and disulfide bonds contribute to the near-UV CD of proteins in the wavelength region 250–300 nm. This is the region in which these chromophores absorb. The denatured protein has a weak CD signal. However, if these chromophores are buried in the folded native protein, they give strong CD signals. The intensity of the CD signal of each chromophore depends on how tightly it is held in the asymmetric environment. The near-UV CD spectrum of proteins cannot be interpreted in terms of protein structure, unlike the far-UV CD spectrum is interpreted in terms of secondary structure. However, detailed studies can decompose the CD spectrum into bands attributed to different chromophores. For instance, Trp exhibits a fine-structured peak between 290 and 305 nm. Tyr displays a peak in the range of 275–285 nm, while Phe shows a weak but intense peak at 255–270 nm. These characteristic peaks of amino acid residues emerge due to the vibronic transitions occurring in different vibrational levels of the excited state [8, 16, 26, 27]. The local tertiary structure of the protein can be used for quality control as it often reveals subtle changes from batch to batch not reflected in the far-UV region. Disulfide bonds also contribute to the CD spectrum in the near-UV region [28, 29].

Simple proteins (i.e., proteins devoid of any prosthetic group) do not absorb above 300 nm, and hence they do not exhibit CD signals in wavelengths above 300 nm. However, many prosthetic groups (non-protein chromophores or extrinsic chromophores), including flavins, pyridoxal, and heme moieties, absorb above 300 nm. In the free state, extrinsic chromophores are either achiral or present as enantiomeric mixtures, so they do not show any optical activity. However, upon interaction with the chiral environment of the protein, they generate optical activity [7, 8]. The heme group is a classic example that shows no CD signal alone but exhibits a strong positive band (Soret band) with a wavelength maximum of 412 nm when incorporated in the apoprotein of hemoglobin and myoglobin. The interaction between heme moiety and aromatic residues of the protein is thought to be the reason for heme chirality and hence the CD signal in the Soret region [30].

2.1 Servers to Estimate the Secondary Structure of Proteins from CD Data

Various web servers are available to estimate the secondary structure of the protein from CD spectroscopic data, including DichroWeb [31], BeStSel [32], and K2D3 [33]. They usually employ reference datasets consisting of a set of proteins with known structures to calculate secondary structure information that best matches the experimental (query) spectrum. The CD contribution at each wavelength is weighted, providing the correct secondary structure of the protein as the output. These servers use a range of deconvolution methods, including the simple least square method and more complicated singular value deconvolution and ridge regression method. Generally, the more diverse the components in the reference database are, the more accurate the estimation of secondary structure elements in the query spectrum [34]. It must be noted that the specialized datasets specifically designed for the integral membrane proteins are to be used for their analysis as they tend to have transitions at somewhere different wavelengths compared to soluble proteins [35].

2.1.1 DichroWeb

DichroWeb (http://dichroweb.cryst.bbk.ac.uk) is a freely available web server for determining the secondary structure of a protein based on CD and SRCD spectra. The server facilitates analyses utilizing five different algorithms, including CDSSTR, SELCON3, and VARSLEC (SVD methods with variable selection functions), CONTINLL (a regression restraint method), and K2D (a neural network method now replaced by the stand-alone K2D3 method). The server accepts data in a wide range of formats, including those output from both CD and SRCD instruments, and uses seven reference databases for structure prediction depending upon the protein to be analyzed. It generates an output file containing calculated secondary structures, a tabular and graphical display of experimental, calculated, and difference spectra, and a goodness-of-fit parameter (normalized root mean squared deviation or NRMSD) for the analyses [34].

NRMSD is an important parameter that tells us about the correspondence between the experimental and calculated spectra and is thus used to judge the accuracy of the results. It is important to note that a low value of NRMSD is required but is insufficient to conclude the correctness of the result obtained from the analysis [34, 36]. DichroWeb highlights the importance of precise protein concentration and path length and requires the input data to be down to at least 190 nm and properly subtracted baselines before submission for accurate analysis. It also emphasizes that the best NRMSD is not always the precise solution and that the reference databases do not work well for peptides and membrane proteins [36].

2.1.2 BeStSel

BeStSel server (https://bestsel.elte.hu/index.php) is explicitly designed to analyze β-sheet rich proteins. However, it can be utilized for structural analysis of any protein class, including membrane proteins, amyloid fibrils, and protein aggregates. A comprehensive structural analysis of different secondary structure elements that includes parallel and antiparallel β-sheets and three types of twists, viz. left-handed, relaxed, and right-handed twisted sheets, is performed by the BeStSel server. Based on the structural analysis, it speculates the protein fold down to the topology level organization of the CATH protein fold database [32, 34]. Although BeStSel provides precise secondary structure estimation for a wide range of proteins, the analysis of some special structure types is unsuitable for this server. Such structures include polyproline-II helix (a characteristic structure of collagen-like fibrillar proteins), 310-helices (present in high amounts in some globular proteins), and various types of turns that are the major structural components of short peptides [32]. Also, BeStSel produces large RMSD values for intrinsically disordered proteins (IDPs) and is not a useful tool for studying this class of proteins [37].

2.1.3 K2D3

The K2D3 server (http://cbdm-01.zdv.uni-mainz.de/~andrade/k2d3/) is based on a neural network approach and a successor to the K2D method. The theoretical CD spectra of a non-redundant set of structures representing most proteins in the PDB are calculated using DichroCalc (https://comp.chem.nottingham.ac.uk/dichrocalc/). These theoretical CD spectra then serve as a reference dataset which is directly applied to predict protein secondary structure. Using the most similar CD spectra in the reference database and weighing their distances from the query spectrum, a predicted CD spectrum is generated. The output contains the query spectrum overlaid on the back-calculated spectrum along with the estimated values of α-helix and β-sheet. No parameter that depicts the fit quality is presented in the K2D3 server. However, if the distance between the query spectrum and the most similar spectrum in the database is greater than a threshold value, a warning signal is displayed [33].

3 Determination of Conformational Changes in the Protein Using CD Spectroscopy

The conformation and structural stability are key determinants of the physiological functions of proteins. Structural perturbation in protein is one of the main reasons for the onset and progression of several diseases, including neurodegenerative disorders and cancer. For example, misfolding of α-synuclein and amyloid β leads to protein oligomerization and fibrillation, resulting in Parkinson’s and Alzheimer’s diseases, respectively [38, 39]. The structural alteration in the prion protein in a cell membrane environment with subsequent deposition of amyloid plaques is known to cause prion disease [40]. Genomic instability that results from mutations in crucial genes is a hallmark of almost all cancers [41]. The phenotypic outcomes of mutations on proteins include activity, binding mode and interactions, complex stability, and turnover rate.

Proteins bind to their specific targets in a precise manner, and the specificity of these interactions is predominantly defined by the structural and physicochemical properties of binding interfaces [42,43,44]. Any structural alteration in the protein due to genetic mutation disrupts the binding with the intracellular target, hindering the functionality of the protein. For instance, missense mutations in the BRCT domain inhibit the ability of BRCA1 for substrate recognition. Consequently, the functional role of BRCA1 in the DNA damage repair pathway is hindered and responsible for most hereditary breast and ovarian cancer cases [45]. Protein misfolding/unfolding and degradation also play crucial roles in developing lung diseases, particularly COPD (chronic obstructive pulmonary disease) and idiopathic pulmonary fibrosis and their associated clinical complications [46, 47].

Protein denaturation and aggregation are major problems during manufacturing, storage, and transport in biotechnological and pharmaceutical industries [48]. For instance, therapeutic proteins like antibodies and insulin denature in the bulk solution or at different interfaces during mass production in pharmaceutical companies. The functionality of a protein in the physiological environment or industrial applications is highly dependent on its native conformation. Thus, it is imperative to monitor conformational changes in the protein due to mutations, pH fluctuations, heat, denaturants, or binding interactions with ligands and analyze their functional consequences. CD is a reliable and convenient spectroscopic technique to detect conformational changes in the protein. Moreover, the time dependence of protein structural changes can be determined using the time-resolved CD measurements. CD is also essential in studying peptides that are not feasible by X-ray crystallography [7]. A classic example of such a study is the switching between α-helix and β-sheet structures in prion peptides [49].

4 Analyzing the Conformational Changes in a Polypeptide Sequence upon Mutations Using CD

CD measurements can easily detect structural alterations in the protein upon mutations. Figure 3a shows the far-UV CD spectra of wild-type CopR protein and its mutants (Dim1-7). The wild-type protein shows an α-helical structure as depicted by a positive band at around 192 nm and two negative bands near 208 and 222 nm. The far-UV CD spectra of mutant proteins (Dim1, Dim3, Dim4, and Dim5) show minor deviations in the ellipticity pattern compared to the wild-type protein. This indicates that mutations did not drastically perturb the secondary structure of the proteins. In contrast, the CD spectrum of Dim6 shows a drastic reduction in α-helical content, pointing toward the significant structural perturbations upon single point mutation. The CD spectrum of Dim7 also shows conformational changes, but they are less pronounced than those in Dim6 [50].

Fig. 3
a. A line graph plots theta versus wavelength in nanometers. It plots 7 labeled curves that fall and converge at (250, 0). B. and C. 2 Line graphs for theta in millimeter degrees versus lambda in nanometers. They plot two lines, which initially fall and converge at (250, 0) in b and fluctuate in c.

A transcriptional activator protein. (a) Far-UV CD spectra of wild-type and mutant forms (Dim 1–7) of CopR. Adapted with permission from Steinmetzer et al. [50]. (b) Far-UV CD and (c) near-UV CD spectra of wild-type and mutant γS-crystallin (V42M). Adapted with permission from Vendra et al. [51]

Another example of monitoring mutation-induced structural changes in the protein is γS-crystallin. Figure 3b, c compares the far-UV and near-UV CD spectra of the wild-type and mutant γS-crystallin (V42M). The far-UV CD spectrum of the wild-type γS-crystallin displays two spectral bands, a negative band near 218 nm and a positive band at 195 nm, signifying the β-sheet secondary structural fold of the polypeptide chain. It should be noted that far-UV CD spectral profiles of wild-type and mutant protein were almost identical, indicating that the V42M mutation does not significantly affect the secondary structure of the protein. Strikingly, the near-UV CD spectra reveal that the tertiary structure around the aromatic residues is moderately altered in the mutant protein [51].

5 Analysis of Protein–Ligand Interactions

The binding of proteins with specific ligands such as cofactors, substrates, or regulatory molecules leads to structural changes vital for their physiological function. Such conformational changes may be monitored by the far-UV CD, near-UV CD, or both [52]. However, if the binding occurs near the aromatic amino acid residues, then small structural changes are easier to detect in the near-UV region since the CD contributions of the aromatic residues are highly sensitive to their environment. In contrast, the major structural changes in the protein’s backbone are usually reflected in the far-UV CD spectrum [53, 54].

The study of ligand binding to a macromolecule is amenable if the signal from the complex is different from the sum of the signals from the components. Protein–protein interaction is a widespread and important biological regulatory phenomenon within cells. CD is a valuable technique for studying protein–protein interactions as changes occur in the secondary or tertiary structure of one or both components. The binding of small-molecule ligands such as metal ions or drugs is often accompanied by changes in the CD signal due to changes in the secondary or tertiary of the protein or the ligand. Small-molecule ligands usually have no or weak CD signal when free in solution. Still, they can show notable ellipticity when bound in the asymmetric environment of the binding pocket on the protein. Apart from protein–protein and protein–small-molecule interactions, the CD is specifically applicable to investigate protein–nucleic acid interactions as nucleic acids have strong signals in the near-UV region (250–300 nm), where proteins usually absorb weakly [55].

Human polynucleotide kinase (hPNK) acts by transferring the γ-phosphate of ATP to the 5′ ends of nucleic acids. Hence, the binding of ATP to hPNK is crucial for the proper functioning of the hPNK. Figure 4 shows the conformational changes occurring in hPNK upon binding to ATP. Two negative CD bands at 218 and 209 nm are observed in the CD spectrum of hPNK, a characteristic of the mixed α/β structure, with the band near 218 nm being attributed to the presence of β-structure in the protein. The binding of ATP to the activity of hPNK induces a substantial conformational change, as suggested by a decreased ellipticity value. The analysis of CD data indicated an increase in β-sheet structure and decreased α-helical content in hPNK upon ATP binding [56]. The near-UV CD spectra also showed the perturbed environment of aromatic amino acid residues upon adding ATP. A substantial increase in the CD signal at the Trp peak near 291 nm is observed in the presence of ATP. Strikingly, two well-defined peaks attributed to tyrosine residues are observed at 284 and 278 nm instead of a broad shoulder around 279 nm in hPNK alone. The CD bands corresponding to Phe residues also show reduced ellipticity values in the apoprotein [56].

Fig. 4
2 graphs plot molecular ellipse versus wavelength in nanometers. In A, the lines at point 220 are marked h P N K + A T P and h P N K alone. In B, the line peak at 280 is marked h P N K alone and at 290 as h P N K + A T P. All values are estimated. In A, they fall and then increase. b. They fluctuate.

Far-UV (left panel) and near-UV (right panel) CD spectra of hPNK alone (●) and in the presence of ATP (▲). Adapted with permission from Mani et al. [56]

Another example where CD spectroscopy was used to monitor structural changes upon ligand binding is a bacterial inner membrane protein, SbmA, of a Gram-negative bacterium. SbmA is required to directly uptake the eukaryotic glycopeptides and antimicrobial peptides. The far-UV CD spectroscopy study showed that SbmA interacts with a proline-rich peptide, Bac7, and induces conformational changes, as revealed by the decrease in the CD signal in the wavelength range of 190–250 nm (Fig. 5). The dissociation constant (Kd) calculated after fitting the CD data was 0.26 μM showing the high binding affinity of Bac7 to SbmA. In contrast to far-UV CD spectra, no significant changes are observed in the near-UV CD spectral range, indicating that no aromatic residues are present in close vicinity at the binding interface of protein and ligand [54].

Fig. 5
Graph A plots delta A versus wavelength in nanometers. The values of S b m A and Bac 7 are plotted. It dips at 220 towards the negative quadrant. B plots ellipticity versus wavelength. The values are plotted for S b m A, S b m A + 2 x Bac 7, and 2 x Bac 7, with a peak at 3.8 for the former 2.

Far-UV (left panel) and near-UV (right panel) CD spectra of SbmA with and without Bac7. Adapted with permission from Hussain et al. [54]. ΔA (absorption unit) = θ (ellipticity in millidegree)/3298.2

6 Determination of Protein Folding Pathways

The proper folding of a polypeptide chain into its biologically functional native structure is one of the fundamental processes of biology. Misfolding proteins in the cellular milieu often leads to fatal human and animal diseases. In the industrial context, overexpression of recombinant proteins leads to misfolding and aggregation, causing significant loss of the final product. CD is one of the many biophysical techniques routinely used to understand various aspects of the protein folding process, including the kinetic and thermodynamic properties of folding intermediates. Investigating protein folding mechanisms in vitro also provides valuable information about cellular processes such as protein trafficking and degradation.

A CD spectrophotometer coupled with the stopped-flow system is regarded as one of the best tools to study the mechanism of unfolding and refolding of proteins. This system provides the structural data of protein during the refolding process at a sub-millisecond time scale which can be used to explore the mechanism of the protein folding pathway [57, 58]. Figure 6 shows the refolding measurements of the denatured cytochrome c in the far-UV and near-UV regions using a stopped-flow system attached to the CD spectrophotometer. The changes in the CD signal at 222 nm are faster and occur within the time scale of 200 ms, suggesting the fast refolding of the secondary structure of the protein. However, the ellipticity changes at 289 nm, reflecting the environment around aromatic residues is relatively slower. Overall, the refolding kinetics measurements indicate the brief existence of an intermediate state with a folded secondary structure and flexible aromatic residue side chains in the tertiary structure during the early stages of the refolding process [57].

Fig. 6
2 line graphs plot C D over D C versus time in seconds. Both plots have a declining trend. In A, the values of K 1 and K 2 are 13.72 and 1.22-second inverse. In B, the values of K 1 and K 2 are 1.52 and 0.24-seconds inverse.

Refolding kinetics of cytochrome c monitored at 222 nm (left panel) and 289 nm (right panel) using a stopped-flow CD system. Adapted with permission from [57]

7 Determination of Protein Stability

The stability of a protein is the fundamental property defined by the physicochemical conditions under which the protein is optimally functional. Hence, it is important to identify conditions that maximize the structural stability of the protein not only from the view of basic protein research but also to have a good yield of therapeutic proteins and other protein-based formulations in biotechnological industries. By improving the structural stability of the protein, off-pathway processes such as denaturation and aggregation could be prevented. However, it is important to note that the conditions (e.g., ionic strength, pH) that optimize the protein’s physical stability might have deleterious effects on its chemical stability (e.g., deamination, oxidation). Therefore, the most stable protein formulation is achieved while considering all aspects of product quality, even with a bit of compromise with the physical and chemical stability of the protein [59]. Various methods that can be employed to measure the stability of the protein require the disruption of the native protein structure either by chemical or physical means. The conformational stability is essentially proportional to the resistance of the protein toward perturbation. The physical denaturation tools used to assess protein stability include temperature, high pressure, mechanical agitation, ultrasound, and ultraviolet radiations [60,61,62,63,64]. In comparison, the chemical denaturation of protein can be achieved by strong acids and bases, high concentrations of inorganic salts, salinity, organic solvents, and heavy metal salts [65,66,67,68,69].

7.1 Thermal Denaturation

Temperature is the most widely used tool for the physical denaturation of protein. Ideally, the thermal denaturation of a protein should be studied at its isoelectric point. However, the native protein is the least soluble at this pH, and the denatured protein is more prone to aggregation. Another disadvantage is that proteins usually get denatured far above the physiological or storage temperature. This requires long extrapolations of data to lower temperatures during thermodynamic analysis, which is often error-prone [70]. Furthermore, the thermal denaturation of proteins near their isoelectric points is usually an irreversible process that makes the calculation of stability parameters from the analysis of thermodynamic data highly unreliable and ambiguous. In such a case, physical stability rankings are presented only on Tm values, representing only a small part of the protein conformational stability curve as a function of temperature [71]. Irreversible protein denaturation is mostly followed by aggregation, affecting the accuracy of the measured Tm values [72,73,74]. Additionally, the Tm value depends on the rate with which the temperature increases during the measurement, further complicating the stability extrapolations to lower temperatures.

The changes in the CD signal at a specific wavelength and as a function of temperature provide information about the thermodynamics of the protein unfolding process. The parameters retrieved from the CD thermodynamic data include the melting temperature or the midpoint temperature of the unfolding transition (Tm), the free energy of unfolding (ΔG), the van’t Hoff enthalpy (ΔH) and entropy (ΔS) of unfolding, and the heat capacity changes (ΔCP) of the unfolding transition [75]. Additionally, the analysis of CD spectra of protein acquired as a function of temperature provides information about the presence of intermediates in the folding pathway [7]. It should be noted that stability parameters obtained from the thermal denaturation curve of a protein must be validated by other experiments, such as the differential scanning calorimetry (DSC) measurements. In addition, an accurate determination of stability parameters from the analysis of optical denaturation curves depends on the temperature dependence of the pre- and post-transition baselines [76,77,78].

Figure 7 shows the thermal unfolding transition of MTH1880, a thermophilic protein from Methanobacterium thermoautotrophicum, probed by CD spectroscopy. Far-UV CD spectra were acquired with increasing temperature from 25 °C to 105 °C at an interval of 10 °C (Fig. 7a). MTH1880 shows the α-helical secondary structure in the temperature range of 25–45 °C, suggesting that the protein retains its native structural fold till 45 °C. The CD signal begins to decline continuously from 55 °C to 95 °C, where the protein is completely denatured and does not seem to show any further change in ellipticity with a further increase in the temperature. The thermal unfolding of MTH1880 follows a two-state transition unfolding pathway [79]. The raw CD data were converted into fu, i.e., the fraction unfolded [77], which is used to generate a fu versus T plot (Fig. 7a). The normalized denaturation curve is fitted to the sigmoidal curve to derive thermodynamic parameters. The midpoint of unfolding transition or melting temperature (Tm) of MTH1880 was 76 °C as defined by fu = 0.5.

Fig. 7
A is a graph for theta multiplied by 10 raised to minus 3 versus wavelength in nanometers. The increasing plot marks 2 temperatures at 105 and 25 degrees Celsius. B plots f subscript u, which is the fraction unfolded versus temperature in degrees Celsius. The curve is increasing and peaks at 1.

Thermal unfolding of MTH1880. (a) Far-UV CD spectra measured at 10 °C intervals during the heating cycle (25–105 °C); 25 °C (black triangle), 45 °C (red square), 65 °C (blue circle), 75 °C (yellow square), 85 °C (pink triangle), 95 °C (green square), and 105 °C (pink circle). (b) A plot of fraction unfolded (fu) derived from molar ellipticity measured at 222 nm as a function of temperature. The midpoint temperature of the unfolding transition (Tm) is 76 °C. Adapted with permission from Kim et al. [79]

Folding intermediates provide crucial information regarding the folding and assembly pathways. Thermal denaturation studies of protein often detect such folding/unfolding intermediates. Figure 8 shows the melting curve of Escherichia coli histone-like HU protein (EcHUa2) obtained by plotting the CD signals at 200 and 222 nm as a function of temperature. The thermal transition curve at 200 nm is biphasic, indicating a three-state denaturation mechanism (N → I → U) for EcHUa2 unfolding. An intermediate state is populated between N and U states at around 48 °C. The presence of two melting temperatures marks the denaturation process, i.e., 37.8 °C and 54.8 °C, corresponding to N → I and I → U transitions, respectively [80].

Fig. 8
A line graph plots theta multiplied by 10 raised to minus 4 versus temperature in degrees Celsius. The lines are 200 and 222 nanometers. The lines intersect at an intermediate point at (50, minus 0.5). 2 vertical lines are marked from the plotted lines and are labeled T m 1 and T m 2.

Thermal unfolding of EcHUa2 followed by CD spectroscopy at 200 and 222 nm. An intermediate state is populated at around 48 °C in the melting curve. Adapted with permission from Ramstein et al. [80]

7.2 Chemical Denaturation

As discussed above, thermal denaturation studies of protein are often complicated and suffer from unreliable thermodynamic parameters if the unfolding process is irreversible. Different approaches are used to measure the protein stability in such a case, which employ chemical denaturants to unfold protein near-physiological temperature. Commonly used denaturants in isothermal chemical denaturation studies are urea and guanidine hydrochloride (GdnHCl) [81,82,83]. These chemical denaturants can prevent aggregation by keeping the unfolded protein species in stable and solubilized form, thus reversing the unfolding reaction. However, there are a few exceptions to this; for instant, low concentrations of GdnHCl fail to keep the denatured protein in the soluble form [84, 85].

Measurement of typical urea (or GdnHCl)-induced denaturation curve monitored by CD involves (i) preparation of samples where the increasing concentration of denaturants are added to the protein solution followed by incubation at room temperature until the equilibrium is reached to ensure complete unfolding [59] and (ii) plotting of the CD signal at a given wavelength (θ, the raw ellipticity or [θ], the mean residue ellipticity) as a function of [denaturant], the molar concentration of the denaturant. The denaturation curve is analyzed for stability parameters, namely ΔGD0 (Gibbs free energy change (ΔGD) associated with ND process in the absence of the denaturant), m (dependence of ΔGD on [denaturant]), and Cm (midpoint of denaturation curve). Analysis of the GdnHCl-induced and urea-induced denaturation curves is discussed elsewhere [86,87,88,89,90,91,92,93]. This analysis assumes that the protein denaturation is reversible. The preparation of protein samples for checking the reversibility of the denaturation by urea (or GdnHCl) is described elsewhere [86]. It should be noted that estimates of stability parameters depend on the mechanism of denaturation. However, it has to be validated whether the denaturation is a two-state process [94].

Figure 9 shows the stability studies of human carbonic anhydrase II (HCAII) employing the chemical denaturation method. Here, far-UV CD spectra of HCAII were collected at different urea concentrations (Fig. 9a). It was noted that the α-helical content increased with the addition of low concentrations of urea (0–2 M). Further increase in the urea concentration leads to the peak shift toward 218 nm, indicating the transformation of the α-helix into the β-sheet structure [95]. To obtain the denaturation curve, the molar ellipticity at 222 nm was plotted as a function of [Urea] (Fig. 9b). HCAII undergoes a cooperative triphasic unfolding profile with two distinct intermediate species (XI and XII) populated at around 2 and 4 M [Urea] on the denaturation pathway N ↔ XI ↔ XII ↔ D. From 0 to 2 M urea, a continuous gain in secondary structure was observed that reduced successively with further increase in [Urea] until the protein is completely denatured. Values of the midpoint urea unfolding concentration (Cm) for transitions, N ↔ XI, XI ↔ XII, and XII ↔ D, were obtained after analyzing the denaturation curve, assuming that each transition curve follows a two-state mechanism (CmI = 1.33 M, CmII = 3.25 M, CmIII = 5.78 M) [95].

Fig. 9
Graph A plots theta lambda cross 10 raised to minus 3 versus wavelength in nanometers. The lines are plotted for native, 1 to 8 M. The values are below the origin in the negative quadrant. B plots theta 222 cross 10 raised to minus 3 versus urea. The increasing curve is marked at points N, X 1, X 2, and D.

Chemical denaturation of HCAII followed by CD spectroscopy. (a) Far-UV CD spectra of HCAII at various [Urea]. (b) Urea-induced denaturation is curved, followed by plotting the change in [θ] at 222 nm as a function of [Urea]. The inset shows the dependence of the optical properties of intermediates, yXI and yXII, on [Urea]. Adapted with permission from Wahiduzzaman et al. [95]

Chemical denaturation studies provide valuable information about the destabilizing mutations that make the native protein either non-functional or prone to aggregation leading to devastating diseases [96,97,98,99]. Figure 10 shows the destabilizing effects of two single point mutations on the wild-type SOD1, whose misfolding and aggregation have been implicated in ALS. The GdnHCl-induced unfolding of SOD1pWT and its mutants, SOD1E100G and SOD1V14G, was monitored using the far-UV CD. The denaturation curves of both wild-type and mutant proteins follow a two-state unfolding transition. The Cm values corresponding to the transition midpoints were 4.2, 3.7, and 3.0 M for SOD1V14G, SOD1E100G, and SOD1pWT, respectively. This indicates that both mutations destabilize the wild-type SOD1protein. Although both mutations are positioned far away from the dimer interface and metal-binding site, they somehow perturb the metal loading to the active site. The partially metallated SOD1 was prone to misfolding and aggregation, causing neurodegenerative disorder [97].

Fig. 10
A line graph plots the fraction unfolded versus the concentration of G d n H c l in M. The lines are plotted for S O D 1 superscript V 1 4 G, S O D 1 superscript E 100 G, and S O D 1 superscript p W T. The lines are increasing.

GdnHCl-induced chemical denaturation curves of SOD1and its mutants monitored by CD spectroscopy. SOD1pWT (blue), SOD1E100G (red), and SOD1V14G (black). Adapted with permission from Tompa et al. [97]

8 Time-Resolved CD Measurements

CD spectroscopy has long been known as a reliable technique to determine the structural elements of proteins. It was used only to investigate the static structural properties in the past. However, it can now be employed to study protein dynamics with kinetic measurements using time-resolved CD spectroscopy. CD spectroscopy can be coupled with stopped-flow kinetic techniques to determine time dependence structural changes in the protein. The critical time-resolved CD measurements can detect events occurring on the millisecond resolution at a single wavelength [100]. The information from such studies provides mechanistic details of the protein folding phenomenon. Various excellent examples of time-resolved CD measurements are available in the literature [101,102,103]. Based on them, it seems that small proteins fold rapidly following a two-state transition without any detectable intermediate state(s). In contrast, larger proteins (more than 100 amino acids) usually fold via multi-state transition pathways. The native-like secondary structure is formed at the early folding stages, followed by acquiring tertiary structure interactions for larger proteins. The early intermediates formed often possess “molten globule” type characteristics.

In many experimental cases, a rapid burst phase in protein folding kinetics is reported using stopped-flow CD spectroscopy. The process occurs during the dead time of the instrument and produces an initial CD signal that differs from that expected for the unfolded protein, referred to as the burst phase. This difference in CD signal indicates that a substantial structural change occurred from D to N states during the initial burst phase [58]. An example of protein folding kinetics investigated using stopped-flow CD spectroscopy is shown in Fig. 11. The C-terminal domain (CTD) of spidroin 1 from the Ma gland (MaSp1) of the nursery web spider Euprosthenops australis was chemically denatured and refolded by rapid mixing into the buffer solution. A rapid burst phase of approximately 10 ms was observed within the dead time, which is beyond the detectable time resolution of the instrument. This was followed by a slow, single-exponential relaxation decay phase on the time scale of seconds [104]. To obtain the molecular details of the slow phase, the folding kinetics was measured at different protein concentrations (Fig. 11). After fitting exponential data of various protein concentrations, the time constants are almost identical within the error limits. This shows that the slow phase follows a mono-molecular folding event and is independent of protein concentration. It is also speculated that a dimeric intermediate is formed from the association of unfolded monomers during the rapid, unresolved burst phase. The event of biomolecular dimerization occurs too rapidly to be observed at the protein concentrations needed for sufficient signal in CD spectroscopy.

Fig. 11
A line graph plots the C D signal at 222 nanometers versus time in seconds. The lines are plotted for 24, 38, and 100 micrometers. The values fluctuates and declines.

Folding kinetics of CTD of spidroin 1 from the Ma gland of Euprosthenops australis. The far-UV CD stopped-flow spectroscopy measures the kinetic transients. Chemically denatured wild-type CTD samples at different protein concentrations (24, 38, and 100 μM) are refolded by rapid mixing into the buffer solution. The data fits the mono-exponential decay function and is depicted as a black line. Adapted with permission from Rat et al. [104]

9 Concluding Remarks

The protein folding phenomenon is of fundamental and practical importance, making the biophysical studies of protein folding and stability highly crucial. CD spectroscopy is an invaluable tool that monitors structural changes at the secondary and tertiary levels and in millisecond time resolution. It is a fast, reliable, and inexpensive technique for the initial investigation of recombinant proteins or those purified from tissues. Unlike X-ray crystallography and NMR spectroscopy, the two primary powerful structure determination techniques, the ease and the requirement of low protein concentrations, make CD spectroscopy a desirable and demanding method of choice. This chapter provides a comprehensive overview of the CD spectroscopy technique, its principle, and its applications in protein structural biology. Although CD could monitor fine structural details, more advanced and sophisticated instrumentation must be developed to detect events occurring too fast to be observed by currently available stopped-flow CD instruments. With the development of synchrotron radiation circular dichroism (SRCD) that uses high-intensity light sources, the measurement of data at lower wavelengths having more electronic transitions and thus giving more structural details has become feasible. The high signal-to-noise ratio conferred by SRCD enables the CD measurements in the presence of detergents, lipids, and other absorbing buffers.