Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Activity-Stability-Flexibility in Psychrophilic Enzymes

It is crucial to unravel the details of the structural mechanisms that rule the relationship between thermal stability, activity and dynamics in psychrophilic enzymes both for fundamental research and industrial applications (Gerday et al. 2000). Indeed, enzymes isolated from cold-adapted organisms are often of interest for their features in terms of high activity at very low temperatures, thermolability and unusual specificity, making them suitable for a large spectrum of industrial applications. The study of enzyme cold adaptation has also a broader and more general relevance. We have to consider that cold-adapted organisms are able to successfully grow and proliferate in very challenging and restrictive habitats for life. It is thus surprising to observe how those organisms evolved to feature metabolic fluxes comparable to those exhibit by the mesophilic counterparts at their own optimal growth temperature (van den Burg 2003; Struvay and Feller 2012).

Different adaptation mechanisms have been exploited by cold-adapted organisms. Among them, the optimization of their enzymatic repertory is one of the most exciting. Indeed, if we think that temperature is one of the fundamental environmental factors for life and that reaction rates is dramatically reduced when the medium temperature decreases from 37 °C to 10 °C (Feller and Gerday 2003; Siddiqui and Cavicchioli 2006), it is really surprising that biological activities can be recorded even at temperatures as low as −20 °C (Cary et al. 2010). Thus, psychrophilic organisms had to evolve and optimize their enzymatic repertory to survive to extremely cold environments.

In particular, psychrophilic enzymes are generally characterized by higher thermal lability and higher catalytic efficiency (kcat/Km) at low temperatures with respect to their warm-adapted counterparts (Siddiqui and Cavicchioli 2006; Struvay and Feller 2012). Their maximal activity is shifted to lower temperatures with respect to the mesophilic enzymes, reflecting the weak stability of psychrophilic enzymes, which are generally prone to inactivation and unfolding at moderate temperatures. It has also been shown that enzyme cold adaptation is usually “incomplete”, since the activity of most of the psychrophilic enzymes around 0 °C, although high, is generally lower than the one of the mesophilic homologs at 37 °C (Siddiqui and Cavicchioli 2006).

kcat/Km is generally optimized by both increasing kcat at the expenses of Km or optimizing both the kinetic parameters in different cold-adapted enzymes (Siddiqui and Cavicchioli 2006; Struvay and Feller 2012). The increase in kcat is also related to a decrease in activation free energy of the catalysed reaction, and in particular to a decrease of activation enthalpy, which was speculated to be structurally achieved by a decrease in the number of enthalpy-driven interactions that need to be broken to reach the reaction transition state. These aspects have been also considered as an indication of low stability of cold-adapted enzymes and of enhanced structural flexibility, at least in the proximity of the catalytic site. This was the first suggestion of a relationship between activity and stability in thermal adaptation and they structurally reflect in a higher flexibility of the three-dimensional (3D) architecture of the protein (Fields 2001; Somero 2004).

Nevertheless, the link between activity, flexibility and thermal stability in cold-adapted enzymes is still debated. The intrinsic thermal lability of cold-adapted enzymes, together with their enhanced low temperature activity, suggests a direct connection between them. Indeed, activity at low temperatures may require a weakening of intramolecular interactions that, in turn, results in reduced stability. Otherwise, the relationship seems not be so clear and straightforward, since low thermal stability could be also related to random genetic drift. It might be indeed just a consequence of lack of evolutionary pressure for stable enzyme in low temperature habitats (Wintrode and Arnold 2000; Leiros et al. 2007; Fedøy et al. 2007). Moreover, the existence of unusual cold-adapted enzymes, which feature both unusual thermostability and high catalytic efficiency, have been reported (Leiros et al. 2007; Fedøy et al. 2007), along with proofs of the ability to decouple stability and activity in in vitro studies (Wintrode and Arnold 2000; Jónsdóttir et al. 2014). All these observations make the definition of activity-stability-flexibility relationship even more challenging.

The hypotheses of an intimate connection between structural rearrangements, protein flexibility, catalytic activity and thermal stability in cold-adapted enzymes have stimulated the scientific community to define the mechanisms related to enzyme cold-adaptation in atomic details with particular attention to the effects mediated by residue substitutions using, for instance, mutational studies.

2 Localized Flexibility and Enzyme Cold Adaptation

The enhanced flexibility of cold-adapted enzymes has been shown not be necessarily spread over the whole 3D structure (global flexibility) but it can be localized in specific regions that affect even from distal site the surroundings of the catalytic site (Olufsen et al. 2005; Papaleo et al. 2006, 2008, 2011a, b; Chiuri et al. 2009; Pasi et al. 2009; Mereghetti et al. 2010; Tiberti and Papaleo 2011; Martinez et al. 2011; Isaksen et al. 2014).

The low thermal stability of psychrophilic enzymes supports the flexibility hypothesis, but its verification is not straightforward due to the intrinsic difficulties in defining and measuring flexibility. Indeed, we can describe flexibility in terms of dynamic motions and thus relate this to a specific timescale in which certain dynamics occur or we can consider it as related to the degree of the deformation of the structure at a certain temperature.

The study at the atom level of protein flexibility in solution is a challenging task and only few experimental techniques can really win the challenge, such as NMR, EPR, or neutron scattering and very recently the sampling of electron density from X-ray crystallography (Fraser and Jackson 2011). We will not treat them in this chapter since our focus is the usage of molecular dynamics simulations, i.e. a computational technique, to tackle the study of cold adapted enzymes. Nevertheless, the aforementioned experimental techniques (Tehei et al. 2005; Heidarsson et al. 2009; van den Bedem and Fraser 2015), and NMR especially, are the more promising approaches to integrate with simulations, on which we cannot completely rely as we will discover in the next pages.

3 An Ensemble Description of Protein Structures and the Importance of Protein Dynamics

In the last decades, while researchers in cold-adaptation field started to take advantage from experimental and computational structural studies to solve the enigma of enzyme cold adaptation, an increasing amount of evidence supported the strict relationship not only between protein structure and function, but also between enzyme dynamics and activity (Henzler-Wildman et al. 2007; Henzler-Wildman and Kern 2007; Nashine et al. 2010). Moreover, the notion that proteins are dynamic rather than static entities and are thus better described as an ensemble of conformations in solution starts to be widely accepted by the community in protein science (Hilser et al. 2006; Acuner Ozbabacan et al. 2010; Woldeyes et al. 2014).

Indeed, proteins are not static molecules and they experience conformational changes across a number of different sub-states over different timescales. Some of those motions are also called ‘breathing motions’ of the protein structure. The transitions between the different states often depend on concerted motions of groups of residues involving hinge and rocking motions in timescales from 10−8 to 103 s. It has been also suggested that many fluctuations, involving side-chain or main-chain motions and that originate from rotations, stretching or torsional motions (10−12 s) can underlie the large structural motions (Eisenmesser et al. 2005).

It thus becomes critical to take into account the timescales that are accessible to the technique that one wants to employ for the structural and computational studies of cold-adapted enzymes. Conventional MD simulations, thanks to the progresses in the field, can routinely reach the microsecond (10−6 s) or in exceptional cases the millisecond timescale (10−3 s) but still suffer of limitations when it comes to compare the simulated timescale to the dynamics that the same protein would experience in solution in the experiments. Solutions are available to overcome these problems. Indeed, MD simulations often encounter the risk to entrap the protein for a too long time in local minima. Enhanced sampling methods coupled to atomistic MD physical models have been also proposed (Spiwok et al. 2014; Barducci et al. 2015) to overcome limitations of conventional MD simulations, such as temperature or Hamiltonian replica exchange, metadynamics approaches or restraining simulations using ensemble averaged NMR chemical shifts (Camilloni et al. 2012, 2013; Camilloni and Vendruscolo 2014). The two latters especially hold promising in the capability to describe with high accuracy with respect to the experimental data protein dynamics occurring on the millisecond timescale and even beyond (Camilloni et al. 2012; Sutto and Gervasio 2013; Palazzesi et al. 2013; Papaleo et al. 2014b).

As stated above, enzyme catalysis involves the ‘breathing’ of particular sites of the enzyme structure, enabling for example the accommodation of the substrate or assuming states that resemble the transition state for the enzymatic reaction. The ease of such conformational changes might be an important determinant of catalytic efficiency. This observation is even more relevant if we consider that there are now many examples of proteins in which, even in the free (substrate-unbound) state, we can observe conformations that resemble functionally important states, such as substrate- or ligand-bound or the transition state conformations (Boehr et al. 2009; Ma and Nussinov 2010; Kosugi and Hayashi 2012). These states are very often just a minor population (even lower than 10 %) of the conformational ensemble of the free enzyme in solution and are difficult to characterize in details (Baldwin and Kay 2009; Mittermaier and Kay 2009).

The discoveries in the field of protein dynamics also impact on the way in which we study the structural determinants of cold-adaptation. An ensemble view of protein structures has to be applied to the investigation of cold-adapted enzymes more widely and as a complementary technique to the biochemical characterization of these enzymes. MD simulations of psychrophilic enzymes have been pioneers in this.

4 Molecular Dynamics Simulations of Proteins: An Overview, Limitations and Advantages

If it is clear that we need to rely on a ensemble description of protein structures and that dynamics over different timescales are important for protein function, and thus also for the understanding of cold-adapted enzymes at the atomic level, it should also be clear that suitable approaches for these studies, at the computational level, are molecular simulations. Among the different techniques available in this context, atomistic explicit solvent molecular dynamics is one of the most promising ones. Indeed, it allows us not to renounce to an atomistic description of either the protein and the solvent and to sample dynamics over different timescales (from the picosecond to the millisecond timescale, as stated in Paragraph 3). It is beyond the efforts of this chapter to summarize the principles beyond this methodology that has been applied to proteins for the first time many years ago (McCammon et al. 1977) and recently three eminent scientists in the field of protein simulations have been recognized by a Nobel Prize (Smith and Roux 2013; Nussinov 2014). Several review articles, and book chapters are already available to become familiar with this technique (Rapaport 1998; Dror et al. 2012; Leach 2001; van Gunsteren and Berendsen 1990). Nevertheless, we want to recall here some important practical aspects that can be crucial for a researcher that wants to approach to these techniques to study cold-adapted enzymes.

A brief scheme of how MD simulations work is reported and discussed in Fig. 24.1.

Fig. 24.1
figure 1

Schematic representation of the workflow for MD simulations. Every simulation starts from two essential pieces of information. The first one is the detailed structure of the protein that needs to be simulated (1), in which the three-dimensional (3D) coordinates of all the atoms or all the heavy atoms have to be specified. Once this starting structure has been selected, it is further processed by adding solvent (usually water), counter-ions (usually Na+ and Cl) to make the simulation as close to reality as possible. The second requirement is a physical model that specifies the interaction between the atoms in the simulation (2). In the molecular mechanics framework, interactions between atoms are approximated by means of classical physics, so that electrons do not need to be explicitly taken into account. For instance, the potential energy that defines the covalent bond has the form of Hooke’s law, which is used in the description of classical springs. Several potential terms build up the total potential energy function V, which depends exclusively on a set of parameters (which collectively make up the so-called “force field”) and the position of atoms. Once both these requirements are satisfied, several preparation steps can be carried out to bring the system to the desired thermodynamic conditions (not shown here). Finally, the productive MD simulation can run (3), and a number of time-consecutive conformations are written as output (the MD trajectory) (4). Once this is done, average properties of the system can be computed from the trajectory, as if it was a collection of independent structures sampled according to the Boltzmann distribution (5). This is possible under the assumption of the ergodic hypothesis, which states that, if sampling is carried out for enough time, the average properties extracted from the trajectory will converge to the properties of the system, in the given thermodynamic conditions (ensemble)

Simulations are based on the model of the system under study i.e., on a representation as close as possible to nature of the system behaviour. In principle, we could recur to first-principle physics to model atomic motions with the computer. Nonetheless, it is not easily applicable to use quantum mechanics (QM) theory for the thousands of atoms that make up a protein and to study long timescales. Therefore, other methods need to be evoked. Indeed, the widely accepted Born-Oppenheimer approximation allows us to decouple the nuclear and electronic behaviour of the system under investigation since the electron cloud equilibrates quickly for each instantaneous configuration of the heavy nuclei. This allows to avoid to take the electrons explicitly into account and the motion of the nuclei can then be expressed as a nuclear potential energy surface (PES). Given the PES, we can use classical mechanics to follow the dynamics of the nuclei and, in turn, the dynamics of the protein or biomolecule of interest. Since we have discarded the first-principle representation, however, we need to somehow define how the nuclei interact with each other. We thus need a potential energy function V(r;p) that determines the energy of the system starting from some parameters (p) and the 3D coordinates of the atoms (r). Such potential functions are called force fields. Force fields are designed to adequately represent the physics of the system of interest, and this is usually attained by building the force field so that it approximates the relevant regions of the ab initio Born-Oppenheimer surface or correctly reproduce the experimental data. The functional form of a typical force field used in MD simulations of biomolecules is the sum of many contributions, which are meant to represent the molecule behaviour. Bond and angle vibrations around the equilibrium values are represented using Hook’s law and a periodic version of it is used for torsions. Non-bonding electrostatic interactions are taken into account through the Coulomb potential, while Van der Waals interactions between atoms are usually modeled through the so-called Lennard-Jones potential. In a simulation one of the goal is to calculate emerging average properties from the motions of protein atoms, given certain selected thermodynamic conditions. This is possible because of the ergodic hypothesis, which states that simulations a protein for a time long enough will eventually allow to calculate properties that are representative of the average behaviour of the protein in the thermodynamic context. In MD simulations the time evolution of the particles, whose interactions are described by a force field, is studied by iteratively solving the Newton’s equations of motions. This involves first the calculations of the forces acting on the atoms from the potential energy function. Then, a new conformation of the protein is calculated by numerically integrating the Newton’s equation of motion for a small time step, usually of few femtoseconds at most. This process is repeated a number of times, until a sufficient timescale has been sampled in the simulation to study the properties of interest. We should notice that before collecting the productive MD simulations several steps are needed to ensure the proper physical behaviour of the system. Indeed, the protein under study has to be soaked in a box of water molecules, ions might need to be added, and several preparatory steps (which are short MD simulations themselves) are needed to bring the system at the selected thermodynamic conditions.

The main outcome of a MD simulation is the MD trajectory, which is the collection of structures generated over the simulation time. From this ensemble of structures, following the ergodic hypothesis, the properties of interest can be calculated.

In brief, it should be clear that two main aspects are important in approaching to MD simulations i.e., the physical model selected to describe the protein and the solvent (i.e. the force field) and the coverage of the conformational space accessible to the molecule with the correct Boltzmann distribution achieved in the simulation (i.e. the sampling).

The efforts done by the community in the last years pointed out a major issue in MD simulations, i.e. the sensitivity of protein dynamics to the force fields used to carry out the simulations. Current MD force fields have been extensively evaluated and validated by comparison with experimental data that are probes of dynamics over different timescales, such as parameters that can be measured by NMR or other biophysical spectroscopies (Lindorff-Larsen et al. 2012; Best et al. 2012). It turned out that the description of the same protein with different force field can provide different information on protein dynamics and some of the new generation force fields (such as CHARMM22* or AMBER99SB*-ILDN), in which backbone or side chain corrections have been added, seem to perform better in describing dynamics that is comparable to the NMR data than some older force fields (such as CHARMM22 and old OLPS versions). The selection of the force fields is thus crucial and, when possible, a cross-validation against experimental biophysical data on the system under investigation is preferable before trusting the simulation results too much.

On one hand, the high accuracy that some of those force fields have achieved is very encouraging in the direction of successfully using MD simulations to study protein dynamics in details and to compare, for example, different variants of an enzyme or different homologs adapted to different conditions, such as extremophilic enzymes. Indeed, the most accurate force fields have been shown capable to unfold and refold small fast folding domains to their native structure (Lindorff-Larsen et al. 2011; Piana et al. 2013), to sample the conformational space around the protein native state of folded proteins (Lindorff-Larsen et al. 2012; Martín-García et al. 2015) and to quantitatively estimate the population of minor states in enzymes and the effects induced upon mutations on these states (Papaleo et al. 2014b). On the other hand, atomistic classical force fields are not perfect and there are still many limitations, such as for example the risk to overestimate salt bridges in solution (Debiec et al. 2014; Jónsdóttir et al. 2014), especially for charged residues not involved in electrostatic networks or solvent-exposed (Jónsdóttir et al. 2014), or even in treating metal ions bound to metallo-proteins or -enzymes (Calimet and Simonson 2006; Zhu et al. 2013; Li et al. 2013) and explicitly account for polarization effects (Halgren and Damm 2001). Improvements in these directions are thus needed, especially in the study of cold-adapted enzymes where many targets for the study are metal-binding enzymes or where the role of salt bridges can (or cannot) be important for temperature adaptation.

5 The Folding Funnel Model of Cold-Adapted Enzymes

D’Amico and coworkers (2003) have proposed the integration of biochemical and biophysical data on psychrophilic enzymes in a folding funnel model which describes the folding-unfolding reactions of cold-adapted enzymes. They proposed that differentially temperature-adapted enzymes also possess differently shaped funnels (Fig. 24.2). The height of the funnel represents the folding free energy, which corresponds to the conformational stability, whereas the unfolded state occupies the upper region of the funnel. In their model, the edge of the funnel for the cold-adapted proteins is larger respect to the one of warm-adapted counterparts, corresponding to a broader distribution of the unfolded states. During the folding process, the free energy levels decrease, as well as the broadness of the conformational ensemble accessible to protein structures, in agreement with the general funnel theory applied to protein folding (Onuchic and Wolynes 2004). The walls of the funnel have been suggested to present a different amount of roughness in psychrophilic and thermophilic enzymes (Fig. 24.2). Thermophilic proteins have overall a quite corrugate funnel. Since structures of cold-adapted enzymes generally unfold cooperatively without intermediates, due to few intramolecular interactions, the funnel slopes are steeper and smoother. The bottom of the folding funnel of a warm-adapted protein is represented as a single global minimum divided by large energy barriers from other minor populated minima and overall little conformational freedom. The bottom of a psychrophilic protein is wider and rugged, as it represents a collection of many conformers separated by low energy barriers, resulting in a more labile and flexible protein.

Fig. 24.2
figure 2

Folding funnel model for thermophilic and psychrophilic enzymes. The folding free energy is shown as a function of reaction coordinates that account for the conformation of thermophilic (red) and psychrophilic (blue) enzyme. Here the funnels have been cut vertically to expose the internal structure. The top of the funnels is occupied by unfolded conformations, while native conformations are found at the bottom. The different heights of the funnels and ruggedness of the bottom exemplify the energy barriers for the interconversion between different substates, as well as the different degree of structural fluctuations between the two extremophiles

According to a conformational selection scenario (Ma and Nussinov 2010), the substrate will bind to the enzyme populations competent for the interaction with it and, as a consequence, a population shift toward these binding-prone conformations is observed, leading to an ‘active’ structural ensemble. In the case of cold-adapted enzymes where the bottom of the folding funnel is rugged, the aforementioned population shift requires only modest free energy changes for the interconversion of the different conformational states, thus explaining the role of increased flexibility, for example, in facilitating the binding of the substrate.

The computational study of the cold-adapted funnel model is a challenging task since it requires the proper sampling and representation of the free energy surface accessible to the protein under investigation in both its folded and unfolded state. Moreover, the free energy landscape (FEL) of a protein is a complex multidimensional landscape and in simulations we need thus to reduce it to two or three coordinates (reaction coordinates). The reaction coordinates to describe the FEL have to be collective variables, i.e. they need to capture the main features of the conformational ensemble under study. The FEL description achieved is strictly dependent on the choice of the reaction coordinates. Suitable descriptors to be employed as reaction coordinates can be, for example, the root mean square deviation (rmsd) of specific groups of residues (i.e, the ones in a loop that is expected to have different conformational preferences in cold- and warm-adapted homologs), the gyration radius, the first principal components from Principal Component Analysis (PCA) or contact maps. In a equilibrated thermodynamic system, the free energy can be then estimated from the probability density function of one or more of these reaction coordinates (Poland 2001).

In the case of cold-adapted enzymes, we have provided a qualitative representation of the FEL from multi-replicate all-atom MD simulations (Papaleo et al. 2009) of serine proteases and uracil-DNA glycosylases (Mereghetti et al. 2010). This study was limited to the description of the bottom of the folding funnel of cold- and warm-adapted counterparts, i.e. the region related to conformational fluctuations around the native state. We hope that this study can stimulate further investigation aiming at describing also the unfolded state of differently temperature-adapted enzymes. In our study we employed different properties, such as principal components from PCA of the MD ensemble (Amadei et al. 1993; Garcia et al. 1992), rmsd of loops in the surrounding of the catalytic site, and the protein radius of gyration (Fig. 24.3) as reaction coordinates. These computational studies allowed to test the hypothesis of the folding funnel model of cold-adapted enzymes mentioned above with regard to the bottom of the funnel. Indeed, the FEL from the MD ensemble of cold- and warm-adapted enzymes showed an intrinsic tendency of cold-adapted variants to explore more structural basins and a more rugged flat bottom landscape, which favours the interconversion among several metastable states.

Fig. 24.3
figure 3

Two-dimensional FEL profiles of cold- and warm-adapted enzymes using as reaction coordinates the two first principal components from PCA analysis of MD trajectories of a mesophilic (a, b) and a psychrophilic (c, d) serine protease at 283 (a, c) and 310 K (b, d) (Reprinted adapted with permission from (Mereghetti et al. 2010). Copyright 2010 American Chemical Society)

Recently, many methods for enhanced sampling of the conformational space accessible to proteins have been proposed (Spiwok et al. 2014; Barducci et al. 2015) and can be combined to atomistic MD force fields. Thus, one intriguing direction for the future would be to extend the MD investigation mentioned above to the whole FEL, including the study of folding and unfolding in cold- and warm-adapted homologs, if we will also be able to improve the solvent models for MD simulations at temperatures higher than the ones at which the current solvent models have been parametrized. Until good solvent models along many range of temperatures will be developed, we cannot rely on MD simulations alone in studying extremophilic enzymes at different temperatures and a good strategy would be to always integrate this kind of calculations with experiments (Invernizzi et al. 2009; Ganjalikhany et al. 2012).

6 The Need for Structures

Due to the relatively low availability of structural data on cold-adapted enzymes, the limited availability of cold-adapted proteins for comparative purposes is a huge constraint and limitation. Several homologous enzymes that are adapted to different temperature with high-resolution atomic structures must exist to allow a suitable comparison. Most of the MD studies carried out so far only compared mesophilic and psychrophilic enzymes, since few examples are known for which also the 3D structure of the thermophilic counterpart is available (Tiberti and Papaleo 2011; Sigtryggsdóttir et al. 2014).

To overcome the lack of 3D structures of psychrophilic enzymes and their homologs, homology modelling or other modelling techniques (Eswar et al. 2008) are in principle useful to increase the subset of available structures for comparison. Nevertheless, a lot of caution is needed in this case. Indeed, homology models are not as accurate as experimental structures and their accuracy dramatically decrease with the decrease in sequence identity and similarity between targets and templates. If we consider that fine details often make the difference between cold- and heat-adapted enzymes, it becomes clear that homology models by themselves and in absence of an experimental validation are not the best of the strategies. This issue is even more critical for MD applications, since it has been shown that simulations started from models suffer of many limitations and often encounter the risk to sample portions of the conformational space with limited functional and structural relevance (Fan and Mark 2004; Raval et al. 2012). Therefore, a tight cross-talk with experiments becomes a necessary requirement when we employ models to describe the structure and dynamics of cold-adapted enzymes (Parravicini et al. 2013; Papaleo et al. 2014a).

We should also remember that models are just a way to overcome the lack of experimental structures but that more and more efforts are also needed experimentally to cover the gap of unknown structures from psychrophilic organisms.

7 A Recap on MD Simulations of Cold-Adapted Enzymes (1999–2015)

Table 24.1 reports a summary of all the classical MD simulations studies published so far for cold-adapted enzymes alone or in comparison with their warm-adapted counterparts, in which simulation length, force field used, number of replicates, temperature and methods for analyses are briefly summarized along with the reference to the original publication.

Table 24.1 Summary of MD simulations studies of cold-adapted enzymes published since 1999

Pioneering MD studies of cold-adapted enzymes track back to 1999 and account for approximately 1–2 ns timescale using single MD runs for cold- and warm-adapted enzymes.

In many of more recent MD works, the usage of multi-replicate MD simulations guarantees a wider conformational sampling and the possibility to identify differences in flexibility between cold- and warm-adapted enzymes. The reproducibility of the results is a critical point in MD simulations, and the usage of multiple replicates can not only compensate for the lack of sampling but also allows to verify if the differences that we see between two different systems are statistically significant or just occurs in one unique simulation.

MD simulations of cold-adapted enzymes published so far have been analyzed with many different tools to characterize not only flexibility patterns but also long-range communication, changes in the networks of intra- and intermolecular interactions, structural changes, different conformational substates.

Among the techniques to analyze MD data and applied so far to the study of cold-adapted enzymes we have already mentioned PCA, also known as Essential Dynamics (ED), which aims at extracting informative directions of motions in a multidimensional space by reducing the overall complexity of the trajectories and isolating the larger amplitude motions.

Commonly employed metrics to evaluate protein flexibility in MD simulations of extremophilic enzymes are the per-residue root mean square fluctuations (rmsf) (Fig. 24.4) or anisotropic temperature factors from the MD trajectories using as a reference the average structure from the simulation or even back-calculated S2 generalized order parameters. Caution has to be taken when we employ rmsf or B-factors to estimate protein flexibility from a MD trajectory. They are indeed strongly dependent on the average structure used as a reference and the better way to proceed should be to calculate these metrics on shorter time-windows with respect to the full simulation length to see if there are any statistically significant differences in flexibility patterns of two different proteins. Indeed, since cold- and warm-adapted homologs share very often the same 3D fold and few amino acidic substitutions, the differences in rmsf or B-factor are more likely differences in the intensity of the fluctuations than differences in the presence or absence of certain peaks. It thus becomes crucial to address if the differences observed in the rmsf intensity are not ascribable to noise in the analysis but are truly genuine.

Fig. 24.4
figure 4

Flexibility profiles from MD simulations calculated as rmsf of Cα atoms in cold- (AHA) and warm- (PPA) adapted counterparts. The rmsf profiles for the two proteins have been aligned according to a structural alignment between the two experimental structures and they are shown as thick lines that are broken in correspondence of alignment gaps. (a) we show here the rmsf profile of the whole concatenated trajectories (i.e., merging all the replicates together). (b) the rmsf profile over 100 ps intervals is shown. (c) The most flexible loops are highlighted on the 3D structure. The distances are ensemble averages and are calculated between the centers of mass of the groups and expressed in nm (Reprinted adapted with permission from (Pasi et al. 2009). Copyright 2009 American Chemical Society)

Other analyses go in the direction to define quantitative indexes of similarity within the ensemble of conformations of the cold- and warm-adapted variants, i.e. defining the overlap between the population of the two ensembles. This has been done for a long time for example evaluating the root mean square inner product (rmsip) or other overlap metrics upon PCA analysis. Rmsip is often calculated for the first 20 principal components of the PCA matrix and can range from 0 to 1, where a rmsip value equal to 1 is achieved if the two MD simulations sample identical essential subspaces and 0 when the two subspaces are completely orthogonal.

Other methods employ metrics to estimate cross-correlated motions and can range from Pearson Correlation to methods based on mutual information. Once a map of residues featuring correlated motions is achieved, it can be also represented on the 3D structure to better highlight different dynamical patterns in different proteins. Also these methods suffer of convergence issues and it is thus very important to calculate and then average them on much shorter timescales than the one covered by the whole simulation length.

Another very recent approach is inspired to graph theory and describe the intra- and intermolecular interactions as networks where each protein residue is a node and a link is included in the network if two residues interact each other. This allows for example to identify hubs (i.e., highly interconnected residues within the networks that might have a structural role) and paths of long-range communication between distal residues.

It is although time to also move to other state-of-the-art approaches integrated to MD simulations, such as the enhanced sampling methods mentioned above, which can provide more accurate and comprehensive results.

8 A Family-Centred Point of View in Comparative Structural and Dynamics Studies of Cold- and Warm-Adapted Enzymes

It is well known that psychrophilic enzymes adopt diverse structural strategies to increase structural flexibility, such as the weakening of intramolecular hydrogen bonds, optimizing protein-solvent interactions, decreasing the packing of the hydrophobic core, enhancing the solvent accessibility of hydrophobic side chains and reducing the number of ion-pair networks (Russell 2000; Gianese et al. 2002; Feller and Gerday 2003; Siddiqui and Cavicchioli 2006; Adekoya et al. 2006; Tronelli et al. 2007).

Some general features can be still identified, such as for example the tendency of cold-adapted enzymes to have enhanced localized flexibility in specific regions that can locally (Papaleo et al. 2006, 2008) or long-range (Papaleo et al. 2011a, 2012, 2013; Fraccalvieri et al. 2012) affect the catalytic site. Moreover, in three different studies (Pasi et al. 2009; Papaleo et al. 2011a; Isaksen et al. 2014), carried out on two different enzymes, it has been suggested by simulations that one of the main differences between cold- and warm-adapted enzymes is located on the protein surface so that cold-adapted enzymes have evolved toward ‘softer’, i.e. more flexible and with less intramolecular interactions, regions with respect to their warm-adapted counterparts.

Even if it can be tempting, however, a general theory on enzyme cold-adaptation cannot be formulated yet. Similarities in the strategies employed at the molecular level can be identified in proteins sharing the same 3D and same functional residues, i.e. proteins belonging to the same family and for some extend superfamily (Papaleo et al. 2011b). Moreover, the networks of intra and intermolecular interactions and correlated motions related to these mechanisms can change in the different enzyme families and are strictly connected to specific features of the 3D fold (Papaleo et al. 2011a, b).

The analysis of many proteomes also confirmed the lack of a unique theory of cold-adaptation mechanisms. Indeed, Gu and Hilser (Gu and Hilser 2009) reported a homogeneous modulation of structural flexibility and stability across the components of the proteome of organisms adapted to different habits, but also that the molecular mechanisms of temperature adaptation can still significantly vary in different proteins.

The investigation so far collected suggests that each enzyme exploits diverse structural strategies to adapt to low temperatures, strategies that are often difficult to precisely identify.

In this context, we should keep in mind for comparative studies, as the ones achieved with MD simulations, that the sequence similarity between psychrophilic and mesophilic counterparts should be relatively high to make the comparison effectively meaningful. Subtle structural effects are hard to estimate if pronounced differences in the protein architecture are found. This is even more important if we consider that very often we are searching for very subtle modifications that are sufficient to adapt a protein to cold temperatures. To overcome these problems, comparative MD studies among homologous enzymes with high sequence identities and similarities (higher than 60–70 %) of differently temperature-adapted enzymes are strongly encouraged and have already been applied in some cases (Papaleo et al. 2008; Sigtryggsdóttir et al. 2014). We would encourage to make this a more routinely and general approach in the study of cold-adaptation at the atomic level.

For example, MD investigations on cold-adapted serine-proteases, clarify how distinct members of this superfamily have addressed the detrimental effects of low temperature on protein activity and stability (Papaleo et al. 2006, 2007, 2008; Isaksen et al. 2014). As an example, the finding that the separation between psychrophilic and mesophilic trypsins or elastases, is subsequent to the separation of trypsins from elastases, is an indication that cold-adapted trypsins and elastases independently evolved identical strategies to optimize flexibility at low temperatures, and is a striking case of molecular evolutionary convergence (Papaleo et al. 2008).

The many structural and computational studies mentioned in this paragraph suggest, thus, that even if a general theory for enzyme cold adaptation cannot be postulated, enzymes sharing function and 3D architecture are expected to adopt similar solutions to tune their flexibility and stability in a way that they can carry out their function under extreme low-temperature conditions, pointing out an evolutionary convergence on structural and dynamical properties of homologue cold-adapted enzymes. These evidences suggest that even if common structural strategies in cold adaptation cannot be formulated, a family-centered point of view will be extremely useful in the comparative analyses of cold- and warm-adapted enzymes.

9 Future Perspectives

MD simulations demonstrated their potential in the study of structural and dynamical properties of cold-adapted enzymes in the last two decades. It is now time to make more progresses in the field taking advantage of the usage of new and more accurate force fields and new methods to enhance the sampling in the simulations or to account for experimental restraints derived for example from NMR data, such as chemical shifts which are probe of dynamics over different timescales. In the application of conventional MD, a step forward is needed both in terms of sampling beyond the microsecond timescale but also in order to assess the reproducibility of the results. New methods can also be applied to the analyses of the MD data to identify for examples differences in conformational transitions that the proteins undergoes in solution and differences in the population of major and minor states of the conformational ensemble. MD simulations can also have a big potential in the study of cold adapted enzymes for applicative purposes. They indeed provide a computationally affordable technique to screen and to design mutant variants with increased local flexibility or enhanced protein stability.