Introduction

The pandemic spread of COronaVIrus Disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has led to worldwide devastation on the economies and health systems. WHO reported that the pandemic led to 5,324,969 deaths and 271,376,643 confirmed cases as of December 15th, 2021 (https://covid19.who.int/). In addition to the different host responses mediated by SARS-CoV-2, the pathogenesis of COVID-19 is still not clearly understood (Flerlage et al. 2021; Kang et al. 2021; Prasad et al. 2021; Khatoon et al. 2020). SARS-CoV-2 is an envelope, positive single-strand RNA virus that utilizes its envelope spike protein to enter target host cells. The SARS-CoV-2 genome is the largest among all RNA-viruses, packed with a helical capsid, formed by a nucleocapsid protein and RNA surrounded by an envelope. There are three structural proteins, namely membrane protein (M), an envelope protein (E), and the spike protein (S), associated with the envelope that is mainly involved in replication, assembly and cell fusion (Murugan et al. 2021; Arya et al. 2021). The S protein facilitates the entry of the virus into the host cells with the help of the angiotensin-converting enzyme 2 (ACE2) receptor (Fig. 1). Recently, several SARS-CoV-2 variants have been reported globally that seem to exhibit increased transmissibility and infectivity (Burki 2021; Abdool Karim and de Oliveira 2021). The SARS-CoV-2 variant B.1.1.7 carries 17 mutations in its genome, of which 8 are in the spike protein (Leung et al. 2021). Another variant, B.1.351 (Mwenda et al. 2021) and P.1 variant (Francisco et al. 2021) carries 9 and 11 spike protein mutations, respectively, including 3 mutations in the receptor-binding domain (RBD), K417N/T, E484K, and N501Y. All three variants have the N501Y mutation, and its presence has been associated with increased transmissibility (Leung et al. 2021; Zhao et al. 2021). Very recently, a “double mutant” L452R-E484Q variant of SARS-CoV-2, B.1.617, has been reported (Cherian et al. 2021), which caused a transmission outburst with a vast number of deaths and active cases. These two mutations have been previously described in other SARS-CoV-2 variants and are known to confer immune escape and increase transmissibility and infectivity. The literature strongly supported that the L452R mutation has been reported in B.1.429 variant (Zhang et al. 2021). While the E484Q mutation is found in B.1.1.7, B.1.351, P.1, and B.1.526 variants (Tang et al. 2021; Bhattarai et al. 2021; Antony and Vijayan 2021). Besides these mutations, B.1.617 carries 11 additional mutations, including P681R outside of the RBD, which is located adjacent to the furin cleavage site of the spike, suggesting it might influence the way the spike is processed during infection (Cherian et al. 2021). All these SARS CoV-2 variants carry mutations in the viral RBD region that increases viral transmission, infection rates, disease severity and compromising the immune response, and thus raising concerns about the vaccine effectiveness against the COVID-19 pandemic (Altmann et al. 2021; Kuzmina et al. 2021; Kavitha et al. 2020).

Fig. 1
figure 1

The schematic representation of SARS-CoV-2 envelop protein and receptor binding domain (RBD) region presenting the three mutations, along with the elucidation of ACE2 receptor binding and virus entry into the host cell, followed by step-wise in silico workflow and the tools implemented in the current study

In this regard, a mechanistic study based on structural and molecular dynamics analysis of the RBD variants will be beneficial to understand the molecular mechanism underlying the enhanced transmissibility and infectivity of the RBD variants (Hou et al. 2021; Sixto-López et al. 2021). We have studied the impact of mutations such as L452R, E484Q and L452R-E484Q, which are associated with the host cell attachment and the virus entry into the host cell (Fig. 1). We used various prediction tools to understand the dynamic changes occurring due to the mutations. Further, molecular dynamics (MD) simulations were conducted to understand the potential consequence of the mutations L452R, E484Q and L452R-E484Q. Various parameters like Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), Radius of gyration (Rg), Solvent Accessible Surface Area (SASA), Principal Component Analysis (PCA), Free Energy Landscape (FEL) and density distribution analysis were calculated for the MD trajectories to evaluate the structure and dynamics of SARS-CoV-2 RBD (Fig. 1). Substitution of amino acids may alter surface charges that can potentially affect the host receptor binding and significantly impact the structure and function of RBD of spike glycoprotein. Such mutations can even be one of the major causes of resistance in various viral diseases. We believe the study outcomes will help to understand the effect of the emerging SARS-CoV-2 variants on its binding to ACE2 and help correlate the higher transmissibility and infectivity associated with the virus cell entry mechanisms.

Materials and methods

Protein models of WT and mutants

We used the SARS-CoV-2 spike protein RBD crystal structure (PDB ID: 6M0J) (Lan et al. 2020). The structure coordinates were retrieved from the PDB and used as the wild-type (WT) structure for the study. The structure has different chains and RBD chain E was used to create the mutations in the protein. We mapped the three mutations, namely L452R, E484Q, and L452R-E4844Q at the corresponding positions, and mutant structures were constructed using the BuildModel module of FoldX (Schymkowitz et al. 2005).

Structural stability analysis

The online tools Polymorphism Phenotyping (POLYPHEN) (Adzhubei et al. 2010), I-mutant (Capriotti et al. 2005), Protein Variation Effect Analyzer (PROVEAN) (Choi et al. 2012), SNP&GO (Calabrese et al. 2009) and Pmut tool (López-Ferrando et al. 2017) were used to predict the functional impact of selected mutations on the protein. The mutational effect on the structural stability and flexibility were analyzed using DynaMut web server (Rodrigues et al. 2018) through DUET (Pires et al. 2014a), mCSM (Pires et al. 2014b), ENCoM(Frappier et al. 2015) and SDM scores (Worth et al. 2011) to obtain the Gibbs free energy (∆∆G) destabilizing values (Rodrigues et al. 2018). Further, Mutpred (Pejaver et al. 2017), Fathmm (Shihab et al. 2013) and MuPro web servers (Cheng et al. 2006) were also used to understand the threshold scores of the mutants. Consensus predictions from these algorithms indicate a destabilizing role of the mutants on the native structure of the receptor.

MD simulations

We carried out MD simulations using GROMACS 5.18.3 software package (Pronk et al. 2013; Hess et al. 2008). The WT RBD and all the mutation topology parameter files were generated using the Gromos96a43 force field. The intermolecular (non-bonded) potential, represented as the sum of Lennard–Jones (LJ) force-based switching with a cut-off distance range of 8–10 Å, pairwise Coulomb interaction and the long-range electrostatic force were determined by particle mesh Ewald (PME) approach (Lee et al. 2016). The real-space cutoff was set to 1.2 nm for the PME calculations (Wang et al. 2016). The velocity Verlet algorithm was applied for the numerical integrations and the initial atomic velocities were generated with a Maxwellian distribution at the given absolute temperature. Then the system was immersed with the default SPC/E water model, and protein was placed at the center of the cubic grid box (1.0 nm3). The protein was placed at the center of the cubic grid box (1.0 nm) (Zielkiewicz 2005). The neutralization was performed to make the concentration of the box to 0.15 M. The neutralized system was then subjected to energy minimization using the Steepest Descent (SD) and Conjugate Gradient (CG) algorithms utilizing a convergence criterion of 0.005 kcal mol−1 Å−1. The two-standard equilibration phase was carried out separately NVT (atom, volume, temperature) and NPT (atom, pressure–temperature) ensemble conditions such as constant volume and constant pressure for each protein structure, at similar time scale. We applied the Berendsen thermostat and Parinello–Rahman barostat to maintain the system's temperature and pressure, respectively. The system pressure and temperature were maintained constant at 1 bar and 300 K, with a coupling time of τP = 2 ps and τT = 1 ps. The Periodic Boundary Condition (PBC) was used for integrating the equation of motion by applying the leap-frog algorithm with a 2-fs time step. Finally, to make the system ready for production, the fixing of constraints is achieved by relaxing the grid box with water and counterions. All the simulations, including WT and three mutants, were subjected to 100 ns simulations in the current study. Various parameters were analyzed for the trajectories using GROMACS utilities and Python scripts with MDTraj (McGibbon et al. 2015).

Principal Component Analysis (PCA)

PCA is one of the well-known statistical techniques that helps in the reduction of complexity in analyzing the data and extracts the rigorous motion information in simulations which is essentially correlated for biological function. The eigenvector calculations and eigenvalues identified by diagonalizing the matrix with their projection and the first two principal components were carried out using the Essential Dynamics (ED) method. Then, principal components were selected as reaction coordinates and the free energy of the state, Gα, was calculated using the following equation:

$$G\alpha \, = \, - kT\ln \left[ {P(q\alpha )P\max (q)} \right],$$

where k is the Boltzmann constant, T is the temperature of simulation. P() corresponds to the estimate of probability density function extracted from the histogram of the MD data constructed using joint probability distribution (PC1 and PC2 as reaction coordinates). Pmax(q) represents the probability of the most probable state. The protein movements and the dynamic motions of WT and L452R, E484Q, double mutant L452R-E484Q throughout the trajectories in the subspace were further identified by Cartesian coordinates projecting the most important eigenvectors from the complete analysis (Ahamad et al. 2021b).

Results

The study was hypothesized to understand the underlying mechanism and molecular basis of SARS-CoV-2 mutations L452R, E484Q, L452R-E484Q on the receptor stability. The various analyses like prediction algorithms and MD simulation results enumerated in this section indicate destabilizing role of the mutants on the protein.

Structural stability analysis

The POLYPHEN scores for L452R and E484Q were high, reflecting deleterious effects on the native protein and a deviation from its WT activity. The results revealed that the mutations L452R and E484Q are predicted to produce functional impact by destabilizing effect on the spike glycoprotein. The deviated role of the mutants were also confirmed from the results of SNP&GO and Pmut tools (Table 1). The impact of the mutations by amino acid substitutions on the biological function of RBD region was predicted by PROVEAN and I-mutant, with high deleterious and destabilizing effects, predicting conformational changes in the protein. Further, the impact on conformational stability upon mutations on the native structure was evaluated using Gibbs free energy (∆∆G) by DynaMut. DUET, mCSM, DynaMut, Mutpred, Fathmm and MuPro scores also predict the impact of amino acid substitutions in the mutants. From the consensus of the results from different methods, it is evident that the selected mutants show deleterious effects upon the amino acid alteration and are found to destabilize the spike glycoprotein RBD, altering its structural stability and function.

Table 1 Predicted scores of mutations for L452R and E484Q, using various methods

Structural dynamics and stability of SARS-CoV-2 variants

The superimposed structure of WT RBD and mutants studied here are shown in Fig. 2A. We studied the effect of mutants on the WT-RBD of SARS-CoV-2 structure in detail with the help of MD simulation allied parameters. The RMSD analysis revealed that the mutations L452R, E484Q and L452R-E484Q were unstable with a RMSD range of ∼0.2 to 0.5 nm compared with WT, which was stable with ∼0.15 nm (Fig. 2B and Table S1). The MD simulation trajectories were further analyzed using Probability Distribution Function (PDF) which revealed that the WT was within the threshold average PDF-RMSD values. Noticeably, the mutant L452R shows a fluctuation of 0.27 nm, and E484Q showed a drift of 0.26 nm, followed by maximum drift in L452R-E484Q with 0.41 nm (Figure S1A). The instability caused by the mutations is confirmed with the RBD deviation analysis, which gives us space to speculate that there is a deformation leading to conformational changes and instability of RBD.

Fig. 2
figure 2

MD simulation of SARS-CoV-2 WT RBD and RBD variants. (A) The superimposition of WT and mutant structures of SARS-CoV-2 RBD generated using PyMOL. The protein structure is rendered in cartoon and mutants displayed as sticks (B) C-alpha (C-α) RMSD conformation of WT and the respective mutations, and (C) RMS-Fluctuations depicting the maximum drift in mutations compared to WT noted at 10 ns of simulation time in all the trajectories

We proceeded with further analysis to check the overall flexibility of WT and mutant structures by RMSF to determine the effect of Cα-RMSF for each residue. The results found minimal fluctuations on the WT with a value of ∼0.1 nm and the residual displacement of mutants L452R, E484Q and L452R-E484Q revealed maximum Cα-RMSF fluctuation with ∼0.2 nm, ∼0.4 nm, and ∼0.3 nm, respectively (Fig. 2C). Thus, the above comparative analysis of Cα-RMSF confirmed the loss in the degree of flexibility caused by mutations on SARS-CoV-2 RBD compared to the WT.

The Rg analysis helped in a comparative analysis of the compactness of the protein structures in the WT and the mapped mutant structures during MD simulations. The results revealed that the mutants L452R, E484Q and L452R-E484Q showed a decrease in Rg with a maximum reduction in E484Q compared with WT (Fig. 3A). However, the average PDF-Rg analysis indicates a higher drift on L452R, with 1.77 nm, E484Q with 1.77 nm, L452R-E484Q with 1.78 nm, whereas WT with 1.84 nm (Figure S1B and Table S1). Thus, results from the above analysis suggested that mutants showed a highly destructive impact on the native SARS-CoV-2 structure, inducing structural destabilization, which could be responsible for the loss of compactness, suggesting a cumulative effect on the core structure.

Fig. 3
figure 3

Time-evolution of (A) Rg consistency plot compared between the WT-RBD and the mutant structures, (B) SASA analysis and fluctuations per residue for WT-RBD and mutants, and (C) H–bonds monitoring of WT-RBD and RBD mutants of SARS-CoV-2

The WT and mutant structures were further subjected to SASA analysis. The system’s core hydrophobic regions were examined, and we found a significant decrease in SASA upon mutations (Fig. 3B). The mutants revealed high SASA fluctuation values, ranging from 109nm2 to 110nm2, in contrast to WT (117nm2). The PDF analysis of SASA on mutants revealed an unsteady change with the values of 117.18nm2 for L452R, 109.98nm2 for E484Q, and 109.43nm2 for L452R-E484Q (Figure S1C and Table S1). The fluctuations from the SASA and PDF-SASA results justified a high loss of hydrophobic contacts between the residues on the SARS-CoV-2 RBD due to the substitution of amino acids.

The change in the number of intramolecular hydrogen bonds formed in the WT and mutant structures was investigated. The overall analysis showed that the L452R consists of 118.84 H-bonds, E484Q with 125.32, L452R-E484Q with 123.93, whereas WT with 119.59 H–bonds (Fig. 3C and Table S2). The investigation revealed the deviations in the hydrogen bonding pattern of the mutants have a significant impact on the structural dynamics and instability of SARS-CoV-2.

PCA results

The analysis further carried out for essential subspace examined where the protein dynamics are noted with eigenvectors associated with the eigenvalues. The dynamic behavior of WT and mutants were compared with the clustering parameters and the results revealed that the clusters are well defined in the WT structures by covering the minimum region, whereas the L452R, E484Q and L452R-E484Q mutants occupied maximum regions with a wider cluster (Fig. 4A). Furthermore, the fluctuations with eigenvector1 and eigenvector2 were analyzed and plotted to understand the atomic fluctuations and intramolecular collective motions. The results revealed that L452R, E484Q, L452R_E484Q mutations showed increased fluctuation than WT SARS-CoV-2 RBD (Fig. 4B-C). The maximum fluctuations were mainly observed in C-terminal ACE2 binding sites present in RBD (Figure S3). The comparative study of WT and mutant SARS-CoV-2 indicates significant conformational changes in mutants leading to consequential disturbances in internal atomic motion causing loss of protein stability.

Fig. 4
figure 4

The MD simulation conformational landscape exhibited by the RBD variants. A 2D Projection plot showing the conformational space of Cα-atom and atomic number monitoring of the RBD WT and mutant structures in the essential subspace along the eigenvector 1 and 2. Average Eigen RMSF values for WT-RBD and the L452R, E484Q and L452R-E484Q mutant systems (B) PC1 and (C) PC2

The WT and mutant structures were further examined for a diagonalized covariance matrix to know the positional fluctuations on the SARS-CoV-2 receptor Cα-atoms. The atomic behavior is identified by the residual motions between WT and mutant structures. The analysis was differentiated with red and blue color representation, where red implies small fluctuations between the atoms and blue denotes large fluctuations (Figure S2). The amplitude and the intensity of WT were magnified with a value of 0.080nm2 (Figure S2A), whereas a high difference is observed in the mutants L452R, E484Q and L452R-E484Q, with a range of 0.258nm2, 0.239nm2 and 0.239nm2, respectively (Figure S2B–D). The comparison of residual displacement in MD trajectories of WT and mutants confirms the atomic deformation of SARS-CoV-2 on the native structure.

FEL results

The free energy changes of the WT and mutants were explored using FEL analysis through eigenvectors plots. The comparison studies indicated stable global free energy minima confined within one basin in the WT (Fig. 5A). However, mutants including L452R, E484Q and L452R-E484Q had more expansive basins and numerous meta-stable conformations associated with multiple energy minima (Fig. 5B–D). From the observations, it is evident that the alterations of amino acids invoke protein structure destabilization of SARS-CoV-2 of RBD.

Fig. 5
figure 5

FEL analysis plot displaying the direction of motion and magnitude analysis for (A) WT-RBD, (B) L452R-RBD, (C) E484Q-RBD, and (D) L452R-E484Q-RBD variants. The color is scaled according to kcal/mol

Moreover, the atomic density distribution analysis also helps understand the changes in atomic orientation, plotted using densmap script. The partial density area of WT was stable with minimum values of 3.95 nm−3 (Fig. 6A), whereas lower density was noticed for the mutations L452R, E484Q and L452R-E484Q, with a range of 3.14 nm−3, 2.74 nm−3 and 2.94 nm−3, respectively (Fig. 6B–D). Thus, the comparison of density distribution affirmed the high impact structural transitions, leading to loss of stability upon amino acid substitution.

Fig. 6
figure 6

Density distribution analysis plotted to understand the atomic orientation using densmap script for both WT and mutant compares on throughout MD simulations (A) WT-RBD, (B) L452R-RBD, (C) E484Q-RBD, and (C) L452R-E484Q-RBD

Discussion

Various studies and advancements revealed a strong involvement of mutations on the dynamic effect of spike glycoprotein (Ahamad et al. 2020, 2021a; Rezaei et al. 2020). Specifically, many researchers stated that replacing amino acids has resulted in structural changes on the spike glycoprotein of SARS-CoV-2, increasing its susceptibility to proteolysis. However, mechanistic and functional consequences of amino acid substitutions remain unresolved (Mishra et al. 2021; Teng et al. 2021). Khalid et al., 2020 reported an in-depth study on spike glycoprotein-ACE2 interactions exploring both sequence and structural features using a bioinformatics approach. The authors labeled various single nucleotide polymorphism (SNP) as damaging/deleterious/destabilizing that block the interaction, which might halt the virus from entering the host cell (Khalid and Naveed 2020). Ahmed et al. reported mutations occurring in the proteins can affect its conformation, structure folding and stability, which can eventually show effect on protein–protein interactions/thermodynamics. Recent research reported several mutations in the SARS-CoV-2 spike glycoprotein RBD that enhances the infectivity and binding interactions of virus to the host ACE2 receptor. It is necessary to study the conformational, stability and dynamic effect of mutations on native RBD structure. Most of the mutations occur at RBD region that are directly bound to ACE2 which has ability to generate a stable complex with good binding affinity (Alaofi and Shahid 2021).

Our study highlighted the significance of amino acid substitutions on the native spike glycoprotein, which presented a significant destabilizing effect. The free energy (∆∆G) scores from the computational prediction algorithms on the RBD upon the mapped mutations also displayed a strong destabilizing impact on the native spike glycoprotein, affecting the receptor binding between the complexes. Herein, we studied three critical RBD mutants L452R, E484Q and L452R-E484Q by implementing the all-atom MD simulations to examine the spike glycoprotein molecular alterations associated with RBD mutations. The calculated RMSD for all the Cα-atoms starting from the central origin showed that the mutant structures projected a diverge manner of deviations throughout the simulations compared to RBD-WT and mutants. RMSF analysis also displayed increased fluctuations on the mutant structures than the WT structures. Rg results portrayed large oscillations and significant fluctuations in mutants compared to WT. The SASA analysis depicted the decreased alterations of mutant in contrary to WT structures for a more extended period enduring a major structural transition. The mutations present in the RBD region showed high flexibility, solvent accessibility and gyration when compared to WT-RBD suggesting the mutation did not alter the conformational compactness. The PCA, FEL and hydrogen bond analysis results also suggest a destabilizing effect for the selected mutations, which have a significant free energy changes when compared to the WT structure. The H-bond analysis comprehended the stable flexibility on WT and mutants demonstrated major changes. Moreover, the energy spectrum, energy wells and atomic motion analysis also aided in detailed stability particulars where significant changes in the energy levels of mutants were investigated and found loss of RBD stability upon mutants compared to WT. To summarize, the study provided a comprehensive view of destabilizing mechanisms on L452R, E484Q and L452R-E484Q of RBD associated with SARS-CoV-2 spike glycoprotein. The prediction tools, MD simulations and the free energy calculations indicate a loss in the stability of the spike glycoprotein due to the RBD mutations.

Conclusions

The RBD mutants, namely L452R, E484Q and L452R-E484Q, are predicted to have a significant deleterious impact on SARS-CoV-2 spike protein. The comparative results of prediction algorithms and MD simulation studies facilitated studying the effect of mutants on the WT-RBD. The MD simulation trajectory investigations, namely, atomic movements, residue flexibility, FEL analysis, hydrogen bond, Rg and SASA, indicate unstable mutant structures. Summarily, the RBD mutants L452R, E484Q and L452R-E484Q displayed a destabilizing and deleterious impact on the RBD native structure.