Introduction

Viruses are made up of hereditary material that is encased in a protein coat. They can cause diseases as severe as HIV/AIDS and CoVID-19 (Coronavirus disease 2019) [1]. Viral infections are currently regarded as a public health issue. The current COVID-19 pandemic is triggered by the Coronavirus-2 linked with severe acute respiratory syndrome (SARS-CoV-2). Coronaviruses (CoVs) are associates of the Coronaviridae family of order Nidovirales [2, 3]. Coronaviruses are classified into four sub-groups known as alpha, beta, gamma, and delta [4, 5]. The crown-like spikes on their surface gave them their name. One of the seven coronaviruses that infect humans, SARS-CoV-2 is a member of the beta subgroup [6].

In 1931, the first coronavirus-based disease was identified, and the first coronavirus (HCoV-229E) was isolated from humans in 1965. Until SARS-CoV, only HCoV-229E and HCoV-OC43 were known [7]. It is a chronic and frequently fatal respiratory illness that was first recorded in November 2002 in Guangdong Province, China, with 11% mortality [8, 9]. Then, in 2012, a Middle East Respiratory Syndrome (MERS-CoV) epidemic brought on by -coronaviruses belonging to the Merbecovirus subgenus was noticed in Saudi Arabia, with a fatality rate of 34% [4, 10]. According to the World Health Organization (WHO), COVID-19 initially coarse in Wuhan (China) in December 2019. It is a contagious illness that mostly affects the respiratory system. SARS-CoV-2, a recently identified coronavirus, is the culprit [11, 12].

Large, enclosed, positive-sense, single-stranded RNA viruses called coronaviruses cause diseases by infecting the respiratory systems of animals, including humans [13,14,15]. Three phases characterize the progression of the COVID-19 viral infection: the asymptomatic cycle, the non-severe symptomatic phase, and the serious infection stage [16]. Spike (S), nucleocapsid (N) envelope (E), and membrane (M) are the fundamental proteins of SARS-CoV-2 [17, 18]. Via the ACE2 receptor, the S protein may engage in robust interactions with the host cell [19]. Following the attachment of the virus’s S protein to the ACE2 receptor, the virus’s envelope will merge with the membrane of the host cell and enter the cell [17, 20, 21].

Due to their medicinal properties, plants have long been important for human wellness [22]. The WHO estimates that about 80% of people on the planet rely on medicinal plants or herbs to take care of their medical requirements [23, 24]. To tackle this global pandemic, scientists are assessing antiviral plant secondary metabolites (PSMs) as sources of therapeutic drugs and looking for novel medical plant-derived drugs [25]. Studies have demonstrated that plant metabolites can disrupt signaling pathways within cells, prevent the coronavirus S protein from combining with the host’s ACE2, and decrease the activity of coronavirus-reproduction cycle-related enzymes such as 3CL protease and papain-like protease [26, 27]. Therefore, the goal of the current work was to use computer-assisted analysis to identify possible phytochemicals with high binding affinities from sources of medicinal plants against COVID-19.

Calotropis procera, the milkweed shrub or Sodom apple, is a flowering plant native to tropical and sub-tropical areas of the world. It is long been known in traditional medicine and is considered an important plant for treating ailments asthma, diarrhea, dysentery, leprosy, malaria, skin diseases, and snake bites. Also, there are concerns regarding its poisonous effects, like ingesting any part of the plant may be fatal. However, the phytochemicals in this plant have been shown to have inhibitory effects against several viral and bacterial diseases [28].

Since a large number of compounds are under investigation, drug screening using in vitro and in vivo analysis has become more complex, time taking, and expensive. The in silico approach utilizing computational techniques aid the drug invention procedure by making the investigation economical and resource-efficient. In this in silico analysis, phytochemicals derived from the medicinal plant Calotropis procera, which is locally found in South Asia, are used to analyze their interaction with the core protease (3CLp) protein of SARS-CoV-2. The present study’s goal is to examine the phytochemicals obtained from Calotropis procera’s interaction potential with SARS-primary CoV-2’s protease.

Materials and Methods

Phytochemicals from the plant Calotropis procera were tested for antiviral activity against SARS-CoV-2.

GC–MS Analysis

Calotropis Procera methanolic and water extracts were used to determine their composition using GC–MS. Helium was employed as a carrier gas at constant pressure. A 20:1 split ratio was used to inject the methanolic and water extract (1 μL) into the GC. The injection temperature was set at 250 °C. The chromatogram’s peak area will be used to measure the concentration of each analyte.

Selection of Phytochemicals

Dr. Duke’s Phytochemical and Ethnobotanical Databases were used to search for phytochemicals in the Calotropis Procera plant. PubChem and Swiss ADME were used to search canonical smiles of selected phytochemicals that were required for docking and ADMET reports.

Assessment of Phytochemicals (ADMET and Drug-likeness Prediction)

The phytochemicals were screened based on ADMET properties and drug-interaction prediction by using the Swiss ADME, Pre ADMET, and cbligand.org online servers. The phytochemicals’ pharmacological properties and pharmacokinetics, such as gastrointestinal (GI) solubility (ESOL), absorption, blood–brain barrier (BBB) penetration and violations of Lipinski’s rules, were investigated [29]. The criteria for screening phytochemicals are listed in Table 1.

Table 1 Identification of the compounds in Calotropis Procera

Retrieval and Assessment of Protein Structure

We obtained the major protease protein for SARS-CoV-2 (PDB ID:6XA4, resolution 1.65) from the protein data bank (PDB; www.rcsb.org). Chimera 1.14 was used to remove the inhibitor attached to the protein’s 3D structure and to find the active site of the protein. PyMol, Chimera 1.14, and the Discovery Studio Visualizer were used to check the active site. Ramachandran Plot was used to determine the reliability of the protein structure, which was done using the PROCHECK V6.0 online server [30]. Expasy protparm server was used to perform a physiochemical analysis of virus protein. The following variables were calculated: Aliphatic 5 Index (AI), Isoelectric Point (PI), Instability Index (II), Number of Positively Charged Residues (+ R), Number of Negatively Charged Residues (-R), Extinction Coefficient (EC) at 280 nm; GOR4 online service was utilized to forecast and evaluate the secondary structure, as well as GRAVY—Grand Average of Hydropathicity [31].

Protein–Protein Interaction (PPI)

We looked at how viral proteins interacted with human proteins as well as their roles using the VIRUSES online interface, which is part of STRING version 10.5 [32]. This server is used to predict the relationship between the SARS-CoV-2 protein (NCBI taxonomic ID: 694,009) and Homo sapiens protein (NCBI taxonomy Id: 9606).

Molecular Docking

AutoDock Vina incorporated in Chimera 1.14 was used to assess molecular docking of the SARS-CoV-2 protein 3CLp with certain phytochemicals [33, 34].

Dock Preparation and Visualization

Via canonical smiles, selected phytochemical (ligand) structures were created, minimized, and saved as mol files. By choosing the conjugate gradient steps to be set to 10, the steepest descent steps to be set to 100, the steepest descent step size of 0.02 A, and the steepest gradient steps to be set to 10, the phytochemical structure was reduced. To consider the protonation state: of histidine and slower H-Bonds, hydrogen was added first, followed by charges. With the help of the dock prep command, the protein (receptor) structure was surface edited. The inhibitor was then removed, incomplete side chains were swapped out for those from the Dunbrack 2010 rotamer library, hydrogen and charges were added to the protein structure, and the finished file was saved in the mol format. The Discovery studio visualizer was used to view the binding pockets, binding residues, and H-bonds in the 2D and 3D structures of the resulting complexes. For the visualization of 2D or 3D structures, the ligand-receptor complex file in pdb format was provided.

Density Functional Theory (DFT) Analysis

Gaussian and GaussView were used for DFT calculations [35, 36]. Transition energies of selected phytochemicals used against the main protease were measured at the ground state utilizing DFT calculations on optimized structures using B3LYP functional (3-parameter, the Becke, Lee-Yang-Parr hybrid functional) to evaluate their reactivity and proficiency [37]. The HOMO and LUMO energies were used in the study. The basis set that was selected is 6-311G (d, p, + +).

GC–MS Assessment

GC–MS assessment was conducted to evaluate the qualitative and quantitative composition of Calotropis Procera, and the findings are shown in Table 1 and Fig. 1. A total of 20 compounds were identified. 1-octadecene, phytol, 5-cholestene-3-ol, stigmasterol, gamma sitosterol, beta amyrin, and alpha amyrin are the main constituents. Those phytochemicals that showed the best results in molecular docking and DFT analysis were not detected in GC–MS analysis, possibly due to their inability to dissolve in methanol and water. Furthermore, the GC–MS analysis was performed using plant leaf samples, however, Dr. Duke’s Phytochemical and Ethnobotanical Databases showed them to express at high level in floral parts of the plant.

Fig. 1
figure 1

GC–MS chromatogram of Calotropis Procera sample

Drug Likeliness Prediction

The ADMET properties and drug-likeness of all phytochemicals were assessed. The criteria mentioned in Supplementary Table 1 were used to screen these phytochemicals. These phytochemicals were also screened using Lipinski’s rule of five [38]. The molecular characteristics of substances significant to the pharmacokinetic potential of a medicine are explained by Lipinski’s rule [29]. These laws are critical for a phytochemical being used as an oral medicine [39]. Phytochemicals that meet these criteria are exposed to further docking studies. A total of 14 phytochemicals out of 50 met the requirements for being drug-like and possessing appropriate ADMET profiles. Supplementary Table 2 lists the phytochemicals that were screened, and Supplementary Table 3 lists their general properties.

Quality Assessment of Protein Structure

The primary physiochemical analysis has been conducted, and the amino acid composition was determined. The main protease was observed to be made up of twenty-two amino acids of various compositions (data not shown). Leucine had the highest content of these amino acids (9.5%). It demonstrates the protein’s hydrophobic nature. The length of the sequence is 306 and the molecular weight is 33.8 kDa. This protein’s isoelectric point is 5.95, indicating the protein’s basic nature. Positively charged residues (Arg + Lys) account for 22 of the total, while negatively charged residues (Asp + Glu) account for 26. At 280 nm, the extinction coefficient of this protein was 33,640. The effectiveness of protein–ligand or protein–protein interactions’ quantification is evaluated using this value. The protein’s instability index, which is 27.65, was used to evaluate the protein’s stability. This value means that the protein in a test tube is unstable. The aliphatic index of a protein indicates how much of the protein is taken up by aliphatic chains (Alanine, Valine, Isoleucine, and Leucine). This protein has an aliphatic index of 82.12. The hydrophobicity of amino acid residues in a protein is measured by Grand Average Hydropathicity. The hydropathicity of this protein is − 0.019. (data not shown). This protein molecule contains 5 different atoms: C, H, N, O, and S, 4686 are the total number of atoms present in this protein and the molecular formula is C1499H2318N402O445S22.

The local conformation proteins’ polypeptide backbone is referred to as a protein secondary structure. β-strand (E) and α-helix (H) are the regular resulting structure states and the spiral region (C) is the irregular secondary structure state. Sander developed the DSSP (Dictionary of Secondary Structure of Proteins) secondary structure assignment system [41]. According to hydrogen-bonding patterns, it automatically divides the secondary structure into 8 conditions (H, E, B, T, S, L, G, and I). The three categories of helix, sheet, and coil are often used to categorize these eight states. Helixes are classified as G, H, and I according to the most widely used convention, sheets as B and E, and all other states as coils.

A Ramachandran plot [40] was developed to evaluate the quality of the protein structure for the major protease protein of SARS-CoV-2. Analyzing the possible angles and conformations for each individual amino acid residue in the primary protease model showed that 89.0 percent of residues were in the most preferred zone, 9.8 percent were in the additional region, 0.8 percent were in the generously allowed region, and only 0.4 percent were in the disallowed zone. The total number of residues was 304, and the properties of the residues revealed that the maximum deviation was 5.6 and the bond length was 4.8.

Protein–Protein Interaction (PPI)

Protein–protein interactions of the viral protein 3CLp within the host cell showed its importance as a target protein. The viral protein 3CLp protein is directly linked with the human proteins CXCL10, EIF2C1, EIF2C2, EIF2C3, and EIF2C4 according to an analysis of the virus-host PPI network (Fig. 2). 3CLp and CXCL10 had a combined interaction score of 0.652, while 3CLp and EIF2C1, EIF2C2, EIF2C3, and EIF2C4 had a combined interaction score of 0.584 (Table 2), suggesting a close relationship. Table 3 summarizes the functions of human protein receptors that interact with virus protein.

Fig. 2
figure 2

A virus-human protein–protein interaction (PPI) network shows the connections between the human SARS coronavirus and Homo sapiens proteins. The viral proteins are represented by red circles, while the human proteins are represented by gray circles. The type of interaction is indicated by the color of the lines (edges), the thickness of the edges demonstrates the intensity of the supporting data, and the number of interactions is shown by the different edges of different colors

Table 2 Virus-Host protein–protein interaction score
Table 3 Functions of interacting human proteins with virus protein

Molecular Docking Studies

The 3CLPro is previously been explored extensively in the recent era [42], however, futher research is necessary to explore full inhibitory potential. In our study 14 chosen phytochemicals’ binding scores to the target protein were utilized to calculate binding energy and inhibitory constant (Ki) values. The ligand-receptor complexes were chosen using a − 4.3 kcal/mol threshold. A total of 11 phytochemicals had a binding attraction of ≥ –4.3 kcal/mol, indicating their use as potential drugs against 3CLp.

Uscharin has a binding attraction of − 6.7 kcal/mol, which is the greatest, and two hydrogen bonds. Voruscharin forms two hydrogen bonds and has a binding affinity of − 6.5 kcal/mol. The binding affinity of coroglaucigenin is − 5.6 kcal/mol with a single hydrogen bond. The binding affinity of fructoside, which contains two hydrogen bonds, is − 5.4 kcal/mol. With a binding affinity of − 5.1 kcal/mol, benzoylisolineolone attaches to two hydrogen bonds. The binding affinity of uzarigenin is − 5.0 kcal/mol, and it only binds to one hydrogen bond. Corotoxigenin has a − 4.9 kcal/mol binding affinity with two hydrogen bonds. Isolineolone has a binding affinity with one hydrogen bond of − 4.9 kcal/mol. The binding affinity of caloropagenin is − 4.8 kcal/mol and it binds to two hydrogen bonds. The binding affinity of syriogenin is − 4.8 kcal/mol and it only binds to one hydrogen bond. Lineolone has a − 4.4 kcal/mol binding affinity with two hydrogen bonds. A three-dimensional representation of phytochemicals in the binding pocket of the primary protease crystal structure is shown in Fig. 3. Docking results depict that uscharin forms the strongest interactions. Table 4 summarizes 2D representations of ligand-receptor complexes, along with their interacting amino acids, binding scores, and Ki values.

Fig. 3
figure 3figure 3

Interaction of a Uscharin b Voruscharin c Frugoside d Coroglaucogenin e Benzoylisolineolone f Corotoxigenin g Uzarigenin h Calotropagenin i Isolineolone j Syriogenin k Lineolone l Benzoyllineolone m Melissyl-Alcohol (n) Beta-Sitosterol with the binding sites of main protease (3CLp) protein from SARS-CoV-2. H-bond donors are represented by the purple hue, and H-bond acceptors by the green color

Table 4 Docking results of phytochemicals with SARS-CoV-2 main protease (3CLp) (PDB: 6XA4)

Density Functional Theory (DFT) Analysis

The best five ligand-receptor complexes were selected for DFT analysis. The band gap difference, i.e., the differences between the ELUMO and EHOMO, ranged from 0.022 to 0.196 kcal/mol for the 5 selected phytochemicals. Comparing the band gaps among the selected phytochemicals, the most reactive potential phytochemical was identified against the target protein. Uscharin had the highest reactivity against SARS-CoV-2 main protease protein among the five phytochemicals, with a band energy gap of 0.022 kcal/mol. A summary of the DFT analysis is presented in Table 5.

Table 5 Analysis of phytochemicals against the major protease protein of SARS-CoV-2 based on density functional theory

Other Considerations

Although important the delivery of the phytochemicals to the targeted sites like nanoparticles formulation [43, 44] is beyond the scope of this study. Also, the study lacks in vivo analysis.

Conclusion

The virus SARS-CoV-2 has drastically affected the global life. This study aims at targeting the virus using computer-assisted drug discovery. Calotropis procera being important medicinal plant in context with the viral infection needs much emphasis. In the present study, out of 52 phytochemicals from Calotropis procera, 14 were screened having drug-like potential. The docking results revealed that 11 of the 14 phytochemicals had a high binding affinity for the main protease protein. Among all, the phytochemicals uscharin, voruscharin, frugoside, coroglaucigenin, and benzoylisolineolone may be considered the top 5 drug-like candidates against 3CLp. DFT analysis of the best 5 phytochemicals revealed that the uscharin exhibited higher reactivity as the band energy gap was the least (0.022 kcal/mol) among the five selected candidates. These phytochemicals can further be investigated in vitro and in vivo for their effectiveness and safety as potential anti-SARS-CoV-2 drug candidates, specifically the MD simulation, cell line assays and animal models-based assays.