Introduction

Since the onset of the COVID-19 pandemic in 2020, the global community has been grappling with the enduring impact of this highly burdensome and formidable disease. Renowned health organizations such as the Centre for Disease Control (CDC) and the World Health Organization (WHO) have been tirelessly engaged in efforts to mitigate its devastating consequences. SARS-CoV-2, initially identified as 2019-nCov, emerged as a virus of immense global significance, originating in Wuhan, China. This viral pathogen exerts a particularly severe toll on vulnerable populations, including the elderly and children, with documented symptomatic cases exhibiting distressing symptoms such as fever, dry cough, dyspnea, and distinct bilateral ground-glass opacities on chest CT scans (Chandra 2012; Nandakumar 2020; Sun et al. 2020).

However, during the initial emergence of the first strain of SARS-CoV-2, there existed no medically validated immunoprophylactic agent to combat the infection (Chandra 2012). Notably, in China, the country of origin for the causative agent, various interventions were employed, including the administration of antimalarial drugs like Chloroquine and Hydroxychloroquine, antivirals such as Lopinavir and Ritonavir, and convalescent patient serum but to no avail. Therefore, as part of the global concerted effort to curtail the widespread transmission of SARS-CoV-2, endeavors have been undertaken to develop vaccines capable of eliciting adaptive immunity in humans, consequently facilitating herd immunity within the population (Resolution & Others 2015). These interventions aimed to eliminate SARS-CoV-2 from infected individuals and confer a certain degree of immunity during the nascent stage of the pandemic (Nandakumar 2020).

Sadly, even the vaccines manufactured in the contemporary times to confer significant acquired adaptive and active immunity against the virus have been faced with limitations because of the several strains and variants of the RNA virus that have evolved as a result of mutation. Due to the nucleic acid of the virus being RNA, the mutation rate of the virus is extremely high, consequently, new strains and variants of the virus are constantly emerging (Sanjua 2016). With the highly pronounced diversity of the SARS-CoV-2, however, about 6.7 billion people have been vaccinated globally with any of the available vaccines which is medically recommended based on age differences and the likelihood of allergenicity as of October 2021 (Myers 2021). Therefore, there is as such, no universal Vaccine available for use as an agent that is able to work against all of the strains of SARS-CoV-2 and their individual variants. For example, the interim analysis of phase 3 ENSEMBLE trial of the Janseen’s COVID-19 Vaccine candidate showed a lower efficiency in countries other than the United States, especially those countries where there is a higher rate of SARS-CoV-2 virus variants, though it was 72% effective in the United States (pharmacypracticenews.com). The need for an immunoprophylactic agent that is universal in action is inevitably needed to confer the required acquired immunity to all people of all ages in all locations in the world.

The design of a universal vaccine that is expected to provide immunity against every variation of SARS-CoV 2 and each strain is presented in this research and also spike glycoprotein (S), the site of the primary mutations that produced these diverse SARS-CoV-2 virus variants was used as the basis for design using a variety of immunoinformatic tools.

Methodology

Data assembly and retrieval of sequence

In order to reflect the circulating variants on the various continents of the world, representative SARS-CoV-2 whole-genome sequences were acquired from the Global Initiative on Sharing all Influenza Data (GISAID) databanks. We considered all of the World Health Organization’s Variants of Concern (VOC) and Variants of Interest (VOI) (Oladipo et al. 2021).

An average of twenty (20) sequences were retrieved per variant across atleast five (5) out of six (6) continents. Sequences of variants of concern and interest which include B.1.214.2, B.1.1.1.7, B.1.525, B.1.526, B.1.351, B.1.1318, B.1.1328, B.1.1.519, B.1.621, R.1, B.1.4662, C.37, P.1, C.1.2, B.1.617.1, B.1.427, C.36.3, B.1.427, B.1.620, B.1.617.2, B.1.619, and B.1.529 across various countries of Oceania, Africa, north America, South America, Europe and Asia, were used for the analysis and vaccine design (Oladipo et al. 2020).

Annotation of protein sequences

Protein transeq was used to convert the SARS-CoV-2 nucleotide sequences obtained from GISAID from 5 continents to protein sequences. Using MEGA-X software (CLUSTA-W tool) to map out the annotated region, each complete protein sequence was annotated to determine the site of the previously established antigenic spike glycoprotein of SARS-CoV-2 (the first SARS-CoV-2 sequence from WUHAN) (Kumar et al. 2018).

CTL prediction

Using the NetCTL 1.2 server, CTL epitope predictions were generated (Madlala et al. 2021). CTL prediction was performed using SARS-CoV-2 sequences that were retrieved, annotated with reference spike glycoprotein sequence and also assessed through the antigenicity prediction test via the Vaxijen server, the CTL epitopes prediction threshold value for epitope identification was 0.75; the weights for C-terminal cleavage were 0.15; and the TAP transport efficiency was 0.05. The MHC class I binding prediction is expanded in the server version 1.2 to include 12 MHC supertypes, including the supertypes A26 and B39. The three main factors—MHC-1 binding peptides, proteasomal C-terminal cleavages, and Transporter Associated with Antigen Processing (TAP) transport efficiency—were used to underpin the prediction (Rapin and Wiernsperger. 2010).

Prediction of helper T cell (HTL) epitopes

Helper T cells (HTL) of the SARS-CoV-2 variants were predicted using Immune Epitope Database (IEDB), (Vita et al. 2015). Seven Human Alleles were selected using the IEDB recommended human allele which include HLA-DRB5*01:01, HLA-DRB1*03:01, HLA-DRB1*15:01, HLA-DRB1*07:01, HLA-DRB3*01:01, HLA-DRB3*02:02, HLA-DRB4*01:01 are selected for Major Histocompatibility Complex II (MHC II). The predicted epitopes were selected in the order of the smallest percentile which has a relationship with the MHC II binding (Zhang et al. 2008). Seven epitopes were selected for human allele and forty- nine for each variant.

Prediction of B-cell epitope

The SARS COV-2 glycoprotein’s linear B-cell epitopes were predicted, and epitopes that exceeded 0.5 thresholds were chosen and subjected to additional analysis (Oladipo et al. 2020). For the prediction of a linear B-cell epitope coupled with a conformational B-cell epitope, we use BCPred and ABCpred. The kernel method’s allotted support vector machine method for amino acid pair antigenicity (AAP). While ABCPred forecasts a variety of pathogenic groups and was used to confirm BCPred’s predictions. But it is claimed that residues with a wide range of values have better solvent accessibility (Ali et al. 2022).

Multi-epitope vaccine construction

Several B-Cells epitopes were discarded and only four (4) were used for the candidate vaccine construction. This was due to their poor performance in antigenicity tests, allergenicity prediction, and toxicity testing. Two HTL-epitopes were finally used for the vaccine construct. Initially, 116 HTL-epitopes were predicted but others were rejected based on the prediction parameters such as IL-4, IL-10, and IFN. CTL region is constructed using six (6) subunits of forty-four (44) which have improved performance in the CTL epitopes antigenicity prediction. The final vaccine candidate contains an adjuvant, human beta defensin-1 (accession no. P60022) (Ryan and Diamond 2017), to augment immune response (Yang et al. 2021) and the sequence was retrieved from the Uniprot, linked to the vaccine construct through an EAAAK linker, connected to the amino (N) terminus of the multi-subunit sequence (Rehman et al. 2021) to the first B-cell epitope. GPGPG linker was used to fuse subsequent B-cell epitopes. In between the last B-cell epitope and the first HTL epitope, GPGPG linker was used to linking the epitopes and was used to merge the subsequent HTL epitope (Shey et al. 2019). The CTL epitopes were linked using AAY linker (Oladipo et al. 2021). The positioning of the epitopes, adjuvant, and linkers that make up the vaccine construct is depicted in Fig. 1 below.

Fig. 1
figure 1

Construction of the multi-epitope vaccine

Toxicity, allergenicity, antigenicity, and solubility prediction

The goal of vaccination is to elicit an immunological response in the recipient. As a result, the potential vaccine should be stable, highly antigenic, non-allergic, non-toxic, and have suitable solubility (Pyasi et al. 2021). Physicochemical characteristics and solubility qualities of vaccine candidates aid in determining the vaccine's efficiency and effectiveness (Id et al. 2021a, b). The toxicity of the final multi-epitope vaccine sequence was evaluated by the ToxinPred (Chen et al. 2020a, b). We also evaluate the allergenicity of the final vaccine construct using AllerTOP v.2.0 (Madlala et al. 2021).The antigenicity of the vaccine candidate was predicted using Vaxijen 2.0 online server using a threshold of 0.5 (Ebrahimi et al. 2019) and the AntigenPro (Yang et al. 2021).The antigenicity of constructs was predicted using an antigenic probability of ≥ 0.8 (Onile et al. 2020). The Vaxijen server was utilized to investigate the antigenicity probability of the protein utilizing a non-alignment-based method that focuses on the physiochemical characteristics of the prototype vaccine (Oladipo et al. 2021). The solubility of the final vaccine was finally evaluated by SolPro (Yang et al. 2021) and Protein-sol server (Rawal et al. 2021). Both of these servers rely on machine learning to provide significantly more accurate predictions. All parameters were retained at their default values during prediction (Sarkar et al. 2020).

Physicochemical property

The physicochemical qualities are crucial in determining how the vaccine interacts with its physiological environment (Yang et al. 2021). ExPASy ProtParam Tool predicts physicochemical parameters such as hydropathicity, charge, half-life, instability index, pI (theoretical isoelectric point value), and molecular weight (Doytchinova and Flower 2008).

Secondary structure prediction

Using primary amino acid sequence input, the web-based server PSIPRED was used to accurately predict the protein secondary structure (Shey et al. 2019). PSI-BLAST (Position-Specific Iterated—BLAST) output is analyzed by PSIPRED using two feed-forward neural networks. It can reliably recognize secondary structure, as seen by its average Q3 score of 81.6% (Yang et al. 2021). The secondary structure of the vaccine construct was predicted using the Self-Optimized Prediction Method (SOPMA). One of the parameters predicted by SOPMA are helices, sheets, twists, and coils (Oladipo et al. 2020) set at default (Singh et al. 2020). The method makes use of customized BLAST to identify and select sequences that clearly resemble the intended vaccine (Oladipo et al. 2021).

3D structure prediction

The 3D structure of target proteins was modeled using a modeling tool, 3Dpro of Scratch Protein predictor. This tool predicts the three-dimensional structure of the protein based on multiple-threading alignments and iterative prototype fragment assembly simulations. (Rehman et al. 2021).

Structure refinement and validation

Computationally created 3D structures may not represent the real or native structures of proteins. As a result, 3D structure refinement is used to improve the resolution of computationally predicted models so that they mimic native protein structures as nearly as possible (Sarkar et al. 2020). The refinement of the 3D structure of the chimeric vaccine construct was performed by the GalaxyRefine tool of GalaxyWeb and desirable properties were obtained. After refinement, the GalaxyRefine tool output a total of five vaccination chimera models (Naveed et al. 2021). This server, which can improve both local and global protein structure quality, rebuilds and repacks the protein side-chain using a community-wide CASP10-based refinement approach that has been successfully tested (Onile et al. 2020). Subsequently, the best model (model 1) from Galaxyrefine was validated using ProSA-web (Adhikari et al. 2018) which provides a z-score. A z-score that falls inside the range of all the experimentally determined protein chains in the PDB database indicates that a query protein is of higher quality (Sarkar et al. 2020). Potential inaccuracies in the anticipated tertiary structure can be detected using the ProSA-web structure. The server displays the overall quality score as well as any problematic sections of the protein structure (Onile et al. 2020). Finally, the overall structural quality was validated by a Ramachandran plot analysis (Mahmud et al. 2021). A Ramachandran plot is a graphic representation of an amino acid's main-chain conformational tendencies (Anderson et al. 2005).

Conformational B cell epitope prediction

Continuous and linear epitopes are the two basic types of B-cell epitopes discontinuous or conformational B-cell epitope on B-cell (Adhikari et al. 2018). ElliPro at IEDB was utilized for identifying the conformational B-cell epitopes. The vaccine protein's PDB structure was utilized as an input for predicting conformational epitopes in the vaccine construct's tertiary structure (Id et al. 2021a, b). ElliPro is a web-based tool that accepts two types of data as input: protein sequence and structure. With threshold values for BLAST e-value and the number of structural templates from PDB, the first example uses a protein SwissProt/UniProt ID or a sequence in either FASTA format or single letter codes to generate a 3D structure from the provided sequence (Ponomarenko et al. 2008). On the vaccine's three-dimensional structure, ElliPro discovered eleven (11) epitopes.

Codon optimization of designed vaccine peptide for expression analysis

To enable maximal protein production, the Java codon adaptation tool (JCat) was used to optimize codons (Naveed et al. 2021). This procedure was largely carried out using the Java Codon Adaptation Tool (JCat) server, an adaptor bioinformatics tool (Shey et al. 2019). The proportion of the C-G ratio was calculated using codon optimization as a strategy. A nucleotide with a higher C-G ratio has a better likelihood of being constructed. Codon optimization increases protein and RNA levels significantly, implying that codon use is an essential determinant of gene expression (Zhou et al. 2016). JCat’s result is presented graphically as well as Codon Adaptation Index (CAI) values for the pasted and newly adapted sequences (No et al. 2005). A sequence’s C-G composition should be between 30 and 70 percent. Outside of this range, C-G content has negative implications for translational and transcriptional efficiency (Shey et al. 2019).

Molecular docking of vaccine construct with cleaned TLR 2, 3, and 4 receptors

TLRs play a critical role in the innate identification of invading microbes and their subsequent clearance. Toll-like receptor 2, 3 and 4 (TLR 2,3 and 4) was chosen as the receptor and retrieved from the RCSB PDB database (PDB ID: 5d3i, 7c76, 2z64 respectively), while the vaccine model was used as a ligand. Due to the fact that TLRs retrieved from the PDB database contain other ligands and water molecules that could interfere with the vaccine construct’s efficient binding to the receptor’s pocket, Discovery studio software was used to clean the protein and remove any extra ligands before docking. Using the ClusPro 2.0 (Naveed et al. 2021) the binding affinity between the multi-epitope vaccine and the TLR 2, 3 and 4 receptor was identified via molecular docking. ClusPro 2.0 worked in three stages at the same time. For instance, stiff body docking (Desta et al. 2020), clustering of lowest energy structure (Kozakov et al. 2013), and structural refinement by energy minimization. PIPER, a docking program based on the Fast Fourier Transform (FFT) correlation technique, is used in the rigid body docking step (Kozakov et al. 2017). The complex with the lowest energy score and the most effective docking was selected (Hossain et al. 2021).

Immune response simulation

The C-ImmSim, an agent-based approach to estimate the impact of antigen and foreign particles on immune function to determine the immune actions against the antigen and foreign particles (Oladipo et al. 2021). Three (3) injections were analyzed for 1050 simulations while leaving other parameters at default. The input file in FASTA format was the vaccine primary construct sequence. The model, known as C-IMMSIM, also includes a portion of (i) primary lymphatic organs where lymphocytes are formed and mature (most notably the red bone marrow and the thymus gland), (ii) secondary lymphoid organs (for example, a lymph node), which filters lymph and is where naive B and T cells are exposed to antigens, and (iii) peripheral tissue that is reliant on the pathogen under study in this case (Castiglione et al. 2021). The operation of this tool is based on machine learning methods for subunit vaccination and binding interactions, as well as a position-specific score matrix. (Fadaka et al. 2021).

Molecular dynamics simulation

Molecular dynamics simulation is a well-known tool for analyzing biological systems at the molecular level. It was formerly used to test the stability of various protein complexes (Id et al. 2021a, b). CABSflex a tool (Jamroz et al. 2013) was used to predict the protein–ligand stability. CABSflex is a single-protein model-based integrated rapid technique for predicting protein structural variations and flexibility (Kuriata et al. 2018).The simulation length for cabs-flex has been tuned to get the highest feasible convergence with 10-ns molecular dynamics simulations (Jamroz et al. 2014). The PDB file of the vaccine-TLR complex was used as the input file.

Disulfide engineering

Disulfide by Design 2 v12.2 was used to do the vaccination protein disulfide engineering (Sarkar et al. 2020). DbD2 is a platform-agnostic web-based framework that improves accessibility, visualization, and analytical capabilities, (Hossain et al. 2021), implements the original design method again, makes a number of additions, and vastly enhances the web-based interface. (Craig and Dombkowski 2013). The server identifies which regions within a protein structure have the greatest chance of forming disulfide bonds. The intra-chain, inter-chain, and C for glycine residue were chosen during the prediction. The χ3 and Cα–Cβ-Sγ angles were retained at 87 or + 97 5 degrees and 114.610 degrees, respectively (Sarkar et al. 2020). The Fig. 2 below shows the overview of the entire procedure used in the process.

Fig. 2
figure 2

The flow chart of the methodology

Result

Sequence retrieval and antigenicity prediction

Representative SARs-CoV-2 sequences were retrieved from Global Initiative on Sharing All Influenza Data (GISAID) Database, of which, a minimum of 12 sequences was downloaded for each of the known variants from across the continents of the world. Of the total number of downloaded sequences, 22.87% (113) were downloaded from Europe, 19.84% (98) from North America, 19.03% (94) from Asia, 13.56% (67) from Africa, 13.16% (65) from South America and 11.54% (57) from Oceania.

When these sequences were subjected to antigenicity test using Vaxijen 2.0 server at 0.5 threshold, only 60.73% (300) was found to be antigenic. The antigenic sequences were then subjected to further predictions and analyses.

Prediction and assessment of cytotoxic T lymphocytes (CTL) epitopes

From the sequences, more than 15,000 epitopes were predicted using NetCTL 1.2 server. Out of these enormous epitopes, only 125 were selected for further assessment based on the significant of the binding affinities they show towards MHC-1. Using ToxinPred and AllerTOP 2.0 servers respectively, 45 epitopes were found to be non-toxic while 19 were found to be non-allergen (Zhao et al. 2017). When these 19 immunogenic, non-toxic and non-allergenic epitopes were subjected to antigenicity test using Vaxijen 2.0 server at a threshold of 0.5, only 8 epitopes were found to be successful and were employed for vaccine construction. These 8 epitopes, additionally, were predicted to be interferon‑gamma (IFN-γ) inducing epitopes by the IFN-γ epitope server. IFN-gammas are known to evade intracellular pathogens and act as cytokines (Bibi et al. 2021).

Prediction and assessment of helper T lymphocytes epitopes

HTL epitopes are those high-binding MHC-II epitopes for human alleles HLA DR, predicted with the IEDB MHC-II web server. Furthermore, the predicted epitopes were subjected to toxicity, allergenicity and antigenicity tests, using appropriate servers respectively, a total of 2 epitopes were found to be suitable for and were employed in the construction of the vaccine. These selected epitopes were chosen on the bases that they are non-toxic, non-allergenic, and are antigenic.

Prediction and assessment of B cell epitopes

In predicting the B cell epitopes, ABCPred server was used. The predicted B cell epitopes with binding score > 0.8 which are antigenic, non-allergenic and non-toxic were selected for the vaccine construction. 5 epitopes were selected in total.

Toxicity, allergenicity, antigenicity, and solubility prediction

The antigenic score produced by using the Vaxijen 2.0 server to predict construct antigenicity is 0.6568, whereas the AntigenPro tool of the scratch protein server yields 0.939710, indicating that the construct is antigenic (Hossain et al. 2021). According to AllerTOP’s allergenicity assessment, the vaccine design is non-allergenic. The ToxinPred server also anticipated that the recombinant vaccination design would be non-toxic. The construct’s solubility, as determined by Protein-Sol, is 0.378, and its pI prediction is 9.900, which is within the range predicted by Expasy Protparam (9.60).

Physicochemical properties

The target protein construct employed in this investigation has an instability index score of 29.67, according to the ExPASy tool in ProtParam used for physicochemical prediction. This instability index indicated that the protein is stable (a score of > 40 indicates instability) (Hossain et al. 2021).The predicted aliphatic index score is 65.58, a protein with a high aliphatic index is thermostable over a wide temperature range, implying that the construct is stable at high temperatures (Pyasi et al. 2021). A GRAVY result of 0.120 was anticipated as a positive grand average of hydropathicity. This score indicates that the construct is hydrophobic, meaning that it has a low proclivity to interact with the surrounding water environment (negative value suggests high hydrophilicity) (Adhikari et al. 2018). The 326 amino acids in the intended design have a molecular weight of 35,667.15 kDa. The projected half-life for mammalian reticulocytes was 30 h, yeast > 20 h, and E. coli > 10 h.

Secondary structure prediction

The vaccine construct’s secondary structure was predicted using two servers. Figure 3 shows the results of the secondary structure of the B-Cell construct as predicted by the SOPMA server. The alpha helix, beta-turn, extended strand, and random coil values for the B-Cell design were 26.69 percent, 4.60 percent, 28.53 percent, and 40.18 percent, respectively. The alpha helix, beta-turn, signal peptide, and random coil are likewise predicted by PSIPRED V4.0, as illustrated in Fig. 4. Based on the results from the two servers, the vaccination had the highest percentage of amino acids in the coil shape. PSIPRED additionally predicts tiny nonpolar, hydrophobic, polar, and aromatic plus cysteine molecules, in addition to the aforementioned factors. Small non-polar amino acid has the highest % in the build of all the parameters anticipated (Fig. 5).

Fig. 3
figure 3

Secondary structure of the vaccine construct as predicted by SOPMA

Fig. 4
figure 4

Secondary structure of the protein construct as predicted by PSIPRED

Fig. 5
figure 5

Other parameters of the protein construct as predicted by PSIPRED

3D structure prediction

Scratch Protein, projected the vaccine construct's 3D structure. Figure 6 shows the final result for the tertiary structure. Pymol was used to examine the structure subsequently.

Fig. 6
figure 6

3D-Structure of the vaccine construct viewed with pymol

3D-structure refinement and validation

The 3D structure was polished further with the 3Drefine and GalaxyRefine tools, yielding 5 refined structures (Fig. 7A). Model 2 was chosen because it had a higher Rama preferred region (96.3) and generally acceptable GDT-HA (0.9095), RMSD (0.522), and MolProbity (1.945) scores (Table 1). In addition, the model’s -1.71 Z-score was predicted by the ProSA-web z-score plot for the 3D structure for B Cell Construct (Figs. 7B and C). These findings suggest that the 3D-modeled structure is trustworthy. The vaccine subunit’s Ramachandran plot revealed that 99.630 percent of the amino acid residues are in the high favored observation (Fig. 7D). Furthermore, the Ramachandran plot analysis projected that only 0.370 percent of the residues would be discovered in the preferred observation, whereas none would be found in the questionable observation (0.000 percent).

Fig. 7
figure 7

3D-Structure refinement and validation structure. A Galaxy refined structure, B Overall model quality C Local model quality, D Ramachandran plot graph

Table 1 Structure information of the five (5) proposed model

Codon optimization of designed vaccine peptide for expression analysis

The optimized codon was 157 nucleotides long, with a codon adaptation index (CAI) of 0.117 and a GC content of 45.79 percent before adaptation, a CAI of 1.0, and a GC content of 50.47 percent after adaptation for the pasted sequence. Because GC content ranges between 30 and 70% and is considered optimal, these values imply persistent vector expression in E. coli, and CAI values more than 0.8 are considered good for expression in the target organism.

Conformational B-cell epitope prediction

B cell epitope prediction based on conformation the discontinuous B-cell epitopes were identified using ElliPro at IEDB. The vaccine protein’s PDB structure was utilized as an input for predicting conformational epitopes in the vaccine construct’s tertiary structure. On the three-dimensional structure of the vaccine construct, ElliPro found eleven epitopes, with the maximum predicted score of 0.872 and a residual position of 301–326 (Table 2), and the structure of the preferred model is illustrated in Fig. 8.

Table 2 Conformational B-Cell epitope prediction of the Vaccine construct
Fig. 8
figure 8

Structure showing the position of the preferred conformational B-Cell epitope

Molecular docking of vaccine construct with cleaned TLR 2, 3, and 4 receptors

Using the TLR as the receptor and the vaccine 3D construct as the ligand, the response of the novel vaccine design with different human TLRs (TLR2, TLR3, TLR 4) was investigated. Cluspro's output looked at 30 possible model outputs for each receptor-ligand reaction. The best model for the balanced coefficient was chosen based on binding energy weight. Figure 9 depicts the models that were chosen. To properly comprehend the connection between the vaccination candidates in the pocket of the TLRs. The toll-like receptor and vaccine amino acid interaction was extracted and visualized using the visualization tool LigPlot., as shown in Fig. 10.

Fig. 9
figure 9

Molecular docking of the vaccine construct with TLR showing: A Docked complex of vaccine—TLR-2, B Docked complex of vaccine -TLR 3, and C Docked complex of vaccine—TLR-4

Fig. 10
figure 10

Vaccine—TLR interaction Visualization showing A Amino acid interaction between the vaccine and TLR-2 with the vaccine amino acids labeled green and receptor amino acids labeled blue. B Amino acid interaction between vaccine and TLR-3 with the vaccine amino acids labeled green and receptor amino acids labeled blue. C Amino acid interaction between vaccine and TLR-4 with the vaccine amino acids labeled green and receptor amino acids labeled blue

Two (2) amino acids from the vaccine design (Trp314, Lys32) were discovered to bind to the pocket of TLR-2 with the amino acids Asp263, Cys287, and Ser 259. Additionally, eight (8) amino acids (Asn310, Tyr307, Pro97, Lys159, Pro95, Pro146, Glu166, Thr157) linked to TLR-3's pocket with Asp692, Tyr683, Ser695, Arg635, Tyr556, Asn608, Lys628, Asp523). Finally, sixteen (16) amino acids linked to the pocket of TLR-4 with amino acids Gln188, Arg234, Lys230, Glu89, Glu42, Asp50, Arg87, Gln39, Thr37, Lys47, Val32, Lys130.

Immune response simulation

With three (3) injections, the C-IMM SIM server was used to precisely analyze the immunological response of the novel vaccine candidate as well as the increased half-life. As demonstrated in Fig. 11, the antigen and immunoglobulin responses, B-lymphocyte count, CD4 T-helper lymphocyte count, macrophage total count, dendritic cell count, and cytokine response were all shown.

Fig. 11
figure 11

Immune simulation response as performed by C-SIM IMM for projected vaccine candidate A Immunoglobulins production in response to antigen with IgM+IgG labeled orange being the highest production and the antigen labeled black. B B–lymphocyte production showing its active state per day colored purple. C B-lymphocyte total count showing the production of memory cells per day of response colored green. D CD4 T-helper lymphocyte count showing the most active state being at the peak colored purple. E Macrophages total count. F Dendritic cells count and G Cytokines response

Molecular dynamics simulation

The vaccine-receptor complex fluctuation and stability were investigated using molecular dynamics modeling. Figure 12 depicts the complexes’ spin direction model as revealed by the server. The fluctuation plot (Fig. 13) and contact maps were also displayed on the server (Fig. 14).

Fig. 12
figure 12

The spin direction of vaccine-Receptor complex A Vaccine-TLR-2 complex interaction B Vaccine-TLR-3 complex interaction C Vaccine-TLR-3 complex interaction

Fig. 13
figure 13

Molecular dynamics fluctuation plot A Flunctuation plot of vaccine-TLR-2 complex B Fluctuation plot of vaccine-TLR-3 complex C Fluctuation plot of vaccine-TLR-4 complex

Fig. 14
figure 14

Molecular dynamics contact map result A Contact map of vaccine-TLR-2 complex B Contact map of vaccine-TLR-3 complex C Contact map of the vaccine-TLR-4 complex

Disulfide engineering

We examined the model with an energy score of less than 2.2 and a Chi3 value between + 97 and 87 based on the findings returned after exposing the vaccine construct to de-novo disulfide engineering. Taking the criteria into account. At positions 55, 89, 116, 191, 266, and 267, six (6) amino acids have chi3 values in this range, while just one amino acid has an energy score less than 2.2. Table 3 below shows the results in greater detail.

Table 3 Possible disulfide bond between the vaccine construct amino acids

Discussion

The pressing global need for a SARS-CoV-2 vaccine to combat the COVID-19 pandemic stems from the absence of an effective treatment for this highly transmissible and severe respiratory syndrome (Anasir and Poh 2016). Vaccination is universally acknowledged as the most potent defense against infectious diseases, and COVID-19 is no exception. The concept behind peptide vaccines revolves around the identification and synthetic production of immunodominant B-cell and T-cell epitopes capable of eliciting specific immune responses (Patronov and Doytchinova 2013). However, with the emergence of various SARS-CoV-2 variants, currently available vaccines have proven inadequate against these new strains, necessitating the development of a universal vaccine capable of addressing all identified variants of concern (VOC) and variants of interest (VOI) as classified by the World Health Organization (WHO). Consequently, this study focused on designing a comprehensive multi-epitope peptide subunit vaccine, employing the spike glycoprotein as the primary reference point (Oladipo et al. 2021).

Bioinformatics analysis enables the efficient and accelerated prediction of potent epitopes, revolutionizing vaccine design compared to traditional approaches. The spike protein, primarily located on the outer surface of the virion, emerges as a promising target for B-cell epitope exploration due to its significant exposure. Previous investigations have revealed the ability of spike proteins from MERS-CoV and SARS-CoV to elicit robust immune responses. Baruah and Bose’s study further corroborated the potential of spike protein epitopes as favorable candidates for the development of a SARS-CoV-2 vaccine (Chen et al. 2020a, b). We retrieved 494 sequences as representatives from six continents on Global Initiative on Sharing all Influenza Data (GISAID) and took into account all Variants of Concern (VOC) and Variants of Interest (VOI) as captured by WHO (Tracking the SARs-CoV-2 variants) (Oladipo et al. 2022a; Oude et al. 2023). The antigenic properties of the sequences were predicted at a viral threshold of 0.5, sequences which falls below this threshold were considered to be non-antigenic and are therefore discarded, afterwards the B-cells, CTL, and HTL epitopes were predicted (Singh et al. 2020).

The prediction of B-cells and T-cells, which play a crucial role in mediating the adaptive immune response, holds significant importance in combatting various infectious diseases, including SARS-CoV-2. This intricate process involves the differentiation of these cells into plasma cells, responsible for synthesizing antibodies that actively combat foreign substances within the body (Fadaka et al. 2021). The selection of helper T-lymphocytes (HTL) was based on the recommendations from the Immune Epitope Database (IEDB) and encompassed HLA-DRB1*03:01, HLA-DRB1*07:01, HLA-DRB1*15:01, HLA-DRB3*01:01, HLA-DRB3*02:02, HLA-DRB4*01:01, and HLA-DRB5*01:01 (Vita et al. 2015).

The comprehensive evaluation of predicted T lymphocyte cells, encompassing both helper T-lymphocyte (HTL) and cytotoxic T-lymphocyte (CTL) epitopes, aimed to determine their coverage across a diverse global population. The CTL epitopes exhibited an extensive reach, covering 68.87% of the entire world population, with notable coverage exceeding 70% in regions such as West Africa, West Indies, South Asia, and Europe (Oladipo et al. 2022b). Moreover, significant coverage exceeding 60% was observed in areas including North Africa, North America, Northeast Asia, South Africa, and Southwest Asia. Similarly, the predicted HTL epitopes exhibited broad coverage, encompassing 69.46% of the world population, with over 70% coverage in Europe and North America, as well as 60% coverage in East Asia, North Africa, South Asia, and the West Indies. Comparatively, the obtained results (Samad et al. 2022) exceeded the threshold value, highlighting the potential of these epitopes as strong vaccine candidates capable of inducing immune responses across multiple Human Leukocyte Alleles.

The final construct was meticulously crafted by integrating the cytotoxic T-lymphocyte (CTL), helper T-lymphocyte (HTL) and B-cell epitopes, along with their requisite linkers. To enhance the vaccine’s stability and longevity, the adjuvant (β-Defensin) was conjugated to the CTL epitope via the EAAAK linker. This strategic design facilitates the generation of robust cellular and humoral immune responses specific to the target antigens, ensuring a comprehensive immune defense (Samad et al. 2022). Additionally, the construct incorporates other essential linkers such as GPGPGP and AAY (Ala-Ala-Tyr), serving as intra HTL, B-cell, and CTL epitope connectors. These linkers not only enhance the immune response against the pathogen but also prevent any potential immunogenicity arising from junctional regions, thereby optimizing the vaccine’s efficacy.

The universal vaccine candidate's physiochemical properties were predicted, bestowing some useful characteristics of a prospective vaccine, such as the peptide's 326 amino acid residues, which are suited for a vaccine build and have a molecular weight of 35,667.15 Da. The hypothesized PI value-based vaccination nature was discovered to be somewhat basic. According to the tools' predicted stability index, the protein is predicted to be stable after synthesis. However, the GRAVY value (29.67) and aliphatic index (65.58) showed that the vaccine was both thermostable and hydrophobic. The likelihood of designating this vaccine as a viable candidate is greatly increased by the positive physicochemical properties anticipated for the vaccine and by all of the scores on various metrics (Samad et al. 2022).

The determination of the peptide sequence’s shelf life involved estimating its stability under various conditions. The analysis revealed that the potential universal vaccine candidate exhibited a longevity of approximately 30 h in mammalian reticulocytes (in vitro), while demonstrating shorter durations of less than 20 h in yeast and less than 10 h in Escherichia coli (in vivo). These estimations were derived using bioinformatics tools, specifically the protparam, and aligned with the findings of previous studies (Samad et al. 2022). Furthermore, a solubility test was conducted, indicating that the vaccine candidate displayed lower solubility with a value of 0.378, falling below the threshold of 0.45.

The peptide designed as potent vaccine has been shown to have antigenic properties when analyzed with vaxijen at 0.6568 and it was also validated with Antigen Pro at a value of 0.939710 which also confirm it antigenic characteristics. Its toxicity analysis when verified with toxinpred indicate that the peptide is non-toxin molecule. For the vaccine construct to be a potent candidate, the construct was analyzed for its allergenicity capacity, which it is shown that the construct was non allergen as predicted by AllerTop 2.0 and confirmed by AllergenFP. The secondary characters of the structure were predicted using SOPMA server and justified with PSIPRED, these two bioinformatics tools indicate the secondary structures has an alpha helix, Beta turn, extended strand and random coil. Following this result, the 3D structure as shown by scratch protein and was validated and refined.

Furthermore, to access it compatibility against Human Toll Like Receptors using insilico design. we docked the potential vaccine candidate against TLR 2, 3, 4 (Singh et al. 2020) and these also enables us to visualize the interactions between these molecules using the Ligplot. The human toll like receptor 4 shows better compatibility as similar works were carried out by Oladipo et al. (2021); Oladipo et al. (2020) and Samad et al. (2022). This interaction shows that there is probable sixteen amino acids to bind with therefore increasing it suitability.

Immune simulation, a technique that was carried out to indicate the possible effect on the of the vaccine activities on the immune cells on the host. With the administration of first (initial) dose, the second dose after 28 days and the booster dose after few months. Immune titre curves which show a parameter that approves it usage and could prove proper Immune response (Bassaganya-Riera and Hontecillas 2016). Molecular dynamics simulation between vaccine and the TLR 4 was done inorder to investigate the interactions and stability between these biomolecules. This process shown by Cabsfleck in the flunctation plots shown above and the procedure was carried out under 10 nansec. Which agrees Jain et al. (2014). Disulsifide engineering was done to access the energy bond between the amino acids residue and this conforms with Oladipo et al. (2021).

The application of immunoinformatics techniques entails scrutinizing SARs-CoV-2 variants, unveiling vaccine constructs with promising potential as a universal Covid-19 vaccine candidate. However, the claim necessitates validation through subsequent wet lab assessments.

Conclusion

The study examined the genomic sequence of a representative sample of circulating SARs-CoV-2 variants in an effort to identify a possible vaccine candidate for the dreadful Covid 19 illness. These peptide constructions show promising properties in in silico assessments as a potential all-purpose vaccine and may go on to wet lab assessments for the creation of a targeted vaccination against Covid 19 infection.