Introduction

Development of modern medicine in the past century has largely relied on the use of antibiotics to drastically reduce the morbidity and mortality due to various bacterial pathogens such as Staphylococcus aureus, Klebsiella pneumonia etc. At the same time, their widespread use has resulted in the emergence of drug resistant bacteria that render these molecules inactive leading to a higher rate of treatment failure. Therefore there is a pressing need to develop novel alternatives to antibiotics for the treatment of bacterial infections (Ventola 2015; Levy and Marshall 2004).

In recent times, anti-microbial peptides (AMPs) from natural or synthetic sources are being researched as a novel class of therapeutic agents showing promising activity against bacterial infections (Mahlapuu et al. 2016). These are small and rapidly active peptide sequences that act as essential protective components of the immune systems of all organisms. One of the major advantages of AMP candidates is their ability to target multiple proteins in the pathogen. Diverse modes of actions are exhibited by such multi-targeting drugs that make AMP-based therapies an interesting solution to overcome the problems of drug resistance. (Baltzer and Brown 2011; Hincapié et al. 2018; Cardoso et al. 2020).

The design and development of novel AMP sequences through various approaches have hence become particularly attractive in therapeutic applications to replace conventional antibiotic use. The basic principles of AMP design techniques overlap and are based on some specific characteristics of AMPs such as size, charge, and hydrophobicity. Typically, short cationic AMPs designed with size less than 15 amino acids, and overall positive charge and a hydrophobicity ratio of 30–50% are found to be of good potential to be developed as active peptides. Primarily, natural AMPs isolated from various sources are analyzed and used for re-designing novel sequences with optimal therapeutic potentials overcoming limitations such as chemical instability, cytotoxicity etc. The majority of such researches were carried out to design AMPs acting specifically acting against common bacterial pathogens such as S. aureus, Mycobacterium tuberculosis, Streptococcus pneumoniae etc. (Pearson et al. 2016).

Rational approaches of generating and testing analogue sequences from known AMPs with required specificity and reduced side effects have been widely used to design AMP variants (Porto et al. 2012). Previously identified AMPs have been used as templates to re-design novel ones using rational principles. (Mishra et al. 2017). Deslouches et al. utilized rational techniques based on residue properties to design sequences that showed specific activity against Pseudomonas aeruginosa and S. aureus (Deslouches et al. 2013). In a different approach, de novo techniques generate multiple sequences from scratch with desirable physicochemical properties such as the length and charge of the sequences, which are then tested for efficacy (Deslouches et al. 2005).

Thus different approaches are being utilized on a case-to-case basis for the identification of novel peptide-based drug candidates. Concurrently, computational tools and techniques are also used to assist such discoveries. With the progress of genomic and proteomic techniques accelerating the discovery of natural AMPs, several databases containing AMP annotations have been built such as the APD3 (The Antimicrobial Peptide Database), CAMPR3 (Collection of Anti-Microbial Peptides) and DBAASP (Database of Antimicrobial Activity and Structure of Peptides) databases (Wang et al. 2016; Waghu et al. 2016; Pirtskhalava et al. 2016). Besides information retrieval, these databases have also been used to promote drug development as they were found to be a good resource for prediction or design of novel sequences and optimization of existing sequences (Wang et al. 2009; Vishnepolsky and Pirtskhalava 2014; Waghu et al. 2016). In this regard, empirical, stochastic or machine learning based approaches have been utilized for development of tools for the optimal design of AMPs (Porto et al. 2012).

Consequently, an increasing number of computational techniques relying on these databases and tools are being exploited for optimal AMP design and tested for their efficacy. Mishra and Wang utilized database filtering method to design ab-initio peptides against methicillin-resistant S. aureus (MRSA) based on the characteristics of the peptides present in AMP database (Mishra and Wang 2012). Hincapie et al. generated new candidate sequences based on desirable physicochemical properties that were then modified using available design and prediction tools (Hincapié et al. 2018). The applications of such prediction tools have also been found useful when implemented in conjunction with advanced transcriptomic analyses to detect potential candidates (Kim et al. 2016). Thus, different bioinformatic approaches relying on AMP databases and prediction tools have been utilized alone or in combination to design novel potential AMP candidate sequences. The odds of discovering promising AMP candidates are found to be higher when different techniques are allied with each other and with further confirmation studies conducted in-silico, in-vitro or in-vivo (Cardoso et al. 2020).

In the current work, we aim to design novel broad-spectrum AMP sequences de-novo using a database-guided approach followed by consecutive refinement techniques with available design and modification tools. Primarily, the APD3 database was used to derive desirable properties that were further utilized to filter randomly generated peptide sequences with potential AMP sequences. The potential sequences were then modified for improved activity using prediction and design tools of CAMPR3 and DBAASP tools. Finally, the inhibition efficacies of the designed sequences are then tested using in-vitro dilution assays as a means of preliminary validation.

Materials and Methods

In-Silico Design of Novel AMPs

The current paper utilizes a consecutive refinement approach for computational prediction and design of potential broad-spectrum AMP sequences by taking into account various desirable molecular characteristics along with the results from databases and tools for AMP prediction (Wang et al. 2009; Mishra and Wang 2012; Vishnepolsky and Pirtskhalava 2014; Hincapié et al. 2018).

Generation of Random Peptide Sequences Based on Database Filtering Technique

Sequences of antimicrobial peptides with desirable characteristics as mentioned before are downloaded from the Antimicrobial Peptide Database (APD3) at http://www.unmc.edu/AP/ (Wang et al. 2016). The database contains AMPs from various sources along with information such as sequence, length, charge, structure and activity spectrum. In order to identify the characteristic length and probable amino-acid residues for novel short broad-spectrum AMPs, the sequences from APD3 with the following desired properties were retrieved- (i) length less than 16 amino-acids, (ii) activity against both Gram-positive and Gram-negative bacteria, (iii) hydrophobicity between 30 and 40%, (iv) cationic charge and (v) non-toxic to mammalian cells. From the retrieved set of peptide sequences, the average statistic length and pattern of amino-acid frequency are analyzed to set a guideline for filtering random peptide generation. Based on the identified length, 20,000 unique peptide sequences were generated by random sampling of amino-acid residues. Further the sequences were filtered with the identified amino-acid frequency pattern, charge and hydrophobicity ratios as above.

Identification of Seed Sequences with AMP Potential

The sequences were then subjected to prediction of their potential anti-microbial activity using the prediction classifiers of the CAMPR3 database at http://www.camp.bicnirrh.res.in/ (Waghu et al. 2016). Prediction algorithms based on Support Vector Machines (SVM), Random Forests (RF) Artificial Neural Network (ANN) and Discriminant Analysis (DA) are incorporated in the database for AMPs. The RF, SVM and ANN classifiers give a probability score ranging from 0 to 1 for AMP prediction whereas DA classifier outputs the prediction as AMP or Non-AMP (NAMP). The random sequences generated with desirable properties were submitted to all these four classifiers in the CAMPR3 server for AMP prediction. Those sequences that showed probable AMP potential according to at least two of the four classifiers with a mean score greater than 0.5 were selected as the seed sequences for further modification. The classifiers and score limits were finalized based on the results obtained by submitting original AMP sequences with the same properties retrieved from APD3 database.

Rational Modification of Seed Sequences for Derivative Sequences

The identified seed sequences were then subjected to position-wise substitution for rational modification by design tool from the CAMPR3 database (Waghu et al., 2016). The tool functions based on SVM, RF and DA algorithms for rational design of derivative AMPs by predicting and scoring possible substitutions at each position. Single/multiple position-wise substitutions of specific residues suggested by at least two of the three classifiers were used to design derivative sequences from the seed sequences.

Prediction and Design of Final Sequences for Validation

Besides the general principles of charge and hydrophobic ratios that were utilized in the CAMPR3 servers, it was identified that charge density and hydrophobic moments (HM) of the peptides were also important characteristics that determined the activity of linear cationic peptides based on their interaction with anionic membranes (Vishnepolsky and Pirtskhalava 2014). Finally, in order to identify the best sequences based on charge density and HM, the prediction tool at Database of Antimicrobial Activity and Structure of Peptides (DBAASP) available at https://dbaasp.org/ was utilized (Pirtskhalava et al. 2016). The predicted sequences were also cross-checked for their 3-Dimensional HM vector at lipid bilayer environment using the 3D-HM calculator available at http://www.ibg.kit.edu/HM/ (Reißer et al. 2014).

Prediction of Penetration Power, Cytotoxicity and Haemolytic Activity

Action of any anti-microbial agent depends on its ability to get in proximity to its target protein. Hence it is important for therapeutic peptides to be able to pass through the bacterial cell membrane. Similarly it is also important to assert the non-toxic nature of these AMPs to human/mammalian cells. In this direction, many computational tools have been developed to predict the cell-penetrative power and toxicity nature of peptide sequences (Kardani and Bolhassani 2020; Gupta et al. 2015).

Five different tools based on various algorithms were used to analyze the cell-penetrative power of the lead sequences namely, CPPPred (Holton et al. 2013), CPPPred-RF (Wei et al. 2017a), SkipCPP-Pred (Wei et al. 2017b), CellPPD (Gautam et al. 2013, 2015) and MLCPP (Manavalan et al. 2018). Further to penetration power, peptide action on human erythrocytes was tested with the tools HemoPI (Chaudhary et al. 2016) and HAPPENN (Timmons and Hewage 2020). The prediction interface of DBAASP against species set to “human erythrocytes” was also used for validation. Additionally, ToxinPred server (Gupta et al. 2013) was used to analyze the cytotoxicity nature of the given sequences.

Peptide Synthesis

All the peptides were chemically synthesized by the peptide manufacturer company SBioChem Pvt. Ltd. (Thrissur, Kerala). Quality analyses of peptides were validated using High Performance Liquid Chromatography (HPLC) and Mass Spectrometry. The peptides for in-vitro testing were synthesized as white powder to > 95% purity. TRIS buffer was used to dilute the peptides for in-vitro activity assessment.

Bacterial Strains

Two Gram-positive and two Gram-negative bacterial strains namely S. aureus MTCC96, Bacillus cereus MTCC8733, Klebsiella aerogenes MTCC8100 and Klebsiella pnuemoniae MTCC618 isolates were procured from MTCC (Microbial Type Culture Collection and Gene Bank) to test for broad spectrum antibacterial activity.

Minimum Inhibitory Concentrations of AMPs

The minimum inhibitory concentrations (MICs) of the peptides against the isolates were determined by broth microdilution protocol as indicated by the CLSI guidelines (Clinical and Laboratory Standards Institute). Briefly, bacterial strains were grown for 18–24 h at 37 °C. Direct suspension of the colonies was made in Müeller-Hinton broth and adjusted to OD 625 0.08–0.1 which corresponds to ~ 1–2 × 108 CFU/ml followed by serial ten-fold dilutions to give 1 × 106 CFU/ml. 50 μl of bacterial suspensions were added to 96-well round bottom microtiter plates containing equal volume of peptides at different concentrations (62.5, 125, 250, 500, 1000, 2000 μg/ml) and the 96-well plates were incubated for 20–24 hr at 37 °C.

Results and Discussion

Figure 1 presents a birds-eye view of the principles and results of the in-silico design process utilized in this work.

Fig. 1
figure 1

Step-by-step selection processes and results of streamlined refining used for the identification of desirable AMP sequences

Design of AMPs

Database Filtering for Computation of Desirable Properties

The APD3 database was searched for broad-spectrum AMPs with less than 16 amino-acids, charge greater than zero, hydrophobicity ratio 30–40% and non-toxic to mammalian cells. A total of 1671 sequences with broad spectrum activity were retrieved at the time of our design among which there were 32 non-toxic sequences within the pre-decided thresholds of charge, hydrophobicity ratio and length (Table 1). The sequences of these natural AMPs were retrieved to calculate the average statistic length and amino-acid frequency. Based on the analysis of the retrieved sequences with desirable properties, the statistic length to design a novel AMP was set to 11, with the residues K, L, G and S to be most frequently occurring amino-acids in each amino-acid group (Table 2).

Table 1 32 non-toxic, broad spectrum short AMPs retrieved from APD3 database
Table 2 Amino-acid frequencies identified

Generation of Random Peptide Sequences

20,000 random peptide sequences of 11 amino-acid residues were generated among which 5427 sequences with the desired amino-acid frequencies, charge and hydrophobicity ratios were selected for further analysis.

Identification of Potential Seed Sequences

Among the 5427 random sequences, 9 sequences were selected as seed sequences with AMP activity by the CAMPR3 prediction server. As explained in Sect. “Materials and Methods”, results from four different learning algorithms were used to score probable AMP activity. Table 3 presents the selected seed sequences that showed probable AMP potential according to at least two of the four classifiers with a mean score greater than 0.5.

Table 3 Seed sequences and their scores selected for rational modification

Rational Modification of Seed Sequences

The selected seed sequences were then subjected to position-wise substitution for rational modification by CAMPR3 design tool to generate probable AMP sequences. 421 sequences were derived from 9 seed sequences among which 13 sequences with charge greater than zero and hydrophobicity between 30 and 40% were selected for further filtering. Table 4 presents the results of AMP activity prediction of these 13 sequences from iAMPPred server (http://cabgrid.res.in:8080/amppred/) and DBAASP (https://dbaasp.org/home) along with the calculated values of various molecular characteristics determining the inhibition potential. The 2-dimensional hydrophobic moment (2D HM) represents a qualitative description of the amphiphilicity of the peptide helices whereas the 3-dimensional HM (3D HM) extends this concept as a vector in three dimensions by taking into account all structural characteristics (Reißer et al. 2014). Similarly, GRAVY is the Grand Average hydropathy value of the peptide wherein positive GRAVY scores indicate high hydrophobicity and low water solubility (Kyte and Doolittle 1982).The Boman index computes the binding potential of a peptide with a negative index representing higher hydrophobicity (Boman 2003).

Table 4 Sequences derived from rational modification

Final Selection of Desirable AMP Sequences

Even though iAMPPred server predicted desirable AMP scores greater than 0.75 for all the selected sequences, DBAASP tool that also considers the property of HM identified only five sequences with high 2D HM values as AMP (Table 4). Nevertheless, computation of 3D HM values at a lipid bilayer environment indicated that one of the sequences namely IKLNVKGMKQW had a deplorably large 3D HM value that is undesirable for the proper functioning as a broad-spectrum anti-microbial agent. (Datheet al. 1997; Zhang et al. 2016; Pathak et al. 1995). Hence the sequence was removed from further analysis.

Thus as a final point, four of the best sequences were detected and ultimately selected for further validation studies. The identified sequences were GKIMYILTKKS, FGIKLRSVWKK, FGIKLRSVWKR and FGIKLRKVWKD which are hereafter designated as PEP01-PEP04 respectively. The important molecular characteristics that determine the inhibition potential of the designed sequences are presented in Table 5.

Table 5 Identified AMP sequences for validation and their molecular characteristics

Prediction of Penetration Power, Cytotoxicity and Haemolytic activity

Sequence based prediction of cell-penetration indicated the designed sequences to possess comparable cell-penetration power (Table 6). PEP03 has been predicted as the high scoring cell-penetrating peptide in a comprehensive analysis of results from five different tools (CPPPred, CPPPred-RF, SkipCPP-Pred, cellPPD and MLCPP). Further to penetration power, toxicity to human erythrocytes and mammalian/animal cells in general were tested with four different tools. HemoPI and HAPPENN predicted all the four sequences as non-hemolytic whereas DBAASP predicted PEP01 as hemolytic and all other three as non-hemolytic. Additionally, ToxinPred server also predicted the non-cytotoxic activity of the given sequences further indicating their safety.

Table 6 Prediction of cell-penetrative power, hemolytic activity and cytotoxicity

In-vitro Validation and Minimum Inhibitory Concentrations of Antimicrobial Peptides

Chemically synthesized peptides were validated using High Performance Liquid Chromatography (HPLC) and Mass Spectrometry (Supplementary Data S1). The peptides synthesized as white powder to > 95% purity was used for in-vitro activity assessment. Their antimicrobial activities against Gram-positive and Gram-negative bacteria were evaluated by the broth microdilution assay. The results are tabulated in Table 2. All the four selected peptides showed inhibition to all the tested Gram-positive and Gram-negative pathogens. However, peptides with an MIC less than 250 μg/ml can be considered particularly promising for further development.

Concentration that showed 50% reduction in OD value compared to control was considered as MIC (CLSI 2015).

PEP01 and PEP03 were highly effective against all the four pathogens tested with MIC of 62.5–125 µg/ml. PEP02 showed medium inhibition of K. pneumonia, K. aerogenes and B. cereus at MIC 62.5-125 µg/ml while MIC against S. aureus was 250–500 µg/ml. PEP04 had a lower MIC (62.5-125 µg/ml) against both the Gram-negative organisms and S. aureus while MIC was observed to be between 500 and 1000 µg/ml against B. cereus. Kanamycin control showed the MIC value for all the four isolates at the lowest concentration of 3.90 to 7.81 µg/ml (Table 7).

Table 7 MIC distributions of the designed peptides with selected bacterial pathogens

In summary, the inhibitory concentrations indicate higher potential for PEP01 and PEP03 for development as broad spectrum AMPs closely followed by PEP02. These results in conjunction with the cell-penetration and toxicity prediction analysis results project PEP03 as the most desirable lead sequence as AMP. Comparative low cell-penetrative power and positive hemolytic activity prediction by DBAASP tool for PEP01 invokes apprehensions in its capacity as an efficient AMP sequence.

Compared to the other three peptides, higher MICs of PEP04 against Gram-positive bacteria indicates a lower efficiency to be developed as broad-spectrum. Nevertheless, it can be projected as a specific AMP candidate against Gram-negative bacteria which is also strengthened by its desirable cell-penetrative and non-toxicity prediction.

Although research works have been carried out in a similar fashion previously, they were aimed to inhibiting specific pathogens (Deslouches et al. 2013; Porto et al. 2018). The uniqueness and advantages of this study lie in the fact that the designed sequence is aimed at all major bacterial pathogens regardless of their Gram-nature. The work has been conducted with the possibilities of further exploration in terms of in-silico screening for identification of bacterial targets and in-vitro testing of hemolytic activity and cytotoxicity. Further improvement in activity by analysis of the side chain length or terminal modification and oligomerization of peptides can also be investigated from these lead sequences.

Conclusion

With the wide emergence of antibiotic resistance among the common pathogens such as S. aureus and K. pneumoniae, research trends indicate the development of alternate strategies of treatment such as natural or synthesized AMP sequences. The current work presents and validates a database-derived computational protocol of consecutive refinement approach for the design of novel AMPs that inhibit major drug resistant pathogens regardless of their Gram nature. Based on the characteristics of existing natural sequences with known anti-microbial activity, four lead sequences were designed and tested for the development of broad-spectrum AMPs among which PEP03 (FGIKLRSVWKR) was found to be particularly promising. To the best of our knowledge, parallel studies purposefully intended at developing broad-spectrum antimicrobials are not common. Under such a scenario, even though the study is only preliminary in nature, it forms a strong basis for the design and discovery of other novel antibacterial/antimicrobial agents.