Introduction

The regulation of cell signalling is essential to perform various functions is mediated by reversible phosphorylation of protein controlled by kinases and phosphatases. Cell signalling is the root cause of significant diseases (Day et al. 2016). It is responsible for a structural change that affects the enzyme and scaffolding activity (Doughty-Shenton et al. 2010). This change influences the cell signal propagation, ultimately leading to augmentation or suppression of cellular events. Protein tyrosine phosphatases are the largest class amongst other phosphatases localized in the mitochondria ‘powerhouse of the cell’ of eukaryotic cells are dual-specificity phosphatases because of their potential to catalyse the dephosphorylation of phosphothreonine and phosphoserine apart from phosphotyrosine residue (Guan et al. 1991; Patterson et al. 2009). The gene is responsible for its expression located on chromosome 11 in humans. Mitochondria is a reservoir for several enzymes like phosphatases and kinases essential for adenosine triphosphatase (ATP) synthesis. PTPMT1 is localized inside the inner mitochondrial membrane so that the phosphatase domain is proximal to the mitochondrial matrix (Pagliarini et al. 2005) that is predominantly involved in energy production and metabolism. This protein has a high affinity towards lipid substrates by influencing the mitochondrial lipid compartment involved in catalysing the cardiolipin synthesis and lipid metabolism inside the matrix (Niemi et al. 2013; Zhang et al. 2011a, b) it is the first-ever protein tyrosine phosphatase to be discovered inside the mitochondria.

Recent studies show that a disturbance in cardiolipin biosynthesis causes apoptosis ‘programmed cell death’. This protein is primarily responsible for maintaining homeostasis in various cancers and lifestyle diseases like type II diabetes (Guan and Xiong 2011). Recent studies reported that the downregulation of tyrosine phosphatase resulted in the apoptosis of cancer cell lines under in-vitro conditions (Niemi et al. 2013). Cardiolipin present inside the inner membrane of mitochondria is proven to bind with cytochrome c potentially. It has been proven that cardiolipin oxidation causes the release of cytochrome c from mitochondria (Kagan et al. 2005). Downregulating the PTPMT1 gene induces apoptosis in cancer cells. Downregulating the PTPMT1 gene induces apoptosis in cancer cells. Targeting the PTPMT1 would induce apoptosis in cancer cells by sensitizing them to chemotherapeutics; however, the mechanism is unknown. The removal of the PTPMT1 responsible gene from the pancreatic beta islet cells had shown to increase ATP and insulin levels under in-vitro conditions (Pagliarini et al. 2005). PTPMT1 closely shares its catalytic site with its homolog phosphatase and tensin homolog (PTEN) (Steck et al. 1997). Targeting this PTPMT1 dual-specificity phosphatase could inhibit its activity and functions, promote the apoptosis of cancer cells, increase cellular ATP and insulin levels in the case of type II diabetes. These could pave the way to the developing new lead-like compounds, thus combating both cancer and diabetes by interfering with the enzyme’s biology and treating the dysregulation (Lai et al. 2009). The three-dimensional structure of rat PTPMT1 is not yet available in the databases; however, it is closely similar to PTPMT1 of Mus musculus. The PTPMT1 of Mus musculus and Homo sapiens are the least explored. Recent studies and attempts in cancer and diabetes keenly focussed on the dysregulation of PTPMT1 of Rattus norvegicus, whose structure is the main focus of this research.

Flavonoids are polyphenolic secondary metabolites containing benzo-γ-pyrone structure and naturally occurring in every part of a plant. Flavonoids help in the survival of plants by creating a defence mechanism against browsing animals and insects. They are classified into six subclasses flavonol, flavone, isoflavone, flavanone, anthocyanin, and flavan-3, predominantly found in berries, grapes, turmeric, cocoa, apple, tea, onion, broccoli, etc. (Kumar and Pandey 2013). Besides flavonoids are naturally antibacterial, antifungal, antiviral, anticancer, antiparasitic, antioxidative, antidiabetic, and anti-inflammatory agents (Horáková 2011; Kamaraj et al. 2009; Li and Xu 2008; Wu et al. 2008; Yadav et al. 2020). Advances in scientific research have contributed to new technologies to isolate, characterize, purify, and analyse their bioactive properties. Natural medicines date back to at least 60,000 years when people started using them as remedies and are used till present (Shi et al. 2010) in countries notably like China and India. Natural products have evolved for the past several thousand years in their properties, biological activities, and distinct mechanisms of action that ultimately led to several drug-like molecules (Yuan et al. 2016). However, only a few portions have been studied widely for bioactivities against the known target out of millions of plant species (Ngo et al. 2013). The development of modern biotechnology, natural products chemistry, and bioinformatics have led to the computational simulations, annotation of mechanisms, and pathways to the rapid throughput screening of new drugs from natural sources.

Homology modelling, also known as comparative modelling, is a valuable tool used to predict the three-dimensional protein structure when only information about the protein sequence is available (Pitman et al. 2006). Homology modelling is essential for biotechnologists to predict and study the functions of un-explored proteins whose three-dimensional structure is not open (Abagyan et al. 1994). An experimental method to perform homology modelling involves the following steps: (i) using the BLAST search engine to identify the template sequence for the target sequence of interest, (ii) sequence alignment, (iii) backbone generation, (iv) loop modelling, (v) sidechain modelling, (vi) model optimization and (viii) model validation (Vyas et al. 2012). Diverse software like Modeller, PRIMO, and free online automated modelling servers like the SWISS-MODEL is designed for homology modelling with high efficiency and reliability (Schwede et al. 2003). Modelling the three-dimensional structure help researchers elucidate the essential amino acids involved in the binding site and their interaction mechanism with the ligand responsible for inhibition, thus neutralizing its virulence.

Molecular docking is a rapid, cost-effective, computer-based simulation technique used to analyse the interaction between protein and ligand using specialized tools and algorithms with the help of high-speed computers. Docking provides insights into the interaction between protein and ligand by elucidating the binding energy, hydrogen bond interactions, Van der Waals interactions, 2-dimensional and 3-dimensional interactions, and ranks the best protein–ligand pose according to the order of binding energy. Molecular docking helps us identify how a ligand or lead compound inhibits its target using hypotheses (Hendlich 1998; Pozzan 2006). AutoDock is a software developed by the Scripps Research Institute, the USA that works based on the Lamarckian Genetic algorithm and is considered an efficient tool for molecular docking (Li et al. 2014). AutoDock consists of AutoGrid and AutoDock and is the most widely used out of 30 well-known computational docking tools. The grid is pre-calculated and set to fit the ligand-binding site of the protein where the ligand is intended to bind (Park et al. 2006), and molecular docking is carried out after fixing the grid. Molecular Dynamics (MD) simulations are effective in the in-depth atomic study of macromolecules like protein, nucleic acids, docked complexes, ribosomes, ribosomes, etc. (Brandman et al. 2012; Roccatano et al. 2007) using force fields. It is simply studying the potential energy of every atom in a molecule by subjecting them to kinetic energy; this mimics the actual process happening in a biological system (Hospital et al. 2015). Modern-day computers have the potential of studying up to 5 lakh atoms in a molecule under nanoseconds. It is essential to study the molecules at the lowest possible time to study their stability and conformational changes since the properties and functions of biomolecules are concerned with the order of nano and microseconds (Hollingsworth and Dror 2018). The prime molecular mechanics—generalized Born and surface area continuum solvation (MM-GBSA) is a versatile tool to quantify the free binding energies of ligands and small molecules bound to a biomacromolecule (Genheden and Ryde 2015). This technique widely is used to strengthen, enhance, validate, and support the methods and results of molecular docking of ligand-macromolecule complexes. The lower the negative value, the lesser is the energy, thus greater the binding affinity.

Thus, PTPMT1 plays a significant role in the metabolism and cell signalling and has emerged as a novel drug target (He et al. 2014; Zhao et al. 2018) for cancer and type II diabetes. Hence, this research works to model the rat PTPMT1 and in-silico molecular docking of common flavonoids against the modelled protein and inhibit its activity. Moreover, molecular dynamics simulation studies are to be performed to validate the conformations and stability of template, apo-modelled protein, and docked complexes. Pharmacokinetics and Lipinski’s properties of the flavonoids are reported and studied. Rats are model organisms for testing a drug for efficacy and toxicity; hence we attempt modelling and docking. This research would be a road map in developing known natural lead compounds against the novel PTPMT1 mediated cancer and diabetes. Our future work aims to perform in-vitro followed by in-vivo clinical trials from the best results obtained. A detailed workflow of the research is depicted in Fig. 1.

Fig. 1
figure 1

Workflow of the research work

Materials and methods

The present study involved the use of diverse software, these are as follows: (i) ArgusLab 4.0.1; (ii) AutoDock 4.2.6; (iii) MGL Tools 1.5.4; (iv) Python 3.8.2; (v) Discovery Studio visualizer 3.5; (vi) PyMOL 2.3; (vii) UCSF Chimera 1.14; (viii) LigPlot + v.2.2; (ix) Java Platform SE binary version 8; (x) Swiss-PDBViewer; (xi) Desmond, Schrodinger.

Several webservers were used in the study viz., (i) ChemDraw JS sample page, an online tool was used to draw the two-dimensional structure of compounds; (ii) SWISS-MODEL for homology modelling of proteins with unknown three-dimensional structure (Waterhouse et al. 2018); (iii) Geno3D for homology modelling of proteins with unknown three-dimensional structure (Combet et al. 2002); (iv) ProtParam to study the properties of modelled protein (Gasteiger et al. 2005); (v) PROCHECK and ERRAT to generate Ramachandran plot and frequencies of non-covalent structural bonding elucidation between diverse atoms in the protein (Colovos and Yeates 1993; Laskowski et al. 1996); (vi) Clustal Omega for multiple sequence alignment (Sievers et al. 2011); (vii) Modloop server for automated loop modelling (Fiser and Sali 2003); (viii) MetaPocket 2.0 was used to predict the ligand-binding site (Zhang et al. 2011a; b); (ix) Prediction of activity spectra for substances (PASS webserver) to predict the pharmacological effects of compounds (Filimonov et al. 2018); (x) admetSAR version 1 and 2 to predict the pharmacokinetic properties of natural compounds (Cheng et al. 2019); (xi) SwissADME to predict the pharmacokinetic properties of ligands (Daina et al. 2017); (xii) PubChem database to retrieve flavonoid ligands and predict their Lipinski’s properties.

Computational specifications to perform molecular docking using AutoDock 4.2.6

The following parameters were required for the smooth and error-free functioning of the software. They include: (i) Physical memory: 4.00 GB RAM; (ii) Operating System: Windows 10 Home Single Language, 64-bit Operating System; (iii) Processor: Intel(R) Core(TM) i3-7100U CPU @ 2.40 GHz, 2401 MHz, 2 Core(s), 4 Logical Processor(s).

Computational specifications to perform molecular dynamics simulations using Desmond, Schrodinger

The molecular dynamics were performed in the Ubuntu 18 Operating System using the HP z238 microtower workstation with 8.00 GB RAM and Intel(R) Core i7 processor Intel(R) Core i7 processor. These were the minimum system requirement to perform the simulations as directed.

Ligand preparation

The ligands were prepared using the ArgusLab 4.0.1 software. The flavonoids obtained from Panche et al. (Panche et al. 2016) were downloaded from the PubChem database and utilized. The ligands were optimized using the Quantum Mechanics – Molecular Mechanics AM1 semi-empirical method according to the Hartree–Fock self-consistent field algorithm for 200 iterations (Hafeez et al. 2013; Stewart 1989).

Homology modelling and protein preparation

The 3-dimensional crystal structure of PTPMT1 is not available in the protein data bank database. Hence, we decided to model the protein based on the homology or the comparative modelling using two software viz., (i) SWISS-MODEL and (ii) Geno3D by performing the BLASTp search against the protein data bank engine (Altschul et al. 1990). The amino acid FASTA sequence of the protein (UniProtKB: P0C089) from Rattus norvegicus was obtained from the UniProtKB database, and the structural homologs were searched using the BLASTp against the protein data bank to find the closely related protein concerning the query sequence and similarity (Pagliarini et al. 2005). Multiple sequence alignment of the protein FASTA sequence was done using the Clustal Omega webserver to analyse the conserved regions.

SWISS-MODEL is a reliable webserver for homology modelling of proteins to predict the 3-dimensional structures (Hari and Akilashree 2019) to study their structure, properties, mutagenesis, and functions. Furthermore, the modelled protein will pave the way for the development of lead-like drugs against the protein. The FASTA sequence of the target sequence was fed into the webservers, respectively; the templates were searched, aligned and the template with the highest similarity to the target sequence and Q-mean value was chosen for model building.

The other web tool used in the study was Geno3D, which is also similar in operation to SWISS-MODEL was also used to compare the efficiency of the built models. Ten models were built using Geno3D, and the lowest energy model conformation was chosen.

The best model generated by either of the models was further subjected to energy minimization using the GROMOS_96 force field of Swiss-PDBViewer, loop modelling using Modloop server. The model was validated using the PROCHECK webserver to identify the amino acid residues in the allowed and disallowed regions and model quality was assessed using the ERRAT server.

Predicting the ligand-binding site using MetaPocket 2.0

Predicting the active/binding site is the starting step of the drug discovery process. MetaPocket 2.0, a high-efficiency web server; analyses the sizes of the binding pockets on the protein's surface, was used to predict the ligand-binding site of the modelled protein (Huang 2009; Jendele et al. 2019), and the largest pocket was chosen as the binding site. Identifying the ligand-binding site plays an essential role in navigating the ligand/small molecules to bind to the amino acids at that site.

Molecular Docking using AutoDock 4.2.6

AutoDock 4.2.6 is the most preferred and high-efficiency software widely used for protein–ligand docking (Morris et al. 2009). Molecular docking was carried out by individual docking, that is., ‘docking one ligand to one protein at a time’ for 10 Genetic Algorithm runs and initializing protein and ligand preparation done according to Shivanika et al. (2020). The grid spacing was set to its default value (0.375 Å). The grid parameters were set as follows: (i) Number of points in dimensions: 50 × 66 × 48, (ii) Grid box value: x center = 29.506 Å; y center = − 8.744 Å; z center = 18.708 Å, and (iii) Corresponding offset values: x = 0.556; y = 0.806; z = 0.444 respectively. Molecular docking was executed after setting the grid parameters, and the results were tabulated.

Interaction analyses

PyMOL 2.3; UCSF Chimera 1.14; Discovery Studio visualizer 3.5; LigPlot + v.2.2 (Wallace et al. 1995) were used to elucidate the bonded and non-bonded supramolecular interactions between protein and ligands.

Pharmacokinetics and Lipinski’s rule of 5

Lipinski's rule of five is an essential criterion for assessing a compound’s oral drug likeliness. Lipinski rule contains five essential rules namely (i) Molecular weight < 500 Daltons, (ii) Number of hydrogen acceptors < 10, (iii) Number of hydrogen donors < 5, (iv) Partition coefficient Log P < 5, and (v) Molar refractivity < 140. After clinical trials, lead compounds obeying these criteria could be considered an oral drug (Lipinski 2000, 2004). The Lipinski rule was predicted for the top 10 ligands that showed low binding energies. The pharmacokinetic properties denote the absorption, distribution, metabolism, excretion, and toxic properties of the ligands that are essential criteria for a lead compound for further studies. The pharmacokinetic properties were predicted using admetSAR versions 1 & 2 and SwissADME webservers.

Molecular Dynamics simulations

Molecular Dynamics simulations (MD) were performed for 150 ns with the Schrodinger’s Desmond package, according to Pant et al. (2020). The MD simulations were performed to study the RMSD, RMSF, and stability of the modelled apo-protein and protein–ligand complexes. The simulation uses force fields to study the potential energy and force between the atoms in a molecule. The simulations were performed for the following complexes: (i) template apo-protein; (ii) modelled protein; (iii) top three low binding energy ligand–protein complexes from docking; (iv) alexidine dihydrochloride-protein complex, which is the known inhibitor. The apo-proteins and docked protein–ligand complexes were first solvated indigenously in a single-point charge TIP3P water box size of 10 Å along with periodic boundary conditions (Shivanika et al. 2020). The simulations were performed using the Optimized Potentials for Liquid Simulations_2005 (OPLS_2005) force field (Harder et al. 2016) throughout the study. 0.15 M Na+ and Cl ions were used to neutralize the simulation system. The energy minimization of the complexes was carried out for 2000 steps. After energy minimization, the complexes were further subjected to the production run for 150 ns using the isothermal-isobaric (NPT) ensemble at constant pressure and temperature. The Nose–Hoover thermostatic algorithm and Martina-Tobias-Klein method gradually heat the entire MD system to 300 Kelvin. Particle-Mesh Ewald to enumerate the electrostatic interactions between the atoms in the complexes by maintaining a spacing of 0.8 Å.

MM-GBSA energy calculation

The MM-GBSA of modeled protein in complex with the top three low binding energy conformations and the control drug alexidine dihydrochloride was calculated using OPLS_2005 forcefield and VSGB 2.0 solvation model. The prime MM-GBSA ΔG binding energy was calculated using the equation described by Massova and Kollman (2000).

$$\mathrm{\Delta G bind}= {\mathrm{G}}_{\left(\mathrm{complex}\right)}-\left({\mathrm{G}}_{\left(\mathrm{ligand}\right)}+{\mathrm{G}}_{\left(\mathrm{modelled protein}\right)}\right),$$
(1)
$$\mathrm{G}=\,{E}_{\mathrm{Van der Waals energy}}+ {E}_{\mathrm{Electrostatic energy}}+{G}_{\mathrm{General Born solvation energy}}+ {G}_{\mathrm{Surface area energy}}.$$
(2)

Results and discussion

Homology modelling of PTPMT1 using SWISS-MODEL

Many similar proteins were retrieved from the BLAST (Camacho et al. 2009) and HHblits (Remmert et al. 2012) structural and sequential similarity results of the SWISS-MODEL template search library. The three-dimensional structure of two PTPMT1 proteins, PDB Id: 3RGQ and PDB Id: 3RGO (Xiao et al. 2011) from Mus musculus, showed the highest similarity (above 90%). The proteins (PDB Id: 3RGQ) with resolution 2.0 Å showed 94.23% similarity, 0.81 query coverage, and (PDB Id: 3RGO) with resolution 1.9 Å showed 94.90% similarity and 0.81 query coverage to the primary amino acid template sequence. From the two most similar proteins, the protein (PDB Id: 3RGQ) had a high Global Model Quality Estimation (GMQE = 0.79) value than PDB Id: 3RGO greater the GMQE, the higher the reliability. The GMQE is determined by combining the aligned target-template sequence and template crystal structure and is an essential tool to estimate the quality of protein models. Moreover, the SWISS-MODEL ranked the protein (PDB Id: 3RGQ) at the top, followed by (PDB Id: 3RGO). Hence, the protein (PDB Id: 3RGQ) was the final template, and we built the model. The model building was done using ProMod3 and PROMOD-II based on the target-template alignment (Guex et al. 2009). The conserved regions between the sequences were carried forward, the insertions and deletions were modelled using the fragment library, loop modelling, followed by sidechains modelling. The modelled protein was 156 amino acids in length. Finally, using a force field, the geometry was optimized to build a stable model. The built model has had a QMEAN score of 0.10, which is an agreeable value. A QMEAN Z-score closer to zero is a good model and denotes a higher similarity between the modelled and experimental protein (Benkert et al. 2011). The amino acids with a similarity value lower than 0.6 denote that the model is of low quality. Fortunately, no residue was found to be lower than 0.6; thus, the modelled protein was of the highest quality (Fig. 2a). Figure 2b shows the comparison of the quality scores of the modelled protein with the experimental structures. The modelled protein is denoted as a red-coloured star; the back and grey dots are the experimental proteins of |Z-score| in the range 0–2. The modelled protein was then subjected to energy minimization to correct any distorted amino acids or modified loops or adjusting bond constraints, etc., using the Swiss-PDBViewer. The energy minimization enhances the stability of modelled protein using the GROMOS_96 force field (Scott et al. 1999; Schmid et al. 2012). The SWISS modelled protein was superimposed onto the template structure (PDB Id: 3RGQ) using PyMOL 2.3, and an RMSD value of 0.072 (Fig. 3) was obtained (1105 atoms aligned).

Fig. 2
figure 2

Local quality estimate and comparison of the modelled PTPMT1 using SWISS-MODEL

Fig. 3
figure 3

PTPMT1 (green colour) superimposed onto the template PDB Id: 3RGQ (red colour)

Homology modelling of PTPMT1 using Geno3D

From the similarity search using the template library search of Geno3D, the proteins (PDB Id: 3RGQ and PDB Id: 3RGO) showed the highest similarity of 94%. Both the template proteins are very closely related, but Geno3D ranked the protein (PDB Id: 3RGO) template in the first hence, this template was then chosen for model building. A total of ten models were generated, the model was then chosen based on the lowest energy. Out of ten models generated, the best lowest energy protein model (− 7132 kcal/mol) was then downloaded in PDB format and further processed, followed by energy minimization steps using Swiss-PDBViewer as mentioned above. Finally, the Geno3D modelled protein was superimposed onto the template structure (PDB Id: 3RGO) using PyMOL 2.3, and an RMSD value of 0.746 (Fig. S1) was obtained ten times higher than the value obtained from SWISS-MODEL.

Model validation

Validating the built model is an essential step that denotes the built model’s quality and reliability. Ramachandran plot was used to validate the model of whether or not any amino acids are present in the disallowed regions due to steric hindrance of phi (φ) and psi (ψ) bonds between C-alpha methylene group sidechain and main-chain atoms in a polypeptide (Ramachandran et al. 1963). The PDB file format of models built using SWISS-MODEL and Geno3D were submitted to the PROCHECK webserver. The model developed by SWISS-MODEL showed the best results with 91.6% (131 amino acids) of amino acids in the most favoured regions of the plot; 7.7% (11 amino acids) in additional allowed regions; 0% in generously allowed regions; 0.7% (1 amino acid PHE-79) in the disallowed region(s) (Fig. 4a). The overall G-value for modelled protein was − 0.03, which shows modelled protein is acceptable; a G-value lower than − 0.5 is considered unusual. The model developed by Geno3D had 81.2% (117 amino acids) of amino acids in the most favoured regions; 16.7% (24 amino acids) in additional allowed regions; 1.4% (2 amino acids) generously allowed regions; 0.7% (1 amino acid MET-66) in the disallowed region(s), the G-value was 0.31 (Fig. 4b). Thus, the model developed by SWISS-MODEL was preferred because of several amino acids in the most favoured and allowed regions of the plot compared to Geno3D. The SWISS-MODEL protein model was submitted to the ERRAT server, and the model quality factor was 94.595 (Fig. S2). The amino acid PHE-79 was found to be in the disallowed region of the plot with distorted conformation, which was then submitted to the Modloop server, and the loops were remodelled was saved in the PDB file format for further use. The corrected protein structure had 91.6% amino acids in the most favoured region, 8.4% in the additionally allowed region after correcting the PHE-79, 0% in generously allowed and disallowed regions, respectively (Fig. S3). Thus, the protein modelled using SWISS-MODEL was efficient, high quality, and used for the research. The properties of the SWISS modelled protein were reported in Table 1.

Fig. 4
figure 4

Ramachandran plot of the modelled proteins: a SWISS-MODEL; b Geno3D

Table 1 Properties of protein predicted using ProtParam

Ligand-binding site prediction

The modelled protein was submitted to MetaPocket 2.0 webserver before docking, and the top-ranked site (based on z-score) out of 3 predicted binding sites was chosen for docking. It was the biggest cleft on the protein with 21 amino acids, accounting for 13.4% of total amino acids in the protein (Table 2; Fig. 5).

Table 2 Predicted binding site of protein using MetaPocket 2.0
Fig. 5
figure 5

Sequence of PTPMT1 denoting the secondary structures

The multiple sequence alignment of conserved regions of the three closely related PTPMT1 amino acid sequences from rats, mice, and humans was analysed. Amino acid sequences of rats and mice were very similar, while the human sequence showed mild variations compared to them. The amino acids in the ligand-binding site of the rats, mice, and humans PTPMT1 sequences were the same except for the presence of HIS-170 instead of SER-170 in humans alone (Fig. 6). The sequence alignment was to depict the similarity between the PTPMT1 in different species. The predicted ligand-binding site of the modelled protein and PDB Id: 3RGQ were overlapped and found similar and consensus (Fig. S4), proving that the built model is valid.

Fig. 6
figure 6

Multiple sequences alignment of rat, mouse, and the human PTPMT1

Molecular docking using AutoDock 4.2.6

The main motive of the work was to develop natural compounds as inhibitors against the modelled cancer and diabetes target. Flavonoids are organic bioactive secondary metabolites synthesized by plants, fungi, and bacteria responsible for the development, metabolism, maintenance, stress tolerance, defence against predators, etc. (Isah 2019). Molecular docking of 210 common flavonoid ligands showed effective binding with the homology modelled PTPMT1 with binding energies in the range of − 8.66 to 0.61 kcal/mol (Table S1). The details of the top ten flavonoids with low binding energy are listed in Table 3.

Table 3 Details of top 10 flavonoids with low binding energies

From the present work, flavonoids have shown the potential to inhibit the protein’s activity. The concept of inhibition and affinity in molecular docking is as follows, the lower the binding energy of docked complexes, the greater their binding affinity with the target, hence the augmented inhibition. The flavonoid prunin had the lowest binding energy of − 8.66 kcal/mol and showed the protein’s highest binding affinity. Prunin was followed by oroxindin, luteolin 7-rutinoside, 3(2H)-isoflavenes, nicotiflorin, and chalcone with binding energies − 8.56, − 8.47, − 8.36, − 8.29, and − 8.15 kcal/mol, respectively (Table 4). These were some of the top ligands interacting with the protein forming supramolecular bonded and non-bonded non-covalent interactions.

Table 4 Binding energies and interactions of top 10 flavonoids

Prunin is a polyphenolic flavanone abundantly present in Prunus davidiana, a Chinese wild peach, and is also predominantly found in juvenile citrus fruits like lemon, lime, and tomatoes. The flavonoid interacted with ARG-136, ARG-138, ARG-172, ASP-101, GLN-175, GLU-73, GLU-76, GLY-104, LYS-133, MET-102, SER-137, THR-99, THR-103, and TYR-75 amino acids in the binding site forming six hydrogen bonds with ARG 136, ARG 172 (3 bonds), LYS-133, and SER-137 amino acids respectively within 2.5 Å bond length. Hydrogen bonds with a donor–acceptor distance within 2–2.5 Å are the strongest bonds, an increase in the bond length denotes moderate and weak bonding (Jeffrey 1997). The 2-dimensional and 3-dimensional interaction diagrams have been shown in Fig. 7. The greater number of hydrogen bonds, the higher the binding efficiency, and a very potent inhibitor (Azam and Abbasi 2013). Also, hydrogen bonds strongly influence the specificity, metabolization, and adsorption of the drug. Several non-covalent interactions like Van der Waals, carbon-hydrogen bonds, π-Cation, π-Anion, π-Sigma, π-Alkyl, and π-Sulphur bonds were also observed during docking. These supramolecular interactions were responsible for inhibition, and were observed within the range of 2.0 Å between the ligand and protein. The pharmacophoric features of prunin responsible for the interactions were aromatic rings and the oxygen atoms (hydrogen acceptor). Prunin is a secondary metabolite and widely used as anti-microbial (Celiz et al. 2010), anti-cancer, anti-viral (Zhang et al. 2008), anti-inflammatory (Na and Ryu 2018), anti-oxidant (Céliz et al. 2013), anti-hyperlipidemic (Choi et al. 1991), anti-diabetic agents. Prunin is a very effective protein tyrosine phosphatase 1B inhibitor (Jung et al. 2017) and has reported stimulating glucose uptake in insulin-resistant HepG2 cell lines. Jung et al. reported that it is an effective inhibitor of protein tyrosine phosphatase 1B (closely related to PTPMT1 used in the present study) interacted with a binding energy of − 9.0 kcal/mol (Inhibition constant = 8.66 µM) which is comparable with the present study of − 8.66 kcal/mol for prunin against the protein, thus downregulating cancer and diabetes.

Fig. 7
figure 7

3D and 2D interactions of the prunin-protein complex

Oroxindin, polyphenolic flavone known as wogonin glucuronide used in Chinese medicine, is a bioactive chemical compound extracted from Holmskioldia sanguinea, Oroxylum indicum, and Bacopa monnieri plants have reported being an effective inhibitor of diabetes, tumour, cancer, inflammation (Cho et al. 2013), and free radicals (Liu et al. 2020). Figure 8 denotes the 3D and 2D interactions of the oroxindin-protein complex. In the current research, oroxindin has inhibited the PTPMT1 forming bonds with ARG-136, ARG-138, ARG-172, ASP-101, GLU-73, GLU-76, GLY-104, LYS-133, MET-102, SER-137, THR-99, TYR-75, and VAL-100 amino acid residues along with strong seven hydrogen bonds with bond length less than 2.5 Å with ARG-136, ARG-172 (3 bonds), LYS-133 (2 bonds), and THR-99 amino acid residues in the ligand-binding site. The binding energy of − 8.56 kcal/mol was attained from docking. Van der Waals, carbon-hydrogen, π-Cation, π-Anion, π-Alkyl, π-Sigma, amide-π Stacked bonds were formed between them within 2.0 Å. The pharmacophoric features responsible for these interactions were the aromatic rings and hydrogen acceptors (oxygen atoms).

Fig. 8
figure 8

3D and 2D interactions of the oroxindin-protein complex

Luteolin 7-rutinoside is an abundant flavonoid isolated from Cyclopia subternata, a plant commonly found in South Africa and Caucalis platycarpos (Plazonić et al. 2009). The luteolin 7-rutinoside ranked third out of 210 flavonoids screened against the protein with a binding energy of − 8.47 kcal/mol interacting with ARG-136, ARG-138, ARG-172, ASP-101, GLU-73, GLY-104, LYS-133, MET-102, SER-137, THR-99, TYR-75, and VAL-100 residues along with six hydrogen bonds with ARG-138, ARG-172 (3 bonds), LYS-133, and THR-99 amino acid residues in the ligand-binding site region (Fig. 9). The other supramolecular bonded and non-bonded contacts were all the same as reported for prunin and oroxindin, hence inhibiting the protein. The extent of inhibition is dependent on binding energy and hydrogen bonds. Luteolin and its derivatives have shown inhibition potential against diabetes on rat KK-Ay mouse cell lines and inhibiting lipid synthesis (Zang et al. 2016). Luteolin and its derivatives and other flavonoids like amentoflavone and daidzein have shown potential against the alpha-glucosidase by 36% at 0.5 mg/mL concentration and alpha-amylase compared to acarbose, the two predominant enzymes responsible for the breakdown of sugars and inhibiting the onset of type 1 diabetes (Kim et al. 2000). Goodarzin et al. (2020) isolated derivates of luteolin flavonoid from Cuminum cyminum and tested for the cytotoxic effects on breast cancer cell lines (MCF-7 and MDA-MB-231) and found it selectively inhibits the cancer cell lines at an IC50 = 3.98 µg/ml with a selectivity index of 8.0.

Fig. 9
figure 9

3D and 2D interactions of the luteolin 7-rutinoside-protein complex

Figure 10 denotes the surface image of the protein along with the top ten flavonoids with low binding energies.

Fig. 10
figure 10

Surface image of the top 10 flavonoids in the binding site of PTPMT1

3(2H)-isoflavenes, a dichloroflavan, showed binding energy of − 8.36 kcal/mol by interacting with ARG-138, ARG-172, ASP-101, GLU-73, GLU-76, GLY-104, LYS-133, MET-102, SER-137, THR-99, THR-103, and TYR-75 residues within 2.0 Å in the binding site of the protein. No significant intermolecular conventional hydrogen bonds within 2.5 Å were formed during the interaction. However, one bond with TYR-75 was found with a bond length of 3.5 Å (weak hydrogen bond). The aromatic rings were the main pharmacophoric features in the ligand for electronic and supramolecular interactions. This flavonoid has widely reported anti-viral properties against the type 2 poliovirus by inhibiting the RNA and protein synthesis (Salvati et al. 2004), thereby affecting the viral replication.

Nicotiflorin, a kaempferol O-glycoside, is reported to be an active component of Solanum campaniforme (Torres et al. 2011), Clitoria ternatea (Kazuma et al. 2003), Acalypha indica (Nahrstedt et al. 2006), Carthamus tinctorius (Huang et al. 2007), Ampelopsis heterophylla (Chen et al. 2013), Amaranthus tricolor (Kraujalis et al. 2013), etc. This ligand has inhibited the tyrosine phosphatase with a − 8.29 kcal/mol binding energy by forming interactions with ARG-136, ARG-138, ARG-172, GLU-76 LYS-133, SER-137, SER-170, and TYR-75 residues in the binding site. In addition, six hydrogen bonds were formed between the ligand and protein within 2.5 Å. The pharmacophoric features of the ligand were the oxygen atoms as the electron acceptor. Apart from these top five ligands, other flavonoids, namely chalcone (− 8.15 kcal/mol); lanceolatin A (− 8.13 kcal/mol); flavan (− 8.12 kcal/mol); yinyanghuo A (− 8.06 kcal/mol), and anthocyanin (− 7.99 kcal/mol) were ranked successively.

The present research also focussed on docking the drug alexidine dihydrochloride, the known standard control drug for PTPMT1, using the same procedure described. This drug was earlier used as an inhibitor of the bacterial membrane (Gilbert and Moore 2005) and fungal lipases (Ganendren et al. 2004). Alexidine dihydrochloride is an uncompetitive and selective inhibitor of this protein isolated from rat pancreatic islet and beta cells with an IC50 = 1.08 µM under in-vitro conditions, thus phosphorylating it (Doughty-Shenton et al. 2010). They also reported that metformin and phenformin did not inhibit the protein.

Docking alexidine dihydrochloride obtained binding energy of − 2.60 kcal/mol with an inhibition constant (Ki) value of 12.35 mM interacting with three hydrogen bonds with GLU-73, THR-99, and VAL-100 within 3.0 Å bond length. Generally, binding energies lower than − 5.45 kcal/mol are considered to be the best conformations to produce inhibitory response since their predicted inhibition constant values remain below 100 µM and are potent. Inhibitors with predicted inhibition constants higher than 100 µM are non-potent (Zheng and Polli 2010). The hydrogen donor (-NH) in alexidine dihydrochloride were the responsible pharmacophoric features for these hydrogen bonds. The other interacting amino acids were ALA-134, ARG-96, ARG-136, ARG-138, ARG-172, ASP-101, CYS-132, GLU-73, GLU-74, GLU-76, GLY-104, LYS-133, MET-102, SER-98, SER-137, THR-99, TYR-75, and VAL-100 (Fig. 11). From this, the control drug had poor binding energy of − 2.60 kcal/mol, denoting that our test compounds (193 out of 210 flavonoids docked) outperformed with higher binding energy, owing to the possibility that natural compounds could be used also be tested for their potency for in-vitro and then in-vivo clinical trials. Thus, natural compounds have shown better results compared to the control. The hydrogen bonds were predominantly formed with ARG-136, ARG-138, ARG-172, LYS-133, THR-99, and TYR-75, effectively constituting the binding site with strong bonds. The ligand-binding site might not be the active catalytic site. Even if the ligands had bound to the allosteric site, this would distort the active site, thereby preventing the release of the substrate from the enzyme–substrate (Doughty-Shenton et al. 2010).

Fig. 11
figure 11

3D and 2D interactions of the alexidine dihydrochloride-protein complex

Park et al. studied the virtual screening of inhibitors against the human PTPMT1 using AutoDock software (Park et al. 2012). They reported the chemical leads inhibited the protein between IC50 = 0.7–17.3 µM concentrations. Interestingly, they reported the identical amino acid residues with concerning the present research work, ALA-134, ARG-138, ARG-172, ASP-101, MET-102, LYS-133, SER-137, and VAL-100 were the interacting amino acids. Thus, the PTPMT1 of humans and rats (present research) share similar ligand-binding sites. Hence, the binding site and interacting residues were similar to the previously reported work supporting the present research.

It can be elucidated from the present study that inhibiting the PTPMT1 leads to conformational changes in the protein’s structure, leading to its dysregulation, thereby increasing the insulin levels in cells. Doughty-Shenton et al. (2010) attenuated the protein’s expression and reported a significant augmentation in the insulin levels in-vitro pancreatic beta-cell lines. Interestingly, targeting and inhibiting the PTPMT1 of the liver and pancreatic beta-cells could help increase insulin levels and possibly treat the onset of type II diabetes (Arthur et al. 2020). Niemi et al. studied the role of PTPMT1 isolated from the rat in the apoptotic cancer cells. They reported that downregulating and knocking out the gene responsible for the PTPMT1 using RNAi. They resulted in downregulation of the protein thereby, promoting mitochondrial-dependent apoptosis cellular fate of cancer cells (Niemi et al. 2013) due to the release of cytochrome. This also caused the metabolic changes resulting in downregulating cardiolipin levels with increased insulin and adenosine triphosphate levels when cultured in glucose-containing media, thus combating cancer and type II diabetes. The flavonoids from the present research posed well with low binding energies enabling them to be taken to the following levels for in-vitro and in-vivo studies after assessing the in-silico pharmacokinetic and toxicity characteristics.

Biological and pharmaceutical properties of top 10 flavonoids

Table S2 denotes flavonoids’ biological, pharmacological, biochemical mechanism, and toxicity properties using the PASS webserver. It can be noted that these compounds are inhibitors of cancer, neoplastic, diabetic, and dual-specificity phosphatase 1. Hence, these compounds can be further studied using in-vitro studies as potent inhibitors of PTPMT1.

Pharmacokinetic properties and Lipinski’s rule of 5

The absorption, distribution, metabolism, and excretion properties of lead compounds must be studied before entering clinical trials (Sirikantaramas et al. 2008). From the predicted pharmacokinetic properties (Table S3), all the ten flavonoids possessed very high gastro-intestinal absorption penetrating the epithelial cell membrane into the circulatory system that predominantly occurs through passive diffusion (Barthe et al. 1999). Compounds prunin, oroxindin, luteolin 7-rutinoside, nicotiflorin, and yinyanghuo A were not found to cross the blood–brain barrier and Caco-2 due to their high total polar surface area (greater than 90 Å2) and high molecular weight. In comparison, the rest of the compounds crossed the blood–brain barrier and Caco-2 due to their low total polar surface area (less than 90 Å2) (Hitchcock and Pennington 2006). Luteolin 7-rutinoside, nicotiflorin, and yinyanghuo A inhibited the P-glycoprotein substrate, and the rest of the compounds did not. Lanceolatin A and yinyanghuo A were effective inhibitors of P-glycoprotein inhibitors. A drug must inhibit P-glycoprotein to be an effective lead compound (Amin 2013). Prunin, oroxindin, luteolin 7-rutinoside, and nicotiflorin did not inhibit the cytochromes, whereas other flavonoids were at least an inhibitor of any of the cytochromes. The drug likeliness properties of the compounds were also studied, namely Lipinski, Ghose, Veber, Egan, and Meugge filters were applied as these filters help chemists design compounds for the next phase of trials (Loureiro et al. 2019). Only 3(2H)-isoflavenes, lanceolatin A, and yinyanghuo A satisfied all the drug likeliness properties (Table S4 Chalcone, flavan, and anthocyanin violated the Muegge filter, respectively, Prunin and oroxindin satisfied only Lipinski and Ghose filters. Flavonoids luteolin 7-rutinoside and nicotiflorin violated all the filters. The toxicity profiles of compounds (Table S5) reported that luteolin 7-rutinoside 3(2H)-isoflavenes, and nicotiflorin were found to be AMES toxic. None of the compounds reported carcinogenic properties. All the compounds except flavan were found to be hepatotoxic.

Lipinski’s drug likeliness is an essential criterion for a compound to be an oral drug. Out of the top 10 compounds, luteolin 7-rutinoside and nicotiflorin violated three properties (molecular weight; H-bond donor; H-bond acceptor) respectively (Table S6). Prunin and oroxindin violated the number of H-bond donors and H-bond acceptors, respectively.

Na and Ryu studied the toxicity of prunin on HaCaT cell lines using the tetrazolium salt assay-1 at different concentrations, viz., 1 µM, 5 µM, 10 µM, 20 µM, and 40 µM concentrations. The cell viability was 100% a 1, 5, and 10 µM concentrations; however, the viability was 93% at 20 µM and further decreased to 75% above 40 µM. They also studied the cytoprotective effects against UVB irradiated HaCaT cells at 40 mJ/cm2 intensity. The survival rate of the cells increased to 71% at 5 µM and up to 96% at 10 µM concentrations, respectively, when compared to the control (53% cells viable after irradiation) (Na and Ryu 2018). Nishina et al. studied the cytotoxic effects of nicotiflorin extracted from Eleutherococcus sieboldianus on 3T3-L1 mouse fibroblast cell lines at 1 µM, 3 µM, 10 µM, 30 µM, and 100 µM. They reported that the compound was not toxic even at 100 µM concentration, and the cells were 100% viable (Nishina et al. 2017). Shyaula et al. reported that nicotiflorin was not toxic even above 100 µM concentration when tested on 3T3 fibroblast cell lines using MTT assay (Shyaula et al. 2012). Pirvu et al. reported that nicotiflorin, a kaempferol derivative enhanced the viability of L-929 fibroblast cell lines under in-vitro conditions. They also reported that the cell viability was not significant at lower concentrations (5 μg GAE/mL sample), while at higher concentrations (10–100 μg GAE/mL), the cell viability increased in a dose-dependent manner (Pirvu et al. 2018). de Vasconcelos et al. reported that chalcone and its derivatives pose cytotoxic properties on human adenocarcinoma cell lines by inhibiting the pro-apoptotic Bax and anti-apoptotic Bcl-2 gene expression. The pro-apoptotic BAX mRNA expressions in cells were found to be high (at 40 µM) compared to the control, there was no effect of the incubation period, and the anti-apoptotic Bcl-2 expression was decreased in a time-dependent manner (de Vasconcelos et al. 2013). The expression of survivin was also inhibited at 40 μM concentrations. Sinha et al. reported that chalcone and its derivatives were non-toxic and were not haemolytic when test on erythrocytes (Sinha et al. 2019). There has been no evidence of anthocyanin toxicity in human and animal models (Cladis et al. 2020). Anthocyanins from various fruits like elderberries, blueberries, etc. were subjected to rat and rabbit models at 9 g/kg of body weight per day at various time intervals and studied for three successive generations (Pourrat et al. 1967). There was no trace of toxicity in these models that depict the anthocyanins are non-toxic. Polyphenol anthocyanins from apples (Shoji et al. 2004), grapes (Bentivegna and Whitney 2002), purple corn (Nabae et al. 2008), and green tea (Morita et al. 2009) were also found to be non-toxic to higher mammals.

Molecular Dynamics simulations

From the results of MD simulations, it can be seen that the apo-protein and the protein–ligand complexes were stable throughout the simulations. In contrast, the control drug alexidine dihydrochloride-protein complex was found to fluctuate highly. The molecular dynamics simulations were performed for 6 complexes viz., (i) template apo-protein (PDB Id: 3RGQ); (ii) modelled protein; (iii) prunin-protein complex; (iv) oroxindin-protein complex; (v) luteolin 7-rutinoside; (vi) alexidine dihydrochloride-protein complex.

The RMSD is the average displacement or deviations of a set of atoms for a particular frame, the α-Carbon of protein, concerning the original reference frame. It is the measure of the stability of the conformation and quality of the docked complexes. The RMSD of the template apoprotein (PDB Id: 3RGQ) was within 2–2.25 Å throughout the simulation (Fig. 12a). The RMSD of apo-protein was found to be stable and did not fluctuate, and was found to rise linearly at the beginning and converged to 1.70 Å at 150 ns. The maximum threshold RMSD value can be within 1–3 Å for small and globular proteins. Thus, the lower the RMSD of protein, the greater the stability and lower the deviations. This denotes the local fluctuations of specific atoms or amino acids with respect to the reference structure. The RMSF of template apo-protein is shown in Fig. 12b. The α-helices and β-strands are denoted in red and blue colours, respectively, and the loop regions are denoted in white colours. The secondary structures are rigid thus fluctuates less when compared to loop regions of the protein. The RMSF of the protein was found to be stable throughout the simulation. The highest RMSF value of 2.25 Å was found in the loop regions between the amino acid residues 61–70 of the protein. The template apo-protein was stable throughout the 150 ns simulation. The contribution of template protein secondary structure by residue and their assignment was monitored throughout the 150 ns trajectory (Fig. 12c, d).

Fig. 12
figure 12

MD simulation of the template apo-protein (PDB Id: 3RGQ)

The RMSD of the modelled protein was found to be stable throughout the simulation. There were minor fluctuations in the RMSD values and were found to increase initially and stabilized during the end of the simulation within 1.80 Å at 150 ns. The maximum RMSD value of 2.45 Å at 48 ns was obtained (Fig. 13a). The RMSF value of modelled protein was less than 1.7 Å (residues 48–52) during the simulation (Fig. 13b); hence no high fluctuations were found. This proves the modelled protein was stable, and thus the built model was a valid one. The contribution of modelled protein secondary structure by residue and their assignment was monitored throughout the 150 ns trajectory (Fig. 13c, d).

Fig. 13
figure 13

MD simulation of the modelled apo-protein

The protein–ligand RMSD of prunin-protein docked conformation was studied (Fig. 14a–f). The RMSD of the docked protein fluctuated during the simulation and laid within 2.2 Å (Fig. 14a). The ligand RMSD is the measure of stability concerning its bound site on the protein. The protein–ligand complex was first aligned on the protein backbone of reference, then the deviations of heavy atoms on the ligand were measured. The ligand RMSD was found to fluctuate during the simulation initially, then reduced to 7.5 Å, peaked to ~ 11 Å at 80 ns, and further, the complex stabilized to 3.5 Å at 150 ns. The RMSF of the protein in the prunin-protein complex was very low, i.e., within 1.4 Å, and did not fluctuate during the simulation, proving the stability of the complex (Fig. 14b). The ligand RMSF provides insights on its interaction and fluctuations with the protein to elucidate the docked complex (Fig. 14c). The ligand–protein complex was first aligned onto the protein backbone, and the RMSF of heavy atoms on the ligand was measured concerning the protein. The RMSF value of heavy atoms of the ligand was within 4.1 Å. The heavy atoms 13 and 24 were tail atoms that fluctuated the most as these atoms were exposed to solvent, while the atoms 8–12, 25, 26, and 27 did not fluctuate as these were bound deep inside the binding site with minimal spatial movements (Fig. S5). The protein–ligand contact plot is a timeline that annotates the bonded interactions between them. Predominantly four different bonds can be found viz., hydrogen bonds, hydrophobic bonds, ionic bonds, and water bridges (Fig. 14d). From the ligand–protein contact plot, most interactions were found to be water bridges followed by conventional hydrogen bonds. Water bridges are the hydrogen bonds between the ligand and protein-mediated by a water molecule at a distance of 2.8 Å, while the distance of conventional hydrogen bonds is 2.5 Å. The amino acids MET-102, SER-137, ASP-101, THR-99, and ARG-172 were the order of most interacting with a ligand at higher fractions. Conventional hydrogen bonds were formed between the ARG-136, ARG-138, ARG-172, GLY-104, LYS-133, MET-102, THR-99, and TYR-75 and prunin. Hydrophobic bonds were formed with MET-102 and TYR-75 since their amino acids possess hydrophobic sidechains and interacted with an aromatic and aliphatic group on the ligand. The number of contacts made between the protein and the ligand and in each trajectory over the course of the 150 ns simulation is shown (Fig. 14e). Most contacts between the amino acid residues and the ligand were made during 60–90 ns. The darker the colour shade, the more than one specific bonded interaction formed. Most contacts were made between the ligand prunin and the amino acid residues THR-99, followed by GLU-73, MET-102, LYS-133, ARG-172, SER-137, and so on. Ligand properties chart depicts the properties like ligand RMSD, the radius of gyration, molecular surface, intramolecular hydrogen bonds, the polar surface area, and the solvent-accessible surface area of prunin during the entire trajectory of 150 ns shown in Fig. S6.

Fig. 14
figure 14

MD simulation of the prunin-protein complex

For the protein–ligand RMSD of the oroxindin-protein complex, the RMSD of protein was found to increase even after 100 ns finally converged and stabilized at the end of 150 ns (Fig. 15a), the maximum RMSD of 2.7 Å was observed at 68 ns. The RMSD of oroxindin at the binding site of protein increased initially, then converged, again increased, and got stabilized at the end of the simulation at 9 Å. Probably this could be due to the deviation of the ligand from its initial binding site. The RMSF of protein was stable throughout the simulation; a maximum of 2.25 Å was observed between 60 and 70 amino acid residues at the ligand contact site with protein (Fig. 15b). The ligand RMSF with concerning protein was also found to be within 6.0 Å (Fig. 15c). The RMSF was highest due to fluctuation of tail end atoms (7, 10–17) exposed to solvent and were able to rotate, while the movement of atoms 19–31 was restrained as they were deep inside the binding pocket; thus, the lower fluctuations (Fig. S7). The protein–ligand contact plot showed most contacts with ARG-138, LYS-133, MET-102, TYR-75, ARG-136, ASP-101, THR-99, and so on (Fig. 15d). Water bridges and were the most bonded contact in all the interacted amino acids. The conventional hydrogen bonds were formed between the ARG-136, ARG-138, GLN-175, LYS-133, SER-137, THR-99, TYR-75 residues, and the ligand within 2.5 Å. Hydrophobic contacts were formed with ALA-134, ARG-138, LYS-133, MET-102, TYR-75, forming π-Cation interactions with the aromatic group on the ligand (Fig. 15d). Polar ionic bonds were formed between the NH3+ located in the sidechain of LYS-133 and two oxygen atoms in the ligand at a distance of 2 Å. Amino acid residues ARG-136 and ARG-172 were also found to form ionic bonds with the ligand. Amino acids ARG-138 and LYS-133 contacted the most with the ligand, followed by MET-102 and TYR-75. The dark-coloured shades denote more than one contact established between the complex (Fig. 15e). The ligand–protein contact plot showed ARG-138 formed two π-Cationic contacts 30% and 46% of the time with two aromatic groups of the ligand and also donated its donor sidechain 33% of the time with oxygen acceptor of the ligand (Fig. 15f). The various ligand properties like ligand RMSD, the radius of gyration, molecular surface, intramolecular hydrogen bonds, the polar surface area, and the solvent-accessible surface area of oroxindin during the 150 ns trajectory were shown (Fig. S8). Thus, these two amino acids ARG-138 and LYS-133 played a vital role in inhibiting the protein’s virulence.

Fig. 15
figure 15

MD simulation of the oroxindin-protein complex

From the MD analysis of luteolin 7-rutinoside with protein, the protein RMSD in the protein–ligand plot lay within 2.25 Å during the simulation, then converged to less than 2.00 Å during the simulation rest of the simulation (Fig. 16a). The ligand RMSD, luteolin 7-rutinoside, was quite unstable and showed random fluctuations between 50 and 80 ns of the simulation, reaching upto 17 Å at the end of the trajectory. The fluctuations could be due to the deviation of the ligand from the binding site of the protein. The RMSF of α-carbon in protein was, however stable, most fluctuations were found in the ligand contact regions of the protein within 2.00 Å. The unstructured part, that is, the loop and turns (40–45 residue indices), (64–70 residue indices), (97–103 residue indices), and (133–138 residue indices) in the protein fluctuated more compared to the other secondary structures (Fig. 16b). The highest fluctuation (RMSF = 1.9 Å) was observed between the 64 and 70 regions. The ligand RMSF of luteolin 7-rutinoside fragments was slightly higher than 10 Å compared to the protein (Fig. 16c). The atoms 23–34 were found to fluctuate the most above 8 Å that could be due to the solvent exposure tail end with high rotatability. In contrast, the RMSF of the rest of the atoms was comparatively low as their free spatial movements were restricted as they were deeper inside the binding cleft (Fig. S9). Conventional hydrogen bonds and water bridges contributed to the interactions between the protein and ligand (Fig. 16d). ALA-134, ARG-78, ARG-136, ARG-138, ARG-172, CYS-132, ILE-171, LYS-133, MET-102, SER-137, SER-170, SER-174, and THR-103 formed conventional hydrogen bonds at a distance of 2–2.5 Å. Water bridges were formed by ARG- 78, ARG-136, ARG-138, ARG-172, ASP-101, LYS-133, GLN-175, GLU-73, GLU-76, ILE-171, MET-102, SER-137, SER-170, SER-174, THR-103, and TYR-75 typically within 2.8–3 Å. Hydrophobic bonds were formed between the aliphatic or aromatic groups of the ligand and ALA-134, ARG-136, ARG-138, ARG-172, LYS-133, MET-102, PRO-173, and TYR-75 amino acid residues. Polar ionic bonds were formed by ARG-138, LYS-133, ARG-172, and ARG-136 residues within 3 Å. Total contacts were found to be highest during the beginning of the simulation. ARG-172 formed most contacts with the ligand, followed by LYS-133, ARG-138, CYS-132, ARG-136, SER-137, and so on. ARG-136, ARG-138, ARG-172, CYS-132, LYS-133, SER-137, etc., interacted with the ligand with more than one specific contact and is shown in dark-coloured shades (Fig. 16e). From the ligand–protein contact plot (Fig. 16f), the two donors OH groups on the ligand luteolin 7-rutinoside interacted with CYS-132 for 40% and 72% of the time, and LYS-133 donated its positive charged sidechain NH3+ to the aromatic group on the ligand forming π-Cation interaction 32% of the time and 31% to a water molecule. Apart from these, water molecules donate 31% of the time to the oxygen acceptor atom on the ligand. There is an intramolecular donor and acceptor within the ligand 96% of the time. The various ligand properties like ligand RMSD, the radius of gyration, molecular surface, intramolecular hydrogen bonds, the polar surface area, and the solvent-accessible surface area of luteolin 7-rutinoside during the 150 ns trajectory were shown (Fig. S10).

Fig. 16
figure 16

MD simulation of the luteolin-7rutinoside-protein complex

The MD analysis of alexidine dihydrochloride-protein docked complex was studied (Fig. 17a–f). The protein RMSD of the alexidine dihydrochloride-protein remained within 2.5 Å throughout the simulation, and the RMSD of ligand alexidine dihydrochloride fluctuated highly and was not stable (Fig. 17a). The ligand RMSD peaked upto 64 Å during the simulation; however, it got reduced to 30 Å at the end. The protein local fluctuations were low (RMSF = 1.90 Å) in the loop (64–70 residues) and turn regions of the protein during its interaction with ligand (Fig. 17b). The fluctuations in the N and C terminals of the protein are often the highest and thus ignored. The ligand only RMSF (Fig. 17c) of alexidine dihydrochloride was found to fluctuate above 30 Å that the control drug docked with modelled protein was not stable. Despite the light and heavy atoms (Fig. S11), the entire molecule alexidine dihydrochloride was not stable and showed high fluctuations. The ligand–protein contact plot elucidates the different bonds involved during the simulation. The amino acids GLN-122, GLU-185, SER-123, LEU-119, ALA-188, ARG-189, VAL-150, etc., interact with the ligand through conventional hydrogen, hydrophobic, water bridges, and ionic bonds (Fig. 17d). Hydrogen bonds were contributed ALA-187, ALA-188, ALA-190, ALA-191, ARG-189, ASN-152, GLN-116, GLN-122, GLN-149, GLU-73, GLU-76, GLU-185, HIS-43, LEU-119, LYS-184, SER-123, TYR-75, and VAL-150. Water formed bridges with residues ALA-187, ALA-188, ALA-190, ALA-191, ARG-189, ASN-152, ASP-42, GLN-116, GLN-122, GLN-149, GLU-73, GLU-74, GLU-76, GLU-185, HIS-43, HIS-183, LEU-119, LEU-124, LYS-184, SER-123, THR-44, THR-99, TYR-75, VAL-100, and VAL-150. Hydrophobic bonds were formed between the hydrophobic amino acid residues ALA-187, ALA-188, ALA-190, ALA-191, LEU-119, LEU-124, PHE-79, TYR-75, VAL-115, and VAL-150 and the aliphatic or aromatic group of the ligand. Ionic bonds were formed with the residues ALA-187, ALA-190, ARG-78, ARG-189, ASP-42, GLU-73, GLU-177, GLU-185, and LYS-184. GLN-122 formed most contacts with the ligand, followed by GLU-185, SER-123, LEU-119, ALA-188, ARG-189, VAL-150, and so on residues interacted with the ligand with more than one specific contact and are shown in dark-coloured shades (Fig. 17e).

Fig. 17
figure 17

MD simulation of the alexidine dihydrochloride-protein complex

The various ligand properties like ligand RMSD, the radius of gyration, molecular surface, intramolecular hydrogen bonds, the polar surface area, and the solvent-accessible surface area of alexidine dihydrochloride during the 150 ns trajectory were shown (Fig. S12). From the MD simulations, the template, modelled apo-protein, and the docked conformations were stable throughout the simulation, thus validating the docking procedure. The protein RMSD and RMSF of the top three best binding conformations docked complexes were found to be stable during the trajectory; however, some fluctuations and deviations were observed in the ligand RMSD and RMSF. The control drug alexidine dihydrochloride-protein docked complex was found to be unstable during the trajectory.

MM-GBSA free binding energy calculations

The prime MM-GBSA energies were calculated for the following complexes, modelled protein-prunin; modelled protein-oroxindin; modelled protein-luteolin 7-rutinoside; modelled protein-alexidine dihydrochloride. The average MM-GBSA ΔG binding energy of prunin was found to be − 36.16 kcal/mol, oroxindin − 40.71 kcal/mol, luteolin 7-rutinoside − 48.05 kcal/mol, and alexidine dihydrochloride − 28.47 kcal/mol respectively. The lower the ΔG binding energy, the greater is the binding affinity. The ΔG of the top three complexes was found to be lower compared to the energy of the control drug alexidine dihydrochloride which proves that the flavonoids possessed greater affinity and inhibition potential and emerge as a possible remedy against the PTPMT1. Table S7 depicts the calculated MM-GBSA ΔG binding energy and ΔG of bonds formed between the ligands and the modelled protein.

Conclusion

PTPMT1, a dual-specificity phosphatase in the mitochondria, is responsible for cardiolipin synthesis, homeostasis, and protein phosphorylation, whose dysregulation has led to the onset of cancer and type II diabetes. We performed the homology modelling of this protein using SWISS-MODEL and Geno3D servers using appropriate templates to study its properties, functions, and inhibition. The protein model developed using SWISS-MODEL was efficient compared to Geno3D. Molecular docking of flavonoids and alexidine dihydrochloride against the modelled PTPMT1 was done using AutoDock 4.2.6, and the results were promising. Prunin (− 8.66 kcal/mol), oroxindin (− 8.56 kcal/mol), luteolin 7-rutinoside (− 8.47 kcal/mol), 3(2H)-isoflavenes (− 8.36 kcal/mol), and nicotiflorin (− 8.29 kcal/mol) possessed the lowest free binding energy out of 210 flavonoids docked. Pharmacokinetics, toxicity profiles, and Lipinski properties of flavonoids were also studied and tabulated. Molecular dynamics simulations for 150 ns performed using Desmond, Schrodinger revealed that the docked complexes were stable. The MM-GBSA ΔG binding energy studies showed that the flavonoids possessed low energies than the control drug alexidine dihydrochloride; hence they could emerge as a promising regimen towards PTPMT1 inhibition. From the study, it can be interpreted that plant-based natural compounds could inhibit the dysregulations of novel drug target PTPMT1. Our future works aim to involve other classes of bioactive compounds for in-vitro and in-vivo studies.