Introduction

Johne’s disease is of foremost concern to the dairy industry worldwide (Nielsen and Toft 2009). Mycobacterium avium subspecies paratuberculosis (MAP) is the causal agent of this alarming disease of the ruminants and is associated with the Mycobacterium avium complex (MAC). MAC includes mild environmental bacteria as well as unscrupulous pathogen of humans (Turenne et al. 2007). Like any other Mycobacterial species, MAP is also specialized in the blockage of phagosome-lysosome fusion, thereby sustaining its survival (Kuehnel et al. 2001). Furthermore, they provide anti-apoptotic signals lengthening the life span of their host cell. The interaction of MAP with macrophages are mediated through variety of receptors, as well as Toll-like Receptors (TLRs) (Ferwerda et al. 2007; Koets et al. 2010). MAP can amplify interleukin (IL)-10, an immunomodulatory cytokine that represses killing of MAP by macrophages. It also triggers Th1-type immune responses which are mainly preferred to combat intracellular infections by inhibition of IL-12 (Weiss et al. 2002, 2005). Macrophages are one of the important cells of the innate immune system, but also provide signals to induce adaptive immune responses. MAP can interfere with antigen presentation, hence consequent adaptive responses (Weiss and Souza 2008). Correspondingly, the host immune response to Mycobacterial sp., namely Mycobacterium tuberculosis (M tb) is generally mediated by a TLR-2 in macrophages (Yoshida et al. 2009; Jo et al. 2007; Stenger and Modlin 2002). Mtb non-acylated lipoprotein LprG (Rv1411c) has a TLR-2 agonist action that is dependent on its association with triacylated glycolipids binding specifically with the hydrophobic pocket of Mtb LprG lipoprotein protein. The detection of a glycolipid carrier function has important implications on the role of LprG in Mycobacterial physiology (Kuehnel et al. 2001; Gehring et al. 2004). Continued exposure (>16 h) of human macrophages to LprG (acylated agonist of TLR2) results in noticeable inhibition of MHC-II Ag processing. Moreover, this inhibition depends on TLR-2. MHC-II Ag processing inhibition by mycobacterial lipoproteins may permit M. tuberculosis within infected macrophages to evade the innate immune response of the host that results in virulence and latency (Weiss et al. 2002; Drage et al. 2010). On the contrary, we know little about the evasion and survival mechanism of MAP within the host macrophages, a process common to all mycobacterial infections. Recently, it was proposed by Hassan et al. (2014) that MAP1138c a putative protein shares a structural and functional homology with Rv1411c lipoprotein. Therefore, a proper functional and structural characterization of the putative antigenic proteins will be useful in understanding the evasion and survival mechanism of MAP within host macrophages.

In this regard, the present comprehensive comparative proteomic study of MAP1138c and Rv1411c lipoproteins shows that the putative protein MAP1138c shares both structural and functional homology with Rv1411c or LprG lipoprotein. The structure-based interaction studies of the theoretical 3D structure of MAP1138c protein and triacylated glycolipid shows that the Ac1PIM2 (triacylated glycolipids) binds specifically to the hydrophobic pockets of MAP1138c protein in a similar way as it binds to the hydrophobic pockets of the crystal structure of Rv1411c or LprG lipoprotein. Thus, we propose that MAP1138c-Ac1PIM2 binary complex acts as an agonist for TLR-2 receptor and eventually leads to TLR-2 dependent inhibition of antigen processing within host macrophages and eventually reduced recognition by CD4+ T cells. This mechanism of action may lead to immune elusion and latency of MAP within host macrophages.

Materials and Methods

Sequence retrieval

The accession numbers for MAP1138c and Rv1411c proteins were acquired from UniProtKB (http://www.uniprot.org/). FASTA sequences of both MAP1138c and Rv1411c (LprG) proteins studied were obtained from protein database (www.ncbi.nih.gov/protein). The protein sequences in FASTA format were used to compare the physicochemical, structural and functional properties of MAP1138c and Rv1411c (LprG) proteins.

Sequence analysis

Gapped-Basic Local Search Tool (BLAST) proposed by Altschul and Madden (1997) was used by the SWISS-MODEL server to identify homologs of the query protein (MAP1138c) against the ExPDB template library of the Swiss-Model server.

Phosphorylation sites evaluation

Comparative evaluation of the serine and threonine phosphorylation locations in the orthologous proteins (MAP1138c and Rv1411c (LprG)) was executed using NetPhosBac 1.0 server (Miller et al. 2009) (http://www.cbs.dtu.dk/services/NetPhosBac 1.0).

Secretory nature profiling

The existence of signal sequence and signal peptide cleavage site in the orthologous proteins (MAP1138c and Rv1411c (LprG)) was evaluated and compared using SignalP 4.1 server (Petersen et al. 2011) (http://www.cbs.dtu.dk/services/SignalP).

Globular/order and disorder regions prediction

The globular structures/order and disorder regions between MAP1138c and Rv1411c (LprG) protein was compared using GlobPlot (Linding et al. 2003) (http://globplot.embl.de/cgiDict.py).

Hydropathy analysis

The Hydropathy plot of DNASTAR was used to predict and compare the flexible region (Karplus and Schulz 1985), surface probability (Emini et al. 1985) and hydrophilicity (Kyte and Doolittle 1982) and the antigenic index (Jameson and Wolf 1988) of MAP1138c and Rv1411c (LprG) proteins directly from its primary amino acid sequence.

Sequence feature annotation

Domain assignment

The member databases of InterPro (Zdobnov and Apweiler 2001) helps in the identification of protein domains as well as the determination of protein function. InterPro Scan sequence analysis and classification tool were used to detect and compare protein domains and functional sites of the orthologous proteins (MAP1138c and Rv1411c (LprG)).

Secondary, disorder and transmembrane prediction

PSIPRED (Jones 1999) a secondary structure prediction server based on two feed-forward neural networks will be used to compare the secondary structure, disorder region and transmembrane segment of the orthologous proteins (MAP1138c and Rv1411c (LprG)).

Homology modeling and model quality estimation

Model building

Raw model structure of MAP1138c protein was built by means of manual protein modeling server “SWISS-MODEL Workspace” (Biasini et al. 2014; Arnold et al. 2006; Bordoli et al. 2009; Kiefer et al. 2009; Kopp and Schwede 2006; Guex et al. 2009). The N-terminal secretion signal sequence of MAP1138c protein was removed before submitting to the server for model building. The hypothetical model of MAP1138c protein was made based on the results of target template sequence alignment using Promod-II (Guex and Peitsch 1997) and Modeller (Sali and Blundell 1993).

Model quality estimation

Model goodness and reliability of the model MAP1138c protein structure was evaluated by means of QMEAN score (Benkert et al. 2008; Benkert et al. 2009a, b, 2011). The estimated model reliability ranges between 0–1. The absolute model quality was measured by QMEAN Z-score. The QMEAN score obtained by the theoretical model was evaluated against the scores of high-resolution crystal structures of the same size, and a Z-score of the contributing components of QMEAN was determined. The standard Z-score of the high-resolution structure is zero. Finally, obtained model was validated at the SWISS-MODEL server using PROCHECK validation tool. PROCHECK aims to evaluate the similarity of the geometry of the amino acid residues in the predicted 3D structure of MAP1138c protein, when compared to stereochemical parameters obtained from well-refined, high-resolution structures (Laskowski et al. 1993). The analysis of the super-secondary structural motifs in model MAP1138c protein structure was performed using PROMOTIF (Hutchinson and Thornton 1996).

Ligand-binding domain (LBD) analysis

The prediction of ligand-binding sites/domains of MAP1138c protein structure was performed using Fpocket 1.0 (Guilloux et al. 2009) at http://fpocket.sourceforge.net/run_online.html.

Ligand preparation

Structure of triacylated glycolipid (Ac1PIM2) was drawn using Marvin Sketch and energy minimization of the ligand was accomplished using MMFF94 force field. Energy minimization was performed to help the docking program for identifying the bioactive conformer from the local minima.

Molecular docking

The PDB of the 3D theoretical structure of MAP1138c protein generated by SWISS-MODEL server was used for docking using AutoDock Vina 1.1.2 (Trott and Olson 2010). Fpocket web server was employed to detect the binding pockets in MAP1138c protein structure. The docking procedure for the interaction studies of MAP1138c (protein) and Ac1PIM2 (ligand) was fixed to rigid conditions, and the dock grid was set encircling the protein–LBD of MAP1138c protein. Exhaustiveness was set to 20 with all other parameters set to default standards. The docking and simulation studies was performed using the system configuration namely; Windows® seven operating system (OS) with 8 GB of RAM and Intel® Core™ i7 processor. The binary complex (Ac1PIM2-MAP1138c) categorized by their binding energy values, was examined for geometry and docking. After docking, the preeminent binary complex model was selected on the basis of lowest binding energy. Whereas the most appropriate complex (Ac1PIM2-MAP1138c) conformation was chosen on the basis of hydrogen bond interactions between the protein and ligand near the ligand-binding site. The lowermost energy poses specify the maximum binding affinity since high energy produces the unstable conformations. For visualization purpose, AutoDock Vina was used to generate docking poses, which was loaded directly to PyMol (Seelier and Bert 2010). PyMol was used further to produce the images of the protein ligands complex models.

Results

Gapped-BLAST analysis

A gapped-BLAST was performed for MAP1138c protein sequence to obtain its homologs that provided significant functional and structural motifs for the query protein. The gapped-BLAST analysis of the query sequence leads to the identification of a possible template with known X-Ray structure (PDB ID: 3MH8A) having 70 % identity and 83 % positive match (Supplemental Fig. 1). The template was found to be the crystal structure of LprG from H37Rv strain of Mycobacterium tuberculosis.

Analysis of phosphorylation potential of orthologous proteins

The phosphorylation of serine and threonine residues plays a key role in the regulation of host-pathogen interactions and cell signaling. The potency of the phosphorylation reaction is based upon the phosphorylation potential. The amino acids (serine and threonine) phosphorylation potential was calculated using NetPhosBac 1.0 server. The amino acids are having a phosphorylation potential greater than a threshold value are considered effective site of phosphorylation (Miller et al. 2009). Gain of phosphorylation site was observed for MAP1138c protein, where new eight potent phosphorylation sites (one threonine and seven serine residues) were found when compared to Rv1411c (LprG) protein. Whereas, Rv1411c (LprG) protein shows only four serine residues having phosphorylation potential greater than the threshold value (0.5). The predicted phosphorylation site and their position are tabulated in Table 1.

Table 1 Putative phosphorylation site in the query proteins [MAP1138c and H37Rv1411c (LprG)]

Analysis of signal sequences of orthologous proteins

Estimation of the signal sequence in a protein is important to categorize the protein as secretory or non-secretory protein (Petersen et al. 2011). In this study, MAP1138c and Rv1411c (LprG) proteins were used as query proteins to detect appropriate signal sequence. The results of SignalP 4.1 server for each of the orthologous proteins (Rv1411c (LprG) and MAP1138c) showed that these are preproteins and are secretory by nature due to the presence of secretory signal sequences in the N-terminal region of preprotein (Supplemental Fig. 2a, b). The length of the signal peptide for Rv1411c (LprG) and MAP1138c proteins were found to be 30 amino acids each. The presence of twin-arginine at N-terminal region of the signal peptide for both Rv1411c (LprG) and MAP1138c proteins relates to their translocation across the membrane via Tat translocases. It can also be stated that after translocation there is a possibility of cleavage of the preprotein by SPase I (Paetzel et al. 2002).The mature peptide for MAP1138c and Rv1411c (LprG) proteins were found to be 31–238 and 31–235, respectively. Thus, we conclude that MAP1138c and Rv1411c (LprG) proteins are preproteins and are secretory by nature.

Analysis of the globular and disorder domains in orthologous proteins

Globular and disorder domains in a protein plays significant roles in protein function (Linding et al. 2003). Thereby, a change in the profile of order and disorder region in a protein may lead to a change in the functionality of the protein. MAP1138c and Rv1411c (LprG) proteins of MAP and H37Rv exhibit similar profile of the globular domain. Figure 1a, b graphically represents the comparative GlobPlot profiles of both MAP1138c and Rv1411c (LprG) proteins. The GlobPlot profile of both MAP1138c and Rv1411c (LprG) show absence of low complexity, transmembrane and coil & coil region, thereby highlighting its role as a secretory globular protein.

Fig. 1
figure 1

a–b Disorder propensity of a MAP1138c and b H37Rv1411c (LprG) predicted by GlobPlot to detect the low complexity region (yellow), disorder region (blue), globular domain (displayed in green) and transmembrane region (striped column-white)

Analysis of the hydropathy plot of orthologous proteins

Figure 2a, b shows the result of the hydropathy analysis of both MAP1138c and Rv1411c (LprG) proteins using DNASTAR software (Madison, WI, USA) which reveals regions of high antigenic index (potential antigenic determinants). In silico analysis of MAP1138c protein revealed the presence of high antigenic index regions (above zero-reticle): (31–60, 78–85, 90–95, 100–106, 120–134, 140–146, 155–180, 200–207, 210–220 and 230–238 aa). Similarly, for Rv1411c (LprG) protein (31–60, 78–92, 98–105, 118–134, 138–144, 152–175, 198–206, 210–218 and 222–236 aa) amino acids stretches with high antigenic index were observed. The comparative antigenic profiling of MAP1138c and Rv1411c (LprG) proteins were promising as the antigenic contours for both the proteins all along the sequence were nearly similar. Moreover, the flexible regions, surface probability and hydrophilicity plot of both the protein were also identical.

Fig. 2
figure 2

a–b A comparative graphical representation of Hydropathy plot of a MAP1138c and b H37Rv1411c (LprG) proteins

ProtScan domain and functional analysis of orthologous proteins

ProtScan Domain analysis showed that both MAP1138c and Rv1411c (LprG) proteins share a common domain DUF1396 ranging from 40–231 amino acids. This domain comprises of various lipoproteins and as a group; they are known as the LppX/LprAFG family from Mycobacterium species. The members of this family are involved in virulence and localization of complex lipids to the outer membrane of MTB (Bigi et al. 1997). According to MEMSAT3 (Jones et al. 1994), tool both proteins share a similar profile of transmembrane segment at their N-Terminal region. Figure 3a, b shows the presence of a transmembrane segment at the N-terminal region of both MAP1138c and Rv1411c (LprG) proteins. The transmembrane segment of MAP138c protein consists of an outside loop (1–9 a.a), inside helix cap (10–13 a.a), central transmembrane helix segment (14–22 a.a) and outside helix cap (23–26 a.a). Similarly, the transmembrane segment of Rv1411c protein constitute of an outside loop (1–10 a.a), inside Helix cap (11–14 a.a), central transmembrane helix segment (15–23 a.a) and outside helix cap (24–27 a.a) for Rv1411c (LprG) protein. The N-terminal region of MAP1138c protein corresponds to the signal sequence as predicted by SignalP 4.1. Therefore, the presence of central transmembrane helix at the C-terminal region of the signal sequence of the protein is important for anchoring the protein to the outer membrane of the bacteria during secretion across the membrane.

Fig. 3
figure 3

a–b Graphical representation of the ProProt Scan analysis of a MAP1138c and b Rv1411c (LprG) proteins

Template search and homology modeling

A theoretical model of MAP1138c protein was generated using the automated protein-modelling SWISS-MODEL server. The coordinates of crystal structure of Rv1411c (LprG) protein from MTB (PDB ID: 3MH8A) was used as a template. Figure 4 shows the resultant 3D model of MAP1138c protein showing the secondary structure conformations of the same. The superimposition of the crystal structure of 3MH8A with the resultant 3D model of MAP1138c protein depicts secondary structural homology between the predicted 3D model of MAP1138c protein and the crystal structure of Rv1411c (LprG) (Supplemental Fig. 3). Pairwise alignment of MAP1138c (query) and Rv1411c (target) protein sequence displayed 70 % sequence identity and 83 % of sequence similarity (Supplemental Fig. 1).

Fig. 4
figure 4

Theoretical three-dimensional structure of MAP1138c protein prepared by SWISS-MODEL server showing different secondary conformations. The helix is represented in red, the strand in purple and loops in gray

Model quality evaluation

The model goodness and reliability were evaluated by measuring QMEAN score (Qualitative Model Energy Analysis). Furthermore, a Z-Score (standard score) relative to the scores attained for high-resolution experimental structure of similar size is specified. Absolute quality of the model is measured with the help of Z-Score. The Z-Score of the individual component of QMEAN and QMEAN Z-score of the predicted model is also illustrated (Fig. 5a, b). The global QMEAN score 0.9 of the predicted model reflect upon the reliability of the model (Fig. 5a).

Fig. 5
figure 5

a Graphical representation of the Z-Score of the individual component of QMEAN for the 3D model of MAP1138c protein. b Normalized QMEAN score of theoretical 3D structure for MAP1138c protein model created with SWISS-MODEL server

The QMEAN Z-score value for the predicted structure of MAP1138c protein is 1.33 as the score of all except torsion energy is close to the ideal value. The overall score of QMEAN shows a deviation of approximately one standard deviation from the ideal value that is zero (Fig. 5b). This deviation is considered fair, and the structure can be validated for further studies. The local error estimate of each residue of the modeled protein was estimated and represented using a color ranging from red (unreliable region) to blue (reliable region) (Supplemental Fig. 4). We may distinguish the reliability of generated model since the blue color region occupies a significant portion of the protein sequence and nothing in the potentially unreliable region (red).

Model structure assessment

As expected, the predicted structure of MAP1138c protein, shares secondary structural features with Rv1411c (LprG) protein. The PROMOTIF analysis shows that the MAP1138c protein consists of a single domain of an alpha/beta fold along with a β-sheet. This region consists of eleven anti-parallel strands on one side and six helices (3 [α-helix] + 3 [310 helix]) on the opposite side. Additionally, nine Hairpins, seventeen B-Turns, one G-Turn and five Bulges were featured in the PROMOTIF analysis of the model protein. Between the β-sheet and α-helices lies a large cavity. The entrance to the cavity lies near the beta strands (β 3 to β 6) of modeled MAP1138c protein. The lower part of the molecule has a narrow cavity between a single α/β fold and β 10 and β 11. Similar profile of secondary, super secondary and cavity structure of Rv1411c (LprG) protein is demonstrated in the crystal structure of LprG lipoprotein by Drage et al. (2010). The amino acids are lining the central cavity and portal lies in the hydrophobic stretches of MAP1138c (LprG) protein predicted and represented by the hydrophobicity surface plot (Supplemental Fig. 5). The percentage of strand, alpha helix and 3–10 helices in the predicted model were 13.7, 47.2 and 5.1, respectively. This result confirms the globular nature of MAP1138c protein owing to its higher strand percentage.

The geometry of the predicted model was evaluated using PROCHECK. The Ramachandran plots of generated model structure, as evaluated by PROCHECK, showed overall good quality. For MAP1138c protein structure 85.8 % residues resulted in the core regions, 12.4 % in the generally allowed regions, 0.6 % in the additionally tolerable region and 1.2 % in the forbidden regions (only 9 residues labeled out of 195) of the Ramachandran plot (Fig. 6).

Fig. 6
figure 6

Ramachandran plot assessment of the predicted model of MAP1138c protein structure using PROCHECK Server

The stereochemical parameters like Chi-chi2, Main-chain and Side-chain parameters were evaluated and declared fit for further investigation. Correspondingly, bond lengths (B.L), bond angles (B.A) and planar groups were 99.8, 98.8 and 89.5 % respectively were within limits. Whereas 0.2, 1.2 and 10.5 % of B.L, B.A and planar groups respectively of the total residues were highlighted above limits. The summary of PROCHECK analysis by SWISS-MODEL server for the theoretical 3D model of MAP1138c protein is pictorially represented (Supplemental Fig. 6).

Analysis of ligand binding sites

Fpocket software identified hydrophobic binding pockets represented by green dot in the cavity between the beta sheet and the helical region of MAP1138c protein (Supplemental Fig. 7). Table 2 shows a list of pocket identified by fpocket in MAP1138c protein and these pockets are ranked according to score obtained.

Table 2 Shows a list of pocket identified by fpocket in MAP1138c protein and ranked according to score obtained

Protein–ligand docking studies

Figure 7 displays the minimum energy of binding (−9.08e+006 kcal/mol) docked state of Ac1PIM2-MAP1138c protein. The ligand molecule Ac1PIM2 (triacylated glycolipid) is anchored within the hydrophobic pocket, or cavity formed by hydrophobic amino acid residues forming the cavity between the alpha and the beta strands of MA1138c protein structure. The docked structure of the binary complex (Ac1PIM2-MAP1138c) is stabilized by two intermolecular hydrogen bonds is pictorially represented (Supplemental Fig. 8). The oxygen (O6) of the mannose attached to 6-OH position of inositol in Ac1PIM2 with the NH of VAL58 amino acid residue of MAP1138c protein. The hydrogen bond distance between these groups is 3.214  Å. Another hydrogen bond was formed between oxygen (O1) of the mannose attached to 6-OH of inositol in PIM2 with the NH of amino acid VAL133 of MAP1138c protein at a distance of 3.322  Å. Four distinct conformational clusters were found using an RMSD (Root Mean Square Deviation) tolerance of 2.0 A. Table 3 shows the clustering histogram showing the lowest/mean binding energy of the binary complex (MAP1138c-Ac1PIM2) and the number of conformation in each cluster with their RMSD value.

Fig. 7
figure 7

Pictorial representation of the binary complex of Ac1PIM2 in the hydrophobic cavity (green color) of MAP1138c protein

Table 3 Shows the clustering histogram of the conformations obtained during the docking studies of Ac1PIM2 and MAP1138c protein

Discussion

We carried out a comprehensive comparative proteomic, structural and functional analysis of MAP1138c and Rv1411c (LprG) proteins. From our analysis of gapped BLAST, it appears that MAP1138c protein shares a 70 % sequential homology with the crystal structure of Rv1411c (LprG) lipoprotein (Supplemental Fig. 1). A comparative analysis of the ProtParam parameters namely GRAVY, instability index and aliphatic index by Hassan et al. (2014) shows that both MAP1138c and Rv1411c (LprG) proteins share similar physicochemical properties. Both MAP1138c and Rv1411c (LprG) proteins shares a common profile of order and disorder region as predicted by GlobPlot server. These comparison shows that the high percentage of sequence identity between the orthologous protein (MAP1138c and Rv1411c) leads to similarities in their physicochemical and structural properties.

In addition, as evident from the comparative NetPhosBac analysis, MAP1138c protein exhibits additional phosphorylation sites than Rv1411c (LprG) proteins. As specified before, phosphorylation plays a significant role in signaling-transduction process (Paetzel et al. 2002). Therefore, it can be inferred that MAP1138c protein could also interact actively with other proteins leading to signal transduction and cell signaling due to the presence of potential phosphorylation sites in the protein sequence. We also predicted the presence of a signal peptide processing sequence (MLGMQTRRRLSAVFASLTLATALIAGCSSG) having twin Arginine in the N-terminal region of the signal peptide of MAP1138c protein. The presence of twin-arginine suggests that this transcript is a preprotein, and a mature protein is secreted via the Tat pathway and cleaved by SPase I peptidases. The presence of secretory signal sequence indeed could be essential for virulence and pathogenesis of MAP. In our SignalP 4.1 prediction studies, we also found Rv1411c (LprG) protein having secretory property via Tat pathway. We can, therefore, hypothesize that MAP1138c is a preprotein and the mature protein upon cleavage by specific peptidases is secreted out into the host, by SignalP 4.1.

The comparative antigenic profile of MAP1138c and Rv1411c (LprG) proteins are promising since the antigenic contours, flexible regions, surface probability and hydrophilicity for both the proteins all along the sequence were nearly similar. The profile created by antigenic index of DNASTAR (Madison, WI, USA) software for MAP1138c and Rv1411c (LprG) proteins can be used to map the antigenic domains of these proteins and assess the antigenic potential of these peptides in eliciting an innate and differential immune response.

The ProtScan domain analysis for both MAP1138c and Rv1411c (LprG) proteins shows that they share a common domain DUF1396 ranging from 40–231 amino acids. DUF1396 consists of several lipoproteins belonging to LppX/LprAFG family from Mycobacterium species. The best known of these is LprG from MTB (Bigi et al. 1997). Functional characterization of LprG revealed that it halted MHC II antigen processing; a mechanism evolved by MTB to avoid the host MHC-II-restricted CD4+ T cell response within host macrophages (Gehring et al. 2004). LppX is a secreted antigen, which might be a potent target for the design of vaccine against MTB (Al-Attiyah and Mustafa 2004). While LprF is a membrane-associated lipoprotein responsible for signal transduction leading to reaction to osmotic stress (Steyn et al. 2003). In this context, we can assume that MAP1138c is a lipoprotein and possesses a signal sequence of 30 amino acids at the N-terminal region. The presence of the signal sequence direct the preprotein to the cell membrane for secretion via the Tat pathway and eventually cleaved and secreted into the host environment. Therefore, we propose a hypothesis that the MAP1138c protein may play a role in the pathogenesis of paratuberculosis infection by inhibiting the MHC-II mediated antigen processing within the host macrophages. Thus, evading the host MHC-II-restricted CD4+ T cell response, which leads to virulence of M. paratuberculosis infection in ruminants.

Based on the high pairwise sequential identity between the query and target protein, a template structure (protein from Mycobacterium tuberculosis (PDB ID: 3MH8A) was selected. Theoretical model for the MAP1138c protein (query protein) was generated using alignment protein-modelling mode of SWISS-MODEL server. Once the model structure of MAP1138c protein was made, quality assessment tools of SWISS-MODEL server were used to estimate the reliability of the model. Model quality estimation is essential in homology modeling. Hence, mean force potential and empirical force-field methods provided by SWISS-MODEL server were used to validate the quality of the protein model structure.

QMEAN provide scoring functions to estimate the quality of the protein structure model (Benkert et al. 2008, 2009a, b). The QMEAN score reflects the reliability of the whole model and it value ranges from zero (unreliable) to one (reliable). QMEAN score for MAP1138c protein model was found to be 0.92, which means that that the predicted 3D model for a given protein is reliable. QMEAN Z-score evaluates the absolute quality of the query model, and the score is standardized to standard deviation (SD) 1 and mean 0. The Z-score also predicts by much the model structure is deviated from the normalized standard deviation value obtained from the expected values for reference structures. QMEAN Z-scores are calculated for all four structural descriptors (all-atom pairwise, C β interaction, solvation and torsion angle energy) actuality part of the QMEAN score. The investigation of the Z-scores for each structural descriptors can help in detecting the structural features accountable for an observed strongly negative (red region) QMEAN Z-score. Models of good quality are supposedly to have the Z-score values of the individual structural descriptors in the favorable area (light red to blue) (Benkert et al. 2011). The QMEAN Z-score for MAP1138c protein was observed to be 1.33, i.e., <2. The result signifies that the predicted 3D structure of MAP1138c protein shows a standard deviation of approximately one from the template X-Ray structure. This variation is acceptable and allows us to use this predicted structure for further structural and functional analysis of MAP1138c protein. The predicted 3D structure of MAP138c protein falls in the “Good Structure” category since the individual score of the pseudo energies and the QMEAN have values in the light red to the blue region.

Similarly, the predicted three-dimensional structure of MAP1138c protein was evaluated for per residue error in the protein chain. Residue error was estimated and visualized using a color gradient extending from blue (reliable regions) to red (possibly unreliable regions). The local error estimate per residue for MAP1138c protein lies in the more reliable regions (blue color) (Supplemental Fig. 4). The three-dimensional structure of MAP1138c protein shares a similar profile of secondary structures with Rv1411c (LprG) protein demonstrated using PROMOTIF program. Both MAP1138c protein structure and crystal structure of Rv1411c (LprG) protein shares a common profile of hydrophobic amino acids lining the central cavity (ligand binding domain). The presence of hydrophobic amino acids in the ligand binding domain of MAP138c protein structure suggests an analogous mode of protein–ligand (triacylated glycolipids) interactions in both the protein. The structure assessment tool namely Ramachandran plots of PROCHECK analysis was used to evaluate the secondary structure profile of MAP1138c protein model. The Ramachandran plot showed overall good quality since 98.8 % residues of the predicted 3D structure of MAP1138c proteins lies favorable regions of the plot. While, only 1.2 % of the total residues lie in the disallowed regions of the plot. In the same way, stereochemical properties obtained and evaluated for MAP1138c protein model by PROCHECK further shows that the theoretical structure generated by SWISS-MODEL server is worth investigating. Fpocket predicted hydrophobic binding site of the ligand in MAP1138c protein structure complemented well with the results of the docking procedure. The docking and simulation studies revealed that the binary complex of MAP1138c protein and triacylated glycolipids were stabilized by two intermolecular hydrogen bonds. The successful docking procedure showed that Ac1PIM2 (triacylated glycolipid) could bind to the hydrophobic binding pocket of MAP1138c protein of MAP. The binding pocket of MAP1138c protein for TLR-2 agonist (Ac1PIM2) resembles the complex crystal structure of Rv1411c (LprG) lipoprotein and triacylated glycolipid (Ac1PIM2) demonstrated by Drage et al. (2010). This ligand–protein interactions study indicates that the MAP1138c protein may contribute to cell wall assembly and facilitate recognition of triacylated glycolipids by TLR-2 leading to TLR-2 mediated immune evasion within host macrophages. The current study also provides an opportunity to design drug targeting the TLR-2 agonist activity of the binary complex (Ac1PIM2-MAP1138c) apparently contributing to immune evasion and virulence. Additional, functional and biochemical experimental studies are necessary to establish the role of MAP1138c protein in the evasion of the cell-mediated response within host macrophages. Such studies will improve our current knowledge of the mechanisms of pathogenesis and latency of MAP but also will contribute to developing new therapeutic for Johne’s disease.