1 Introduction

Genital tuberculosis (GTB) in women is an important clinical problem often associated with infertility (Bose 2011; Dam et al. 2006). According to an estimate, 5 % of women seeking infertility treatment have GTB worldwide (Malik 2003). The incidence of GTB in developing societies such as India is reported to be as alarming as 18–19 % (Malik 2003; Das et al. 2008). It is well recognized that the active form of GTB causes irreversible tubal and endometrial damage (Jindal et al. 2010) and even the dormant (sub-clinical/latent) form is associated with repeated implantation failure (Bose 2011; Subramani et al. 2016b).

Diagnosis of GTB in the sub-clinical stage is necessary to prevent or minimize damage to the female pelvic organs (Jindal et al. 2012). Conventional diagnosis of GTB includes microscopy and culture where low detection rate still remains a matter of concern (Rozati et al. 2006; Jindal 2006). Diagnosis of dormant GTB is a bigger challenge owing to the paucibacillary nature of the bacilli (Thangappah et al. 2011). Polymerase chain reaction (PCR) has shown promising results for the diagnosis of dormant GTB with a considerable increase in sensitivity and specificity (Subramani et al. 2016b; Thangappah et al. 2011; Kulshrestha et al. 2011). However, false positive or negative results are the important limitations associated with this technique (Shrivastava et al. 2014). Therefore, there is a demand for the identification of reproducible diagnostic markers of dormant GTB, which could help identify patients early and, therefore, assist clinicians in improved disease management.

Over the past decade, metabonomics has been extensively used in the development of diagnostic metabolic markers of diseases (Fernie et al. 2004; Bujak et al. 2014; Zhao et al. 2014). Proton nuclear magnetic resonance spectroscopy (1H NMR) is one of the established metabolomics strategies for the quantitative profiling of low molecular weight endogenous metabolites (Wevers et al. 1994; De Graaf and Behar 2003; Kostara et al. 2014). The combined use of 1H NMR with multivariate analysis offers the potential for identification of reproducible metabolite markers that are altered under pathological conditions. Various groups have used NMR based metabolomics to study tissue and biofluids of animal TB models (Shin et al. 2011; Somashekar et al. 2011, 2012) and lung TB patients (Weiner et al. 2012; Zhou et al. 2013; Frediani et al. 2014). These studies have reported significant changes in the host metabolism during infection.

Recently, we used proton NMR metabonomics to study the metabolic influence of tuberculosis on the endometrium of women with dormant GTB (Subramani et al. 2016a). We found that the energy metabolism and amino acid biosynthesis related metabolites were altered in dormant GTB women during the implantation window implicating their association with implantation failure (Subramani et al. 2016a). We hypothesize that clinically relevant metabolites which can effectively discriminate between women with and without dormant GTB may be found in systemic circulation. The present study, aims to assess metabolic perturbations in serum of dormant GTB women and identify disease-related metabolite markers in dormant GTB women using 1H NMR metabonomics for diagnosis of the disease at a subclinical stage.

2 Materials and methods

2.1 Subject selection

The human ethics committee of the Institute of Reproductive Medicine (IRM), Kolkata and Indian Institute of Technology Kharagpur, India has approved this study. Written informed consent was obtained from all couples volunteered to participate in this study. The detailed inclusion and exclusion criteria are discussed elsewhere Subramani et al. (2016a, b). Briefly, unexplained infertile women undergoing repeated in vitro fertilization (IVF) failure (>3 embryo transfer) were included in this study. Endometrial tissue samples obtained by dilation and curettage (D&C) from these women were screened for confirmation of GTB by PCR, BACTEC-460 culture, hematoxylin and eosin (H&E) and Ziehl–Neelsen (ZN) staining. Women testing positive for GTB were included as the study group (n = 26) and negative for GTB were considered as controls (n = 26). Of the 26 endometrial GTB-positive cases, PCR and BACTEC-460 culture positive cases were observed to be 19 and 7, respectively. All the culture positive cases were also PCR positive. All samples were tested in duplicate using single tube nested PCR to minimize the effect of false positivity. None of the samples were found to be positive with H&E and ZN staining. Proven fertile women without any pathology undergoing voluntary sterilization (n = 25) and women undergoing recurrent spontaneous miscarriage (RSM) due to poor endometrial receptivity (n = 27) were included for comparison purposes. Women associated with any other gynaecological disorder, bacterial and viral infections or have had received medication for the past 3 months were excluded.

Women were monitored for ovulation confirmation by daily ovarian ultrasonography and urinary luteinizing hormone assay (Banerjee et al. 2013). On 7–10 days post ovulation, blood samples were collected from both study and control groups who fasted for at least 10 h. Serum was collected by centrifuging the blood at 1500×g for 10 min and stored within 2–3 h of sample collection at −80 °C until analysis. Serum samples of both the groups were collected and prepared following the same standard operating procedures. Radioimmunoassay was used to measure the level of serum estradiol and progesterone during implantation window.

3 1H NMR analysis

One-dimensional NMR spectra of all serum samples were performed using Car-Purcell-Meiboom-Gill (CPMG) pulse sequence. The experimental procedure is described elsewhere (Dutta et al. 2012; Banerjee et al. 2014). Briefly, 1 mM sodium salt of trimethylsilyl propionic acid (TSP) in 400 µl of D2O was added to 200 µl of thawed serum sample. The mix was centrifuged and transferred into 5 mm NMR tubes. The instrument and parameters were as follows: 700 MHz Bruker Avance AV III spectrometer; number of scans—256 transients; spectral width—14,005.6 Hz; data points—16 K; relaxation delay—4.0 s and an acquisition time—0.58 s.

3.1 Spectra pre-processing

All spectra were subjected to manual phase and baseline correction, and referenced to TSP (δ = 0.0 ppm) in MestReNova version 7.1.0 (Mestrelab Research, Santiago de Compostela, Spain). Multivariate analysis was performed to spectral region of δ 0.5–4.5 (excluding alcohol signals: δ 3.60–3.69 ppm and δ 1.12–1.22 ppm and residual water signal: δ 4.5–5.10 ppm). Due to poor signal to noise ratio in the aromatic region (δ 5.10–9.0), univariate analysis (i.e. Mann–Whitney U test) was applied to this region. For minimizing chemical shift variations, recursive segment-wise peak alignment (RSPA) was applied using R/Bioconductor package mQTL.NMR (Hedjazi et al. 2015). Followed by constant sum normalization, equal weightage was given to all variables by applying unit variance scaling (SIMCA 13.0.2, Umetrics, Sweden).

3.2 Multivariate analysis and validation of statistical models

Principal component analysis (PCA) was performed to determine the distribution of samples and to detect outliers. Partial least squares discriminant analysis (PLS-DA) and orthogonal partial least squares discriminant analysis (OPLS-DA), the supervised classification models, were generated using SIMCA 13.0.2 (Umetrics, Sweden). The parameters including goodness of fit (R2), goodness of prediction (Q2), permutation test statistics (n = 200) and analysis of variance testing of cross validated predictive residuals (CV-ANOVA) score were used to detect the robustness and validation of the OPLS-DA model. S-line plot was then generated to extract the statistically significant metabolites based on their loading and modulus of correlation coefficient values. The statistically significant variables extracted were subjected to spectral integration (MestReNova version 7.1.0, Mestrelab Research, Santiago de Compostela, Spain). The robustness and predictive performance of the constructed models in discriminating groups was also validated by receiver operating characteristic (ROC) curve analysis.

3.3 Univariate statistical analysis

Prior to univariate statistical analysis, the integral values of the selected signals were normalized with constant sum normalization. Statistical significance of the mean integral values for corresponding metabolites between the groups was obtained using Student’s t test (GraphPad Prism version 5.00 for Windows, GraphPad Software, San Diego, CA, USA). Mann–Whitney U test was applied to integral values of the aromatic regions (GraphPad Prism version 5.00 for Windows, GraphPad Software, San Diego, CA, USA). The level of statistical significance was 5 %. ROC curve analysis was applied to all the significantly altered metabolites and sensitivity, specificity, and accuracy calculated (MedCalc for Windows, version 15.0, MedCalc Software, Ostend, Belgium). Multiple Pearson’s correlation analysis was performed using R statistical packages version 3.2.2 (R Foundation for Statistical Computing, Vienna, Austria; http://www.R-project.org/). Compound network were generated and visualized using Cytoscape based Metscape plugin (http://www.metscape.ncibi.org/tryplugin.html; Karnovsky et al. 2012). Markov clustering based module analysis was performed to analyze the complex biological network (Zhu et al. 2015).

4 Results

Clinical characteristics of patients including age, body mass index (BMI), serum estrogen and progesterone levels were found to be comparable and summarized in Supplementary Table 1. T2-edited CPMG NMR spectra helped in identification of several small molecule metabolites in serum of women with dormant GTB and controls. NMR peaks were assigned to corresponding metabolites based on literature (Cao et al. 2012; Bertini et al. 2012; Wishart et al. 2009) and human metabolome database (HMDB). Representative CPMG-NMR spectrum of serum from women with dormant GTB is shown in Supplementary Figure 1.

The unsupervised statistical approach, PCA was applied to unit variance scaled NMR spectra of both the groups for dimensionality reduction of data. PCA scatter score plot of principal component 1 (PC1) vs. PC2 segregated the clustering between dormant GTB women and controls; however noticeable overlaps were observed (Supplementary Figure 2A; R2X = 0.889, Q2 = 0.767). To facilitate improved separation between the groups, the supervised classification models PLS-DA and OPLS-DA were generated. The maximum separation between the groups was achieved in PLS-DA models (Supplementary Figure 2B; R2X = 0.607, R2Y = 0.736, and Q2 = 0.612).

PLS-DA model validated with permutation test statistics demonstrated that the original model has better predictive ability than the 200 permutated models. This was evidenced by all permuted R2 and Q2 values lower than the original value where regression line of Q2 and R2 with the intercept was observed at −0.394 and 0.156, respectively (Supplementary Figure 2C). OPLS-DA model showed optimized classification of dormant GTB and control groups (Fig. 1a). Also, OPLS-DA model showed differentiation of dormant GTB group from proven fertile women and women with RSM (Supplementary Figure 4A–C). We found OPLS-DA model to fit well with the training data set with higher R2 and Q2 values [dormant GTB vs controls (R2X = 0.496, R2Y = 0.905 and Q2 = 0.632); dormant GTB vs proven fertile women (R2X = 0.472, R2Y = 0.937 and Q2 = 0.732); RSM vs proven fertile women (R2X = 0.507, R2Y = 0.870 and Q2 = 0.506) and dormant GTB vs RSM women (R2X = 0.595, R2Y = 0.975and Q2 = 0.912)] indicating that the model could predict the classes better than chance. Significant variables contributing towards class separation of disease and control groups were identified based on S-plot (r > 0.60; Fig. 1b) and VIP scores >1 (Supplementary Figure 3). S-plot (Supplementary Figure 4A–C) and VIP scores (Supplementary Figure 5A–C) obtained from OPLS-DA model of dormant GTB and proven fertile women, RSM women and proven fertile women and dormant GTB and RSM cases. Cross-validation of OPLS-DA model with a CV-ANOVA score showed highly significant differences between the group values [dormant GTB vs controls (p = 2.8 × 10−7); dormant GTB vs proven fertile women (p = 1.09 × 10−4); and RSM vs proven fertile women (p = 3.17 × 10−21) and dormant GTB vs RSM cases (p = 1.87 × 10−26)]. Further, area under ROC (AUROC) of 0.99 supported better predictability and robustness of the model.

Fig. 1
figure 1

a Scatter plot of OPLS-DA obtained from NMR spectra of serum from dormant GTB and controls b Color map is representing the coefficient loading plots corresponding to OPLS-DA model (MATLAB R2009a, The MathWorks, Inc., USA). Modulus of correlation is represented as a color bar Altered metabolites are indicated: 1. Lipids, 2. l-Lysine, 3. Acetate, 4. l-Glutamine, 5. 3-hydroxy butyrate, 6. l-Glutamate, 7. Succinate, 8. Citrate, 9. l-Threonine (Color figure online)

On the basis of the significant regions identified in S-plot and VIP scores >1, metabolites including 3-hydroxy butyrate, succinate, acetate, citrate, l-glutamine, l-glutamate, l-lysine, and l-threonine were assigned. Next, the integral values of these metabolites were subjected to student’s t test. All metabolites were found to be upregulated in dormant GTB women compared to controls (Table 1). The peaks identified in the chemical shift region 5.1–9.0 ppm (aromatic region) were subjected to Mann–Whitney U test and 1-methyl histidine was found to be statistically up regulated in women with dormant GTB (Table 1). Sensitivity and specificity of the significantly dysregulated metabolites are presented in Table 1. A significant increase in l-leucine, l-isoleucine, l-valine, succinate, acetate, citrate, lactate, l-glutamine, l-glutamate, l-lysine, l-alanine and l-threonine and decrease in d-glucose were observed in dormant GTB women as compared with proven fertile women (Supplementary Table 2). Levels of l-lysine, lactate, citrate, l-glutamine, l-histidine, l-threonine, l-phenylalanine, and l-tyrosine were found be increased in RSM cases compared to proven fertile women (Supplementary Table 2). The significantly altered metabolites discriminating dormant GTB and RSM cases are presented in Supplementary Table 2.

Table 1 Chemical Shift, fold change, sensitivity, specificity and p values of the potential metabolite markers discriminating dormant GTB women from controls

In the present study, we have used Pearson’s correlation analysis to identify the association of altered serum metabolites with dysregulated endometrial tissue metabolites in dormant GTB women, reported earlier (Subramani et al. 2016a). This analysis shows significant positive correlation among serum metabolites (Fig. 2). Further, serum metabolites were found to be significantly correlated with endometrial tissue metabolites (Fig. 2), implying serum metabolite levels in dormant GTB cases are dependent on the metabolism of the endometrium in these women. Next, dormant GTB related metabolic network model of potential markers were constructed for diagnosis and disease classification (Supplementary Figure 6). The generated model shows compounds and its reactions related to these identified metabolites (Supplementary Table 3). These network models were reconstructed using Markov clustering and represented as modules for reducing complexity of the data (Fig. 3).

Fig. 2
figure 2

Pearson’s correlation map represents association between dysregulated serum metabolites and endometrial tissue metabolites. Blue squares—positive correlation (0.297–0.895, p < 0.05); red squares—negative correlation (−0.298 to −0.442, P < 0.05). Size of squares indicates the level of correlation coefficient (Color figure online)

Fig. 3
figure 3

Dormant GTB-related metabolic networks based on the identified metabolites. The central map represents the relationship between the metabolites. Size of gray lines between the metabolites indicates the level of correlation between them. Markov clustering based modules for each metabolite is represented outside the central map (Color figure online)

5 Discussion

High prevalence of latent TB infection in India makes the diagnosis and treatment of sub-clinical TB very challenging. Even dormant tubercle bacilli may cause functional damage to the endometrium and affect embryo implantation (Subramani et al. 2016b). Due to paucibacillary nature of TB in the genital organs, conventional techniques including histology, acid-fast bacilli staining and culture tests show low-rate of positivity. PCR is a highly sensitive tool which has been successfully used for diagnosing GTB, even in the dormant form (Jindal 2006; Roy and Roy 2003; Bhanu et al. 2005). Studies by Jindal et al. (2012) and our group (Subramani et al. 2016b) have shown that GTB-PCR can effectively diagnose dormant GTB in women when other conventional techniques fail to detect GTB in the endometrium. However, PCR is yet to be used reliably for the diagnosis of tuberculosis owing to the tendency of this technique to have false-positive or negative results. This motivates us to work towards the identification of reproducible markers for dormant GTB which could complement the PCR technique and help identify women with GTB in dormant form.

Metabonomics is a powerful tool for the global profiling of metabolites under diseased and non-diseased states and for understanding the onset and progression of the disease (Bujak et al. 2014; Zhao et al. 2014). Our earlier study on endometrial tissue NMR metabonomics revealed 21 significantly dysregulated metabolites in women with dormant GTB as compared to controls (Subramani et al. 2016a). Interestingly, majority of the metabolites expressed in serum in the present study follow a similar trend as the expression observed in the endometrium of these women. Metabolites mostly associated with energy metabolism, protein biosynthesis, lipid and fatty acid metabolism and nucleic acid metabolism were found be up-regulated (Supplementary Table 4). It was interesting to find that a different set of metabolites were dysregulated when a group of RSM women with poor endometrial receptivity was included. These findings underline the fact that the identified metabolites are specific to dormant GTB per se and hold promise as candidate markers of the disease. The pathway and modular analysis of identified potential markers represent pathway-mediated chemical reactions and its role in dormant GTB (Fig. 3). Our findings are in good agreement with NMR based metabolomic studies on active lung TB which reports alterations in the metabolic pathways (Zhou et al. 2013).

Based on these identified metabolites, multivariate analysis of the NMR spectral data could differentiate between women with dormant GTB and controls. An improved class separation was obtained using the supervised models, PLS-DA and OPLS-DA. Although supervised models could provide good discrimination, rigorous validation is necessary for predicting the group better due to variations in samples, data collection and processing. The sevenfold internal cross-validation was used to extract the parameters R2 and Q2. Our values of R2 (0.905) and Q2 (0.632) signify that the model can significantly predict the dormant GTB group. CV-ANOVA score confirms that our model is statistically valid and more efficient than the permutated models in predicting the classes. Moreover, ROC analysis of the OPLS-DA model also supported the accuracy of the model and the metabolite markers identified could be segregated into two distinct groups with high sensitivity and specificity (Table 1).

3-hydroxybutyrate and succinate are important byproducts of fatty acid oxidation. Increased levels of these metabolites suggest a possible metabolic switch from active glycolysis to fatty acid oxidation increasing the demand for ATP and cell energy supply. The adverse effect of 3-hydroxybutyrate on endometrial epithelial and stromal cells of ketonuric genetically diabetic Chinese hamsters is documented (Garris and Smith 1983). It also induces a significant reduction in the maturation of oocytes and development of blastocysts in cows with sub-clinical ketosis and aggravates the toxic effect of low glucose concentrations (Garris and Smith 1983; Leroy et al. 2006). Furthermore, studies indicate that 3-hydroxy butyrate not only affects the endometrium but also the quality of oocytes and embryos, leading to reduced embryo-endometrial interaction. The increased levels of citrate and acetate in women with dormant GTB provide evidence of association between systemic metabolism and tubercle infection in the endometrium. Our findings are indicative of tubercle bacilli mediated dysregulation of host lipid metabolism in serum of women with dormant GTB. This is in good agreement with the work of Zhou et al. (2013) where correlation between lung TB and perturbed lipid metabolism is reported.

Amino acid metabolism, a complex process associated with proteolysis and gluconeogenesis, supplies the demand of amino acids during infection (Paton et al. 2004; Zhou et al. 2013). The increase in free amino acids level in dormant GTB cases may be attributed to impaired anabolic response, which represents deficiency in the mechanism of protein synthesis. l-lysine plays an important role in pathogenesis of tuberculosis (Nambi et al. 2013; Liu et al. 2014). Also, increased levels of l-glutamine, glutamate and l-threonine in dormant GTB women indicate the demand of energy during infection. Further, upregulation of 1-methyl-histidine in serum of lung TB patients is known to cause muscle protein degradation (Zhou et al. 2013). It is also suggested that dormant tubercle bacilli alters the endometrium at a molecular level (Subramani et al. 2016b). We, therefore, presume that the increased level of 1-methyl-histidine in dormant GTB women is associated with alterations in the endometrium which is known to be responsible for implantation failure.

It is important to mention that the end-point of this study defining measurement of a set of metabolite markers in serum for diagnosing dormant GTB is to be viewed with caution. Identification of clinically relevant biomarkers require strategies to minimize confounding factors including intra- and inter-individual variability, impact of diet and medications, population stratification, etc. The identified markers should be interpreted and validated with pre-test probability of GTB (Lim et al. 2000). Due to ethical constraints, we could not explore the status of these metabolites after anti-tubercular treatment in these women.

6 Conclusions

The present study demonstrates the application of 1H NMR based serum metabolomics and multivariate analysis for discriminating the metabolic profile of women with dormant GTB compared with controls. A significant increase was observed in metabolites largely related to lipid metabolism, energy metabolism and amino acid biosynthesis in dormant GTB cases. Owing to a similar trend in the expression pattern of most of the serum metabolites as in the endometrial tissue, it is tempting to speculate that these metabolites may be explored as potential biomarkers for detection of sub-clinical GTB women. This may assist the clinicians in treating these women at an early stage thereby preventing damage to the genital organs.