Introduction

Easy and early diagnosis of tuberculosis is a key to achieve the sustainable development goals developed by WHO. The early diagnosis of tuberculosis has made a reasonable progress in the past few years with the advent of molecular tools like Xpert MTB/RIF and Xpert MTB/RIF Ultra, but due to the complexity of these methods they cannot be used as diagnostic tools in peripheral health care settings. Thus, WHO has highlighted the need of non-sputum-based biomarker assay [1] particularly a point-of-care test [2] which can be easily used at peripheral health care settings. To meet the need of a non-sputum biomarker-based assay, a number of new tests are in various phases of development. This includes the tests based on urine biomarkers, breath biomarkers as well as blood-based biomarkers [3,4,5,6,7,8]. Serodiagnostic assays based on the detection of serum antigens or antibodies in response to Mtb infection may also play an important role for development of non-sputum-based biomarker assay. Previously a lot of antibody based serodiagnostic tests were there in the clinical use. These tests were however banned by WHO in 2010 due to inaccurate estimates of sensitivity and specificity, but simultaneously the organization has encouraged further research in this area [9].

Antibody-based assays have been developed for various other diseases where they have provided easy solutions for diagnosis at peripheral health care settings. Due to their advantage to be developed into dipstick or lateral flow-based assays, antibody-based assays will help in rapid diagnosis of TB. Several research groups have been working on the development of these kinds of assays for rapid and easy diagnosis of TB in peripheral health care settings [10,11,12,13,14]. It is known that antibody profile depends on the antigen signatures expressed during the TB disease [15]. So, an antigen which is highly specific to mycobacteria and is expressed during the disease conditions may elicit a specific and high antibody response and hence can act as a better diagnostic candidate for antibody-based serodiagnostic assays. With regard to specificity an important group of antigens is Region of Difference (RD) antigens. These antigens are specifically expressed by most of the human adapted virulent strains of mycobacteria and deleted from most of the vaccine strains of BCG [16]. Antigens of this group are immunogenic in nature, because of which they have been an important component of various vaccine-related studies [17,18,19]. Past studies from various groups have also shown the potential role of RD antigens in immunodiagnosis [20,21,22]. Thus, considering the specificity of RD antigens and the dependence of antibody profile on antigen expression during the disease, we looked for the in vivo expressed RD antigens based on the findings of a previous study from our group [23]. Transcriptomic analysis of mycobacteria in sputum samples of smear positive pulmonary tuberculosis (PTB) patients led to identification of several in vivo expressed transcripts suggesting that the proteins encoded by these genes could be possible targets for antibody response [23]. Antibodies recognize only specific regions of proteins termed as B-cell epitopes. Utilization of these specific regions in the diagnostic assay eliminates cross reactivity, lowers cost and improves quality [24]. Thus, the immunodominant epitopes of the in vivo expressed RD antigens along with the epitopes of two antigens encoded by topmost abundantly expressed mycobacterial genes in transcriptomic analysis were explored in the current study for their potential use in a non-sputum-based serodiagnostic assay.

Methodology

Study participants

A total of 300 study participants were recruited from a tertiary care hospital in India. These included 200 chest symptomatic patients with TB or other respiratory diseases from the Pulmonary Medicine department of PGIMER, Chandigarh, India and 100 healthy volunteers recruited from the non-clinical staff in PGIMER with no history of contact with TB patients. The patients who were having HIV or those who had already started ATT treatment were excluded. Institutional Ethics Committee PGIMER approved the study vide no. 8818/PG11-1TRG/168235, and written informed consent was taken from all the study subjects.

The recruited patients were categorized into following groups: (a) Smear positive (smear+) PTB (n=100): Patients diagnosed with PTB based on the detection of acid fast bacilli in direct sputum smear microscopy were included in this group. (b) Smear negative (smear−) PTB (n=25): This group included patients who had a negative direct sputum smear but were diagnosed as PTB based on the culture/Xpert MTB/RIF positivity of sputum or smear/culture/Xpert MTB/RIF positivity of broncho-alveolar lavage (BAL) samples. (c) Lung cancer (n=25): Patients diagnosed with lung cancer on the basis of histopathological evidence were included in this group. (d) Treated TB patients (n=25): This group included patients who completed their treatment course for PTB and were testified as cured by the treating staff. (e) Sarcoidosis patients (n=25): Patients with clinical features of sarcoidosis with pulmonary (dyspnoea, dry cough, chest pain, fever, fatigue or crackles) or extra pulmonary organ (lymph nodes, liver, spleen, skin, eyes, heart, etc.) involvement, consistent radiological findings and histological evidence of non caseating granulomas were included in this group. (f) The healthy BCG vaccinated volunteers (n=100) not presenting with any clinical symptoms or previous history suggestive of tuberculosis were included in this group.

B-cell epitope prediction using bioinformatic software

B-cell epitopes of the proteins encoded by selected upregulated mycobacterial genes in the sputum samples of PTB patients were predicted using 4 different software: SVMTrip, COBEpro, LBTope and LEPS [25,26,27,28]. The peptides corresponding to epitopes which were common to either all four softwares or three of them were further used in enzyme-linked immunosorbent assay (ELISA).

B-cell epitope selection by peptide array

In addition to bioinformatics analysis, we also carried out peptide arrays for four proteins encoded by upregulated RD genes in sputum, which were identified in our previous study [23]. The microarray slides containing overlapping peptides of the proteins encoded by upregulated RD genes were synthesized commercially through JPT Peptide Technologies, GmbH and were thereafter processed as per the protocol used by Sakamuri et al. [29] using sera from 7 PTB patient and 7 healthy controls. For the synthesis of arrays, each glass slide was designed with identical sub-arrays in triplicate with appropriate controls. Briefly, each sera sample diluted at 1:50 in phosphate buffered saline containing FBS (3%) and tween 20 (0.5%) was added to a microarray slide followed by overnight incubation at 4 °C under moist conditions. Next day after washings with TBST and MilliQ water, slides were incubated with Cy5 labelled anti human IgG raised in goat [Jackson Immuno Research, CN] at 1:1500 dilution for 1 h. Followed by another washing steps the slides were scanned on Axon 4000B scanner. Images in TIFF format were analyzed using Genepix Pro software as per the Genepix Array List (GAL) file with spot alignment provided by JPT. Briefly, the spots were aligned, and the median fluorescent intensity (MFI) of individual peptide spots was extracted from the results file (GPR). The ratio of MFI of each peptide spot in a block to MFI of background spots (negative controls) for that block was calculated (referred as FI). Thereafter, normalized FI (NFI) for each peptide in 3 subarrays was calculated as the median of three FI for that peptide. NFI values of TB patients were compared with those of healthy controls using Student’s t test. The peptides with p ≤ 0.05 and which had the NFI values in TB patients greater than the mean + 3 X SD of NFI values in healthy subjects in all samples (7/7) were selected for ELISA

Synthesis of biotinylated peptides

Peptides corresponding to B-cell epitopes selected either through peptide array or bioinformatics software were synthesized commercially in N-terminal biotinylated form with >95% purity from GL Biochem Pvt. Ltd., Shanghai, China. The sequence and purity were confirmed by mass spectrometry and analytical HPLC by the commercial source (Supplementary Data). Lyophilized peptides were diluted in dimethoxysulfoxide (DMSO) at a concentration of 20mg/ml followed by further dilution with distilled water at stock concentrations of 2mg/ml for each peptide and then stored at −20 °C in small aliquots until use.

ELISA

Antibody detection in serum

Antibody detection in serum against the selected peptides was done using the method described by Shen et al. [30]. Briefly, 96-well plates [Greiner, high binding flat bottom] were coated with avidin (0.0625mg/ml) [Sigma-Aldrich, USA, A9275] diluted in 1X PBS followed by incubation at 4 °C overnight and then again for 1 h at 37 °C. After washing the avidin coated plates twice with 1X PBS, biotin labelled peptides diluted to 10μg/ml with 1X block buffer (1X PBS containing 7.5% FBS [GIBCO] and 2.5% BSA [Sigma-Aldrich, USA, A7030 ]) were added to each well and incubated for 1 h at 37 °C. Washing was then done twice with 0.05% PBST. Further, diluted serum samples (1:20 in 0.1X blocking buffer) were added to the plate. After 4 washings, Protein A alkaline phosphatase conjugate (1:2000) [Sigma-Aldrich, USA, P7488] and antihuman alkaline phosphatase labelled IgA (1:1000) [Sigma-Aldrich, USA, A3400] diluted in 0.1X blocking buffer were added to each well and then incubated for 1 h at 37 °C. After 6 washings, pNPP substrate [Sigma-Aldrich, USA, S0942] prepared in 10% diethanolamine buffer [Pierce™ , Thermo Fisher Scientific, 34064], pH 9.8 was added in each well and ELISA plate was incubated for 1 h at 37 °C followed by reading at 405nm. Each experiment was repeated three times, and the sample showing positive result in 2/3 experiments was considered positive. The mean optical density (OD) of sera from healthy control group plus two times the standard deviation (SD) was taken as cutoff for positive results. The results thus obtained were then assessed with respect to bacteriological diagnostic tests for TB.

Statistical analysis

The mean absorbance of different groups was compared using Kruskal-Wallis test using Dunn’s multiple comparison to compare the all the groups with each other . Graph Pad Prism software was used to do the statistical analysis. The cutoff of the assay was calculated using the mean optical density + 2* standard deviation of healthy control’s OD, and any study subject above the cutoff value was considered positive. The sensitivity and specificity were calculated using the formulas: sensitivity = TP / (TP + FN), specificity = TN / (FP + TN); TP: true positive, TN: true negative, FN: false negative, FP: false positive.

Results

For assessing the utility of proteins encoded by in vivo expressed transcript in PTB sputum identified through a previous study [23] in a serodiagnostic assay, a total of 8 proteins, including two proteins encoded by topmost expressed transcripts (Rv0986, Rv0971) and 6 encoded by RD transcripts (Rv3121, Rv1965, Rv1971, Rv2351c, Rv2657c, Rv2674), were selected. These selected proteins were initially subjected to identification of B-cell epitopes by bioinformatics using different epitope prediction software and peptide arrays available in our laboratory for four of the selected RD proteins. The identified B-cell epitopes were screened for their diagnostic utility using sera from study population. The demographic and clinical details of study subjects are given in Table 1.

Table 1 Demographic and clinical details of study subjects

Bioinformatics analysis for B-cell epitope prediction

Four linear B-cell epitope programs namely COBepro, SVMTriP, LEPS and LBtope were used for prediction of B-cell epitopes. All the four programs were primarily based on support vector machine (SVM), besides individually being based on other properties which include physicochemical properties, mathematical morphology, tripeptide similarity and propensity, binary profile, dipeptide composition and amino acid propensity (AAP). Based on the results from these programs, the peptides (n=8) which were common to all four or at least three of them with highest score were selected for further use in ELISA (Table 2).

Table 2 Predicted B-cell epitopes of selected proteins common to 3 or 4 software namely COBepro, SVMTriP, LEPS and LBtope used for prediction

Peptide arrays for B-cell epitope selection

Overlapping peptides (n=201) of the four RD proteins (Rv1965, Rv1971, Rv2351c and Rv3121) were screened for antibody response in 7 smear positive PTB patients in comparison to healthy controls. It was found that some of the peptides of these proteins were significantly detected in all the smear+ patients by using mean plus three times standard deviation of healthy controls as a cutoff ( Supplementary Table S1). One peptide corresponding to each protein which was having the lowest p value and was being detected by all the TB patients was selected for use in the ELISA-based immunological assay (Supplementary Table S1).

ELISA based on selected epitopes offers poor diagnostic value for PTB diagnosis

Overall 12 peptides were identified as B-cell epitopes from all the eight selected proteins using both bioinformatics software as well as peptide array technology and were listed as P1 to P12 (Table 3). These peptides were commercially synthesized in N-terminal biotinylated form for the development of antibody-based diagnostic test. The diagnostic potential of the selected peptides was assessed in two phases which included a screening phase using a small training set of samples (50 smear+, 25 smear− and 21 healthy controls) to select the best peptides for further work followed by an elaborative study of the selected peptides in phase II with additional samples.

Table 3 List of selected peptides based on peptide array and bioinformatics analysis for antibody detection by ELISA

In the screening phase, all the selected peptides were screened for the combined presence of IgG and IgA antibodies in the sera samples of 96 study subjects to assess their diagnostic potential. Sensitivities obtained with all the 12 peptides were ranging from 14 to 46% for smear+ PTB patients (Fig. 1a) and from 8 to 24% for smear− PTB patients (Fig. 1b). Specificities of all peptides ranged from 95 to 100%. Amongst all the peptides, Peptide 12 (P12) showed the highest sensitivity for both smear+ (46%) as well as smear− PTB (24%) patients.

Fig. 1
figure 1

Percentage sensitivity of detection for PTB patients using all the selected peptides in screening cohort. Bar graphs showing percentage sensitivity of the selected peptides (P1–P12) (a) in smear+ PTB patients and (b) smear− PTB patients calculated as: {number of patients positive by ELISA (OD >mean OD of healthy controls + 2SD as cutoff) / total number of patients} * 100

As P12 was able to significantly differentiate TB patients from healthy controls, along with highest sensitivity in PTB patients, so its ability to correctly diagnose TB patients from disease controls was further assayed using a larger test cohort of 300 study subjects that included 100 smear+ PTB patients, 25 smear− PTB patients, 25 sarcoidosis patients, 25 lung cancer patients, 25 treated TB patients (cured on DOTS treatment) and 100 healthy volunteers.

P12 is an immunogenic peptide selected from protein Rv1971 which is a mce family protein with unknown function [31], but is considered to be involved in host cell invasion as per tuberculist [32]. The mean absorbance values for antibody response to peptide P12 were significantly higher (p<0.001) in PTB patients, both in smear+ (p<0.001) as well as smear− (p<0.001) patients as compared to healthy controls. Further the mean absorbance was also significantly higher (p<0.05) in smear+ PTB patients as compared to lung cancer patients (Fig. 2). But the sensitivity of the peptide was only 31% for detection of smear+ and 20% for detection of smear− PTB patients (Fig. 2). Overall specificity of the peptide with respect to (w.r.t.) all control groups including sarcoidosis, lung cancer, treated TB and healthy subjects was 92.8%. Alongside, the specificity of P12 w.r.t sarcoidosis, lung cancer, treated TB and healthy groups was 84%, 92%, 84% and 97.8% respectively

Fig. 2
figure 2

Antibody response to P12 in developmental cohort using IgG and IgA combination. (a) Dot plot showing reactivities of individual serum samples from different patient groups. Values represent absorbance from one representative experiment. The number in parenthesis shows the percentage sensitivity of P12 for detection of smear+ and smear− PTB patients. (b) Mean plot showing mean reactivities of different patient categories to peptide 12 as mean absorbance±SD. Values represent absorbance from one representative experiment. (***p<0.001, **p<0.01 as compared to healthy controls, ##p<0.01 as compared to lung cancer patients using Kruskal-Wallis with Dunn’s multiple comparison)

Thus, the B-cell epitopes of proteins encoded by in vivo expressed genes although are able to induce IgG and IgA response, but they are not suitable candidates for a rapid diagnostic test for tuberculosis.

Discussion

Effective global disease control and optimal treatment of TB mandate superior diagnostic tests. TB poses a significant health challenge in developing nations, and the lack of efficient diagnostic tests for TB affects timely and accurate diagnosis. This in turn affects timely therapeutic intervention and also skews disease burden assessment. Current diagnostic tests in clinical use including culture, GeneXpert are either time-consuming or costly, and require technical expertise. Serodiagnostic tests on the other hand have the potential to be deployed in a resource-limited clinical setup, where they can be performed with minimal resources and can alleviate the need for technical expertise. Serodiagnosis is especially beneficial for diagnosing those patients who do not produce an adequate amount of sputum. Therefore, the development of a new biomarker-based serodiagnostic assay is an unmet need for the diagnosis of tuberculosis. We, therefore, need to explore beyond the conventional sputum-based approach utilizing conserved antigens and focus on identifying and utilizing new TB-specific biomarkers or a combination of biomarkers for sensitive, specific and timely detection of TB.

The proteins encoded by in vivo expressed mycobacteria specific RD genes have been tested in the current study in a biomarker-based non-sputum assay using an immunodiagnostic approach. The serodiagnostic assays based on different antigens could serve as promising diagnostic tools to be developed into point-of-care tests. For a sensitive serodiagnostic assay, a biomarker should be abundantly expressed during disease condition, and for a specific test it needs to be disease specific. Therefore, for serodiagnostic test, 6 RD proteins were selected that are encoded by genes upregulated in the sputum sample of TB patients. In addition proteins encoded by two topmost upregulated genes Rv0986 and Rv0971 were also evaluated for serodiagnostic assay considering their importance for better sensitivity of the assay. As the antibodies recognize only specific regions of proteins termed as B-cell epitopes, so we selected the B-cell epitopes of candidate proteins using two approaches: peptide arrays as well as bioinformatics. Moreover the use of peptides for serodiagnosis is a more efficient approach as compared to use of recombinant proteins as they eliminate the potential cross-reactivity offered by some of the cross reactive antigens present over of the full-length protein and hence increase specificity [33]

Peptide array, which was used as one of the techniques for B-cell epitope selection, is a resourceful platform for screening antibody response and to identify regions of protein specifically detected by antibodies [34]. Similar approach was earlier used by different research groups to identify B-cell epitopes [35, 36]. Multiple regions of the four proteins were identified by antibodies in sera from PTB patients, and collectively 10 peptides from all four proteins were recognized by 100% of patients and not by healthy controls. So, amongst these, one peptide of each protein with lowest p value was further selected for validation in ELISA-based serodiagnostic assay.

Peptide arrays, though promising to identify B-cell epitopes in wet lab, are expensive, and therefore arrays could not be done with all eight selected proteins. So, bioinformatics software were also used to screen B-cell epitopes of all 8 proteins. Many authors have successfully used bioinformatically predicted linear B-cell epitopes for various experimental analyses, including the development of serodiagnostic assays [30, 37, 38]. In the current study four different linear B-cell epitope prediction software were used, and one epitope from each protein which was identified by 4/4 or 3/4 software with maximum score was selected for further work. As different prediction methods have their individual strengths, therefore by making use of multiple prediction programs and analyzing the results through voting mechanism, more accurate prediction could be obtained [39]. Thus, overall, 12 peptides were selected through peptide arrays and bioinformatics software for subsequent screening in serodiagnostic ELISA.

Serodiagnostic tests are of a great importance for TB endemic, low and middle income countries as they have the potential to be developed into low-cost, user-friendly tests which can be easily used by the first contact health care providers for diagnosing TB [40] even in the remote areas with very less resources [30]. Moreover, serodiagnosis is also beneficial for diagnosing those patients who are not able to produce adequate amount of sputum. So, the 12 antigenic peptides selected using peptide arrays and bioinformatics software were evaluated for their utility to be used in a serodiagnostic antibody detection assay where they elicited sensitivities ranging from 14 to 46% for smear+ PTB patients and from 8 to 24% for smear− PTB patients. The peptide P12 from protein Rv1971which showed maximum detection of 46% in smear+ patients and 24% in smear− patients further performed poorly when assessed with larger number of study subjects. Similar sensitivity of detection for the antibody response to different individual antigens has also been reported earlier [12, 13, 41]. Heterogeneity of antibody response [41] could be one of the possible factors for the accounted low sensitivity of different antigens including P12 in screening and developmental phase in the current study. Another possible reason for this observation could be due to the difference in expression patterns of mycobacteria in sputum from the expression patterns of mycobacteria in lung tissues of TB patients [42]. Likely, as the genes expressed in lung and the proteins encoded by them have more chances to be presented to the immune system, so it could be possible that the antibodies are generated to those antigens more abundantly as compared to the antigens expressed in sputum.

Thus, in conclusion, the proteins encoded by in vivo expressed mycobacterial transcripts in sputum samples of pulmonary TB patients are immunogenic and elicit an antibody response. But the response generated is not sufficient enough to develop a biomarker-based non-sputum assay. The only possible imitation of the study is that the mycobacterial proteins selected for the development of peptide ELISA are based on their in vivo gene expression. Currently, in our lab, studies are ongoing on the identification of mycobacterial proteins that are abundantly and specifically expressed in tuberculosis with a goal to find better diagnostic markers. This is an important research objective to meet the WHO’s requirement of a non-sputum-based biomarker assay.