Introduction

Drug resistance emerges in a cell as natural survival mechanism, when the cell is exposed to the hostile environment due to drug exposure. Drug resistance has been observed with many different drugs both in the human cells as well as bacterial or parasitic cells (Barradell and Fitton 1995; Gottesman et al. 2002). In fact, drug resistance is one of the main problems plaguing continued usage of otherwise good drugs (Alekshun and Levy 2007). Tuberculosis (TB) is the largest killer among all infectious diseases, causing about 2 million deaths in a year. The problem is worsened significantly by the emergence of drug resistance to clinically used drugs. Different stages of the resistance are termed DR (drug resistant), MDR (multidrug resistant) and XDR (extensively drug resistant). The first category refers to cases where resistance to a single drug such as isoniazid is seen, while the second category refers to resistance both isoniazid and rifampicin simultaneously. The third and the most dangerous category refer to resistance to isoniazid and rifampicin (i.e. XDR-TB) as well as any fluoroquinolone and any of the second-line anti-TB injectable drugs such as amikacin, kanamycin or capreomycin. In 2008 itself, an estimated 390,000–510,000 cases of multidrug resistant TB (MDR-TB) emerged (WHO 2010). These forms of TB do not respond to the standard 6 month treatment with first-line anti-TB drugs and can take up to 2 years or more to treat with drugs that are less potent, more toxic and much more expensive. These rather unnerving statistics only serves to prove further the requirement of new drugs to combat tuberculosis. On the other hand, they also serve as stimuli to probe the mechanisms by which such resistance can emerge and whether those resistance mechanisms can be targeted for improving prospects of therapy.

At present, methods employed to tackle the problem of drug resistance are rather arbitrary and involve clinical decisions such as rotation of antibiotics- where in a given antibiotic is administered for a short period of time and the treatment regimen is switched completely to use a different class of antibiotic (Nathanson et al. 2004). Yet another common approach is to use combination therapy where multiple drugs are given simultaneously. This in fact is the method followed in several clinical regimens for infectious diseases. Along with the first two, another strategy used widely is to ensure proper drug administration and continued usage for the required term through high supervision (Ma et al. 2010). In some cases, as another measure to tackle resistance, access to the antibiotics is restricted so that self medication and hence indiscriminate use is also avoided (Shenoi and Friedland 2009).

While these strategies play an important role in containing the problem of resistance, they do not provide a comprehensive solution to the problem, nor do they provide clues about the underlying mechanisms. The phenomenon being a complex biological process, simple correlations with individual molecules or individual residues in them rarely provide the required level of insights. A systems approach becomes essential to get a global view of the problem. We have recently reported a protein–protein interactome study in which a network of protein–protein interactions was constructed, the target and the resistance machinery defined and shortest pathways of information transfer from the target to the resistance machinery identified (Raman and Chandra 2008). The study provided a list of most possible pathway leading to the emergence of drug resistance for the chosen drug and hence the chosen target. Encouraged by the insights obtained through this approach, here we seek to construct an enriched network from the present level of data available and predict potential pathways for different classes of drugs. The network is further enriched with DNA microarray data indicating variation in gene expression upon exposure to different drugs. Usefulness of the analysis in drug target identification is also presented. Availability of genome-scale protein–protein interaction data, some through experiment and many through bioinformatics methods, as well as 75 different microarray data sets for M. tuberculosis H37Rv upon exposure to different drugs in different conditions, make this approach feasible (Boshoff et al. 2004).

Methods

Protein–Protein interaction network

The interactome for M. tuberculosis H37Rv was derived from the STRING 8.0 database (Jensen et al. 2009). This database includes interactions from published literature describing experimentally studied interactions as well as those from computational predictions from genome analysis. The latter is based on well established bioinformatics concepts and methods such as domain fusion, phylogenetic profiling and gene neighbourhood. Thus the network captures interactions due to both structural and functional linkages. The different types of such linkages are (a) structural complex formation between two proteins required to form a biological unit, (b) genes belonging to a single operon or common neighbourhood, (c) pairs of proteins which share metabolites between them and (d) suggested associations between pairs of proteins based on co-expression, domain fusion or predominant co-existence in a number of species. Analogous to the road map in a city, the network represents a first comprehensive view of the connectivity among the various interacting proteins.

Resistome identification

Proteins involved in the emergence of drug resistance have been broadly classified into four known mechanisms, which are, SOS, cytochromes, pumps and those involved in horizontal gene transfer (Nguyen and Thompson 2006; Wright 2007). The proteins classified under the SOS category are believed to be important in mediating mechanisms important for homologous recombination and DNA repair and hence in the emergence of mutations in the target protein that give rise to drug resistance (O’Sullivan et al. 2008). The cytochromes in the resistome are known to modify the drug structure, thereby affecting the concentration of the active drug species at the site of action. The pump proteins achieve the same goal, although by directly effluxing the drug molecules out of the site of action (Gupta et al. 2009; McKeegan et al. 2002). The proteins in the last category of horizontal gene transfer refers to those genes that aid in recruiting other genes and hence proteins from the environment that give rise to capabilities of reducing the physiological burden of the drug, typically by degrading the drug in a suitable manner (Kurland et al. 2003; Coros et al. 2008; Alekshun and Levy 2007). The RecA molecule serves as an example of the first category, while CcdA (Rv0527), IniA (Rv0342) and SecA1 (Rv3240c) are examples of the second, third and fourth categories respectively. Previously, we had curated a set of 74 genes to form the resistome. Further searching of literature, lead to the identification of 8 more proteins linked to one or the other resistance mechanisms. 7 of these proteins (Rv0262, Rv1819c, Rv2209, Rv2477c, Rv2688c, Rv2938 and Rv3361c) (Gupta et al. 2009; Card et al. 2005; Hegde et al. 2005) belong to the category of pumps, while one of them (Rv0818), a transcriptional regulatory protein is linked to the SOS mechanism (Liu and Pop 2009).

Drug specific networks

M. tuberculosis H37Rv interactome obtained, captures interactions of all proteins in the cell, and hence will automatically include interactions of the drug targets as well as the resistome proteins. Specific sub-networks capturing routes through which information could flow from the ‘source’ to the ‘sink’ nodes can therefore be easily derived from the entire interactome. For the analysis presented here, sub-networks comprising shortest paths from drug targets as source nodes to resistome as sink nodes were derived, which were subsequently weighted based on their drug specific expression levels.

The 8 clinically used drugs in the current common regime of the treatment of tuberculosis are known to interact with 36 different proteins in the mycobacterial proteome, These 36 proteins, which in many cases are known to be responsible for the observed pharmacological action of the drugs are considered as drug targets (Chopra and Brennan 1997; Zhang 2005) and hence as ‘source’ nodes for this analysis and are listed in Table 1A. The resistome identified as descried earlier, constitute the ‘sink’ nodes for the analysis and listed in Table 1B.

Table 1 Source and sink nodes used in this study. a Known (in bold) and probable targets of M. tuberculosis H37Rv and b Curated Resistome of M. tuberculosis H37Rv

Shortest paths from source (target) to the curated set of resistance proteins (sink) were computed using Dijkstra’s algorithm implemented in the MATLAB-Boost Graph Library (David Gleich; http://www.stanford.edu/~dgleich/programs/matlab_bgl/). The algorithms mentioned above were used to construct the drug specific network from the shortest paths, in which each path begins from a corresponding drug target which leads to a sink node through interacting proteins, if any.

Incorporating expression data, scoring the shortest paths and Ranking

The microarray expression data of the mycobacterial genome upon exposure to the clinically used first and second line drugs (Table 1A) individually were obtained from the NCBI Geo Express with GEO accession number GSE1642 (http://www.ncbi.nlm.nih.gov/geo/). The expression profiles were obtained under the following conditions, by Boshoff and colleagues (Boshoff et al. 2004). They report that the bacterial cultures at the exponential phase are exposed to drugs individually at concentrations in sub-lethal doses for different periods of time. We have mainly considered the expression profiles of the genome, exposed to the highest dose and longest exposure period for each drug.

Node and edge weights

The fold change in gene expression value of the gene in the particular microarray experiment upon exposure to the drug is taken as the weight for the corresponding node. An edge between two nodes is then weighted by considering the expression profiles of both nodes as well the betweenness of the given edge. Betweenness is a measure of centrality, which captures the number of times the edge is traversed among all shortest paths from the set of source nodes to the set of sink nodes (Freeman 1977). For a graph G (V, E) with n vertices, the betweeneness of a vertex v is defined as

$$ C_{B} (v) = \sum\limits_{{s \ne v \ne t \in V}} {{\frac{{s_{{st}} (v)}}{{s_{{st}} }}}}$$

where sst is the number of shortest paths from s to t, and sst(v) is the number of shortest paths from s to t that pass through a vertex v. A similar definition was given by Girvan and Newman for edge betweenness (Girvan and Newman 2002), which can be obtained by

$$ C_{B} (e) = \sum\limits_{{s \ne v \ne t \in E}} {{\frac{{s_{{st}} (e)}}{{s_{{st}} }}}}$$

where sst is the number of shortest paths from s to t, and sst(e) is the number of shortest paths from s to t that pass through an edge e.

The edge weight (Wst) is thus given by, \( Wst = {\frac{1}{{\sqrt[3]{CB(e) \times (|Es| \times |Et|)}}}} \) where, Es and Et are the node weights of s and t nodes respectively. The path score is then obtained by summation of the weights of all edges in the path.

Data analysis, visualization

Cytoscape was used for the visualization of the networks, Node and edge weights are represented by altering the diameter of the circles representing the nodes and the thickness of the lines representing the edges respectively (Shannon et al. 2003). NetworkAnalyzer, a plugin available for Cytoscape (Assenov et al. 2008) was used to compute various network parameters such as diameter and clustering coefficient.

Results and discussion

Approach

A number of studies investigating the emergence of drug resistance have provided critical insights about the molecular mechanisms by which organisms become resistant (Wright 2007; Alekshun and Levy 2007). Studies in this regard on M. tuberculosis itself have revealed that individual strains of M. tuberculosis will have little opportunity (Gillespie 2002) to interact and exchange genetic information with other strains, unlike in the case of bacteria that co-colonize gastrointestinal tract for example. It is thus argued that resistance in M. tuberculosis would occur mainly through chromosomal mutations (Sandgren et al. 2009; Gillespie 2001). A significant increase in the mutation rates upon exposure to sub-lethal concentrations of different drugs has also been reported. Genome-wide gene expression profiling studies using microarray techniques have shown that a number of genes are in fact over-expressed upon exposure to drugs and a different set of genes are up- or down-regulated for different drugs (Simon et al. 2004).

All these data clearly point to the fact that ‘response’ networks are at play, upon stress due to exposure to drugs. The objective of this study is to utilize the comprehensive interactome network in the cell to identify possible routes through which information required for triggering drug resistance may be passed on in the cell. The interactome consists of both structural and functional linkages among various protein molecules, including indirect linkages due to association with common ligands for example. Hence it would be appropriate for the problem being studied.

The approach used here is illustrated in a flowchart in Fig. 1. The different steps are: to construct an interactome network, identify the intended target(s) of the drug (source nodes), identify those nodes that may be responsible for one or the other resistance mechanisms (sink nodes), incorporate drug induced changes in gene expression profiles into the network through appropriate weighting of the nodes and edges and then compute the most feasible paths between the sets of source and the sink nodes and finally to score and rank the identified pathways. Such networks are constructed and studied for different clinically used drugs, enabling comparison of the possible pathways between different drugs. The use of this information in choosing or prioritizing drug targets is also discussed.

Fig. 1
figure 1

Flow chart shows here the methodology followed to derive individual target to resistance paths network, form curation of the drug targets for each drug in the TB regimen and proteins involved in the resistance machineries known, building interactome network of shortest paths with incorporation of the specific drug exposed microarray data to score and finally to rank the paths

Network description

The interactome network obtained from STRING that includes all high and medium confidence level undirected edges consists of 3,936 nodes (proteins) and 67,507 (protein–protein interactions), covering the genome quite comprehensively. The average degree of this network was found to be 5.79. Of the 3,936 nodes, different numbers of targets have been identified for different drugs, as listed in Table 1. The set of sink nodes have been kept constant to include all possible sinks, so as to identify routes to any resistance mechanism that may be possible in the cell. At this stage, the networks are all still unweighted. In each case, shortest paths are computed from the set of source nodes to the set of sink nodes and all such shortest paths constitute the drug-specific network.

Significance of incorporating expression data

The networks will be much more biologically insightful, if the expression data is incorporated into them. To do this, the expression profile for each drug is taken and used for computing node and edge weights. The expression profiles will depend upon the mechanism of action of the drug as well as the changes it induces in the cell as a consequence of drug action, which may be to overcome the drug stress through metabolic and regulatory adjustments in the cell or through triggering drug resistance more explicitly (Simon et al. 2004). In drug specific networks, gene expression profiles vary significantly among them and hence a hub node obtained in one network need not necessarily hold well for a network of another drug, although present in both (Boshoff et al. 2004). The weighted networks obtained capture the experimentally observed changes in expression patterns (Cabusora et al. 2005). The profiles with the highest drug concentrations are used in the networks studied here (Fig. 2). The fold changes observed is reported to be statistically significant in that microarray data set.

Fig. 2
figure 2

Drug specific networks (a) Isoniazid; all possible targets of isoniazid as source (see Table 1), inset- InhA alone as the source (b) Ofloxacin, (C) Ethambutol, (d) Rifampicin (e) Streptomycin and (f) Amikacin. Nodes represent the proteins while the edges indicate interactions between them. Fold-changes in gene expression upon drug exposure, from genome scale microarray data, are incorporated as weights to the network and are indicated in the figure through node colours: (green-down-regulated, yellow- no fold-change, red- up-regulated and grey- not expressed or not studied in the data considered). The size of the node is proportional to the expression level. Different classes of the resistance proteins are shown in different shapes: SOS (hexagonal), Pumps (parallelogram), HGT (triangle), Cytochromes (rounded-rectangle) and drug targets are shown as arrow heads. The edge thickness is proportional to the edge weight

Shortest paths and probable pathways for information flow thus have been computed and then analyzed for the propensities to trigger resistance based on their path score (Table 2). Higher the path score better is the target due to its lesser propensity to trigger the resistance mechanism as and when that particular drug is administered. When the path score approaches zero in the network, it suggests that particular shortest path has the highest proximity to a node in the resistome and hence has a very high propensity for triggering resistance through that specific resistance mechanism, which is laid out in that path.

Table 2 Network parameters derived from the individual networks for different drugs

Shortest paths identified for each drug and support from literature

The drug specific networks constructed for the six clinically used drugs are shown in Fig. 2, of which the top ranked paths are also indicated as thick lines. The different parameters describing the networks are listed in Table 2. Analysis of the various network parameters such as the degree distribution, clustering co-efficient or cliquishness and average path length are suggestive of the nature of the networks. Each drug target exhibits different properties. Nevertheless, as expected from protein–protein interaction networks of this type (Verkhedkar et al. 2007; Barabasi and Oltvai 2004) the clustering coefficient is very low, implying that the interactions that are present could be biologically significant. The diameter and average path length for the 7 networks is in the range of 4–10 and 2.06–4.89 respectively. In this context, the lengths of the top ranking resistance paths with path lengths of 2 and 3 again seem significant.

INH has been reported to bind to 18 different proteins in the mycobacterial proteome (Argyrou et al. 2006). However, it is believed that the pharmacological action of this drug is mediated through the inhibition of InhA (Rv1484), leading to considering InhA as the primary target (Rawat et al. 2003). Figure 2(A) depicts the network of all the known proteins INH binds to as the ‘source’ nodes to the resistome as the ‘sink’. The network being weighted based on the gene expression data and the betweenness centrality as described in the methods section, is seen to contain some paths that are clearly more probable (i.e., with higher ranks) than most other paths. The top ranked paths, which appear as thicker lines in the network therefore, are much more likely to reflect routes of communication as compared to all theoretically possible paths. Incidentally, KatG turns out to be a hub node in this network, with multiple paths with scores in the range 0.2–0.4 to the SOS machinery. It is indeed remarkable that several drug resistant mutations have been observed in KatG, correlating with the hypothesis derived from network analysis that the position of KatG in the network or in other words its proximity to the SOS machinery makes it much more mutable than several other proteins. The network comprising the top ranked paths with only InhA as the source is also shown as an inset of this figure.

This protein has 3 different top ranking shortest paths to the pumps (Rv0933, Rv1258c and Rv0783) in the resistome, with path lengths of 2 (score = 0.05), 3 (score = 0.31) and 3 (score = 0.78) respectively (Table 2). Among the top ranking paths, routes to other mechanisms are also seen, such as a path with length 3 with a score of 0.36, is seen with Rv0629c and Rv0630c, both belonging to the SOS mechanism. Here again, a correlation to the known mechanisms of drug resistance from literature can be observed, since the pumps (such as iniBAC operon) are known to be involved in INH resistance (Colangeli et al. 2007; Gupta et al. 2010; Gupta et al. 2009). The paths to SOS for InhA although present are not the top most ranked ones, as inferred from the network. Indeed, it is gratifying to observe from an article based on experimental observations from a time-kill kinetics study of M. tuberculosis, that got published as we write this manuscript, that the efflux pumps are the predominant mechanism of drug resistance for isoniazid (de Steenwinkel et al. 2010; Ouellet et al. 2008).

For RpoB, the target of rifampicin, all four mechanisms are seen in the top ranked paths to resistance, rendering it as one of the most susceptible drugs to resistance. Indeed, resistance to rifampicin is one of the most worrying causes of MDR TB, which are believed to occur both through mutations, and pumps (Nguyen and Thompson 2006; Gupta et al. 2009; Abbadi et al. 2009).

The target of streptomycin RpsL (Rv0682), has paths to resistance involving the pumps (drug transporter, Rv1877), with scores close to zero implying the effluxing out of drugs as one of the predominant mechanisms for emergence of drug resistance (Silva et al. 2001). Top ranking paths with length 2 (0.24–0.77) to cytochromes, CydC (Rv1620c), CtaB (Rv1451) are also seen. Spies et al. indeed report that streptomycin is effluxed out of the cell (Spies et al. 2008). The network analysis presented here, besides identifying the propensity of the target to trigger resistance, also provides significant clues to understanding which molecule(s) are involved in mediating such mechanisms (Table 3).

Table 3 Top ranking (Least score) shortest paths for different drugs

The ethambutol target, EmbB shows high scoring paths to pumps such as Rv0191 and to proteins such as SecA2 (Rv1821). A path EmbB (Rv3795)- > RpoB (Rv0667)- > SecA2 (Rv1821) seems to be a common path for both rifampicin and ethambutol targets. Interestingly reports in literature indicate that rifampicin and ethambutol resistance emerge together in many cases of MDR (Zhang et al. 2007). In addition to paths to pumps, the two drugs also have top ranking paths to the SOS mechanism, which again correlate with a number of mutations observed in both Rv3795 (EmbB) and Rv0667 (RpoB).

The gyrases of fluoroquinolones interactions show the significant paths to resistance are all through different SOS proteins. For example, Rv0005 - > Rv0001 - > Rv1629 (PolA), Rv0006 - > Rv2720 (LexA) and Rv0006 - > Rv2737c (RecA) (O’Sullivan et al. 2008). In addition to these, the pumps (Rv0005 - > Rv0001 - > Rv2992c - > Rv2994) also seem to play some role in introducing resistance to these second line drugs (Nguyen and Thompson 2006).

The resistance scores for the target of amikacin, TlyA (Rv1694) is comparatively much lower that for the other drugs (de Steenwinkel et al. 2010), in the range of 3.35 and 3.67, for pumps and HGT mechanisms respectively which are represented below. The path Rv1694 - > Rv1464 - > Rv3065 where Rv3065 (Mmr) is involved in transport of multiple drugs and Rv1691 - > Rv1694 - > Rv3248c - > Rv3240c (SecA1) is a probable protein translocase subunit implicated in HGT.

Hub nodes in multiple networks

Of the several networks studied here, proteins comprising the top ranking paths to the resistome were analyzed with a goal of identifying important proteins mediating drug resistance. Multiple instances of a few proteins such as possible fatty-acid-CoA ligase (Fatty-acid- CoA synthetase), Rv1750c (FadD1), were observed in these paths. Such proteins can be treated as hub proteins among all top ranking paths leading to resistance. A few examples are listed in Table 4.

Table 4 Examples of most frequently occurring nodes in the resistance paths.

We recently suggested the concept of target—co-target pairs, where in the co-target could be a key protein in mediating drug resistance for that particular drug and hence for that particular target protein (Raman and Chandra 2008).

The resistance path for example, Rv0005 - > Rv0001 - > Rv1629 is part of the SOS mechanism, where Rv1629 (PolA) is present in many drug resistance paths such as isoniazid, ethambutol and also for ofloxacin. It is also seen in the resistance paths of Diamine analog 109, making it as a probable co-target. This protein molecule also happens to feature as a high confidence primary target in our recent comprehensive analysis targetTB (Raman et al. 2008). RecA (Rv2737c) and a monooxygenase, Rv0892, can be identified as important co-targets, which were also identified in our previous study with mycolic acid pathway inhibitors (Raman and Chandra 2008) agreeing with their predicted role in triggering resistance for multiple drugs. Similarly, Rv2992c (probable glutamyl-tRNA synthetase (GltS)) that emerges as a hub protein in this analysis, leading to the ‘pumps’ class in the resistome is also a primary target in targetTB pipeline, and can be potentially considered as a co- target of gyrases.

Biological insights gained and implications to identify the targets with least propensity to trigger resistance

Shortest paths computed to the resistome from the individual drug target(s), using the gene expression enriched protein–protein interactome network, indicate that many of the targets have pathways, spanning a path-length of 1–3 to one or the other proteins in the resistome, as shown in Fig. 2. When multiple proteins to which a given drug can bind, are considered, such as in the case of isoniazid, several pathways through which information could traverse from the source to the sink, appear to be present. However the weighting scheme used here helps in ranking the pathways, of which the ones shown in thick lines indicate most feasible pathways.

It is interesting to observe that different drugs have shortest paths to different proteins in the resistome. Moreover some targets have even shorter paths to the resistome as compared to some other drug targets. It seems reasonable to assume that shorter the pathway, the more feasible it is as route through which biochemical or biophysical signals could be passed on from the source to the sink.

Put together, these imply that (a) different drugs may be prone to triggering different mechanisms of drug resistance, (b) the propensities for triggering resistance, using either the same mechanism or different resistance mechanisms also appear to be different and (c) different drug targets may follow different pathways to trigger the same resistance mechanism, albeit with some common hub nodes.

At present, there are some limitations to be noted in the analysis reported here. First, the network may be incomplete since several proteins of this bacterium have not been studied experimentally, particularly to study protein–protein interactions. The network does not also contain explicit information about linkages due to interactions of proteins with nucleic acids, lipids or carbohydrates. The network does not capture the direct action of the drug itself to trigger its own resistance (for example drugs binding to cytochromes). Nevertheless, the network seems comprehensive enough, since it covers most of the proteins in the proteome and since multiple methods are used for predicting the interactions, increasing the chances of identifying a linkage through one or the other methods. The network thus contains thousands of specific linkages, enabling the study of paths of information flow to trigger drug resistance. Several correlations for the predicted paths are indeed observed with the experimental data available in literature suggesting that information flows through the identified routes are probable and biologically significant.

A new drug candidate, currently under clinical trials (Ma et al. 2010) which is a diamine analog 109 is also studied here. This drug belongs to the same class as that of ethambutol. Since the mechanism of action for the new drug is not as yet clear, the same target as that of ethambutol (EmbB) is considered. Yet, after incorporation of gene expression data, it appears that the diamine analog has a lower propensity to trigger resistance as compared to ethambutol, perhaps with a potential for the former to be a better drug. A recent report in literature shows that the instances of drug resistance is indeed lower with the diamine analog as compared to frontline drugs such as rifampicin and isoniazid (Protopopova et al. 2005; Chen et al. 2006).

When we compare across the classes of drugs, we can see the propensity to trigger resistance of amikacin for its drug target Rv1694 (TlyA) is the least among all drugs for TB (Fig. 3). The proteins namely Rv0343, Rv1772, Rv1908c, Rv2846c, and Rv3139 for isoniazid while Rv3795 and Rv0343 for ethambutol are with highest propensity to trigger resistance.

Fig. 3
figure 3

Distribution of path scores (a) for the set of target for each drug shown, when all resistance mechanisms are considered and (b) for the SOS mechanism alone. The X- axis shows the individual drug and Y- axis represents the path score. Red bars in both (a) and (b) correspond to top ranks (or least path scores) while blue bars in (a) correspond to scores for an average of the top hundred ranks

Gyrases (Rv0005 and Rv0006), considered as drug targets of fluroquinolones, RpsL (Rv0682), the target of streptomycin, RpoB (Rv0667), the target of rifampin and EmbB (Rv3795), the target of ethambutol exhibit higher ranges of propensity to trigger resistance (Fig. 4). From higher incidences of resistant strains to these drugs are indeed reported in literature (Chopra and Brennan 1997; Nguyen and Thompson 2006; Shenoi and Friedland 2009).

Fig. 4
figure 4

Number of top ranking unique resistance paths in different score ranges (as indicated in the colour key), for different drugs to different resistance classes

The pathogen is known to overcome susceptibility to fluroquinolones, predominantly through mutations (O’Sullivan et al. 2008), evident again from paths with top ranks to SOS from our analysis. Rifapentine, which is under advanced clinical trials to replace rifampicin as a first line drug, is considered to be a more potent analog than rifampicin (Chopra and Brennan 1997; Ma et al. 2010). Rifapentine scores better in this study too, suggesting that it may better from the resistance perspective as well.

Hub nodes among multiple top ranking paths are indeed likely to be more useful as co-targets since they are predicted to be involved in the emergence of resistance for multiple drugs. Inhibition of the primary target and the co-target(s) simultaneously seems to be a feasible approach, given that drug combinations are in fact used in the clinic quite widely, to overcome the problem of drug resistance for tuberculosis and also many other diseases (Shenoi and Friedland 2009).

Protein–protein interaction networks enriched by corresponding gene expression profiles in general provide an opportunity to identify and analyze systems level responses to exposure to specific drugs. Analyses of such networks provide insights about possible routes of communication in the cell and strategies to inhibit such communication to retard emergence of drug resistance. The method employed here has sufficient scope to incorporate any new genome-scale expression or proteomics data that may become available in the future. The network can also be easily refined, when necessary, to incorporate new interactions among the proteins that get identified. Finally, the method can be easily adapted to studying drug resistance in other species as well.