Trade-Offs Between Efficiency and Robustness in Bacterial Metabolic Networks Are Associated with Niche Breadth

Morine, Melissa J.; Gu, Hong; Myers, Ransom A.; Bielawski, Joseph P.

doi:10.1007/s00239-009-9226-5

Trade-Offs Between Efficiency and Robustness in Bacterial Metabolic Networks Are Associated with Niche Breadth

Published: 14 April 2009

Volume 68, pages 506–515, (2009)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Molecular Evolution Aims and scope Submit manuscript

Trade-Offs Between Efficiency and Robustness in Bacterial Metabolic Networks Are Associated with Niche Breadth

Download PDF

Melissa J. Morine¹,
Hong Gu²,
Ransom A. Myers¹ &
…
Joseph P. Bielawski^1,2

399 Accesses
9 Citations
Explore all metrics

Abstract

The relation between structure and function in biologic networks is a central point of systems biology research. Key functional features—notably, efficiency and robustness—are linked to the topologic structure of a network, and there appears to be a degree of trade-off between these features, i.e., simulation studies indicate that more efficient networks tend to be less robust. Here, we investigate this issue in metabolic networks from 105 lineages of bacteria having a wide range of ecologies. We take quantitative measurements on each network and integrate this network data with ecologic data using a phylogenetic comparative model. In this setting, we find that biologic conclusions obtained with classical phylogenetic comparative methods are sensitive to correlations between model covariates and phylogenetic branch length. To avoid this problem, we propose a revised statistical framework—hierarchical mixed-effect regression—to accommodate phylogenetic nonindependence. Using this approach, we show that the cartography of metabolic networks does indeed reflect a trade-off between efficiency and robustness. Furthermore, ecologic characteristics related to niche breadth are strong predictors of network shape. Given the broad variation in niche breadth seen among species, we predict that there is no universally optimal balance between efficiency and robustness in bacterial metabolic networks and, thus, no universally optimal network structure. These results highlight the biologic relevance of variation in network structure and the potential role of niche breadth in shaping metabolic strategies of efficiency and robustness.

Functional comparison of metabolic networks across species

Article Open access 27 March 2023

Resource constrained flux balance analysis predicts selective pressure on the global structure of metabolic networks

Article Open access 23 November 2015

Cross-biome microbial networks reveal functional redundancy and suggest genome reduction through functional complementarity

Article Open access 24 August 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Cellular metabolism, being the sum of all chemical processes that support growth and reproduction, is a fundamental property of all living organisms. These processes are extensively interconnected by way of a complex system of chemical reactions that are often modeled as a network, where nodes symbolize metabolites and links represent metabolic enzymes (Jeong et al. 2000). Systems biologists and network theorists work under the assumption that analyzing metabolic networks—in particular, examining the connection between network shape and function—will advance our understanding of microevolution, disease, and biologic complexity (Alves et al. 2002; Oltvai and Barabasi 2002; Becker et al. 2006). Already, metabolic network topology has provided a novel context for mapping transcription regulation in Saccharomyces cerevisiae (Patil and Nielsen 2005) and predicting viability of mutant strains in Escherichia coli and S. cerevisiae (Wunderlich and Mirny 2006). There is clear functional relevance in network topology; however, it is difficult to generalize without knowledge of the extent and, equally important, the ecologic basis of interspecific variation.

Optimization of network performance results, at least in part, from a trade-off between two specific qualities: efficiency and robustness (Stelling et al. 2002). Efficient networks are most effective at carrying out a function with minimal input of resources; in this case “resources” refers to “metabolic enzymes.” Robust networks are less efficient but able to maintain functionality despite system perturbations. Although there is ample evidence that a trade-off between these qualities mediates optimal performance in human-engineered networks (Konsynski and Tiwana 2004; Meepetchdee and Shah 2007) and in networks constructed in silico (Venkatasubramanian et al. 2006; Meepetchdee and Shah 2007), whether these constraints are an important feature of biologic networks is a much more difficult question to answer. If we use a cartographic view of a biologic network, we can take a variety of topologic measurements that are theoretically related to efficiency and robustness. In this context, a maximally efficient network has lower average node degree, shorter average path length, and lower ratio of hub-to-nonhub nodes; a maximally robust network displays converse topologic features (Venkatasubramanian et al. 2006). Average node clustering coefficient, which indirectly measures link redundancy, and modularity have also been shown to have a strong positive relation with network robustness (Hartwell et al. 1999; Variano et al. 2004; Zhao et al. 2006; Holmgren 2006).

A recent large-scale survey of modularity showed that bacterial metabolic networks were more modular for bacteria from more variable environments (Parter et al. 2007). The results from Parter et al. (2007) are important for two reasons. First, they demonstrate the existence of substantial interspecific variation in modularity, an important feature of network cartography. Second, they illustrate the potential ecologic relevance of such variation. However, modularity is just one aspect of metabolic network cartography and is better viewed in conjunction with other measures of network topology to provide a complete picture of network efficiency and robustness. Furthermore, Parter et al. (2007) did not explicitly accommodate the influence of phylogeny on their statistical analyses, which is a critical component of any cross-species comparative analysis. Here, we report an expanded analysis based on a suite of different measures of network cartography (hereafter referred to as “network shape profile”) relevant to the trade-off between network efficiency and robustness. Furthermore, we employ a hierarchal mixed-effect statistical model to explicitly accommodate the influence of phylogenetic nonindependence among species in our data set.

Network Theory

Topologic measurement in network biology is dominated by a small set of shape indices: (1) average path length, (2) average node degree, (3) degree distribution (λ value), (4) clustering coefficient, and (5) modularity. Each of these indices conveys critical information about efficiency and robustness in network function. Average path length (APL) is measured as the average shortest distance between all possible pairs of nodes in a network, whereas distance is the number of edges between them. In a random network, i.e., a network that is produced by adding edges to randomly chosen pairs of nodes,

$$ APL_{\rm rand} \cong \frac{\ln n}{{\ln \overline{k} }}, $$

(1)

where n refers to the number of nodes in the network, and $ \bar{k} $ is the average node degree (Fronczak et al. 2004). Node degree is the number of connections (k) between a given node and other nodes in the network. In metabolic networks, smaller APL indicates that metabolic end products can be produced in a shorter average number of steps and, thus, with lower energetic input. Similarly, a low average node degree suggests that a smaller amount of cellular resources are dedicated to each network node (i.e., metabolite).

The degree distribution of a network is based on the pattern of variation in node degree (Barabasi and Albert 1999) and is defined as the probability that a randomly selected node will have k connections within the network. This distribution has been shown in a wide range of network types to decay as a power law P(k) ~k ^−λ, where λ represents the slope of the decay. High values of λ indicate a steeper slope, i.e., a smaller proportion of high-degree nodes. Networks with high λ are more centered on a small number of hub nodes and are more likely to be fractioned if nodes are removed (Venkatasubramanian et al. 2006).

Clustering coefficient is measured as the average fraction of pairs of a node’s neighbours that are themselves connected (Watts and Strogatz 1998). Equation 2 defines clustering coefficient for a single node,

$$ CC_{i} = \frac{{2|E\left( {G\left( {v_{i} } \right)} \right)|}}{{k_{i} \left( {k}_{i} - 1 \right)}}, $$

(2)

where | E(G(v _i)) | denotes the number of links among all neighbours of node i, and k _i is the number of degrees of i. If node i has a clustering coefficient of 1, each of its neighbour nodes are directly connected to each other. Node i therefore increases redundancy in the network by providing an alternate path between each of its neighbours. Average clustering coefficient for the network, $ \overline{CC} $, is defined by Eq. 3,

$$ \overline{CC} = \frac{1}{n}\sum\limits_{i = 1}^{n} {C_{i} } $$

(3)

Only in a globally coupled network (where every node connects with every other node) would $ \overline{CC} $ = 1. In a random network,

$$ \overline{CC}_{\rm rand} \cong \frac{{\overline{k} - 1}}{n}, $$

(4)

$ \overline{k} $ is average node degree (Light and Kraulis 2004).

The last measure, modularity, is unlike the other network indices because it is an optimality criterion that is maximized for a given network. Here, a module is defined as a semiautonomous group of nodes wherein the number of connections within the module outnumbers connections to other modules (Guimera and Nunes-Amaral 2005), and the degree to which a network can be subdivided in this way defines its modularity. For a given partition of a network, modularity = M,

$$ M \equiv \sum\limits_{s = 1}^{{N_{M} }} {\left[ {\frac{{l_{s} }}{L} - \left( {\frac{{d_{s} }}{2L}} \right)^{2} } \right]} , $$

(5)

where N _M is the number of modules, L is the number of links in the network, l _s is the number of links between nodes in module s, and d _s is the sum of the degrees of the nodes in module s (Raff and Raff 2000; Variano et al. 2004). The goal of module-determining algorithms is to uncover the configuration that maximizes modularity in the network. Networks with high modularity have well-defined modules with a low proportion of between-module links. In such networks, any “damage” incurred through random node failure is more likely to occur within a module (rather than between modules) and to be contained within that module rather than fractioning the entire network (Raff and Raff 2000; Variano et al. 2004; Griswold 2006).

The relation between network shape and function is supported in simulated networks and applied in human-engineered networks (Konsynski and Tiwana 2004; Meepetchdee and Shah 2007). Highly efficient networks tend to have short APL, small average node degree and degree distribution (low ratio of hubs to nonhubs), and low level of clustering coefficient and modularity. Finding balance between efficiency and robustness is a matter of practical concern in man-made networks because highly efficient networks are more cost-effective but exhibit decreased error tolerance. In this article we address the question of whether real metabolic networks exhibit such trade-offs, and if they are associated with the ecology of the organism.

Materials and Methods

Metabolomic Data

Metabolic connectivity lists were extracted from the metabolomic data compiled by Ma and Zeng (2003). This database contains manually curated metabolic data from the Kyoto Encylopedia of Genes and Genomes (http://www.genome.jp/kegg/). Metabolic-reaction lists for 105 species of bacteria were extracted from the Ma and Zeng (2003) database and visualized by using Pajek network analysis software (Batagelj and Mrvar 2003).

Ecologic Data

Ecologic characteristics for each species were defined in terms of five indices: niche breadth, obligate endosymbiosis, host association, host –restriction, and pathogenicity. Status of host association was obtained from the National Center for Biotechnology Information (NCBI) microbial genome database (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi). The rest were obtained from the Genome Bank database (http://www.genomics.ceh.ac.uk/cgi-bin/gmine/gminemenu.cgi). Niche breadth is a composite characteristic that can be viewed as a proxy for both the diversity and fluctuation of metabolites under which an organism’s metabolism must function. The index ranges from 1 to 5; a species that inhabits a narrow and stable environment has a niche breadth score of 1, whereas a score of 5 is ascribed to one found in a highly complex and dynamic environment (Supplementary Table S1) (http://www.genomics.ceh.ac.uk/gmine/genomebankbacterialinfo.html; see Web site entry on “environmental breadth”). For comparison, we also employ the number of input metabolites for a given metabolic network as an alternative measure of diversity and fluctuation of metabolic substrates that a species is capable of using. This is a quantitative rather than ordinal measure of niche breadth, and it is computed by determining the number of metabolite nodes consumed (i.e., as enzyme substrates) but not produced (as metabolic reaction products). The remaining ecologic characteristics are catagoric variables, with each species having a score of 0 or 1. Note that host association, host restriction, and obligate endosymbiosis have logical associations with niche breadth because each is related to some aspect of habitat complexity.

Construction and Topologic Assessment of Metabolic Networks

Metabolic networks are reconstructed from the metabolite connectivity lists extracted from the Ma and Zeng (2003) database. We compute the APL, clustering coefficient, degree distribution, and average node degree all 105 networks by using the Pajek (Batagelj and Mrvar 2003) software package. Modularity is determined using a program provided by Guimera et al. that employs simulated annealing as a heuristic algorithm to determine the network conformation that maximizes modularity (Guimera et al. 2004; Guimera and Nunes Amaral 2005). We note that simulated annealing is a stochastic optimization method (Kirkpatrick et al. 1983) that appears to be most successful in maximizing modular structure for a biologic network (Guimera and Nunes Amaral 2005). Using this method, it has also been shown that modularity increases with increasing average node degree (Guimera et al. 2004). Therefore, when comparing modularity between networks, we calculate modularity for multiple randomizations of a network and normalize modularity by averaging these values.

Hierarchal Mixed-Effect Regression Model to Accommodate the Influence of Phylogenetic Nonindependence

We use a regression framework to explore how ecologic characteristics of a lineage of bacteria are related to the shape of its metabolic network. However, because the bacteria are related by evolutionary history, their traits cannot be treated as independent observations. Statistical accommodation of character correlations caused by shared evolutionary history is traditionally handled by using a set of techniques that are collectively referred to as “phylogenetic comparative methods” (PCMs). Under these models, it is assumed that longer branch length between a pair of species implies greater phylogenetic and thus greater statistical independence. Such species are given greater weight in a PCM, whereas closely related species are downweighted in the model (see Supplementary Material for further discussion of these methods). The problem with applying a typical PCM to the present data set is that there is a degree of correlation between phylogenetic branch length and niche breadth. This correlation is thought to arise from two sources. First, species with narrow habitats tend to have smaller effective population sizes, thereby increasing the rate of fixation of mutations (Woolfit and Bromham 2003). Second, prokaryotic species in narrow environments tend to undergo genome reduction and lose genes that code for DNA repair enzymes, which further accelerates rate of nucleotide substitution (Dale et al. 2003). This correlation between branch length and niche breadth introduces a systematic bias wherein species with narrow niche breadth are consistently upweighted in the PCM, which can lead to inflated type I errors. However, we cannot ignore phylogeny either because treating lineages as completely independent also can lead to inflated errors.

Here, we propose a new PCM for cases where phylogenetic branch length and model covariates are not expected to be independent. The new method is based on linear mixed-effect (LME) regression models (Pinheiro and Bates 2000), which capture the hierarchical structure of phylogenetic history in a set of nested random effects but do not require use of a highly resolved phylogeny with branch length information. This random component of our LME model allows random variation in intercept between the nested phylogenetic groups, and correlation structure for errors within these groupings. The random effects structure of the LME is determined by the data in hand by way of a hierarchical set of likelihood ratio tests (LRTs). The LRT testing follows a backward-elimination procedure based on the topology of the original phylogeny.

We employed a phylogenetic tree that represented a majority-rule consensus topology from the literature and then calculated branch lengths with a maximum-likelihood–based analysis of 4 highly conserved ribosomal protein sequences (Yang 1997; Mollet et al. 1998; Daubin et al. 2002; Wolf et al. 2002, 2004; Lerat et al. 2003; Brown and Volker 2004; Canback et al. 2004; Santos and Ochman 2004; Belda et al. 2005; Bern and Goldberg 2005; Henz et al. 2005; Kunin et al. 2005; Zhao et al. 2005; Chan et al. 2006; Ciccarelli et al. 2006; Fitzpatrick et al. 2006). Random-effects structure was determined for each network index by dividing this tree into 21 phylogenetic groups and then performing a backward-elimination procedure to collapse all groups that did not have significantly distinct random intercepts. For each group, likelihood was calculated for an LME model with and without the group included, and a likelihood ratio test was carried out (on a 50:50 mixture of $ \chi_{0}^{2} $and$ \chi_{1}^{2} $) to determine if the grouping significantly improved the model fit (α = 0.05) (Self and Liang 1989). Once a random-effects structure was defined for each network statistic, ecologic features were assessed individually for significant associations with each measure of network shape. Additional details of the procedure, as well as a graphic representation of the phylogenetic groups, are provided as Supplementary Material (Supplementary Fig. S1; Supplementary Notes).

For comparison, we also analyze the same data under a classical PCM using a generalized estimating equation (GEE) framework, which is among the most commonly used statistical approaches in phylogenetic comparative modelling (Paradis and Claude 2002). As with typical PCMs, the phylogenetic tree structure is used to define a variance–covariance matrix, which is then applied to determine the weight given to each observation (i.e., species). Further explanation of this analytic framework is presented in the Supplementary Material.

Results and Discussion

Cartography of Metabolic Networks Reflects a Trade-Off Between Efficiency and Robustness

Using publicly available metabolomic data, we reconstructed metabolic networks of 105 bacterial lineages from 8 distinct phyla and a wide range of ecologic lifestyles. Based on these networks, we created a network shape profile for each lineage that includes measurements of APL, average node degree, exponent of power-law distribution of degrees (λ; approximately measures the ratio of hub-to-nonhub nodes), clustering coefficient, and modularity. Figure 1 shows the substantial natural diversity of each network shape index. Note that the indices in Fig. 1 are standardized to aid comparison (the original scores are presented in the Supplementary Material). Variation in normalised APL (mean 0.9292; SD 0.2738), node degree (mean 1.851; SD 0.129), and normalised clustering coefficient (mean 19.987; SD 7.291) is particularly dramatic. For example, clustering coefficient ranges from a maximum of 34.363 in E. coli (strain O157 EDL933), to a minimum of 1.243 in Ureaplasma urealyticum. Because these values have been normalized for network size, they indicate that metabolic network redundancy (in terms of alternate paths between neighbouring nodes) can be quite variable between species.

To illustrate this variation on a system level, Fig. 2a and b show the striking topologic differences in the metabolic networks of two bacterial species, Buchnera aphidicola and Pseudomonas aeruginosa. Based on their shape profiles, the P. aeruginosa network should have higher robustness and lower efficiency compared with B. aphidicola. With short APL and low node degree, the B. aphidicola network has a more efficient topology. However, the low redundancy of links and rarity of hubs yields a network expected to be more subject to fragmentation in the face of system perturbations, i.e., it will be less robust. P. aeruginosa has a more densely connected network with higher average node degree, longer APL, and a higher ratio of hubs to nonhubs; also, this network is more clustered and modular (see Supplementary Table S2). Both are species of γ-proteobacteria; however, B. aphidicola is a nonpathogenic obligate endosymbiont, whereas P. aeruginosa is a pathogen that can be found in a variety of terrestrial and aquatic habitats. Although these examples are consistent with an association between niche breadth and network topology, a more systematic approach is required to determine if such an association holds over our broad sample of bacterial diversity.

To explore the relation between efficiency and robustness of metabolic networks, we plot all possible comparisons of the network statistics included in our shape profile (Fig. 2c). The data for APL, clustering coefficient, and modularity are normalized by the expected values for a randomly connected network with the same number and average degree of nodes. This normalization is important because these indices are, to an extent, mathematically dependent on network density; i.e., if the number of edges in a given network was doubled without changing the number of nodes, APL and modularity would decrease, whereas average clustering coefficient would increase. The schematic plots along the diagonal of Fig. 2c show the relation between each network statistic, the properties of efficiency and robustness, and the Pearson correlation coefficient for each pair. Figure 2c shows that after normalization for network size and node degree, all measures of metabolic network shape exhibit evidence of the positive correlation. We note, however, that the measure of degree distribution (λ) exhibits a generally weaker correlation with the rest of the indices. This indicates that the proportion of hubs to nonhubs does not strongly covary with other measures of network shape. Nonetheless, there is a positive correlation across all measures, and this is expected if the network topographies tend to reflect a consistent trade-off between efficiency and robustness (Venkatasubramanian et al. 2006). Indeed, for many of the network statistics the correlation is strong (i.e., >0.8 correlation coefficient). Based on these results, it appears that bacterial metabolic networks with short APL also tend to have small average node degree and low clustering and modularity, all of which are expected to promote efficiency over robustness. Although it is theoretically possible for a network to be highly efficient in terms of some indices and highly robust for others (Albert and Barabasi 2002), this is not the case in the bacterial metabolic networks examined here.

Biologic Conclusions Are Sensitive to the Statistical Framework of a Phylogenetic Comparative Method

Simulation-based in silico studies of the origin of nonrandom structure in networks suggest that trade-offs between efficiency and robustness can reflect environmental factors (Venkatasubramanian et al. 2006), e.g., networks that must function in the face of frequent environmental perturbation, or over a wider variety of conditions, will tend to have a more robust structure. We use PCMs to explore the potential for such a relation in biologic networks by assessing whether ecologic characteristics are strong predictors of metabolic network topography. Under the LME framework, we incorporate the underlying phylogenetic structure of the data into our model by way of the random component. Under the GEE framework, the topology and branch lengths of the phylogeny are used to determine a correlation matrix for the regression model. The fixed component of the models are comprised of the number of input metabolites for each network as well as the five ecologic characteristics for each of the sampled lineages in our data set: niche breadth, host association, host restriction, obligate endosymbiosis, and pathogenicity.

The LME regressions indicate that certain ecologic characteristics of an organism are predictors of the shape of its metabolic network. We observe highly significant relations between network shape and niche breadth, number of metabolite inputs, obligate endosymbiosis, host restriction, and host association but not pathogenicity (Table 1). The GEE-based regression results, however, indicate different relations. GEE results suggest a significant relation between network shape and pathogenicity and significant correlation between some measures of network shape and obligate endosymbiosis, host restriction, and host association. In addition, for some regressions between network shape indices and niche breadth and number of metabolite inputs, GEE results indicate significant relations in the opposite direction of those indicated by the LME results (Table 1). Using APL and niche breadth as an example, the LME regression indicates a positive relation (p < 1 to 16), whereas GEE regression indicates a negative relation (p < 1e to 14; Fig. 3). Because both results are significant, the difference cannot be attributed to sampling errors; at least one case is an analytic artifact. Regardless of the origin of the discrepancy, biologic conclusions for these data are sensitive to the assumptions of the PCM.

Table 1 Fixed-effect coefficients for linear mixed effect and generalized estimating equation regressions

Full size table

The GEE regression uses a Brownian motion model whereby branch lengths determine the degree of independence among observations (Martins and Hansen 1997). Specifically, branch lengths determine the values of a variance–covariance matrix that are used to weight the observations from each species. A relation between branch lengths and the response variables would lead to an inappropriately formulated matrix and thereby introduce a systematic error into the weighting scheme. Figure 3 illustrates that the difference between the LME and GEE regression is indeed a consequence of the weighting scheme; a subset of observations are strongly weighted (in blue), and it appears that the result of the GEE regression is strongly influenced by these observations. We conducted a comprehensive sensitivity analysis of the two regression methods and found that the GEE results were sensitive to the weighting scheme of the variance–covariance matrix (see Supplementary Material; Supplementary Fig. S2). Given the sensitivity of the results under the GEE approach to the structure of the variance–covariance matrix, and a clear gap between the assumptions of the Brownian motion model and the branch length data, we conclude that the GEE-based regression results are unreliable in this setting.

The LME regression model accommodates phylogenetic structure as a set of nested random effects. Moreover, because it does not rely on branch length data, it is appropriate for modeling data with correlation between model covariates and branch length and also in cases where a phylogenetic tree cannot be confidently resolved. This is particularly important in studies of prokaryotes, where frequent lateral gene transfer (LGT) causes significant heterogeneity in phylogenetic signal among genes. The LME approach only requires that the structure of the nested random effects is correct, whereas the GEE approach requires that a fully resolved organism history, without errors, is known a priori. The LME regression is clearly more appropriate for these data and indicates that the shape of a metabolic network, as it relates to the trade-off between efficiency and robustness, is associated with habitat complexity.

Trade-offs Between Efficiency and Robustness in Bacterial Metabolic Networks Are Associated with Niche Breadth

We have shown that the structure of a metabolic network varies greatly among prokaryotes and that it is associated with several measures of habitat complexity. To investigate the robustness of this finding, we reanalyzed network cartography with respect to the number of input metabolites. This provides an alternative measure of the way an organism uses the metabolic complexity of its environment because it is a function of the diversity of metabolic substrates that a species is capable of metabolizing. An LME regression using number of input metabolites (rather than niche breadth) yielded the same qualitative results: a highly significant positive relation between network shape and number of input metabolites (Table 1). Thus, the relation is robust to these measurements of habitat complexity. We propose that niche breadth is the most useful predictor of network shape because it provides a simple yet encompassing measure of environmental complexity.

Figure 4 illustrates how network cartography changes as a function of niche breadth. In species with low niche breadth, topologic features conferring efficiency are prominent, whereas those that impart robustness are weaker. Short APL in such networks indicates that metabolic end products can generally be produced in a smaller number of steps and, thus, with less energetic input (i.e., cellular resources). This has been previously reported for genomes of intracellular species that have lost genes associated with processing metabolites that can be consistently acquired from the host (Batagelj and Mrvar 2003). Estimates of λ indicate that these networks tend to be centered on just a small number of hub nodes. Furthermore, lower clustering coefficients indicate lower redundancy within the network. Although highly efficient, such networks are susceptible to fragmentation if a hub node (i.e., metabolite) becomes unavailable. Network robustness, by way of redundancy, is more costly to maintain and may be an energetic liability in habitats with a narrow and stable collection of metabolites, such as the intracellular or host-restricted environment.

Species inhabiting highly complex and dynamic environments, i.e., large niche breadth, have larger average node degree, indicating greater energetic input into cellular metabolism (see Fig. 4 and the Supplementary Material). Although less efficient, the networks of such species are more robust through greater edge redundancy, a feature that is considered critical for survival in conditions with fluctuating metabolite availability. Indeed, species with wide environmental breadth have been shown to respond to environmental changes in metabolite availability through activation of alternate metabolic pathways (Almaas et al. 2005). Even in cases where alternative pathways are not available, changes in metabolite availability will have less impact on overall metabolism, compared with species with low niche breadth, because their networks tend to be more modular, i.e., they tend to have greater independence between clusters of nodes.

Variation in network shape is greatest in niche breadth scores 1 through 3 and more subtle in scores 3 through 5. This may reflect more dramatic lifestyle variation between species with narrow habitats, or it may indicate that the definitions of niche breadth require finer clarification. Future efforts to characterize habitat complexity may benefit from using a semiquantitative approach, by incorporating both ordinal and quantitative indices. The index describing number of metabolite inputs provides a useful new perspective on species ecology because it describes metabolic plasticity in quantitative terms.

Despite recent developments, science is a long way from a full understanding of the origins of biologic complexity. The tools of network analysis seem to offer great promise for advancing such studies; however, in the rush to apply these tools to questions of complexity, there has been little critical assessment of the scope of network shape diversity or its potential ecologic relevance (Lynch 2007). With this work we have taken some first steps in that direction. We have established that the natural diversity of metabolic cartography is extensive and covaries with features of ecology that have clear metabolic significance. The positive association between network robustness and niche breadth suggests that network cartography can be viewed as a complex phenotype with potentially adaptive qualities. The next challenge is to develop a rigid null model for the evolution of metabolic structure by neutral processes. Such a model could serve as the basis for an explicit test of the role of natural selection in the origins of network robustness and efficiency.

References

Albert R, Barabasi A (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97
Article Google Scholar
Almaas E, Oltvai ZN, Barabasi AL (2005) The activity reaction core and plasticity of metabolic networks. PLoS Comput Biol 1:e68
Article PubMed Google Scholar
Alves R, Chaleil RA, Sternberg MJ (2002) Evolution of enzymes in metabolism: a network perspective. J Mol Biol 320:751–770
Article PubMed CAS Google Scholar
Batagelj V, Mrvar A (2003) Pajek: analysis and visualization of large networks. In: Jünger M, Mutzel P (eds) Graph drawing software. Springer, Berlin, Germany, pp 77–103
Google Scholar
Becker D et al (2006) Robust Salmonella metabolism limits possibilities for new antimicrobials. Nature 440:303–307
Article PubMed CAS Google Scholar
Belda E, Moya A, Silva FJ (2005) Genome rearrangement distances and gene order phylogeny in gamma-proteobacteria. Mol Biol Evol 22:1456–1467
Article PubMed CAS Google Scholar
Bern M, Goldberg D (2005) Automatic selection of representative proteins for bacterial phylogeny. BMC Evol Biol 5:34
Article PubMed Google Scholar
Brown JR, Volker C (2004) Phylogeny of gamma-proteobacteria: resolution of one branch of the universal tree? BioEssays 26:463–468
Article PubMed CAS Google Scholar
Canback B, Tamas I, Andersson SG (2004) A phylogenomic study of endosymbiotic bacteria. Mol Biol Evol 21:1110–1122
Article PubMed Google Scholar
Chan PY, Lam TW, Yiu SM (2006). A more accurate and efficient whole genome phylogeny. Proceedings of the 4th Asia-Pacific bioinformatics conference, pp 337–352
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P (2006) Toward automatic reconstruction of a highly resolved tree of life. Science 311:1283–1287
Article PubMed CAS Google Scholar
Dale C, Wang B, Moran N, Ochman H (2003) Loss of DNA recombinational repair enzymes in the initial stages of genome degeneration. Mol Biol Evol 20:1188–1194
Article PubMed CAS Google Scholar
Daubin V, Gouy M, Perriere G (2002) A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. Genet Res 12:1080–1090
Article CAS Google Scholar
Fitzpatrick DA, Creevey CJ, McInerney JO (2006) Genome phylogenies indicate a meaningful alpha-proteobacterial phylogeny and support a grouping of the mitochondria with the rickettsiales. Mol Biol Evol 23:74–85
Article PubMed CAS Google Scholar
Fronczak A, Fronczak P, Hołyst JA (2004) Average path length in random networks. Phys Rev E Stat Nonlin Soft Matter Phys 70:1–7
Google Scholar
Griswold CK (2006) Pleiotropic mutation, modularity, evolvability. Evol Dev 8:81–93
Article PubMed Google Scholar
Guimera R, Nunes Amaral LA (2005) Functional cartography of complex metabolic networks. Nature 433:895–900
Article PubMed CAS Google Scholar
Guimera R, Sales-Pardo M, Amaral LAN (2004) Modularity from fluctuations in random graphs and complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 70:025101 [epub]
Google Scholar
Hartwell LH, Hopfield JJ, Leibler S, Murray AW (1999) From molecular to modular cell biology. Nature 402:C47–C52
Article PubMed CAS Google Scholar
Henz SR, Huson DH, Auch AF, Nieselt-Struwe K, Schuster SC (2005) Whole-genome prokaryotic phylogeny. Bioinformatics 21:2329–2335
Article PubMed CAS Google Scholar
Holmgren AJ (2006) Using graph models to analyze the vulnerability of electric power networks. Risk Anal 26:955–969
Article PubMed Google Scholar
Jeong H, Tombor B, Albert R, Oltavi ZN, Barabasi A (2000) The large-scale organization of metabolic networks. Nature 407:651–654
Article PubMed CAS Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Article PubMed Google Scholar
Konsynski B, Tiwana A (2004) The improvisation-efficiency paradox in inter-firm electronic networks: governance and architecture considerations. Inform Technol 19:234–243
Article Google Scholar
Kunin V, Goldovsky L, Darzentas N, Ouzounis CA (2005) The net of life: reconstructing the microbial phylogenetic network. Genet Res 15:954–959
Article CAS Google Scholar
Lerat E, Daubin V, Moran NA (2003) From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-proteobacteria. PLoS Biol 1:E19
Article PubMed Google Scholar
Lynch M (2007) The frailty of adaptive hypotheses for the origins of organismal complexity. Proc Natl Acad Sci USA 104:8597–8604
Article PubMed CAS Google Scholar
Ma H, Zeng AP (2003) Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19:270–277
Article PubMed CAS Google Scholar
Martins EP, Hansen EF (1997) Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. Am Nat 149:646–667
Article Google Scholar
Meepetchdee Y, Shah S (2007) Logistical network design with robustness and complexity considerations. Int J Phys Distrib Log Manage 37:201–222
Article Google Scholar
Mollet C, Drancourt M, Raoult D (1998) Determination of Coxiella burnetii rpoB sequence and its use for phylogenetic analysis. Gene 207:97–103
Article PubMed CAS Google Scholar
Oltvai ZN, Barabasi AL (2002) Systems biology. Life’s complexity pyramid. Science 298:763–767
Article PubMed CAS Google Scholar
Paradis E, Claude J (2002) Analysis of comparative data using generalized estimating equations. J Theor Biol 218:175–185
Article PubMed Google Scholar
Parter M, Kashtan N, Alon U (2007) Environmental variability and modularity of bacterial metabolic networks. BMC Evol Biol 7:69
Article Google Scholar
Patil KR, Nielsen J (2005) Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc Natl Acad Sci USA 102:2685–2689
Article PubMed CAS Google Scholar
Pinheiro J, Bates D (2000) Theory and computational methods for linear mixed-effects models. In: Chambers J, Eddy W, Härdle W, Sheather S, Tierny L (eds) Mixed-effects models in S and S-plus. Springer, New York, NY, pp 57–96
Google Scholar
Raff EC, Raff RA (2000) Dissociability, modularity, evolvability. Evol Dev 2:235–237
Article PubMed CAS Google Scholar
Santos SR, Ochman H (2004) Identification and phylogenetic sorting of bacterial lineages with universally conserved genes and proteins. Environ Microbiol 6:754–759
Article PubMed CAS Google Scholar
Self SG, Liang KL (1987) Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Am Stat Assoc 82:605–610
Article Google Scholar
Stelling J, Klamt S, Bettenbrock K, Schuster S, Gilles ED (2002) Metabolic network structure determines key aspects of functionality and regulation. Nature 420:190–193
Article PubMed CAS Google Scholar
Variano EA, McCoy JH, Lipson H (2004) Networks, dynamics, and modularity. Phys Rev Lett 92:188701 [epub]
Google Scholar
Venkatasubramanian V, Politis DN, Patkar PR (2006) Entropy maximization as a holistic design principle for complex optimal networks. AIChE J 52:1004–1009
Article CAS Google Scholar
Watts DJ, Strogatz SH (1998) Collective dynamics of 'small-world' networks. Nature 393(6684):440–442
Article PubMed CAS Google Scholar
Wolf YI, Rogozin IB, Grishin NV, Koonin EV (2002) Genome trees and the tree of life. Trends Genet 18:472–479
Article PubMed CAS Google Scholar
Wolf M, Muller T, Dandekar T, Pollack JD (2004) Phylogeny of firmicutes with special reference to mycoplasma (mollicutes) as inferred from phosphoglycerate kinase amino acid sequence data. Int J Syst Evol Microbiol 54(Pt 3):871–875
Article PubMed CAS Google Scholar
Woolfit M, Bromham L (2003) Increased rates of sequence evolution in endosymbiotic bacteria and fungi with small effective population sizes. Mol Biol Evol 20:1545–1555
Article PubMed CAS Google Scholar
Wunderlich Z, Mirny LA (2006) Using the topology of metabolic networks to predict viability of mutant strains. Biophys J 91:2304–2311
Article PubMed CAS Google Scholar
Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556
PubMed CAS Google Scholar
Zhao J, Yu H, Luo JH, Cao ZW, Li YX (2006) Hierarchical modularity of nested bow-ties in metabolic networks. BMC Bioinformatics 7:386
Article PubMed Google Scholar
Zhao Y, Davis RE, Lee IM (2005) Phylogenetic positions of Candidatus Phytoplasma asteris and Spiroplasma kunkelii as inferred from multiple sets of concatenated core housekeeping proteins. Int J Syst Evol Microbiol 55:2131–2141o
Article PubMed CAS Google Scholar

Download references

Acknowledgments

We thank Hongwu Ma and An-Ping Zeng for providing their metabolic database and Robert Guimera for use of the program for determining modularity as well as useful suggestions for optimizing the implementation of the program. We thank the anonymous referees for constructive comments. This work was supported by grants from the Natural Sciences and Engineering Research Council of Canada to J. P. B. and H. G.; a grant from the Canadian Foundation for Innovation to J. P. B.; and a grant from the Nova Scotia Health Research Foundation to M. J. M.

Author information

Authors and Affiliations

Department of Biology, Dalhousie University, Halifax, NS, B3H-4J1, Canada
Melissa J. Morine, Ransom A. Myers & Joseph P. Bielawski
Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, B3H-4J1, Canada
Hong Gu & Joseph P. Bielawski

Authors

Melissa J. Morine
View author publications
You can also search for this author in PubMed Google Scholar
Hong Gu
View author publications
You can also search for this author in PubMed Google Scholar
Ransom A. Myers
View author publications
You can also search for this author in PubMed Google Scholar
Joseph P. Bielawski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Melissa J. Morine.

Additional information

Ransom A. Myers Died March 27th, 2007. He will be missed.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 1044 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morine, M.J., Gu, H., Myers, R.A. et al. Trade-Offs Between Efficiency and Robustness in Bacterial Metabolic Networks Are Associated with Niche Breadth. J Mol Evol 68, 506–515 (2009). https://doi.org/10.1007/s00239-009-9226-5

Download citation

Received: 07 November 2008
Revised: 04 March 2009
Accepted: 17 March 2009
Published: 14 April 2009
Issue Date: May 2009
DOI: https://doi.org/10.1007/s00239-009-9226-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Trade-Offs Between Efficiency and Robustness in Bacterial Metabolic Networks Are Associated with Niche Breadth

Abstract

Similar content being viewed by others

Functional comparison of metabolic networks across species

Resource constrained flux balance analysis predicts selective pressure on the global structure of metabolic networks

Cross-biome microbial networks reveal functional redundancy and suggest genome reduction through functional complementarity

Introduction

Network Theory