Introduction

Although the concept of the metabolic syndrome (MS)—also known as syndrome X, insulin resistance syndrome, and recently, cardiometabolic syndrome—may be considered as clinically irrelevant [1, 2], the MS is recognized as a major and prevalent cardiovascular risk factor [35]. Five screening variables with different cutoffs are used to identify individuals with the MS: waist circumference, circulating levels of triglycerides and of HDL cholesterol, fasting glucose, and blood pressure. Therefore, the MS includes a constellation of complex diseases such as type 2 diabetes, dyslipidemias, central obesity, arterial hypertension, prothrombotic and proinflammatory states, ovarian polycystosis, and fatty liver disease.

On the basis of family studies and studies of twins reared together or apart, a heritable contribution to the key components of the MS has been demonstrated. Moreover, more than 50% of the variance of arterial blood pressure values, levels of lipids, and body mass index (BMI) is attributable to genetic influences, showing a strong genetic basis for each main component of the MS. Current evidence from both the rare monogenic forms of the MS and the quantitative and qualitative common traits found using various approaches such as candidate gene association, genetic linkage analysis, and genome-wide association studies (GWAS) indicates that the genetics of each of these diseases is complex in itself, and they vary along a spectrum from monogenic and syndromic forms (usually rare) to the most common polygenic and multifactorial forms, including some with mitochondrial defects. Though a comprehensive analysis of the genetics of each MS component is beyond the scope of this review, the interested reader can found specific reviews about the genetics of each component [6].

On one hand, advances in genotyping technology and information generated by GWAS have enormously expanded our knowledge about gene variants associated with each of the MS-associated phenotypes. Nevertheless, even with the comprehensive survey of the entire genome made by GWAS, many important aspects remain inconclusive and many important questions remain unanswered. First, most of the single nucleotide polymorphisms (SNPs) identified by GWAS are not placed in either coding regions or regulatory elements. On the contrary, most of them are intergenic SNPs. As a result, we may assume that the intergenic disease-associated SNPs identified by GWAS are in strong linkage disequilibrium (LD) with the true causal SNPs, or they are truly involved in the disease biology, in which case they should modulate gene expression in some way. Second, are the disease-associated SNPs identified by GWAS related to pathways suspected to play a role in the etiology of the observed phenotype? Third, can the enormous progress made with GWAS be rapidly translated into improved diagnostic strategies and therapies? Finally, can the complex constellation of diseases integrated into the MS be connected beyond a mere clinical observation? If so, can we hypothesize that common regulatory pathways or common physiologic processes link the MS-associated traits?

On the other hand, genetic factors other than DNA variation are likely to play an important role in the etiology of the MS, and the regulation of transcriptional and post-transcriptional gene expression and translation of mRNAs to proteins by microRNAs (miRNAs) is gaining acceptance. In particular, they can exert a dramatic influence on their target gene expression by acting not only on mRNA stability and translation but also on chromatin structure modifications [7]. A related and remarkable topic that few have explored is the potential influence of epigenetic changes on the development and progression of the MS-associated phenotypes. By definition, epigenetic factors—the most important of which are DNA methylation and covalent histone modifications—are modifiers of gene expression that are capable of self-replication through cell mitosis without altering the DNA sequence itself, and they can even be transmitted to the next generation of organisms. The most interesting points about epigenetic modifications are that they are crucial during development and they can be modified and disrupted by environmental influences such as dietary and behavioral habits and therapeutic interventions [8].

In this review, we use systems biology approaches to integrate genomic, molecular, and physiological data to decipher putative pathways suspected to play a role in the etiology of the MS-associated phenotypes. This approach is designed to analyze and integrate genomic, transcriptomic, and/or proteomic data to infer from genetic signals related pathways of disease. In addition, we update the current knowledge about the impact of epigenetic changes on the course of MS-associated traits and summarize some data about the role in MS pathogenesis of polymorphisms at microRNA target sites.

Systems Approach and Mechanistic Insights in the Etiology of MS-Associated Traits

As already mentioned, current evidence from candidate-gene association studies, genetic-linkage analysis, and GWAS indicates that the genetic component of each part of the MS is complex, and they vary from rare monogenic and syndromic forms to common polygenic and multifactorial forms. So far, hundreds of SNPs in different genes or loci have been proposed as potential modifiers (albeit modest) of the genetic risk of the intermediate phenotypes of the MS, even though the mechanism behind the association between a gene and the associated disease may be unknown.

Interestingly, from a clinical point of view, there is consensus that the components of the MS constitute a particular combination of underlying risk factors for a single adverse outcome—cardiovascular disease. It is also agreed that the MS carries a greater risk for adverse clinical outcomes than each single risk factor. Thus, the clinical behavior of the MS shows that this phenotype can be regarded as a whole, as its clinical effect is greater than the sum of its parts. Nevertheless, from a pathogenic point of view, the understanding of the underlying mechanisms of each disease is far beyond the understanding of all of them as an integrative process.

Systems biology introduces a new concept for revealing the pathogenesis of human disorders and suggests the presence of common physiologic processes and molecular networks influencing the risk of a disease. Rather than compartmentalizing individual risk factors (eg, insulin resistance, blood pressure, or lipid concentrations) and treating them as if they were separate and independent, systems biology examines their interactions. Here we show a model of this concept to explain the genetic determinants of MS-associated phenotypes.

Methods and Study Design

Based on the hypothesis that common physiologic processes and molecular networks may influence the risk of MS disease components, we proposed two systems-biology approaches: a gene enrichment analysis and the use of a protein-protein interaction network. Both approaches were evaluated by the bioinformatic resource ToppGene Suite (http://toppgene.cchmc.org) [9•]. A similar concept was recently used to find new candidate genes for the MS [10], rendering new loci whose associations with MS components were finally replicated in independent studies: HNF4α with type 2 diabetes in more than 49,000 individuals by a meta-analysis [11], and IGF1R with arterial hypertension in the Argentinean population [12].

To build the candidate gene list, we included 537 genes selected from the Genetic Association Database (GAD) on the basis of the following broad phenotype terms: hypertension, arterial blood pressure, obesity, insulin resistance, diabetes, body mass, waist, HDL, cholesterolemia, hypertriglyceridemia, fasting plasma insulin levels, lipid metabolism disorders, cholesterol, HDL, triglycerides, fat, and adiposity. The GAD (http://geneticassociationdb.nih.gov/) is an archive of human genetic association studies of complex diseases and disorders. This database includes summary data (molecular, clinical, and study parameters) extracted from published papers in peer-reviewed journals on candidate-gene and GWAS studies. The data were enriched with the list of genes described in the Catalog of Published GWAS, available at www.genome.gov/gwastudies (accessed in August 2010) [13].

We first performed a gene enrichment analysis using the ToppFun application (http://toppgene.cchmc.org/enrichment.jsp), which detects functional enrichment of the candidate gene list based on transcriptome, proteome, regulome (transcription factor binding sites and miRNA), ontologies (gene ontology, pathway), phenotype (human disease and mouse phenotype), pharmacome (drug-gene associations), literature co-citation, and other features. The system uses 14 similarity scores that are combined into an overall score using statistical meta-analysis, assigning a P-value [9•].

As a second approach, we used the ToppGenet application (http://toppgene.cchmc.org/network_prioritization.jsp) to perform a gene prioritization of neighboring genes based on functional annotations and protein-protein interaction networks [9•]. Briefly, the protein-protein interaction network–based disease candidate gene prioritization uses social and web networks analysis algorithms (extended versions of the PageRank and HITS algorithms, and the K-Step Markov method) [9•]; training and test set genes are mapped to a protein-protein interaction network. Scoring and ranking of test-set genes are based on the location relative to all of the training-set genes using global network-distance measures in the protein-protein interaction networks.

Results of Gene Enrichment Analysis and Protein-Protein Interaction Network Analysis

We observed that among the 58 molecular pathways (including 1,815 genes from the genome) scored with a significant P-value, 15 were ranked with high significance (Table 1). Among the top-ranked molecular disease pathways we found the statin pathway (http://www.pharmgkb.org/search/pathway/statin/statin.jsp), reinforcing the concept of the biologic effect of statins and their effects on hepatic cholesterol metabolism and consequent effects on plasma lipoprotein transport, which not only have a direct impact on lowering lipid levels but also have a strong impact on reducing vascular complications. Interestingly, tryptophan metabolism was also significantly ranked, strengthening the participation of serotonin in the development of MS components [14, 15].

Table 1 Top-ranked molecular disease pathways related to the metabolic syndrome, determined by gene enrichment analysis using the ToppFun application

Results of the top 50 genes based on relative location to all the training set genes using global network distance measures in the proteome-wide protein-protein interaction network are shown in Table 2. Figure 1 shows that they can be predicted to share an abundance of physical interactions, co-expression, co-localization, and pathways, and a series of new candidates (open circles) also emerges. In general, the list of genes shows a strong enrichment of nuclear receptors, including the farnesoid-X activated receptor, retinoid X receptors, pregnane x receptor, and thyroid hormone receptors. Interestingly, the role of nuclear receptors in the physiopathology of the MS is gaining acceptance because of their pleiotropic actions and their involvement in many physiologic processes, including cell growth and differentiation, embryonic development, and metabolic homeostasis; it is also possible that they can be pharmacologically modulated. In addition, nuclear receptors are regarded as key regulators of circadian rhythm [16]. That is, the disruption of CLOCK, a master gene in circadian rhythm regulation, predisposes to MS-related phenotypes such as obesity and nonalcoholic fatty liver disease (NAFLD) [17, 18]. Another interesting example is cyclooxygenase 2 (COX2), an inducible enzyme that plays an important role in inflammation, angiogenesis, tumorigenesis, and cardiomyocytes, where COX2 influences cardiac function.

Table 2 The top prioritized 50 genes influencing the risk of metabolic syndrome components, based on relative location to all the training set genes, using global network distance measures in the proteome-wide protein-protein interaction networka
Fig. 1
figure 1

Graph obtained from the Genemania application (genemania.org [37]) for 48 of the 50 genes (gray circles) described in Table 2 (RXRB and LRP1 were not recognized by the software) and 20 predicted related genes taken as an example for the sake of simplicity (white circles); the size of these circles corresponds to their probability of belonging to the networks assigned by the program. Of these genes, 49.74% of their products are predicted to have physical interactions (blue lines), 34.73% are co-expressed (purple lines), 12.13% are co-localized (bordeaux lines), 2.33% belong to the same pathways (teal lines) and 1.07% are predicted to have direct interactions by small-scale datasets (tan lines)

We have mentioned the contribution of gene-gene interaction, a relatively poorly explored but remarkable mechanism by which gene variants can modulate the risk of common diseases. The effect of isolated genes is generally small, with odds ratios (ORs) smaller than 2, at best. Even considering the sum of the effect of several risk variants combined, the total variance of the phenotype explained is commonly less than 5%. That is, for seven hypothetical variants with a minor allele frequency of 0.5 and an OR of 1.5, the overall composite effect is smaller than 10 and the probability of finding individuals of such haplotypes would be less than 5% in the total population. A dramatic example can be found in the study by Li et al. [19], which showed that individuals who carried more than 16 risk alleles for obesity had higher BMI than those who carried fewer than 7 risk alleles, but by only 1.53 BMI units, and all SNPs added only 3% to the predictive value of obesity in addition to age and sex.

Thus the possibility should be considered that risk variants are not acting independently but rather by a synergistic effect, a phenomenon known as epistasis. Owing to the difficulties of study design, few authors have reported the role of epistasis in the risk of MS-associated phenotypes such as hypertension [20], myocardial infarction and coronary artery disease [21], cholesterol levels [22], and triglyceridemia [23]. In this vein, we have shown that CLOCK and serotonin transporter (SLC6A4) variants interacting with environmental factors such as rotating shift work strongly affect the overall development of the MS or its isolated components, such as blood pressure or plasma triglycerides [24].

In summary, though the MS is a complex constellation of diseases, it is possible to infer that common physiologic processes and molecular networks influence the risk of each intermediate phenotype. The susceptibility genes identified so far as genetic determinants of each MS disease component can be regarded as a network of complex interactions working either synergistically or in an integrated system.

The Impact of Epigenetic Changes on the Course of MS-Associated Traits

Recent advances in epigenomic approaches have placed epigenetic gene regulation as a key factor in the pathogenesis of complex disorders—mainly cancer, but also other complex diseases such as the MS. In particular, epigenetic modifications can explain the mechanisms involved in the gene-environment interaction, the sexual dimorphism observed in some phenotypes, and the role of developmental programming. The most attractive aspect of the hypothesis of the impact of epigenetic changes in the etiology of MS-associated phenotypes is given by the nature of epigenetic regulation, which is dynamic and is subject to both internal and external influences. Hence, though epigenetic marks can be propagated during cell division, permanently modifying the phenotype, therapeutic modifications are also plausible.

The most studied mechanism of epigenetic modifications influencing an MS-related trait is DNA methylation. For instance, DNA methylation of the PPARGC1A promoter in pancreatic islets from patients with type 2 diabetes was associated with alterations in insulin secretion [25]. In this vein, there is increasing evidence that prenatal environment can modify the epigenetic regulation of specific genes. We recently reported a positive correlation between maternal BMI and PPARGC1A promoter methylation in the umbilical cord of their offspring, suggesting a potential role of promoter PPARGC1A methylation in the metabolic programming of the fetus [26]. In addition, we explored whether DNA methylation of the mitochondrial transcription factor A (TFAM) promoter is associated with insulin resistance in adolescents with features of the MS, and we observed a potential role of epigenetic modifications in this transcription factor in association with insulin resistance as measured by the homeostasis model assessment index (HOMA-IR) and fasting plasma insulin levels [27]. Interestingly, because both PPARGC1A and TFAM are regulators of mitochondrial biogenesis, DNA methylation at their promoters may be one cause of the mitochondrial DNA decrease observed in newborns who are small or large for gestational age [28], as well as in adolescents with insulin resistance [29]. A thorough overview of the participation of epigenetic regulation in type 2 diabetes was recently published [30].

In conclusion, epigenetic gene regulation seems to play a role in the pathogenesis of many complex disorders, including the MS. Some candidate genes, such as PPARGC1A, represent excellent targets for epigenetic modification (eg, DNA methylation) and also can explain the hypothesis of the impact of metabolic reprogramming during embryogenesis on the onset of the MS in adult life.

The Role of Polymorphisms at MicroRNA Target Sites in the Pathogenesis of the Metabolic Syndrome

MiRNAs are small, noncoding RNAs (about 21 nucleotides) involved in post-transcriptional gene regulation. MiRNAs exert their biologic functions after binding (in a sequence-specific manner) to the 3′ untranslated region (UTR) of mRNA targets, facilitating mRNA degradation or translational inhibition.

Interestingly, some DNA variations occur naturally in putative miRNA target sites, and they are often regarded as polymorphisms in microRNA target sites (polymiRTS). In fact, approximately 20,000 of them are catalogued in datasets such as the polymiRTS Dataset (http://compbio.uthsc.edu/miRSNP/home.php) and Patrocles (http://www.patrocles.org/Patrocles.htm).

In the miRNA target site, SNPs may affect the base-pairing between the miRNA and its target site. Therefore, they can affect the miRNA-mediated gene repression. Hence, polymiRTS and SNPs in the miRNA itself (a less frequent finding, probably owing to evolutionary constraints) may lead to heritable variations in gene expression.

There are isolated but interesting reports about the association between SNPs located in the 3′ UTR region of human genes and components of the MS, including arterial hypertension. For instance, Sethupathy and coworkers [31] observed that a target site for the hsa-miR-155, within the 3′ UTR of the human AGTR1 gene that contains the SNP rs5186 (also known as A1166C), leads to allele-specific underexpression of AGTR1, suggesting that the abrogation of miR-155 binding elevates the mRNA-AGTR1 levels. Interestingly, abundant data provide strong evidence of an association of the A1166C variant and cardiovascular disease, including hypertension, and therapeutic response to losartan [32].

Another disease-associated 3′ UTR polymorphism has been reported to be linked to the miR-122: the C/T SNP in the 3′ UTR of SLC7A1 (rs41318021), the principal arginine transporter, important in the generation of nitric oxide by the endothelium [33]. The minor T allele significantly attenuates reporter gene expression, lowering SLC7A1 levels and presumably contributing to the endothelial dysfunction seen in hypertensive individuals [34].

In a hospital-based case-control study, we evaluated the role of the rs41318021 C/T variant in the genetic susceptibility of MS-associated phenotypes and observed a significant association with arterial hypertension (OR, 2.002; 95% CI, 1.1255–1.3195; P = 0.003), after adjusting for age and BMI. In addition, the rs41318021 genotypes were significantly associated with diastolic blood pressure elevation (OR, 1.955; 95% CI, 1.164–3.285; P = 0.01), after adjusting for age and BMI.

On the other hand, a recent study showed that miR-33 has central role in the control of HDL cholesterol by a coordinate regulation with the SREBF2 [35•]. Though the authors showed that miR-33 has binding sites in the 3′ UTR of several genes, including the ABC transporters ABCA1 and ABCG1 and the endolysosomal transport protein NPC1, there are still no data about association studies.

Unfortunately, few genetic association studies have focused on miRNA-associated 3′ UTR or 5′ UTR and MS-related phenotypes. The emerging data suggest that they may explain some critical aspects still unresolved, such as the fine-tuning gene expression regulation and the gene-environment influence. Moreover, despite the tremendous advances made by GWAS in the search for gene variants and the risk of common diseases, some limitations remain. For example, the gene variants in the regulatory regions are poorly represented, as are polymorphisms in miRNA target sites.

The patterns of associations between miRNAs and disease remain largely unclear, but they may serve as a novel explanation of dysregulation of multiple pathways in concert (upregulation or downregulation simultaneously in several tissues) in complex diseases such as the MS [7], particularly considering that one miRNA may regulate many genes and, conversely, one gene may be regulated by many miRNAs [36].

Conclusions

In this article, we have used a combination of genome-wide association data and gene enrichment analysis, along with gene prioritization of neighboring genes based on functional annotations and protein interaction networks to answer a central question about the genetic components of MS-associated phenotypes: Are the disease-associated SNPs identified by GWAS related to common pathways? The results show that a network driven by many members of the nuclear receptor superfamily of proteins, including retinoid X receptor and farnesoid X receptor (FXR), may be implicated in the pathogenesis of the MS by their interactions (at multiple levels of complexity) with genes associated with metabolism, cell differentiation, and oxidative stress. This knowledge can be rapidly translated into improved therapies because nuclear receptors are prime candidates as drug targets, as they are involved in many physiological processes, including glucose homeostasis and the synthesis of cholesterol, lipids, and bile acids. Finally, the pleiotropic functions and tissue-specific expression patterns of nuclear receptors may functionally connect all MS-associated traits. New knowledge may soon emerge regarding environmentally modifiable epigenetic factors such as DNA methylation of specific genes as well as whole new arrays of miRNAs as biomarkers and therapeutic targets not only for the MS but also for many other complex diseases.