Keywords

4.1 Introduction

Biological trace elements refer to those dietary elements which are required in very small amounts (less than 100 mg/day) for the proper growth, development, and physiology of an organism [1]. These micronutrients include iron (Fe), zinc (Zn), copper (Cu), molybdenum (Mo), cobalt (Co), nickel (Ni), manganese (Mn), chromium (Cr), vanadium (V), selenium (Se), iodine (I), and probably other elements. The majority of trace elements are metals. They provide proteins with unique coordination, catalytic, and electron transfer properties and are involved in critical enzymatic activities, immunological reactions, and physiological mechanisms [2, 3]. Due to the important roles these trace elements play in cells, efficient and specific mechanisms are needed to maintain and regulate their concentration, utilization, and storage, especially for those elements whose soluble forms are present in trace amounts in natural environments.

Trace element deficiencies are life-threatening health problems in some regions and populations of the world, which are responsible for a variety of clinical disorders, such as Fe deficiency in anemia patients [4]. Some groups of individuals, such as children, pregnant women, and the elderly, are more likely to develop trace element deficiency. On the other hand, accumulation of inappropriate amounts of certain metals (such as Cu) may result in overload disorders because of high toxicity of these metals [4]. Some trace elements may interact and could interfere with the essential functions of each other. For example, large doses of Zn supplementation can disrupt Cu uptake and lead to neurological problems [5]. In addition, trace element status can be altered in some clinical conditions and may interfere with the efficacy of the treatment [6]. Therefore, homeostasis of trace elements within the body should be carefully maintained to offer their adequate but not toxic levels for biological processes.

Research during the past 20 years has provided lots of evidence of how trace elements are utilized for humans. Marginal or severe trace element imbalances could be considered as risk factors for a variety of diseases, but mechanism of such cause and effect relationships needs a more complete understanding of basic metabolism, regulation, and function of these micronutrients. Previous studies of trace elements and genes involved in trace element metabolism have revealed the complexity of trace element utilization and function in nature. In the recent decade, with the rapid increase in the amount of biological data available (such as genomes, transcriptomes, proteomes, etc.) and a corresponding increase in computational approaches, omics-based and/or bioinformatics analysis of the relationship between trace elements and health or disease has become more and more important. Attempts have been made at a genome-wide level based on high-throughput sequencing techniques, which could improve our understanding of the utilization of trace elements in normal physiological conditions and their variations or dyshomeostasis in disease [7,8,9,10,11,12]. Very recently, the term ionome has been introduced, which is defined as all mineral nutrients and trace elements found in an organism. Several ionome-based studies have identified new features of elemental network for complex diseases such as diabetes and neurodegenerative diseases [13,14,15]. These contributions may not only provide important mechanistic insights into the metabolism and homeostasis of trace elements but facilitate development of new drugs and therapeutic strategies against some of the imbalanced elements.

This chapter focuses on the metabolism and function of several important trace elements in human health as well as their association with the onset and development of diseases mainly from the perspective of bioinformatics and systems biology, such as comparative genomics, genome-wide association study (GWAS), and population and/or cohort studies. Such information may achieve a more integrated and system-level picture of the critical roles these elements play in both physiological and pathological conditions. Recent developments in the study of ionome in diseases (disease ionomics) are also discussed.

4.2 Computational Resource for Trace Elements

4.2.1 Databases

Integration of genes/proteins that bind one or more trace elements for their biological function (say, metalloproteins) and those involved in trace element metabolism from multiple resources (such as large nucleotide/protein databases and literatures) is the basis for understanding their utilization and function in different organisms. In recent years, several trace element-specific web databases have been successfully built up, including MDB, MeRNA, MINAS, dbTEU, MetalPDB, and some other databases.

MDB (the Metalloprotein Database and Browser) is the first web-accessible resource for metalloprotein research, which offers quantitative information on metal-binding sites in protein structures available from the Protein Data Bank (PDB) [16]. MDB also provides tools for analysis of patterns in the metal-binding sites and for prediction of potential metal-binding sites from new protein structures.

MeRNA (metals in RNA) is a database of metal-binding sites identified in RNA structures. It focuses on eight known binding motifs and is used to aid in the study of the roles of metals in RNA biology such as RNA folding and catalysis [17]. Recently, another database of metal ions in nucleic acids (MINAS) has been developed to list all nucleic acid-bound metal ions contained in the PDB, which will be useful to identify new possible metal-binding motifs in nucleic acids [18].

Metal-MACiE is a web-based database that aims to collect the known information on the properties and the roles of metals in catalytic mechanisms of metalloenzymes [19]. This database can be used to advance our understanding of the chemistry underlying metal-dependent catalysis.

dbTEU (DataBase of Trace Element Utilization) is a large protein database of trace element utilization [20]. This manually curated database contains ~16,500 known transporters and user proteins for five trace elements (Cu, Mo, Co, Ni, and Se) in more than 700 organisms from the three domains of life. It also offers interactive tools for search and browse of trace elements, proteins, organisms, and sequences.

Mespeus is a newly developed database of metal interactions with proteins [21]. It lists metal and protein interactions whose geometry has been experimentally determined and could be further visualized.

MetalPDB is a novel resource of metal sites in biological macromolecular structures [22]. This database is achieved through the systematic and automated representation of metal-binding sites in proteins and nucleic acids by way of minimal functional sites (MFSs). The web interface allows access to a comprehensive overview of metal-containing structures, providing a basis to investigate the basic principles governing the properties of these systems.

SelenoDB provides full annotations of Se-containing protein (or selenoprotein) genes in at least 58 animal genomes, which is a valuable resource for addressing medical and evolutionary questions in Se biology [23].

4.2.2 Computational Tools for Trace Element Utilization

Identification of trace element-dependent proteins is not only useful for the inference of protein function but also important for understanding the roles of trace elements. To date, several bioinformatics algorithms and tools have been developed for identification of genes encoding metalloproteins (particularly for Zn and Fe) or selenoproteins in different organisms including humans. Unfortunately, considering that metal-binding properties still remain difficult to predict at the whole-proteome level, it is currently not possible to identify complete sets of metalloproteins in most organisms. Further efforts are needed to identify additional and reliable common features.

An early study reported a software named Zincfinder for the prediction of the Zn-binding proteins based on support vector machine (SVM) learning method [24]. This predictor identified some unprecedented Zn-binding sites and proteins which were further validated through structural modeling. Another SVM and homology-based algorithm was reported to provide higher precision at different levels compared to Zincfinder [25].

TEMSP (3D TEmplate-based Metal Site Prediction) is a structure-based method to predict Zn-binding sites in proteins [26]. This tool improves previously reported methods in predicting Zn-binding proteins with minimum overpredictions. In addition, TEMSP can also predict the Zn-bound local structures, which is helpful for functional analysis.

Zincidentifier software integrates multiple sequence and structural properties and graph-theoretic network features, followed by an efficient feature selection using random forest to improve prediction of Zn-binding sites and proteins [27]. This method can not only be applied to large-scale prediction of Zn-binding sites using structural information but also give valuable insights into new features for characterizing the Zn-binding sites.

ZincExplorer is a new hybrid method for the prediction of Zn-binding sites from protein sequences, which combines the outputs of different types of predictors [28]. It could also identify the interdependent relationships of the predicted Zn-binding sites bound to the same Zn ion.

SIREs (search for iron-responsive elements) is a user-friendly web-based tool for the prediction of iron-responsive elements (IREs) in query genome [29]. This web server provides structure analysis, predicted RNA folds, and an overall quality flag based on properties of well-characterized IREs.

HemeBIND is the first algorithm for heme (an Fe-porphyrin complex)-binding residue prediction in proteins by integrating structural and sequence information such as evolutionary conservation, solvent accessibility, depth, and protrusion [30]. A better performance has been reported when compared with individual classifier alone.

SCMHBP is a novel tool for the prediction and analysis of heme-binding proteins using propensity scores of dipeptides [31]. This approach is based on a scoring card method for predicting and analyzing heme-binding proteins from sequences. SCMHBP performs well relative to comparison with such methods as SVM, decision tree, and Bayes classifiers and improves our understanding of heme-binding proteins rather than merely improves the prediction accuracy in predicting them.

FINDSITE-metal is a threading-based method which is specifically used to detect metal-binding sites in protein structures [32]. It integrates evolutionary information and machine learning for structure-based metal-binding site prediction at the proteome level. An accuracy of 70–90% could be achieved for Fe, Cu, Zn, and some other metal ions. This algorithm was applied to quantify the metal-binding proteins of the human proteome.

Compared to metalloprotein prediction, computational identification of selenoproteins and the complete set of selenoproteins (selenoproteome) in different organisms, including humans, have been reported. Several programs have been widely used for selenoprotein prediction in different kingdoms, such as SECISearch and bSECISearch tools for prediction of selenoprotein genes in eukaryotes and bacteria, respectively [33, 34]. In addition, a method named Seblastian was also developed to predict new selenoprotein genes in eukaryotes [35].

4.3 Metabolism and Homeostasis of Trace Elements and Their Association with Disease

Trace elements play important roles in all types of cells; as a consequence, the ability of the cell to tightly manage their homeostasis is very important. In eukaryotes, the major processes related to the metabolism of trace elements (especially transition metals) are similar, which include uptake, compartmentalization, storage, and export [36]. High-affinity transport systems have been identified for several metals [37]. Some metal ions could also be transported via unspecific cation influx systems [38]. Excessive uptake of certain elements can be toxic to cell growth. Thus, storage of these elements in inactive sites or forms and export systems are needed to prevent their overload in the cell. It is clear that homeostasis of trace elements should be carefully maintained to provide sufficient levels while preventing accumulation to toxic levels.

The majority of trace elements are directly incorporated into target proteins, whereas some have to form trace element-containing cofactors or complexes (e.g., molybdopterin for Mo, vitamin B12 for Co, and selenocysteine for Se) prior to their insertion into user proteins. A general scheme of metal utilization in eukaryotes is shown in Fig. 4.1. The following sections will focus on several essential trace elements and discuss recent progress on bioinformatics research of their metabolism, physiological roles, and correlation with diseases.

Fig. 4.1
figure 1

A general scheme of metal metabolism in eukaryotes. The major components involved in metal metabolism and homeostasis include transporters (importers and/or exporters), metal-binding chaperones, and user proteins (metalloproteins). Some metals (such as Mo and Co) have to form metal-containing cofactors prior to their insertion into user proteins

4.3.1 Iron

4.3.1.1 Iron Metabolism and Iron-Binding Proteins

Fe is the second most abundant metal (after aluminum) in the Earth’s crust and is an absolute requirement for all living organisms. This metal is needed for the function of a wide range of enzymes and pathways related to its rich coordination chemistry and redox properties [39]. Besides Fe ions, proteins can bind different forms of Fe-containing cofactors, such as heme or Fe-S clusters. In mammals, Fe is essential for cellular respiration, oxygen transport, energy production, xenobiotic detoxification, and DNA synthesis. On the other hand, redox properties of Fe contribute to its toxicity, which produces reactive oxygen species (ROS) that are harmful to biological molecules. To maintain Fe homeostasis at both the systemic and the cellular levels, mammals have developed a complex machinery to control its intake, utilization, and recycling. In the past several decades, several key findings have shaped our current understanding of Fe metabolism, including the identification of the transferrin (Tf) receptor (TfR), the iron-responsive element/iron-regulatory protein (IRE/IRP), the Fe-regulatory hormone hepcidin, and its target ferroportin (Fpn) [40,41,42,43]. Nevertheless, our current knowledge of Fe biology remains incomplete.

In general, inorganic Fe is initially reduced to Fe2+ by ferrireductase and transported through the cellular membrane by the divalent metal transporter 1 (DMT1) [44]. Organic heme Fe is transported into the cytosol and released by heme oxygenase 1. Excess intracellular Fe is then stored in the storage protein ferritin [45]. Cytosolic Fe is exported into the plasma by the basolateral Fe exporter Fpn [43]. Export of Fe from enterocytes into the blood requires the ferroxidase hephaestin, a multicopper oxidase that oxidizes Fe2+ to Fe3+ [46]. In the plasma, Fe3+ is bound to Tf which delivers Fe to different cells. Most cells acquire Fe via TfR1 (a high-affinity ubiquitously expressed receptor) and TfR2 (restricted to certain cell types with much lower affinity than TfR1). Fe is imported into mitochondria (the major site of Fe utilization) for heme biosynthesis by the transporter protein mitoferrin [47]. However, the mechanism by which heme passes through the mitochondria is poorly understood. Fpn is believed to be the only ferrous Fe exporter. In addition, hepcidin, the key circulating hormone mainly produced by the liver, can systematically modulate Fe homeostasis, which regulates cellular Fe efflux from different cells by binding to Fpn and inducing its internalization and degradation [42, 48].

Recent advances in the study of Fe metabolism have revealed multiple intricate pathways that are essential for maintaining Fe homeostasis. Thus, bioinformatics and systems biology approaches may represent a new strategy for understanding Fe metabolism and its function in proteins. Several computational and dynamic models have been developed to describe the Fe metabolic network and its homeostasis based on microarrays, high-throughput sequencing, and proteomics data, which may shed light on the mechanistic foundations of Fe regulation [49,50,51]. However, key parts of the system remain poorly understood.

To understand the function of Fe in various processes, a more important issue is to identify all Fe-binding proteins. So far, it is very difficult to identify the complete Fe-dependent metalloproteomes by computational approaches. Several bioinformatics studies have been conducted for understanding ferroproteomes, at least partially, in some organisms including humans. One comparative study investigated the occurrence of nonheme Fe-binding proteins based on Fe-binding pattern recognition in a selected number of organisms and found that 90% of Fe-binding proteins have homologs in all three domains of life [52]. The majority of Fe-binding proteins were involved in electron transfer or enzymes performing oxidoreductase functions, suggesting that Fe is mostly used in redox catalysis. Fe-S clusters were the cofactors in about 40% of nonheme Fe proteins retrieved. Another structural analysis of the protein environment around Fe-binding sites revealed that similar sites could be found in unrelated proteins [53]. Very recently, a new algorithm named MetalPredator has been developed for the prediction of the Fe-S proteome [54].

Heme constitutes 95% of functional Fe in the human body. It is a prosthetic group, comprising a ferrous Fe and protoporphyrin IX, and is an essential cofactor in various biological processes [55]. Heme-binding proteins (or hemoproteins) carry out a variety of important functions, such as oxygen transport, electron transfer, and enzyme catalysis. The utilization of heme requires a complex machinery for its biosynthesis, insertion into hemoproteins, and uptake from external sources [55, 56]. Several bioinformatics tools, such as SCMHBP and HemeBIND, have been developed to predict hemoproteins [30, 31]. A genome-wide study investigated the processes of heme biosynthesis and uptake in several hundreds of prokaryotic organisms, which allowed to identify organisms capable of performing none, one, or both processes [57]. Many Gram-positive pathogens support heme uptake from the host, implying that this process can be a potential target for a wide range of antibiotics. Another bioinformatics analysis of all known genes in the heme biosynthesis pathway in animals revealed that these genes are under strong purifying selection from cnidarians to mammals and that multiple-level controls on the activity of this pathway depend on the linear depth of these genes [58]. Further studies on hemoproteins in higher eukaryotes such as mammals are needed.

4.3.1.2 Iron Homeostasis and Diseases

Our knowledge of diseases associated with Fe mainly depends on our understanding of Fe homeostasis. Levels of Fe can be affected by many factors such as genetic variations, diet that contains insufficient or excessive Fe, reduced intake or absorption of Fe, and hemolysis. Therefore, a lot of epidemiological and omics-based studies have focused on examining the association of dietary Fe intake and gene mutations with the risks of common diseases. Many Fe-associated diseases or disorders are attributable to genetic malfunctions that influence the hepcidin/Fpn trait.

Fe overload may lead to Fe deposition in the liver, heart, brain, and some other organs, which promotes the formation of hydroxyl radicals and causes damage to DNA and protein or even cell death. Long-term Fe overload increases the risk of cancer, diabetes mellitus, liver cirrhosis, arthritis, cardiac arrhythmia, heart failure, retinal degeneration, neurodegenerative diseases, and premature death [59, 60]. The main treatments for Fe overload include phlebotomy and Fe chelation therapy [61].

Hemochromatosis is the most common genetic Fe overload disorder and results from mutations in several genes including hemochromatosis protein HFE (involved in transcriptional regulation of hepcidin), TfR2, hemojuvelin, Fpn, and hepcidin, all of which affect the hepcidin/Fpn regulatory axis [62,63,64,65]. The main characteristic of hemochromatosis is the Fe accumulation in vital organs where it may cause cell injury and organ dysfunction.

Aceruloplasminemia is an autosomal recessive disease caused by mutations in the gene encoding ceruloplasmin, a ferroxidase involved in the oxidation of Fe2+ into Fe3+, therefore assisting in Fe transport across the cell membrane [66]. This disease is characterized by a total absence of ceruloplasmin in the blood and accumulation of Fe in hepatocytes, neurons in the brain, and pancreatic islet cells, which in turn leads to diabetes mellitus, neurologic diseases, dementia, and some other diseases.

Fe deficiency is the major cause of anemia. Considering that the majority of total body Fe is used in hemoglobin synthesis, Fe deficiency may affect the production of healthy red blood cells. In addition, deficiency in this metal can result in premature birth, poor growth development, and weak cognitive skills and also affects the nervous system. Changes in diet and Fe supplements can treat minor Fe deficiency, while severe patients may require transfusion of red blood cells or intravenous Fe. It has been reported that a rare mutation in the gene TMPRSS6 encoding matriptase-2 may lead to Fe-refractory Fe-deficiency anemia [67]. As a result, Fe absorption from the intestine and Fe release from macrophages are inhibited, causing severe Fe deficiency [63].

Fe dyshomeostasis in cancer is well known for a long time. The relationship between elevated Fe levels and increased cancer incidence has been debated for years [68]. However, dietary Fe deprivation and Fe chelators have been suggested in cancer therapy, which implies a strong link between Fe-rich environment and cancer [69, 70]. Moreover, levels of TfR1 were observed to be elevated in cancer, which could be used as an anticancer drug target [71]. It was also found that levels of hepcidin were increased and levels of Fpn were decreased in both breast cancer cell lines and patients, implying a direct relationship between intracellular Fe homeostasis and tumor growth [72, 73]. A recent computational study defined the Fe regulatory gene signature which includes TfR1, HFE, and some other genes for outcome prediction in breast cancer patients [74]. Although cancer is definitely more than an Fe disorder, these findings indicate a clear relationship between Fe metabolism and cancer development.

4.3.2 Zinc

Zn plays a pivotal role as a structural, catalytic, and signaling component that can be found in numerous proteins, including enzymes, structural proteins, transcription factors, cytokines, and ribosomal proteins. A global search within the human genome with a bioinformatics approach showed that about 2800 proteins (10% of the human proteome) consist of potential Zn-binding sites [75]. In addition, Zn was suggested to be a fundamental element in the origin of life, and its bioavailability may have been a limiting factor in eukaryotic evolution [76]. Thus, it is expected that Zn metabolism and homeostasis in an organism are tightly controlled. Imbalance in Zn homeostasis has been found to be associated with a variety of human diseases.

4.3.2.1 Zinc Metabolism and Zinc-Dependent Proteome

The biological functions of Zn-binding proteins are maintained through cellular Zn levels, which are mainly regulated by Zn transporters, Zn-sensing molecules (such as metallothioneins), and metal-response element-binding transcription factor-1 (MTF-1) [77,78,79].

In mammals, two groups of Zn transporters have been identified on the basis of their structural and functional features: solute carrier family 39A (SLC39A) that includes mammalian ZRT/IRT-related proteins (ZIPs) [80] and solute carrier family 30A (SLC30A) that comprises mammalian ZnTs [81]. Fourteen ZIP homologs (Zip1 to Zip14) have been identified in the human genome, all of which mediate Zn uptake from the extracellular environment or intracellular vesicles into the cytoplasm. Ten members of the ZnT family (ZnT1 to ZnT10) are responsible for Zn efflux from the cytoplasm toward intracellular vesicles or the extracellular space. The expression of specific ZIPs and ZnTs is tissue dependent and related to specific cellular functions. A recent bioinformatics analysis of the distribution and evolutionary trends of all ZIP family members in eukaryotes suggested that Zip11 is the most ancient Zn transporter that might have originated in early eukaryotic ancestors [82].

Metallothioneins (MTs) are a class of small cytosolic metal-binding proteins that contain one-third cysteine (Cys) residues [83]. These proteins can bind Zn and some other metal ions with high affinity. MTs are thought to be involved in the intracellular regulation of Zn concentration and detoxification of nonessential heavy metals [84]. Four isoforms have been identified so far, the most widely expressed isoforms in mammals being MT1 and MT2. Synthesis of MT is strongly induced by metals, mediated by MTF-1, an important transcription factor for liver development and cell stress response [85]. Under pathological conditions, MTF-1 seems to be involved in tumor angiogenesis and drug resistance. It thus seems generally advisable to monitor MTF-1 activity in stress-related processes including aging and carcinogenesis.

Identification of Zn-binding proteins is important for understanding how Zn is used by different organisms. Thousands of genes encoding Zn-binding proteins have been reported, especially after the completion of genome projects, implying that a great number of biological processes are Zn dependent. In recent years, several comparative genomic studies have been carried out for prediction of Zn-dependent metalloproteomes in the human genome and in genomes of other organisms. Based on known Zn-binding domains and patterns, an early study investigated Zn proteomes in several organisms from the three domains of life [86]. The number of Zn-binding proteins correlated with the proteome size in an organism. In eukaryotes, the majority of Zn proteins are involved in performing enzymatic catalysis and in regulating DNA transcription, especially Zn-finger-containing transcription factors that are almost exclusively a privilege of eukaryotes.

The Zn-binding motifs could also be affected by different functions of Zn-dependent proteins. Four-ligand motifs are often observed in structural sites where Zn contributes to the stability of protein structure, whereas three-ligand motifs (the fourth is often water) are associated with catalytic sites where Zn participates in enzymatic reactions [87]. Moreover, conserved residues in these motifs are quite different. For example, among all predicted Zn-binding proteins in human, 97% have a structural Zn site with at least one Cys ligand and 40% have four Cys ligands [75]. On the other hand, one-third of human Zn proteins containing a three-ligand motif have three histidine (His) residues. The conservation of Zn-finger-binding sites could be associated with their more recent origin, whereas the differentiation of the catalytic Zn-binding sites could be the result of evolutionary processes that led to the development of different enzymatic reactions targeting different physiological substrates [88].

4.3.2.2 Zinc Homeostasis and Diseases

The importance of Zn in human metabolism is illustrated by increasing evidence that points to Zn metabolism as a critical player in the onset or progression of a growing number of multifactorial diseases such as diabetes, Alzheimer’s disease (AD), and asthma. While Zn deficiency is commonly caused by dietary factors, several genetic causes of Zn deficiency have been reported. Therefore, in order to evaluate the influence of Zn on disease risk, it is important to adopt population-based approaches that take into consideration the Zn status and/or genetic variations in the genes encoding proteins that regulate Zn metabolism.

Diabetes is a common metabolic disease characterized by impaired glucose homeostasis and long-term damage, dysfunction, and failure of various organs. It comprises several forms, all of which are associated with varying degrees of hyperglycemia. Type 1 diabetes (T1D) is caused by an autoimmune destruction of islet beta cells leading to little or no insulin production, whereas type 2 diabetes (T2D) is characterized by hyperinsulinemia caused by a failure in the insulin signaling pathway triggered by the activation of the insulin receptor [89]. Zn ions are essential for the normal processing and storage of insulin. Several epidemiological studies have demonstrated that whole-body Zn status is associated with diabetes, including significantly decreased serum Zn levels and increased urinary Zn excretion [90,91,92,93]. Zn supplementation could improve T2D symptoms, both in mice and diabetic patients [94]. However, there is no clear evidence if the use of Zn supplementation would have an effect for the prevention of T2D [95]. On the other hand, Zn imbalance can result not only from insufficient dietary intake but also from impaired function of proteins that regulate Zn metabolism. The first comprehensive GWAS study for T2D demonstrated a link between T2D development and a single nucleotide polymorphism (SNP) rs1326663 in the SLC30A8 gene encoding ZnT8 transporter in diabetic patients [96]. In recent years, several other SNPs in this gene were reported [97, 98]. It is unclear if some of these SNPs can affect glucose homeostasis. Other genetic mutations were also identified, such as a SNP in the promoter region of the MT2A gene which is closely associated with hyperglycemia in old patients with T2D [99]. These human genetic studies highlight the relationship between Zn and glucose homeostasis and diabetes, and further investigation in this direction is very important.

AD is an age-related neurodegenerative disease, characterized by progressive impairment of memory and cognitive abilities. It is commonly thought that AD is caused by the abnormal accumulation and deposition of extracellular senile plaques composed of Cu-Zn aggregates of the amyloid β-peptide (Aβ) [100]. Excess Zn was found to be associated with amyloid plaques, and several studies have indicated that Aβ deposition could be inhibited by Zn chelation [101, 102]. Expression levels of several Zn transporters such as ZnT1, ZnT4, and ZnT6 were increased in early- and late-stage AD subjects [103, 104]. To date, studies on Zn homeostasis and related genes in neurodegenerative diseases are still few and preliminary. Some of these proteins in maintaining brain Zn homeostasis are thought to be important in the onset and progression of AD. In the future, identification of genetic variations in genes controlling Zn homeostasis is essential to establish possible functional links between Zn metabolism and AD.

Asthma is a common chronic inflammatory disease of the airways caused by a combination of genetic and environmental factors. Zn deficiency may play a role in the pathogenesis, control, and severity of asthma [105, 106]. In addition, hair Zn levels were found to be significantly lower in wheezy infants than in healthy controls, implying that Zn deficiency may influence the risk of wheezing in early childhood [107].

Mutations in genes responsible for Zn metabolism have been reported to be associated with additional inherited disorders of Zn deficiency. For example, variations in two Zn transporters Zip4 and ZnT4 are linked to the Zn deficiency diseases acrodermatitis enteropathica syndrome in humans and lethal milk syndrome in mice, respectively [108, 109]. A point mutation in ZnT2 is associated with transient neonatal Zn deficiency [108]. Another population-based study on the genetics of Zn metabolism suggested that a SNP in the MTF-1 gene was associated with lymphoma susceptibility [110]. Mutations in some other human Zn transporters were also observed to be related to a range of diseases, including heart disease and mental illnesses [111]. To date, the functional consequences of these mutations and their interactions with dietary Zn are not known. It remains unclear whether some variations only increase the risk for diseases if dietary Zn levels are inadequate or exceeded.

4.3.3 Copper

Cu is essential for approximately a dozen of proteins and enzymes that carry out fundamental biological functions required for growth and development, such as mitochondrial oxidative phosphorylation, free-radical detoxification, pigmentation, neurotransmitter synthesis, and Fe metabolism [112]. As free cuprous ions react readily with hydrogen peroxide to yield the deleterious hydroxyl radical, it is important for Cu-dependent organisms to have a complex Cu regulatory network to prevent its deficiency and to limit its toxicity. Disrupted Cu homeostasis may lead to excess or toxicity of Cu, which is associated with the pathogenesis of hepatic disorder, neurodegenerative changes, and other disease conditions [113].

4.3.3.1 Copper Metabolism and Cuproproteins

In eukaryotes, cellular Cu trafficking and homeostasis are tightly regulated via a complex system which contains Cu transporters, chaperones, and other components. Cu is acquired by the high-affinity Cu transporter (Ctr) family [114]. Different organisms may possess multiple Ctr proteins located in different biological membranes. In humans, two Ctr proteins (hCtr1 and hCtr2) are identified. hCtr1 is the main Cu importer, which is located predominantly at the plasma membrane, but may also be present in intracellular vesicular compartments [115]. hCtr2 is localized in late endosomes and lysosomes and may be involved in the delivery of Cu ions to the cytosol [116]. Cu export is mediated by an important category of ATP-dependent transporters, the ATP7 family [117]. Mammals have two isoforms: ATP7A and ATP7B [118]. ATP7A is expressed in most tissues (such as the intestinal epithelium, heart, and brain) except the liver, which is required for the transport of Cu into the trans-Golgi network for biosynthesis of several secreted cuproproteins and for basolateral efflux of Cu in the intestine and other cells [119]. ATP7B is predominantly detected in the liver and is required for Cu metalation of ceruloplasmin and biliary Cu excretion [119].

Within the cell, Cu is delivered to specific compartments or cuproproteins by different metallochaperones, including CCS, COX17, and Atox1 [120]. CCS is the Cu chaperone for Cu-Zn superoxide dismutase (Cu-Zn SOD), which delivers Cu in the cytoplasm and intermitochondrial space. COX17 delivers Cu to mitochondria to cytochrome c oxidase (COX) via additional chaperones COX11, SCO1, and SCO2. Atox1 (antioxidant protein 1) is responsible for shuttling Cu from the cytosol to exporters ATP7A and ATP7B. Other proteins involved in Cu homeostasis may exist and might include COMMD1 (Cu metabolism MURR1 domain), metallothionein, and amyloid precursor protein [119, 120]. Plasma protein transport of Cu from the intestine to liver and in systemic circulation probably includes albumin and alpha-2-macroglobulin. Changes in the expression of some of these proteins may help to monitor Cu status of humans.

To date, a number of cuproproteins have been characterized in various organisms. The Cu sites in these proteins could be divided into three types based on spectroscopic and structural properties, and some cuproproteins (such as multicopper oxidases, MCOs) may contain multiple types of Cu centers [121, 122]. Type 1 Cu proteins play important roles in electron transfer in the respiratory and photosynthetic chains of bacteria and plants, including plastocyanin, plantacyanin, and several other proteins [121]. Type 1 Cu center is also found in some larger cuproproteins, such as COX I and COX II, nitrite reductase, and a variety of MCOs (ascorbate oxidase, hephaestin, ceruloplasmin, etc.). Type 2 cuproproteins include Cu-Zn SOD, Cu amine oxidase (CuAO), peptidylglycine R-hydroxylating monooxygenase (PHM), and dopamine β-monooxygenase (DBM) [122]. Other cuproproteins include tyrosinase, hemocyanin, galactose oxidase, and Cnx1G. A list of known cuproprotein families in eukaryotes is shown in Table 4.1.

Table 4.1 Known user protein families for selected trace elements in eukaryotes

In recent years, several bioinformatics studies have been carried out to characterize important features of Cu utilization and cuproproteomes (the whole set of cuproproteins) in a variety of organisms [123,124,125,126]. Based on a set of Cu-binding motifs derived from known cuproprotein sequences and domain recognition methods, a computational strategy was developed for examining the occurrence of cuproproteins in several sequenced prokaryotes and eukaryotes [124, 125]. The size of the cuproproteome is generally less than 1% of the total proteome. Recently, several comparative genomic studies examined the Cu utilization trait and cuproproteomes in hundreds of sequenced organisms, which revealed a more clear view of Cu utilization, especially in eukaryotes [123, 127]. Almost all sequenced eukaryotes could utilize Cu. Among all examined cuproprotein families, MCOs, COX I, COX II, and Cu-Zn SOD were the most abundant cuproproteins. The largest cuproproteomes in eukaryotes were found in land plants (e.g., 62 and 78 cuproproteins in Arabidopsis thaliana and Oryza sativa, respectively). Mammals may have approximately 20 known cuproproteins [127].

4.3.3.2 Copper Status and Human Diseases

There are few reports of Cu excess or deficiency in the general population except for formerly obese patients after Roux-en-Y gastric bypass surgery, in whom Cu deficiency was reported with an incidence of 18.8% [128]. On the other hand, several studies have indicated the relationship between dietary Cu and health issues.

High Cu level in the serum has been considered as a potential risk factor for cardiovascular disease in several case-control and population-based studies [129,130,131]. However, dietary Cu intake was not predictive of cardiovascular mortality in a cohort study of older British people [132]. In a cross-sectional study, a negative relationship between dietary or serum Cu and total and LDL (low-density lipoprotein) cholesterol was observed, implying that a high Cu intake and status are associated with a better lipoprotein profile [133]. A second cross-sectional study showed that serum Cu was positively associated with HDL (high-density lipoprotein) cholesterol [134]. Limited evidence also suggests that low Cu diet may lead to premature ventricular discharge and cardiac arrhythmia [135].

The hypothesis that Cu intake might be linked to cognitive decline (such as AD) is based on the well-recognized age-dependent accumulation of some metals (including Cu, Zn, and Fe) in key sites of the brain [136]. An inverse linear association between serum Cu concentrations and cognitive performance was observed in a large cohort of elderly healthy women [137]. A recent study showed a significant inverse correlation of the serum levels of free Cu with both Mini-Mental State Examination (MMSE) and attention-related neuropsychological test scores, suggesting that free Cu appears to be a player in cognitive decline [138, 139]. However, a controversy point was also reported that free serum Cu may increase even when total body Cu decreases, which questions the relevance of free Cu as a marker of Cu exposure [140]. In addition, Cu may promote Aβ aggregation in the brain, and unusually high concentration of Cu has been observed in AD senile plaques [141]. Cu-induced oxidative stress is another mechanism that may lead to profound neurodegenerative processes in AD [142].

The relationship between Cu and cancer has been investigated by some groups. For example, two cohort studies examined the link between Cu intake and lung cancer and lymphoma. In one large cohort study (482,875 subjects), there was no significant association between total Cu intake and lung cancer risk [143]. In another cohort study, no link could be identified between total or dietary Cu and the risk of non-Hodgkin’s lymphoma, diffuse large B-cell lymphoma, or follicular lymphoma [144]. In a study of diet and early breast cancer, the ceruloplasmin/total blood Cu ratio was found to be significantly related to the disease [145]. For other types of cancer, no cohort studies have assessed Cu intake. Thus, no conclusion can be drawn regarding Cu intake and cancers so far.

One cohort study reported that there was no relationship between total (diet and supplements) or dietary Cu intake and risk of rheumatoid arthritis [146]. In spite that the use of Cu supplements showed a weak but significant inverse association with rheumatoid arthritis, such association did not persist after further adjustment for confounders. Thus, no conclusion can be made regarding Cu intake and rheumatoid arthritis.

It has been suggested that Cu may influence the immune system. Animals with severe Cu deficiency have reduced populations of neutrophils and T cells, impaired proliferation of T lymphocytes in response to mitogens, and decreased activity of B lymphocytes, phagocytes, and natural killer cells. However, in humans, the impact of Cu supplementation on immune function is less well documented. The effect of low-Cu diets on immune function has been examined in healthy men, which could significantly inhibit the proliferation of peripheral blood mononuclear cells and increase the fraction of circulating B cells [147]. It seems that the impact of Cu on the immune system can only be observed in specific situations where Cu malabsorption may be combined with low Cu intakes, such as post-bariatric gastric bypass surgery patients [128].

Besides dietary Cu amounts, mutations in genes involved in Cu homeostasis and cuproprotein genes are also associated with severe pathology. It is well known that genetic variations in ATP7A and ATP7B underlie Menkes disease and Wilson’s disease, respectively [148]. Menkes disease is an X-linked inherited disorder, and it is caused by a mutation in the ATP7A gene. Mutations in this gene lead to hypothermia, neuronal degeneration, mental retardation, abnormalities in hair, bone fractures, and aortic aneurysms. Wilson’s disease is an autosomal recessive genetic disorder whose clinical manifestations are liver disease and neurological damage and is caused by disabling mutations in both copies of the ATP7B gene. In addition, mutations of Cu-Zn SOD have been connected with amyotrophic lateral sclerosis, where a gain of function is responsible for the underlying neurological symptomatology [149]. As mentioned above, mutations in the ceruloplasmin allele may lead to aceruloplasminemia [66].

4.3.4 Molybdenum

Mo is an essential transition metal for many living organisms as it is a key component of the active site of molybdoenzymes catalyzing key redox reactions in the metabolism of carbon-, nitrogen-, and sulfur-containing compounds [150,151,152]. With the exception of bacterial nitrogenase, all known molybdoenzymes use the pterin-based Mo cofactor (Moco) [150].

4.3.4.1 Molybdenum Uptake, Molybdenum Cofactor Biosynthesis, and Molybdoproteins

Studies on Mo uptake in eukaryotes are quite limited. Only two types of eukaryotic Mo transporters have been characterized, MOT1 and MOT2 [153, 154]. Mammals only have MOT2 protein. The function of MOT2 in Mo transport or homeostasis is not clear and needs to be examined in the future.

Moco is synthesized by a conserved multistep pathway which includes (i) conversion of GTP into cyclic pyranopterin monophosphate (cPMP), (ii) transformation of cPMP into molybdopterin, and (iii) adenylylation of molybdopterin and subsequent Mo insertion. At least seven proteins (MOCS1A, MOCS1B, MOCS2A, MOCS2B, MOCS3, GEPH-G, and GEPH-E as named in humans) are involved in Moco biosynthesis [155]. Details of these processes have been described in many review articles [150,151,152, 155]. In addition, a Moco sulfurase, catalyzing the generation of the sulfurylated form of Moco that is essential for activation of the xanthine oxidase family of proteins such as xanthine dehydrogenase (XDH) and aldehyde oxidase (AO), has been identified in plants and humans [156].

To date, more than 50 different molybdoenzymes have been found in bacteria. In contrast, only a limited number of molybdoenzymes are present in eukaryotes and can be divided into three classes: the xanthine oxidase (XO) family which is represented by XDH and AO, the sulfite oxidase (SO) family which includes SO and nitrate reductase (NR), and the mitochondrial amidoxime-reducing component mARC family (Table 4.1) [155, 157]. There are five different molybdoenzymes known in humans: XDH, AO, SO, and two isoforms of mARC (mARC1 and mARC2). SO and XDH catalyze catabolic reactions in Cys and purine metabolism, and their structures and reaction mechanisms have been studied intensively. In contrast, functions of AO and mARC enzymes remain unclear, both of which have been suggested to function in drug metabolism [157, 158].

In the recent decade, several bioinformatics studies have focused on the identification of genes involved in Mo uptake and Moco biosynthesis and genes encoding molybdoenzymes in a wide range of sequenced organisms [127, 159, 160]. In eukaryotes, Mo utilization pathway was mainly observed in animals, plants, algae, and certain fungi, whereas parasites and yeasts lack the Mo utilization trait. Essentially all Mo-utilizing organisms have members of the SO and XO families. Plants appeared to have the largest number of molybdoproteins among eukaryotes (10–11 molybdoproteins) [127].

4.3.4.2 Molybdenum Cofactor and Molybdoenzyme Deficiencies

Moco deficiency (MoCD) is a rare inborn error of metabolism causing the loss of all molybdoenzyme activities. The clinical manifestations of MoCD involve intractable neonatal seizures, severe developmental delay, progressive microcephaly with brain atrophy, and even early childhood death [161]. Most of the symptoms of MoCD are very similar to isolated SO deficiency, which is caused by mutations in the SUOX gene (the gene encoding SO) leading to the accumulation of sulfite. Therefore, SO is considered as the most important Moco-dependent enzyme, and sulfite accumulation presents the primary cause of neurological impairment in both disorders [162].

XDH deficiency results in the excessive excretion of xanthine in urine leading to a disease called xanthinuria, which includes type 1 and type 2 [163]. Type 1 xanthinuria is caused by the loss of activity of XDH resulting in an accumulation of xanthine. In contrast, type 2 xanthinuria is caused by the simultaneous loss of activities of XDH and AO due to mutations in the MCSU gene, whose protein product is essential for the sulfuration of Moco in enzymes of the XO family [164]. A very low level of plasma uric acid and high levels of xanthine are hallmarks of both types of xanthinuria. Patients of both groups have similar clinical presentation, mostly due to increased xanthine deposition; however, the mechanism involved in the disease is less clear.

MoCD is mainly caused by mutations in any steps of Mo biosynthetic pathway. Previous studies have identified two types of MoCD: type A and type B. It has been found that MoCD type A patients carry mutations in the MOCS1 gene, while type B patients are defective in MOCS2 [165]. Mutations in the gephyrin gene cause very severe forms of MoCD due to impaired synaptic inhibition.

4.3.5 Selenium

Se is an important metalloid in many organisms from bacteria to humans. This micronutrient is known primarily for its functions in redox homeostasis and is recognized as one of the promising cancer chemopreventive agents [166]. It also has a role in antivirus activity, in anti-inflammatory activity, in preventing heart disease and other cardiovascular and muscle disorders, and in delaying the progression of AIDS [167,168,169]. In addition, Se is required for mammalian development, male reproduction, and immune function.

4.3.5.1 Selenocysteine Biosynthesis and Selenoproteins

Se exerts its functions in the form of selenocysteine (Sec), which is co-translationally incorporated into selenoproteins [170]. The biosynthesis of Sec and its incorporation into selenoproteins, which have been reviewed in many other articles, require a complex molecular machinery that recodes UGA codons from stop signals to Sec function [170,171,172]. In eukaryotes, this process needs a cis-acting Sec insertion sequence (SECIS) element which is located in the 3′-untranslated region (3′-UTR) of selenoprotein mRNAs, tRNA[Ser]Sec, and several trans-acting factors dedicated to Sec incorporation. In mammals, proteins and enzymes that are involved in Sec biosynthesis include selenophosphate synthetase 2 (SPS2), Sec synthase (SecS), O-phosphoseryl-tRNA[Ser]Sec kinase, eukaryotic Sec-specific elongation factor (eEFSec), Secp43, SECIS-binding protein 2, and ribosomal protein L30 [173]. Moreover, Sec is usually present in the active site of selenoproteins, being essential for their catalytic activity.

In the past several years, remarkable progress in genome sequencing projects provided an opportunity and resources for selenoprotein identification. Several bioinformatics algorithms have been developed to predict selenoprotein genes in a variety of prokaryotic and eukaryotic genomes [33,34,35]. The general strategy of these approaches is to find candidate SECIS elements and then analyze upstream regions to identify selenoprotein genes. Besides, additional SECIS-independent approaches were developed, which employ Cys-containing proteins and comprehensive protein databases to search nucleotide sequence databases for selenoprotein genes [174]. Based on these tools, a number of novel selenoproteins have been discovered in various organisms [23, 175,176,177]. A complete list of known eukaryotic selenoproteins is shown in Table 4.1. In mammals, a total of 25 and 24 selenoproteins were identified in human and mouse, respectively [175]. The main selenoprotein families include glutathione peroxidases (GPxs) that have oxidoreductase functions and also regulate immune response, thioredoxin reductases (TRs) which modulate transcription and signal transduction functions, iodothyronine deiodinases (Dios) that participate in thyroid hormone metabolism, selenoprotein P (SelP), 15-kDa selenoprotein (Sep15), SPS2, and methionine-R-sulfoxide reductase 1. However, the functions of many eukaryotic selenoproteins are unknown.

Recent comparative analyses of eukaryotic selenoproteomes revealed that significant differences in the composition of selenoproteomes could be seen even among related organisms [176, 178]. The number of selenoproteins varied from 0 (plants, fungi, and some protists) to 56 (Aureococcus anophagefferens) [179]. Among all selenoproteins, selenoprotein K (SelK) and selenoprotein W (SelW) were the two most widespread selenoproteins which are present in most eukaryotes that utilize Sec. The origin of many selenoproteins in mammals can be traced back to the ancestral, unicellular eukaryotes [176]. Many of these selenoproteins were preserved during evolution and remain in mammals and green algae, whereas many other organisms, including land plants, fungi, nematodes, insects, and some protists, manifested massive, independent selenoprotein gene losses. It seems that large selenoproteomes mainly occur in aquatic organisms, whereas the organisms that lack or have few selenoproteins are mostly terrestrial (with the exception of mammals) [176].

4.3.5.2 Selenium Metabolism and Human Disease

As an essential micronutrient, the range of Se intake for human health is narrow, such that low Se intake is associated with developmental defects and disease states and high Se results in toxicity. Recent Se supplementation trials have found that moderately higher Se intake may influence redox status through selenoprotein synthesis to cause T2D [180, 181]. Thus, Se homeostasis needs to be tightly regulated in humans.

Several diseases have been reported to be associated with severe Se deficiency, such as Keshan disease, Kashin-Beck disease, and myxedematous endemic cretinism. Keshan disease was first described as endemic cardiomyopathy with multiple foci of necrosis in the early 1930s in northeastern China, with higher incidence in women and children [182]. It was suggested that Se deficiency in combination with coxsackie virus infection might be required for the development of Keshan disease [182]. Kashin-Beck disease is a chronic, endemic osteochondropathy accompanied by joint necrosis, which affects individuals in Se-deficient areas of China, Siberia, and North Korea [183]. A polymorphism in the GPx1 gene was reported as a potential genetic risk factor in the development of this disease [184]. Myxedematous endemic cretinism is induced by thyroid atrophy and results in mental retardation, which has been observed in those areas of the world with both severe I and Se deficiencies [185].

Se toxicity (selenosis, blood Se level > 100 μg/dL) can be acute or chronic. The symptoms include vomiting, abdominal pain, diarrhea, hair loss, fatigue, irritability, and neurological impairment [186]. Selenosis in humans is a rare event except in very high-Se areas. The famous Se and Vitamin E Cancer Prevention Trial (SELECT) that involved more than 35,000 men revealed the potential risk of T2D, alopecia, and dermatitis due to Se supplements [180].

Se supplementation is prioritized for brain development and function as almost all selenoproteins are expressed in neurons [187]. Recently, mutations of the SecS gene were reported to cause autosomal-recessive progressive cerebellocerebral atrophy (PCCA) in Jews of Iraqi and Moroccan ancestry, which disrupt the biosynthesis of Sec and thus the production of selenoproteins [188]. This disease represents the first clinical syndrome related to Sec biosynthesis in humans.

As mentioned above, previous Se supplementation trials for cancer prevention revealed an over two-fold increase in T2D incidence in the Se-supplemented compared to the placebo group [181, 189, 190]. The SELECT project revealed a similar trend [180]. A recent study reported that SelP is associated with the development of T2D, which may induce insulin resistance in the liver and muscle, resulting in hyperglycemia [191]. Overproduction of GPx1 in mice also resulted in a T2D-like phenotype [192]. In addition, some selenoproteins that are related to ER stress, such as selenoprotein S (SelS) and Dio2, have been found to be involved in the development of T2D. Increased expression of SelS mRNA was observed in human subcutaneous adipocytes from T2D patients [193]. A SNP of human Dio2 (A/G at codon 92) has been identified, which is associated with greater insulin resistance in T2D patients [194].

The association between Se and other diseases, such as cancer, cardiovascular disease, neurodegenerative disease, thyroid disease, and reproductive system disease, has been reported in numerous studies [166, 195, 196]. Among different types of evidence, identification of genetic variants in selenoprotein genes or Se-related genes has shed light on the relationship between Se and disease risk, especially for cancer. Although the mechanistic links between Se levels, selenoproteins, and carcinogenesis are not clear, a significant number of GWAS studies have shown that a small number of SNPs in several selenoprotein genes may influence risk of several cancers, including colorectal, prostate, lung, breast, or bladder cancers [197]. For example, mutations in the coding regions or UTRs of GPx1, GPx4, and SelP genes have functional consequences and could be associated with breast cancer [197, 198]. More SNPs in the promoter, coding region, and UTRs of GPx1, GPx4, SelP, Sep15, SelK, SelS, TR1, and TR2 genes were considered as prostate cancer and/or colorectal cancer risks [197, 198]. Furthermore, mutations in SPS2 and SecS genes were significantly associated with Crohn’s disease [199]. Thus, continued research to study the effects of these mutations on selenoprotein synthesis and Se homeostasis could help to understand the relationship between Se metabolism and different diseases.

4.3.6 Other Trace Elements

Co is mainly used in the form of cobalamin (vitamin B12), a water-soluble cofactor involved in methyl group transfer and rearrangement reactions [200]. A recent comparative genomic analysis revealed that most B12-utilizing eukaryotic organisms are animals (except insects) [201]. Mammals have a unique absorption, delivery, and activation system for vitamin B12. In humans, only two enzymes bind vitamin B12: methionine synthase (MetH) and methylmalonyl coenzyme A mutase (MCM), both of which are important for health. MetH is essential in folate-mediated one-carbon metabolism, including DNA synthesis and chromatin methylation, whereas MCM catabolizes branched-chain and odd-chain fatty acids. Vitamin B12 is required for erythropoiesis, and the classic presentation of vitamin B12 deficiency is hematologic: megaloblastic anemia [202]. Vitamin B12 deficiency also leads to neurologic manifestations which may be irreversible. Cobalamin C disease (CblC) with methylmalonic aciduria and homocystinuria is the most frequent genetic disorder associated with vitamin B12 metabolism, which is caused by an inability of the cell to convert vitamin B12 to its active forms. The typical symptoms may include intrauterine growth retardation, microcephaly, failure to thrive, hypotonia, hydrocephalus, neurological deterioration, hematological abnormalities, and hemolytic uremic syndrome. The MMACHC gene is responsible for the CblC, which may act both as an intracellular vitamin B12 trafficking chaperone and as a decyanase catalyzing the reductive decyanation of cyanocobalamin [203]. To date, more than 75 mutations have been reported in this gene [204].

I is a chemical element required for thyroid hormone production. Early deficiency of this element in life impairs cognition and growth, but its status is also a primary determinant of benign thyroid disorders in adults, such as goiter, nodules, and hyper- and hypothyroidism [205]. In contrast, the role of I intake in thyroid cancer remains unclear, despite decades of studies and debates. To date, studies of thyroid cancer epidemiology in different populations are very challenging because it is still a relatively rare and, in most cases, indolent cancer. The available evidences from several case-control studies imply that I deficiency is a risk factor for thyroid cancer and that it particularly increases risk for follicular thyroid cancer and, possibly, anaplastic thyroid cancer [206].

With regard to other trace elements, a great number of experimental studies have been conducted for understanding their metabolism and function. However, bioinformatics analysis of their utilization in human health and disease is almost completely blank. Therefore, more research efforts are needed in this area.

4.4 Ionomics and Human Health

4.4.1 An Overview of Ionome and Ionomics

In the recent decade, a new term, ionome, has been introduced, which is defined as the mineral nutrients and trace elements of an organism [207]. Ionomics, the study of the ionome, involves quantitative analyses of elemental composition in living systems using high-throughput elemental analysis technologies and their integration with bioinformatics tools [208]. Such an approach has been widely applied in plants in response to physiological stimuli, developmental state, and genetic modifications. It has been shown that ionomics has the ability to help identify genes and gene networks that directly control the ionome. In addition, it may provide a powerful tool to investigate more complex gene networks that control developmental and physiological processes and influence the ionome indirectly [209].

The majority of experimental techniques for elemental analysis include inductively coupled plasma mass spectrometry (ICP-MS), inductively coupled plasma optical emission spectroscopy (ICP-OES), and X-ray fluorescence. Among them, ICP-MS is the most frequently used approach, which is capable of detecting metals and several nonmetals at very low concentration (such as part per trillion). Compared to ICP-OES, ICP-MS allows for a smaller sample size owing to its greater sensitivity and has the ability to detect different isotopes of the same element. Currently, ICP-MS has been successfully used for large-scale ionomic studies in yeast, plants, and mammals, which illustrate the power of ionomics to identify new aspects of trace element metabolism and homeostasis and how such information can be used to develop hypotheses regarding the functions of previously uncharacterized genes [210,211,212,213].

As large-scale ionomic studies may produce large amount of data due to the analysis of hundreds or thousands of samples over a period of time, it is important to develop appropriate information management systems and tools for genome-scale data acquisition, validation, storage, and analysis. The Purdue Ionomics Information Management System (PiiMS) is an example of such a workflow control system, which provides an open-access platform for data processing, mining, and discovery [214]. This system (http://www.ionomicshub.org/home/PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. To promote rapid knowledge generation about the ionome and related genes/networks, it is also important that such information should be correctly annotated for further discovery. However, systems to allow researcher-driven annotation of genes involved in trace element metabolism and homeostasis are very limited. With the increase in the number of novel trace element-related genes and their functions, new approaches allowing for such systematized annotation are needed.

Very recently, mechanisms that regulate different trace elements in human HeLa cells were characterized by a genome-wide high-throughput siRNA/ionomics screen [213]. A computational strategy was developed for data processing and advanced analysis. Based on the primary screen data and gene network analysis, a secondary screen was performed, which revealed additional candidate genes involved in the homeostasis of Cu, Se, Fe, and some other trace elements. This ionomic dataset should be useful for further studies on trace element metabolism and homeostasis in humans.

4.4.2 Recent Application of Disease Ionomics

Before the birth of ionomics, ICP-MS has been applied to quantify the levels of multiple trace elements in samples of different diseases for years. For example, an early ICP-MS-based study examined 20 elements in brain tissue, cerebrospinal fluid, serum, and aqueous humor from AD patients and matched control subjects [215]. Another study which determined concentrations of 14 trace elements in blood samples of patients with coronary heart disease (CHD) showed that patients had elevated Co plasma as well as diminished Cu blood concentrations [216]. In recent years, ICP-MS-based elemental distribution analysis has been reported for some other diseases, such as T2D, Parkinson’s disease, viral infection, autism, atherosclerosis, and cancer [217,218,219,220,221]. In spite that related elements were reported for each of these diseases, the associations and interactions among these elements are unknown due to methodological limitation.

With the rapid development of systems biology and statistical approaches, advanced computational strategies have recently been used for systematic analysis of the ionome in several diseases such as T2D, which improves our understanding of the complex interactions among different elements.

Sun et al. measured the fasting plasma elemental concentrations to investigate associations of ion modules/networks with overweight/obesity, metabolic syndrome, and T2D in 976 middle-aged Chinese men and women [13]. Based on mutual information analysis, they constructed disease-related ion networks and found that Cu and phosphorus always ranked the first two among three specific ion networks associated with the above situations. In addition, three ionome patterns were also observed, which provide new clues for studying the relationship between plasma ionome and metabolic disorders. Very recently, another population-based study which analyzed urine ionome of 2115 Chinese aged 55–76 years revealed that increased urinary Ni concentration is associated with elevated prevalence of T2D [14].

Considering that disturbance in metal homeostasis is among many of the factors that lead to the development of malignancy of cancer, one study investigated the relationship between cancer risk and element status in order to support diagnosis of cancer [222]. They analyzed both essential elements (such as Ca, Mg, Zn, Cu, Mn, and Fe) and toxic metals (such as cadmium and lead) in the samples of hair and nails obtained from patients with larynx cancer and healthy subjects. Levels of the majority of examined essential elements were significantly decreased in patients, while the opposite trend was observed for the heavy metals. In addition, a variety of statistical data mining approaches have been used for the prediction of cancer probability, and the best results were obtained using logistic regression, artificial neural networks, and canonical discriminant analysis. These constructed classifiers can be useful for estimating cancer risk and early screening of the disease.

The utilization of ionomic techniques was also reported for some other diseases. For example, very recently one study examined the concentration of metals in saliva and blood for periodontal disease [223]. They used clustering approaches in the classification of samples of saliva based on the concentration of selected metals. The results of cluster analysis suggested that the metal profiles of saliva in those with periodontal disease are different from the controls, which may become a basis for the future development of diagnostic and prognostic biomarkers for periodontal disease. In another study, researchers quantified the concentrations of multiple trace elements in plasma from 238 patients of Parkinson’s disease and 302 controls, which is so far the largest cohort for measuring plasma levels of these elements [15]. It was found that lower plasma Se and Fe levels might reduce the risk for this disease, whereas lower plasma Zn was probably a disease risk factor. Finally, a SVM model was built to predict patients based on the plasma concentrations of several trace elements as well as other features such as sex and age, which achieved a good performance. In the future, new computational strategies and algorithms should be developed to improve ionomic studies.

4.5 Conclusions

Bioinformatics and system-level approaches have given powerful support for studying the metabolism, homeostasis, and function of trace elements as well as their relationship with a variety of diseases. This chapter describes recent studies that used bioinformatics and related methods to better understand the general principles of utilization of several essential trace elements. In addition, recent case-control- or population-based studies of individual elements in different diseases and disease ionomics have provided significant advances in discovering new relationship between trace element homeostasis and disease onset and progression. Nevertheless, it should be admitted that the usage of bioinformatics in the field of trace element research is still limited. In the future, with the increased availability of genome/transcriptome/proteome data and improved techniques for ionomics, bioinformatics and computational systems biology will play a significant role in studies on the roles that trace elements play in human health and disease.