Abstract
The Toll-like receptor (TLR) gene family consists of type 1 transmembrane receptors, which play essential roles in both innate immunity and adaptive immune response by ligand recognition and signal transduction. Using all available vertebrate TLR protein sequences, we inferred the phylogenetic tree and then characterized critical amino acid residues for functional divergence by detecting altered functional constraints after gene duplications. We found that the extracellular domain of TLR genes showed higher functional divergence than that of the cytoplasmic domain, particularly in the region between leucine-rich repeat (LRR) 10 and LRR 15 of TLR 4. Our finding supports the concept that sequence evolution in the extracellular domain may be responsible for the broad diversity of TLR ligand-binding affinity, providing a testable hypothesis for potential targets that could be verified by further experimentation.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The Toll gene was first discovered in the 1980s as an important component of a pathway that establishes the dorsoventral axis in the early embryo of Drosophila (Anderson and Nusslein-Volhard 1986; Schupbach and Wieschaus 1989). It was later found also to play an essential part in antifungal defense (Lemaitre et al. 1996). Nine proteins were revealed to belong to the Toll family by sequencing of the Drosophila genome (Tauszig et al. 2000). The first mammalian proteins structurally related to Drosophila Toll were then identified and named Toll-like receptor (TLR) 1 and 4 (Nomura et al. 1994; Medzhitov et al. 1997). The Toll and TLR family proteins are characterized as a transmembrane receptor with an extracellular domain with Leucine-rich repeats (LRR) participating in ligand recognition and an intracytoplasmic domain containing a Toll/interleukin-1 receptor homology (TIR) domain, critical to both Drosophila Toll and mammalian TLR signaling (Werling and Jungi 2003). There are 10 human and 9 murine transmembrane proteins belonging to the mammalian TLR family (Akira et al. 2001; Zarember and Godowski 2002). The TLR family members are crucial in the early phase of infection when innate immunity is important, as well as linking innate and adaptive immunity throughout the entire course of the host defense response (Takeuchi and Akira 2001; Werling and Jungi 2003). Apparently, the expansion of TLR receptors was in the early stage of vertebrates, probably due to the large-scale gene duplications (Gu et al. 2003).
These broad functional roles of TLR receptors are likely prompted by the substantial diversity of ligand-binding affinity. Although it is generally believed that sequence evolution in the extracellular domain after gene duplications may be the key for understanding how the ligand-binding affinity evolves (Smirnova et al. 2000), this evolution remains largely uncharacterized. In this brief communication, we address this issue by conducting a phylogeny-based functional divergence analysis to formulate a testable hypothesis that may be valuable to direct further experimentation.
Materials and Methods
A comprehensive search by Gapped BLAST and PSI-BLAST were performed in several major protein databases using the human TLR2 gene as a query sequence. After partial sequences and redundant sequences were removed, the final data set included 40 complete vertebrate TLR sequences, and one Drosophila homologous gene. The multiple alignment of 41 TLR amino acid sequences was obtained by using the software CLUSTAL X (Thompson et al. 1997). This was from human (10), non-human primate (5), rodents (10), non-human, non-primate, non-rodent mammals (11), non-mammalian vertebrates (4) and Drosophila (1).
Based on the multiple alignment of 41 TLR amino acid sequences by CLUSTAL X (Thompson et al. 1997), we inferred the phylogenetic tree, using the neighbor-joining method (Saitou and Nei 1987) implemented in the software MEGA2.0 (http://www.megasoftware.net/); other methods (parsimony (PAUP) and likelihood (PHYLIP)) gave almost the same results.
Gu (1999) developed a statistical method to detect critical amino acid residues that might be responsible for functional divergence by investigating whether the evolutionary conservation of these residues have been changed, in our case, between these three clades, that is, an amino acid residue can be highly variable in one clade but highly conserved in the other one. Statistically, this functional divergence between two clades is measured by the coefficient of functional divergence, θ, ranging from 0 to 1. A null hypothesis of θ = 0 indicates that the evolutionary rate is virtually the same between two duplicate genes at each site (Gu 1999, 2001). If the null was rejected, a site-specific profile is then used to predict critical amino acid residues that are most likely responsible for the detected functional divergence. This method is implemented in the software DIVERGE (http://www.xgu.zool.iastate.edu) (Gu and Vander Velden 2002).
Results and Discussion
There are two important domains in the TLR gene family: the extracellular domain containing LRR and the cytoplasmic domain with the TIR. The LRR domain, consisting of many tandem LRR, plays key roles in binding ligands for defending pathogens. The LRR motifs of TLR2, 4, and 9 were defined by the Interpro program from the SWISSPROT database. There were sufficient conservations between TLR2, 4 and 9 genes, and enough amino acid sequences available only in TLR2, 4 and 9 to have reliable multiple alignments. The LRR motif alignments among the human TLR2, 4 and 9 genes were constructed, based on the sequence alignments of three genes. (See the online supplement materials for the accession numbers of sequences and the multiple alignment).
The inferred phylogeny of TLR gene family (Fig. 1) shows that there are three major clades, supported by high bootstrapping values: clade A includes TLR1, 2, 6, and 10; clade B includes TLR4; and clade C includes TLR 3, 5, 7, 8 and 9. Interestingly, these three major clades correspond to diversified ligand properties: clade A recognize the Gram-positive bacteria, except for the unknown ligand of TLR10; the clade B recognize Gram-negative bacteria; and the ligand of the clade C are mixed (Gram-positive, Gram-negative, virus, CpG-DNA, antiviral compound). Based on the ligand property of clade A, one might infer the ligand of TLR10 was Gram-positive. The evolutionary closeness between TLR2, TLR1 and TLR6 implies a similar function they may have, supported by the fact that there are functional interactions between TLR2 and TLR1 or TLR6 in response to pheno-soluble modulin in mouse (Haijjar et al. 2001).
We used the method of Gu (1999) to explore the connection between the TLR protein sequence evolution and the distinct ligand properties of three clades. Several case studies, for example, the caspase gene family (Wang and Gu 2001), and Jak protein kinase family (Gu et al. 2002) have shown promising perspective of this methodology in functional genomics study. The results are presented in Table 1. Interestingly, though clade A and clade B are evolutionarily more closely-related, a relatively higher coefficient of functional divergence is observed than that between clades A and C, or clades B and C. This pattern is consistent with properties of ligand of clade. The properties of ligands of clade A (Gram-positive) and B (Gram-negative) are different. Therefore, more functional divergence would be expected, while the ligands of clade C share part of ligands with clade A or B. A similar pattern is obtained when TLR2, 4, and 9 genes, which have relatively large sets of sequences available are used for representing clades A, B and C.
To investigate the connection between sequence evolution and the different ligand properties of clade A and B, we predicted the critical amino acid residues that are responsible for the functional divergence by calculating the (posterior probability) site-specific profile between pair-wise comparisons of TLR2, 4 and 9, respectively. We observed that the number of sites with higher values in the extracellular region is much greater that in the cytoplasmic region for all three gene comparisons. This pattern is not affected by the cut-off value, implying a potential connection with the diversity of ligand properties. Given an appropriate cut-off value (see the footnote of Table 2), the number of predicted critical residues within each domain of TLR genes between pairs of gene clusters are presented in Table 2. These predicted sites were definitely conserved in one cluster but variable in the other cluster. The χ2 test shows significant differences between extracellular and cytoplasmic regions in all comparisons; the signal and transmembrane regions were not included because of short lengths.
It is well known that duplication in the TLR gene family provides the opportunities for the host to recognize the variability of pathogens. The extracellular domain had significantly higher functional divergence than the cytoplasmic (TIR) domain we discovered, supports the concept that the extracellular domain is biologically critical for host-pathogen interactions. Indeed, the highest level of functional divergence was detected by the site-specific analysis in the region between LRR9 to LRR13 motif of TLR4, which may have potential RNA-binding domain function (Kirschning and Schumann 2003). Fig. 2 shows the distributions of the number of predicted critical sites in LRRs among four pairs of cluster comparisons. The numbers of predicted critical amino acid residues between LRR10 and LRR15 motifs were generally higher than the rest of the motifs of the extracellular domain. This implies that the motif between LRR10 and LRR15 might contain potential targets responsible for ligand binding, which is testable by further biological experimentation. On the other hand, the conserved cytoplasmic domain may not be required to cope with a variety of ligands and ligands of variable structure. Indeed, all TLRs share a common adaptor molecule MyD88 that interacts with the TIR domain for signal transduction (Means et al. 2000), therefore, a more conserved structure of the cytoplasmic domain can ascertain specificity or affinity of binding.
In summary, the specific-site posterior profile approach was applied to predict only Type I functional divergence among homologous genes within gene family. There are still many other approaches to identify functional divergence from the evolutionary perspective (Casari et al. 1995; Livingstone and Barton 1996; Pollock et al. 1999; Gaucher et al. 2001; 2002). With the accumulation of more sequence data, multi-species sequence analysis will make more accurate and reliable predictions for functional divergence using the current approach. Further study will combine the microarray data to evaluate the relative importance of expression and protein function divergence after vertebrate gene duplications (Gu 2004; Gu et al. 2005).
References
Akira S, Takeda K, Kaisho T (2001) Toll-like receptors: critical proteins linking innate and acquired immunity. Nat Immunol 2:675–680
Anderson K, Nusslein-Volhard C (1986) Dorsal-group genes of Drosophila. In: Gall J, ed. Gametogenesis and the early embryo. New York: Alan R. Liss, 177–194
Casari G, Sander C, Valencia A (1995) A method to predict functional residues in proteins. Nat Struct Biol 2:171–178
Gaucher EA, Gu X, Miyamoto MM, Benner SA (2002) Predicting functional divergence in protein evolution by site-specific rate shifts. Trends Biochem Sci 27:315–321
Gaucher EA, Miyamoto MM, Benner SA (2001) Function-structure analysis of proteins using covarion-based evolutionary approaches: Elongation factors. Proc Natl Acad Sci USA 98:548–552
Gu J, Wang Y, Gu X (2002) Evolutionary analysis for functional divergence of Jak protein kinase domains and tissue-specific genes. J Mol Evol 54:725–733
Gu X (1999) Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol 16:1664–1674
Gu X (2001) Maximum likelihood approach for gene family evolution under functional divergence. Mol Biol Evol 18:453–464
Gu X, Vander Velden K (2002) DIVERGE: Phylogeny-based Analysis for Functional-Structural Divergence of a Protein Family. Bioinformatics 18:500–501
Gu X (2004) Statistical framework for phylogenetic analysis of expression profiles. Genetics 167:531–542
Gu X, Zhang Z, Huang W (2005) Rapid Evolution of Expression and Regulatory Divergences after Yeast Gene Duplications. PNAS 102:707–712
Hajjar AM, O’Mahony DS, Ozinsky A, Underhill DM, Aderem A, Klebanoff SJ, Wilson CB (2001) Cutting edge: functional interactions between toll-like receptor (TLR) 2 and TLR1 or TLR6 in response to phenol-soluble modulin. J Immunol 166:15–19
Lemaitre B, Nicolas E, Michaut L, Reichhart JM, Hoffmann JA (1996) The dorsoventral regulatory gene cassette spatzle/Toll/cactus controls the potent antifungal response in Drosophila adults. Cell 86:973–983
Livingstone CD, Barton GJ (1996) Identification of functional residues and secondary structure from protein multiple sequence alignment. Methods Enzymol 266:497–512
Means TK, Golenbock DT, Fenton MJ (2000) Structure and function of Toll-like receptor proteins. Life Sci 68:241–258
Medzhitov R, Preston-Hurlburt P, Janeway CA Jr (1997) A human homologue of the Drosophila Toll protein signals activation of adaptive immunity. Nature 388:394–397
Nomura N, Nagase T, Miyajima N, Sazuka T, Tanaka A, Sato S, Seki N, Kawarabayasi Y, Ishikawa K, Tabata S (1994) Prediction of the coding sequences of unidentified human genes. II. The coding sequences of 40 new genes (KIAA0041-KIAA0080) deduced by analysis of cDNA clones from human cell line KG-1 (supplement). DNA Res 1:251–262
Pollock DD, Taylor WR, Goldman N (1999) Coevolving protein residues: maximum likelihood identification and relationship to structure. J Mol Biol 287:187–198
Saitou N, Nei M (1987) The neighor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Schupbach T, Wieschaus E (1989) Female sterile mutations on the second chromosome of Drosophila melanogaster. I. Maternal effect mutations. Genetics 121:101–117
Smirnova I, Poltorak A, Chan EK, McBride C, Beutler B (2000) Phylogenetic variation and polymorphism at the toll-like receptor 4 locus (TLR4). Genome Biol http://www.genomebiology.com/2000/1/1/research/002.1.
Takeuchi O, Akira S (2001) Toll-like receptors; their physiological role and signal transduction system. Int Immunopharmacol 1:625–635
Tauszig S, Jouanguy E, Hoffmann JA, Imler JL (2000) Toll-related receptors and the control of antimicrobial peptide expression in Drosophila. Proc Natl Acad Sci USA 97:10520–10525
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DJ (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–2882
Werling D, Jungi TW (2003) TOLL-like receptors linking innate and adaptive immune response. Vet Immunol Immunopathol 91:1–12
Zarember KA, Godowski PJ (2002) Tissue expression of human Toll-like receptors and differential regulation of Toll-like receptor mRNAs in leukocytes in response to microbes, their products, and cytokines. J Immunol 168(2):554–61
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, H., Gu, J., Lamont, S.J. et al. Evolutionary Analysis for Functional Divergence of the Toll-Like Receptor Gene Family and Altered Functional Constraints. J Mol Evol 65, 119–123 (2007). https://doi.org/10.1007/s00239-005-0008-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-005-0008-4