Keywords

1 Conserved Structure of Carbohydrate Recognition Domains

In mammals, C-type lectin-(like) receptors (CLRs) play a crucial role in the immune response to pathogen invasions and physiological homeostasis. The CLRs are type II membrane proteins comprising an N-terminal intracellular domain, a transmembrane region, and extracellular neck and lectin domains (Fig. 12.1a). The CLRs are classified into 17 groups (groups I–XVII) on the basis of their domain organization and phylogeny (Zelensky and Gready 2005). Most of the CLRs involved in immune responses are categorized into either Group II or V, in which the extracellular domain consists of two parts, a neck domain (N-terminal) and a carbohydrate recognition domain (CRD) (110–130 amino acid residues at the C-terminus) responsible for ligand binding. Various structural studies (X-ray crystallography and NMR analyses) of the CRD domains have been performed. The structure of DC-SIGN, as a representative CRD, is shown in Fig. 12.1b. The main body of the CRD is composed of two α-helices and a three-stranded antiparallel β-sheet. Six cysteines (C0–C0′ and C1–C4), which are the most conserved CRD residues, form disulfide bonds between the loops in DC-SIGN. The C1–C4 disulfide bond links the β5-sheet and the α1-helix, and the C2–C3 bond links β3 and β5. The C0–C0′ disulfide bond is conserved in almost all of the CLRs described in this chapter. The N-terminus and the C-terminus of the CRD approach each other, due to the formation of the antiparallel β-sheet pair (β1 and β5). Since the N-terminus is connected to the neck domain located near the cell surface, the membrane-distal face of the CRD is reasonably exposed to the outside for ligand binding. Four Ca2+ ion-binding sites (sites 1–4) are often found in CRD domains (Fig. 12.1b, c). Site 1, at the membrane-distal face of the CRD, is the most conserved and essentially responsible for Ca2+-mediated sugar-binding. Sites 2 and 3 mainly contribute to additional recognition of sugar moieties (details in Sect. 12.3). In contrast, the Ca2+ ion at site 4, which is coordinated by the α2-helix and the β1/β5-sheet, is not involved in ligand binding, but probably contributes to protein stability. Some CLRs lack Ca2+ ion-binding sites and instead utilize the “top” face to bind lipids or protein ligands, rather than sugars. Interestingly, CLRs often form homodimers, as observed in crystal structures and biochemical analyses (gel filtration, ultracentrifugation, etc.). Some CLRs further form heterodimers, to achieve efficient ligand binding using both membrane-distal surfaces. Moreover, various CLRs form trimers via the neck domain. Accordingly, CLRs adopt monomer/dimer/trimer/oligomer states to achieve appropriate signaling, but their mechanisms are not fully understood yet.

Fig. 12.1
figure 1

(a) Schematic representations of the domains and surface expression of CLRs are shown. The crystal structure of a representative CLR, DC-SIGN (PDB ID: 19KI) (b), and another example of a CLR structure, Mincle (PDB ID: 3WH2) (c), are shown. Intramolecular disulfide bonds (C0′–C0, C1–C4, and C2–C3) are depicted by sticks. (d) Mechanism of mannose recognition by DC-SIGN. The DC-SIGN residues are represented by lines, and the complex sugar is depicted by a stick model. The calcium ion (Ca-1) bound to the sugar is shown as a sphere. Dotted lines indicate interactions between the protein and the sugar

In the following sections, we take a closer look into the detailed structural characteristics of individual CLRs and discuss their substrate specificities mediated by their amino acid sequences and structural differences. The structural properties of the CLRs described in this chapter are summarized in Table 12.1.

Table 12.1 Summary of the features of the CLRs described in this chapter

2 Mono-/Oligosaccharide Recognition

The CRD of rat mannose-binding protein-A (MBP-A) complexed with Man6-GalNAc2-Asn was the first crystal structure determined of the complex between a CLR and a carbohydrate (Weis et al. 1992). As mentioned above, CRDs have a conserved Ca2+ ion (site 1), and in the case of the MBP-A complex, it electrostatically interacts with the 3- and 4-position hydroxyl groups of the mannose. CLRs also have the“Glu-Pro-Asn (EPN)” or “Gln-Pro-Asp (QPD)” sugar-binding motifs. Like MBP-A, CRDs with EPN motifs engage the equatorial/equatorial arrangement of the 3-OH/4-OH of the bound sugar residue, such as in glucose, N-acetylglucosamine, and mannose (Fig. 12.1d). In contrast, CLRs with QPD motifs engage the equatorial/axial configuration of 3-OH/4-OH, such as in galactose and N-acetylgalactosamine. The “Trp-Asn-Asp (WND)” motif is also conserved in CLRs and is involved in the Ca2+ ion coordination to site 1.

DC-SIGN and DC-SIGNR are closely related CLRs expressed on dendritic cell (DC) and endothelial cell surfaces, respectively. Both receptors recognize high-mannose N-linked glycans on viral glycoproteins, such as HIV, but only DC-SIGN can bind blood fucosylated oligosaccharides (Appelmelk et al. 2003; Feinberg et al. 2001, 2007; Guo et al. 2004). In the complex structure of human DC-SIGN with GlcNAc2-Man3, a secondary mannose binds DC-SIGN via the Ca2+ ion on site 1 (PDB ID: 1K9I) (Figs. 12.1d and 12.2a). The adjacent saccharides of the secondary mannose also interact along the groove of the CRD surface. In spite of the axial orientation of the 4-OH group in fucose, the α1-3-linked fucose of Lewisx trisaccharide also bound to Ca2+ on DC-SIGN, with separated galactose recognition (PDB ID: 1SL5 and Fig. 12.2b). Recently, the complex structures of the macrophage receptor SIGN-R1/CD209b, one of the mouse homologs of human DC-SIGN, with α2-6 sialic acid or microbial dextran sulfate (DexS) were reported by Silva-Martín et al. (2014). SIGN-R1 bound to the glucose of DexS at the conventional calcium-binding site (site 1), regardless of the presence or absence of sulfate (PDB ID: 4C9F). Interestingly, sialic acid binding to the CRD of SIGN-R1 was observed on the outside of the EPN loop surrounding site 1 (PDB ID: 4CAJ). These characteristics suggest that SIGN-R1 may simultaneously bind to sialic acids on antibodies and immunoglycoproteins such as C1q complement factor and microbial polysaccharides, in the innate immune response to pathogen invasion.

Fig. 12.2
figure 2

CLRs with the “EPN motif” have versatile oligosaccharide recognition mechanisms. The human DC-SIGN with GlcNAc2–Man3 (1K9I) (a), DC-SIGN with Lewisx trisaccharide (1SL5) (b), Langerin with 6SO4–Galβ1–4GlcNAc (3P5I) (c), and mouse DCIR2 with bisecting N-acetylglucosamine (3VYK) (d) are shown. The sphere indicates the Ca-1 ion. The dotted lines in (c) represent the unique interactions between the lysines of DC-SIGN and the negatively charged moiety, mainly the sulfate group, from 6SO4–Galβ1–4GlcNAc. The “EPN motif” in each CLR is colored differently and depicted by sticks

Langerin (CD207/CLEC4K) is expressed on Langerhans cells (LCs), as well as some dermal and splenic DCs, to mediate immune responses (Valladeau et al. 2000). Langerin has a neck region for trimerization and a CRD that binds various mono- and oligosaccharides of endogenous and pathogenic glycans, in a calcium-dependent manner (Chatwell et al. 2008; Stambach and Taylor 2003). Feinberg et al. determined the crystal structures of the human Langerin CRD complexed with oligomannose, the blood group B antigen, and a β-glucan representative (Feinberg et al. 2011). In these complex structures, the Langerin CRD recognizes only a single sugar via the site 1 Ca2+ ion (PDB IDs: 3P5D, 3P5E, 3P5F, and 3P5I). The complex structure of Langerin with the 6SO4–Galβ1–4GlcNAc unit from keratan sulfate revealed a unique interaction. Although Langerin has the EPN motif, which is suitable for mannose binding, the 3- and axial 4-OH groups of galactose in this unit were recognized via the site 1 Ca2+ ion. The salt bridges among Lys299, Lys313, and the sulfate group are essential to hold this ligand (Fig. 12.2c). These structures revealed the diversity of the interactions as well as the difficulty in predicting CRD ligands.

In 2013, Nagae et al. reported the complex structure of mouse dendritic cell immunoreceptor 2 (DCIR2)/CLEC4a CRD with a biantennary complex-type glycan containing bisecting N-acetylglucosamine (GlcNAc) (Nagae et al. 2013). DCIR2 is one of the mouse homologs of human DCIR. Both human DCIR and mouse DCIR2 are expressed on DCs as inhibitory receptors, with an intracellular immunoreceptor tyrosine-based inhibitory motif (ITIM). While human DCIR was shown to bind fucose and mannose preferably (Lee et al. 2011), Nagae et al. found that the mouse DCIR2 recognizes bisecting GlcNAc specifically. In the co-crystallized structure of mouse DCIR2 with a hexasaccharide bearing a bisecting GlcNAc, the calcium ion in site 1 was coordinated by the acidic side chains of the EPN motif residues (PDB ID: 3VYK and Fig. 12.2d). The hydroxyl groups of the primary-binding mannose and the branched GlcNAc interact with the DCIR2 residues, either directly or in a water-mediated fashion. The position of the primary-binding mannose overlapped well with the mannose on Langerin (PDB ID: 3P5D), but it faced the opposite direction to mannose on the DC-SIGN complex (PDB ID: 1K9I). While the precise ligand for human DCIR is still unknown, a different ligand recognition pattern is predicted because it has the EPS motif, instead of EPN, with a set of longer α3–β3 and β3–β4 loops than DCIR2.

3 Glycolipid Recognition (Mincle and MCL)

Macrophage inducible C-type lectin (Mincle) and macrophage C-type lectin (MCL) reportedly bind a glycolipid, trehalose dimycolate (TDM), from a mycobacterium. Mincle can also bind to other glycolipids, such as gentiobiosyl diacyl glycerides from M. pachydermatis and glycerol monomycolate. To understand the substrate specificity, structural analyses of Mincle were performed by two groups independently. Feinberg et al. reported the crystal structures of bovine Mincle (PDB ID: 4KZW), whereas we solved the crystal structures of human MCL (PDB ID: 3WHD) and human Mincle (PDB ID: 3WH3) (Feinberg et al. 2013; Furukawa et al. 2013). The crystals of bovine and human Mincle were grown under similar low-pH conditions with citrate buffer. In both structures, citrate binds the Ca2+ ion at site 1, which was predicted to be the primary sugar-binding site, but the binding modes of the citrate molecules in the two complexes are slightly different. We also reported the citrate-unbound structure (PDB ID: 3WH3). These crystals grew at neutral pH, and the structure superimposed well on the citrate-bound form (RMSD = 0.12 Å). These results indicated that citrate might be a (weak) ligand of Mincle, and its binding does not cause a large structural change.

Feinberg et al. successfully crystallized bovine Mincle complexed with trehalose (PDB ID: 4KZV) (Fig. 12.3c) under virtually identical conditions to those used for the citrate-binding form of bovine Mincle. In the trehalose-bound Mincle, a loop composed of Leu172 to Asp177 is located closer to the site 1 Ca2+ ion. This structural change is caused by the formation of an electrostatic network among Asp193, Glu176, the site 1 Ca2+ ion, and the site 3 Na+ ion. They found that the surface of bovine Mincle near the primary sugar-binding site possesses a hydrophobic channel, running between Phe197/Phe198 on one side and Leu172/Val173 on the other side. The entrance of this hydrophobic channel is located adjacent to the 6-OH group of the glucose residue at the primary binding site. The acyl chain is extended from the 6-OH group in TDM and would interact with the hydrophobic channel. Structural modeling revealed the binding mode of the octanoic acid attached to this 6-OH group of the primary glucose. The decreased affinity for the acyl trehalose, by either the Val173 or Phe197/Phe198 mutation of Mincle, proved the validity of this model. In addition, we also provided a glycolipid-binding model of human Mincle, by the superimposition of the human Mincle–citrate complex on the DC-SIGNR–mannose complex. The model of the trehalose moiety from TDM was placed on the sugar-binding site of Mincle, in a similar manner to the mode of mannose binding to DC-SIGNR. This suggested that a hydrophobic region located at the sugar-binding site could interact with the mycolic acid attached to the glucose 6-OH of TDM. The mutagenesis within this region of Mincle reduced the activity in reporter cell assays, proving the importance of the hydrophobic region for the interaction with TDM. The predicted lipid-binding sites of the two models described above overlapped well. Taken together, these structures suggested that the hydrophobic residues in the vicinity of the sugar-binding domain are important for the recognition of the lipid part of TDM.

Fig. 12.3
figure 3

Comparison of the ligand recognition patterns of CLRs. (ac) Complex structures of DC-SIGN with oligomannose (a), DCIR2 with bisecting GlcNAc (b), and Mincle with trehalose (c). (d) Crystal structure of Lox-1. Predicted lipid-binding positions are shown by dotted lines in (c) and (d). (e, h) CLEC-2 in complex with podoplanin (e) or rhodocytin (h). (f) The NKG2A heterodimer with CD94 interacting with HLA-E. (g) KACL recognizes NKp65 as a homodimer. The CLRs are shown in the same orientation, as gray cartoon models. Sugar and protein ligands are represented by black sticks and cartoons with surfaces, respectively. Calcium ions bound for ligand interactions are shown as spheres

The crystal structure of another glycolipid-binding CLR, MCL, a close relative of Mincle, was also reported (Furukawa et al. 2013). The structure indicated that MCL also has a shallow hydrophobic region near the canonical sugar-binding site (site 1), but the region is not obvious as compared to that in Mincle. This obscureness may account for the weaker binding affinity of MCL with TDM than Mincle.

4 Lipopeptide Recognition

Lectin-like, oxidized low-density lipoprotein (LDL) receptor 1, Lox-1, is the major receptor for oxidized LDL (OxLDL) in endothelial cells. The crystal structures of human Lox-1 were reported independently by two different groups in 2005 (Ohki et al. 2005; Park et al. 2005). As predicted from the amino acid sequence, the overall structure of Lox-1 forms an atypical-type CRD fold, such as in Ly49A, consisting of two antiparallel β-sheets and two α-helices (Fig. 12.3d). Lox-1 has three intramolecular disulfide bonds, Cys144–Cys155, Cys172–Cys264, and Cys243–Cys256, among which the latter two are commonly observed in all CRDs, while the former one stabilizes the short antiparallel β-sheet between β0 and β1. The Cys140 residue, which is unique in human Lox-1, is involved in the disulfide-linked homodimer formation, as often observed in the C-type lectin-like NK receptors (Natarajan et al. 2002), and is necessary for the proper dimerization of the Lox-1 CRDs (Ohki et al. 2005). The Lox-1 ligand-binding surface, which corresponds to the Ca2+-binding site of the sugar-binding CLRs, exhibits a unique distribution of charged amino acids. The basic residues Arg229 and Arg231, in the long-loop region, and Arg248 are well aligned linearly across the homodimer surface, forming the so-called basic spine (Ohki et al. 2005). Lox-1 preferentially associates with negatively charged ligand molecules, including OxLDL. The mutations of all three residues abolished the ligand recognition, while the single mutation of the basic residues, Arg229 or Arg248, did not. Furthermore, a remarkable reduction of the LDL binding activity was observed when the Trp150 at the dimer interface was mutated. Taken together, proper homodimer formation by Lox-1 is quite crucial for the ligand recognition, to allow the key elements at the ligand-binding surface to align linearly (basic spine alignment) and provide a sufficient surface width for ligand association.

5 Glycopeptide and Protein Recognition

NKG2 family members are expressed on NK cells and T cells. The NKG2 family includes five isoforms (NKG2A, C, D, E, and F), and NKG2B and NKG2H are splice variants derived from NKG2A and NKG2E, respectively. Five NKG2 molecules (NKG2A, B, C, E, H) form disulfide-linked heterodimers with another CLR, CD94 (Borrego et al. 2006), which is encoded by a single gene and has low polymorphism. The NKG2A/CD94 heterodimer recognizes HLA-E, an MHC (major histocompatibility complex antigen) class I molecule. HLA-E presents the relatively conserved leader sequence of the MHC proteins. The structure of NKG2A/CD94 in complex with HLA-E (loaded with an HLA-G-derived peptide) was reported (PDB IDs: 3CII and 3CDG) (Fig. 12.3f) (Kaiser et al. 2008; Petrie et al. 2008). The antigen-presenting regions of HLA-E, the α1 and α2 domains, with the loaded peptide were recognized by the top CRD faces of both NKG2A and CD94. The Arg(P5) residue of the peptide is recognized by both NKG2A and CD94, whereas Phe(P8) is only recognized by CD94. These residues are basically conserved in the leader sequence and are dispensable for the interaction with the NKG2A/CD94 receptor. The structure clearly explained the significance of these residues at P5 and P8 in the HLA-E-loaded peptide.

Keratinocyte-specific C-type lectin-like receptor (KACL) is expressed in the skin and modulates the activities of natural killer (NK) cells through its receptor, NKp65 (Spreu et al. 2010). Recently, the crystal structure of the KACL–NKp65 complex was reported, as the first structure of a CLR–CLR complex (PDB ID: 4IOP) (Fig. 12.3g) (Li et al. 2013). KACL and NKp65 recognize each other by utilizing the top faces of their CRDs. The KACL dimer binds two NKp65 monomers independently. KACL forms a dimer in solution and crystals, while NKp65 exists as a monomer in the crystal. Thus, considering the fact that the normal state of NKp65 is the monomer, the KACL–NKp65 complex structure suggests that the dimerization of NKp65 upon KACL binding facilitates signal transduction.

CLEC9A is expressed on the surface of dendritic cells. CLEC9A recognizes a filamentous form of actin (F-actin), which dead cells and virally infected cells commonly display (Zhang et al. 2012). The crystal structure of CLEC9A was reported in the apo form (PDB ID: 3VPP) (Zhang et al. 2012). The putative ligand recognition site of CLEC9A is similar to those of other CLRs. Two mutations located at this face, W131A and W227A, disrupt the ability to bind F-actin, supporting the idea that this face is a ligand recognition site.

KLRG1 is expressed on NK cells, CD4+ T cells and CD8+ T cells, and recognizes the cell–cell adhesion-mediating molecule, cadherin. Cadherin is composed of five Ig-fold domains (EC1-5). The structures of human KLRG1 (hKLRG1) with human cadherin EC1 (hEC1) and mouse KLRG1 (mKLRG1) with or without hEC1 were reported (Li et al. 2009), as the hKLRG1–hEC1 complex (PDB ID: 3FF8), mKLRG1 (PDB ID: 3FF9), and the mKLRG1–hEC1 complex (PDB ID: 3FF7). Cadherin (EC1) is recognized by the top of the CRD in KLRG1. The KLRG1 monomer binds to the cadherin monomer. The region of cadherin recognized upon KLRG1 binding overlaps with the homodimer interface of cadherin, suggesting that KLRG1 detects the cadherin monomer exposed by the destruction of cell–cell adhesion elements.

CLEC-2 is expressed in the liver and blood cells. In platelets, CLEC-2 activation blocks lymphaticovenous connections, leading to separate blood and lymphatic vascular systems. CLEC-2 is unique in that it binds two different molecules, rhodocytin and podoplanin (Suzuki-Inoue et al. 2006, 2007). Rhodocytin, a snake venom toxin, is a heterodimeric protein composed of one α subunit and one β subunit. In contrast, podoplanin is a type I sialomucin-like glycoprotein. The binding of O-glycosylated podoplanin with CLEC-2 triggers the activation of cell spreading, via the downregulation of RhoA activity and myosin light-chain phosphorylation. The structures of CLEC-2 in complex with rhodocytin (PDB ID: 3WWK) (Fig. 12.3h) or podoplanin (PDB ID: 3WSR) (Fig. 12.3e) revealed that CLEC-2 utilizes a unique side face for ligand binding, which is distinct from the “top” conventional ligand recognition site of other CLRs (Nagae et al. 2014). The binding face of CLEC-2 has positively charged patches composed of basic residues (Arg107, Arg118, Arg152, and Arg157), which recognize the acidic residues of either rhodocytin or podoplanin. The sugar chain of podoplanin is recognized by Asn105, Arg118, and Tyr129, which are adjacent to the positive patches. The unique binding manner of CLEC-2 suggests that CLRs may utilize all surfaces as potential sites for the recognition of various ligands, including proteins, lipids, and sugars.