Introduction

The complexity of a cell and its functions are mediated by a large number of molecules and their interactions. Such interactions which attribute to various functions are of great significance not only in plants but also in the animal kingdom. In that context, this review deals with animal lectins, whose binding with carbohydrates implies their role in diverse biological functions. It is primarily the interaction of a non-enzymatic protein with the protruding glycan moieties over the cell surface. Such interactions and their complexity add value to various aspects of the cell and its functions. The early stage of research was focused mostly on plant lectins, even though some proteins which are now known as animal lectins were found earlier (Tsaneva and Van Damme 2020). Now various research proves that animal lectins have astonishing diverse functions from cell–cell interactions to immune response (Loh et al. 2017). The diverse functions of lectins are due to their ability to recognize and differentiate among various glycan molecules. These lectins interact with endogenous carbohydrate molecules or may be brought by invading microorganisms. The primary function of lectins is to bind with carbohydrates and glycoconjugates to decode their information and mediate various functions including intercellular interactions, cell signaling, immune responses, etc. (Lam and Ng 2011). In such cases, the presence of lectins can be detected by its inference with immobilized sugar molecules. This specific binding helps us to undergo structural analysis of several carbohydrates containing polymers and their isolation. The report of the first animal lectin model by (Ashwell and Harford 1982) with the asialoglycoprotein receptor in mammalian cells helped analyze how animal lectins differ in glycan binding. As this binding occurs over the cell membrane, several cell membrane studies can also be undergone. These lectins specific to animal glycoconjugates can be identified by various biochemical separation techniques utilizing their binding properties. One such work is the isolation of a natural lectin from the shrimp Penaeus japonicus hemolymph by affinity chromatography with fetuin–sepharose (Yang et al. 2007). The earlier research on the functions of lectins was focused on plant lectins, especially with the agglutinating property. The first animal lectin was reported in snakes in 1902 by Flexner and Noguchi who observed agglutination and lysis of erythrocytes and leukocytes by snake venom (Hirabayashi et al. 2015). Later lectins with similar properties were found even in invertebrates, especially in snails, horseshoe crabs (Fig. 1), and also in lower vertebrates such as fish (Shimizu et al. 1977).

Fig. 1
figure 1

Crystal structure (1TL2) of an animal lectin Tacylectin-2, bound with its ligand GalNac, from Tachypleus tridentatus (Japanese horseshoe crab) developed using X-ray diffraction method

Lectins which agglutinate mammalian red blood cells were also identified. Such properties of lectins assist in the effective isolation and purification of animal lectins by their ability to mimic epitope-specific binding with carbohydrate groups over the red blood cells. Also, such specific interactions can be used in blood typing analysis and its chemical basis (Flexner and Noguchi 1902). Nowell in 1960, insisted upon the significance of lectins in immunology (Vasta et al. 2001) for example, chicken tissues exhibit the property of mitogenicity which aids in various cytogenetic studies and investigations on chromosomal analysis (Kilpatrick 1999). Moreover, animal lectins express to be excellent biomarkers for detecting cancer cells, for example, the expression of galectin, a major type of animal lectin, in melanomas, astrocytomas, bladder and ovarian tumors. Animal lectins also engage as effective targets of apoptosis, thereby regulating cancer signaling pathways, which could be utilized effectively to develop potent antitumor drugs (Mody et al. 1995). Animal lectins also act as effective antimicrobial agents that impart their significance in the host–pathogen defense mechanism which involves membrane permeabilization, autophagy, and vacuole lysis (Dias et al. 2015). Recent research has shown that animal lectin such as galectin-3 can be utilized for effective treatment against COVID-19, utilizing galectin-3 inhibitors followed by its anti-inflammatory mechanism (Caniglia et al. 2020).

Chronicle overview of animal lectins

The event in nature of erythrocyte-agglutinating proteins has been known since the turn of the nineteenth century. Later around 1970, it was tracked down that such proteins could agglutinate other naturally significant macromolecules, particularly carbohydrates, and were termed Lectins. Despite their widespread distribution in plants and, to a lesser degree, invertebrates, relatively few lectins had been isolated until the early 1970s and received only a little attention (Kilpatrick 2002). However, it was quickly demonstrated that lectins are extremely useful tools for studying carbohydrates on cell surfaces, especially their isolation, purification, characterization and changes in malignancy. In the ensuing years, more investigators became interested in understanding lectins, and over 100 lectin structures with their characteristics were revealed. Even though earlier research was mainly focused on plant lectins, several molecules of animal origin that bind to specific carbohydrates were found earlier at a time period before plant lectins were discovered. The reaggregation of dissociated marine sponge cells, a type of species-specific identification, was the first indication that lectins existed in animals (Müller et al. 1979). It was in 1853, that Jean-Martin Charcot reported the Charcot–Leyden crystals (CLCs) in the cardiac blood and spleen of a patient who died of leukemia (Charot 1853). Later in 1872, Leyden found that these proteins were associated with eosinophil-mediated inflammation in the sputum of asthma patients. Such proteins were then identified as a member of carbohydrate-binding moiety galectins owing to their specific carbohydrate-binding domains for simple saccharides, such as N-acetylglucosamine and lactose (Leonidas et al. 1995). In 1973, another form of agglutinin for sheep erythrocytes was found in a slime mold, Dictyostelium discoideum, and was named discoidin-1 which is now a model for cell–cell adhesion analysis (Rosen and Bertozzi 1994). It was more specific in its binding and inhibited by lactose. Similarly, erythro-agglutinating compounds have also been discovered in crustacean hemolymph and snake venom. Flexner and Noguchi reported the first animal lectin in snakes in 1902, demonstrating that a variety of snake venom can agglutinate and induce blood cell lysis (Flexner and Noguchi 1902). Many years later, lectins in snake venoms were re-examined. And in 1980, thrombolectin from Bothropsatrox was the first to be isolated in pure form (Gartner et al. 1980). A similar lectin was identified in 1975 from electric eel tissue (Teichberg et al. 1975). In the following years, other similar lectins were found which differed in their binding specificities with carbohydrates (Liener 2012). These findings piqued the researcher’s intention of looking for lectins in different animal species, which led to the existence of galactosidase-binding lectins in vertebrates. With regard to mammalian lectins, it was in 1906, that a mammalian protein with the ability to conglutinate the complement-activated red blood cells was identified, and it was later called ‘collectin conglutinin’ (Bordet and Gay 2022). Following that aspect, the agglutinins from plants were replaced with eel agglutinin for serum-based blood analysis (Watkins 2001). The presence of mammalian serum lectins was first identified in 1975 by Robinson et al. (1975) followed by the isolation of lectins from rabbit liver in 1978 by (Kawasaki et al. 1978).

Later it was reviewed that the first solid evidence for a mammalian lectin appeared fortuitously during the work by Ashwell and Harford (1982). These molecules were examined for a mechanism that mediated the turnover of glycoproteins in the blood circulation with a substantial variation between the circulating half-lives of glycoproteins that terminated either in sialic acid or galactose residues. The end products were found to be accumulated in the liver and this led to the discovery of a specific asialoglycoprotein receptor, a hepatic membrane protein complex that recognizes terminally associated β-galactose or N-acetyl galactosamine circulating glycoproteins (Kawasaki and Ashwell 1977). Following that, avian hepatic lectin was also found which was specific in binding with terminal N-acetylglucosamine (Leffler 2018). Later, lectins in slime molds and complex animal tissues were found different from the first discovered mammalian lectin in various features, such as solubility and molecular weight, and those were collectively termed ‘galectins’ (Taylor et al. 2022). Another significant report by Elizabeth Neufeld on a type of lectin was around 1970 when it relied upon a glycoconjugate system that mediates the uptake of lysosomal enzymes by fibroblast. And in 1977, it was reported by William Sly that this uptake could be inhibited by mannose-6-phosphate. Further research led to the discovery of mannose 6 receptors that express on lysosomes, and this system was aimed to be used in treating lysosomal enzyme-related diseases. This mannose 6 receptor is now classified as P-lectins (Taylor et al. 2022). Another form of lectin, Pentraxin, is a C-reactive protein and was first reported in 1930 as a precipitin of pneumococcal C polysaccharide (Tillett and Francis 1930). Most of the cell adhesion molecules discovered earlier are now known to be lectins classified into several family groups, for example, CD11b or CD18 integrin (Kilpatrick 2000). It was discovered in the late 1980s that the primary amino acid sequence of a protein could be used to predict glycan-recognition properties. In that aspect, Collectins are specific types of lectins, described by their primary structure, with a carbohydrate-binding domain (C-Type) and a collagen domain (Hansen et al. 2016). The mannan-binding lectin was the first reported and most significant human collectin (Bordet and Gay 2022). Later, other collectin-like proteins were also cloned from the human liver (Ohtani et al. 1999) and placentae, such as CL-P1 or SRCL, which emphasize their role in the phagocytosis of bacteria (Nakamura et al. 2001). These significant findings in the past served as a springboard for discovering numerous lectins identified primarily by their unique glycan affinity. Now more than 100 lectins have been identified and their structural properties were analyzed for wide applications in health science sectors.

Structure and classification of animal lectins

Animal lectins are classified based on various interdependent factors, the complexity of the carbohydrate ligands, the biological processes in which such molecules participate, their expression levels, and their reliance on divalent cations have all been used to classify animal lectins. A significant number of lectins have been identified in animals over the years along with their primary sequence, and it is necessary to sort them according to their structural and functional characteristics (Fig. 2).

Fig. 2
figure 2

Different types of animal lectins and their cellular location

Drickamer (1988), reported the classification of lectin molecules into two classes, based on their Carbohydrate-Recognition Domain (CRD). He explained that, given the lectin's dynamic variable structure, the Carbohydrate-Recognition Domain (CRD) was extremely conservative with its amino acid residues and could be used to identify them. Based on that aspect, animal lectins are classified into C-type lectins, which denote their dependency on Ca2+ ions for binding, and S-type lectins, which are sulfhydryl-dependent, now known as galectins (Drickamer 1988). Apart from the sequence similarity of the conservative glycan-binding domain, those molecules showed differences in non-lectin structures during the evolution and characterization of some lectins. As a result, lectins were divided into more families and subfamilies as listed in (Table 1) below.

Table 1 Classification of animal lectins and their ligands (Wiederschain 2009)

C-type lectins

Over the discovery of various animal lectins, the structural and biochemical characteristics of all domains were analyzed, and it was found that a particular group of lectins possesses a carbohydrate-recognition domain (CRD) which plays a vital role in glycan-binding property. This domain was found exclusively in Ca2+-dependent lectins and not in other lectins (Drickamer 1989). The discovery of more similar proteins was achieved by comparing CRD sequences from various C-type lectins, which showed conserved sequence motifs distinctive to the domain (Drickamer 1993). C-type lectins have a modular carbohydrate-recognition domain (CRD) that links sugars by binding to Ca2+ in most cases, making the sugar-binding activity Ca2+-dependent (Fig. 3). As a greater number of proteins were identified, it became evident that not every protein with C-type CRDs would bind to glycans and Ca2+. To address the discrepancy, the term ‘C-type lectin-like domain' (CTLD) was coined for such domains (Weis et al. 1998). CRD is often used to refer to the short amino acid motifs found in CTLDs that interact specifically with Ca2+ and carbohydrates (Rivkin et al. 2000). Hence, generally C-type lectins are Ca2+-dependent lectins present in the extracellular matrix, serum, and membrane, which possess a conservative domain called ‘carbohydrate-recognition domain’. The direct interaction of Ca2+ with the bound sugar through coordination bonds is a unique feature and it is specific to the carbohydrate-recognition domain (CRD) (Drickamer 1993). It is also worth noting that the overall layout of a lectin is dependent on how a CRD interacts with other domains, reflecting the multivalent binding of lectins (Drickamer and Taylor 2015; Mitchell and Gibson 2015).

Fig. 3
figure 3

Different types of C-type lectins including Dectin-1, Dectin-2, Mannose receptor, DC-SIGN and their structural and ligand characterization

In that aspect, even though the majority of CTLs accept glycan and glycoconjugate ligands in a Ca2+-dependent fashion, certain non-carbohydrate ligands, such as proteins, lipids, inorganic molecules, and ice crystals, also bind to these lectins at different domains (Drickamer and Taylor 2015; Mitchell and Gibson 2015). As a result of these diverse interactions, the C-type lectin family has become a highly significant group, wherein about 17 classes of proteins have been identified with their structural and genomic analysis (Drickamer and Fadden 2002). Among them, highly relevant classes of CTLs are discussed below.

Endocytic receptors

Asialoglycoprotein receptors

The majority of C-type lectins that have structural similarities act as membrane-bound receptors. These moieties have a short NH2 terminal cytoplasmic tail, an un-cleaved signal sequence, an extracellular segment (neck region) and a COOH terminal part holding a C-type lectin-like domain (CTLD), which together resembles type II transmembrane proteins (Fig. 4) (Zelensky and Gready 2005).

Fig. 4
figure 4

The structure of a C-type lectin, asialoglycoprotein receptor, and the crystal structure (1DV8) of carbohydrate-recognition domain of its H1 subunit, developed by the X-ray diffraction method

The asialoglycoprotein receptors (ASGPRs) specifically contribute to the regulation of serum glycoprotein levels, thus maintaining homeostasis (Cummings 2009). This mechanism of homeostasis regulation was analyzed with human ASGPR which possesses two subunits H1 & H2, resembling a type II transmembrane protein. It facilitates the clearance of serum glycoproteins by interacting with specific oligosaccharides from which their terminal sialic acid residues are stripped, thereby exposing terminal galactose residues (Manning et al. 2017). ASGPR is exclusively expressed by hepatocytes, and this makes it an appealing target for hepato-specific drug delivery. This could lead to effective targeting methods using ASGPR. Another important C-type lectin-based receptor is a chicken hepatic lectin (CHL), which is unique in binding specifically to N-acetylglucosamine than other monosaccharides. This receptor is a trimer made up of a single form of subunit identical to the polypeptides found in the asialoglycoprotein receptor (Kawasaki and Ashwell 1977).

Macrophage mannose receptor

Pathogenic species and potentially toxic glycans are cleared by the macrophage mannose receptor which resembles exogenous type l transmembrane receptor proteins. These receptors are efficient proteins that clear glycans with terminal mannose residues by selectively binding to them. Mannose-specific binding and internalization lead to lysosomal destruction, resulting in the elimination of foreign pathogens by the innate immune system. Macrophage mannose receptors bind with a diverse range of ligands such as glycans with a mannose group including lysosomal enzymes, and tissue plasminogen activators and its specificity rely on their primary structure (Fig. 5) (Stahl 1990).

Fig. 5
figure 5

The structure of a Macrophage mannose receptor, with 8 C-type lectin domains that recognize and bind to mannose ligands on microbes, a fibronectin type II region, and a cysteine-rich domain followed by a cytoplasmic tail region (Created with BioRender.com)

The identification of N-acetyl galactosamine, which is found over the ligands, for example, pituitary hormones, is mediated by an N-terminal cysteine-rich domain. An extracellular region with an amino-terminal cysteine-rich domain, a fibronectin type II repeat domain, eight CTLDs, a transmembrane region, and a short cytoplasmic tail make up the structure. The extracellular part of the macrophage mannose receptor possesses around eight C-type, carbohydrate-recognition domains (Fiete et al. 1998). CRD-4 has been proven to possess the sugar-binding domain. Even though multivalent binding, a characteristic of lectins (Weis and Drickamer 1996) requires every domain, through structural analysis by ligand-binding studies, NMR clearly proved that CDR-4 has the sugar-binding property (Hitchen et al. 1998). This kind of structural study of all domains helps in comprehending their arbitrary interactions and roles in different functions. Most of these receptors have been shown to mediate or control pathogen uptake, for example, phagocytosis of apoptotic cells in COPD (Hodge et al. 2003), this uptake is affected by various concerned factors.

Natural killer cell receptor

Natural killer cells are potent effector cells of the immune system, against foreign invaders and also cancer by preventing their spread of infection. The NK cells are effective by means of their C-type lectin surface receptors, such as NKG2D, Ly49 or KIR, and CD94-NKG2 heterodimers (Mayer et al. 2017) which enables both their activation and inhibition related to the immunological role. Natural killer cell receptors are expressed on NK cells and bind to the major histocompatibility complex (MHC) class I and II molecules and other cellular stress ligands despite glycans. And this is because most NK cell receptors lack both Ca2+-binding sites, and also certain CRD domains. The integration of signals obtained through activating and inhibitory NK cell receptors regulates NK cell activation, including target cell killing and cytokine production (Pegram et al. 2011). One of the major types of NK cell receptors is Dectin-1, which is primarily a fungal (Penicillium, Pneumocystis, Aspergillus, Candida, and Saccharomyces) recognition receptor. It binds to β-glucan carbohydrates residues (Fig. 1) and is mostly expressed by myeloid cells (Yan et al. 2020; Plato et al. 2013).

Kupffer cell receptors

Kupffer cells are the largest number of macrophage populations in the liver that are involved in maintaining stable liver functions (Klein et al. 2007). These resident macrophages of the liver have surface receptors that are specific to fucose and galactose residues (Yang et al. 2013).

The structural analysis of endogenous Kupffer cell receptors has explained the presence of repeated sequences between CRD domains which form the alpha helices (Bevilacqua and Nelson 1993). One such lectin is CLEC4F, a fucose-binding lectin, which was first purified from rat liver (as shown in Fig. 6), by L-fructose-bovine serum albumin (BSA)-Sepharose column. It was later found to bind other glycan moieties like galactose and N-acetyl-d-galactosamine that adds to more specific lectin interaction along with the hepatocytes (Kolb-Bachofen et al. 1982; Yang et al. 2013).

Fig. 6
figure 6

The trimeric structure of Kupffer cell C-type lectin receptor Clec4f on mouse Kupffer cells crystallized by X-ray diffraction method (Created with BioRender.com)

Selectin-cell adhesion molecules

Selectins are a Ca2+-dependent receptor family that has been discovered to mediate important cell–cell interactions in a variety of processes, such as leukocyte trafficking, inflammation, thrombosis, tissue injury, etc. (Rosen and Bertozzi 1994). These proteins essentially type I transmembrane proteins, possess an epidermal growth factor-like domain, a transmembrane domain, a cytoplasmic terminal tail, and some short consensus repeats about 10–20, according to structural review. Based on the cells that express and purpose, selectins are divided into three classes, such as L-selectins (leucocytes), P-selectins (platelets), and E-selectins (activated endothelial cells), (Fig. 7) (Rosen and Bertozzi 1994). The ligand analysis showed that selectins have the binding affinity to recognize the tetra-saccharide sialyl-Lewis X and its isomer sialyl-Lewis which are terminal groups of N and O glycan. A ligand for E-selectin is also identified to be a related structure in which sialic acid is substituted by a sulfate group. These findings indicate that a negatively charged ligand must occupy a portion of the selectin-binding site, and the CRD lies with the amino-terminal end (Rosen and Bertozzi 1994). During inflammation, several immune cells migrate to the site of tissue, and selectins interact with glycoside molecules, thereby mediating the rolling mechanism, enabling leukocytes to interact with chemokines that activate leukocyte integrins, and enabling them to cease and crawl (McEver and Zhu 2010). Selectins have been shown to act as adhesion receptors in models of inflammation, thrombosis, and immune responses, especially P-selectin in the activation of β2 integrins (Yago et al. 2018).

Fig. 7
figure 7

Three types of selectins, E- selectin, P-selectin, and L-selectin with a carbohydrate-recognition domain (CRD), an epidermal growth factor involved in binding specific ligands, such as CD34 and PSGL-1

Selectins also activate leucocytes by interacting with ligands including PSGL-1 and CD44 to elicit various signals for immune responses, such as superoxide generation, cytokine secretion, and tissue factor synthesis. In several preclinical disease models, including atherosclerosis, and thrombosis, selectins, especially P-selectin, have been expressing inflammation and thrombosis, and this is significant in developing drugs (McEver 2015).

Collectin

Collectins are soluble oligomeric proteins with a carboxylic terminal upholding the Carbohydrate-Recognition Domain (CRD) and a collagen-like domain with a short cysteine-rich N-terminus. The CRD domain involves ligand recognition and binding whereas, the N-terminal domain determines the structural characteristics (Fig. 8). Collectins are divided into two structural groups including, bouquet-like structures with shorter collagenous domains and cruciform structures with more Gly-X–Y repeats (McEver 2015). The first collection to be discovered was conglutinin, specifically termed bovine serum collection, capable of agglutinating cells that are opsonized through antibodies or complement pathways.

Fig. 8
figure 8

Structure of a Collectin with Carbohydrate-Recognition Domain (CRD), a collagen triple helix, and a cysteine-rich domain forming a helical subunit

A mannose-binding lectin (MBL) is formed by five of such helical subunits, whose binding with glycan is mediated by CRD. Several types of collectins have been identified over years, which are significant over various functions as listed in Table 2. The major role of collectin is host defense by detecting a variety of microorganisms through binding with effector complement proteins (Garred et al. 2016). Collectins also stimulate the humoral component of innate immunity (Dobó et al. 2016). Mannose-binding lectin (MBL) is the most significant collectin and has a major role in immune response (Foo et al. 2015). Mannose-binding protein A—MBP-A (in serum) and mannose-binding protein C—MBP-C (in the liver) are the two types of MBL that have been identified and sequenced in rats. Despite their structural similarities, MBP-A and MBP-C have different carbohydrate ligand affinity. MBP-A recognizes the trimannosyl group of N-linked oligosaccharides, while MBP-C recognizes the trimannosyl group of N-linked oligosaccharides (Childs et al. 1990). Mannose-binding proteins and complement component C1q have identical overall structural organization. This enables MBP to act in a similar fashion, thereby fixing complement via the classical pathway. Utilizing this property on binding with a microbial surface glycan ligand, antibody-independent complement activation is possible (Ikeda et al. 1987). Mannose-binding proteins also play a major role in the chemotaxis of a variety of cell types, including neutrophils, macrophages, and dendritic cells.

Table 2 The diverse cellular applications of various animal lectin families (Vasta et al. 2007)

S-type lectin (or) galectins

S-type lectins are soluble β-galactosidase-binding lectins whose binding fashion is through a Ca2+-independent manner. It functions based on the conservation of a group of amino acid residues that resemble the characteristics of the Carbohydrate-Binding Domain (CRD) (Ikeda et al. 1987). These proteins were initially termed as S-type to denote their need for sulfhydryl groups but were later replaced by galectins, as site-directed mutagenesis later revealed that there exist certain soluble protein groups without the presence of sulfhydryl groups (Hirabayashi 1996). Studies have also revealed a conservative interaction in the binding between the non-polar side of galactose and the receptor's aromatic side group (Iobst et al. 1994). About 15 galectins have been characterized by primary structural analysis up to date, which also varies in their cellular position, binding affinities, carbohydrate-binding domain, and expression. Further galectins have been classified into three groups, such as proto-type galectin, chimera-type galectin, and tandem repeat-type galectin (Fig. 9). The carbohydrate-recognition domains of galectins are structurally dimers of two β-antiparallel sheets formed back-to-back by the F (F1-FX) and S (S1-SY) strands, where the carbohydrate-binding site (CBS) is held in a groove in the S-sheet side (Iobst et al. 1994). Galectins of the proto-type, such as galectin-1, -2, -7, -10, -13, and -14 are characterized by only one CRD that exists as dimers, whereas the chimera-type galectin which includes galectin-3 possesses a CRD region at the COOH terminal and a non-CRD region at the NH2 terminal. Galectins of the tandem repeat type, such as galectin-4, -8, -9, and -12, have two CRDs bound by a short linker peptide. Galectin-1 seems to have a strong affinity for complex-type N-glycans, while galectin-3 has a strong affinity for the LacNAc repeats (Iobst et al. 1994). The affinities for ligand glycoconjugates tend to vary based on their cellular location where galectins act, and so differences in the glycoconjugates on ligand molecules can influence galectin activities (Thiemann and Baum 2016). One such report shows that dimeric Gal-1 can cause phosphatidylserine expression and promotion of phagocytic identification of leukocytes, demonstrating how a ligand and its structural characteristics affect the function (Dias-Baruffi et al. 2003). Galectins play a role in varied physiological processes, including immune response, cell motility, early growth processes, inflammation, cell–cell communication, pathological response, and so on. Understanding the structural properties of galectins enables them to be used for therapeutic applications, such as drug design with glycan ligands. Galectins, especially Gal-1 and Gal-3, have been found to possess high levels of expression in immune organs and cells, implying their significant role in immune response (Levi et al. 1983).

Fig. 9
figure 9

Three major types of Galectins, Prototype galectin including types (Galectin-1, -2, -7, -10, -13, and -14), Chimer galectins (Galectin 3), and Tandem repeat galectins (Galectin-4, -8, -9, and -12). All the galectins involve in specific ligand binding mediating Cell–Cell adhesion, Cell-ECM adhesion, intracellular signaling, and Galectin–glycoprotein lattice formation

The expression of galectin-9 as a major immunomodulatory factor in Rheumatoid arthritis was recently identified in research, and this may provide a platform for the development of potential biomarkers that can correlate with disease diagnosis and treatment (Sun et al. 2021). The role of animal galectins in modulating neuronal targets of the central nervous system (CNS) can be highly applicable in understanding neurological diseases as reported by various studies (Araújo et al. 2020). In the context of the global COVID virus pandemic, other significant research reported that gal-3, with its potent anti-inflammatory activity and structural resemblance to viral spike protein, could be effectively utilized in the treatment of COVID-19 (Caniglia et al. 2020).

M-type lectins

M-type lectins are indeed another type of animal lectin that is highly homologous to mannosidase present in the endoplasmic reticulum but lacks mannosidase activity due to the absence of certain essential residues and disulfide bonds (Hirao et al. 2006). M-type lectins resemble type II transmembrane proteins that consist of short cytoplasmic tails. Potential M-type lectins are composed of amino acid residues (cysteine or glutamic acid residues) that are essential for mannosidase activity, and approximately 135 such residues have been identified so far (Mast et al. 2005). One such significant lectin is EDEM (ER degradation-enhancing α-mannosidase-like protein), involved in a process called, endoplasmic reticulum-associated degradation (ERAD) and includes 3 types, such as EDEM 1, EDEM 2, and EDEM 3 (Fig. 10).

Fig. 10
figure 10

The protein being translocated is sorted and the misfolded proteins are directed toward the proteasome via Endoplasmic reticulum-associated degradation (ERAD) through the action of EDEM 1, 2, 3 proteins

These proteins assist in the proper identification and removal of misfolded proteins, allowing stable proteins to be retained and cellular processes to run efficiently One such protein EDEM3 is reported in mouse as a soluble EDEM homolog which aids in endoplasmic reticulum-associated degradation and mannose trimming of glycoprotein such as Man8GlcNAc2 oligosaccharide residues to Man6-7 GlcNAc2. Another research discovered that EDM2, a Homo sapiens EDEM homolog, intensified the breakdown of misfolded alpha1-antitrypsin, implying that the protein is implicated in the process of ERAD (Mast et al. 2005).

L-type lectins

L-type lectins are proteins named after being discovered first from the seeds of leguminous plants (Sharon and Lis 1990; Itin et al. 1996). However, subsequent research discovered a variety of animal L-lectins that varied structurally and functionally. The Carbohydrate-Recognition Domain (CRD) of L-lectin is a beta-sandwich model composed of a concave sheet of seven beta-strands and a convex sheet of five, with the ligand binding to a negatively charged cleft through specific residues (Fiedler and Simons 1994). Around four forms of L-lectins have been identified in mammals, including ERGIC-53, ERGL, VIP36, and VIPL, and have been shown to be involved in the cell's protein-sorting mechanism. ERGIC-53, an invertebrate L-lectin, which has a single CRD and a coiled domain, conducts protein transport across the endoplasmic reticulum–Golgi apparatus compartment by binding with high mannose glycans in a Ca2+-dependent manner (Fig. 11). Another associated mammalian L-lectin, ERGL (ERGIC-53-like), has been identified, where it lacks some basic residues for glycan binding, but, like ERGIC, it plays an important role in the secretion of various glycoproteins in specific tissues (Yerushalmi et al. 2001).

Fig. 11
figure 11

ERGIC-53, a major lectin in protein sorting, enables protein trafficking from ER to Golgi through binding high mannose residues and recycles as an intermediate between both

VIP36 (vesicular integral membrane protein) and VIPL (VIP36-like) lectins are two other groups found in vertebrates and invertebrates respectively, and have a major role in protein trafficking and regulation (Nufer et al. 2003). Research also identified that Pentraxins, another significant protein family of L-type lectin function as Pattern-recognition receptors (PRRs) and play a role in both vertebrate and invertebrate innate immunity (Du Clos 2013).

P-type lectins

The 'P-type' lectin family denotes proteins with a binding affinity of carbohydrate-recognition domain (CRD) toward mannose-6-phosphate (M6P) ligands. The ability to recognize phosphorylated mannose residues distinguishes the P-type lectin family from another group, and these molecules primarily include, the cation-dependent mannose-6-phosphate receptor (CD-MPR) and the insulin-like growth factor II/mannose-6-phosphate receptor (IGF-II/MPR). Hence, these proteins are mostly termed mannose-6-phosphate receptors (MPRs) (Du Clos 2013). In vertebrate organisms, MPRs play a major role in targeting the lysosomal enzymes (Fig. 12).

Fig. 12
figure 12

Lysosomal precursor is transferred with mannose-6-phosphate tag at cis-Golgi which is recognized by the Mannose-6-phosphate receptor (2RL9) through clathrin-coated vesicular trafficking and is directed toward the lysosomes

It processes by generating an M6P tag signal to the enzymes in cis-Golgi. The addition of GlcNAc phosphate to terminal mannose residues in N-linked glycans, accompanied by the elimination of the GlcNAc moieties, produces this signal. Such tagged enzymes are then transported from trans-Golgi in vesicles toward lysosomes (Du Clos 2013). Certain P-type CRD-like domains in proteins are found to differ from MPRs in terms of structure and are referred to as MPR homology (MRH) domains. The binding affinity analysis showed that a single MRH domain in CD-MPR exists as dimers that bind with diverse glycans (Munro 2001). Such binding of diverse ligand molecules even non-glycans makes P-Lectins a highly efficient protein involved in various physiological functions such as cell signaling, as biomarkers, etc.

I-type lectins

I-type lectins are glycan-binding proteins that are members of the immunoglobulin (Ig) superfamily, and these lectins vary in type based on the conserved amino acid residues in the CRD region (Crocker and Varki 2001). With their conserved CRD domain, I-type lectins mostly bind sialic acid on the cell surface and are termed ‘Siglecs’ which is the most characterized I-lectin.

Siglecs are single-pass type 1 transmembrane proteins with highly homologous extracellular domains (Fig. 13). Siglecs consist of an N-terminal variable-set Ig domain with a sialic acid-binding site, followed by a constant region Ig domain. They also possess a conserved arginine residue essential for ligand binding (Hanasaki et al. 1995). Siglec expression varies to denote their complex roles, which are difficult to analyze since sialic acid-binding sites are masked by other ligands. In humans, 11 major Siglecs and one Siglec-like molecule have been identified. The CD33-related Siglecs have four C-set domains and cytoplasmic tyrosine residues that are implicated in signaling and endocytosis (Crocker and Varki 2001). Other Siglec subgroups include Sialo-adhesin (expressed on macrophages), CD22 (B-cell, adhesion receptor), and myelin-associated glycoprotein (MAG), which are extremely conserved and have been identified in both humans and mice (Crocker et al. 1998). A recent study also revealed the use of Siglec-binding specificity to produce cell-based glycan arrays that could be useful in therapeutic targeting against a variety of diseases, such as autoimmune diseases and cancer (Crocker and Varki 2001).

Fig. 13
figure 13

13 characterized I-type lectins or Siglecs with V-set domain binding sialic acid residues, C-set domain, (ITIM) immuno-receptor tyrosine-based inhibitory motif, GRB2 binding motif

R-type lectins

R-type lectins are another part of the lectin superfamily that seems to have a carbohydrate-recognition domain (CRD) similar in structure to the CRD in ricin which is a toxic-soluble plant lectin. The R-type CRD has a beta-clover configuration, with three lobes organized along a three-dimensional axis. An aromatic residue piles against the galactose loop at a ligand-binding site, while hydrophilic residues from the same lobe develop hydrogen bonds with the hydroxyl groups. Each lobe possesses a ligand-binding site, interacting in diverse ways with a single glycan (Cummings 2009). R-type CRDs are found in proteins in the macrophage mannose receptor family that differ in their structure. The MR family in humans includes the MR, the PLA2 receptor, DEC-205/MR6-gp200 (Fig. 14), and the Endo180/urokinase plasminogen activator receptor-associated protein.

Fig. 14
figure 14

Most important class of R-lectin, PLA2R (Phospholipase A2 receptor) and DEC-205/ MR, with a Ricin-type domain that binds to glycans, fibronectin type II receptor, and C-type lectin domain repeats

R-type lectins are class I transmembrane glycoproteins with a common fibronectin type II domain, multiple C-type lectin domains (TLDs), and an amino-terminal cysteine domain (Fiete et al. 1998). Mannose receptors are highly expressed on macrophages, hepatic endothelial cells, Kupffer cells, and dendritic cells, which play a significant function in the innate immune system by promoting the phagocytosis of mannose-rich pathogens (Woodworth and Baenziger 2001).

Other significant animal lectins

Apart from these major lectins, mammals have other lectin molecules significant in their physiological processes that include Calnexin, F-box lectins, Intelectins, Chitinase-lectins, and F-lectins (Cummings 2009). Calnexin serves as the proto-type for a small group of ER-resident chaperone proteins distributed throughout the eukaryotes. In humans, there are four types of calnexin family proteins, namely calnexin, calreticulin, calmegin, and calreticulin 2. Calnexin maintains protein quality assurance by facilitating the proper folding of proteins that join the secretory pathway and direct misfolded proteins to degradation in collaboration with other proteins (Trombetta and Helenius 1998). F-box lectins have been identified in the murine F-box protein Fbs1, which functions similarly to the carbohydrate-recognition domain (CRD). These molecules are essential components of the cytoplasmic enzyme system that conjugates ubiquitin, a tiny polypeptide, to amino residues of the target protein that is to be degraded (Mizushima et al. 2004) An F-box protein is found in the E3 complex of the ubiquitin system and is in control of substrate selection, which varies depending on the substrate specificity (Yoshida et al. 2003) proteins in the Fbw and Fbl families. For example, it contains WD-40 and leucine-rich repeat regions, accordingly, and recognizes phosphorylated amino acid residues to identify ubiquitination targets (Cenciarelli et al. 1999). The intelectin group of lectins has a basic domain structure made up of a carbohydrate-recognition domain (CRD) of variable length and a non-conserved N-terminal region with a protein signal. It forms disulfide-bond oligomers, which increase ligand binding affinity and show Ca2+-dependent sugar-binding activity in a range of proteins throughout the family. In the innate immune system, interactions function as a pathogen identification molecule, facilitating phagocytosis and stimulating defense responses (Wrackmeyer et al. 2006). Chitinase-like lectins are soluble intracellular mammalian proteins that are members of the glycoside hydrolase family 18. These molecules have a barrel-like triose-phosphate isomerase (TIM) structure with binding specificity for chitooligosaccharides.YKL-40, Ym1, and oviductin are chitinase-like lectins found in mammals (Chang et al. 2001). Ficolins are soluble oligomeric proteins composed of trimeric collagen-like domains connected to fibrinogen-related domains (FReDs) that enable the detection of molecular structures on the pathogen and apoptotic cell surfaces and stimulate the complement system (Chang et al. 2001).

Future perspectives of animal lectins

Owing to the structural and functional characteristics of a diverse range of animal lectins, it is evident that these proteins are effective in broad areas of research. These proteins could be used to study every aspect from the recognition of glycoconjugates to their separation, classification, and even the essential cell signal transductions they approach.

Several animal lectin structures have been characterized over the past decades, and these structures shed light on the diversity in their functions as receptors, cell adhesion molecules, immune regulators, and so on (Table 2). Potent animal lectins that function as receptors in different signaling pathways trigger a slew of cellular activities that can be investigated for the activation or inactivation mechanisms. These findings suggest that animal lectins, as cell surface transducers, can provide a novel, or additional signaling ability to the cells. Through protein interactions, lectins can mediate various immune responses, which make them attractive in immune research. As most host–pathogen interactions involve carbohydrate–protein binding, significant efforts must be geared toward understanding and emulating the recognition processes of these lectins. This aids in finding effective agents for developing carbohydrate-based therapeutics, and drug targeting through carbohydrate–lectin interactions against various diseases. In conclusion, along with plant lectins, a critical understanding of the molecular basis of animal lectin–carbohydrate interactions and the processes involved should be studied. This is essential for the development of diagnostic tools, vaccines, and novel drugs for the treatment of infectious, inflammatory, and malignant diseases.