Keywords

3.1 Introduction

Lectins are a type of glycoprotein abundant in nature that binds to monosaccharide and oligosaccharide residues with specific structure and configuration. In general, all lectin molecules possess two or more carbohydrate-binding sites, which is required for them to agglutinate cells or glycoconjugates (Goldstein and Hayes 1978). Lectins are a structurally diverse class of proteins distinguished by their ability to specifically bind carbohydrate moieties of cell surface glycoproteins. Lectin can be obtained from plants, microbials, or animals and can be soluble or adherent to membranes. In 1888, the characteristic feature of castor bean seed (Ricinus communis L.) extract to agglutinate animal red blood cells led to the discovery of lectins primarily in plants. Following that, similar “agglutinins” were discovered in the seeds of several plants, most commonly in leguminous plants and were renamed lectins as they could differentiate human ABO blood types, significant for blood transfusions (Tsaneva and Van Damme 2020). The asialoglycoprotein receptor (ASGPR), discovered by Anatol Morell and Gilbert Ashwell in the late 1960s while analyzing the transition of a serum glycoprotein called ceruloplasmin, was the first animal lectin found (Hudgin et al. 1974). Bacterial surface lectins were discovered in the 1970s by Nathan Sharon and colleagues. Along with their haemagglutinating action, the major role of microbial lectins is to provide adhesion to host cells, which is essential for colonization and pathogenicity (Sharon and Lis 2004).

Lectins attach to simple or complex carbohydrate conjugates in a reversible and noncovalent manner, whether free in solution or on cell surfaces. The glycoconjugate-containing surface will only function as a lectin receptor. The specificity of lectins is typically determined using a hapten inhibition test, in which different sugars are evaluated for their ability to block erythrocyte hemagglutination. All lectin molecules have two or more carbohydrate-binding sites, which is required for them to agglutinate cells or react with complex carbohydrates. These molecules will bind hydrophobically, although electrostatic forces are rarely involved. Lectins have been isolated from crude aqueous solutions or saline buffers of diverse tissues using conventional protein extraction techniques. These include the utilization of ammonium sulfate or ethanol precipitation, as well as affinity chromatography (Helliwell 1998). The structural diversity aided upon by their specific carbohydrate-binding properties provides them with a wide range of biological functions in plants, animals, and microbes. Numerous investigations have been done to demonstrate such biological effects of lectins. One such demonstration is the in vitro analysis of the action of lectins on lymphocyte mitogenesis, both activation and inhibition, with lymphocytes from the gastrointestinal tract (GIT) being the most sensitive. It was also found that lectins can agglutinate immunoglobulins, activate the alternative complement system, limit fungal development, and stimulate histamine release from immune cells.

Lectins are resistant to both heat (at 70 °C for more than 30 min) and digestion. Some lectins are extremely resistant to gastrointestinal acids and enzymes (Shah and Rocca 2004). Also, lectins are essential components of biological processes such as cell signal transduction, cell–cell communication, and host–pathogen recognition. Host cell receptors which attach to pathogens are frequently glycan-recognizing complexes like lectins, and the interactions between such lectins and carbohydrate epitopes on the pathogens initiate infection. Hence identification and classification of diverse types of lectins that aid in these processes are therefore critical for effective development of diagnosis and treatments. For example, lectins can act as potential biomarkers in cancer tissues, assisting in the early diagnosis of cancer (Wu et al. 2012). Lectins can be classified into various groups based on their molecular structure, carbohydrate specificity, variance in carbohydrate-binding domain, and their localization within different forms of life including plants, animals, and microbes which imparts their specific biological functions. Lectins are often categorized using carbohydrates as the sensor, with the identification principle relying on the strength of interactions between a carbohydrate ligand and its binding lectin. Lectins are generally classified into five categories based on their affinity towards the monosaccharide substrates such as mannose, galactose/N-acetylgalactosamine, fucose, N-acetylglucosamine, and N-acetylneuraminic acid. The abundance of such monosaccharides as components of typical glycans seen on the surface of eukaryotic cells in nature supports the relevant biological activities of lectins. Hundreds of lectins have been identified and categorized using affinity chromatography and recombinant DNA methods on immobilized carbohydrates (Hamid et al. 2013).

The characterization of the amino acid sequences of dozens of lectins, along with the identification of around 30 3D structures, has permitted a classification based on shared molecular and structural characteristics. On that basis, the majority of lectins are classified into three types: (1) simple, (2) mosaic (or multidomain), and (3) macromolecular complexes (Van Damme et al. 1998). Lectin proteins possess at least one carbohydrate-binding domain which involves in specific glycoconjugate binding and on that basis, lectins could also be classified as merolectins (single carbohydrate-binding domain—monovalent), hololectins (at least two binding domain that binds structurally same sugar), chimerolectins (one or more C-binding domain with a specific enzymatic activity), and superlectins (binds structurally unrelated sugars) (Damme et al. 1998). In the aspect of species classification aided by structural and binding characteristics, the vast majority of known plant lectins may be divided into seven structurally and evolutionarily associated protein families such as the amaranthin family (amaranthins), the chitin-binding enzymes lectins with the vein domains, phloem lectins from Cucurbitaceae, jacalin-related lectins, legume lectins, monocot mannose-binding lectins (MMBL), and ribosome-inactivating proteins of type 2 (type 2 RIP) (Peumans et al. 2001). Animal lectins are classified into several families, depending on their diverse cellular localization and the binding specificities of their carbohydrate recognition domain (CRD) modules. Earlier characterization of animal lectins is classified them into two principal structural families, the C-type (Ca2+ dependent binding) and S-type-galectins (sulfhydryl dependent binding) lectins. The C-type lectin family has become a highly significant group, wherein about 17 classes of proteins have been identified with their structural and genomic analysis (Drickamer and Fadden 2002). In case of microbes, the lectin interactions are utilized to identify bacteria, fungi, and protozoa. Bacterial lectins are similar to carbohydrate-binding characteristics and relative stability of plant lectins (Chesterton 1987). The first lectins on the bacterial surface were reported in the 1970s by Nathan Sharon and his colleagues. The main function of these lectins in microbes is to promote the adhesion or adherence of the bacteria to the host cells. Bacterial lectins are hence commonly referred to as adhesions, which attach corresponding glycan receptors on the host cell surface via carbohydrate recognition domains (CRDs). The ability of these microbial lectins to aggregate or cause the hemagglutination of red blood cells was the basis for their discovery and is termed haemagglutinin. Bacteria may also synthesize soluble toxins, which rely on glycoconjugate-binding subunits to interact with membrane glycoconjugates and transport the functionally active toxic component through the membrane. Many microbial agglutinins, adhesins, and toxins have been identified, cloned, and characterized in the last 30 years (Nizet et al. 2015). The classification of these diverse lectins in terms of various aspects makes it easier to analyze over a wide range of potential applications dependent on their molecular and evolutionary characteristics.

3.2 Classification of Lectins Based on Molecular Structure

Lectins can be categorized into structurally and evolutionarily related protein families based on their amino acid sequence and molecular structure. And in that aspect, most, but not all, members of that particular lectin family are composed of monomers with a homologous basic structure and overall three-dimensional orientation. The basic structure of lectins is influenced not only by the structure of monomers, but also by the degree of polymerization and, in certain circumstances, the post-translational modification also matters (Damme et al. 1998). The three-dimensional topologies of known lectin structures vary widely. The simultaneous functions of subunit location and multivalency, however, provide a paradigm for understanding the structural basis of lectin–carbohydrate interactions (Rini 1995). The majority of lectins fall into one of the three categories: simple, mosaic or multidomain, and macromolecular complexes on the basis of their structure.

3.2.1 Simple Lectins

Simple lectins are composed of a limited number of subunits, all of which are not exactly identical and have a molecular weight of less than 40 k Da. A carbohydrate-binding site is present in each monomeric unit. Almost all known plant lectins and most members of the galectin family of animal lectins, a group of β-galactoside specific animal lectins, fall into this category. Major types of lectins classified under simple lectins include legume, cereal, Amaryllidaceae and related families, Moraceae, Euphorbiaceae, Galectins, and Pentraxins (Lis and Sharon 1998).

3.2.1.1 Legume

Leguminous lectins are a wide group of carbohydrate-binding proteins that are primarily found in legumes. The discovery of Phaseolus vulgaris (bean), Lens culinaris (lentil), Vicia sativa (vetch), Pisum sativum (pea) in legumes by Landsteiner and Raubitzchek proved that non-toxic lectins exist among the period with toxic lectins (Landsteiner and Raubitschek 1907). Seed lectins make up a large portion of legume lectins. Some are also present as vegetative tissues such as leaves and bark. Such seeds or vegetative tissues contain two or more distinct lectins. Legume lectins are composed of 30 kDa protomers generated from homologous primary translation components containing about 250 amino acid residues. Most legume lectin protomers are composed of a single polypeptide chain of around 250–300 amino acid residues. Legume lectins were essentially utilized in the fields of plant lectin biochemistry, physiology, and molecular biology research. Legume lectins have the ability to interact with both simple and complex carbohydrates. The carbohydrate-binding specificity of the various lectins in the legume lectin family is extremely diverse. Hence, legume lectins cover a far larger spectrum of binding specificities than any other lectin family (Young and Oomen 1992). Concanavalin A was the first plant lectin to be isolated, crystallized, and subjected to X-ray diffraction analysis. The soybean seed lectin was the first plant lectin to be cloned, which is a legume lectin (Hardman and Ainsworth 1972).

3.2.1.2 Amaranthin

The lectins known as “amaranthin” are discovered in the seeds of Amaranthus species. The family is called “amaranthin” after the substantive name of the first member of this family that was isolated from Amaranthus caudatus seeds. Other species with this lectin involve A. spinosus, A. caudatus, A. leucocarpus, and A. cruentus. Amaranthins are defined by their tiny protomers of size about 12–33 kDa, along with three-fold internal repeats built upon 36 amino acids, characterized by the lack of metal action, and poor affinity for the carbohydrate ligand (Chervenak and Toone 1995). Such agglutinin protomers are made up of two domains (the N- and C-domains) connected by a short helix (Transue et al. 1997). Amaranthin is often thought to be a GalNAc-specific lectin although it has a far greater affinity to the GaIB(1,3) GaINAc T-antigen disaccharide. This selectivity implies that amaranthins are intended to also encounter the common animal glycoconjugates (Rinderle et al. 1989). Recent research also proves the potential applications of amaranthin lectin for their antiproliferative activity through exerting a cytotoxic effect that would promote apoptosis (Quiroga et al. 2015).

3.2.1.3 Cereal–Wheat Germ Agglutinin

A variety of lectins can be found in high quantities in dietary staples, such as cereal grains. Lectin activity has been reported in wheat, rice, barley, oats, and corn but WGA is the cereal grain lectin that has received the most attention. Wheat germ agglutinin (WGA) is a lectin found in wheat germ that are of great significance (Cordain 1999). This lectin is a homodimer composed of monomeric subunits. Each protomeric unit of wheat germ agglutinin is composed of four structurally similar domains that share a high degree of amino acid sequence identity. Such domains possess four interconnecting disulfide linkages, within each, that enables a compact protein structure (Goldstein et al. 1997). WGA attaches to the sialic acid present mostly in humans, allowing it to cling to cell surfaces such as the gut epithelial layer (Shaw et al. 1991). The binding of WGA to Neu5Ac in the glycocalyx of human cells and pathogens that produce Neu5Ac further leads to cell invasion and perhaps disrupting immunological tolerance by eliciting pro-inflammatory immune stimuli (Varki 2009).

3.2.1.4 Moraceae–Jacalin

Jacalin lectins were identified, in fact, solely in Artocarpus and Maclura pomifera seeds and some vegetative tissues (de Azevedo Moreira and Ainouz 1981). On the basis of lectin specificity, they are distinguished into the tiny GalNAc-specific Moraceae lectins and the extended mannose-specific jacalin lectins. These lectins possess mannose-binding specificity and, however, are extensively spread among higher plants. Each Moraceae lectin galactose-specific has four identical, large α-chain and short β-chain protomers that consist of a single sugar-binding site. All known jacalin-related mannose lectins are made up of extremely identical protomers containing around 120–150 residues of amino acid (Sankaranarayanan et al. 1996). The significance of the finding of these novel lectins became obvious when it was discovered that these novel lectins also selectively bind GalB(1,3)GalNAc-residues. Jacalin was widely utilized as a potent immunological tool when its particular IgA-binding activity and mitogenicity were also reported (Skea et al. 1988).

3.2.1.5 Euphorbiaceae–Chitin-Binding Lectins

The chitin-binding lectins are a large and diverse family of proteins that include all proteins with at least one hevein domain (43-amino acid protein that contains a highly integrated chitin-binding site). Chitin-binding lectins with hevein domains are quite common in plants. For example, single hevein domain proteins were purified from Hevea brasiliensis (Euphorbiaceae) (Walujono et al. 1975). Other lectins of the family involve beans of the castor tree (Ricinus communis) that comprise two closely related lectins, ricin and Ricinus communis agglutinin, RCA. Ricin is a 60 kDa heterodimeric protein composed of two S-S linked chains, A and B. B chain possesses galactose-specific carbohydrate-binding sites whereas cytotoxic action is found in the A chain (Macholz 1988). By affinity chromatography on cross-linked arabinogalactan, a N-acetylgalactosamine-specific lectin was isolated from Euphorbia heterophylla seeds. Its distribution over the seed is normal in the regard that it is mostly limited to the main axes (Nsimba-Lubaki et al. 1983).

3.2.1.6 Galectins

Galectins or S-type lectins are soluble β-galactosidase binding lectins that bind to β-galactosidase independent of Ca2+. It functions based on conserved amino acid residues that are similar to those found in the carbohydrate-binding domain (CRD) (Barondes et al. 1994). These proteins were formerly known as S-type proteins because they required sulfhydryl groups, but were eventually substituted by galectins after site-directed mutagenesis that revealed the existence of some soluble protein groups without sulfhydryl groups (Hirabayashi 1996). About 15 galectins have been characterized by primary structural research analysis up to date, which vary also in their cellular position, binding affinities, carbohydrate-binding domain, and expression. Further galectins have been classified into three groups such as proto galectins, chimera galectins, and tandem repeat type galectins (Hirabayashi and Kasai 1993). Galectins of the prototype, such as galectin-1,2,7,10,13,14 are characterized by only one CRD that exists as dimers. Whereas the chimera galectin which includes galectin-3 possesses a CRD region at COOH terminal and a non-CRD region at NH2 terminal. Galectins of the tandem repeat type, such as galectin-4,8,9,12, have two CRDs bound by a short linker peptide. Galectin-1 seems to have a strong affinity for complex-type N-glycans, while galectin-3 has a strong affinity for the LacNAc repeats (Nio-Kobayashi 2017).

3.2.1.7 Pentraxins

The pentraxins are a group of simple plasma proteins that play a role in invertebrate and vertebrate innate immunity. They have L-type lectin structures and glycoconjugate ligand binding requires Ca2+ ions. C-reactive protein (CRP), serum amyloid protein (SAP), and female protein are three of the most important members of the pentraxin family (FP). Pentraxins are composed of five identical noncovalently bound subunits arrayed in a circular pentameric disc structure. Pentraxins are classified into two categories based on the main structure of the subunit: short pentraxins and long pentraxins (Gupta 2012a). The pentraxins (C-reactive protein, CRP; serum-amyloid P component, SAP; long pentraxin 3, PTX3) are effectively involved in complement activation and amplification via association with other complement factors (Ma and Garred 2018).

3.2.2 Mosaic Lectin

Mosaic lectins are multidomain lectins that possess a wide range of molecular weights and are made up of multiple diverse protein modules or domains, only one of which has a carbohydrate-binding domain (CBD). Virus haemagglutinins and some of the animal lectins such as C-, P-, and I-types are the lectins that come under mosaic lectins. These diverse characteristics impart diverse functionality in their applications.

3.2.2.1 Viral Haemagglutinin

Viruses express a huge range of glycan-binding proteins, which resembles lectins. Initially many of the microbial lectins were recognized based on their capacity to aggregate or cause red blood cell hemagglutination (erythrocytes). Alfred Gottschalk in 1950 reported the first microbial haemagglutinin discovered from the influenza virus that binds erythrocytes and other cells was linked via the sialic acid component of the host cellular glycoconjugates. This binding promotes the infection of the host and hence contributes to the viral pathogenicity (Wiley and Skehel 1987). Viral haemagglutinin comprises two polypeptides, HA1 and HA2, each of molecular mass 36 kDa and 26 kDa, respectively, which is covalently connected by a single disulfide bond (Nizet et al. 2015). Other such haemagglutinin is found in a non-enveloped, icosahedral symmetrical murine polyoma virus, which possesses a circular, double-stranded DNA genome. The capsid of the virion possesses around 360 copies of the viral protein VP1 (with two antiparallel β sheets) of 42 kDa positioned as pentamers (Stehle and Harrison 1997).

3.2.2.2 C-Lectin

C-type lectins are those which possess a carbohydrate recognition domain (CRD) that links sugars by binding to Ca2+ in most cases, making the sugar-binding activity Ca2+-dependent (Weis et al. 1998). As a greater number of proteins were identified, it became evident that not every protein with C-type CRDs would bind glycans and Ca2+. To address the discrepancy, the term “C-type lectin-like domains” (CTLD) was coined for such domains. CRD is often used to refer to the short amino-acid motifs found in CTLDs that interact specifically with Ca2+ and carbohydrates (Rivkin et al. 2000). C-Type lectin family has become a highly significant group, wherein about 17 classes of proteins have been identified with their structural and genomic analysis. It is also worth noting that the overall layout of a lectin is dependent on how a CRD interacts with other domains, reflecting the multivalent binding of lectins (Drickamer and Taylor 2015). C-type lectin involves various endocytic receptors, collectins, and selectins which have diverse glycoconjugate specificity. Selectins are a Ca2+ dependent receptor family that has been discovered to mediate important cell–cell interactions in a variety of processes such as leukocyte trafficking, inflammation, thrombosis, tissue injury, etc. (Rosen and Bertozzi 1994). Collectins are soluble oligomeric proteins with carboxylic terminal upholding the carbohydrate recognition domain (CRD) and a collagen-like domain with a short cysteine-rich N-terminus which together aids in effective functioning (Drickamer et al. 1986).

3.2.2.3 P-Lectin

The carbohydrate recognition domain (CRD) of these proteins has a high affinity for mannose 6-phosphate, therefore, the name “P-type” lectin family (M6P). The cation-dependent mannose 6-phosphate receptor (CD-MPR) and the insulin-like growth factor II/mannose 6-phosphate receptor (IGF-II/MPR) are two molecules that differentiate the P-type lectin family from others by their capacity to recognize phosphorylated mannose residues (Dahms 2002). Mannose 6-phosphate receptors (MPRs) are the most used term for them. MPRs serve an important role in the targeting of lysosomal enzymes in vertebrates. On binding affinity analysis, it was found that a single MRH domain in CD-MPR exists as dimers which bind with diverse glycans. Such binding of diverse ligand molecules even non-glycans makes P-Lectins a highly efficient protein involved in various physiological functions such as cell signaling, as biomarkers, etc. (Munro 2001).

3.2.2.4 I-Lectin

I-type lectins are glycan-binding proteins that belong to the immunoglobulin (Ig) superfamily and are classified according to the conserved amino acid residues in the CRD region (Powell and Varki 1995). With their conserved CRD domain I-type lectins mostly bind sialic acid on the cell surface and are termed “Siglecs” which is the most characterized I-lectin. Siglecs consist of an N-terminal variable-set Ig domain with a sialic acid binding site, followed by a constant region Ig domain (Crocker and Varki 2001) and also a conserved arginine residue on the F-strand on V-region is a criterion for ligand binding. In humans, 11 main Siglecs and one Siglec-like molecule have been identified. The CD33-related Siglecs have 4 C-set domains and cytoplasmic tyrosine-residues that are implicated in signaling and endocytosis. It was also demonstrated that Siglec binding specificity may be used to create cell-based glycan arrays that could be beneficial in therapeutic targeting against autoimmune disorders and cancer (Crocker 2002).

3.2.3 Macromolecular Complex

Macromolecular assembly of proteins involves lectin organization with multivalent binding and thus imparts significant functions. Bacteria possess a lot of macromolecular assemblies of lectins that are filamentous organelles made up of helical subunits (pilins) and are assembled in a certain sequence (Ofek and Doyle 1994a). These proteins aid in bacterial adhesion followed by invasion and infection. These lectins are heteropolymeric filamentous organelles. The majority of the lectin is composed of a structural polymer and only a small portion contains the carbohydrate-binding site (Ting et al. 2010). Furthermore, along with their large size and complexity, polysaccharide-lectin complexes may be used as model systems to study inter-polysaccharide and protein-polysaccharide interactions. The macromolecular assemblies of complex polysaccharides with galectin-3, a major lectin, and their synergistic effects on function were described in a study (Zhang et al. 2017).

3.3 Classification of Lectins Based on Glycoconjugate Specificity

Lectins are a type of protein that binds carbohydrates in a specific (Table 3.1) and reversible manner. The wide array of applications they perform is in turn the effect of this specific monosaccharide binding and this varies upon different types of lectins (Sharon and Lis 2007). Lectins can be categorized into different groups based on the monosaccharide for which they have the specificity to bind including mannose, galactose/N-acetylgalactosamine, fucose, and sialic acid specific residues. These monosaccharides are typical glycan components found on the surfaces of eukaryotic cells. Lectins with the same glycoconjugate specificity show varied affinity for oligosaccharides and can only show affinity for oligosaccharide derivatives corresponding to monosaccharides (Sharon and Lis 2013).

Table 3.1 Various lectins and their specific sugar residues

3.3.1 Mannose-Specific Lectins

Mannose-specific lectins are extensively distributed throughout higher plants, algae, and fungi and are thought to have a role in the detection of microorganisms and plant predators through their high mannose glycans (Barre et al. 2019). The mannose-binding specificity of lectins is mediated by different structural scaffolds, according to structural analysis. These lectins are made up of several structural scaffold components that contain one or more carbohydrate-binding sites and are important in the recognition of mannose-containing glycans. The mannose-binding site possesses a small carbohydrate-binding region responsible for wider sugar-binding specificity towards mannose molecule, surrounded by a larger binding area that is responsible for the specific recognition of larger mannose-containing N-glycan chains (Barre et al. 2001). In animals, pathogenic species and potentially toxic glycoconjugates are cleared by the macrophage mannose receptor which resembles exogeneous type l transmembrane receptor proteins. These receptors are efficient proteins that clear glycoconjugates with terminal mannose residues by selectively binding to them. Mannose-specific binding and internalization lead to lysosomal destruction, resulting in the elimination of foreign pathogens by the innate immune system (Stahl 1990).

3.3.2 Galactose/N-acetylgalactosamine Specific Lectins

Many C-type animal lectins recognize galactose- or N-acetylgalactosamine oligosaccharides. The best-known Gal-binding C-type lectins are the mammalian hepatocyte asialoglycoprotein receptors, which play a role in serum glycoprotein homeostasis. C-type lectins with a high affinity for glycoconjugates containing terminal galactose and N-acetylgalactosamine residues have also been discovered on the surfaces of macrophages and Kupffer cells and seem to mediate tumor cell recognition. N-acetyl-d-galactosamine lectins are significant in determining sugar moiety in blood group in animals (Kolatkar and Weis 1996). Liener and Pallansch were the first to purify soybean lectin (SBA) specific to galactose and N-acetyl-d-galactosamine in plants. Later, Sharon and colleagues isolated the same lectin using affinity chromatography with a column of 6-aminocaproyl-d-galactosylamine linked to Sepharose. Later various plant lectins such as red kidney bean lectin, horseshoe lectin, Concanavalin A legume lectin were identified with galactose and N-glycan specificity (Yosizawa and Miki 1963).

3.3.3 Fucose Specific Lectins

F-type lectins are fucose-binding proteins that are found in a wide range of taxonomic groups, from viruses, plants, and vertebrates. They possess a fucose recognition lectin domain (FTLD) with a novel fold termed F-type fold consisting of a barrel structure with specific fucose- and calcium-binding motif. Although FTLs can have a single FTLD, which is often coupled with more than one diverse domain in single polypeptide, members of this lectin family can also have a variable number of tandemly distributed FTLDs which mediate the process of binding (Vasta et al. 2017).

3.3.4 Sialic Acid Specific Lectins

Sialic acid binding lectins are those which bind to sialic acids selectively and have the potential to be beneficial in the detection, purification, quantification, and characterization of numerous biomolecules containing sialic acids residues, such as glycoconjugates, gangliosides, and polysaccharides. Such lectins can be utilized as particular probes for specific sialic acid derivatives that act as molecular markers in enormous physiological and biochemical processes (Schauer 1983). Limulin is a significant sialic acid binding lectin isolated from the American horseshoe crab Limulus polyphemus. Agglutinins that bind to sialic acid are also found in the Orthomyxoviridae viral family, which includes influenza viruses, Papoviridae, Reoviridae, and Adenoviridae (Weis et al. 1988).

3.4 Classification Based on Source (Plants, Animal, Microbes)

3.4.1 Plant Lectins

Plants contain lectins that bind specifically to mono- or oligosaccharides with particular properties. These carbohydrate-binding plant proteins, also known as lectins, agglutinins, or haemagglutinins, are a large collection of proteins with diverse applications. Approximately 500 distinct plant lectins have been identified and described in some detail, according to recent research analysis (Van Damme et al. 1998). All these lectins exhibit significant applications on the basis of their specific carbohydrate-binding through carbohydrate recognition domain. Plant lectins are classified into merolectins, hololectins, chimerolectins, and superlectins (Table 3.2) based on their carbohydrate recognition domain (Fig 3.1). Merolectins are proteins that are made up entirely of a single carbohydrate-binding domain, e.g., small chitin-binding lectins. Merolectins are monovalent by definition and so cannot precipitate glycoconjugates or cause agglutination of cells.

Table 3.2 Plant lectins classification

Hololectins are made up entirely of carbohydrate into a special category called superlectins. They are made up of at least two carbohydrate-binding domains that are not identical or comparable, but recognize structure binding domains, although at least two of them are identical or highly similar. Hololectins can agglutinate because they are divalent or multivalent (Damme et al. 1998).

Fig. 3.1
figure 1

Classification of plant lectins based on carbohydrate-binding domain. (Created with biorender.com)

Plant lectins are also classified into families based on some shared features as legume lectins, type II ribosome-inactivating proteins, monocot mannose-binding lectins, and other lectins (Lam and Ng 2011). The most well-known kind of lectin is legume lectin. Leguminous plants’ seeds have a greater lectin content than their bark, leaves, roots, and stems. The lectins of the Gramineae (cereals, such as wheat germ) and Solanaceae plant families have also been discovered (potatoes and tomatoes). Monocot-binding lectins are made up of 1, 2, 3, or 4 12 kDa subunits with a specific affinity for mannose, whereas chitin-binding lectins are built up of hevein domains (Damme et al. 1998). Type 2 ribosome-inactivating proteins are chimerolectins composed of a polynucleotide: adenosine glycosidase domain (also known as the A chain) and a carbohydrate-binding domain positioned in parallel (the so-called B chain). Both chains are produced on the same precursor molecule, which is subsequently processed post-translationally by eliminating a linker between the A and B chains (Barbieri et al. 1993). Major plant lectins include concanavalin A, wheat germ agglutinin, soybean agglutinin, Limba bean, wax bean agglutinin, and Red bean agglutinin.

3.4.2 Animal Lectins

The first animal lectin discovered was the asialoglycoprotein receptor in mammalian cells, which was useful in determining how animal lectins differ in glycoconjugate binding. Depending on their cellular location and the binding specificities of their carbohydrate recognition domain (CRD) modules, animal lectins are divided into numerous groups (Fig 3.2). Animal lectins were formerly divided into two structural families: C-type (Ca2+ dependent binding) and S-type-galectins (sulfhydryl dependent binding) (Drickamer 1988). The most important animal lectins, such as endocytic receptors, mannose receptors, selectins, and collectins, belong to the C-type lectin family.

Fig. 3.2
figure 2

Different types of animal lectins and their cellular location (Created with biorender.com)

Recent research has identified more than 100 animal lectins and classified them into different families based on the complexity of carbohydrate ligands, metabolic processes they perform, expression levels, and their reliance on divalent cations. These families include calnexin, F-lectin, intelectin, chitinase like lectin, F-box lectin (Table 3.3), and others (Cummings and McEver 2009). C-type lectins are Ca2+-dependent lectins that are found in the extracellular matrix, serum, and membrane and have a conservative domain known as the carbohydrate recognition domain. The distinctive feature of the carbohydrate recognition domain is the direct interaction of Ca2+ with the bound sugar via coordination bonds (CRD) (Drickamer 1993).

Table 3.3 Classification of animal lectins and their ligands

C-type lectin includes endocytic receptors such as asialoglycoprotein receptors, macrophage mannose receptors, natural killer cell receptors, kupffer cell receptors along with other molecules such as collectin and selectin (cell adhesion molecules) (Cummings and McEver 2009). Another type of animal lectin is S-type lectins that are soluble β-galactosidase binding lectins which bind glycoconjugate through a Ca2+ independent manner. It functions based on the conservation of a group of amino acid residues that resemble the characteristics of carbohydrate-binding domain (CRD) (Barondes et al. 1994). L-type lectins are proteins discovered first in the seeds of leguminous plants. For example, ERGL has been identified, where it lacks some basic residues for glycan-binding but, like ERGIC, plays an important role in the secretion of various glycoproteins in specific tissues (Yerushalmi et al. 2001). The term “P-type” lectin family denotes the binding affinity of carbohydrate recognition domain (CRD) in these proteins towards mannose 6-phosphate (M6P). MPRs play a major role in targeting the lysosomal enzymes (Dahms 2002). Other significant lectins include calnexin, calreticulin, calmegin, and calreticulin 2 that serves as the prototype for a small group of ER-resident chaperone proteins (Thomsen et al. 2011), F-box lectins have been identified in the murine F-box protein Fbs1, which functions similarly to the carbohydrate recognition domain (CRD) (Yoshida et al. 2003), and Ficolins which are soluble oligomeric proteins composed of trimeric collagen-like domains that stimulate the complement system (Thomsen et al. 2011).

3.4.3 Microbial Lectins

Lectins derived from fungus, bacteria, protozoa, and viruses make up microbial lectins. Since the 1970s, only a few lectins have been identified nevertheless, the relevance of microbial lectins has been recognized as a result of ongoing study in this subject, which has led to substantial investigations on lectins from microbes (Slifkin and Doyle 1990). Alfred Gottschalk was the first to discover a microbial lectin in the early 1950s. The influenza virus was used to isolate this lectin, which was shown to be of viral origin. In the 1970s, Sharon et al. were the first to investigate bacterial lectins (Wiederschain 2009). The major function of microbial lectins is to bind to host cells, which is required for infection to occur. Microbes benefit from lectins because they help them attach to the cell surface. Such interactions that aid in microbial growth must be critically analyzed to essentially prevent pathogenic infections and diseases in humans (Ofek and Doyle 1994b). Major microbial lectins include haemagglutinin, adhesins, and bacterial toxins. The influenza virus haemagglutinin, which binds to sialic acid–containing glycans, is a significant viral glycan-binding protein. The specificity of this interaction, like that of other glycan-binding proteins with their glycosyl ligands, is modest, because the haemagglutinin oligomerizes into trimers and the host cell has a large density of glycan receptors, the sensitivity for cell membranes rises (Rott et al. 1996). Bacterial lectins are mostly of fimbriae (hairs) or pili (threads), which are elongated, submicroscopic protein structures that bind with glycoconjugate receptors on the host cells. The mannose-specific fimbriae, the galactose-specific fimbriae, and the N-acetyl galactose-specific fimbriae are identified to be effective cell adhesion molecules (Ofek and Doyle 1994b). Other forms of microbial lectins include bacterial toxins. Bacteria produce lectins termed toxins to prevent other bacteria from colonizing, allowing them to gain an advantage in the struggle for resources and space. For example, A lectin called bacteriocin with two b-lectin domains produced by a gram-negative bacteria proteobacteria was reported to eliminate other bacteria through contact dependent inhibition (Ghequire et al. 2018). A variety of parasites, in addition to viruses and bacteria, employ glycans as adhesion receptors. Entamoeba histolytica produces a heterodimeric lectin that specifically binds to galactose/N-acetylgalactosamine residues on the host cell (Wiederschain 2009).

3.5 Conclusion

Carbohydrates are found on the surface of all living cells as a component of glycoconjugates and are involved in diverse physiological functions such as cellular communication, antimicrobial activity, mitogenesis, tumor biomarkers, therapeutic strategies, etc. Such characteristics of lectins make it an inquisitive area for researchers to analyze more lectins of natural origin with beneficial effects. Such inventions of lectins could be significant only if it is classified according to their characteristics such that they could be easily accessed for further analysis. As mentioned above lectins could be classified in several ways according to their source, glycoconjugate specificity, and binding patterns. Further recent research also focuses on the classification of lectins on the basis of evolutionary relationship and gene analysis using bioinformatics tools. One such report is the utilization of Glyco Bioinformatics databases and tools that aid in classifying lectins upon their functional differences and versatility in addition to origin and specificity. For example, Unilectin3D, a portal dedicated to lectin 3D structures and lectin/glycan complexes was launched in 2018, with 1740 structures encompassing 428 distinct lectins and 765 references. UniLectin3D has been validated as the primary source of data on lectin 3D structures and their glycoconjugate interaction (Bonnardel et al. 2021).