Introduction

Small leucine rich repeat proteoglycans (SLRPs) are a group of active components of the extracellular matrix (ECM) in all tissues. Leucine rich repeats (LRRs) occur tandemly in SLRPs. SLRPs bind to various types of collagens (I, II, III, V, VI, IX, XII, XIV, and XVIII) and regulate collagen fibril growth and fibril organization (collagen fibrillogenesis) (Chen and Birk 2013; Hindson et al. 2019). SLRPs also interact with various cytokines including transforming growth factor-beta (TGF-β) and von Willebrand factor (vWF), and with extracellular compounds such as Toll-like receptors (TLR2 and TLR4) and epidermal growth factor receptor (EGFR) (Pietraszek-Gremplewicz et al. 2018). These interactions lead to various biological functions, including cell adhesion and signaling, proliferation, and differentiation. Mutations in SLRP genes are associated with human diseases (Matsushima et al. 2019).

SLRPs contain LRR domains flanked by clusters of cysteine residues at both N- and C-termini which are called as LRRNT and LRRCT or LRRCE, respectively; LRRCE is a capping motif containing “ear” LRR (Park et al. 2008). In addition to these domains, there are low complexity sequences such a poly-Asp in asporin at the extreme ends. SLRPs are composed of eighteen members in human genome and are divided into five classes (Fig. 1) (Iozzo and Schaefer 2015). SLRPs undergo post translational modifications: glycosylation with glycosaminoglycan (GAG) including chondroitin sulfate, dermatan sulfate or keratan sulfate and N- or O- linked oligosaccharides, and tyrosine sulphation (Zappia et al. 2020). Their core proteins have molecular weights between 36 and 77 kDa. Class I identifies five members of biglycan (BGN), decorin (DCN), asporin (ASPN), extracellular matrix 2 (ECM2), and extracellular matrix X (ECMX). Class II containing five members are divided into 3 subgroups. Subgroup A consists of fibromodulin (FMOD) and lumican (LUM), subgroup B includes keratocan (KERA) and proline and arginine rich end leucine rich protein (PRELP), and subgroup C is composed of osteomodulin/osteoadherin (OMD). Class III is composed of osteoglycin/mimecan (OGN), epiphycan (EPN), and opticin (OPTC). Class IV SLRPs include chondroadherin (CHAD), chondroadherin-like protein (CHADL), nyctalopin (NYX), and tsukushi (TSK). Class V identifies podocan (PODN) and podocan like 1 (PODN1). Several SLRPs (asporin, PRELP, opticin, nyctalopin, and tsukushi) do not carry GAG side chains; ECM2 and ECMX has no data on GAG or glycosylation (Zappia et al. 2020). CHADL has two LRR domains which are highly similar to that in CHAD, indicating that CHADL was created by a duplication of CHAD.

Fig. 1
figure 1

The class and the repeating unit length (RUL) of individual LRRs in each protein of human SLRPs. Each box represents one structural element. Shaded areas indicate the LRR class; light yellow indicate Bacterial LRR, light blue Typical LRR. The numerical values indicate RUL of individual LRRs. N-box, conserved N-terminal disulfide bonded cap (LRRNT); E-box, ear repeat; C-box, C-terminal disulfide bonded cap (LRRCT). This figure is a modification of the one by McEwan et al (2006)

LRRs occur in tandem (Kobe and Deisenhofer 1994; Bella et al. 2008; Matsushima and Kretsinger 2016). Typical LRRs are 20–29 amino acid long. Some LRRs contain longer repeats or inversely shorter ones (Matsushima et al. 2021). Individual LRRs are separated into a hallmark motif, highly conserved segment (HCS) with the consensus of LxxLxLxxNxL, LxxLxLxxNxxL, LxxLxLxxxL and a variable segment (VS). In LRRs short β-strands (at positions 3–5) in HCS form a parallel β-sheet, which produce a super helix arrangement (called a solenoid structure). The VS part forms various secondary structures such α-helix, 3(10)-helix, polyproline II, and β-turns. At least eleven LRR classes including Bacterial, Typical, and SDS22-like have been recognized (Kobe and Kajava 2001; Matsushima et al. 2010; Kajava et al. 2008). Known structures indicate that all highly conserved hydrophobic residues such as Leu, Ile, Val, and Phe within LRR domains contribute to the hydrophobic core (Matsushima and Kretsinger 2016). The side chain of the conserved Asn at position 9 in HCS forms a hydrogen bond network in LRRs (called Asn ladder) (Kobe and Deisenhofer 1994). The LRR solenoid structure is divided in four regions of a concave (inside) surface, ascending loops, a convex (outside) surface, and descending loops (Matsushima et al. 2019). Protein, ligand interactions in LRR domains occur on the concave surface, the ascending surface, the descending surface, the capping regions, and the combinations thereof.

Several excellent reviews on SLRPs have been published (McEwan et al. 2006; Chen and Birk 2013; Iozzo and Schaefer 2015; Zappia et al. 2020; Pang et al. 2020; Zeng-Brouwers et al. 2020). We try to avoid gilding refined gold. We describe some features of amino acid sequence and structures of SLRPs. Next, we review ligand interactions and then discuss protein–ligand interfaces. Finally, we map all mutations associated with human diseases and discuss possible effects on LRR structures by the mutations.

Sequence features

Types and super motif of LRRs

Crystal structures of LRR domains in biglycan and decorin from bos taurus, and fibromodulin, osteomodulin, and chondroadherin from humans have been determined (Scott et al. 2004, 2006; Ramisch et al. 2017; Paracuellos et al. 2017; Tashima et al. 2018). The difference between the consensus sequences of respective LRR classes provides various secondary structures in the VS parts (Batkhishig et al. 2018, 2020, 2021). To get a deeper understanding of sequence—secondary structure correlations in LRRs we performed the following procedures. First, we assigned the secondary structures of the five known structures by four programs (Chebrek et al. 2014). Next, we identified a total 237 LRR units in human 18 SLRPs. Third, we divided by repeating unit length (RUL) and estimated respective consensus sequences. Finally, we could divide the 237 LRRs into three groups that are Bacterial LRR, Typical LRR, and others; Bacterial LRR are first identified in bacterial proteins such as YopM protein from Yersina pestis and Ipa4 and Ipa7 proteins from Shigell flexneri, while Typical LRR with RUL = 24 is the most abundant class at that time (Kajava 1998).

Sixty-one LRRs with RUL = 20 (17 times) and RUL = 21 (44) are Bacterial type, because the former N-subtype has the consensus of LxxLxLxxNxI xxLPxLPxx and the latter T-subtype has the consensus LxxLxLxxNxI xxLPxxLPxx (Batkhishig et al. 2018). The N-subtype VS adopts polyproline II (PPII) consisting of three to five residues, while the T-subtype VS prefers PPII conformation. The C-terminal side in the VS parts adopt β-turns, as well as the ones in the other LRRs (Fig. 2). Bacterial N-subtype LRR is characterized by a super secondary structure consisting of PPII with four, five, or six residues and a type I β-turn (Batkhishig et al. 2018).

Fig. 2
figure 2

Secondary structures of the variable segment (VS) part in individual LRRs in the known structures of five SLRPs. aThe position number of LRR units. bThe repeating unit length of LRRs. cShaded areas indicate the secondary structure elements; light blue indicate polyproline II helix conformations, yellow b-turns in which the first residue and the last residues show residue i + 1 and i + 2, respectively, green α-helices in which the first residue and the last residues show residues Nc + 1 and Cc − 1, respectively; Nc is the first helical residue and Cc is the last helical residue. dS” is Bacterial LRR and “T” is Typical LRR. Bos taurus biglycan (PDB: ID: 2FT3); Bos taurus decorin (PDB ID: 1XKU); Human fibromodulin (PDB ID: 5MX0); Human osteomodulin (PDB ID: 5YQ5); Human chondradherin (PDB ID: 5LFN)

Eighty-seven LRRs with RUL = 24 are Typical type with consensus sequence of LxxLxLxxNxL xxLxxxxFxxLxx. Thirty-four LRRs with RUL = 26 are a Typical variant with the consensus of LxxLxLxxNxL xxxxLxxxxFxxLxx. In both the central part of VS adopts tandem b-turns or 3(10)-helix (Fig. 2). The insertion of two residues in the VS part of the RUL = 26 LRR produce a short 3(10)-helix. Three LRRs with RUL = 25 and HCV = 12 (LRR4 in nyctalopin and LRR9 and LRR10 in tsukushi) are also Typical type, because the VS consensus is completely consistent to the Typical one. The remaining fourteen LRRs with RUL = 25 likely belong to Typical type, because (L/V)xxxxF, which is highly conserved in Typical LRR VS, are conserved. The secondary structure elements are similar to those of Typical LRR with RUL = 26. Eight LRRs with RUL = 23 may belong to Typical type, because the VS prefer b-turns or 3(10)-helix.

Others are four LRRs with RUL = 22 and twenty ones with longer RULs. Two repeats with RUL = 22 (LRR5 in lumican and LRR11 in the second LRR domain of chondroadherin-like protein) appear SDS22-like type with the consensus of LxxLxLxxNxL xxLxxLxxLxx. The eleventh repeat (LRR11) in classes I and II and the eigth one (LRR8) in class III is longer with RUL = 31–37, which was referred as the “ear repeat” (Scott et al. 2004). The last LRR in class IV with long RUL constitutes an LRRCT. The last two LRRs in podocin like protein 1 likely form a C-terminal capping structure through a disulfide bond.

In summary, one of the sequence features is the presence of super motifs consisting of Bacterial LRR (S) and Typical one (T) in classes I, II, III and IV (Fig. 1), as seen in the FLRT family (Matsushima et al. 2000; Matsushima and Kamiya 2000). The first nine LRRs in classes I and II SLRPs adopt almost three tandem repeats of the STT super motif. Podocan contain complete seven tandem repeats of STT, while podocin-like contains five STT repeats. Class III contains (STT)(ST). Class IV has no super motif.

Cysteine capping motifs

LRRNT typically contains four cysteines in a CxmCxCxnC pattern with n and m being variable number (Fig. 3). There are also two cysteines in ECM2, five in chondroaderin-like protein (in the second LRR domain), and seven in nyctalopin. The known structures of five SLRPs indicate that typical LRRNT forms a disulfide knot between the first LRR repeat (LRR1) and a β-hairpin, as first found in the decorin structure (Scott et al. 2004). This capping motif structure does not form a separate domain, but integrates seamlessly into the LRR architecture (McEwan et al. 2006). Two disulfide bonds are formed between the first and third cysteines, and between the second and fourth cysteines; they are C64–C70 and C68–C77 in biglycan, C55–C61 and C59–C68 in decorin; C76–C82 and C80-C92 in fibromodulin, C62–C68 and C66–C78 in osteomodulin; and C23–C29 and C27–C38 in chondroadherin. A single β-strand (β0) consists of four residues (three residues in only fibromodulin) and is the only one antiparallel to the rest. A β-turn is mostly formed between the first and second cysteines. The β0 strand contains the second and third cysteines, and the strand of LRR1 contains the fourth cysteine that is located on position 6 of the LRR1 HCV; positions 6 is occupied by the conserved hydrophobic residues such a leucine in the rest repeat. Significantly, the internal spacing between the third and fourth cysteines provides a difference in the β0–β1 loop. The β0–β1 loop contains only one β-turn in class I SLRPs (biglycan and decorin), which was also observed in the FLRT3 structure. In contrast, it contains two b-turns in class II SLRPs (fibromodulin and osteomodulin) and chondroadherin. In nyctalopin and tsukusi, there are more residues between conserved cysteines, which will produce longer β0–β1 loop. The seven cysteine clusters in nyctalopin will have one additional disulfide bond (McEwan et al. 2006). LRRNT shields the hydrophobic core of the first LRR, which prevents aggregation.

Fig. 3
figure 3

Amino acid sequence alignment of the LRRNT motif and the secondary structures identified from known structures of five SLRPs and FLRT3. aChondroadherin like protein (1) and (2) indicate the first and the second LRR domains, respectively. Shaded areas indicate the secondary structure elements; light blue indicate polyproline II helix conformations, yellow b-turns in which the first residue and the last residues show residue i + 1 and i + 2, respectively, pink b-strands. Cysteine residues are shown in white text over light blue, pink, or black background. Underlined residues “A”, “M”, and “E” are “S”, “L”, and “D” in human biglycan and decorin, respectively. Bos taurus biglycan (PDB ID: 2FT3); Bos taurus decorin (PDB ID: 1XKU); Human fibromodulin (PDB ID: 5MX0); Human osteomodulin (PDB ID: 5YQ5); Human chondradherin (PDB ID: 5LFN); FLRT3 (PDB ID: 6JBU)

Class I, II, and III SLRPs have a specific C-terminal capping motif with two cysteines containing the ear repeat (named LRRCE) (Park et al. 2008). This LRRCE motif encompasses the ear repeat, which is extended laterally, the last LRR following it, and the final b-strand closing the domain (Park et al. 2008). The last two LRRs is connected by a single disulfide bond; the crystal structures confirm the presence of this disulfide bond (C285–C318 in biglycan; C284–C317 in decorin, C334–C367 in fibromodulin, C321–C353 in osteomodulin). The cap structure forms parallel stacking of three β-strands (β11-, β12- and β13-strands) which integrates seamlessly into the LRR fold, as in the case of LRRNT. Conserved hydrophobic residues (I325 in biglycan, V324 in decorin, and I374 in fibromodulin, I229 in osteomodulin) in the final β-strand (β13) participate in the hydrophobic core the domain. Aromatic/methionine- aromatic interactions also occur in F284–F316 (biglycan), F283–F315 (decorin), F328–F333 (fibromodulin), and M320–F354 (osteomodulin) (Burley and Petsko 1985; Pal and Chakrabarti 2001).

LRRCT contains normally four cysteines having a CxCxmCxnC pattern in chondroadherin, chondradherin-like protein (the first and second LRR domains), and nyctalopin with one additional cysteine, as well as in many LRR proteins such as FLRT3, nogo receptor, and GPIBb; podocan does not have any C-terminal capping motif. The crystal structure of chondroadherin indicates that the disulfide bond connectivity is the same as that of LRRNT with the four cysteines; two sulfide bonds (C304–C326 and C306–C346) are formed. The last LRR contains a common sequence of NPWxCxCx4Lx2WL. Characteristic of this LRRCT motif is a short 3(10)-helix and a contiguous 10–13 residue α-helix on its convex side. The helical secondary structures cover the left-over hydrophobic surface of the LRR domain to prevent aggregation. Finally, both LRRNT and LRRCT would contribute to the stability of the entire domain.

Low complexity sequences

Several SLRPs have low complexity sequences flanking the LRR domains on the extreme N- an/or C-terminal sides (Kalamajski and Oldberg 2010). They are poly-Asp in asporin, Arg/Pro-rich amino acids in PRELP, Glu/Pro-rich amino acids in epiphycan, Thr/Ser-rich amino acids in opticin on the N-terminal regions, while on the C-terminal regions, Lys/Arg-rich amino acids in chondroadherin-like protein and poly-Glu in podocan. In chondroadherin-like protein, a linker connecting the two LRR domains consist of Glu/Pro-rich amino acids. In ECM2, Glu-rich amino acids intermediates between vWFC and the LRR domain. These low complexity regions may be regarded as intrinsically disordered regions (van der Lee et al. 2014). The N-terminal poly-Asp region in asporin binds calcium and regulates hydroxyapatite formation (Kalamajski et al. 2009). The basic N-terminal Arg/Pro-rich region in PRELP binds heparin and heparan sulfate (Bengtsson et al. 2000).

Tyrosine clusters

Class II SLRPs (fibromodulin, lumican, and osteomodulin) undergo sulphation in tyrosine clusters in the N-terminal regions; in addition, for osteomodulin, two adjacent tyrosines in the C-terminal region are sulphated. The remaining class II and III SLRPs also harbor sulfotyrosines in the N-terminal regions (Jensen and Karring 2020). Residues 73–130 in fibromodulin and 54–124 in osteomodulin on the extreme N-terminal sides are not visible in the electron density maps of the crystal structures, indicating that these regions are disordered; all nine tryosines in fibromodulin was substituted by serine in the molecule used for the crystal analysis (Praracuellos et al. 2017; Tashima et al. 2018). An LRR protein, platelet glycoprotein Ibα (GPIbα) binds α-thrombin. The interaction is enhanced by sulfotyrosine of Tyr278 at the C-terminus of GPIbα (Zarpellon et al. 2011). The tyrosine sulfate-rich domains of both fibromodulin and osteomodulin bind basic cluster motifs that are shared by various heparin-binding proteins and growth factors (Tillgren et al. 2009). The sulfotyrosines of fibromodulin and osteomodulin might be directly involved in the binding to their proteins.

Solenoid structures of five SLRPs

HELFIT analysis

Most solenoid structures of LRRs fold into a right-handed or left-handed super helix. The repeat number of LRRs in SLRPs ranges from 8 to 22 (Fig. 1). To get geometrical features of the LRR solenoid structures, we calculated the helix parameters by the HELFIIT analysis (Enkhbayar et al. 2014). A helix may be characterized by helix axis, helix pitch (P), helix radius (R), number of residues per turn (N), and handedness. HELFIT computes these parameters which also yield the rise per repeat unit/residue (Δz = P/N) and the rotation per repeat unit/residue in the helix (ΔΦ = 360°/N). The HELFIT analysis requires only four data points: the coordinates of the α-carbon (Cα) of residue. In LRRs, the Cα coordinates of the consensus leucine residue at position 4 in HCS, which is located in the center of individual β-strands, are used. The five SLRPs are represented by a right-handed helix. The helix parameters range over: P = 24.2 → 80.6 Å, N = 29.2 → 37.7 units/turn, R = 24.1 → 28.7 Å, Δz = 0.97 → 2.10 Å, and ΔΦ = 9.4° → 12.4°; p = 0.04 → 0.17 Å. Plots of 2⋅R⋅sin (ΔΦ/2) versus Δz is well fit by a circle with radius D.

$$\left\{ {2R\sin \left( {\frac{\Delta \Phi }{2}} \right)} \right\}^{2} + (\Delta z)^{2} = D^{2}$$
(1)

The above equation is a circle equation and yields D = 5.09 ± 0.03 Å which corresponds to the inter-strand distance allowing the formation of hydrogen bonds. This equation is established in all known structures with a very few exceptions (Enkhbayar et al. 2014). The helical parameters of the five SLRPs seem to be more comparable to those of Typical LRR domains (for example, Drosophila Toll) than Bacterial LRR domains (IpaH9.8).

Multiple aromatic/methionine–aromatic interactions

The known five structures indicate that aromatic–aromatic and methionine (Met)–aromatic interactions frequently occur. These are noncovalent interactions stabilizing protein structure (Burley and Petsko 1985; Pal and Chakrabarti 2001). Examples are F74–F98 in biglycan, F72–F96 and F141–F167 in decorin, F125–F151 and F111–F137 in osteomodulin, and F71–F95–F119 in chondroradherin. The phenylalanines are a consensus hydrophobic residue at position 19 in the RUL = 24 typical LRR or position 21 in the RUL = 25 Typical LRR. The benzyl groups form stacks of Phe side chain. The Phe’s of consecutive units is called the Phe spine, which was first recognized in nogo receptor (He et al 2003). A big aromatic–aromatic interaction is observed. It is F242–F239–F364–F287–F289–W314 in chondroadherin. Others are F284–F316 in biglycan, F264–F290 in fibromodulin, Y274–F277–F296 in osteomodulin, and F259–F264 in chondroadherin. Met–aromatic interaction occurs in F72–F96–M119–F141–F167 and F141–M170–F167 in decorin, F156–M182–F208 in osteomodulin, in which more than one aromatic residue interact with one Met in a "bridging" motif of the general form aromatic-Met-aromatic (Weber and Warren 2019). Others are F328–F333–M360 in fibromodulin, M227–Y252–F253 and M320–F354 in osteomodulin. In summary, aromatic/Met–aromatic interactions is a significant factor to stabilize the entire LRR domains in SLRPs.

Ligand interactions

Collagen—binding

Decorin binds collagen I via sequence SYIRIADTNIT in LRR7, which is supportive of a concave surface binding (Kalmajski et al. 2007). The collagen-binding sites are mainly located in LRR4 and LRR5 (Sevensson et al. 1995). The deletion of M176–K201 (LRR6) and the mutation E180K (in LRR6) drastically interfered with the binding to reconstituted collagen I fibrils, indicating that an electrostatic interaction is a critical factor for the binding (Kresse et al. 1997). Decorin and asporin compete for binding to collagen via LRR10 to LRR12 (Kalmajski et al. 2009). Fibromodulin and lumican bind to same region on collagen I (Svensson et al. 2000). Fibromodulin interact with collagen cross-linking sites (Kalamajski et al. 2016). Fibromodulin binds collagen I via residues (E354 and K356) in LRR12 (Kalamajski and Oldberg 2007). Fibromodulin and lumican bind to collagen via a more proximal region located between LRR6 and LRR8 (Kalamajski and Oldberg 2009). In osteomodulin, residues E284 and E303 (in LRR 12 and LRR13, respectively) mediate collagen I-binding (Tashima et al. 2018). Taken together, collagen may span across 7–8 LRRs in these SLRPs. The collagen binding site is presumably mapped on the concave surface, the ascending loops or the combinations thereof of the LRR domains (Islam et al. 2013).

Chondroadherin binds to a sequence of GAOGPSGFQGLOGPOGPO (O is hydroxyproline) in collagen (Paracuellos et al. 2017). A GFx in the six triples is in common with several other collagen-binding proteins including OSCAR (Zhou et al. 2016). Histidines and aromatic residues are markedly concentrated on the concave surface concave surface of chondroadherin, as well as fibromodulin (Paracuellos et al. 2017). Taken together, phenylalanine in the GFX motif might participate in the collagen-binding interactions with chondroadherin through histidine–aromatic interaction and/or aromatic–aromatic interaction, while the interactions with osteomodulin is dominated by weak electrostatic forces (in particular by E284 and E303) and controlled by entropic factors (Tashima et al. 2018).

TGF—binding

Biglycan, decorin, asporin, fibromodulin, and tsukushi bind TGF-b (Yamaguchi and Ruoslahti 1988; Yamaguchi et al. 1990; Nakajima et al. 2007; Hildebrand et al. 1994; Niimori et al. 2012), as does an LRR protein, GARP (glycoprotein A repetitions predominant) with 22 LRRs (Lienart et al. 2018). Decorin fragment Leu155-Val260 (LRR5 to LRR9) interacts with TGF-β (Schoherr et al. 1998), while asporin fragment His159-Asn205 (LRR4 to LRR6) mediate its interaction with TGFβ1 (Kou et al. 2010). The binding footprint of TGF-β1 to GARP comprises the ascending and convex surfaces (LRR4 to LRR11). The five SLRPs directly might interact with TGF-β via the ascending and convex surfaces. In contrast, a modeling study indicates that TGF-β interacts with biglycan on its concave surface (Cho et al. 2016).

LRR proteins—binding

Some SLRPs bind other LRR proteins. Biglycan binds to TLR2, TLR4, and CD14 (Schaefer et al. 2005; Roedig et al. 2019). Lumican interacts with TLR4 (Wu et al. 2007). Decorin binds EGFR (Santra et al. 2002). EGFR and IGF-1 (insulin-like growth factor 1 receptor) contain five non-canonical LRRs in the L1 and L2 domains (Miyashita et al. 2014). A part of the LRR2 VS (H394-I402) in the L2 domain is essential for decorin and EGF binding. Decorin also binds and suppresses the IGF-IR (Morcavallo et al. 2014; Schonherr et al. 2005; Iozzo et al. 2011).

Other ligands

In addition to the above ligands, SLRPs belonging to classes I, II, and III interact with many other ligands including vWF (decorin) and matrix metalloproteinase-14 (MMP-14) (lumican) (Guidetti et al. 2004; Pietraszek-Gremplewicz et al. 2018). The glycosaminoglycan side chain of decorin mediates and the degree of sulfation regulates this interaction with vWF. The concave face of the LRR β-sheet in the GPIbα-VWF complex is involved in protein, ligand interactions (Sadler 2002). Similarly, the catalytic domain of MMP-14 may directly interact with the concave surface of the lumican LRR domain.

Classes IV and V SLRPs also interact with various ligands. Nyctalopin is located on the surface of photoreceptor-to-ON bipolar cell synapse in the retina (Gregg et al. 2007). Nyctalopin interacts directly with transient receptor potential cation channel subfamily member 1 (TRPM1) (Pearring, et al. 2011). Tsukushi is expressed in the primitive streak and Hensen’s node (Ahmad et al. 2018; Ohta et al. 2006). Tsukushi (TSK) binds to nodal/Vg1/TGF-β1, BMP4/chordin, Delta, FGF8, Frizzled4 and CCN2/CTGF (Ohta et al. 2004, 2011, 2019). Structural studies of the complexes remain.

Human diseases associated with mutations

Mutations in genes encoding SLRPs proteins are associated with human diseases (Matsushima et al. 2019). Exonic mutations (including missense mutations, nonsense mutations, insertions, deletions, frame shift mutations, stop codon/nonsense mutations, and silent mutations) and intronic mutations have been identified in the SLRP genes. Here we focus the exonic mutations on the LRR domains.

Spondyloepimetaphyseal Dysplasia, X-linked (SEMDX), and Meester-Loeys Syndrome (MRLS) associated with the BGN mutations

SEMDX is an X-linked recessive bone disease that impairs bone growth; it occurs only in males. It is characterized by anomalies of the spine and the epiphyses and metaphyses of the long bones. SEMDX is caused by BGN mutations (K147E, and G259V) (Cho et al. 2016). MRLS is a connective tissue disorder that is characterized by thoracic aortic aneurysms and dissections. The BGN mutations of G80S, Q303P, and Trp2* cause MRLS (Meester et al. 2017). The four missense mutations occur residues (in the HCS part) on the concave or ascending surfaces of the LRR domains.

Congential Stromal Corneal Dystrophy (CSCD) and Congenital Hereditary Stromal Dystrophy (CHSD) associated with the DCN mutations

CSCD is an inherited eye disorder that is characterized by numerous opaque flaky or feathery areas of clouding in the stroma (Bredrup et al. 2005). Mutations in DCN have been identified in families with CSCD (Bredrup et al. 2005; Rodahl et al. 2006; Kim et al. 2011; Chen et al. 2011; Jing et al. 2014). Five mutations result in the loss of the C-terminal 33 to 45 amino acids that is involved in the LRRCE motif. A truncated form lacking the 33 amino acids is retained in an unfolded protein response (Chen et al. 2013).

The DCN mutation may be linked to CHSD. CHSD is a rare autosomal dominant disorder that manifests as bilateral neonatal corneal opacification. Minute stromal opacity of the cornea gradually decreases in vision. The C346G mutation was proposed as a mild form of CHSD (Lee et al. 2012). This mutation eliminates interlocking disulfide bonds in the LRRCE motif.

Autosomal recessive cornea plana (CNA2) associated with the KERA mutations

CNA2 is characterized by a flattened corneal surface leading to a decrease in refraction, reduced visual activity, strong hyperopia, widened limbus zone, opacities in the corneal parenchyma, and marked arcus senilis (often detected at an early age) (Forsius et al. 1998). Mutations in KERA encoding keratocan cause CNA2 (Lehmann et al. 2001; Pellegata et al. 2000; Ebenezer et al. 2005; Liskova et al. 2007; Roos et al. 2015; Kumari et al. 2016; Dudakova et al. 2018; Khan 2018; Khan et al. 2006; Khan and Kambouris 2004; Dudakova et al. 2014).

Position 9 in the HCS consensus is usually occupied by Asn of which the side chain forms a network of H-bonds between the backbone carboxy and amide group of the neighboring repeats (asparagine ladder). Residues N31, N131, and N247 correspond to position 9. Thus, the missense mutations of N81A, N31D, N247S likely break the H-bonds. Position 1 and 11 in the HCS consensus is highly occupied by hydrophobic residues. T215 and I247 are located on their positions, as the sequence of the LRR8 HCS is 215-TMQLFLDNNSI-247. Threonine clearly show lower hydrophobicity than Leu, Ile or Val. However, the crystal structure of the lamprey variable lymphocyte receptor C (VLRC) with seven LRRs reveals that Thr69 (located at position 1 in the LRR1 HCS) is completely buried in the inner side of the LRR domain (Kanda et al. 2014). The charged residue substitution of the T215K mutation disrupt the hydrophobic core, that probably induces large structural changes and a subsequent mis-folding and/or an aberrant aggregation. The I225T mutation perturbs the overall folding of the LRR domain and then may induce mis-folding. P70 (in LRR1) and P208 (in LRR7) are located in the Bacterial VS part. The substitutions to Leu and Arg (P70L and P208R) perturb the preference of polyproline II conformation. Three nonsense mutations (Q174*, R279*, and R313*) could not preserve the solenoid structure. F125del will break a short b-strand in LRR4. The frameshift mutation of C343fs*26 may bring out a large change of the LRRCE structure.

Night Blindness, Congenital Stationary (CSNB) associated with the NYX mutations

CSNB is a non-progressive retinal disorder characterized by impaired night vision and predominantly by abnormal function of the rod system. There are type 1A (CSNB1A) and type 1F (CSNB1F) (Tsang and Sharma 2018, review). CSNB1A is caused by the mutations of nyctalopin (Bech-Hansen et al. 2000; Pusch et al. 2000; Zhou et al. 2015; Leroy et al. 2009; Dai et al. 2015; Pradhan et al. 2011; Wang et al. 2012; Xiao et al. 2006; Zeitz et al. 2005; Sui et al. 2008; Ivanova et al. 2019) as well as CSNB1F by those of LRIT3 (Zeitz et al. 2013, 2015; Dan et al. 2017), CACNA1F, TRPM1 (Abdelkader et al. 2018), RPM1 (AlTalbishi et al. 2019), and GNAT1 (Marmor and Zeitz 2018).

Nyctalopin contains thirteen LRRs of which most is Typical type, flanked by LRRNT and LRRCT. Many mutations have been mapped to the LRR domain (Matsushima et al. 2019). Eighteen mutations are located on highly conserved residues (at positions 1, 4, 6, and 11) in the HCS part (V46G, A64E, L91Q, L98P, L117P, L117Q, L123Q, L142P, L142R, L161R, L184P, A187K, L213Q, L232P, L235P, L280F, L285P, and L307P) and at “F” and “L” position in the “Typical” VS part (I101T and F298S). Among these mutations the charged residue substitutions (A64E, L142R, and L142R) disrupt the hydrophobic core in the LRR domain, as seen in the KERA mutation. The mutations of V46G, L91Q, I101T, L117Q, L123Q, L213Q, and F298S disturb largely hydrophobic core, while the I101T and L280F mutations induce its smaller disturbance. The remaining mutations are the substitutions to proline (L98P, L112P, L117P, L142P, L184P, L232P, L235P, L285P, and L307P).

Four mutations at Asn ladder position (N72H, N216S, N264K, and N312S) disturb a network of H-bonds. Nine mutations (C31S, C31Y, V46G, R50P, P57P, W346C, L347P, G370V, and C386R) in the LRRNT and LRRCT structures disturb sulfide bridges, folding, and stability. The mutations of A143P mutation (at position 6 in the HCS) and T258P (at position 3) disturb H-bond between β-strand that form the convex surface. The P57T, P151L and P175R mutations in the VS part change polyproline II conformation. The A177M mutation changes the conformation of plausible tandem β-turns in the VS region. The five mutations of Arg (R50P, R94P, R162P, R173P, and R257P) on the convex surface or on the ascending loop may impair interaction with a ligand such a TRPM1.

Indels (insertion and deletion) are observed. The insertions are the mutations of L155LSVPERLL (an addition of seven residues), R207RLLR and R209RCLR (an addition of three residues). These insertions probably form a bulge but keep a solenoid structure. The deletions are R29-A36del, E114-A118Adel, and A243-P246del which induce destructive structural change. The deletion of one residue of conserved hydrophobic residue occurs I101del which destabilize and then induce a structural change of the LRR domain. Nonsense mutations (Q299x) and frameshifts (H95s and D286Tfs) are also observed.

High Myopia associated with the mutations of LUM, FMOD, PRELP, OPTC, and NYX

If an eye requires -6.0 diopters or more of lens correction, it is usually considered to have high myopia (Cai et al. 2019). The mutations in genes coding opticin, lumican, fibromodilin, PRELP, and nyctalopin are associated with high myopia (Wang et al. 2009; Majava et al. 2007; Acharya et al. 2007; Yip et al. 2013; Zhang et al. 2007; Zhou et al. 2015).

The OPTC mutations of G329S and R330H occur at the C-terminus of the protein. The crystal structures of biglycan, decorin, and fibromodulin reveal that I325 (biglycan), V324 (decorin), and I374 (fibromodulin) participate in the hydrophobic cores (Scott et al. 2004, 2006; Praracuellos et al. 2017). Sequence alignment indicates that F331 in opticin corresponds to their residues. Thus, the G329 and R330H mutations disrupt the LRRCE structure. The NYX mutation of C48W disrupt disulfide bond in the LRRNT structure. The mutations of Arg (R191Q and R209P in NYX, R324P in FMOD, and R146L, R229H and R325W in OPTC may impair interaction with a ligand, as seen in the NYX mutations associated with CSNB. Similarly, the mutations of P46L at the N-terminal Tyr-rich region of fibromodulin, G33R at the N-terminal Arg/Pro-rich region of PRELP (PRELP), N348H at the LRRCT of PRELP also might prevent ligand-interaction. The mutations of L112P (NYX), M157V (PRELP), and L199P (LUM) disturb the hydrophobic core in the LRR domain. The mutations of A177T (NYX), G147D (FMOD), and T177R, P267L, and L268P (OPT) change β-turn conformation in the VS region. The mutations of P46L at the N-terminal Tyr-rich region of fibromodulin, G33R at the N-terminal Arg/Pro-rich region of PRELP (PRELP), N348H at the LRRCT motif of PRELP may impair interaction with a ligand. Furthermore, nonsense mutations (L134x and L270x in OPT) and frameshift mutation (E41fs*100) are also observed.

Cancer or Carcinoma associated with BGN

Cancer is a group of diseases that involve abnormal increases in the number of cells, with the potential to invade or spread to other parts of the body. Carcinoma are cancers derived from epithelial cells. Mutations in genes encoding various LRR proteins are associated with cancer or carcinoma. They include breast cancer in biglycan, as seen in NLRP14, densin-180, colorectal cancer in FBXL2, LRRC47, Slitrk4, and ISLR, papillary cancer in TSHR, lung cancer in MXRA5, multiple self-healing palmoplantar carcinoma (MSPC) in NLRP1, renal cell carcinoma in FBXL12, ovarian serous carcinoma in trk-A, lung adenocarcinoma in trk-B, gastric adenocarcinoma in trk-C and insulin receptor, and colorectal adenocarcinoma in ErbB4 (Matsushima et al. 2019); MSPC is an autosomal dominant disease that is characterized by recurrent keratoacanthomas in palmoplantar skin as well as in conjunctival and corneal epithelia (Zhong et al. 2016).

The BGN mutations of R266T and K288N is in breast cancer. Residues R266 and K288 on the ascending loops are exposed to solvent (Sjoblom et al. 2006). We infer that these mutations are gain-of-function.

Concluding remarks

We described some features of amino acid sequence and structures of SLRPs. The secondary structure assignment supports the presence of the super motifs consisting of Bacterial and Typical LRRs. The known five structures indicate that aromatic/methionine–aromatic interaction to stabilize the entire LRR domains frequently occur. We reviewed the ligand interactions of SLRPs. The concave surface of LRR domains is in contact with some ligands. In addition, other surfaces presumably participate in the interactions. We mapped all mutations associated with human diseases and discussed possible effects on LRR structures by the mutations.