Keywords

Introduction

Fibrillar collagen, including collagen types I, II, III, and V collectively account for more than 90% of all collagens in the body (Bächinger et al. 2010), and they form the essential molecular scaffold in all tissues. Central to the crucial biological functions of collagen is its supramolecular structures in the extracellular matrix (ECM). The macromolecular assemblies of collagen are known for their multiplicity and complexity in terms of the composition, the conformation, and the molecular properties (Bella and Hulmes 2017; Brodsky and Ramshaw 2017; Birk and Bruckner 2005). Accumulating observations are pointing to the structural polymorphism of collagen as a key factor for the varied function of collagen in tissues genesis and pathogenesis (Eyden and Tzaphlidou 2001). Polymorphism of collagen fibrils has been known since the dawn of collagen research (Bruns 1976; Kühn 1982; Gross et al. 1954). Among the alternative structures of fibrillar collagens, SLS (segment-long-spacing) is unique in its form and its direct link to the primary structure of the collagen. While SLS of fibrillar collagens are the best studied and will be the focus here, many different types of collagens can also form SLS-like aggregates (Bentz et al. 1983; Schwartz and Veis 1978, 1980). In fact, the structure of SLS is a manifestation of the conformational properties of a collagen triple helix. Expanded research on how SLS and other structures affect the properties of collagen will lead to a better understanding of the range of critical roles of fibrillar collagen in tissue development and function.

SLS: The Effective Molecular Tool for Collagen Research

To understand the significance of the SLS structure in collagen research, one has to take a step back in time, to the early days of protein chemistry when advances in collagen research took off in lockstep with the emerging technologies in molecular and structural biology: electron microscopy, X-ray diffraction, and analytical ultracentrifugation, just to name a few. The large and regular structure of fibrillar collagen is particularly suitable for studies using a transmission electron microscope (TEM). By the early 1950s, it was understood that collagen is a fibrous protein having an axial periodicity of about 67 nmFootnote 1—the D-period. This periodicity was observed in collagen fibrils isolated from tissues and in the reconstituted collagen fibrils formed from acid extracts (Gross and Schmitt 1948). The 67 nm periodicity was characterized as regular dark-light bands on negatively stained electron micrographs (Fig. 16.1a) or as clear ridge and gaps after chromium shadowing (Fig. 16.1c). This directly visible repeating structure was also supported by the X-ray diffraction of collagen fibrils (Wright 1948; Begbie 1947; Gross and Schmitt 1948). The basic structure of collagen—the collagen triple helix—was known to have three polypeptide chains wrapped around each other in an extended conformation and a uniform backbone. The Gly residues of the characteristic (Gly-X-Y) repeating amino acid sequences are at the center of the helix while the side chains of the residues in the X and Y positions are displayed linearly on the surface of the helix in an N-to-C termini directionality (Ramachandran and Kartha 1955; Rich and Crick 1955). How did the specific D-periodic fibrous structure come about from the arrangement of a defined structural unit was the puzzling question. The D-period was attributed to the spacing between two immediate monomers, although neither the size or shape of the “monomer” nor their specific arrangement associated with this regular spacing structure was understood.

Fig. 16.1
figure 1

The structure of collagen fibrils. (a) Negatively stained TEM image showing the D-period as a pair of light-and-dark bands. (b) Positively stained TEM images showing the D-period related placement of clusters of charged residues. (c) TEM image after metal shadowing showing the dark and light bands of a D-period as the ridges and gaps, respectively. a and b are adapted from (Eyden and Tzaphlidou 2001), c from (Schmitt et al. 1953)

The breakthrough came with the discovery of a different structure of collagen— the segment-long-spacing crystallite, also known as the SLS. SLS is not a fibrous structure but a “crystallite-like” structure having one dimension about 200–300 nm in length—a segmental length related to the long collagen fibril D-period (Fig. 16.2). SLS was first discovered during routine collagen preparations and was later reproduced by dialyzing fibril-forming collagen in the cold against 0.05% acetic acid (pH 3.5) and with the addition of ATP to a final concentration of 0.1–0.25% (Schmitt et al. 1953). In their original work on SLS, Schmitt et al. reported that upon dialysis, the D-period fibrils transformed into a new kind of aggregate “in the form of segments having a characteristic pattern of internal band structure” (Fig. 16.2a). Unlike the regular striation of the D-period in fibrils, the closely spaced, barcode-like bands of SLS have an asymmetric spacing along the segments with respect to the axis of the segment showing a clear polarity from one end to the other. The banding pattern and the length of the segment are highly reproducible, although the width of the segments varies. The chromium-shadowing technique revealed the SLS are not flat ribbons but crystallites with considerable thickness, indicating these are likely structures formed in solution and deposited as crystallites on the grid (Fig. 16.2b).

Fig. 16.2
figure 2

The structure and banding pattern of SLS. (a) Positively stained TEM images of SLS having the N and C termini of the triple helix marked by the letter N and C, respectively. (b) SLS crystallites after metal shadowing. (c) Matching the bands with the clusters of charged residues. (d) The 58 bands of the “molecular fingerprint” of collagen type I. a is adapted from (Hodge and Schmitt 1960), b from (Schmitt et al. 1953), c from (Fietzek and Kühn 1976), d from (Bruns and Gross 1973)

At about the same time SLS was discovered, several classic hydrodynamic studies of collagen revealed that collagen molecules dissolved in solution behave as a “rigid rod” about 280 nm × 1.4 nm in size, having a molecular weight close to 360 kDa (Boedtker and Doty 1956; Hall and Doty 1958)—a spot-on description of a triple helix prior to any knowledge of its primary structure! Combining the hydrodynamic results and the X-ray diffraction data on the extended conformation of a triple helix, it became clear that the SLS is formed by the self-assembled collagen triple helices, also known as the tropocollagens or TC (Gross et al. 1954). All TC molecules are arranged in parallel to each other, in the SLS crystallite, and are in a perfect transverse register—the so-called end-on-end stacking. The SLS crystallites have the same length as the collagen molecule. The banding patterns of SLS are due to the effects of the positive staining: the metal ions tungstate and uranate in the staining solution interact, respectively, with the positively charged and negatively charged residues on the surface of the moleculeFootnote 2 (Kühn 1982). The bands of SLS can thus be related to the primary structure of the molecule given the rod-like, linear conformation of the triple helix. In a remarkable demonstration, Kühn et al. showed a near perfect alignment of the banding pattern of a section of the SLS with the resolved amino acid sequence of collagen from cyanogen bromide (CNB) fragments (Fig. 16.2c) (Fietzek and Kühn 1976; von der Mark et al. 1970). The SLS of collagens prepared from a wide variety of tissues and animals showed a high level of similarity in their banding patterns. By averaging more than 100 published observations on positively stained SLS, a series of 58 bands having a polarized localization from N-to-C termini were recognized as a molecular fingerprint for type I collagen (Hodge and Schmitt 1960; Bruns and Gross 1973, 1974) (Fig. 16.2d).

The realization that the SLS is an “electron optical image” of a 2-D display of the tropocollagen has transformed the SLS into an effective molecular tool for collagen research. The following are three major findings of collagen attributed to the use of SLS.

The Molecular Packing of the Fibrils

The D-period of native collagen fibrils are usually characterized using TEM images under the negative staining condition which depicts the surface relief of the fibril: the dark bands reflect the gaps on the surface where the metal ions of the staining solution accumulated, while the light bands are the overlap regions in the form of ridges (Fig. 16.1a and c). Under positive staining conditions, however, the images of the native fibrils revealed a different banding pattern: one that reflects regions of charged residues on the surface of the fibrils (Fig. 16.1b). The positively stained banding patterns of fibrils also bear strong similarities to that of the SLS fingerprint. In fact, by successively overlaying the banding pattern of a SLS with a longitudinal displacement of one-fourth of the length of a SLS, Schmitt et al. were able to create a banding pattern that corresponded closely to that of the native fibrils (Fig. 16.3a) (Hodge and Schmitt 1960). In another remarkable experiment, they prepared the SLS in the presence of the native fibrils. The individual surface collagen heterotrimers in the fibril appeared to act as the “nuclei” for the SLS aggregates; the SLS are appended on the surface of the fibrils “as outgrowths from the native fibrils” (Fig. 16.3b and c). From the characteristic orientations of the banding pattern of the SLS segments with respect to the native fibrils, this legendary “dimorphic” image revealed two important features of the collagen fibrils: (1) the tropocollagen (TC) molecules are all arranged in parallel in the fibrils and (2) the transverse clusters of charged residues seen on SLS are conserved in the fibrils; as shown in the “optical synthesis,” the unique banding pattern of the native fibrils is the combined effects of TC molecules arranged with a regular mutual staggering with each other. Using different methods, Kühn et al. reached the similar conclusion about conserved SLS bands and the quarter staggered arrangement of the fibrils (Kühn 1982). Later studies revealed that the D-period is 67 nm not 70 nm—the value used in the work by Schmitt et al.—and a TC encompasses 4.4D periods. While the D-staggering of the TC molecules in the fibrils is not exactly a quarter-stagger as had been proposed by Schmitt et al., the findings using SLS have nonetheless captured the fundamental aspects of the fibril assembly of collagen that have served as a bedrock of knowledge for studies of collagen fibrils ever since.

Fig. 16.3
figure 3

The recapitulation of the banding patterns of SLS in the positively stained collagen fibrils. (a) Showing in (1) is the typical banding pattern of a SLS where the N and C indicate the N- and C- termini, respectively, of the tropocollagen. The banding pattern in (2) is generated by multiple printing of the SLS image in (1) with a longitudinal displacement of ¼ of the length of the segment (indicated by the positions δ1, δ2, δ3, and δ4). The summation of band densities resulted from the “optical synthesis” corresponds closely to the band pattern of the native fibrils shown in (3). (b) The dimorphic ordered aggregates of tropocollagen produced by exposing the reconstituted native fibrils to the solution containing tropocollagen and ATP at a pH value favoring the formation of SLS. The SLS formed as an “outgrowth” from the native fibrils and exhibit a characteristic orientation and polarity with respect to the polarized band pattern of the fibrils. The arrows indicate the C to N orientation of the tropocollagen in the SLS. (c) An expanded view of a SLS “outgrowth” on a native fibril. The letters C and N, and the arrow indicate the C-to-N directionality of the SLS. (Adapted from Hodge and Schmitt 1960)

The Fingerprints of Collagens

The close correlation between the banding patterns of SLS and the primary structure of collagen led to the discovery of new types of collagens and even opened the door for phylogenetic studies of collagen long before the amino acid sequences of those collagens were elucidated. Collagens prepared from animal skins are predominantly collagen type I, a heterotrimer consisting of two α1 chains and one α2 chain, which is also the most abundant collagen in the animal kingdom. Collagens extracted from other tissues, however, are often dominated by genetically distinct molecules differing from each other in amino acid composition and chain composition. With chemical sequencing of protein laborious and technically challenging, and the genomic information a few decades away, SLS was used to analyze the structural and functional significance of newly discovered collagens. It was immediately clear that the SLS images of different fibrillar collagens share the similar appearances of the barcode-like bands in a segment about 300 nm long (Fig. 16.4). Yet, the resolution of SLS is sensitive enough to reveal distinctive features in both the location and the intensity of some of the bands between different collagens (Fig. 16.4h) (Wiedemann et al. 1975). One interesting case is the markedly different cross striation pattern of type IV collagen extracted from basement membrane (Fig. 16.4f and g). Type IV collagen is known to have “breaks” in its triple helical domain and to form a chicken-wire-like molecular network rather than fibrils (Schwartz and Veis 1978; Chung et al. 1976; Kefalides 1968). The different banding pattern of type IV also highlights a common feature of fibrillar collagen – the long-stretches of non-interruptive banding patterns. This banding pattern also appeared to be highly conserved and was found in collagen preparations from human skin, calf skin, rat tail tendon, fish, and invertebrates. This finding led to the conclusion that the regular alternation of the polar and apolar regions is one of the characteristics that “are not subjects to variations by mutations” throughout collagen’s evolution (Nordwig and Hayduk 1969). These conserved features were considered necessary for collagen’s configuration and for fibril formation—an idea that was later elaborated on by the discovery of periodic spacings of charged and hydrophobic residues in fibrillar collagens in relation to fibril structure (Hulmes et al. 1973; Hulmes et al. 1977; Doyle et al. 1974; Kaur et al. 2015) and was incorporated in the design of fibril-forming triple helical peptides (Kaur et al. 2015) (for details see below).

Fig. 16.4
figure 4

The SLS fingerprints of different collagens. (a) Type I collagen α1 chain homotrimer. (b) Type I collagen α2 chain homotrimer. (c) Type II collagen. (d) Type III collagen. (e) Type V collagen ⍺1(V)21(V) heterotrimer. (f) Type IV collagen from placenta. (g) Type IV collagen from mouse. (h) The differences (highlighted in yellow dots) between type I heterotrimer and type III collagen. Adapted from (Kühn 1982), h from (Wiedemann et al. 1975)

The Functions of Collagenase

In another historical development of collagen research, SLS was used to “visualize” the action of collagenases. It had long been recognized that collagenolysis plays an essential role in tissue remodeling. Upon the successful isolation of the first animal collagenase from tadpole tissues, the catalytic activity of this enzyme was studied using collagen preparations from calf skin and from the skin of normal or lathyritic guinea pigs and rats (Gross and Lapiere 1962). Incubation of type I collagen with tadpole collagenase resulted in two discrete fragments that can be observed using electrophoresis as two separate bands below the normal α1 and α2 chains. The SLS aggregates prepared from the solution mixture revealed fragments that were three-quarter-length long compared to the SLS of intact tropocollagens (Gross and Nagai 1965) (Fig. 16.5). Remarkably, the shorter SLS retained a banding pattern that was in perfect alignment with the C-terminal three-fourth region of the established fingerprint of type I collagen (Fig. 16.5c). Furthermore, using the 58 bands as the fingerprint of type I collagen and the amino acid sequence of CNB fragment α1-CB7, an analysis of the banding pattern of the SLS helped to identify the Gly-Ile bond between position 775–776 of the α1 chain (i.e., residues 221 and 222 of the α1-CB7) to be the severed bond (Bruns and Gross 1973). Similar studies using SLS have established both the high specificity of the collagenase and the highly conserved action of the enzyme in different tissues and across species (Stark and Kühn 1968; Nordwig and Bretschneider 1971; Heidemann and Heinrich 1970; Nagase and Woessner 1999; Nagase et al. 2006; Harris and Krane 1974).

Fig. 16.5
figure 5

The ¾ fragment of the collagenase digest. (a) The SLS of calf skin collagen. (b) The SLS of the ¾ length fragment from the reaction mixture. (c) A comparison of the SLS of collagen and of the ¾ fragment led to the identification of the region of the enzymatic cleavage; the “A” and “B” marks the C- and N- termini of the SLS. From (Gross and Nagai 1965)

Fast forward half a century, more than 28 genetically distinct types of collagens have been identified based on the amino acid sequences and the structure of the genes (Ricard-Blum 2011). There is no need to use SLS to study the primary structure in the postgenomic era, but SLS remains a useful tool in molecular studies of collagens. The formation of SLS aggregates appears to be common among all collagens under suitable conditions in vitro and in vivo. The reproducible and uniquely identifiable banding patterns were frequently used to identify the corresponding collagens in TEM images of tissues (Bentz et al. 1983; Schwartz and Veis 1978), to study the chain register in heterotrimeric type I collagen (Bender et al. 1982) and to study the effects of disease-causing mutations (Timpl et al. 1978; Kobayashi et al. 1990; Stanescu et al. 1982).

SLS and the Structural Polymorphism of Fibrillar Collagen

Even in its heyday, SLS was frequently referred to as “artificial aggregates” because it has a nonfibrous structure and, most of all, because it was produced in the laboratory under nonphysiological conditions: low pH and with the addition of ATP. On the other hand, the highly reproducible nature of SLS indicates this unique structure is a manifestation of certain innate properties of the collagen molecule.

The Formation of SLS under Different Buffer Conditions

Following its discovery, research emerged to characterize the roles of pH and ATP in the formation of SLS. The SLS under in vitro conditions are often prepared from reconstituted collagen fibrils. The low pH is necessary to weaken interactions involving acidic residues (Glu and Asp) and cause the fibrils to dissociate (Kühn 1982). The low pH is also necessary to keep the TC in a monomer form avoiding other nonspecific aggregates that TC is prone to (Kühn 1982; Bowden et al. 1968). TC monomers that had been dissolved in solutions with a pH between 2.5 and 3.5 precipitated as SLS upon the addition of ATP (Paige and Goh 2001; Harris and Lewis 2016). At pH 5, however, the TC molecule formed fibril-like structures instead of SLS upon the addition of ATP; the fibrils at pH 5 did not show any discernable banding pattern (Harris and Lewis 2016).

The precise interaction between the polyanions and TC in the formation of SLS appears to be more complex, affected by both the structure of the polyanions and the distribution of the charged residues on the surface of TC. The polyanion ATP was thought to cross bridge positively charged collagen molecules so that they aggregate laterally with their ends in register (Kühn 1982; Doyle et al. 1975). Yet, not all polyanions can induce the precipitation of SLS (Bowden et al. 1968; Harris and Lewis 2016; Paige and Goh 2001). In one study, the presence of three adjacent charge groups were deemed to be the necessary condition based on the finding that ATP, ATP-γ-S, and GTP were able to effectively induce the SLS aggregation while ADP could not (Paige and Goh 2001). In another study, however, both triphosphate and polyanions having two negatively charged groups, such as 1, 6 fructose diphosphate and perdisulphuric acid, were able to cause TC molecules to precipitate as SLS having the same banding patterns as those precipitated using ATP (Bowden et al. 1968). The same study also reported that when a dicarboxylic acid, such as succinic acid, glutaric acid, or adipic acid, was used as the polyanion in solution no SLS were found. It was subsequently concluded that “a cloud of strongly negative charge” separating the two negatively charged groups was the important feature. Polyanions like ATP and GTP are naturally present in the cytoplasm and the ECM, whether their concentrations can reach the critical level to induce the SLS formation in vivo is yet to be seen.

In another interesting study, SLS were produced by adding diazo dye Sirius red or Evans blue to preformed D-periodic fibrils at neutral pH (Harris and Lewis 2016). These are large dye molecules that have a size about two to three times that of ATP and contain multiple charged groups separated by a diazo moiety. The dye molecules appeared to take on dual roles in this reaction: weakening the charge-based interaction of the fibrils and promoting the in-register precipitation of SLS. A higher than usual ionic strength appeared to be necessary for precipitation of SLS by the dye molecules. The ionic strength is generally an important factor for both in vitro fibrillogenesis and the SLS formation (Harris and Reiber 2007). The functions of salt in a solution of biomolecules range from shielding of charged groups, altering the dielectric constant and the activity of water to, in the cases of RNA folding and binding of DNA, directly affecting the structures and molecular recognition. How salt affects the self-assembly of collagen triple helices is only understood in general terms for now.

The cross striation from in-register alignment of charged and nonpolar residues of SLS were further elaborated using atomic force spectroscopy (AFM)—a more recently used technology that does not require staining of materials. Studies using AFM revealed the in-register nature of the SLS as “fine structures” rising from residues having large hydrophobic side chains (Fujita et al. 1997; Paige and Goh 2001); the SLS appeared to be rectangular aggregates on the mica showing rounded tops. At the optimum concentrations of collagen and ATP, studies using AFM were able to reveal the SLS crystallites coexisted with different aggregates smaller in size, which were interpreted as the oligomers of the intermediate stages (Paige and Goh 2001). The earliest oligomers consisted of a few TC molecules and had the length of a TC molecule but were ~ 2.5 nm in diameter. These oligomers appeared to be stable intermediates and present in a large amount during the early stages of self-assembly; they further congregated into increasingly larger intermediates leading to the assemblies of the crystallite SLS. Interestingly, a similar observation was also reported using designed, fibril-forming peptides (see below) (Kaur et al. 2015).

SLS under Physiological Conditions and their Potential Involvement in Fibrillogenesis

SLS can naturally occur under physiological conditions. SLS-crystallite and pro-SLS (SLS aggregates of collagens before the N- and/or C-propeptides are removed) were frequently found in cell cultures without the addition of ATP and/or the need to lower the pH (Bruns et al. 1979; Hulmes et al. 1983). SLS were observed in whole-mount preparations of fibroblasts, in tissue homogenates, and in thin sections of embedded cells. These SLS were “slender” comparing to those observed under in vitro conditions—they retain the critical 300 nm length but are smaller in both their width and height (Fig. 16.6). The signature polar cross striations are clearly visible albeit with lower resolution due to their smaller sizes. The pro-SLS, especially those having only the C-propeptide attached (pC-SLS), acquire a distinctive shape as a flower bouquet with the triple helical domain forming the “stems” bearing the recognizable cross striation and the larger, globular propeptide spread out as the flower heads linked to the “stem” via a flexible linker (part of the linker becomes the C-telopeptide later after enzymatic processing and remains at the terminal ends of the TC) (Fig. 16.6c). The large globular domain may have also prevented the pro-SLS to grow bigger. The formation of the pro-SLS is driven by the same molecular interactions that stabilize the SLS, since neither the propeptides nor the linker regions (or the telopeptides) are in contact with each other in the assembly. Because of the large globular propeptides, the procollagen monomers cannot assemble into staggered fibrils, and the SLS becomes a stable alternative conformation. Some TC having the smaller N-propeptide attached, however, were found to be incorporated on the surface of the fibrils and play a key role in controlling the diameter of the fibrils by limiting the further assembly of TC molecules (Hulmes 1983; Hulmes et al. 1989). SLS and pro-SLS were also reported in thin sections of tissues, although it is often difficult to identify which type of collagen was in the aggregates due to the reduced resolution of the banding pattern (Weinstock 1977; Warshawsky 1972; Pérez-Tamayo 1972). It seemed a high concentration of TC is the necessary condition for SLS (Hulmes et al. 1983). The average molecular weight of the pro-peptides in cell culture without further concentrating was estimated to be close to that of the TC monomer using velocity sedimentation, although the wide peaks on the Schlieren optic photography during sedimentation indicated a more heterogeneous population than that of pure TC monomers. After the same cell culture was concentrated by repeated centrifugation, both SLS and pro-SLS were observed by TEM (Fig. 16.6a).

Fig. 16.6
figure 6

Naturally formed SLS and pro-SLS in cell medium. (a) Aggregates from culture medium of chick embryo tendon cells. (b) Homogenate of sternal cartilage from 2 weeks old lathyritic chick showing numerous SLS-like structures. The scale bars are 300 nm. (c) Negative staining of pro-SLS from 24 h culture medium of chick embryo tendon fibroblasts. No ATP or other substances added. a and b from (Bruns et al. 1979), c From (Kühn 1982)

Bruns et al. were first to suggest a physiological function of SLS as the precursor of in vivo fibrillogenesis (Bruns et al. 1979; Hulmes 1983). They reported bundles of SLS in vacuoles associated with the cell membrane. In the same whole-mount fibroblast preparation, they also observed the presence of pro-SLS, SLS, native fibrils and the slender SLS crystallites scattered around the tips of the growing fibrils. In their proposed mechanism of in vivo fibrillogenesis, cells secrete procollagen in high local concentration packaged in the SLS form. These pro-SLS were then transported in vacuoles to be secreted to the ECM. These pro-SLSs remain stable in the ECM. They further suggest that the SLS in the ECM are more stable than those prepared in solution at low pH and with the addition of ATP, presumably because of the interactions with some other “cementing substances” such as fibronectin or proteoglycan. The cemented stability can potentially shift the equilibrium between the TC monomer and the SLS and make the SLS the more favorable conformation under moderate concentrations. The N-and C-propeptide on the pro-SLS were subsequently removed in the ECM and the “slender SLS crystallites”—those having a diameter of ~10 nm in which the TC are in 0D register with one another—become incorporated directly into the fibrils. In a way, the D-periodic fibrils “may be considered as an n × 67 nm staggered array of SLS crystallites” (Bruns et al. 1979).

Collagen fibrils acquire additional suprafibrillar structures in the extracellular matrix that establish the eventual architecture of different tissues and give the tissues their unique forms and properties. The high concentrations of collagen in the ECM resemble a condensed matter rather than macromolecules in a solution. Using the liquid crystalline model of condensed matter, several works by Giraud-Guille et al. using polarized light microscopy and TEM demonstrated that the structure of the ECM share several similarities with the gelation process of collagen under very high concentrations (Besseau and Giraud-Guille 1995). The suprafibrillar structure of collagen in the ECM emerged from a liquid crystalline order of “soluble precursors” which went through an intermediate stage involving molecular aggregates of unit triple helices. The mature suprafibrillar architecture of connective tissues maintain the liquid crystalline ordering of the “soluble precursors” (Besseau and Giraud-Guille 1995; Giraud-Guille 1996; Hulmes 2002). In the subsequent studies, the soluble precursors were determined to be procollagens which can reach a concentration in the range of 50–80 mg/mL without losing the liquid form; in comparison, TC monomers readily form gels or non-specific precipitates at concentrations of a few mg/mL in neutral pH (Martin et al. 2000). It was thus proposed that liquid crystalline order in connective tissue takes place prior to enzymatic procollagen processing; the fibril formation proceeded with “compacting and sliding” of the molecular aggregates of the procollagens after the propeptides are removed, while retaining the liquid crystalline organization that existed prior to the fibril assembly. Ideal for supramolecular order in sub-micrometer to close to 1000 μm range, polarized light microscopy revealed little structural details of the molecular aggregates other than being ~20 nm in diameter, a diameter much thicker than an individual procollagen triple helix. The pro-SLS appears to be the candidate for the intermediary molecular aggregates considering any staggered arrangement is unlikely given the presence of the propeptides. The potential precursor role of SLS and/or pro-SLS during in vivo fibrillogenesis would thrust this unique molecular structure right back to the center of the research on the structures of the ECM during the function and development of tissues. On the other hand, the precursor nature of SLS highlights the challenges of characterizing the involvement of SLS in physiological processes: like any intermediate in a reaction pathway, the SLS will be present in transient and in lower abundance.

The Other Structures of Fibrillar Collagen

It has been established since the early 1950s that collagen is a true polymorphic structure (Gross et al. 1954; Schmitt et al. 1953). In addition to the D-periodic fibrils and SLS, there are another two frequently observed structures: the fibrous long-spacing assemblies, or FLS, and the obliquely striated periodic fibrils (Fig. 16.7) (Bruns 1976; Highberger et al. 1950; Schmitt et al. 1953). The FLS, which include up to four different sub-types categorized as FLS I-IV, are formed by staggered TC-like D-periodic fibrils but having a different periodicity ranging from 90 to 250 nm (Doyle et al. 1975). The fine banding structures of FLS in the positively stained micrographs are nonpolar having a symmetric appearance. Since the banding patterns of TC and SLS are polar, a symmetric, nonpolar banding pattern pointed to an antiparallel arrangement of TC and/or slender SLS a few nm in diameter. The formation of FLS requires the presence of other flexible, often negatively charged biomacromolecules such as chondroitin sulphate, serum glycoprotein, various mucopolysaccharides, and proteoglycans (for a summary, see Doyle et al. 1975). The staggered arrangement was related to the bridging effects of the negatively charged molecules. In some cases, these molecules are located at fairly specific sites in the structure. The possible arrangements of TC in FLS are not unique, giving rise to different periodicities; some FLS show a combination of different staggering arrangements (Doyle et al. 1975) (Fig. 16.7a and b).

Fig. 16.7
figure 7

FLS and oblique fibrils. (a) Positively stained micrograph showing the transition from FLS I (having an overlap of 30–50 nm) to FLS IV (having an overlap ~127 nm. The diagram shows the axial arrangement of molecules including antiparallel arrangement of TC molecules. (b) FLS I showing nonpolar banding pattern and a periodicity of 200 nm. (c) Oblique fibrils reconstituted from cartilage collagen showing about 15 narrow sub-fibrils each has the regular 67 nm D-period. The arrow indicates the accumulation of dye molecules between the sub-fibrils. The scale bar is 50 nm. a adapted from (Doyle et al. 1975), b from (Schmitt et al. 1953), c from (Bruns 1976)

The obliquely straited fibrils were observed in reconstituted fibrils from cartilage and in subsections of rat tail tendon (Bruns 1976; Doyle et al. 1975). The molecular arrangement was explained in terms of narrow sub-fibrils having the native D-striation and being ~14 nm (13.8 ± 1.5 nm) in diameter. The oblique pattern emerged from the assembly of the sub-fibrils; they all have the same diameter and orientation but have a regular axial mutual displacement about 90 Å with the adjacent sub-fibrils (Fig. 16.7c). Like in the case of FLS, the sub-fibrils can have several different arrangements including having two adjacent sub-fibrils in the opposite directions giving rise to varied staining patterns, some looked like a “checkerboard” others had “chevrons.” The observation of such varied assemblies of sub-fibrils highlighted (1) the fibrillogenesis proceeded with the formation of sub-fibrils having similar diameters and the invariant D-striation and (2) the further assembly of the sub-fibrils follows a less specific manner. The D-stagger is a unique, well-defined structure determined by the specific spacings of the interacting residues (more details in the next section), the interaction sites on the surface of the sub-fibrils (and also on the native collagen fibrils) occur in various patterns that are sensitive to the environment during fibril formation. The arrangement of the sub-fibrils can, thus, be a key factor in the polymorphism of collagen fibrils.

The Determinant Factors for the Self-Assembly of Fibrillar Collagen

One of the key findings about the different structures of collagen is that they are interconvertible depending on the buffer conditions (Highberger et al. 1950). These are, therefore, spontaneously formed structures determined by the properties of collagen triple helices. Why are there different structures of collagen? Is polymorphism a genetic phenotype of fibrillar collagen? The key to answering these questions is to understand the principles that govern the self-assembly of triple helices (TC) into ordered aggregates.

The Origin of the Structures Is in the Amino Acid Sequences

The fact that polymorphic structures all exhibit a particular axial structure indicates that the “functional” groups are not scattered in random on the surface of the helix but acquire a specific, periodic placement along the axis of the helix; such that only in a regular mutual arrangement of neighboring helices will these groups be optimally aligned for interaction and accumulatively contribute to the stability of the assembly (Hulmes et al. 1973; Kaur et al. 2015). The most profound work on identifying the unique placements of interacting residues of collagen in its primary structure was accomplished by Hulmes et al. in the early 1970s. Taking advantage of the newly resolved amino acid sequence of the α1 chain of type I collagen, they developed a computational analysis of the amino acid sequence to study the “complementary relationship” that would explain the stagger multiples of 67 nm. Modeling the triple helix as a one-dimensional linear display of residue side chains of the X and Y residues and having a uniform helical rise of 2.86 Å per residue, they calculated the interaction curve based on all possible interactions between two associating TC molecules in a particular “chain staggering.” According to the interaction curve, the interactions of both hydrophobic groups and oppositely charged groups between two associating triple helices reaches maximal if the two helices are staggered by 0D, 1D, 2D, 3D, and 4D, where a D is equivalent to a section of the triple helix composed of 234 residues (per single polypeptide chain). In particular, the distribution of the large hydrophobic residues is found to have a regular D-spacing, thus in a D-staggered arrangement, these residues will be brought into a close proximity for interaction. The interactions of oppositely charged residues are also maximal in a D-staggered arrangement, although the locations of the charged residues per se do not appear to follow any obvious distribution complementary to the D-period (Hulmes et al. 1973).

This periodic placement of interacting residues provides a framework to understand the D-staggered fibrils as the most stable conformation emerging from the interactions of associating triple helices. Certain aspects of the periodic placements of the residues also provide a basic understanding of the antiparallel arrangement and the interactions with the polyanions in FLS and other self-assemblies (Doyle et al. 1975). What remains unclear, however, is the structure of SLS and especially SLS as a potential competing conformation with the D-staggered assemblies. Any molecular interactions that stabilize the D-staggered triple helices during the self-assembly must also be present in the SLS structure. In fact, those interactions should reach the maximum in the 0D staggered SLS (Hulmes et al. 1973). Clearly, there are other molecular interactions and/or structural constraints that discriminatorily favor D-period fibrils over SLS under physiological conditions that were not included in the calculations of the interaction curve.

Interestingly, direct measurement of the forces between the triple helices in the D-periodic collagen fibrils using the osmotic stress method, revealed that the fibrils are predominantly stabilized by hydrophilic interactions (Leikin et al. 19931994, 1995). These hydrophilic interactions were interpreted as the water mediated H-bond network. Indeed, the collagen triple helix is a highly hydrated conformation. A thick layer of water shell is a common feature in the crystal structures of triple helical peptides (Bella et al. 1994; Kramer et al. 1999), including the structure of peptide (PPG)10 which does not have any polar and/or charged residues (Bella et al. 2006; Vitagliano et al. 2001; Berisio et al. 2002). A large number of water molecules were observed bound to the carbonyl groups on the peptide backbone forming water bridges that link peptide chains in a triple helix and between different triple helices in the crystal lattice (Kramer et al. 2001; Bella et al. 1994). The water covered surface may also contribute to the shielding of the charged side chains and minimize the repulsion of like charges in the fibrils (Hulmes et al. 1973). While the H-bonds may account for the explicit attraction force stabilizing the fibrils, how do they contribute to the structural specificity of the fibrils remains unclear. Since the hydrophilic to hydrophobic contacts are energetically very unfavorable, the water covered clusters of polar and/or charged residues may orient with each other to avoid the clusters of hydrophobic residues, and thus inadvertently bring the regularly spaced hydrophobic residues into register.

The self-assembly of the TC is essentially a protein folding problem since the triple helix “wears” the amino acid sequence of its constituent polypeptide chains on the surface of the molecular rod. Yet, the predominant role of hydrophilic interactions in the stability of the fibrils appears to set the fibrillogenesis apart from the folding of globular proteins which are nearly always driven by the process of hydrophobic residues being pushed together in a process known as the hydrophobic collapse (Leikin et al. 1995). One of the major impacts of the hydrophobic collapse in globular proteins is to reduce the conformational space for folding and to funnel the interactions between different parts of the peptide chain into a steep energetic descent to reach the native conformation (Dill and Chan 1997; Dobson et al. 1998; Dobson 2003; Onuchic and Wolynes 2004; Rollins and Dill 2014). The extended, rather rigid conformation of the triple helix suggests a rather restricted conformational space during the self-assembly and consequently, a different folding pathway: one along which the specific D-staggering structure emerges from the accumulative effects of optimized interactions of residues in a D-period related regular, periodic placement. At the same time, it could also imply that multiple conformations may have comparable stability and coexist to a certain extent.

A Design Strategy Based on the Periodic Spacing of the Interacting Residues

The ultimate test of the effectiveness of the periodically spaced residues to direct the self-assembly of the triple helices into a unique structure comes from the studies using designed triple helical peptides. The major findings of the work by Hulmes et al. were incorporated into the design of triple helical peptides that can further self-associate to form fibrils having a D-period-like axial repeating structure (Fig. 16.8) (Chen et al. 2019; Strawn et al. 2018; Kaur et al. 2015; Xu and Kirchner 2021). These designed, self-assembled mini-fibrils exhibit a clear axially repeating structure of 35 nm when examined using TEM and AFM. This 35 nm repeating structure, which is designated as a d-period, consists of a 0.3d overlap region and a 0.7d gap (Fig. 16.8a–c, and g). The d-periodicity of the mini-fibrils comes from a built-in d-spacing of both the hydrophobic and hydrophilic residues in the primary structure of the peptides that is created by using multiple pseudo identical amino acid sequence units placed in tandem (Fig. 16.9d). Each sequence unit contains 123 residues forming a section of triple helix about 35 nm in length. Thus, by mutually staggering one sequence unit, the hydrophobic and hydrophilic residues in the associating helices will be in register in a manner similar to those in the D-staggered native collagen fibrils. The same interaction curve calculated according to Hulmes et al. also revealed a set of interaction peaks when the triple helices are staggered by 0d, 1d, 2d, and 3d (only 0d to 2d staggering are possible in peptides containing only two repeating sequence units) (Fig. 16.9a). Thus, the hydrophobic and charge-based interactions are all optimized in a unit-staggered arrangement. In addition to the triple helix domain consisting of repeating sequence units, the designed peptides also have an “overhang” about 0.3d in size which accounts for the overlap region in the staggered fibril assembly. The “overhang” region consists of a foldon domain included as the nucleation domain for the triple helix folding and a short stretch of linker peptide at the N and C-termini which are included to increase the stability of the triple helix. Thus, by staggering 1-sequence unit (equivalent to 1d or 123 residues), the self-assembly of triple helices give rise to a regular d-periodicity.

Fig. 16.8
figure 8

The designed mini-fibrils. (a–c) Negative staining of the mini-fibrils showing the d-period as alternating dark and light bands every 35 nm. The scale bars are 100 nm in a and c, 200 nm in b. (d–f) Positive staining of the mini-fibrils showing the three pairs of thick-thin bands every 125 nm. The scale bars are 200 nm (100 nm in the insert of f). (g) AFM image of the mini-fibrils, the scale bar is 300 nm. (h) Positive staining of seed-oligomers in the preparation of the mini-fibrils under fibril forming condition. The yellow bars are 125 nm in length marking the size of one triple helix; the red bars are 200 nm in size, equivalent in size to triple helices in a 2d-stagger. a–g From (Kaur et al. 2015); h: Kaur personal communication))

Fig. 16.9
figure 9

The unit-staggered assembly of the mini-fibrils. (a) The unit staggered arrangement of peptides having 3 or 2 identical sequence units (U1-U3); a schematic drawing of the negatively stained mini-fibrils is shown having the alternating black (gap) and white (overlap) regions. (b) The interactions of charged residues in register (red: Glu or Asp, Blue: Arg or Lys, green: Thr). (c) Purported packing of the foldon domain in the mini-fibril. (d) The sequence architecture of a designed peptide having three identical sequence units (GPP4-(Col-domain)) shown as blue rectangles; grey circle: foldon domain, orange bars: the linker regions. a–c adapted from (Strawn et al. 2018)

The mini-fibrils appear to capture the other major feature of a collagen fibril: the SLS-like alignment of the charged residues within the sequence units. The positively stained mini-fibrils revealed a different set of finer banding patterns, which coincide with the distribution of the charged residues in the sequence unit (Fig. 16.8d, e and f). The finer banding of the mini-fibrils reflected the in-register nature of charged residues in the neighboring triple helices in a unit-staggered arrangement. The positively stained banding pattern is resolved to a pair of dark thick-thin bands marking the two clusters of charged residues in one sequence unit. For the peptide containing three sequence units in its full length of ~120 nm, the clusters of the charged residues can be observed clearly as three pairs of thick-thin bands every 100–120 nm or so on the micrographs. Because of the lower TEM resolution associated with the thinner fibrillar structures, the thick-thin bands appeared to lack resolution compared to a SLS fingerprint of native collagen. Nevertheless, they are the manifestation of an in-register alignment of the charged and the hydrophobic residues and are easily identifiable. The fact that both the negatively stained and the positively stained structures of the mini-fibrils shared the same features with that of native collagen fibrils implies the same molecular mechanism directed the self-assembly of both the native collagen fibrils and the mini-fibrils.

In a few cases, we also observed SLS-like assemblies of the designed peptides under the condition of fibril formation (Fig. 16.8h). These slender SLS-like assemblies are ~125 nm long but much thicker than an individual helix which has a diameter of ~1.0–1.5 nm. The three pairs of positively stained thick-thin bands are visible as black dots. There are other larger positively stained aggregates on the same micrograph. These larger aggregates all have the discernable three pairs of thick-thin bands in every ~120 nm. Their size fell into two groups: ~ 150 nm or ~ 200 nm, both correspond well with triple helices in a 1d or a 2d staggered arrangement, respectively. We therefore consider them the “seed” intermediates in the early stages of fibril assembly (they are termed seed-oligos). The presence of the slender SLS-like aggregates together with the seed-oligos is consistent with the “precursor” role of the SLS during the fibrillogenesis. As it was in the case for the native slender SLS by Bruns et al., the SLS-like structures are observed under neutral pH with no addition of ATP or other polyanion. Also similar to the native slender SLS, the SLS aggregates did not populate but give way to the mini-fibrils. For the designed peptides, the conformational bias for d-staggered mini-fibrils was understood in terms of the steric hindrance associated with the foldon domain (Chen et al. 2019). The diameter of the globular foldon domain is twice the diameter of a triple helix and it is tethered to the triple helix via a rather stiff linker: a heptapeptide GSGPCCG where the two Cys are expected to form one or two interchain disulfide bonds (Fig. 16.9c and d). So bulky and stiff, they prohibit the bouquet-like structure of the PC-SLS (having the C-propeptide attached).

Both the interaction curve and the in vitro studies using designed peptides further indicate that the SLS can potentially be a stable assembly of triple helices. There is nothing “artificial” about the molecular interactions giving rise to the SLS structure. One possible destabilizing factor of SLS in neutral pH is perhaps the repulsion of the like charges which are maximally aligned in the D-staggered arrangement. It remains to be fully evaluated how such repulsion forces are mediated in the D-periodic fibrils. The “artificial” conditions where the SLS were precipitated in large sizes only further stabilized this structure and perhaps at the same time discouraged the D-staggered fibrils as a competing conformation. The slender size of SLS under physiological conditions makes them difficult to observe under in vivo conditions in thin sections or other tissue preparations. A full understanding of their physiological roles will require more precise insights on why and how the SLS eventually gets incorporated into the fibrils in the ECM.

The Recent Works on SLS and FLS in Tissues

Structures other than that of D-periodic fibrils have been reported in many normal and atypical tissue samples. The identification of the SLS is often based on two features: (1) a dimension to be close to that of the length of a TC monomer and (2) a fine, closely placed, polar striation. Other fibrillar structures having noncanonical banding patterns were also found and were presumed to be the FLS. With the limited resolution, however, it is often difficult to discern which collagen(s) are involved in the SLS or FLS formation. The various observed collagen-banding periodicities could also be affected by the incorporation of different glycosaminoglycans or an over glycosylation (Rehany et al. 2000; Waltimo 1996; Kobayasi et al. 1985; Wen and Goh 2006). SLS and FLS formation may be due to an over secretion of collagen or it may be a by-product of collagen degradation (Kobayasi et al. 1985; Pérez-Tamayo 1972; Kamiyama 1982; Dingemans and Teeling 1994; Slavin et al. 1985). Type VI was found to form banded fibrils with a periodicity around 110 nm, and some FLS observed by others may be type VI collagen (Miki et al. 1993; Bruns et al. 1986); whether FLS formation is a byproduct of tissue remodeling is still debatable. FLS has been observed in arteries bound to lipids, in malignant mesothelioma where a high content of hyaluronic acid was detected, around arteries in polyp biopsy samples, in intramuscular nerves, and in fatty tissues, to name a few (Kang et al. 2009; Morris et al. 1978; Maeda et al. 1996; Kagoura et al. 2001). It seems that in tissues, collagen may assume various structures; they may provide some unknown function or just be a byproduct of their environment. Only future research will enlighten us. The clinical involvement of abnormal collagens was reviewed by Eyden and Tzaphlidou (2001). The following is a list of some more recent publications on the subject that were not included in that review (Table 16.1). The frequently reported abnormal collagens are mostly FLS which is larger, fibrous like and easier to identify; direct observation of SLS remains rare and limited.

Table 16.1 FLS and SLS in tissues

Concluding Remarks

In our quest to understand the involvement of collagen during the function and development of tissues, the great achievements in recent years on the study of the structures of collagen fibrils of the collagen triple helix and of collagen receptors, as well as the understanding of the interactions between collagen and other macromolecules is only the tip of the iceberg. The complexity related to the crucial roles of collagen is manifested at many levels: the polymorphism of the supramolecular structure, the heterotypic nature of collagen fibrils, the extensive interaction with a broad range of other macromolecules, and the progressive changes in the structure and composition of the ECM during normal development and tissue repair. SLS can potentially be a critical intermediate between procollagen and the molecular organization of collagen in the ECM. Understanding how structural differences between SLS and the D-periodic fibrils affect their interactions with collagen receptors and/or other macromolecules such as collagenase will provide insight on the innumerable and often contradictory findings about the complex roles of collagen in tissues. Such knowledge will also lead to new treatments for a wide range of connective tissue diseases ranging from aging to cancer metastasis. Studies of SLS in vivo will require experiments that can capture the structure that is much smaller in size than the canonical D-periodic collagen fibrils and are transiently present in a low abundance. Designed peptides that can form higher order molecular assemblies of the triple helix can be a useful tool in such an effort. SLS is only one of the variations of the possible macromolecular assemblies of fibrillar collagen in the ECM. The multiplicity of the molecular organizations of collagen makes it a very complex system to study, but also a fascinating one.