Keywords

1 Introduction

The extracellular matrix (ECM) is a complex protein network which not only provides structural support for adherent and migrating cells, but also important mechanical and biochemical cues for cell phenotypes and functions including stem cell fate [114]. The ECM is classified in two major structural compartments: (1) Basement membranes (BM) are a specialized form of ECM that appear as thin and dense acellular sheets underneath epithelial and endothelial cell layer providing structural integrity and mechanical support to cell layers and conferring cell polarity [56, 114]. They are mainly composed of the network-forming type IV collagen, laminins, nidogens and the proteoglycan perlecan. In contrast, the (2) interstitial ECM typically surrounds cells, e.g., fibroblasts residing in the lung interstitium, completely and its main components are the fibrillar type I and type III collagens, fibronectin, decorin, and hyaluronan [87] (Fig. 11.1a). Both ECM compartments undergo fundamental changes in lung disease which directly cause loss of lung function, increase susceptibility to inhaled toxicants and respiratory pathogens, and may strongly affect therapeutic efficacy [17, 123].

Fig. 11.1
A schematic diagram of E C M in the lung with the stacked layers of endo, B M, I M with fibroblasts, B M, and epithelial with ciliated, goblet, club, and basal cells in A. B. A pie chart of the E C M plots collagens and secreted factors with the highest and lowest composition, respectively.

Types and composition of the extracellular matrix (ECM) in the lung. (a) Schematic overview on types of ECM in the lung in relation to the epithelial and endothelial cell layer, focusing on the conducting airways. Epi, bronchial epithelium displaying all major cell types; BM basement membrane; IM, interstitial matrix harboring scarce fibroblasts; Endo, endothelial cell layer; V, vessel lumen. Smooth muscle cells and cartilage are omitted for increased clarity. (Figure was created with BioRender.com). (b) ECM composition in the lung according to the matrisome categorization established by Naba et al. [77]. (Adapted from Beachley et al. [9])

Collagen is a large protein superfamily and the main ECM protein component in almost every tissue type including the lung [9] (Fig. 11.1b). The unifying feature of collagens is the triple-helical collagenous domain, which is assembled in the endoplasmic reticulum (ER) from three α-chains (Fig. 11.2) consisting of regular amino acid repeats of (Gly-X-Y)n, where Y often is 4-hydroxyproline. There are 28 different human collagen types which form homo- or heterotrimeric triple helices during folding in the ER and which are categorized in seven different classes, based on their final extracellular supramolecular assembly: (1) Fibril-forming collagens (I, II, III, V, XI, XXIV, XXVII), (2) fibril-associated collagens with interrupted triple helices (FACITs, IX, XII, XIV, XVI, XIX, XX, XXI, XXII), (3) network-forming collagens (IV, VIII, X), (4) transmembrane collagens (XIII, XVII, XXIII, XXV), (5) endostatin-producing collagens (also termed multiplexins, XV, XVIII), (6) anchoring fibrils (VII), and (7) beaded-filament-forming collagen (VI). The collagen types XXVI and XXVIII do not fit well in any of the above-listed categories [83].

Fig. 11.2
A schematic diagram. The collagen gene is transcribed into m R N A in the nucleus. m R N A translated to proteins are attached to the E R. The terminals of the peptide are cleaved. The fibrils from the peptide are cross-linked in the extracellular space, and fragments are degraded by phagolysosome.

Overview of the most important intracellular and extracellular collagen biosynthesis, maturation, and degradation pathways. Using the example of an activated myofibroblast and biosynthesis of fibrillar collagen, the figure depicts the following major steps of this pathway: (1) Collagen gene transcription to mRNA in the nucleus; (2) Translation of mRNA to protein, co- and post-translational modifications and triple helix formation in the rough endoplasmic reticulum (rER); after secretion to the extracellular space via trafficking through the Golgi network, (3) propeptide cleavage by specific N- and C-terminal collagen proteinases, yielding (4) propeptides which may serve as peripheral markers for collagen formation. Propeptide cleavage triggers (5) fibril formation, followed by (6) extracellular crosslinking and stabilization and further maturation of the resulting fibers. (7) Extracellular degradation is performed by collagenolytic matrix metalloproteases (MMPs) and generated fragments, but also larger fibrils, can be internalized and (8) degraded within the cell in the phagolysosome. (Figure was created with BioRender.com)

Because of the central role of the ECM and collagen in lung disease, quantification, determination of molecular properties, and visualization of three-dimensional structure of collagen is important for the development and characterization of translational models of lung research. For in vitro, ex vivo, and in vivo models for lung fibrosis, collagen quantity and crosslinking remain the most important readouts for evaluation of protection from disease [1, 43, 74]. In this chapter, we aim to give a comprehensive overview on the various methodologies for quantification and characterization of collagen currently available including their advantages and disadvantages.

2 Quantification of Collagen

Methods for the identification and quantification of collagen and its subtypes have been continuously developed for more than one century. These methods differ considerably in specificity and sensitivity [12, 24, 81]. Many current methods for quantification of collagen take advantage of collagen-specific properties such as hydroxyproline content and affinity of dyes to the triple helical domains, and therefore determine total collagen content. Quantification of specific collagen types is currently only possible by transcript analysis, immuno-based or mass spectrometry-based approaches. Considering the numerous post-transcriptional regulation events involved in collagen biosynthesis and maturation (Fig. 11.2), transcript analysis, although informative on the regulatory level of collagen expression changes, can never suffice as readout on its own. On the other hand, immuno-based methods for the specific detection of collagen types, unfortunately, are often characterized by insufficient specificity and few reliable specific antibodies for such applications exist. Finally, mass spectrometry-based proteomic assessment of the ECM allows for a comprehensive assessment of all collagen types and chains in the same sample and requires only small sample amounts when state-of-the-art tandem mass spectrometry instruments are used. Unfortunately, this methodology is expensive and not available to all laboratories.

Solubility of collagen needs particular consideration for quantification of collagen. While intracellular, immature, and newly generated collagen are neutral-salt soluble, mature collagen fibers with many intramolecular cross-links are insoluble in conventional protein extraction buffers, require special solubilization protocols, and are typically dissolved in acetic acid for molecular analysis or usage as culture scaffold [54, 89]. Following protein extraction, chemical or enzymatic digestion of the insoluble pellet will increase collagen coverage [39, 75]. Pepsin is the most widely used protease in that context, as it cleaves fibrillar collagens in the telopeptide regions, hence removing the lysyl oxidase-mediated crosslinks but leaving the triple-helical stretch untouched [28]. Notably, efforts should always be made to include this insoluble part and/or assess secreted or deposited extracellular collagen, none of which is captured by conventional protein extraction protocols. It is important to acknowledge that an increase in intracellular collagen does not necessarily reflect an increase in extracellular collagen—actually, the opposite may be the case if collagen secretion is impaired, and collagen accumulates in the ER.

2.1 The Sircol Assay

The Sircol assay is a fast and simple colorimetric method based on binding of Sirius Red F3B to collagen [59]. The binding specificity of Sirius Red relies on its elongated structure which associates with triple helical collagen along the linear axis and exposes numerous acid sulfonate groups which interact with basic residues in the collagen sequence [60]. Following appropriate solubilization, this method can in principle be used to determine all pools of collagen in complex protein solution in the context of in vitro and in vivo experiments. We find it particularly suitable for determination of newly synthesized collagen content in the cell culture supernatant as a readout for in vitro experiments [102]. However, caution must be taken to use serum-starved culture settings, as serum components are known to interfere with the assay [21, 59]. Application of Sirius Red for collagen visualization in situ will be described in Sect. 11.4.2.

2.2 Hydroxyproline Quantification

The amino acid 4-hydroxyproline (4-Hyp) occurs in high abundance in triple-helical collagenous domains where it frequently occupies Y positions of the above mentioned (Gly-X-Y)n repeats. 4-Hydroxylation of proline is catalyzed in a co- or post-translational fashion by ER-resident prolyl-4-hydroxylases which act on the unfolded polypeptide chain [83]. Presence of 4-Hyp is known to increase the thermodynamic stability of collagen [90]. Although widely considered specific for collagen, it should be mentioned that 4-Hyp also occurs in other proteins. For instance, it has been estimated that up to 33% of the about 90 proline residues in elastin can be hydroxylated [96]. Furthermore, a single 4-Hyp in the hypoxia-inducible factor (HIF) α subunit (HIF-1α) acts as oxygen sensor and plays an important role in the regulation of gene expression by hypoxia [42]. More such examples may exist; nevertheless, considering that collagen is much more abundant than other 4-Hyp containing proteins and the exceptionally high abundance of 4-Hyp in collagen, 4-Hyp quantity can still be considered a reasonable measure for total collagen content. In addition, 3-Hyp is also present in collagens but it is found in much less abundance than 4-Hyp [41].

Collagen quantification by the hydroxyproline assay is particularly suitable for insoluble and solid samples such as mature collagen in tissue or insoluble pellets following protein extraction from complex samples. Samples are completely hydrolyzed by boiling in 6 M hydrochloric acid for several hours and subsequently subjected to amino acid analysis including the quantification of 4-hydroxyproline using high-performance liquid chromatography (HPLC) [54]. This method is still considered the gold standard for measuring hydroxyproline content, although it is not particularly sensitive, requires large sample sizes, and an HPLC set-up is not available in every laboratory [24]. LC-MS/MS quantitation of 4-Hyp methods are more sensitive and given the popularity of LC-MS instrumentation, the technique is becoming more accessible in research centers. Notably, colorimetric alternatives are increasingly being offered by suppliers and appear to yield similar results [88].

2.3 Immuno-Based Methods

Immune-based methods such as enzyme-linked immunosorbent assays (ELISAs) and Western blotting have the potential to allow for specific detection of collagen types and their absolute or relative quantification. Given that it is becoming increasingly clear that collagens fulfill very different roles in health and disease, collagen-type-specific approaches must be considered much more often. For instance, a mass spectrometry-based analysis of TGF-β-induced changes in lung fibroblast-deposited ECM has shown the expected upregulation of the fibrillar type I, but at the same time decreases in type VI collagen, and more variably altered or unchanged levels of other collagen types. Hence, already in such a simple in vitro model of lung fibrosis changes in collagen go far beyond a simple increase in levels overall, but specifically affect collagen types differently [74]. Indeed, while in lung fibrosis fibrillar collagen types I and III are consistently increased, levels of other collagen types remain unchanged or are downregulated during fibrogenesis; others again are variably regulated dependent on disease stage or anatomic location [52]. These observations undoubtedly call for more collagen type-specific assessments to increase our understanding of the distinct roles of different collagen types in disease. However, immune-based approaches are limited by the availability of specific antibodies and, unfortunately, inadequate validation of antibodies and poor specificity remain a major issue in the biomedical research community [30]. Only well-validated antibodies should be used, and proper controls included to ensure specificity. Special attention must be paid to the immunogen used for raising the antibody—some antibodies are specifically raised to target propeptide sequences and hence will only detect intracellular and immature collagen. Others are directed against specific chains of a collagen type or against three-dimensional epitopes of fully assembled extracellular mature collagen types. Hence, some antibodies will require sample denaturation while others will most specifically detect the native fold—sample selection and processing must be adjusted accordingly.

Finally, tandem mass spectrometry (MS/MS) allows for simultaneous assessment of all collagen chains and types in a single run with comparatively little sample amount given that a state-of-the-art instrument is used. In addition, the data generated also enables determination of collagen chain stoichiometries and site-specific identification of post-translational modifications in distinct collagen types [8, 74]. Clearly, it represents an expensive technology, but at the same time it is by far the most powerful technique, offering information about collagen in unprecedented molecular detail. Therefore, Sect. (11.3) will follow to provide guidance for the usage of LC-MS/MS-based proteomics approaches for the assessment of collagen (Table 11.1).

Table 11.1 Methodologies for the quantification of collagen in crude samples from in vitro, ex vivo, and in vivo models of lung research

3 Mass Spectrometry Characterization of Collagen

3.1 Assessment of Collagens in Proteomics Analyses of Pulmonary ECM

The study of ECM proteins may provide important insights about the molecular mechanisms underlying disease including lung fibrosis. The advent of proteomics has propelled the study of the so-called matrisome, a term coined by Naba et al., which catalogued both “core” and “associated” proteins present in the ECM of many tissues [77]. Although in the beginning proteomics used other biochemical techniques such as two-dimensional electrophoresis, nowadays it relies almost exclusively on liquid chromatography tandem mass spectrometry (LC-MS/MS), an analytical technique that combines the power of liquid chromatography to separate complex peptide mixtures and the high sensitivity of modern high-resolution mass spectrometers. Most ECM proteomics studies have used a “bottom-up” approach in which proteins are first digested into peptides before LC-MS/MS analyses. The peptide’s amino acid sequences are identified by database search engines where tandem mass spectra predicted from a protein database are compared to MS/MS spectra obtained experimentally. Identified peptide sequences are assembled into proteins using bioinformatics tools. Thus, unlike collagen-specific methods mentioned above, mass spectrometry-based proteomics is a powerful technological platform that allows not only the assessment of hundreds of proteins, including the different collagen types, at once, but also the characterization of detailed biochemical features such as posttranslational modifications and/or chemical changes of proteins.

Matrisome studies involved the sequential extraction of several cellular fractions until a fraction enriched in insoluble ECM proteins was obtained and analyzed by LC-MS/MS. An example of a general workflow is given in Fig. 11.3. Notably, the study also developed an in-silico human matrisome database composed of a “core” and “associated” matrisome genes that facilitated the identification and classification of matrisome proteins present in different tissues. The “core matrisome” is comprised of 278 genes (274 in the mouse) encoding ECM glycoproteins, collagens, and proteoglycans. In addition, 778 genes are cataloged as “matrisome-associated proteins” comprised of ECM regulators and modifiers and secreted factors [77]. In these studies, the protein composition of the lung ECM was investigated using LC-MS/MS. Following sequential extraction, insoluble ECM-rich samples were obtained and solubilized in high urea followed by reduction, alkylation, and deglycosylation. ECM proteins were digested with trypsin and the resulting peptides were further separated and purified on off-gel electrophoresis. ECM samples were analyzed by LC-MS/MS and the resulting spectra were searched against the mouse database. The study revealed that the murine lung ECM comprises 143 total matrisome proteins: 92 core matrisome proteins, and 51 matrisome-associated proteins [77]. Notably, 43 collagen gene products were identified in the lung samples.

Fig. 11.3
A workflow of sample preparation. The lung tissue is cultured to extract the E C M enriched samples, which involves denaturation, reduction, and alkylation and generates the E C M peptides. The generated peptides undergo L C-M S or M S analysis and give the bioinformatics data in graphs.

General workflow of sample preparation for extracellular matrix proteomics. Decellularization of the tissues or cultured cells enhances detection of ECM proteins. Many strategies have been developed to take advantage of the insoluble nature and enrich for ECM proteins using detergent mixtures that dissolve lipid membranes and allow the removal of soluble proteins. The enriched ECM proteins are denatured, reduced and alkylated to facilitate digestion into peptides by enzymatic or chemical protocols. The peptide mixture is fractionated by different means, typically liquid chromatography, before detection and quantification by high-resolution mass spectrometry. (Figure is created with BioRender.com)

In addition to learning about the composition of normal tissue ECM, LC-MS/MS proteomics studies provide an effective platform for the characterization of ECM proteins from diseased lungs, which may give cues into molecular pathways underlying fibrotic disease. For instance, a comprehensive dynamic proteomics effort to characterize changes in ECM protein biosynthesis during bleomycin-induced lung fibrosis was undertaken [27]. Lung tissues were collected from bleomycin- and vehicle-treated mice labelled with deuterated water prior to injury. Labeled tissue samples were subjected to sequential extraction with salt, detergent, and guanidine. Analysis of soluble and insoluble fractions revealed that while ECM proteins were not significantly solubilized with salt and detergent, they were highly enriched in guanidine-soluble and insoluble fractions. Furthermore, unlike traditional “static” proteomics methods, the use of stable isotope labelling combined with the sequential extractions allowed estimation of fractional synthesis rate (FSR) of single ECM proteins. The study revealed the dynamic changes in synthesis of certain collagen types along with elevated levels of fibrillar collagen, confirming that bleomycin injury induced deposition of insoluble matrix that eventually leads to fibrosis [27].

In a subsequent study, quantitative detergent solubility profiling (QDSP) of lung tissue homogenates was used to evaluate tissue composition from the onset of inflammation and fibrosis to its full recovery [98]. Unlike the previous study, the authors separated whole tissue proteins based on their differential solubility in response to increased detergent stringency, which allowed monitoring interactions of secreted proteins with the insoluble ECM. A label-free mass spectrometry-based approach was used to estimate the relative levels of proteins in the different fractions from all time-points from fibrosis to recovery. The results revealed that some core matrisome proteins, including collagens, were altered upon bleomycin injury with it moving from the soluble to the insoluble fraction. In contrast, many of these fell back to baseline during repair and remodeling of the lungs after injury. Such QDSP method has also been adopted to develop a proteomics workflow for human lung fibrosis biopsies and to study protein involved in lung aging [2, 99].

The studies described above identified peptides with few posttranslational modifications from the most abundant collagen types. leaving many unidentified collagen types identified in a later study concentrating on pulmonary ECM [74]. While the MS methods used in previous general studies included hydroxylation of proline in database search parameters, allowing the identification of collagen peptides, they are not optimized to exhaustively identify highly modified collagen peptides. Thus, this limitation is more relevant for basement membrane collagens such as collagen IV and IV which are highly hydroxylated and glycosylated. In the following Sect. 11.3.2 we provide more details of database search strategies that have been designed to overcome these limitations.

3.2 Analysis of Posttranslational Modifications of Collagen

During biosynthesis, a number of co- and post-translational modifications (PTMs) are added onto the collagen molecule, most notably hydroxylation and glycosylation, both of which are essential for maintaining the architecture and function of tissues [8, 90]. For instance, 4-hydroxylation of proline is important for folding and stabilization of triple-helical molecules and fibril assembly [117]. Similarly, hydroxylation and glycosylation of lysine are important for crosslinking and stabilization of collagen fibrils as well as regulation of cell-matrix interactions [61, 122]. In addition, N-glycosylation of asparagine is thought to play an important role in collagen degradation [47]. Mutations on either collagens and/or the enzymes that modify them can have a detrimental effect on collagen structure, alter tissue function and thus result in disease. Perhaps the best characterized collagen-related disease resulting from genetical alterations is osteogenesis imperfecta [71]. Although the extent of modification may vary, quantitative mapping efforts to catalog modifications along the collagen molecule are very few. However, in recent years improvement of mass spectrometry technologies may allow a better characterization of collagen PTM changes [74].

Although several protocols have been developed for the generation of suitable samples for analysis of these highly modified molecules, the majority of them take advantage of the insoluble nature of collagens for their enrichment. As can be observed in Fig. 11.3, after collagen peptides are generated by chemical degradation or enzymatic digestion, they may be fractionated by different means, but typically liquid chromatography, followed by tandem mass spectrometry analysis.

Because collagen peptides are highly hydroxylated and glycosylated, they pose a challenge for characterization by conventional mass spectrometry-based proteomics approaches. To generate tandem mass spectra, peptides are subjected to gas-phase fragmentation techniques such as collision-induced dissociation (CID) which breaks peptide bonds generating so-called b- and y-fragment ions from which the peptide sequence can be deduced. Hydroxylation of proline and lysine residues is identified by adding 16 mass units to such amino acids during the database search. For this it is key that hydroxylation is sufficiently stable and does not undergo cleavage under CID conditions. Although O-glycosylation was thought to be labile, Perdivara et al. showed that o-glycosidic bonds between glucose, galactose and hydroxylysine are unusually stable to CID in collagen tryptic peptides [85]. Because of this characteristic of collagen peptides, CID and higher-energy C-trap dissociation (HCD), a technology available in high-resolution Orbitrap instruments, have successfully been used alone or in conjunction with PTM-friendly chemical ionization technologies such as electron-transfer dissociation (ETD) for the MS analysis of highly hydroxylated and glycosylated collagen peptides [8].

For instance, a label-free mass spectrometry approach was used to understand hydroxylation and glycosylation on collagen in response to profibrotic cytokine transforming growth factor β1 (TGF-β1). As a proof of concept, fibroblasts derived from IPF human samples were treated with TGF-β1 to induce changes to their ECM within an environment that mimics fibrosis [74]. After decellularization, the enriched ECM was digested into peptides by trypsin/LysC mix and peptides were analyzed by LC-MS/MS. MS-based label-free quantification revealed that upon exposure to TGF-β1 in-vitro, lung fibroblasts ECM experienced changes commonly associated with lung fibrosis such as increased expression of fibrillar collagens such as collagen I, collagen II, collagen III & collagen V. Furthermore, a new bioinformatic platform was developed to allow for the comprehensive mapping and site-specific quantitation of collagen PTMs in these crude ECM preparations. For the identification of PTM on collagen peptides, mass spectrometry data files were searched using the unique motif search feature MyriMatch that allowed identification of sites of hydroxylation and glycosylation [8, 70, 106]. The analyses yielded a comprehensive map of prolyl and lysyl hydroxylations as well as lysyl glycosylations for 15 collagen chains. PTM analysis revealed novel sites of prolyl-3-hydroxylation and lysyl glycosylation in type I collagen. In addition, the same data-dependent acquisition MS data were subjected to an MS1 analysis using Skyline software to assess changes in collagen PTM [74]. Skyline MS1 is a label-free quantification technique in which the areas of each peptide chromatographic peak (a.k.a. extracted ion chromatogram – XIC) are recorded, averaged, and compared between the different sample groups [https://pubmed.ncbi.nlm.nih.gov/20147306/]. The Skyline MS1 workflow was able to identify significant changes in prolyl-3-hydroxylation and O-glycosylation at specific sites within type I collagen molecules present in ECM samples taken from human lung fibroblasts stimulated with TGF-β1 concentrations mimicking a fibrotic environment.

3.3 Assessment of Enzymatic Crosslinks in Collagen

Crosslinks play an important role in maintaining and strengthening the intricate structure of collagens in the ECM. In fibrotic diseases and cancer, abnormal ECM dynamics and crosslinks disturb the homeostatic state of cells and promote organ failure. Collagen-crosslinking lysyl oxidases (LOX) are upregulated in many forms of lung fibrotic disorders such as idiopathic pulmonary fibrosis (IPF) [16, 45]. Increased crosslinking leads to pathologic deposition of collagen which may disrupt elasticity, promote stiffness, and reduce lung function. Thus, when evaluating collagen in translational models of lung disease, it is important to quantify collagen crosslinks.

The precursors to the crosslinks are formed by the oxidative deamination of lysine/hydroxylysine in collagen by the members of the lysyl oxidase family (LOX), resulting in lysine aldehydes also known as “allysines.” The latter may spontaneously condense with a lysine or hydroxylysine in a neighboring collagen α-chain to form immature divalent crosslinks dehydro-dihydroxylysinonorleucine (deH-DHLNL) and dehydro-hydroxylysinonorleucine (deH-HLNL). These immature crosslinks can undergo Amadori rearrangement to form the keto forms hydroxylysinoketonorleucine and lysinoketonorleucine which can condense with a hydroxylysine in third alpha collagen chain to form trivalent crosslink lysylpyridinoline or deoxypyridinoline (dePyr) and hydroxylysylpyridinoline or pyridinoline (Pyr) [12]. Apart from these, there are crosslinks involving the condensation of immature crosslinks with histidine residue forming histidinohydroxylysinonorleucine (HHL) and dehydrohistidinohydroxymeridesmosine (HHMD). Pyrrole is another kind of mature trivalent crosslink present in collagen I formed by condensation of a hydroxylysinoketonorleucine with hydroxylysinonorleucine [34]. Glycosylation is observed on certain helical lysine residues involved in crosslinking and are thus present on immature divalent crosslink DHLNL & HLNL as well as on mature trivalent crosslinks. Collagen crosslink biosynthesis has been extensible reviewed elsewhere [34]. In addition to LOX-mediated crosslinks, non-enzymatic action of reducing sugars on amino groups of proteins (via Maillard reaction) leads to formation of advanced glycation end products (AGEs). Pentosidine is one such AGE highly characterized in aged lungs [11]. The presence of mature crosslinks is thought to be related to matrix stiffness which increases in age related lung diseases such as pulmonary fibrosis [72].

With such structural diversity and complexity, selection of methods for the detection and quantification of collagen crosslinks has historically required careful experimental considerations. A possible workflow is outlined in Fig. 11.4. For instance, the unstable nature of the immature crosslinks to strong acids used for hydrolysis requires prior reduction of collagen samples with sodium borohydride. HHMD crosslinks are stable in acid but can also be detected in the same NaBH4-reduced samples. A small number of laboratories still use tritiated sodium borohydride (NaB3H4) to radiolabel immature crosslinks because this method has high sensitivity allowing the use of smaller sample size. If NaB3H4 is used for reduction, separation of immature crosslinks can be achieved on either a cation -exchange column or C18 reverse-phase column with detection on a liquid scintillation counter [66, 101]. Although radioactive methods present many advantages, it is not readily available to most collagen researchers around the world who are obligated to look for alternative methods.

Fig. 11.4
A schematic workflow. The biological sample has a reduction with tritiated N a B 3 H 4 and N a B H 4 spike in internal standards leads to hydrolysis. After the collagen content is formed from hydrolysis, the ion exchange H P L C and enrichment on cellulose detect the moles of collagen crosslinks.

Schematic workflow for detection of collagen crosslinks in biological samples. The biological sample is crushed in liquid nitrogen to generate an insoluble powder that is lyophilized before determination of its dry-weight. Stabilization of acid-labile immature crosslinks can be achieved with either radiolabeled (left) or non-radiolabeled sodium borohydride (right). After hydrolysis amino acids and lysyl-derived crosslinks can be separated on different HPLC columns. In case of immature crosslinks, those that have been radiolabeled are detected with a liquid scintillation counter, whereas non-labeled crosslinks are detected by mass spectrometry. For mature crosslinks, quantitation can be achieved using either mass spectrometry using MS1 extracted-ion chromatogram (MS1), multiple reaction monitoring (MRM) or parallel-reaction monitoring (PRM). In the particular case of pyridinolines, they can also be quantified using their intrinsic fluorescence or using an ELISA detection kit. A portion of each hydrolyzed sample may also be used to quantify hydroxyproline, which is used to determine the number of crosslinks per molecule of collagen in each sample. The addition of a known amount of an internal standard at the beginning of the procedure accounts of sample loss and greatly improves results. The crosslinks are represented in moles per mole of collagen in the sample. Commonly analyzed collagen crosslinks are shown at the bottom. (Figure is created with BioRender.com)

More accessible non-radioactive methods to quantify immature reducible crosslinks have been developed. In this case, ultra-performance liquid chromatography ESI-MS/MS has been developed and used for a variety of tissues. Since immature crosslinks are small and have a more polar character, the mobile phase of reverse-phase C18 column includes heptafluorobutyric acid (HFBA) which allows a better retention and fractionation of these crosslinks on the column. However, because HFBA reduces columns shelf life and it is difficult to remove from the mass spectrometer source, other columns and mobile phase agents have been implemented. One such column that has emerged in the field of crosslink quantification is hydrophilic interaction chromatography (HILIC) columns coupled with mass spectrometry detection. Unlike C18 reverse-phase columns, HILIC columns are hydrophilic where polar compounds such as immature collagen crosslinks are typically fractionated by a gradient starting with a high concentration of organic solvent (e.g., acetonitrile) and increasing the polarity of the mobile phase with water. Notably, in addition to immature crosslinks, fractionation of a mixture of immature and mature crosslinks from different tissue samples has been achieved on HILIC columns [3, 78, 79, 107, 108]. This is particularly convenient when sample amount is limited as it may allow the quantitation of a panel of collagen crosslinks in a single column.

For reliable quantitation by mass spectrometry (MS), the gold standard is multiple reaction monitoring (MRM) done in a triple quadrupole where the crosslinks molecules are selected in first quadrupole (Q1), fragmented in the collision cell (Q2) and the intensities of the resulting fragments are registered in the third quadrupole (Q3). MRM-MS methods are very selective because they rely on predetermined precursor-product transitions that are determined for every specific analyte to be quantified. In the case of collagen crosslinks, these transitions have been determined and are available in the literature for implementing a quantitation method using HILIC or reverse-phase columns. Although MRM-MS methods enjoy many advantages such as high sensitivity, selectivity, wide dynamic range, high precision, and reproducibility, even when analyzing complex samples, it requires expertise in mass spectrometry. For instance, finding optimal fragmentation conditions (collision energy, etc.) of the analyte to be quantified is recommended to achieve maximal sensitivity.

More recently, the popularity of Q-Exactive high-resolution MS instruments has propelled the development of parallel-reaction monitoring (PRM) methods for the quantitation of collagen crosslinks. Although PRM and MRM are similar in sensitivity, dynamic range, etc., the rapid scanning rate and acquisition of high-resolution MS/MS spectra these instruments make PRM potentially more specific than MRM. In addition, unlike MRM methods, pre-established parent-product transitions are not needed for PRM which greatly facilitates method development.

Because isotopically labelled standards for all collagen crosslinks are not commercially available, a typical quantitation method relies on an external calibration curved constructed with crosslink standards. The inclusion of a related molecule such as pyridoxamine as internal standard [4] could help account for losses during sample preparation and thus significantly improve accuracy.

Due to the fluorescent nature of the pyridinolines, HPLC with fluorescent detection can be used to detect these trivalent mature crosslinks. Notably, fluorescent detection of pyridinolines in human urine is a well characterized method as it has been used to understand the collagen turnover or as biomarker in many forms of bone disease [31,32,33, 35] & lung fibrosis [116]. Although pyrrole crosslinks are also likely to be present in tissue samples, they are not typically quantified because unlike pyridinolines they are not fluorescent and require Ehrlich’s chromogen for detection (Table 11.2).

Table 11.2 Examples of collagen crosslink quantification in lung diseases

4 Assessment of Collagen Architecture In Situ

Collagen architecture, including fibril length, density, and alignment, directly and profoundly influences adherent cell behavior and function [58, 86, 100, 120]. Visualization of collagen architecture in translational models of lung disease may therefore provide important clues for underlying disease-driving mechanisms beyond the mere increase in deposition of fibrillar collagen, which can be quantified using the methodology outlined above. Several staining and microscopy techniques allow for the assessment of collagen or ECM architecture and are not only used to visualize collagen in tissue sections, but also in more complex three-dimensional samples. These methods range from traditional histological staining and immunohistochemistry coupled with light or fluorescence microscopy to polarization-based microscopy methods, second-harmonic generation (SHG) microscopy, and scanning and transmission electron microscopy (Fig. 11.5). While most of these methods allow for quantification of collagen content and alignment in situ, they differ considerably in specificity and resolution [53, 60, 68, 115]. Their advantages and disadvantages will be outlined below.

Fig. 11.5
A flow diagram. In-situ visualization of collagen architecture using light or fluorescent microscopy of fixed tissues based on Masson's trichrome staining, picrosirius red-polarization, immunohistochemistry, T E M, and seconds harmonic generation microscopy with examples of micrographs.

Examples for in situ visualization of collagen architecture. (a) Masson Trichrome staining of an IPF lung tissue section. (Reproduced from Harris WT et al. [124] (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0070196)). (b) Picrosirius Red staining of a mouse lung with bleomycin-induced lung fibrosis, visualized using conventional light microscopy, in comparison to (c) the same section visualized using polarized light microscopy. ((b) and (c) are reproduced from Egger C et al. [125] (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0063432)). (d) Immunofluorescent staining of a normal human lung section for type IV collagen (green). (e) Immunofluorescent staining of a mouse lung section depicting bleomycin-induced lung fibrosis for type I collagen (green). Staab-Weijnitz CA, unpublished results. (f) Transmission electron microscopy of idiopathic pulmonary fibrosis (IPF) acellular lung matrix. (Reproduced from Booth AJ et al. [126], Figure 6, right-hand panel, with permission of the American Thoracic Society. Copyright © 2021 American Thoracic Society. All rights reserved. The American Journal of Respiratory and Critical Care Medicine is an official journal of the American Thoracic Society. Readers are encouraged to read the entire article for the correct context at https://www.atsjournals.org/doi/full/10.1164/rccm.201204-0754OC. The authors, editors, and The American Thoracic Society are not responsible for errors or omissions in adaptations). (g) Second Harmonic Generation image of lung parenchyma from an IPF patient’s lung section. (Reproduced from Tjin G et al. [127] (https://doi.org/10.1242/dmm.030114), with permission of Disease Models & Mechanisms (https://journals.biologists.com/dmm))

4.1 Masson’s Trichrome Staining

Masson’s trichrome staining is a widely used method to for the visualization of collagen in the context of tissue structure which allows for detection of morphological alterations and the qualitative and quantitative assessment of the extent of collagen deposition [110]. The latter makes it particularly useful for routine use in pathological diagnosis of fibrotic diseases. Numerous variants of this method and combinations with other staining protocols exist, but in principle they all rely on the sequential use of three dyes with different molecular weights and acid-base chemistry in a precisely controlled timely manner. Weigert’s iron hematoxylin is used first to stain the nuclei. This dye is resistant to subsequent acid decolorization procedures. Then, a red acid dye (the so-called plasma stain, e.g., Biebrich scarlet or acid fuchsine) is applied which binds all acidophilic structures including cytoplasm, muscle, and collagen. A solution containing large acid heteropolymetalates such as phosphomolybdic or phosphotungstic acid removes the plasma stain from collagen, but not from muscle fibers or cytoplasm. Finally, a green or blue dye (e.g., light green SF or aniline blue) is used to stain collagen fibers (Fig. 11.5).

4.2 Picrosirius Red Staining

The underlying principle for the specific binding of Sirius Red to collagen has been described above (Sect. 11.2.1). When used for tissue staining, Sirius Red is often called Picrosirius Red (PSR), as the stain is dissolved in an aqueous solution of picric acid in the protocol. In bright-field microscopy, collagen appears red on a pale-yellow background. Also, fluorescence microscopy has been used for visualization of PSR-stained sections using excitation/emission settings for rhodamine—stained fibers also then appear red and can be combined with green autofluorescence of elastic fibers or live cells [14, 29, 112]. Importantly, specificity and sensitivity for visualization of collagen fibers can be considerably enhanced when polarized light microscopy is used for analysis [119]. Parallel alignment of collagen fibers causes a strong natural birefringence, which is further enhanced by the association of elongated Sirius Red molecules along the linear fiber axis. Thus, collagen can be visualized under linear polarized light and will then present as red, orange, yellow or green fibers [46, 60, 67, 115] (Fig. 11.5). The different colors were initially thought to reflect distinct collagen types [46, 67], but are more likely to be a measure of fiber thickness and/or degree of parallel orientation [60, 69].

4.3 Second Harmonic Generation Microscopy

Second harmonic generation (SHG) is a non-linear optical process where photons from a strong laser pass through a so-called non-centrosymmetric environment, i.e., an environment without an inversion center as symmetry element, and interact with aligned harmonophores that possess a permanent dipole moment. This leads to emission of second-harmonic light at half the wavelength of the light that originally entered the material. Only very few biological materials/proteins are such harmonophores, i.e., meet the physical requirements for efficient second harmonic generation. These materials include type I and II collagens and myosin in actin-myosin complexes [22]. Hence, specificity for fibrillar type I and II collagens in biological samples is excellent. In addition, this technique allows for assessment of the three-dimensional fibrillar collagen network in high resolution [5]. Another advantage is that, as SHG microscopy entirely relies on the above-described intrinsic physical properties of fibrillar collagen, it does not require staining procedures or any type of labelling [53, 105, 121].

4.4 Immunohistochemistry

While all techniques described above are best suited for or even restricted to the visualization of fibrillar collagens, immunohistochemistry (IHC) provides the opportunity to stain for distinct collagen types, given that specific antibodies are available. Hence, the same limitations as already outlined above for immune-based methods for quantification of collagen types (Sect. 11.2.3) apply. Notably, Rickelt and Hynes have published a highly useful summary on IHC methods and suitable antibodies for the detection of ECM proteins including numerous collagens [93]. Immunohistochemical techniques are typically highly sensitive. Dependent on the antibodies used, efficient detection of collagen may require the unmasking of epitopes by limited predigestion of tissue sections prior to the staining procedure [26].

4.5 Transmission Electron Microscopy

In transmission electron microscopy (TEM), electrons instead of light are sent through the sample specimen, resulting in superior magnification by a factor of 1000 and much higher resolving power. This facilitates the visualization of subcellular compartments and single collagen fibrils, diameters of which may range from 12 to 500 nm, far below or at best at the limit of resolution of standard light microscopy [103]. This property makes TEM the only currently available technique to quantify fibril diameter and length and to directly visualize collagen fibrillogenesis at the interface fibroblast-extracellular space [104]. Methods for 3D reconstruction have been developed [76, 104]. Exemplifying the power of this technique, reconstructing 3D images from serial TEM images ultimately lead to the discovery that, in tendon fibroblasts, Golgi to plasma membrane carriers (GPCs) carry collagen fibril cargo and target them to special plasma membrane protrusions, fibripositors, for secretion [18] (Table 11.3).

Table 11.3 Methodologies for the visualization of collagen architecture in tissue sections and three-dimensional samples (for tissue staining examples, refer to Fig. 11.5)

4.6 Selected Complementary and Emerging Techniques

4.6.1 Confocal Reflection Microscopy (CRM)

In confocal reflection microscopy (CRM), a confocal microscope is used to take reflection images of a sample specimen at sequential focal planes along the z axis followed by three-dimensional reconstruction. While reflection as an intrinsic optical property is not particularly specific for collagen, in contrast to SHG as described above (Sect. 11.4.3), CRM is a comparatively simple alternative for the detailed assessment of collagen microarchitecture in artificially prepared scaffolds from purified collagen or collagen-enriched material [15, 100].

4.6.2 Atomic Force Microscopy (AFM)

Collagen architecture and the extent of collagen crosslinking affect tissue stiffness which, in turn, regulates adherent cell behavior via biomechanical signaling [37, 45]. Therefore, for some research questions, it may be of interest to assess mechanical properties of a biological sample with altered collagen properties. Atomic force microscopy (AFM) is a type of scanning probe microscopy, where interactions between a sharp tip at the end of a cantilever and the sample surface are recorded, allowing for the visualization of surface topography at nanoscale [82]. Importantly, alongside topographic analysis, AFM can be used for nanoindentation, where specified forces are applied to the surface of the sample and the extent of indentation is recorded by the instrument. This yields a measure of the sample’s stiffness/elasticity and allows for the generation of Young’s modulus maps of the sample’s surface [37, 45, 82].

4.6.3 Imaging Probes for Magnetic Resonance Imaging (MRI)

Magnetic resonance imaging (MRI) is an imaging modality routinely used in the clinic when diagnosis requires high-resolution images of soft tissues or when ionizing radiation should be avoided. MRI relies on strong magnetic fields, magnetic field gradients, and radiofrequency pulses to generate images of anatomical structures. Molecular MRI is an emerging field where targeted probes are designed and, for instance, coupled to MRI contrast agents such as clinically approved gadolinium-based structures [95]. ECM-targeting moieties have much been explored in the context of cardiovascular disease [91]. For in vivo models of lung fibrosis, successful targeting of type I collagen by a gadolinium-coupled probe has been reported by Caravan et al. [19] and, more recently, the same group demonstrated that an allysine-binding gadolinium chelate can be used to monitor fibrogenesis in the mouse model of bleomycin-induced lung fibrosis [113]. Also, the latter targets mostly collagen, as fibrogenesis typically associates with increased LOX-mediated oxidations of lysine to the allysine aldehyde intermediate. Clearly, ECM-targeted MRI may provide the unique opportunity for non-invasive and non-destructive analysis of ECM changes in lung disease, ultimately allowing for sequential assessment of disease development in the same animal.

5 Monitoring Fibril Formation in Real Time Using Purified Collagen

Fibril formation and crosslinking stabilizes fibrillar collagen and largely protects it from proteolytic degradation. Inhibition of fibril formation can therefore be viewed as a promising therapeutic strategy for all types of pulmonary disease where excessive secretion and deposition of ECM is a pathological feature. Notably, this is not only the case for fibrotic disease, but also for cancer, where the formation of tumor-encapsulating ECM shields the tumor cells from therapy and is typically associated with poor prognosis [38].

A well-established and straightforward assay for the assessment of collagen fibril formation in vitro relies on the principle that purified and acid-dissolved type I collagen spontaneously forms fibrils upon neutralization in an entropy-driven process. Fibrillogenesis can be monitored in real time by light scattering and the resulting fibrils, which are similar to those formed in vivo, further examined by electron microscopy [48, 118]. We have previously used this assay to show that two approved therapeutics for the treatment of idiopathic pulmonary fibrosis, nintedanib and pirfenidone, both delay collagen fibril formation, identifying a potential novel mechanism of action [55].

As simple and attractive as this assay appears, several things need to be considered. First, it should be acknowledged that it reflects a highly artificial environment and that assay conditions must be strictly controlled as temperature, buffer, salt composition, and pH strongly affect fibril formation [118]. Second, the source of the collagen used must be carefully chosen. In many studies, pepsin-digested collagen is used which is devoid of telopeptides and therefore harbors few LOX-crosslinking sites if any [34, 97]. However, absence of telopeptides has been shown to slow fibril formation and to alter the morphology of resulting collagen fibrils [97]. Also, with telopeptides present, this assay will be better suited to directly study the influence of enzymatic and non-enzymatic collagen crosslinking on fibril formation and on resulting fibril architecture. Therefore, collagen preparation methods should be considered that leave telopeptides intact [20, 97, 118].

6 Assessment of Collagen Turnover by Peripheral Markers

Normal collagen turnover maintains a healthy balance between collagen synthesis and degradation processes. Impairment of this balance and uncontrolled ECM remodeling also is a major pathological hallmark of many chronic lung diseases [13, 17, 123]. Importantly, these processes lead to cleaved-off collagen propeptides as well as MMP-dependent cleavage products which are detectable in peripheral blood. Although a relatively new concept, several studies already support the concept that assessment of peripheral collagen propeptides and matrikines may be beneficial for prognosis and diagnosis of chronic lung disease [10, 44, 49,50,51, 65, 84, 109]. Markers of collagen formation can also be used as a read-out when efficiency of novel therapeutic strategies is assessed in preclinical models [73]. In addition, these cleavage products allow for inferences on the responsible protease for collagen turnover, provide information about the tissue of origin, and thus contribute to our understanding of disease pathogenesis and comorbidities [25, 44, 49,50,51, 109]. Table 11.4 provides a list of the most frequently currently used peripheral markers of collagen research.

Table 11.4 Peripheral markers of collagen turnover with potential applicability as biomarkers for lung disease

7 Conclusion

In most mammalian tissues, collagens represent the main ECM component and play a major role in the maintenance of tissue integrity and function. Hence, it is not surprising that alterations of collagen quantity and molecular properties contribute considerably to chronic lung disease development and thus need to be considered in translational models of lung disease. The high structural and functional diversity of collagens, however, entail a number of challenges for their characterization.

While methods for the detection and visualization of fibrillar collagens in situ are well-established, the analysis of collagen types in molecular detail, including PTMs and extracellular crosslinks, requires very specific approaches, many of which are still under development. Proteomics studies, in combination with detailed functional analysis of observed changes, may reveal novel molecular mechanisms underlying fibrosis and thus help identify potential future targets for treatment of chronic lung disease. Identification and quantification of collagen PTMs, crosslinks, collagen-derived propeptides and matrikines may help elucidate their functions in lung health and disease, and also serve as a platform for the development of future diagnostic and therapeutic strategies.

In conclusion, here, we provided a summary on the current state-of-the-art of collagen quantification, detection, and assessment of properties in molecular detail. The methodology reviewed includes traditional and well-established approaches that take advantage of well-known intrinsic and unique properties of collagen fibers, but also cutting-edge methodologies such as tandem mass spectrometry, which allows not only for assessment of collagen composition but also for elucidation of site-specific PTMs and crosslinks.