Introduction

Skin fibrosis is the patognomonic feature of several conditions such as hypertrophic scars, chronic graft-versus-host disease, nephrogenic systemic fibrosis, localized scleroderma (i.e. morphea), and systemic sclerosis (SSc). It is related to an excessive dermal deposition of collagenous and non-collagenous extracellular matrix (ECM) components as a consequence of aberrant production and altered remodeling from tissue fibroblasts and myofibroblasts [1].

In all the above-mentioned conditions, quantitating skin fibrosis remains the main goal to assess disease activity and severity as well as response to therapy. In SSc, this becomes even more critical since severity of skin involvement inversely correlates with survival and prognosis [2, 3]. Additionally, in SSc, the extent of skin disease is currently the major criterion to define the two different clinical subsets, limited cutaneous and diffuse cutaneous [4], and skin involvement is often used as the primary outcome in clinical trials.

This review will focus on SSc as prototype of fibrotic skin disease. Clinically, in SSc, skin involvement evolves through three stages: edematous, indurative, and atrophic. In the first, or edematous phase, there is painless pitting edema of the hands and fingers, which may also involve the feet and legs as well as the forearms. This swelling is then slowly, or at times rapidly, replaced by thickening and tightening of the skin, which loses its normal pliability. This second or indurative phase, which persists variably for many years, is characterized by hard skin that also becomes shiny, taut and adherent to the subcutis. Later in the course of the disease, the skin may revert to normal thickness or may atrophy, looking thin and tethered to the underlying tissue [5, 6].

Here, we review the techniques currently available to quantify skin fibrosis mainly focusing on the most innovative and recently developed strategies.

Quantitating Skin Fibrosis by Physical Examination and Mechanical Devices

The current gold standard, widely used in randomized clinical trials [7, 8] to measure skin involvement in SSc, is the modified Rodnan skin score (mRSS) [9]. The original score was developed in 1979 by Rodnan et al. [5] and then modified in a summation of ratings obtained from clinical palpation of 17 body areas. Each area is scored based on examiner judgement of skin thickness on a 0–3 ordinal scale (0 = normal; 1 = mild thickness; 2 = moderate thickness; 3 = severe thickness with inability to pinch the skin into a fold). Total score ranges from 0 (=normal skin thickness all over the body) to 51 (severe skin thickness in all 17 areas).

Although being a fully validated outcome or response measure ready for use in clinical trials [9], mRSS has several shortcomings: it is extremely dependent on the examiner skills, requiring specific training and experience [10]; intra- and inter-observer variability of 12 and 25 %, respectively, has been observed [11, 12]; it may be not sensitive enough to measure small but clinically meaningful changes in skin thickening [6]; there is often heterogeneity of skin involvement within each of the areas accounted for, forcing the examiner to “decide” a score for the given area; and the sensitivity to change over time remains uncertain [7].

Several other clinical tools have been proposed to clinically quantify skin fibrosis, some of them assessing different mechanical properties of skin such as hardness and elasticity.

Durometry is a well-known and fully validated technique using a hand-held device able to measure skin hardness. It was proposed and first validated in single-center studies [13, 14] as a reliable tool correlating well with mRSS, ultrasound-assessed skin thickness [14], skin hyalinized collagen content, and skin myofibroblast score [15]. More recently, a multicenter study confirmed that durometry is reliable, simple, and accurate, with a good sensitivity to change when compared with mRSS [16]. However, there are some concerns regarding its ability to measure skin fibrosis or to discern between skin thickening or skin tethering in SSc [14]. In addition, although durometry has been validated in a multicenter intervention trial targeting patients with early diffuse cutaneous SSc (dcSSc) patients, the ability to distinguish skin changes on small body areas (e.g., fingers), and in general all areas with bony prominences, is poorly understood.

The plicometer, a medical device with two arms, used to measure the subcutaneous plica in obese individuals, has also been proposed as a tool to assess skin involvement in SSc [17]. The measurements were made trying to pinch only skin and to avoid capturing subcutaneous fat. It is feasible and reliable [17]; however, it has not been validated in multicenter studies and needs further evaluation to understand what aspect of scleroderma skin changes it really measures.

The vesmeter is a sensing device able to assess simultaneously several skin physical properties such as hardness, elasticity, viscosity, viscoelastic ratio, and relaxation time. Its reliability, accuracy, and correlation of hardness and elasticity with mRSS led Kuwahara et al. to propose vesmeter as a quantitative outcome measure of skin involvement in SSc [18]. However, measurements can be affected by subcutaneous tissue, and whether or not they correspond to fibrotic skin activity should be further investigated.

The Cutometer, a skin elasticity meter which lifts the skin into a measurement chamber, imitating pinching skin into a fold, has also been found to be reliable and to correlate significantly with mRSS [19]. All these tools seem to distinguish between involved and non-involved skin in scleroderma, and therefore merit further validation, although it is usually not clear to what extent the skin thickness, hardness, and tethering is being measured [6]. Furthermore, the sensitivity of these tools on small body parts such as fingers has been scantly evaluated and it is predictably poor.

Quantitating Skin Fibrosis by Skin Biopsy

Skin fibrosis can be assessed by histology. Both dermal thickness and collagen content can be evaluated by several methodologies with the advantage of providng a direct quantification of skin fibrosis. While histopathology is mainly qualitative, or at best semiquantitative, the quantification of the hydroxyproline (HYP) content of the skin is an absolute measure reflecting the amount of collagen incorporated in the ECM [20]. Indeed, this method has been used to validate other potential outcome measures, and relies on the amount of HYP incorporated within the collagen fibers. Despite its absolute positive value, the methodology is burdened by invasiveness and consequent low feasibility, it cannot be performed at the same site more times, and, with the exception of clinical studies where serial biopsies can help to validate new outcome measures of skin fibrosis, it cannot be used as systematic method to assess skin fibrosis in all patients over time. Most importantly, the quantitative assessment of HYP content is entailed by an inevitable site bias and no studies have been performed to validate the HYP content in one site as measure of overall skin fibrosis in SSc.

Soluble Indicators of Skin Fibrosis

A surrogate method to quantify skin fibrosis is the measurement of soluble biomarkers (i.e. in serum or urine) that correlate with the severity and extent of skin involvement. Many candidates have been proposed in recent decades and already extensively reviewed [21, 22, 23••, 24]. Cytokines, chemokines, growth factors, circulating collagen fragments, non-collagenous ECM constituents, matrix metalloproteinases, and their inhibitors correlated with the extent and amount of skin fibrosis. While referring the reader to the cited reviews, we will focus on a few recently developed and novel biomarkers of skin fibrosis.

One of the recently studied and most promising biomarkers, closely correlated with the severity of skin fibrosis, is the cartilage oligomeric matrix protein (COMP), a protein shown to be produced by skin fibroblasts in patients with SSc. It can be detected in serum samples and its levels correlate with skin involvement as measured by the mRSS, directly with skin thickness and inversely with skin echogenicity as measured by ultrasound. Moreover, its levels change over time according to changes in mRSS [25], and, early in disease, is a predictor of mortality in SSc patients [26•]. Indeed, increased expression of COMP was previously demonstrated in SSc skin samples and fibroblasts cultured from these samples [27, 28]. Moreover, gene expression of COMP, as part of the four-gene biomarker in SSc skin biopsies, significantly correlated with skin fibrosis [29]. The four-gene biomarker, including two transforming growth factor- and two interferon-inducible genes, namely COMP, thrombospondin-1, interferon-inducible 44, and sialoadhesin, was validated against mRSS for absolute score and for its sensitivity to detect change in mRSS over time. For this purpose, this signature remains one of the very few examples of a measure with obvious face validity and a validated sensitivity to change [22]. However, the invasiveness of skin sampling represents an obstacle to the applicability of this composite biomarker of skin fibrosis in clinical practice.

A novel composite marker of overall fibrotic activity in SSc, mainly correlating with skin fibrosis as assessed by mRSS, is the Enhanced Liver Fibrosis (ELF) test [30••]. It is an algorithm of three serum biomarkers—namely, amino-terminal propeptide of procollagen type III (PIIINP), tissue inhibitor of matrix metalloproteinase-1 (TIMP-1), and hyaluronic acid (HA)—previously shown to predict liver-related outcomes in patients with chronic liver diseases, and it has recently been implemented in a CE-marked quality controlled test for use in patients. Our group recently demonstrated that the ELF score is significantly higher in SSc patients with evidence of skin fibrosis and that it correlates with the degree of skin involvement as assessed by mRSS and with the severity of skin involvement as assessed by the Medsger’s scale. Sub-analysis of ELF components indicated that all three biomarkers showed a significant correlation with mRSS [30••]. However, the ELF score does reflect overall fibrotic activity in SSc and is influenced by internal organ fibrosis. Further studies are needed to determine the sensitivity of the ELF score to change over time, to assess the predictive value of ELF score in the development of fibrotic involvement, and, ideally, to develop a SSc-specific algorithm by combining multiple biomarkers of fibrosis.

Novel Imaging Techniques to Quantify Skin Fibrosis

More recently, interest has been focused towards imaging techniques that can directly visualize the skin. These include magnetic resonance imaging (MRI), high frequency ultrasound (HFUS) of skin, elastosonography (ES) and, more recently, optical coherence tomography (OCT).

Magnetic Resonance Imaging

MRI could, in theory, allow the quantifying of skin fibrosis measuring the “thickened” signal of skin [31]. However, this technique has not been used and developed for skin assessment in SSc because, besides a low resolution (of the order of 100 μm), it lacks feasibility: it is clearly time consuming in clinical practice and expensive, and, as a consequence, not easily available in all the centers.

Ultrasound and Elastosonography

More feasible is high frequency ultrasound (HFUS) of skin, which has been demonstrated to be a reliable tool to measure dermal thickness [32, 33] allowing a visualization of epidermis and dermis up to 30–40 μm resolution. A recent systematic review analyzed 17 papers published between 1955 and 2010 on the use of this technique as outcome measure of skin involvement in SSc [34•]. As reported by the authors, the majority of articles, using 10–30 Mhz frequencies, studied skin thickness only, plus just five measuring echogenicity. With the exception of Ihn H. et al, who validated HFUS findings against histopathological findings [35], other authors have made a comparison with the current gold standard, the mRSS. The majority of them did not show any correlation between local mRSS and US findings, but some showed a correlation with global mRSS [34•]. The ability of US to distinguish between three phases of skin involvement (edematous, fibrotic, atrophic) has also been investigated, at digital level, and the authors found a significant correlation between dermal thickness and the clinical phase of skin involvement [33]. HFUS could also identify the edematous phase preceding palpable skin involvement in early disease, thus helping to diagnose a very early diffuse skin involvement [36]. A few studies analyzed the sensitivity to change of HFUS [32], but the limited data available need to be integrated by further studies [34•].

Ch’ng SS et al. pointed out the need to reach standardization and agreement on the acquisition of images, such as machine settings, regions of interest, number and sites to be imaged and measured, and definition of skin thickness, and to further investigate the various stages of skin involvement before undertaking future work to assess responsiveness and change [34•].

More recent and less investigated is elastosonography (ES). ES allows the examination of the elastic properties of skin with a color scale superimposed on the gray scale image produced by the conventional US [37, 38, 39••]. The principle behind this technique is that the excessive dermal deposition of collagenous and non-collagenous ECM causing fibrosis reduces skin elasticity. Specific color patterns of dermis have been identified in SSc patients compared with healthy subjects, but several aspects need to be further confirmed and studied [37]. ES could also improve the reliability of conventional HFUS [38]. Of interest, new generation ES (shear wave) may provide a quantitative and operator independent assessment of dermal properties and surely deserves further investigations [39••].

It is envisaged that imaging by HFUS of all the 17 sites used to assess the mRSS could be very time-consuming in clinical practice, questioning the feasibility of this technique. Future studies should clearly analyze whether a few sites could be representative of the total severity of skin fibrosis or whether assessing more sites could add any benefit.

Furthermore, the obstacle of the specific training required for the current gold standard could not be overcome by HFUS, which needs a certain grade of experience. In contrast, HFUS seems to be more reliable than mRSS, but, despite this, it needs to be confirmed in large studies. An additional limitation of HFUS is the fact that conventional transducers, whilst allowing skin thickness to be measured, do not produce a high enough frequency and therefore image resolution needed to depict the finer superficial structures of the skin.

Optical Coherence Tomography

Over the last 20 years, OCT has been one of the most innovative aspects in medical imaging [40]. Indeed, it was first introduced in ophthalmology in 1991 [41], and it appeared to be a promising imaging technique in several fields of medicine, particularly for endoscopic applications [42]. Nowadays, it is a clinical standard in diagnosing and follow-up of many eye diseases, and it is expected to become a clinical standard in cardiology in the near future [43].

The first application in dermatology occurred in 1997 [44]. OCT is currently used for research purposes to assess psoriasis, contact dermatitis, cutaneous lupus erythematosus, blistering diseases, vascular skin lesions, infections, melanoma and non-melanoma skin cancer [42]. In the rheumatology field, it has historically been used to investigate early articular cartilage degeneration [45] and more recently to assess nail disease in psoriatic arthritis [46, 47]. Furthermore, the use of OCT technology for quantification of skin fibrosis is in the formative stages and a tremendous growth potential has been foreseen, similar to the ultrasound development paradigm that has evolved over the past 30 years [48••].

OCT works analogously to an ultrasound scanner; however, it measures echo delays and the intensity of back-reflected infrared light rather than acoustic waves. The common depth resolution is in the order of 5–10 μm, although systems with ultrahigh resolution of about 1 μm have been developed [42]. OCT has appeared to be a promising tool in studying skin diseases in vivo [42]. OCT identifies characteristic microscopic features in the skin, which would otherwise have only been obtained from histological sections [49]. This characteristic has suggested the potential usefulness for skin assessment in SSc [50••].

While this technique is able to provide a higher resolution, it has a lower penetration depth than HFUS (2 mm). However, it is a suitable tool to study diseases affecting superficial skin layers.

The capability of OCT to perform “optical biopsy” in situ in real time with unprecedented resolution makes it a promising non-invasive imaging modality for the visualization and interpretation of microstructural information of different types of tissues [51].

Previous studies have looked into structural details of normal skin using OCT [44, 49, 52, 53]. It has also been demonstrated that OCT images correspond to histology [54, 55]. The normal skin appears as a layered structure. The first distinguishable layer is the stratum corneum, only visible in the skin of the palms and soles. It appears as a dense, homogenous low-scattering band. The epidermis (ED) is usually less signal intense than the dermis. Dermis shows signal-poor cavities corresponding to blood vessels [42].

In 2008, Mogensen et al. [49] described the qualitative morphology of normal skin in various locations of the body using an OCT system able to record polarization sensitive(PS)-OCT images in parallel with standard OCT images. PS-OCT is able to represent birefringent tissue in skin, such as collagen. The study indicated that OCT can be used for both the qualitative and quantitative assessment of skin. The authors showed that normal skin has a layered structure, less pronounced in adults than children, as assessed by OCT. The PS-OCT images showed a well-demarcated difference between ED, papillary dermis (PD), and reticular dermis (RD). The actual dermo-epidermal junction (DEJ) could not be as easily identified in PS-OCT images as it can in regular OCT images. This was explained by the architecture of the collagen in the PD that, due to the loose structure, is less birefringent than in the RD. In addition, they found that the epidermal thickness decreases with age and is not gender- or skin-type-related.

Our group first studied SSc skin imaged by OCT [50••]. No studies have been previously published with regard to the use of OCT in this condition, though some data are available on localized scleroderma [52]. The authors described the morphology of skin in localized scleroderma, normal skin, and other skin pathologies, performing parallel histological and tomographical qualitative studies. In the OCT images of the edematous stage of localized scleroderma, the DEJ was poorly differentiated and large poorly backscattering regions with indistinct and uneven borders were visible in the dermis. The corresponding histology showed diffuse inflammatory infiltrates in the upper dermis—considered responsible for the aspect of DEJ and edema—morphologically corresponding to the poorly back-scattering areas [52].

Our group studied SSc skin aspect including 21 patients with different degrees of skin involvement, 1 morphea patient and 22 healthy controls (HC), using a Swept-Source OCT system [50••]. We compared the findings with histology from 3 skin biopsies, correlated them with the mRSS and assessed intra- and inter-observer reliability.

In healthy skin, the ED appeared as a hypo-reflective layer compared to the underlying PD. The different reflective properties allowed the easy identification of DEJ. The RD presented as a hypo-reflective region, below the PD. Blood vessels were visible in PD and RD as signal-poor cavities. In severely involved SSc skin, the ED was visualized as a homogeneous textured layer and appeared less hypo-reflective than the normal skin. Visualization of the DEJ was difficult. There was no clear distinction of PD and RD. Only rare vessels were visualized compared with normal skin (Fig. 1) [50••]. Comparison of OCT images with corresponding skin histology indicated a progressive loss of visualization of the DEJ associated with dermal fibrosis. Furthermore, SSc-affected skin showed a consistent decrease of optical density (OD) in the PD, progressively worse in patients with worse mRSS. Additionally, clinically unaffected skin was also distinguishable from healthy skin for its specific pattern of OD decrease in the RD. OCT analysis of affected and unaffected skin in a patient with plaque morphea showed a similar pattern of severe SSc and HC, respectively. In addition, the technique showed an excellent intra- and inter-observer reliability [50••].

Fig. 1
figure 1

Virtual biopsy of forearm skin by optical coherence tomography. Representative 3D reconstruction from the tomography of healthy and systemic sclerosis (SSc) (site-modified Rodnan skin score = 3) skin scans. The keratin of the skin appears as a white line on the surface (k). The epidermis (ED) is quite visible in the healthy skin by the contrast with the increased optical density of the papillary dermis (PD). The dermal–epidermal junction (DEJ) is quite visible in the healthy skin between the ED and PD. In contrast, neither clear distinction of ED and PD nor DEJ is appreciable in the SSc skin. The vessels (*) are numerous and very well recognizable in healthy skin, whereas they appear less numerous and less distinct in the OCT image of SSc skin. Total depth of 3D reconstruction = 1.2 mm. Scale bars are calculated by ImageJ. (Reproduced from Ref. [50••], copyright 2013, with permission from BMJ Publishing Group Ltd)

However, these preliminary findings need to be confirmed in longitudinal studies including a larger number of patients with different degrees of fibrosis. Because of the novelty of OCT use in SSc and of the very recent application in a single center study, there are currently several limitations to its applicability ranging from the cost of the machine, the lack of standardization of number and sites to study, to the lack of evidence of sensitivity to change over time.

Although such a system is not yet available for daily use, if future studies show the usefulness and the validity of this technique to assess the SSc skin involvement, many advantages can be expected: (1) the resolution, up to 50 times higher than US, allows for more detailed structural information; (2) the maximum imaging depth provided by OCT is sufficient to examine the skin; (3) OCT complements other imaging techniques, covering, in resolution and penetration, the gap between high resolution optical microscopy techniques (e.g., confocal microscopy) and techniques with long penetration depth (e.g., ultrasound imaging) [43]; (4) it is capable of in situ 2D and 3D skin imaging; (5) it provides clear images in vivo and real-time of skin microstructure for instant visual feed-back; (6) OCT devices are compatible with computer systems and the captured video data can be instantly replayed for review or stored for further or centralized operator-independent analysis limiting the ‘hands-on’ time in the clinic office and allowing a centralized, blinded assessment of results in clinical trials [50••]; (7) it provides a high rate of data acquisition allowing minimization of errors caused by involuntary movements of the patient and the operator; (8) it is a non-invasive technique and does not cause trauma; (9) the OCT system is safe since it does not produce harmful ionizing radiation; (10) the scanning can be performed more times at the same site and at different anatomical sites; (11) the last OCT generation uses a hand-held OCT probe, easy to handle and to be applied directly to the skin, without use of gel; (12) the scanning is very fast, lasting a few seconds; (13) the technique requires minimal operator training; (14) OCT devices are relatively inexpensive; and (15) it is able to provide information on the functional state of tissues such as Doppler flow measurement, suitable to study and monitor changes in blood flow dynamics and vessel structure, and polarization sensitive-OCT based on measuring the polarization properties of light collected from birefringent samples, in particular those with a significant collagen content such as the skin [42, 49].

In the near future, the improvement of such systems with a higher resolution may allow for skin study at the cellular level, making OCT a required complementary tool for assessing and monitoring skin fibrosis in SSc, especially in the course of treatment.

Conclusions

The research efforts devolved in quantifying skin fibrosis in SSc are a clear indirect proof of the unmet need in the field. The skin is a complex massive organ, extremely hard to characterize in its entirety, but, at the same time, extremely accessible. The developments in this field in the past few years are certainly promising. We envisage that an integrated composite outcome measure, comprising clinical and instrumental evaluation, may be a plausible direction to pursue in the aim of accruing a reliable and quantitative tool to measure skin fibrosis.