Introduction

Histopathological analysis of the skin can be used to assess and diagnose various diseases.12 Further, histology images of tissue samples can provide information on the tissue environment.8 Histology images of tissue samples contain information about tissue layer boundaries (i.e., dermis, subcutaneous) as well as various tissue features (i.e., capillaries, venules) and information about the matrix composition (i.e., collagen, hyaluronic acid). Combining the structural and mechanical properties of skin tissue samples can provide useful information needed for various applications, including transdermal drug delivery techniques,8 cancer diagnosis, evaluating skin lesions, and wound healing and scarring. However, such quantification requires accurate annotation of histology images. Manual annotations, measurements, and segmentations are time-consuming and lead to high inter- and intra-user variability.6 Automated methods can enable objective and rapid evaluation of tissue samples, yet few such methods have been reported.5,6

Identification of tissue layer boundaries can provide details on the anatomical structure of skin samples to develop structural and computational models.8 Previous methods have focused on the segmentation of tissue layers. Typically, layer segmentation methods use either an edge detection method or a histogram-based approach.6 Edge or contour detection algorithms aim to identify discontinuities in the image and often use Otsu thresholding.6,13,14 For example, Osman used an active contours-based method where images are converted from RGB to the HSV color space, and Otsu thresholding is used to create a binary image to identify the epidermis and dermis layers as well as adipocytes.13 Similarly, Hussein et al. and Haggerty et al. used a color pre-processing step such as adaptive color deconvolution coupled with Otsu thresholding to segment the skin layers.6,7 Meanwhile, histogram-based algorithms aim to create sets of region-based information to inform the segmentation. For example, Chen et al. proposed using neighborhood pixel information coupled with a support vector machine classifier for each pixel to yield a generalized segmentation approach.3 Kleczek et al. developed an automated method for epidermis segmentation using shape information and distribution of transparent regions.10 Diamond et al. extracted texture properties from 100 × 100 pixel neighborhood regions and classified regions using training data obtained from manual segmentations.4 Because tissue boundaries can be subtle in the image, without significant edges, local histograms often outperform edge-based feature detection.11,12

While these methods are useful for their specific purposes, many challenges have hindered the development of generalized automated tissue layer analysis methods. Using human skin samples would provide the most relevant information for these models, but such samples are difficult to obtain.9 Porcine skin provides a high fidelity alternative, but limited studies have reported pig skin automated layer segmentation. Further, different stains of the tissue samples can produce varying color distributions within the image, presenting a significant challenge for methods that rely on color-space information or thresholding. Automated algorithms can also fail because tissue samples can tear apart in processing, resulting in blank and overlapping regions within the image.12 Furthermore, only small numbers of samples are often available, essentially limiting robust training datasets for machine learning-based methods.

In this work, we present an unsupervised tissue layer segmentation algorithm for use with histology images. Our method was designed for use with porcine tissue samples but does not use pig-specific information so that it is adaptable to other species samples. Further, it does not require any training data and the only user input is the number of layers to be identified. The algorithm combines neighborhood-based histograms of pixel intensity, K-means clustering, and Dijkstra’s algorithm to identify the dermis, subcutaneous, and muscle layers in porcine tissue samples. Subsequently, a model-fit step is used to refine the identified boundaries and segment the epidermis. We test this method using eight H&E stained histology images of tissue samples taken from different anatomical locations of the pig. Finally, we demonstrate the utility of the method at assessing the tissue environment by using the identified layer boundaries to evaluate layer thicknesses as a function of anatomical location.

Material and Methods

Proposed Algorithm

A primary challenge of tissue layer identification methods in histology images is that each distinct layer spans similar pixel intensity ranges. Thus, single pixel intensity information alone cannot be used to determine the layer to which the pixel belongs. However, if a local neighborhood of pixels (i.e., 100 × 100 pixel window) is considered, the pixel intensity distribution can differ depending on which layer the neighborhood was extracted from. Thus, the neighborhood histograms can readily be used to inform a clustering-based classifier. The drawback of this approach is that it is computationally expensive if the image is not subsampled, but subsampling yields coarser boundaries. In addition, the neighborhoods with the highest classification uncertainty will be those which span across two different tissue layers, which are the most critical for determining the layer boundaries.

For this reason, the proposed algorithm incorporates two primary steps: a coarse boundary identification step using neighborhood histogram clustering, followed by a detailed boundary refinement step. The boundary refinement step also identifies the epidermis. The algorithm requires the user to input the histology image to analyze as well as the number of layer boundaries to be identified. The refined boundary step and identification of the epidermis adds computational cost and complexity; thus, we also provide the option to use the ‘cluster-change boundary lines’ in cases where less detailed boundaries are acceptable. Figure 1 shows a schematic of the coarse boundary identification step, including the cluster-change boundary result. Figure 2 shows a schematic of the boundary refinement step and Fig. 3 shows the final boundary with refinement. The method was implemented in MATLAB.

Figure 1
figure 1

Schematic of the proposed tissue layer segmentation method. (a) Raw histology image. (b) Pre-processed image. (c) Sample extracted sub-tiles from the dermis, subcutaneous, and muscle layers and (d) the ten-bin histograms of the sample sub-tiles. (e) Initial K-means clustering labels and (f) connected components refined labels. (g) Layer boundary line identified using Dijkstra’s algorithm. (h) Cluster-change boundary lines before (blue) and after (red) Dijkstra’s algorithm.

Figure 2
figure 2

Schematic of the refinement step of the proposed algorithm. (a) Evaluation of perpendicular lines at each boundary point to limit the image to a more specific region of interest and (b) K-means clustering of the pixel intensity within the region of interest. (c) Boundary identified between the two main clusters identified. Epidermis identification step including (d) the pixel intensity extracted along the perpendicular line and (e) the model fit to identify the end of the epidermis. (f) Final epidermis boundary line. Hair follicle identification step including (g) a coarse and fine boundary around the clusters from (b), and (h) the identified hair follicle.

Figure 3
figure 3

The final refined layer boundaries for the dermis, subcutaneous, and muscle layers (blue) and epidermis (red).

Histology Image Pre-Processing

Figure 1a shows a sample histology image and demonstrates the typically low contrast of histology images. Thus, a contrast enhancement step is used to pre-process the images and is performed according to:

$${I}_{\mathrm{CE}}=255*\frac{\left(I- {T}_{1}\right)}{\left({T}_{2}-{T}_{1}\right)},$$
(1)

where \({I}_{\mathrm{CE}}\) is the contrast enhanced image, \(I\) is the raw image, and \({T}_{1}\) and \({T}_{2}\) are constant values. From Eq. 1, any pixel values less than zero are set to zero, and any pixel values greater than 255 are set to 255. \({T}_{1}\) and \({T}_{2}\) are set as the 1st and 99th percentile pixel intensity of the raw image, respectively. The images are then smoothed with a 9 × 9 pixel median filter. Each color channel is pre-processed individually and independently of the other channels. To reduce the computational cost, the images are also sub-sampled by a factor of 2. Figure 1b shows a sampled pre-processed image.

Coarse Boundary Identification

The coarse boundary identification is done using histograms of the pixel intensity of windowed sub-tiles of the image. K-means clustering is then used to optimally classify each tile such that tiles with similar histograms belong in the same cluster while tiles of differing histograms belong in different clusters. To compute the neighborhood histograms, the raw image must first be segmented into \(N\) sub-tiles. Herein, we use sub-tiles of size 125 × 125 pixels for pre-processed images with an average size of approximately 5,000 × 2,500 pixels. Other window sizes were tested, and it was found that windows of size less than 125 × 125 pixels could yield erroneous boundaries at times. Boundary results were steady with increasing window size, but larger windows lead to coarser boundaries so a smaller window is desirable. No window overlap was used to reduce computational cost. A ten-bin histogram of the pixel intensity of the blue-color channel within each tile is then computed. The blue channel was used because it contained slightly higher contrast across the different stains used herein (i.e., Alcian Blue stain and Masson’s trichrome), as compared to other color channels. In general, for the proposed method, only one color channel is needed because the method does not depend on color information. However, for the images tested herein, similar boundaries were obtained whether using the blue-only or the red or green color channels. For other applications, the use of other color channels, a magnitude image, or all color channels should be explored. Figure 1c demonstrates sample image tiles and Fig. 1d shows the corresponding 10-bin histograms from the dermis, subcutaneous, and muscle layers. Figure 1d highlights that the histograms are uniquely different depending on the layer the tile was extracted from.

K-means clustering is used to classify the tile histograms. The optimal number of clusters is not known a priori; thus, K-means is run using 2, 3, 4, and 5 clusters. The optimal number of clusters is autonomously determined for each image according to:

$$\mathrm{max}\left\{\frac{1}{NC}*\frac{{nSil}^{4}}{{nSSE}^{0.3}}\right\} \;\mathrm{for} \;2\le NC\le 5,$$
(2)

where \(NC\) is the number of clusters, \(nSil\) is the normalized Silhouette value, and \(SSE\) is the normalized sum-squared error (SSE) of the final cluster labels. \(nSil\) and \(nSSE\) are normalized by each’s maximum value across clusters. The Silhouette value is computed according to:

$$Sil= \frac{({dist}_{IC}-{dist}_{NC})}{\mathrm{max}\left\{{dist}_{IC}, {dist}_{NC}\right\}},$$
(3)

where \({dist}_{IC}\) is the average distance between the current point being evaluated and all other points in the same cluster and \({dist}_{NC}\) is the average distance between the current point and the nearest-neighboring cluster center. Because the Silhouette value calculation is computationally expensive, it is only computed for fifty randomly sampled sub-tiles. Testing was done to confirm that this limited Silhouette score sample size did not affect the optimal number of cluster determination. SSE was computed by:

$$SSE= \sum_{i=1}^{N}\sqrt{{{(X}_{i}- {X}_{i}^{cent})}^{2}+{{(Y}_{i}- {Y}_{i}^{cent})}^{2}},$$
(4)

where (\({X}_{i}\), \({Y}_{i})\) is the center point of the ith sub-tile and (\({X}_{i}^{cent}\), \({Y}_{i}^{cent})\) is the location the ith sub-tile’s cluster centroid. Figure 1e illustrates the initial cluster labels for the sample image. For this image, three clusters were determined to be optimal. It can be observed in Fig. 1e that the initial labeling contains some outliers, since it is known that each segmented label should be a single connected region of sub-tiles. Thus, connected components are used to adjust tile labels, as demonstrated in Fig. 1f.

The number of boundary lines to be identified is known from the user-input. The boundary lines of each layer are known to exist near the change points between different cluster labels. However, the sub-tile evaluation yields a coarse ‘cluster-change’ boundary line, as seen by the blue lines in both Figs. 1g and 1h. Thus, Dijkstra’s algorithm is used to produce a finer boundary (red lines in Figs. 1g and 1h). Dijkstra’s algorithm is a graph-theory based method that determines the lowest cost path between a start and end node. In this application, the tissue layers are expected to be primarily horizontal in the image such that no image transform is needed to use Dijkstra’s algorithm here. Only points within one sub-tile length (i.e., 125 pixels) away from the cluster change line are included as nodes for Dijkstra’s algorithm.

The cost assigned to each node is the image gradient magnitude at that point. The image gradient was calculated using the built-in MATLAB function (imgradient). The node costs are normalized by the average cost across all nodes, and then squared. To reduce the computational cost, only the lowest cost node within a 4-pixel neighborhood is kept and used for Dijkstra’s algorithm. Further, the node costs are sorted in ascending order, the elbow of the sorted cost curve is identified using a two-line fit method,2 and all nodes with a cost beyond the elbow are discarded. Because the nodes for Dijkstra’s algorithm are not on a structured grid, a distance limit is needed to define the neighbors of each node. But the optimal distance limit value is not known a priori. A low distance limit can cause Dijkstra’s algorithm to fail if large jumps in the boundary exist, while a high distance limit can lead to the algorithm jumping over and clipping parts of the boundary. Thus, a distance limit of 50 pixels is initially used, and this limit is increased in increments of 5 pixels until Dijkstra’s algorithm successfully finds a path. Figure 1g illustrates the cost matrix of nodes and the final Dijkstra’s algorithm path identified for the upper edge of the tissue. Figure 1h demonstrates the boundaries produced by Dijsktra’s algorithm for the dermis, subcutaneous, and muscle layers of the sample image. The boundaries shown in Fig. 1h are the cluster-change tissue boundaries. In-house written functions (as opposed to built-in MATLAB ones) were used for both K-means clustering and Dijkstra’s algorithm.

Boundary Refinement and Epidermis Identification

While the use of Dijkstra’s algorithm mitigates the coarseness of the boundary, some undesired boundary jumps can persist. For example, in Fig. 1h the boundary identified between the dermis, and upper tissue edge jumps and skips part of the tissue near the left-hand side, as denoted by the black arrow. In addition, the boundary between the dermis and subcutaneous layers contains errors due to the presence of a hair follicle, as denoted by the orange arrow. Thus, a boundary refinement step (Fig. 2) is incorporated to further improve the boundary accuracy and identify the epidermis layer.

For each point along the boundary, a perpendicular line of a total length of 80 pixels (approximately 2/3 the sub-tile size) is drawn and defined between \(-{L}_{p}\) to \(+{L}_{p}\), as shown in Fig. 2a. For the refinement step, the region of interest (ROI) is limited to only pixels that fall on at least one perpendicular boundary line. The pixels within the ROI are then clustered using K-means with three clusters, which represent: (1) the area outside the ROI, (2) the upper region (ROIupper), and (3) the lower region (ROIlower). Figure 2b shows the labeled ROI. For the upper boundary line that separates the dermis and region above the tissue (\({B}_{\mathrm{out},\mathrm{derm}}\)), the algorithm follows the steps shown in Figs. 2c–2f. For the boundary separating the dermis and subcutaneous layers (\({B}_{\mathrm{derm},\mathrm{subq}}\)), the algorithm proceeds to the steps shown in Figs. 2g and 2h. For the boundary line between the subcutaneous and muscle layers (\({B}_{\mathrm{subq},\mathrm{muscle}}\)), no refinement is done.

For \({B}_{\mathrm{out},\mathrm{derm}}\), a line separating the upper and lower region clusters is drawn, as shown by the red line in Fig. 2c. For each point along the red line, perpendicular boundary lines are again evaluated from \(-{L}_{p}\) to \(+{L}_{p}\), and the pixel intensity extracted, as shown in Fig. 2d. As illustrated in Fig. 2e, the best fit of the pixel intensity from the perpendicular line half located inside the dermis (from \(-{L}_{p}\) to 0) is then computed. Two distinct parts define the fit: (1) a parabolic best fit from 0 to some change point between 0 and \(-{L}_{p}\), and (2) a constant linear fit from the change point to \(-{L}_{p}\). Every possible change point is evaluated, and the optimal change point is selected to minimize the error between the image pixel intensity, Pim, and the fit pixel intensity, Pfit. The error is computed as:

$$\mathrm{fit}\; \mathrm{error}\,=\,2*PkErr+10*LocErr +\frac{1}{N}\sum_{i=1}^{N}{\left({P}_{\mathrm{im},i}- {P}_{\mathrm{fit},i}\right)}^{2},$$
(5)

where N is the number of points on the line, \(PkErr\) is the difference in pixel intensity at the minimum parabolic peak point between Pim and Pfit, and \(LocErr\) is the difference in the location along the line of the parabolic peak point between Pim and Pfit. The optimal change point denotes the ‘fit epidermis’ edge point in Fig. 2f. If the fit error was sufficiently high (> 500), the fitting process failed, and no ‘fit epidermis’ edge point is placed for the corresponding boundary point. The failure error threshold is set high as further steps will remove possible epidermis outlier points. From the initial set of ‘fit epidermis’ edge points, an average epidermis thickness, tepi-fit,avg, can be computed. ‘Expected epidermis’ edge points can subsequently be defined for boundary points where the fit failed as the point along the perpendicular line that is a distance tepi-fit,avg away from the boundary. Finally, Dijkstra’s algorithm is used to determine the final epidermis line and remove any outlier epidermis boundary points. The ‘fit’ and ‘expected’ epidermis points are used as nodes for Dijkstra’s algorithm, and a 50-pixel distance tolerance between neighboring nodes is allowed. Figure 2f demonstrates the epidermis boundary identified for the sample image region.

For \({B}_{\mathrm{derm},\mathrm{subq}}\), a ‘fine’ and ‘coarse’ boundary is drawn around ROIlower, as illustrated in Fig. 2g. A circle is fitted to each individual area between the ‘fine’ and ‘coarse’ boundaries, denoted in yellow in Fig. 2g. The error between the fitted circle mask and the area mask is computed and normalized by the size of the area mask. Any area with a normalized error of less than 0.5 is denoted as a hair follicle, and the area within the hair follicle is added to ROIlower (i.e., added to the subcutaneous layer). Figure 2h shows the identified hair follicle region, marked in blue. A new \({B}_{\mathrm{derm},\mathrm{subq}}\) boundary is then computed as the minimum area boundary encompassing the entire hair follicle adjusted ROIlower. To remove any residual outliers from the boundary, Dijkstra’s algorithm is then used to determine the final \({B}_{\mathrm{derm},\mathrm{subq}}\) boundary. Figure 3 shows the final, refined boundaries segmenting the epidermis, dermis, subcutaneous, and muscle layers for the given tissue sample.

Porcine Tissue Samples

Eight porcine tissue samples, shown in Fig. 4, were used to test the proposed algorithm. The samples were collected from a single castrated male Yucatan minipig that weighed 33 kg. The tissue was collected from the belly immediately (within 30 min) after euthanasia, following a Purdue University approved IACUC protocol. First, the extraction area was shaved, and a sterile scrub was performed using alternating washes of chlorhexidine and saline. Next, the tissue was extracted using a scalpel, and then the extracted tissue was chilled to 4 °C to allow the fat to solidify for sectioning. As shown in the diagram in Fig. 4, samples were collected starting at the nipple (most anterior, medial position) and spaced every 10 cm moving posteriorly and every 9 cm moving laterally (towards the belly). Four samples, denoted as circles with a red outline, were damaged during the extraction process and could not be used for processing. The samples were then fixed using 4% paraformaldehyde (PFA) for 24 h. After fixation, the samples were rinsed with PBS three times for 5 min. The tissues were then paraffin embedded, sectioned, and stained for Masson’s trichrome (MTC) and Alcian blue (AB). All histology images were taken within 1-week after fixation. Serial sections were used to determine the distribution of hyaluronic acid in the tissues. One section was stained with AB and the second section with hyaluronidase for 30 min before AB staining. For the remainder of the manuscript, we refer to the AB-only stained sample as ‘AB-only’, the section treated with hyaluronidase as ‘Hase + AB’, and the section treated with MTC staining simply as ‘MTC’.

Figure 4
figure 4

Histology images of eight excised porcine tissue samples used for algorithm testing. The schematic of the pig shows approximately where the samples were extracted from. Circles with red outlines denote samples that were damaged during extraction and could not be used for processing. The location labels are A: anterior, P: Posterior, M: Medial (breast), L: Lateral (belly).

Results

Figure 5 shows the tissue boundary layers identified by the proposed algorithm across all eight tissue samples. In general, the initial results demonstrate the method’s accuracy across all skin samples and all anatomical locations. Figures 5a, 5d, and 5h demonstrate that the algorithm was robust to hair follicles (denoted by black arrows) in the image. It is also observed that the algorithm primarily struggles in regions where high-density collagen exists near the boundaries, particularly if the collagen is oriented parallel to the layer segmentation. For example, in Fig. 5e, the boundary identified between the dermis and subcutaneous layers on the right side of the tissue sample contains large high-density collagen. The algorithm includes these regions in the dermis, though it is more likely that they should be segmented into the subcutaneous layer.

Figure 5
figure 5

Refined boundaries identified for the dermis, subcutaneous, and muscle layers (blue) and epidermis (red) for each of the eight tissues samples denoted by (a)–(h). Black arrows denote locations of hair follicles in the tissue sample. The schematic of the pig shows approximately where the samples were extracted from. Circles with red outlines denote samples that were damaged during extraction and could not be used for processing. The location labels are A: anterior, P: Posterior, M: Medial (breast), L: Lateral (belly).

Additionally, in Figs. 5f and 5g, an extended horizontal collagen region closely located to the muscle and subcutaneous layer boundary causes the algorithm to jump and denote the collagen as the boundary rather than the muscle. In Fig. 5b, a horizontal collagen region is observed near the dermis-subcutaneous boundary but slightly farther away. In this case, the algorithm correctly marks the boundary at the dermis edge without including the thick collagen fiber. Figure 5c demonstrates that the algorithm is robust to high-density isolated collagen regions connected to the dermis. Specifically, near the right-hand side of the tissue, the boundary skips past the connected high-density collagen region, marking this in the subcutaneous layer. In general, the boundary between the dermis and outside the tissue sample is the most accurate and robust. For this boundary, no visible collagen is present nearby. The boundary is typically smooth but contains increased fluctuations in samples, such as in Fig. 5d, where high pixel intensity fluctuations exist throughout the dermis.

Figure 6 illustrates the cluster change point boundary lines for three tissue samples (denoted as -i, -ii, and -iii) for images of the tissue samples with (a) AB-only, (b) Hase + AB, and (c) MTC staining treatments. The boundary lines are drawn using the cluster change method along each column. Boundary lines are consistent across the H&E, AB, and MTC stains for all samples, demonstrating that the K-means clustering and cluster change boundary line methods are agnostic to the histology stain and thus the image color information. Further, the method is robust to tears and image dropout, as observed in Figs. 6a-ii and 6c-ii. The boundary identification struggles to segment the muscle layer in Fig. 6c-ii, where only a thin region is present on the left-hand side. Further, Figs. 6b-i and 6b-iii demonstrate that the cluster change boundary method is susceptible to errors caused by hair follicles or other image artifacts.

Figure 6
figure 6

Cluster-change boundaries (red line) identified for three tissue samples (i, ii, and iii) imaged with (a) ‘AB-Only’, (b) ‘Hase + AB’, and (c) ‘MTC’ stains. Black triangles indicated cluster change points. Open triangles are identified as outliers and not used in the boundary.

A primary utility of this automated tissue layer identification method is that it enables the extraction of layer-specific information. For example, when studying the efficacy of subcutaneous drug delivery techniques, variables describing the tissue environment, such as tissue layer thickness, are of great interest. Thus, we now demonstrate that such variables can be objectively evaluated using our novel method. Figure 7 illustrates the thicknesses of the dermis and subcutaneous layers across all tissue samples, which span different anatomical locations of the pig. Figure 7 also compares the thickness distribution of the tissue layers for both the detailed Dijkstra and basic cluster change boundary identification methods. The thickness of the dermis layer is observed to generally increase from about 1.5 to 2.5 mm, moving anatomically from breast to belly. The dermis layer thickness showed minimal dependence on the anterior/posterior location of the tissue sample. Meanwhile, the subcutaneous layer decreased in thickness from about 5 mm down to about 3.5 mm from breast to belly. Towards the belly, the subcutaneous layer tended to be thicker in the most anterior locations (toward the neck of the pig), then got thinner for more posterior locations, a finding in agreement with a prior study.1 However, towards the breast, the subcutaneous region became thicker moving from anterior to posterior. Considering Figs. 5 and 7 together, broader PDF distributions are observed when the layer boundaries contain more deviations from horizontal, as expected. In general, these trends highlight that tissue layer thickness is influenced by anatomical location. Similar distributions of thickness are observed between the two boundary identification methods. The difference in mean thickness across the two boundary identification methods was on average 6.9% for the dermis and 2.2% for the subcutaneous layers.

Figure 7
figure 7

Layer thickness distribution computed for the dermis (D) and subcutaneous (SQ) layers for each of the eight tissue samples denoted by (a)–(h). ‘Dij’ indicates boundaries using the refinement step. ‘CC’ indicates boundaries using the cluster-change method. The schematic of the pig shows approximately where the samples were extracted from. Circles with red outlines denote samples that were damaged during extraction and could not be used for processing. The location labels are A: anterior, P: Posterior, M: Medial (breast), L: Lateral (belly).

Figure 8 illustrates the median layer thicknesses of the epidermis, dermis, and subcutaneous layers for all eight tissue samples. Figure 8d highlights the percent distribution of each layer to the total tissue sample thickness. The muscle layer was not included in any thickness calculations since the entire layer is not represented in the images. The average epidermis thickness across all eight samples was 0.035 mm. Minimal trends were observed between the epidermis thickness and the anatomical location. The average dermis thickness across the eight samples used here was 1.73 mm. Figure 8b confirms previous observations, demonstrating that the dermis layer increased thickness from on average 1.37 to 2.25 mm, a 64% increase, for samples moving from breast to belly. Near the breast, the median dermis thickness maintained no dependence on the anterior/posterior location. The average subcutaneous thickness across all samples was 3.96 mm. Figure 8c demonstrates that near the breast, the subcutaneous layer increased 47% from 3.72 to 5.47 mm from anterior to posterior. Anteriorly, the subcutaneous layer increased in thickness from breast to belly, while posteriorly, it decreased in thickness moving from breast to belly. Figure 8d provides the relative thickness of the epidermis, dermis, and subcutaneous layers for each of the eight tissue samples. Figure 8d highlights the relative thickness of the dermis layer increased from on average 24 to 37% from breast to belly. The relative thickness of the subcutaneous layer decreased from 76 to 62% from breast to belly. Again, minimal trends in relative thicknesses were observed between anterior and posterior locations. Overall, the layer thickness analysis demonstrates that the dermis and subcutaneous layer thicknesses depend primarily on the breast to belly anatomical location and can vary significantly between locations.

Figure 8
figure 8

Heatmap of the median thickness of the (a) epidermis (ED), (b) dermis (D), and (c) subcutaneous (SQ) layers. (d) Relative thicknesses of each layer within each tissue sample. The schematic of the pig shows approximately where the samples were extracted from. Circles with red outlines denote samples that were damaged during extraction and could not be used for processing. The location labels are A: anterior, P: Posterior, M: Medial (breast), L: Lateral (belly).

Discussion

In this study, we present a novel automated algorithm to segment tissue boundary layers in histological images of porcine skin samples. Subsequently, we demonstrated the utility of our method for extracting layer-specific tissue information by assessing the thickness of the different layers as a function of anatomical location. Our method requires one user input, namely the number of layers to be segmented. The proposed method combines a coarse identification step that implements neighborhood histograms of image sub-tiles, K-means clustering, and Dijkstra’s algorithm, followed by a boundary refinement step. The algorithm can segment the epidermis, dermis, subcutaneous, and muscle layers. In cases where only general (less detailed) layer boundaries are needed, we show a cluster change line boundary identification (omitting the refinement step) can be used to reduce computational cost.

Extracting quantitative information on the tissue environment is important across a range of applications. For example, such details can be used to inform computational models evaluating transdermal drug delivery. Herein, we demonstrated that the thickness of the dermis and subcutaneous tissue layers maintained a dependence on the anatomical location. We note that only one pig was used, so concrete conclusions on trends of tissue layer thicknesses as a function of anatomical location cannot be drawn from these results alone. Nonetheless, this highlights the importance of assessing variables such as the tissue layer thicknesses and punctuates the need for automated layer identification methods. The layer segmentations provided herein are a first step in enabling higher fidelity models and informing injectable biologics and other histopathological diagnostic techniques. For further advancements to this end, additional studies should aim to develop automated methods for quantifying and annotating tissue features such as collagen and vessels.

In its present form, the method uses only the blue color channel and thus does not rely on specific color information. This is a major contribution of our algorithm as compared to those previously reported in the literature.6,7,13 By not relying on color information, our method is robust across varying histology stains as demonstrated in Fig. 6. Ultimately, this further expands the tissue environment information that can be extracted using our algorithm. For example, by comparing the change in blue color (i.e., change in AB stain) between the AB-Only and Hase + AB images, we can evaluate the distribution of hyaluronic acid within the tissue layers. Such analysis is the subject of future work.

Collagen fibers in the subcutaneous layer were a primary source of error in the algorithm. The collagen generally did not affect the K-means clustering step as the neighborhood size was larger than the collagen fibers. However, collagen regions nearby or connected to the dermis or muscle layers could cause the boundary line to be incorrectly drawn around the collagen. Even with the use of robust path finding methods such as Dijkstra’s algorithm and the refinement step, the collagen fibers still caused errors in the boundary lines drawn in certain samples, such as Fig. 5e. Conversely, Dijkstra’s algorithm was critical in avoiding errors due to isolated collagen regions (i.e., spanning a small horizontal width), such as those present between the dermis and subcutaneous layers in Figs. 5c and 5g. In these cases, the K-means clustering typically labels the collagen region to belong to the same cluster as the dermis layer. Further, because these sub-tiles are connected to the dermis, the connected components-based label refinement cannot correct this. Instead, Dijkstra’s algorithm will skip over these regions since adding more points to the path induces a higher cost path.

While the boundary refinement step and the identification of the epidermis layer add robustness to the algorithm, they also increase the computational complexity and cost. The refinement step of the algorithm takes in the range of 9–11 min to process one image (run non-compiled using a MacBook Pro with the Apple M1 chip, 8 cores, 16 GB RAM, on a histology image of size 5,220 × 2,550 pixels). This includes about 60–90 s per boundary of added time caused by the refinement and about 7–8 min of added time as a result of the epidermis identification. For this reason, if average layer thicknesses are primarily of interest rather than highly detailed boundary lines and the epidermis is not of interest, we tested the case where the boundary line is drawn simply using the cluster change point lines. In doing so, this reduces the computational cost by two orders of magnitude, to 10–15 s. Comparing Figs. 5 and 6, it is observed that the cluster change boundary line method does not have the same level of detail in the layer segmentation. However, Fig. 7 highlights that the cluster change boundary line method is robust in yielding similar thickness distributions. Thus, using the cluster change boundary lines yields a robust method for varying histological stains. Nonetheless, a detailed boundary line identification method that is computationally inexpensive while still being accurate across various tissue stains should be explored in future work.

Several notable limitations of both the proposed method and this study exist. The proposed method identifies tissue layer boundaries in porcine skin samples with no dependence on color-specific information. While it is an automated method, it still requires the user to input the number of layers to be identified. The algorithm also maintains parameters, such as the window pixel size, that could, in principle, be user-adjusted. This could lead to a cumbersome user experience for certain applications not explicitly tested herein. Additional studies should aim to make such values more universally and robustly selected. Further, the tissue layers must span the entire horizontal length of the image, or the boundary identification may fail. The method, in principle, can segment any number of boundaries, but it has only been rigorously tested for samples that include the dermis, subcutaneous, and muscle layers. While the algorithm is designed to be robust, it does not contain any specific way to handle regions where the tissue tore, and such regions could cause the algorithm to fail or give erroneous boundaries. For this study, no artificial or “ground-truth” boundaries were available. Thus, the evaluation of the accuracy of the proposed method could not be rigorously quantified. In future work, the use of other species (i.e., human skin samples) where experts can manually segment the layers and provide a “ground truth” should be explored. This study was also limited to only a single pig, largely due the limited and unpredictable availability of the animals. As a result, only eight tissue samples were available for testing the algorithm, limiting the validation and statistical analysis that could be performed. Although these samples were sufficient to demonstrate the algorithm, as was the objective of this paper, future studies testing and advancing the utility of the method across a larger cohort of pigs should be considered.