Introduction

The Global Initiative for chronic obstructive lung disease (GOLD) has defined chronic obstructive pulmonary disease (COPD) as a common preventable and treatable disease characterized by persistent airflow limitation, which is usually progressive and associated with an enhanced chronic inflammatory response in the airways and the lung to noxious particles or gasses.31 Nowadays, COPD is considered a serious global health problem. Its prevalence is estimated at 9–10% for adults aging 40 years or older and is expected to be the third most common cause of death in 2020.10,26 The chronic airflow limitation characteristic of COPD is caused by a mixture of small airway disease (obstructive bronchiolitis) and parenchymal destruction (emphysema).26 Moreover, it has been shown that COPD patients with emphysema (confirmed through high resolution computed tomography) have more severe airflow limitation.4 Emphysema is defined as the abnormal permanent enlargement of the airspaces distal to the terminal bronchioles, accompanied by destruction of their walls and without obvious fibrosis.1,29 Histological ex vivo analysis of lungs is performed in order to elucidate the pathophysiology of COPD. In this context, quantitative assessment of emphysematous lesions is required to determine the stage of COPD as well as to evaluate the benefit derived from experimental therapeutic approaches.

Several metrics have been previously defined for emphysema characterization in histological lung section images. Stereological techniques are of special interest since they enable the evaluation of 3D lung attributes such as the number of alveoli, the average thickness of the septa or the alveolar surface from a series of sections.18 These techniques consist in assessing the interaction between the lung section image and a geometric probe (points, lines, planes or volumes).11,20 The mean linear intercept \(\left( L_{\rm m}\right) \), which provides a measure of alveolar airspace size, has been widely employed in stereological analysis of the lung.8 For instance, it is used in combination with further stereological measures to calculate the alveolar surface area. The computation of \(L_{\rm m}\) is based on sampling air segments delimited by alveolar tissue from the image of the lung lobe. Hence, \(L_{\rm m}\) is calculated as the mean length of the extracted air segments.14 The value of \(L_{\rm m}\) has been employed for the quantitative evaluation of airspace enlargement in emphysema analysis. Recent studies pointed out two main drawbacks of such an application of \(L_{\rm m}\).12,22 First, \(L_{\rm m}\) depends on the shape of the airspaces. As a result, even if similar sized airspaces are found in distinct lung tissue images, the value of \(L_{\rm m}\) could vary because of their different shapes. In addition, as a measure of the central tendency of a distribution, \(L_{\rm m}\) has shown to underestimate airspace enlargement in mild emphysematous regions characterized by a heterogeneous distribution of the airspaces (i.e., regions with a few enlarged airspaces surrounded by smaller ones). To overcome these limitations, Parameswaran et al. 22 suggested the use of the equivalent diameter variable \((d) \) of a given airspace. The variable \(d\) is defined as the diameter of the circle with an area equal to the underlying airspace. This circumvents the dependence on its shape. A family of indexes derived from the statistical moments of \(d\) was proposed as measures of the airspace enlargemet associated with emphysematous lesions. Among them, the index \(D_{2}\), which takes into account the skewness of \(d\), has shown to be more reliable than the others since it enables the identification of mild emphysema.12 Indeed, \(D_{2}\) has been used as the gold-standard metric for airspace enlargement quantification in the study of gas flow inside the lung tissue15 or in the development of novel in vivo techniques of emphysema analysis based on microcomputed tomography (micro-CT).2,19

It has been reported that larger values of \(D_{2}\) are expected for areas of parenchymal tissue showing an increased enlargement of the airspaces as a consequence of emphysematous lesions.22 However, while the value of \(D_{2}\) provides a measure of airspace enlargement, it does not reflect emphysema severity in a given lung area. Recent results demonstrated the utility of \(D_{2}\) in the classification of healthy and emphysematous areas of lung tissue observed through a microscope.12,22 According to this approach, a binary identification of parenchymal tissue can be obtained. Nevertheless, finer resolution in the assessment of emphysema severity would be desirable.

In our study, we propose a novel methodology to objectively rank the degree of emphysema severity in a lung area from its corresponding \(D_{2}\) value. To achieve this, a severity index variable \(\left( SI\right) \), which represents the probability for the lung area under analysis to be emphysematous, is defined. None of the metrics conventionally employed in lung tissue analysis is focused on the evaluation of such a measure of probability, which enables a straightforward assessment of emphysematous lesions. Unlike \(L_{\rm m}\) and \(D_{2}\), which characterize the airspace size, the proposed SI provides a novel means to quantify emphysema severity in histological lung sections. As a probability measure, the severity range expressed by SI would vary from zero to one, so that lung areas showing more relevant lesions would be associated with values near one, areas of normal tissue would yield probabilities close to zero and areas including mild lesions would reflect intermediate probability values. A Bayesian approach was adopted to define the expression of the function that maps the variable \(D_{2}\) onto the target SI. For this purpose, the statistical distribution of \(D_{2}\) for normal and emphysematous samples of lung tissue was approximated using kernel density estimation techniques. Furthermore, a procedure for the segmentation of the lung section image was designed in order to implement a computer assisted diagnosis tool for histological analysis of these images. The final result is the emphysema severity map for a given lung section, which is obtained from the computation of SI for every pixel in the parenchyma. Such a map is intended to help specialists in the quantification and localization of airspace enlargement resulted from emphysema.

Materials and Methods

Animal Preparation and Image Acquisition

All experimental protocols involving animal manipulation were approved by the University of Navarra experimentation ethics committee. Sixty A/J male mice, 11 weeks old, were equally distributed into control and treatment groups. Treated mice were intratracheally instilled with 6 units per 30 g of porcine pancreatic elastase (PPE, EC134GI, EPC, MI, USA), as described in a previously published protocol.16 Control animals were instilled with a saline solution. Five treated mice and five control mice were then sampled 1, 6, 12, 24 h, 7 and 17 days after elastase administration. For the present study, a subset of them composed of four control mice, three mice induced with mild emphysematous lesions (sacrificed 24 h after treatment) and four mice induced with severe emphysematous lesions (sacrificed 17 days after treatment) were randomly selected. Their lungs were collected for histological analysis.

All mice were euthanized by exsanguination. To ensure optimal expansion of the airspaces, the trachea was exposed by dissection and ligated during maximal inspiration. The larynx, trachea and bronchi were removed en bloc and fixed at a constant pressure of 20 cm H2O by immersion in 4% formaldehyde in phosphate buffer for 24 h. Lung lobes were then sectioned along their main bronchial axis, collected and embedded in three paraffin blocks. The first block contained the two halves of the left lung lobe. The second block contained the superior and middle lobes. The third block contained the inferior and post-caval lobes. Three 4-micron sections were then cut off each block, spaced at 20 to 30 microns distance from each other. Sections were stained with hematoxylin and eosin (H&E). In this study, we used six randomly selected sections per mouse, for a total of 66 lung sections.

Whole-slide views of the sections were captured using an automated Axioplan 2ie Zeiss microscope (Carl Zeiss, Jena, Germany). Each slide was initially acquired with a Plan-Neofluar objective (numerical aperture NA = 0.035, magnification 1.25×, pixel resolution 3.546 \(\mu \)m/pixel). The automatic threshold method proposed by Otsu21 was then applied to detect all tissue areas. The size of the objects was measured and only objects with a reasonable size to represent entire sections of lung lobes were considered for further processing. For each object, a bounding box was created and the coordinates of its four vertices were sent to the microscope. Then, tissue areas were automatically scanned at higher magnification with a Plan-Neofluar objective (NA = 0.3, 10×, 0.725 \(\mu \)m/pixel). Some overlap was allowed between image fields to facilitate the creation of large image mosaics using the Stitcher ImageJ plugin.25 The resulting mosaics were stored in a server for offline quantitative analysis. Figure 1 shows a section from a treated mouse in which the regions including emphysematous lesions have been manually annotated. Additionally, one healthy lung area and two areas with emphysema are shown at high magnification for comparison.

Figure 1
figure 1

Lung section from a treated mouse with manually annotated emphysematous lesions. Three different areas are highlithed: (a) a lung area including healthy tissue and (b, c) two lung areas from the damaged region

For the development and validation of the index for emphysema severity quantification proposed in our study, the initial group of mice was divided into training and test sets. The former was composed of the 12 lung section images corresponding to one control mouse and one mouse induced with severe emphysematous lesions (6 sections from each of them). This training set was allocated for design and optimization purposes. The lung section images from the nine remaining mice were used for testing the behavior of our emphysema severity index. The test set was then composed of 54 lung section images, with 6 sections per mouse.

Emphysema Severity Index

To quantify the degree of emphysema severity, we defined the severity index \(\left( SI\right) \) of a specific lung area as the probability that it contains emphysematous lesions. It is worth noting that, according to this definition, emphysema severity and airspace enlargement represent different properties. Nevertheless, they are related to each other since emphysematous lesions are characterized by enlarged airspaces resulting from the destruction of parenchymal tissue.1 Thus, the emphysema severity index SI presented in our study was expressed in terms of the well-known parameter \(D_{2}\), which has shown to be a robust descriptor of pulmonary airspace enlargement.12,22 Therefore, given an area of lung tissue, the first step to rank emphysema severity by means of SI is the computation of \(D_{2}\). For this purpose, an automated procedure similar to that described by Muñoz-Barrutia et al. 19 was applied. It encompasses the following operations:

  1. 1.

    The 8-bit grayscale green channel is extracted from the 24-bit RGB image of the lung area since it provides the greatest contrast between the background and the red-blue H&E stained tissue.12,19

  2. 2.

    The image is binarized using the Otsu’s method in order to separate the parenchymal tissue from the airspaces.21

  3. 3.

    Tissue structures are enhanced by eroding the binary image.

  4. 4.

    The white regions enclosed by the lung tissue represent the airspaces. The area \(\left( A\right) \) of each of these regions is approximated as its number of pixels. Area values are transformed to equivalent diameter \(d\) samples by means of the operation \(d = 2\sqrt{\frac{A}{\pi }}\).

  5. 5.

    From the set of samples of the variable \(d\), an estimation of the parameter \(D_{2}\) is obtained by following the definition provided by Parameswaran et al. 22:

    $$D_{2} = \mu _{d}\left[ 1+\frac{\sigma _{d}^2}{\mu _{d}^2 + \sigma _{d}^2}\left( 2+\frac{\sigma _{d}\gamma _d}{\mu _{d}}\right) \right]$$
    (1)

    where \(\mu _{d}\), \(\sigma _{d}^2\) and \(\gamma _{d}\) are the mean, variance and skewness of the variable \(d\), respectively.

Steps 1-3 address the preparation of the lung area image for the computation of \(D_{2}\), which is performed in subsequent steps 4 and 5. Image preprocessing aims to enhance the definition of tissue structures in order to obtain a clear representation of the airspaces of interest. It enables automatic analysis of the whole lung section image.24,27 Operators such as thresholding and erosion/dilation involve the modification of the pixels in the original image.9 However, it must be appreciated that preprocessing is equally applied to every image to ensure an unbiased comparison of the obtained measurements.

Once \(D_{2}\) has been computed, the aim is to find the mapping function \({SI} = {SI}\left( D_{2}\right) \) that provides the corresponding SI. As previously mentioned, the value of SI was modelled as the probability for the lung area associated with the observed \(D_{2}\) to be emphysematous. Hence, the target function should meet the three following attributes:

  1. 1.

    To represent a probability, SI must range between zero and one.

  2. 2.

    Given two lung areas characterized by \(D_{2}^{(1)}\) and \(D_{2}^{(2) }\) with \(D_{2}^{(1)} \ge D_{2}^{(2) }\), the values of emphysema severity must reflect this condition by assigning SI values such that \({SI}^{(1) } \ge {SI}^{(2)}\). In other words, the function must show a monotonically growing behavior as increased severity must be assigned to larger \(D_{2}\) values.

  3. 3.

    The function \({SI} = {SI}(D_{2})\) must present a smooth continuous evolution along its domain of definition. This requirement prevents the occurrence of abrupt changes of the function profile, which would be associated with large differences in the SI assigned to similar \(D_{2}\) samples.

A stochastic approach was adopted for the definition of the function relating \(D_{2}\) to SI. Hence, both metrics or indexes were treated as random variables. The cumulative distribution function (CDF) of \(D_{2}\), which is denoted by \(F_{D_{2}}\), is suggested as the target mapping function between both variables. According to the inherent attributes of a CDF, \(F_{D_{2}}\) displays the two first properties initially imposed as it monotonically grows between zero and one.3 The third required property is given by the continuous nature of the variable \(D_{2}\) as only discrete variables would be associated with discontinuities in the CDF. Therefore, we define SI as:

$${SI} = F_{D_{2}}\left( d_{2}\right) = P\left( D_{2} \le d_{2}\right)$$
(2)

where \(d_{2}\) represents a specific realization of the variable \(D_{2}\) and \(P\left( \cdot \right) \) denotes the probability of an event. The Bayesian framework was adopted to infer the expression of \(F_{D_{2}}\) from the prior knowledge about the problem.

Initially, it is assumed that this knowledge is given by regions of healthy and emphysematous tissue delimited by a specialist in a lung section image, as depicted in Fig. 1. It is worth noting that the specialist is only asked to determine regions with emphysematous lesions but no information on their severity is required. From these regions, a set of control \(\left( C\right) \) and emphysematous \(\left( E\right) \) tissue areas of size \(W \times W\) pixels are extracted. The value of \(D_{2}\) for these areas is then obtained following the procedure described before. As a result, a set of \(D_{2}\) samples grouped into \(C\) and \(E\) categories is available. It describes the statistical properties of the proposed problem and will be used to estimate the function \(F_{D_{2}}\). To this end, according to the total probability theorem, the probability density function (PDF) of the variable \(D_{2}\), which is the derivative of \(F_{D_{2}}\) and is termed as \(f_{D_{2}}\), can be expressed as:3

$$f_{D_{2}}\left( d_{2}\right) = P\left( C\right) f_{D_{2} \vert C} + P\left( E\right) f_{D_{2} \vert E}$$
(3)

where \(f_{D_{2} \vert C}\) and \(f_{D_{2} \vert E}\) denote the conditional PDF of the variable \(D_{2}\) in \(C\) and \(E\) groups, respectively. In addition, \(P\left( C\right) \) and \(P\left( E\right) \) are the prior probability of observing a control and an emphysematous lung area, respectively. The value of SI can be directly obtained from the PDF of \(D_{2}\):

$$SI = P\left( D_{2} \le d_{2}\right) = \int _{-\infty }^{d_{2}}f_{D_{2}}\left( \tau \right) d\tau$$
(4)

In practice, the PDF of \(D_{2}\) is unknown and must be approximated from the available sample set. For this purpose, we used the conventional kernel (Parzen) density estimation procedure.3 Consider a univariate variable \(X\), for which a finite set of \(N\) samples is observed. Using the Parzen’s method, an approximation to its PDF, which is denoted by \(f_{X}\), would be obtained as:

$$f_{X}\left( x\right) \approx \frac{1}{N}\sum _{i=1}^{N}K\left( x-x_{i}, h\right)$$
(5)

where \(K\left( \cdot \right) \) denotes the kernel function and \(h\) is its width parameter. Usually, the Gaussian kernel is adopted since it provides smooth approximations to the objective PDF.3 In this case, the kernel would be given by:

$$K\left( x-x_{i}, h\right) = \frac{1}{h\sqrt{2\pi }}\exp {\frac{{-\left( x - x_{i}\right) }^{2}}{2h^{2}}}$$
(6)

The rule proposed by Silverman enables the adjustment of the width parameter \(h\) (the standard deviation of the Gaussian), which would be obtained as:28

$$h = \hat{\sigma _{X}}\left( \frac{4}{3N}\right) ^{\left( \frac{1}{5}\right) }$$
(7)

where \(\hat{\sigma _{X}}\) is the estimate of the standard deviation of \(X\) computed from the initial set of \(N\) samples.

This procedure can be applied to the estimation of \(f_{D_{2} \vert C}\) and \(f_{D_{2} \vert E}\), which, according to  Eqs. (3) and  (4), lead to a manageable expression of the target mapping function \({SI} = {SI}\left( D_{2}\right) \).

Calculation of the Emphysema Map

The defined SI function enables the derivation of an emphysema severity map from a given lung lobe section. This map is intended as a tool to facilitate the identification of highly probable emphysematous lung regions. The estimation of an emphysema map involves two main tasks: (1) the segmentation of the lung image in order to identify parenchyma pixels and (2) the computation of the SI on each of these pixels.

Image segmentation requires the identification of the lung border along with artifacts due to airways and blood vessels crossing the lung parenchyma, which could be confused with airspaces and must be therefore excluded from further analysis. The key feature exploited for the detection of the lung border was the increased area of the region surrounding the lung compared with the airspaces of interest. To perform border detection, the original lung image is initially converted into a binary image by retaining the green channel and applying the Otsu’s method, as previously described for the estimation of \(D_{2}\) from a lung area. Subsequently, the following operations are performed (see Fig. 2a):

  1. 1.

    The binary image is eroded in order to enhance the tissue structures in the lung.

  2. 2.

    The resulting holes inside the lung are filled.

  3. 3.

    The biggest foreground (black) area in the image is retained. This operation discards small foreground objects outside the lung. Subsequently, the image is inverted to assign parenchyma pixels a non-zero value.

  4. 4.

    Morphological closing is then applied to smooth out the lung border and recover its original size.

On the other hand, artifacts corresponding to airways and blood vessels are usually surrounded by a border that is thicker than that of the parenchymal tissue surrounding the airspaces. Therefore, artifact rejection was based on the identification of abnormally thick lines in the image.19 Once one of these thick borders is detected, we took into account that it might define, or not, a closed region. A specific procedure based on the identification of concavities was then designed to fill open regions. The algorithm for airway and vessel rejection takes as the starting point the binary version of the original lung image and is composed of the following steps (see Figs. 2b, 2c):

  1. 1.

    Thin tissue structures are removed by dilating the binary image.

  2. 2.

    Small foreground objects (area lower than 30 × 30 pixels) are removed.

  3. 3.

    Morphological opening is performed to remove irregularities in the definition of the detected thick lines.

  4. 4.

    At this point, each of the remaining foreground objects represents a candidate artifact and is separately processed. This analysis aims to fill the region, if needed, that is defined by the detected line:

    1. 4.1

      The distance transform is computed. As a result, each pixel is assigned the distance from it to the nearest foreground pixel in the image.

    2. 4.2

      Local maxima are obtained and identified with concavities.

    3. 4.3

      A local maximum is assumed to be in the interior of the region defined by the line when, in the original image, no tissue is found in a radius equal to the value of the distance transform at such local maximum.

    4. 4.4

      For maxima identified as interior points, a disk of radius equal to their corresponding value of the distance transform is placed on each of them to partially fill the region.

    5. 4.5

      Any remaining hole is filled.

Figure 2
figure 2

Segmentation procedure for the identification of lung parenchyma pixels. (a) The detection of the lung border is based on the higher area of the background when compared to the airspaces of interest. (b) Artifacts given by airways and blood vessels are found by detecting thick lines in the region limited by the lung border. (c) If the structure identified as a potential artifact is not perfectly closed, a filling strategy is carried out by detecting the points in the interior of the region

We found that the method designed for the detection of airways and blood vessels in the lung parenchyma did not produce false positive cases since abnormally thick lines do not correspond to the border of the airspaces. However, our method was capable of identifying most, but not all, of the artifacts in the image. The reason was that, in some cases, the surrounding border line was not as thick as expected. To remove these undetected objects, the user was required to manually mark one point inside them. These points were stored and a region growing approach was used to segment out every undetected artifact.

As a result of the image segmentation process, a binary mask indicating those pixels in the parenchyma is obtained. This mask is used to perform the second task in the computation of the emphysema severity map. It involves the following operations from the binary version of the whole lung image:

  1. 1.

    The mask is applied to the binary image of the lung section by means of pixel-by-pixel product.

  2. 2.

    Parenchymal tissue is enhanced by erosion.

  3. 3.

    A \(W \times W\) window is centered on each pixel in the parenchyma and the corresponding value of \(D_{2}\) is obtained. It must be noted that the mask indicates those pixels where the map must be computed and prevents the inclusion of artifacts as airspaces for the computation of \(D_{2}\). Finally, the mapping function is used to assign a SI value to each pixel.

Results

Optimum Window Size

The optimum window size must be initially determined for the estimation of emphysema maps. The value of \(W\) defines the extent of the local region considered around the pixel for which SI is computed. The definition of \(W\) reveals a trade-off between the statistical reliability of the obtained \(D_{2}\)/SI and the spatial resolution of the severity map. For an increased value of \(W\), a higher number of samples of the variable \(d\) will be obtained, as it is expected that more airspaces will be included in the analysis, leading to a more reliable estimation of \(D_{2}\)/SI. However, a large \(W\) is associated with lower spatial accuracy in the assignment of SI to the lung pixels. Since a larger area of the lung is considered, the unique correspondence between the central pixel of the \(W \times W\) window and the obtained \(D_{2}\)/SI value is lost.

To determine the optimum window size, we evaluated the ability of \(D_{2}\) to characterize lung tissue samples as a function of \(W\). To this end, we selected pixels from regions of healthy and emphysematous tissue initially annotated by a specialist on the original lung section image. A \(W \times W\) window was then centered on each of these pixels to define control \(\left( C\right) \) and emphysematous \(\left( E\right) \) tissue samples for which \(D_{2}\) was computed. To assess the reliability of the estimated \(D_{2}\), its performance in the identification of samples from both \(C\) and \(E\) categories was analyzed. The performance measures adopted for this purpose were the area under the receiver operating characteristic curve (AUC)5 and the \(p\)-value derived from the non-parametric Kruskal–Wallis test.13 Both measures were computed for different values of \(W\), which was varied between \(W_{\rm min} = 11\) and \(W_{\rm max} = 1001\) (an odd value was used for the window size to unambiguously identify the central pixel). In our experiment, a total of 283 pixels (123 from healthy tissue regions and 160 from emphysematous ones) were selected on the 12 lung sections in the training set. The evolution of the AUC and the \(p\)-value achieved by the variable \(D_{2}\) as a function of \(W\) is depicted in Fig. 3. As expected, the capability of \(D_{2}\) to characterize emphysema grows as \(W\) is increased. Nevertheless, it must be noted that beyond certain value of \(W\) (\(W = 751\) pixels) the classification utility of \(D_{2}\) is not further improved. Therefore, the optimum value of \(W\) was set to 751 pixels.

Figure 3
figure 3

Influence of the window size on the classification ability of \(D_{2}\). Both the AUC and the \(p\)-value are used to assess the differences between \(C\) and \(E\) lung areas

Mapping Function

According to the optimum window size, a new set of 399 tissue samples of size \(751 \times 751\) pixels (190 control and 209 emphysematous samples) was extracted from the training lung sections. As detailed in the proposed methodology, this training set was used to infer the expression of the conditional PDF of \(D_{2}\) in \(C\) and \(E\) categories, which lead to the defintion of the function \({SI} = SI\left( D_{2}\right) \).

The Parzen window method was applied to estimate \(f_{D_{2} \vert C}\) and \(f_{D_{2} \vert E}\) as expressed in Eq. (5). Equal prior probability was assumed for healthy and emphysematous samples of lung tissue, i.e., \(P\left( C\right) = P\left( E\right) = 0.5\). The obtained functions are depicted in Fig. 4 together with the SI function, which is given by Eqs. (3) and (4). The result shows the differences between the distributions of \(D_{2}\) for \(C\) and \(E\) categories. Areas of emphysematous tissue are associated with higher values of \(D_{2}\) (airspace enlargement), which are concentrated in a wider range. Furthermore, the approximated density functions \(f_{D_{2} \vert C}\) and \(f_{D_{2} \vert E}\) can be used to estimate the probability of error in the identification of lung areas from both categories according to their value of \(D_{2}\). Using a Bayesian approach, this probability would be obtained as \(P(C)p(D_{2}\ge u \vert C) + p(E)p(D_{2}\le u \vert E)\), where \(u\) determines the optimum decision threshold given by the intersection point between both curves.3 From our data, the value of \(u\) was found to be 114.38 \(\mu \)m and the probability of error was estimated as 2.93%. This result shows the utility of \(D_{2}\) to perform binary classification of lung areas as healthy or emphysematous, as reported in previous studies.12 Finally, it must be appreciated that the obtained SI function given by \(F_{D_{2}}\) reflects the three properties initially required since it maps the whole domain of \(D_{2}\) onto the 0-1 interval, has a monotonically growing behavior and reflects a smooth profile.

Figure 4
figure 4

Estimation of the statistical distribution of \(D_{2}\) from training samples using the kernel density approach: (a) approximated PDF and (b) the corresponding CDF. The latter defines the mapping function from \(D_{2}\) to SI

Emphysema Maps

The segmentation procedure described before was used to obtain the masks of the 54 lung sections in the test set. Subsequently, the resulting SI function was applied to each of them for the computation of their respective emphysema maps. Severity maps provide the probability for each pixel in the lung parenchyma to fall in an emphysematous region. Once a map is obtained, the next step consists in determining whether a lung area corresponds to normal tissue, mild emphysema or severe emphysema from its associated SI. This task can be modelled as a classification problem based on the value of SI. The Bayes decision rule can be applied to find the optimum decision thresholds on SI between the three severity groups.3 For this purpose, we used the images in the test set to evaluate the statistical distribution of SI in sections from normal untreated mice, mice induced with mild emphysema (sacrificed 24 h after treatment) and mice induced with severe emphysema (sacrificed 17 days after treatment), respectively. Figure 5 depicts the normalized histogram of SI in each severity group.

Figure 5
figure 5

Statistical analysis of the SI variable from lung section images in the test set. The distribution of this variable in lung sections from normal mice, mice induced with mild emphysema and mice induced with severe emphysema is studied. A characteristic band can be identified for each of these severity groups (dotted lines): normal (\(0 \le SI \,< \,0.5\)), mild (\(0.5 \le SI \le 0.7\)) and severe (\(0.7 \le SI \le 1\))

It must be appreciated that the obtained histograms approximate the conditional PDF of SI in the three categories of interest. Assuming equal prior probabilities for the observation of tissue areas in these categories, the Bayes rule states that a lung area characterized by SI will be assigned to the severity group for which the corresponding conditional PDF achieves the highest value.3 The results show that each of the three histograms predominates over the others for distinct intervals of SI values. Approximately, these correspond to \(0{-}0.5\), \(0.5{-}0.7\) and \(0.7{-}1\) for normal, mild emphysema and severe emphysema groups, respectively. From this analysis, the domain of the variable SI can thus be divided into three bands related to the severity of the emphysematous lesions:

  1. 1.

    Normal band \(\left( 0 \le {SI} < 0.5\right) \). SI values in this band indicate normal tissue. They are more probable in sections from normal mice than in those from mice induced with mild and severe emphysema.

  2. 2.

    Mild band \(\left( 0.5 \le {SI} \le 0.7\right) \). SI values in this band indicate mild emphysematous lesions. They are more probable in sections from mice induced with mild emphysema than in those from normal mice and mice induced with severe emphysema.

  3. 3.

    Severe band \(\left( 0.7 < {SI} \le 1\right) \). SI values in this band indicate severe emphysematous lesions. They are more probable in sections from mice induced with severe emphysema than in those form normal mice and mice induced with mild emphysema.

We define the parameters \(A_{\rm norm}\), \(A_{\rm mild}\) and \(A_{\rm sev}\) as a set of quantitative descriptors of emphysema from a lung section. They account for the percentage of SI points in the emphysema map that fall in normal, mild and severe bands, respectively. We evaluated the utility of these descriptors for the characterization of emphysema severity in a lung section. For comparison purposes, conventional metrics \(L_{\rm m}\) and \(D_{2}\) were also assessed. For a given section, \(L_{\rm m}\) and \(D_{2}\) were estimated by randomly selecting a total of 25 areas of lung tissue (\(751 \times 751\) pixels) in which airways and blood vessels were avoided, as described by Jacob et al. 12 The samples of the intercept length \(\left( L\right) \) and the equivalent diameter \((d) \) collected from all the areas were assembled into separate datasets to compute \(L_{\rm m}\) and \(D_{2}\), respectively.12 Figure  6 shows the boxplots of the five evaluated lung parameters in each of the three severity groups: normal, mild emphysema and severe emphysema. We found that each of the parameters can distinguish normal sections from those with mild and severe emphysematous lesions. Nevertheless, as reflected by the boxplots, the most difficult task is the discrimination between the two latter groups. The Kruskal–Wallis method was used to test the statistical significance of the differences exhibited by each parameter between lung sections from normal mice, mice induced with mild emphysema and mice induced with severe emphysema.13 The \(p\)-value for each pair of severity groups is summarized in Table 1. As can be observed, \(A_{\rm mild}\) is the most effective descriptor of the degree of emphysema severity in a lung section since it achieved statistically significant differences \(\left( p {\mathrm{-value}} < 0.001\right) \) between the three groups. Additionally, although other parameters such as \(L_{\rm m}\), \(D_{2}\) and \(A_{\rm sev}\) did not yield significant differences between mild and severe emphysematous sections, they showed a distinct trend in the values assigned to sections from these two groups. This is reflected by the gap between their mean in both categories. Hence, higher values of these parameters are expected for lung sections from mice induced with severe lesions.

Table 1 Assessment of the statistical significance of the differences between severity groups for each of the lung parameters using the non-parametric Kruskal–Wallis test
Figure 6
figure 6

Boxplots showing the statistical distribution of the five lung parameters in the three severity groups: normal, mild emphysema and severe emphysema

The test indicates distinct statistical properties of the evaluated parameters. Thus, they may contain complementary information on the severity of the emphysematous lesions in a lung section. To study this hypothesis, multivariate analysis based on the combination of the five parameters was conducted. We used the Fisher’s linear discriminant (FLD) method to visualize the multivariate patterns extracted from each lung section image in normal, mild and severe groups.7 A transformation matrix \({\mathrm{T}} _{\mathrm{FLD}}\), which maps input patterns onto a two-dimensional space, is defined by maximizing the ratio of the inter-class variability to the intra-class variability for the transformed samples. Figure  7 depicts the projection of the patterns on the two FLD components, which have been denoted as \({FLD}_{1}\) and \({FLD}_{2}\). A different trend can be observed for patterns in each of the three severity groups, revealing the diagnostic utility of the combination of uncorrelated variables. On the one hand, the component \({FLD}_{1}\) allows us to discriminate normal sections from those corresponding to treated mice induced with mild and severe emphysema. On the other hand, \({FLD}_{2}\) captures the differences between sections in mild and severe groups.

Figure 7
figure 7

Multivariate characterization of lung sections images. Projection of the multivariate patterns using the transformation matrix obtained from Fisher’s linear discriminant (FLD) analysis

Finally, as an example, Fig. 8 shows the maps obtained for three sections corresponding to a normal mouse, a mouse induced with mild emphysema and a mouse induced with severe emphysema, respectively. Lung regions identified as normal, mild and severe were automatically obtained by selecting map pixels with a SI in the corresponding band. As can be observed, most of the lung section from the normal mouse was quantified with small SI values indicating a lower probability of emphysema. Additionally, an interesting point here is the difference between the maps of the two treated mice. For the mouse induced with mild emphysema, a higher proportion of the lung section is assigned with intermediate SI values, reflecting the predominance of mild emphysematous lesions. For the mouse exposed to treatment during a longer period, a relevant percentage of the lung section has been given increased values of SI. This reflects that, in addition to mild emphysema, severe lesions have been developed in most of the parenchyma. Therefore, the obtained maps show a coherent behavior of the proposed index SI.

Figure 8
figure 8

Emphysema maps for three different lung sections from the test set: (a) lung section from a normal untreated mouse, (b) lung section from a mouse induced with mild emphysema (sacrificed 24 h after treatment) and (c) lung section from a mouse induced with severe emphysema (sacrificed 17 days after treatment). For each of the sections, three different representations are shown: the original lung section image (left), the obtained emphysema severity map (middle) and lung areas identified as normal (\(0 \le {SI} \,<\, 0.5\)), mild (\(0.5 \le {SI} \le 0.7\)) or severe (\(0.7 \le {SI} \le 1\)) emphysema (right)

Discussion and Conclusions

Emphysema commonly manifests as a component of COPD. Animal models provide a useful means to study this disease and test the efficacy of drugs. In this research, we address the problem of quantifying emphysema severity in histological lung sections. For this purpose, we defined the emphysema severity index (SI), which provides the probability for an observed lung area to be emphysematous. The variable SI was expressed as a function of \(D_{2}\), which effectively quantifies airspace enlargement in the lung, by means of the CDF of the latter. A kernel-based approach was applied to estimate the statistical distribution of \(D_{2}\) for both control and emphysematous tissue samples in order to obtain the expression of the \({SI} = {SI}\left( D_{2}\right) \) function. Finally, we evaluated the utility of this function by calculating the emphysema severity map of a lung section, enabling the localization of areas affected by emphysematous lesions.

We present two main contributions to the ex vivo histological analysis of lung sections in the context of emphysema and COPD research: the definition of the SI variable and the computation of the emphysema map for a lung section. Our SI is a step forward over currently existing metrics such as \(L_{\rm m}\) and \(D_{2}\) for the quantification of airspace enlargement due to emphysema. These metrics enable a relative evaluation on the extent of emphysema by assuming increased severity for lung areas with larger values of \(L_{\rm m}\)/\(D_{2}\) with respect to those with smaller ones. Hence, \(L_{\rm m}\) and \(D_{2}\) have been traditionally adopted as a reference in previous studies on emphysema and COPD involving animal models.15,24,27 However, a scale of severity had not been established for these metrics. This gap between airspace enlargement quantification and the degree of emphysema severity is overcome by the definition of the SI function, which provides a novel analytical expression for the interpretation of \(D_{2}\).

On the other hand, emphysema maps provide spatial information on the progression of the lesions. Usually, the characterization of emphysema in lung section images has been carried out by means of stereological techniques such as the intercept length or the disector. These techniques typically yield a number that quantifies a specific lung attribute related to the alveolar wall destruction or the airspace enlargement.18 For example, the latter has been widely evaluated using \(L_{\rm m}\), which is estimated from a set of tissue samples extracted from the underlying lung section image. Peces-Barba et al. 24 used a minimum of 27 samples of lung tissue from each section while Rangasamy et al. 27 extracted 15 samples. In contrast, for the computation of emphysema maps, every pixel in the lung image is assigned its corresponding SI value. The information contained in the map can thus be employed for different purposes. Hence, it enables the spatial localization of lung areas characterized by a higher emphysema severity, as shown in the example of the Fig. 8. In addition, new emphysema descriptors such as \(A_{\rm norm}\), \(A_{\rm mild}\) and \(A_{\rm sev}\) can be defined.

As reflected by our results, \(A_{\rm mild}\) summarizes the information in a lung section image by providing an objetive quantification of the damage caused by emphysema. It achieved statistically significant differences between normal, mild emphysematous and severe emphysematous sections, outperforming conventional metrics such as \(L_{\rm m}\) and \(D_{2}\). Moreover, our statistical analysis reveals an acceptable capability of \(L_{\rm m}\) to characterize emphysema severity in a whole lung section, yielding similar or smaller \(p\)-values than \(D_{2}\). Previously, other researchers have proved that \(L_{\rm m}\) fails in the identification of emphysematous lung tissue with a heterogeneous distribution of the airspaces.12,22 Areas with such a heterogeneous distribution are localized in small regions of the lung corresponding to the transition from normal to emphysematous tissue. Therefore, as the computation of \(L_{\rm m}\) and \(D_{2}\) for a section was based on the random selection of a set of lung tissue samples, the probability of sampling areas with a high variability of the airspace size is small, having no relevant influence on the final estimate of these parameters.

It is worth noting that the proposed emphysema maps are not presented as an alternative to the currently existing metrics for emphysema quantification used in histology. These maps are intended to provide additional information for the specialist in the field. As indicated by Mühlfeld and Ochs,18 a combination of several parameters may be the most effective approach to characterize parenchyma destruction in emphysema. Our experiments show that the suggested map features \(A_{\rm norm}\), \(A_{\rm mild}\) and \(A_{\rm sev}\) can be used together with \(L_{\rm m}\) and \(D_{2}\) towards a more efficient description of emphysematous lesions. In this vein, morphometry features such as geometric attributes of the alveolar septa (volume, surface area and thickness), the volume of alveoli or the number of alveoli could be also considered.18 Furthermore, emphysema maps enable the spatial localization of the lesions. This information could be used for the assessment of patterns of tissue destruction in order to explore the relationship between emphysema and the physiological functions of the lung.6,23

To our knowledge, no preceding studies on the computation of emphysema severity maps from histological lung sections have been reported. However, similar approaches have been recently developed on CT scans. Castaldi et al. 6 classified small lung areas from the CT image into one of six different emphysema categories: no emphysema, mild, moderate or severe centrilobular emphysema, panlobular emphysema and pleural-based emphysema. They used a grid to divide the whole lung image into square areas. Once these are labeled, an emphysema map of the CT scan is obtained. Sorensen et al. 30 suggested a similar methodology to process CT images. A more simple approach was considered by assuming a single category of centrilobular emphysema. The comparison of our study with those previous ones relies on the method employed to capture images of lung tissue. Histological and CT analyzes are intended for distinct purposes. For instance, histological assessment of lung tissue enables the evaluation of treatments and could be used as a reference for quantification techniques based on non-invasive procedures such as CT.17,19 The main advantage of CT is given by its non-invasive nature, making possible the assessment of living individuals. Yet, the radiation exposure is a clear drawback of this technique. Regarding the resulting emphysema maps, those reported by the cited studies are characterized by a reduced resolution in both the spatial coordinate and the emphysema severity scale. It is motivated by the area-by-area approach adopted to process the original lung image and the limited number of labels (categories) considered to rank severity. In contrast, our approach suggests pixel-by-pixel analysis of the lung tissue, providing a continuous value of probability in the 0-1 range to quantify emphysema severity. As a future research line, the application of our methodology to CT lung images should be explored, since it would enable the computation of emphysema maps in humans. These maps could provide a powerful diagnostic tool in the context of emphysema and COPD.

Several limitations can be found in our study. First, standard sampling approaches recommended in stereology for unbiased analysis of the lung were not considered.11 The presented results would then be biased in order to provide conclusions about the 3D lung structure, since larger airspaces have a higher probability of being sampled.32 Therefore, appropriate sampling rules described by Hsia et al. 11 should be taken into account for an unbiased representation of the whole lung organ by means of the extracted sections. On the other hand, future work will be also focused on the segmentation step. It enables the identification of parenchyma pixels, being a key issue to achieve map representations. A semi-automatic segmentation procedure based on morphological operators was designed, as the user was required to indicate artifacts that were not detected by the algorithm. From our experiments, we estimated that the average time for this manual annotation was approximately 2 minutes. However, segmentation should be independent of the user to achieve fully automatic analysis of a lung section. Finally, the mapping function relating \(D_{2}\) and SI is influenced by the adopted animal model. Hence, in order to apply our method to other models, a new training set with normal and emphysematous tissue samples would be required to infer the statistical properties of \(D_{2}\).

In summary, accurate assessment of emphysema severity in histological analysis of lung sections is key for a successful characterization of COPD and treatment strategies in pre-clinical trials. A novel parameter for ranking emphysema severity, the so called emphysema severity index \(\left( SI\right) \), has been proposed. It has been defined as a function of the airspace enlargement observed in lung tissue, which is quantified by \(D_{2}\). Additionally, a semi-automatic procedure has been designed to evaluate the SI for a whole lung section image, resulting in an emphysema severity map. This approach is proposed as an assistant software tool for pathologists and researchers involved in the study of emphysema and COPD.