Keywords

1 Introduction

COVID-19 is a respiratory disease caused by the SARS-CoV-2 virus. It is primarily transmitted through respiratory droplets and can cause symptoms such as fever, cough, and difficulty breathing. In addition to laboratory tests, chest radiographs are used as a screening tool to evaluate potential signs of pulmonary infection caused by the virus. These images can exhibit characteristic patterns, such as opacities or infiltrates, which can assist doctors in diagnosing and monitoring the disease [6]. Furthermore, various studies have demonstrated the effectiveness of chest radiographs in detecting COVID-19 [1, 2, 23, 27, 38].

Currently, there are datasets available that contain labeled radiographs, which can be utilized to train various machine learning algorithms [2]. The construction of these banks has been a collaborative effort involving institutions and medical experts in the field [26, 30, 33]. However, the challenge lies in the lack of uniformity in the region of interest (lungs) within these images. Some radiographs contain unwanted or irrelevant information for classification, such as additional body parts or objects covering the chest. This can adversely affect the precision metrics of classification algorithms [5, 10].

In this work, we aim to demonstrate the hypothesis that aligning the region of interest in both the training images and the test image, such that the anatomical structures within the lungs are positionally consistent across all images, can enable simple and conventional classification methods like K-NN or MLP to achieve better accuracy results, provided that a reliable feature reduction method like PCA is employed in conjunction with a feature selection process based on their discriminatory capability.

To this end, we propose applying two consecutive processes. The first process involves the detection and normalization of the lung region, ensuring that the images within the lung region exhibit the same alignment, location, scale, and improved contrast as much as possible. In the second process, the “Eigenfaces” method (PCA) will be applied to the aligned regions to obtain a reduced set of statistically independent features. Finally, based on the Fisher criterion [35], we propose performing a selection of the features that best discriminate between classes. Using this set of optimal features and a traditional classifier such as K-NN or MLP, the classification accuracy will be measured.

This work comprises seven essential parts. Section 1 introduces the research context, discussing related work and the database. Section 2 outlines the “Lung Finder Algorithm” (LFA) utilized for normalization. Section 3 presents the “Eigenfaces” and Fisher linear discriminant theory, along with our feature weighting approach for normalized image features. Section 4 describes the experimental setup, including the dataset and parameter settings. Section 5 reveals and analyzes the results, demonstrating the performance of our methodology using weighted K-Nearest Neighbors (K-NN) and the Multilayer Perceptron (MLP) [9]. Section 6 delves into result discussion, exploring trends and implications. Lastly, Sect. 7 offers conclusions drawn from the findings and outlines possible future research directions.

1.1 Related Work

Currently, various methodologies have been developed for classification of chest radiographs, as evidenced in previous studies [4, 11, 13, 17, 28, 31, 37]. These methodologies make use of deep learning algorithms or traditional machine learning classifiers [7, 8], and have reported high levels of classification accuracy, greater than 96%. However, the architectures employed in these algorithms still face challenges in achieving a reliable classification of COVID-19 [32], as their accuracy decreases when tested with other datasets different from those used for training. This raises the need of exploring new proposals for normalizing and aligning the lungs region before classifying, instead of just facing the problem by training classifiers like CNNs with a large number of different datasets, to cope with the bias imposed by a particular one. Efficient non CNN-based works have been proposed too, as in [3], where a Multilayer Perceptron (MLP) and an architecture based on image involution were used, which proposes kernels similar to CNNs but shares their weights dynamically in all dimensions, thus reducing the number of multiplications necessary for the calculations. The former obtained a maximum classification accuracy of 98.31%. Feature selection has proven to be effective in increasing classification accuracy in other works, as observed in a study on [20] which used support vector machines to recognize the orbit axis of the sensors, as in another study [29] where it was also possible to classify the frequencies of an encephalogram. Furthermore, in a work carried out by Chengzhe et al. [21], the K-NN algorithm was applied successfully. Several studies have shown that image normalization improves classification results. In a study on kidney radiographs [10], the best results were obtained using CNN and image normalization techniques. Also, in another [19] work, different normalization techniques were used on different types of radiographs to improve image classification. It is important to highlight that the results of our work are not intended to devalue CNNs in image classification, but rather to present an alternative option, and to demonstrate that image alignment and a proper feature selection technique can produce results comparable to the most commonly used algorithms. in the state of the art. In the Table 1 we show the comparison of the different pre-processing methods used in some published works [3, 10, 19,20,21, 29]

Table 1. Comparison of the different preprocessing methods from related works.

1.2 Data Set of Radiographic Images

The database used for this work was “COVID-19 Radiography Database” [4, 31] from kaggle. This data set was selected because it has been used in other similar works [15, 25]. The content of this data set is 6012 images already labeled as pulmonary opacity (other lung diseases), 1345 as viral pneumonia, 10192 as normal, and finally 3616 as COVID-19.

2 Overview of the Lung Finder Algorithm (LFA)

The goal of this algorithm is to locate the lungs in the radiographs, and it consists of a training and testing stage, as shown in Fig. 1. During the training stage, 400 images from the Pneumonia, COVID-19, and Normal classes were randomly selected from the data set. Histogram equalization (HE) [12, 24] was applied to all images and regions of interest were manually labeled by placing 4 provisional landmarks easily located by a human user. It was agreed that two of them would be located, one in the middle of the cervical vertebrae just at the upper limit of the lungs, and the other also on the spine but below where the lung region ends. The other two provisional landmarks are forced to the user to place them on a imaginary straight line perpendicular to the spine that intersects it just in the middle of the two previous landmarks. These last two landmarks are located in the left and right sides of lung region. Finally, and by using these 4 provisional positions, we compute 4 final and permanent landmarks at the corners of the rectangular lung region. On the other hand, ten new images randomly rotated and displaced were then generated for each labeled image to increase the data set and have an augmented dataset. Next, a dimensionality reduction to this set of 4400 images was applied using the “Eigenfaces” method based on Principal Component Analysis (PCA) [18, 39].

During the test stage, and after a contrast improvement (H.E.), a new image is projected to the “Eigenfaces” linear subspace in order to convert it to a compact few dimensions vector which is compared via euclidean distance with each of the 4000 examples contained within the augmented dataset to find k nearest neighbors \(k-NN\). The landmarks associated with these k most similar images from the augmented dataset are used to estimate the 4 landmarks of the test image by interpolation. These predicted landmarks are the coordinates of the corners of the lung ROI which can be used to warp the inside region to a standard template of fixed size.

Fig. 1.
figure 1

Lung Finder Algorithm description. During the training phase, 400 images were tagged with their coordinates. PCA was applied to reduce the dimensionality of the images. In the testing phase, an example radiograph is provided as input, and the algorithm extracts the region of interest as the output. During the test phase, the test image is compared with its nearest neighbors to interpolate its coordinates. Finally, the algorithm outputs the extracted region of interest in a new image.

2.1 Coordinates Labeling for the LFA Training Stage

Each of the images selected for this stage requires a manual labeling where the region of interest of the lungs is delimited by a set of coordinates. These points or landmarks become the labels used by a regression weighted K-NN to predict the corner coordinates of the novel image. The coordinates the lung region are shown in Fig. 2, and consist of four points: Q1(x1,y1), Q2(x2,y2), Q3(x3,y3) and Q4(x4,y4). Q1 and Q2 represent the length of the lungs, while Q3 and Q4 represent their width. In total, 400 images were labeled manually.

Fig. 2.
figure 2

Example of an array of coordinates Q1, Q2, Q3, and Q4 on a radiograph.

The labeling process is shown in Fig. 3. First, the Q1 point at the top of the lungs is manually located, using the spine as reference. The Q2 point is then placed at the bottom of the lungs. When the points Q1 and Q2 are placed, a straight line connecting them automatically appears, and at the midpoint of this line a perpendicular line is drawn containing the points Q3 and Q4. These last two points are constrained to be placed by the user only along the perpendicular line, and may have a different distance from the midpoint of the Q1Q2 line, due to the fact that the lungs are not symmetrical to each other.

Fig. 3.
figure 3

Sequential placement of the points Q. First Q1 is placed, then Q2 so that Q3 and Q4 appear on the perpendicular line that crosses the midpoint of the line Q1Q2. Finally, Q3 and Q4 are adjusted.

2.2 Data Augmentation

Data augmentation is used in various machine learning tasks, such as image classification, to expand a limited database and avoid overfitting [19, 22, 34]. In the case of our algorithm, we have used a large dataset [4, 31]. However, in order to have a set with ROI coordinates sufficiently varied we decided to generate artificial examples based on a randomly selected set, 400 images extracted from original set. The additional artificial images were generated by producing random translations and rotations of the original images. Ten additional artificial images were created from each of the original 400, resulting in a total of 4400 images. First, it was necessary to define the range of operations on the images. For rotation, we set a range of −10 to 10\(^\circ \), suggested by [31], and for translation a range from −5 to 5 pixels. These values were calculated by analyzing the coordinates of the 400 manually labeled images. In summary, the LFA training set contains 4400 images where the coordinates of the landmarks are normal distributed. Figure 4 shows an example of artificial images with their corresponding landmarks.

Fig. 4.
figure 4

Example of artificial images during data augmentation, applying translation and rotation operations.

2.3 Estimating the Corner Coordinates of the Lung Region by Regression

As shown in Fig. 1, in the test stage a new image is introduced from which it is desired to obtain its region of interest. Contrast enhancement and feature reduction are automatically applied to the test image by projecting it onto the “Eigenfaces”. The weights obtained in this projection are used in the “weighted regression K-NN” algorithm to find the most similar neighbors in the “Eigenfaces” space, using the Euclidean distance. In order to reduce the computational cost, the calculations are performed in a 64\(\,\times \,\)64 resolution.

Once the nearest neighbors have been identified, a regression is performed using the coordinates of the ROIs of these neighbors with the aim of predicting the coordinates of the lungs in the test image. For this, the regression (1) and (2) are used, which are applied to each coordinate, either x or y, of each Q landmark, until completing the entire set of landmarks (Q1, Q2, Q3 and Q4). The regression equations are detailed below

$$\begin{aligned} x_{i}=\frac{1}{k}\sum _{i=1}^{k}{x}_{ni} \end{aligned}$$
(1)
$$\begin{aligned} y_{i}=\frac{1}{k}\sum _{i=1}^{k}{y}_{ni} \end{aligned}$$
(2)

2.4 Image Warping

Once the coordinates are obtained through regression, a Warping operation [36] is used to extract the region of interest. In Fig. 5, examples of different test radiographs from the data set are presented along with their automatically estimated ROI coordinates of provisional landmarks (red dots). The calculated coordinates are geometrically transformed to obtain the corners of the ROI (final landmarks depicted as blue dots) that are used in the Warping operation towards a standard fixed size template. On the right side of each image the normalized image resulting from the LFA is shown.

Fig. 5.
figure 5

Two examples of new images with their estimated ROI coordinates used to warp the inside region towards a fixed and normalized template. (Color figure online)

3 Feature Reduction and Selection

After using the LFA on all radiographs in the data set to extract all regions of interest, these new images undergo additional preprocessing before being processed by a classifying algorithm. For our work, we propose the use of [18, 39] Eigenfaces as a feature reduction method. In addition, we incorporated a statistical analysis of these features using Fisher’s linear discriminant in order to preserve only the most discriminating features and weighing each of them according to their power of discrimination between classes. Together these two methods ensure obtaining a reduced number of discriminant features suitable for efficient classification using traditional classifiers.

3.1 Eigenfaces for Dimensionality Reduction

Eigenfaces [18, 39] is based on principal component analysis (PCA) and its objective is to reduce the dimensionality of the images in the [16] dataset. Because each pixel becomes a dimension or feature to be analyzed, processing 256\(\,\times \,\)256 images can be time consuming. On the other hand, a large number of features, in comparison to a smaller number of training examples, could produce missclassification when euclidean distance based approaches as k-NN are used.

The resulting eigenfaces are sorted according to the greater variances of the training set, and can be used to reconstruct every image in the training set as a linear combination of them. Because the greatest amount of variance is concentrated in the first eigenfaces, we can use only a few number of them to efficiently represent all the training images and even novel ones. Thus, every normalized image from the training set can be represented with this compact set of features. Figure 6 shows the Eigenfaces equation, and the matrix Q which columns are the Eigenfaces. The eigenfaces method works better and is capable of concentrating more variance in a less number of eigenfaces when training images are more similar. In our case, the normalized images are more similar to each other than the original images from the dataset. For this reason, the number of useful PCA features is necessarily reduced when using the proposed LFA.

Fig. 6.
figure 6

Reconstructed image (left) is computed as a linear combination of the columns of matrix Q (in the middle) plus the mean image (right)

3.2 Using the Fisher Discriminant to Reduce the Number of Useful Features

Fisher discriminant criterion also known as Fisher ratio FR has been used in Linear Discriminant Analysis for finding a linear projection of features that maximizes the separation between classes. Typically, only one important feature survives this process in two classes problems. However, since the PCA features are to some degree independent, we can use, in a naive fashion, the fisher ratio as a measure of separation between classes for a each feature.

This process is done by evaluating each feature individually, and making sure that the means of the observations in each class are as far apart as possible, while the variances within each class are as small as possible. Using this analysis, it is possible to select a number greater than 2 of those features obtained by the Eigenfaces method that best discriminate the classes in the data set [35]

The FR has been used in works such as the one mentioned in [14], and we denoted it as J. The FR formula is found in Eq. 3 (Fig. 7).

$$\begin{aligned} J_{i}=\frac{{\left( {\mu }_{i{c}_{0}}-{\mu }_{i{c}_{1}}\right) }^{2}}{{\sigma }_{i{c}_{0}}^{2}+{\sigma }_{{\mathop {i}\limits ^{.}}{c}_{1}}^{2}} \end{aligned}$$
(3)
Fig. 7.
figure 7

Example of frequency distributions for each class. The discriminative capability of a feature can be visually assessed by the separation between the means of the histograms. The pair of histograms on the right shows a greater separation, indicating higher discrimination between classes. Conversely, the pair of histograms on the left exhibits lower discrimination.

3.3 The Fisher Ratio as a Weight for Each Feature

We propose to use the FR value as a weigh for each feature, in such a way that those features that possess a greater capacity for discrimination are amplified.

As a first step we standardize all selected features in order to give them a uniform relevance. Then, we calculate \(\rho _{K} = \sqrt{J_{K}}\) for each feature k. Next, we normalize \(\rho _{K}\) as shown in (4).

$$\begin{aligned} \varrho _{k}=\frac{{\rho }_{k}}{\sum _{i=1}^{k}{\rho }_{i}} \end{aligned}$$
(4)

Finally, each \(\varrho _{k}\) is used to weigh all the standardized observations for the feature k.

4 Experiments Setup

In this work, the weighted K-NN and MLP algorithms were used for classification. Several experiments were conducted to compare the impact of different image preprocessing and feature enhancement algorithms on classification accuracy. The algorithms used in the training and testing stages included LFA for image normalization and preprocessing, Eigenfaces for dimensionality reduction, FR for selection of the best features, and W for weighting the features based on their discriminative capacity between classes. These two algorithms together aim to improve the discriminative ability of the features across classes. A total of five experiments were conducted for each classifier, which are described in Fig. 8.

Fig. 8.
figure 8

Graphical representation of the different experiments conducted in image preprocessing. Each arrow represents a sequence of algorithms that may include image preprocessing or feature enhancement. A classification accuracy value is calculated for each arrow.

A total of 1250 COVID-19 images and 1250 normal images, all of size 256\(\,\times \,\)256 pixels, were used. The region of interest was extracted from these images using the LFA algorithm, forming a bank of normalized images. The images were divided into 2000 training images, with 1000 from each class. For the testing phase, 500 images were selected, with 250 from each class. In the experiments using Eigenfaces-based features, 600 features were used.

For the MLP topology, 4 hidden layers with 120 neurons each and a single neuron in the output layer were utilized. The training was conducted for 100 epochs.

5 Experimental Results

Different values were tested for the parameter K in the weighted K-NN, and it was found that the optimal value is 11. On the other hand, experiments were conducted with various topologies and number of epochs in the MLP, but no significant improvements in classification accuracy were observed. The classification accuracy results for all experiments of each classifier are shown in Table 2.

Table 2. Results of the Weighted K-NN and the MLP for the experiments using different preprocessing methods.

Additional tests were conducted in Experiment 5, varying the number of features for both classifiers. However, it was found that 600 is the optimal number of features for both classifiers. Furthermore, Experiment 5 underwent cross-validation to demonstrate the consistency of the proposed set of algorithms in this work. Table 3 displays the results of the 5 tests, along with the average and standard deviation for each classifier.

Table 3. Results of Weighted K-NN and MLP for cross-validation.

6 Discussion of Results

For both classifiers, the following statements can be made regarding the experiments conducted in image preprocessing:

  1. 1.

    Experiment one, where images have no preprocessing, generally exhibits the worst results.

  2. 2.

    Experiment two demonstrates that image normalization improves results compared to experiment one.

  3. 3.

    In experiment three, where an image representation is projected onto the Eigenfaces space, no notable improvement is observed.

  4. 4.

    Experiment four highlights the importance of feature selection that effectively separates classes using FR, resulting in improved accuracy.

  5. 5.

    Experiment five showcases the effectiveness of our algorithm sequence, which includes image normalization, feature selection, and weighting, yielding the best results.

Additionally, the results demonstrate consistency with minimal variability during cross-validation. Finally, the MLP achieved accuracy results that can compete with other state-of-the-art algorithms for classifying chest X-ray images.

7 Conclusions and Future Work

In this study, we proposed a technique for automatic detection and normalization of the Region of Interest (ROI) in chest radiographs, along with a feature selection method based on Fisher’s criterion (FR) using PCA for automatic COVID-19 detection. With the proposed method, a reduced number of highly discriminative features are obtained. The results demonstrate that by utilizing both ROI alignment and feature selection processes, a significant increase in classification accuracy is achieved when using traditional classifiers such as weighted K-NN and MLP. The reported results are reliable as cross-validation techniques were employed to obtain them.

The contributions of this work include a ROI normalization method for lung images and a technique for selecting highly discriminative features using FR. Our approach achieves accuracy values that compete with other state-of-the-art works employing CNN-based techniques.

For future work, the ROI normalization technique can be applied to other databases and for the detection of other lung diseases. Additionally, the feature selection and weighting approaches can be tested to enhance the accuracy of other classification algorithms.