INTRODUCTION

One of the main sources of information in medicine today is X-ray images. A deep learning convolution neural network that analyzes X-ray images and determines the probability of disease (patient pathology) via binary classification can be used for the quick and high-quality automatic processing of images [1]. A system that uses such a network acts as a radiologist’s assistant. The main problem is teaching the neural network up to the required quality of recognition (to classify images with minimal errors). One way of solving this problem is to preliminarily process images to separate the required areas with lung and spine, and to mask unnecessary areas.

We have developed a convolution neural network for the preliminary processing of images and introduced a set of filters in script form. In this work, we describe the procedure for processing images, the architecture of the neural network, and its learning process. The algorithm for creating the final database of images processed with the neural network are presented along with the results from numerical experiments.

IMAGE PROCESSING

In the considered problem, we dealt with black-and-white images of lungs in the DICOM special format for medical images. Lung and spine occupied most of each image, the pixels of which were presented in the JPG format using the RGB code with 256 shades of gray. In an image, we might see damaged pixels (as in an ordinary picture) due to the lens of the medical device being dirty. This noise can affect the quality of learning of the main neural network classifying the images. One problem in preliminarily processing the images is removing this noise and separate fragments of lung and spine in the photofluorographic images [2].

There are several ways of solving this problem:

(1) Visual detection by analyzing histograms for an image’s lines and columns.

(2) Selecting contours via “sliding” two-threshold binarization.

(3) Using a combination of filters (slow, averaging, Gaussian).

(4) Analyzing histograms of image fragment color with a fully connected neural network using K-means and random forest algorithms.

(5) Classifying fragments with a convolution neural network.

The effectiveness of these approaches was investigated in [3], and it was decided to develop and employ an auxiliary neural network for separating and classifying image fragments in combination with a set of filters.

ALGORITHM FOR SEPARATING LUNG AND SPINE IN AN IMAGE

Our software was developed in the Python environment using the Tensorflow library [4], a neural network of deep learning generated on the basis of convolution neural network (CNN) technology [5] with the Inception Resnet V2 architecture and 1A2B1C set of base blocks (MaxPooling, Batch8) [6]. A set of scripts was developed for automatically solving the problem of dividing an image into fragments, selecting an area of spine, and so on. In our approach, the initial image is separated into fragments that form a matrix of 32 × 32 elements, each of which is subject to binary classification by the neural network. The separated classes are “lung” and “other.” The ratio between the two classes’ fragments is on average 1 : 3 for each image. After classification, the spine area and the matrix fragments associated with it in the image are determined mathematically (point 3 of the algorithm). Thirty average statistical photofluorographic images that form 24 000 fragments with a 50 : 50 ratio of the classes and uniform selection according to the class of fragments without lung (due to its statistical dominance) are used as the learning set.

A separate test sample consisting of 5000 fragments with no overlap between it and the learning set with the same ratio between classes was used to confirm the results on the learning neural network’s operation. Fragments were created manually with automatic support and minimal redundancy. Only lung was used as a criterion of separation, due to the low quality of the results if fragments with spine were included in the form of redundant bone areas. In testing, the final mean result from classifying the two classes of fragments separated by the learning neural network was 93.84% relative to the “other” class. The test results are presented in Fig. 1.

Fig. 1.
figure 1

X-ray image recognition by our auxiliary neural network.

General Algorithm for Processing Images

(1) Form a binary membership matrix for the “lung” class by comparing the degree of membership at a certain threshold value determined experimentally for the best separation.

(2) Use standard 3 × 3 median filter for both classes in the membership matrix by creating real values. The threshold value of 0.5 is used when operating with the matrix.

(3) Separate the area of spine in a matrix copy with a 7 × 7 filter using a special algorithm of element analysis for the membership matrix, traveling left to right to detect two points in different sets for the upper and bottom parts of the image, and draw lines and a polygon in the membership matrix according to the determined points. The area of the “lung” class at subsequent stages includes spine.

(4) Use the algorithm in [7] for detecting spots of redundant lung separation with sizes of up to 50 fragments by searching in depth.

(5) Smooth lung contours via self-implementation of the 3 × 3 median filter with respect to the class without lung that has a threshold value higher than 4/8 fragments.

(6) Use the algorithm for detecting unseparated spots of lung with the same parameters.

(7) Color the fragments without lung black, according to final values in membership matrix.

When all fragments entered into the learning neural network are processed, it forms real output values (0‒1) belonging to classes “lung” and “other,” generating a matrix of the probability of belonging to the first class.

ESTIMATING THE EFFICIENCY OF PRELIMINARY PROCESSING

We preliminarily processed the initial Kazan database of X-ray lung images using our auxiliary neutral network. Each initial image with 512 × 512 pixels was separated into fragments in a 32 × 32 matrix. The fragments were used by the learning neural network and separated into classes. Further processing was done according to the algorithm described above. Examples of the initial and preliminarily processed images with unnecessary areas colored in black are presented in Fig. 2. We thus obtained a data base of preliminarily processed X-ray images for the main learning neural network developed in [1]. To assess the effect of preliminarily processing the images, we compared the quality of the main neural network’s operation if unprocessed images are used for learning, the quality using the preliminarily processed image database, and the quality using preliminarily processed image database with HSV coding instead of RGB. We evaluated a test sample consisting of 21 064 images from the Kazan image database, according to which learning was accomplished separately without crossover: 10 532 was the norm, 10 532 was pathology. We calculated the number of images accepted as the norm, which shows how the physician’s workload is reduced (the system’s efficiency), and the number of images with pathology are considered a norm with a probability higher than the level of confidence. This value determines the probability of neural network error. To save the metadata and image database, we used the DBMS PostgreSQL softwarewith replications [8].

Fig. 2.
figure 2

Examples of images: initial, with spine; final, processed.

The results for the initial Kazan database of X-ray images with the 1A2B1C architecture of the main neural network (MaxPooling, Batch8) were

—confidence level, 0.8: reduction of workload, 62.8%; errors, 11.1% (111 units);

—confidence level, 0.9: reduction of workload, 45.7%; errors, 5.7% (57 units).

The results for the preliminarily processed initial Kazan database of X-ray images with the 1A2B1C architecture of the main neural network (MaxPooling, Batch8) were

—confidence level, 0.8: reduction of workload, 62.5%; errors, 9.5% (95 units);

—confidence level, 0.9: reduction of workload, 43.0%; errors, 4.2% (42 units).

The results for the preliminarily processed initial Kazan database of X-ray images if coding system HSV is used instead of RGB in the improved 8A20B4C architecture of the main neuron network (MaxPooling, Batch8) were

—confidence level, 0.8: reduction of workload, 59.5%; errors, 9.6% (59 units);

—confidence level, 0.9: reduction of workload, 43.0%; errors, 4.9% (49 units).

CONCLUSIONS

Investigations show our auxiliary neural network is efficient in separating areas with lung and spine in photofluorographic X-ray images. The quality of fragment classification was approximately 94%. The preliminary processing of images does not improve the quality of the main neural network’s operation according to the criterion of reducing the physician’s workload, but it does lower the number of errors in image classification. If HSV coding is used instead of RGB, we see no appreciable effect on the main neural network’s learning upon an increase in the number of basic units. The proposed approach can be used not only for image processing, but for several applied problems as well, particularly for detecting damages in gas pipe lines [9] by processing numerical data from measurements.