Keywords

1 Introduction

As per the research released by the Indian Journal of Medical Research under the title “Doctor Population ratio of India—The Reality,” it has been estimated that six lakh doctors and twenty lakh nurses shortage of medical personnel in India. India is planning to establish two hundred new medical colleges in next decade to meet the above requirement. Also, the cost of medical treatment is increasing for a common man in India and 65% of the health expenditure in borne by the individual itself, and as per recent release of the data by the government of India, it has been estimated that the medical expenses push 57 million people into poverty each year.

The medical diagnostics using machine learning powered by computer vision and deep learning will help us to extract useful information by filtering out the non-essential and insignificant information from the diagnosis report [1]. Computer vision, neural networks, and artificial intelligence methods like convolutional neural network will lead to identify and extract the useful information from the diagnosis report, and in turns, it will help to assist in medical diagnosis. We will train and develop the medical diagnostics tool which will help organization or government or user which will assist doctors/medical personnel in medical diagnosis.

2 Background

The National Electronic Health Records Survey (NEHRS) is an annual comprehensive survey of employed, office-based physicians. Usually based on the role of diet and prescribed exercise, the health risks are evaluated and research will be conducted. Recently, many researchers have achieved promising results based on electronic database and applying computational techniques. At the same time, secured data and maintaining patient’s privacy are also primary concerns while maintaining health records electronically. This research article provides an optimal method to identify a specific disease by suitable computational methods and also justifies the reliability on developed system.

3 Objectives

As per the above scope, the following objectives are defined in this research work:

  • Medical diagnostics using machine learning

  • Developing a medical diagnostics tool

  • Medical diagnostics tool powered by machine learning and deep learning will high prediction capability than the traditional models

  • Identifying and extract the most critical/important information from the diagnostic

  • Reducing the manual touchpoint while performing the model diagnostic.

4 Proposed Process Flow

To carry out the proposed research work, the resources needed are—artificial neural network, computer vision, and Python. Hardware requirements are 32 GB RAM, 1 TB Hard Disk, Window/Linux Machine. The potential challenges and risks involved are different sources of data which will have different patterns and quality of data which usually lot of efforts to prepare and clean for analysis [2]. Privacy of the data is also one of the challenges in such types of domain. Figure 1 presents the overall flow of the proposed model.

Fig. 1
figure 1

Proposed flow diagram indicating all involved modules

Around 6000 JPEG X-ray images are considered for detecting pneumonia condition. Training, testing, and validation are the stages in which different subfolders of images are distributed [3]. Anterior–posterior chest images are selected from pediatric patients for this study. Clinical and laboratory symptoms are considered which selecting chest images for the investigation [4]. Several chest radiographs are filtered to remove images with noise, poor quality, or unreadable data. Finally, the filtered images are certified by experts before being used for training purpose for our model. In this phase, the grading errors are recorded and discarded from the training database.

5 Methodology

5.1 Dataset Collection

One of the major potential challenges for this work is to obtain relevant medical data. As mentioned earlier, around 6000 JPEG images are been considered from unique patients for this study. Based on the associated radiology reports, the text contents are extracted and used for classification using preprocessing phases of language processing tools [5]. A unique labeling process is adopted to disambiguate and group the images according to the clinical text data as per the proposal in the article, “ChestX-ray8: Hospital-scale Chest X-ray database and benchmarks on weakly supervised classification and localization of common thorax diseases” (Wang et al.). This dataset includes 12 zip files, and each of them is of size 2–4 GB. The technologies used are: Keras, Python, Spyder, Jupyter, OpenCv, TensorFlow, and image acquisition through CNN datasets.

The typical text preprocessing steps involved are: removal of white space, expanding the contraction, removing noise, special character, normalizing all text to lower case, finding the maximum length of the text, tokenization, stop word removal, and stemming/lemmatization.

5.2 Preprocessing

Standardization of the features is conducted by standardizing pixel values among the whole database. This action is applied for each column in a tabular database [6]. Feature-centric and feature standard normalization parameters are used to standardize the entire image data generation class. This process is monitored closely in order to avoid multiple arguments with same effect. Otherwise, the redundant entries need to be filtered which will be an added effort. Typical image processing algorithms are applied on these digitized images. There are many applications of digital image processing algorithms as compared to analog processing. Majority of digital image processing algorithms help in enhancing image features by eliminating noise or skewed images. These enhanced image parameters considerably improve in developing artificial intelligent computer models. Typical image processing phases include—reading image, resizing it, de-noise (if any), normalize it, segment, and smooth edges as per the needs.

6 Model Building

The typical neural network model and convolutional neural network model are shown in Figs. 2 and 3. Convolutions are meant to extract key features from the input images. By learning image features, they ensure the relationship among pixels of input images [7]. The two inputs such as image matrix and kernel or filter are considered for a mathematical operation. We all know how to generate the volume dimension as output from an image matrix of dimension using relevant filter [8]. The convolution of image matrix multiplies with filter matrix to generate feature map. Strides in convolution layer are significant.

Fig. 2
figure 2

Simple neural network architecture

Fig. 3
figure 3

Convolutional neural network architecture

Fig. 4
figure 4

A sample code showing preprocessing and visualization

This section provides a complete overview on the developed model along with the code samples. Figures 4 and 5 show the code for preprocessing steps and visualization steps.

Fig. 5
figure 5

Graph showing the status of pneumonia

Fig. 6
figure 6

Sample images of affected lungs

It is noticed that the feature map size is smaller than the input size. The feature map has to be avoided from shrinking [9] with the help of padding process. Zero-valued pixels are added around the input in order to avoid shrinking of feature map. This will ensure that the spatial size remains constant. Padding confirms improvement in the performance, and kernel size is constantly maintained. When the input images are large, then the number of arguments will be reduced by pooling. By retaining the key features, dimension reduction happens through downsampling. Various downsampling methods are maxpooling, average pooling, and sum pooling [10]. Suitable bias values are applied to ensure an efficient activation function. When the derivatives are steeper, several neurons will get destroyed which results in a passive network. When the epoch is consuming huge time to run, then it is decomposed into batches. Binary cross-entropy measures are used to average the class-wise errors. Adam optimizer is used to update network weights iterative based on training data. It pursues a single learning rate for all weights updates. The learning rate is undisturbed during training process. Convolution layer is basically a feature detector that automatically tries to learn to filter out the not needed information. Pooling layers reduce the memory size required for processing and also detect object characteristics at some unusual places.

Fig. 7
figure 7

Sample code showing data normalization, resizing, and augmentation

Fig. 8
figure 8

Code sample to show the process of training the proposed model

7 Code Snippet

All the required class definitions and visualization steps essential for the model are shown. Figure 6 shows images of both normal and pneumonia-affected lungs images. The deviations obtained in these affected images are evident of the percentage of infection through increased number of epochs of CNN. The code for data normalization which is responsible for noise elimination and filtering is shown in Fig. 7. Thus, figure also shows resizing and augmentation code. Figure 8 illustrates the steps followed for training the model. We classified trainable and non-trainable parameters from the input file and identify the percentage of data samples collected. Figure 9 lists the parameters for training the model.

Fig. 9
figure 9

Status of total parameters considered for training the model

8 Analysis of Model Performance

The proposed work presents the optimal method of analyzing patient’s health records in the form of images. Through CNN, the training accuracy and accuracy of the validated results are checked. Figure 10 shows the graph with promising results, and it is evident that the method followed is reliable. Another graph is also shown with very minimal loss rate from the considered datasets.

9 Conclusion and Future Scope

The proposed solution project would be used by the organization/government authorities/medical authorities to reduce the workload of overloaded medical personnel and provide medical facilities to everyone at affordable cost.

Pneumonia is considered to be one of the serious statuses of health which leads to considerable proportion of mortality. This status can be controlled by early diagnosis with some computational techniques. Among various diagnostic procedures, chest X-rays are considered to be a reliable tool for screening and examination. Even though considerable imaging equipments are available, shortage of experts to infer the images is an added challenge. This work facilitates in proposing additional procedure for early detection of the disease through clinical and laboratory evidences of chest X-ray images.

Fig. 10
figure 10

Performance of the model developed