Keywords

1 Introduction

Now a day’s the big data analytics has been applied for supporting the progression of care delivery and disease exploration. An aspect of healthcare innovation that has recently gained importance is in addressing some of the growing pains in introducing concepts of big data analytics to MIA. An image is a spatial map of various physical properties of anatomy where the pixel intensity represents the worth of a physical property of the anatomy at that point. Imaging the anatomy is an approach to record spatial information, structure, and context information. In this situation, the anatomy could be basically anything. But the objective of imaging is simple and straightforward: convert some scene of the real world into some kind of array of pixels that represents that scene and it will be stored in a computer. For now, it is all about biomedical images, which are a subspace of images that pertain to some form of a biological specimen, which is generally some part of human or animal anatomy.

The categorization of MI modalities has shown in Fig. 1. Such modalities have various purposes like to have an image inner side of the body without harming the body or to have image specimens that are too short to be viewed with the naked eye. The purpose of MI is to assist radiologists and doctors to diagnose and to provide treatment further efficiently. The mechanisms of the building block of our system, the cell, can now be viewed with the aid of the latest computing equipment meant for MI. But being capable of viewing these phenomena is not sufficient, and generating quantitative information through image analysis is very much needed for diagnosis. Voluminous data can be better handled by biomedical image analysis. Such analysis methods obtain quantitative measurements and inferences from images. So, it is feasible to find and monitor certain biological processes and extract information about them. And also, it comprises lot more challenges because images are diverse, complicated, and of uneven shapes.

Fig. 1.
figure 1

Categorization of medical imaging modalities [1].

ML and Artificial Intelligence (AI) will facilitate doctors for investigating and predicting the uncertainty of diseases precisely and more quickly. These approaches strengthen the capabilities of doctors and researchers to perceive how to interpret the general differences that lead to the disease. Although automatic identification of diseases based on traditional practices in MI has shown momentous correctness for decades, new developments in ML Techniques has attracted many researchers to increase the effectiveness of MIA.

In case of anatomical structures performing localization and presentation tasks which are the key phases in the work flow process of radiologist on MI. The radiologist carry out both the tasks by finding definite anatomical signatures like image features which can differentiate from one structure to the other structures. Segmentation of organs and/or various substructures allows the measurable investigation of clinical factors related to size and contour of MI [2]. This is often first stage in computer aided diagnosis (CADx) pipeline process. The segmentation task is usually defined as classifying a set of pixels which compose of either the edge of an object or interior of the object as a next task in MIA. Detection consistently recognized as Computer Aided Detection (CADe), in which it is also an intense field of studying about a missing lesion on the scan which will have more importance for both patient and clinician in their work flow process. In the classification task there is a possibility of one or more images as input for analyzing with one diagnostic value as output i.e., whether the disease is present or not. Among all the above discussed tasks, DL methods have enlightened its performance from time to time.

The most strong models for image analysis till date are the Convolutional Neural Networks (CNN). A single CNN model can have many layers working on identifying edges and general features on inner layers and more in-depth features in deeper layers. An image is convolved with filters (some refer to it as kernels) and later pooling is applied, this process may go on for some layers and may eventually find a more recognizable feature [3]. CNN’s first real-world application is handwritten recognition in LeNet (1998). MI is an unexplored area and there is a lot of researches to be conducted, hopefully, DL will have a great impact on MI as a whole.

Rest of the paper were discussed as follows. Section 2 illuminates about history of MIA, standards relevant to image formats, representations and about PACS system. Section 3 discusses research methodology, search criteria and search process for this review. Section 4 describes the influences of DL to certain tasks in MIA. Finally, Sect. 5 gives a detail discussion on attained results and research challenges in various application areas for further improvements.

2 Evolution of Medical Image Analysis and Its Standards

The AI standard in 1970s has driven towards the implementation of Rule-based and Expert systems. In the medicine domain MYCIN system [4] from Shortliffe was the first implemented a system which produces various regimes of antibiotic therapies to the patients. From the period 2015–2017 the more number of the algorithms are focused on unsupervised ML, then that is being researched towards supervised ML methods namely Convolutional Neural Networks (CNN) [5]. The first artificial neuron concept was described by McCulloch and Pitts [6] in the year 1943. Later it turned for implementation as perceptron from Rosenblatt [8] in the year of 1958. The purpose of the deep neural network (DNN) is to identify important low-level features (LLF) (i.e., lines or edges) automatically and then combine high-level features (i.e., shapes) in the layers [7].

CNNs can have it’s boundaries with the concept of Neocognitron suggested from Fukushima [9] in the year of 1982, but later the author LeCun et al. [10] is the one who established CNNs with the support of error Backpropagation described from Rumelhart et al. [11] in order to implement automatic recognition of hand written digits effectively. This approach has become most popular after winning the 2012 Imagenet Large Scale Visual Recognition Challenge (ILSVRC) by Krizhevsky et al. [11] with very less error rate as 15% as distinguished with the error rate of second place as 26%. Krizhevsky et al. presented many promising concepts for CNN like: Rectified Linear Unit (RELU) function, data augmentation and dropout. Consequently, there is a intense upsurge in count of papers related to CNN architecture and its applications, with this perspective CNNs became principal architecture in MIA.

Development of image analytics and quantification methods is originated upon common standards associated with image formats, data representation, and capturing of meta-data required for downstream analysis. Digital Imaging and Communications in Medicine (DICOM) [12] is a widely used standard that helps to achieve for organizing, storing, printing, and transmitting MI data.

While High Level Seven (HL7) [13] is a more general standard used for interchange, incorporation, distribution, and recovery of electronic healthcare information. It defines standards not just for data but also for application interfaces that use electronic healthcare data. The Integrating the Healthcare Enterprise (IHE) [14] initiative drives the promotion and adoption of DICOM and HL7 standard for improved clinical care and better integration of the healthcare enterprise. MI data is generally gathered and handled using specialized systems known as Picture Archiving and Communications System (PACS) [15]. PACS systems house medical images from most imaging modalities and in addition it can also contain electronic reports and radiologist annotations in encapsulated form. Commercial PACS systems not only allows to perform search, query retrieve, display and visualize imaging data, but often also contain sophisticated post-processing and analysis tools for image data exploration, analysis, and interpretation.

3 Research Methodology

SLR is a process of identifying, interpreting and evaluating all possibly available research resources which are related to a specific research question, or area of topic, or of specific interest. SLR is also called as a secondary study. With this approach, it is possible to optimize the actual evidence of a technology, which helps to determine any research gaps that should be addressed [16, 17].

3.1 Research Questions

The main aim of this SLR is to address this research question: How can we perform better identification and segmentation of organ or substructure and also how much accurate can we do classification task by using DL? This research question further divided in to four sub questions, they are:

  • RQ1: Which DL techniques are being used to perform the detection of Organ, Region and Landmark localization?

  • RQ2: What are the various DL methods considered to perform the segmentation of organs and other substructures?

  • RQ3: How to discover or localize anomalous/incredulous regions in structural images using DL techniques?

  • RQ4: What are the various DL approaches considered for Image/Exam classification?

3.2 Search Strategy

Since the index terms act as “keys” to isolate the scientific articles. Hence, it is necessary to consider appropriate keywords, which can legitimately observe related articles and refine the surplus material [18]. Therefore, the deliberated index terms are: “Medical Image Analysis”, “Medical Imaging”, “Localization”, “Segmentation”, “Classification”, “Deep Learning”, “Machine Learning”, and “CNN”. Based on our familiarity’s with journals, we referred databases that traditionally publish articles on the subject. The following databases were selected: PubMed, ArXiv, IEEE Xplore Digital Library (IEEE), Web of Science, Scopus.

Fig. 2.
figure 2

PRISMA flow chart diagram for selection of research papers [22].

In order to address these RQs, a search string has specified with the help of PICO approach, which distributes the RQ into 4 sub parts: population, intervention, comparison and outcome [17]. The comparison phase was neglected because of the SLR is worried with identification. Remaining are expressed as follows:

  1. 1.

    Population: Refers to review research about MIA that helps to infer some information regarding diseases from various imaging modalities. Localization keywords are considered because it is a primary activity for segmentation task, and then segmenting the region of interest to classify the progression/grade of disease.

  2. 2.

    Intervention: ML or CNN techniques are more prominent nowadays for image analysis. The ML keyword was selected from the branch of ML Approaches and the CNN keyword are selected from the branch of DL approaches.

  3. 3.

    Outcome: Detecting whether the disease is present or not and identify the progression of disease, if present.

3.3 Search Results

The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) Flow chart [19] defines the choice of articles as presented in Fig. 2. This investigation emerged on a overall of 270 articles.

Fig. 3.
figure 3

(a) Total number of considered publications (b) Year wise publications

The resulting condition used in this work is: having MIA as a major description, not only the use of a software that performs Localization, Segmentation, Detection, and Classification; and the work was existing for analysis, i.e., accessing to the complete paper. Next to this phase, the selected number of articles were 118, which required an exhaustive assessment. The designated works were classified conferring to the key method used with respect to the segmentation procedure. Finally, respective tables were created in chronological order and summarized with the highlighted works. The below Fig. 3 shows the number of papers published per year from 2012–2018.

4 Deep Learning Applications in Medical Imaging

Traditional ML based approaches had shown encouraging results in various CV tasks, but still there is a little amount of human involvement is required for solving certain tasks. This can be effectively addressed with the help of DL based approaches these days and also DL outperformed in various image analysis tasks with high accuracy and efficiency. These achievements from DL has prompted majority of the researchers working towards DL based MIA for handling certain complicated tasks more efficiently within the stipulated time. In this section, we briefly discuss about the successful applications of DL for Localization, Segmentation, Detection and Classification tasks. As shown in the example Fig. 4 a CNN takes an input image of raw pixels, and transforms it via Convolutional Layers, Rectified Linear Unit (RELU) Layers and Pooling Layers. This feeds into a last Fully Connected Layer which refers class scores or probabilities, thus classifying the input to the class with the maximum probability.

pagination

Fig. 4.
figure 4

Flow diagram of CNN model for Brain Tumor disease Classification Task [7].

4.1 Organ, Region and Landmark Localization

The localization task is used to specify a bounding box over a unique object inside the image by identifying specific image features which helps us to distinguish from one anatomical structure from other structures. To report this issue we have framed a research question (RQ1): Which DL techniques are used to perform the detection of Organ, Region and Landmark localization? Here we discuss about various solutions provided to address the above research question and also provided some of the important references in the Table 1.

Table 1. Summary of papers related to Organ, Region or Structure Localization.

Vaswani, et al. [20] was carried out the first GPU based implementation to accomplish fast localization of 3D anatomical structures in medical volumes using extended Adaptive Bandwidth Mean-Shift Algorithm for Object Detection (ABMSOD) algorithm applied on CT images of brain stem, eye and the parotid gland and achieves more than 90% in 40 runs and at least 50% partial structure identification in 65% runs. The winner of ICPR 2012 mitosis detection competition Ciresan [21] proposed supervised DNN for mitosis detection in breast histology images applied on publicly available MITOS dataset by achieving F-score of 0.782. Shin et al. [22] is the principal study who uses DL for organ detection in mixed MRI datasets and they have used a probabilistic based method for detection and the outcomes are most promising. Sermanet [23] conferred a multi scale, sliding window based approach which can be effectively implemented within a ConvNet and won the 2013 ILSVRC competition with 29.9% error rate. Zheng et al. [24] suggested an competen and strong landmark 3D detection in volumetric data with a mean error rate of 2.64 mm.

Chen et al. [25] proposed to recognize the fetal abdominal standard plane (FASP) from US videos. Further proposed a transfer learning strategy which reduces the overfitting problem. And also presented [26] knowledge transferred based RNN which will perceive fetal standard planes from US images discovering along with spatio-temporal feature learning. Su et al. [27] proposed sparse reconstruction method using an adaptive dictionary and insignificant templates for handling the shape variations, inhomogeneous intensity, and cell overlapping to detect the lung and brain tumor cells. De Vos [28] went a step forward and performed localized ROI on available regions of anatomy (heart, aortic arch, and descending aorta) by considering a rectangular shape 3D bounding box. Here is a kind of new strategy has started, pertained CNN architectures used for the purpose of better localization, handing the inadequacy of training data to learn better feature representation [29, 30]. The other group of authors are trying to adjust the learning process of network to predict the locations directly, Payer et al. [31] suggested an approach to precisely degenerate landmark positions with the help of CNNs.

Ghesu et al. [32] specified a sparse adaptive DNN motivated by marginal space learning method, which deals with complexity of data to understand the aortic valve in 3D transesophageal echocardiogram. An exciting example is the idea from Sirinukunwattana et al. [33] outperforms the classification-based center localization by considering the center locations of nuclei. Liu et al. [34] proposed innovative algorithm for comprehensive cell detection which doesn’t need the fine tuning of parameters, and this is the preliminary work to familiarize DCNN to offer weights to a graph. Trebeschi et al. [35] developed a fully automatic localization and segmentation of locally advanced rectal tumors with the help of CNN. Humpire-Mamani et al. [36] provided an decisive method for concurrent localization of several structures in 3D thoraxabdomen CT scans.

Table 2. Summary of papers related to Segmentation of Organ and Substructure.

4.2 Segmentation of Organs and Other Substructures

The segmentation task in medical images permits measurable investigation of scientific specifications which are correlated with volume and shape. Besides, it is often an important first step in CADe process. The segmentation task is consistently stated as finding the group of voxels which produces either the contour/the interior of the object(s) of interest. It is the utmost common concept of papers relating DL to MI, some of them were discussed in Table 2, and also seen the broadest variability in practice, comprising the advancement of distinctive CNN involved segmentation architectures and also the broader use of RNNs to the concern (RQ2).

Ciresan et al. [37] presented a sliding window fashion in the microscopy imagery which can be applied to find a pixel-wise segmentation of membranes. To handle the erroneous reactions while performing voxel classification methods, many clusters have found that the association of fCNNs with graphical models resembling Markov Random Fields are the good solutions Song et al. [38]. Furthermore, in the next work by Cicek et al. [39], the U-Net, which has the limited relative 2D annotated slices, has proven to be a complete 3D segmentation. RNNs are now well developed for segmentation tasks. Xie et al. [40] used RNN to fragment the perimysium in H&E-histopathology images. It considers a former evidence against to the present patch row and the column predecessors. To integrate multi-directional information, the RNN is useful for 4 times in numerous directions and the final result is combined & served to a fully-connected layer. Lastly, Poudel, et al. [41] mixed a 2D U-net considering along with the gated recurrent unit to attain 3D segmentation. Moeskops, et al. [42] accomplished a unique fCNN to segment the MRI of brain, breast, and coronary arteries in the cardiac CT.

4.3 Computer Aided Detection

The primary objective of CADe is to discover/reduce the unusual/doubtful regions in structural images, and therefore in turn can alert experts (RQ3). While CADe intention is to reduce the false-negative rate to improve the recognition rate of unhealthy sections, it may be due to the lack of an amount of viewers involved or fatigue. Even though CADe is well recognized in MI, DL techniques has been enhancing its presentation in diverse scientific applications. Particularly, utmost methods designated in the collected works misused deep convolutional models to extremely develop essential information in 2, 2.5, or 3 dimensions. Table 3 summarizes the possible solutions to perform detection task using DL.

Table 3. Summary of papers related to Object/Lesion Detection.

The leading object detection framework by using CNNs has recommended in the year of 1995, using a CNN with 4 layers to intellect nodules in x-ray’s Lo et al. [43]. The integration of appropriate or 3D data is also controlled with multi-stream CNNs (for example by Roth et al. [44] and Teramoto et al. [45]). van Grinsven et al. [46] recommended a discriminating information sampling in which incorrectly categorized patterns were pumped backward to the network higher frequently to emphasis on exciting zones in retinal images. Setio et al. [47] deliberated 3 sets of rectangular perspectives for a whole of 9 perspectives from a 3D patch and considered collaborative mechanisms to integrate data from dissimilar perspectives for finding of pulmonary nodules.

4.4 Computer Aided Diagnosis

While assessing some information from an image, CADx offers an additional independent opinion. The key applications of CADx comprises of the judgment of malignant from benign lesions and the recognition of specific diseases from one or multiple images. Usually, most of the CADx systems are established to use human-intended features obtained from the domain experts. In the recent years, DL methods are productively applied to CADx systems. The concept of image classification is one of the first and foremost areas in which DL performed a major influence to MIA (RQ4). Table 4 discuss about the possible solutions to perform classification task using DL.

Table 4. Summary of papers related to Image/Exam Classification.

A timeline related to computer vision (CV) is obvious, w.r.t the kind of deep networks that are generally used in exam classification Esteva et al. [48]. The popular papers smearing these practices for image classification which seemed in 2013 and motivated on neuroimaging. Suk et al. [49], and Suk et al. [50] considered DBNs and SAEs to categorize the patients who had Alzheimer’s established along with the brain MRI. A vibrant shift towards CNNs have been observed recently. Two papers has considered an architecture leveraging with limited attributes of medical data: Hosseini-Asl et al. [51] has investigated 3D convolutions irrespective of 2D to organize patients as consisting of Alzheimer; Kawahara et al. [52] smeared a CNN resembling architecture to a brain relatedness graph resulting from MRI diffusion-tensor imaging (DTI). With the help of their network they have proved that their method has outperformed the previous methods in evaluating intellectual and mechanical scores for predicting the development of a brain.

5 Discussion and Future Directions

The health sector is completely divergent in comparison with any industry. It has a top preference and customers anticipate the utmost level of care and services nevertheless of cost. Possibly, the study of medical data by medical professionals while interpreting the images is being limited due to its complexity, individuality and also having the maximum workload to the experts. In this scenario, the DL outperformed in various CV tasks nowadays successfully. Which in turn can provide the most exciting and maximum accurate solutions for MIA also and it can be seen as an important method for future applications as well.

In MIA, the deficiency of data is still two-fold and also it is most important: there is a common scarcity of freely usable data, and superior kind of labeled data is even scarcer. Data or class inequality in the training group is also a noteworthy difficulty in MIA. Currently, Variational AEs and GANS are a generative model, may avoid the data paucity problem because it can create synthetic medical data. The data imbalance outcome can be improved with the help of data augmentation process to produce maximum training images of rare/abnormal data, but still there is a possibility of overfitting. There are few important point that needs to focus: The majority of DL methods focused on supervised DL approaches; though, the annotation of medical data is not constantly achievable [53]. To eradicate the unreachability of big data, the supervised DL approach needs to shift towards unsupervised or semi-supervised systems. Thus, the usefulness of unsupervised and semi-supervised methods in medicine will be worthy.

Overall, CNNs are the most common strategy to perform localization task over 2 Dimensional image classification with decent results. Almost all of the discussed methods related to the brain are concentrating on brain MRI images [8]. But, there are other imaging modalities for brain is also aid from DL based analysis. The increasing accessibility of large scale gigapixel whole-slide images (WSI) of tissue specimen has formed digital pathology and microscopy a very prominent application area for DL techniques. In the histopathology image analysis, color normalization is a major research area. One among the major challenge is to stable the amount of imaging features for the DL network (typically thousands) with the amount of clinical features (typically only a handful) [54].

6 Conclusion

DL technology applied to MI may become the tremendous technology within the next 15 years. Applications of DL in health care will provide an extensive scope of problems extending from cancer screening and disease monitoring to personalized treatment suggestions. DL methods have regularly described as ‘black boxes’, specifically in medicine. Because liability is more essential which can have severe legal significance’s, so it is usually not at all sufficient with a good prediction system. In this SLR, we have examined the possible state-of-the-art DL approaches used for image localization, segmentation, detection and classification tasks.

This review starts with the discussion on the evolution of MIA and the standards it follows. The process of research methodology considered for this SLR is elaborately discussed. Then, presented a brief summary of more than 130 contributions to the field is discussed and provided insights through discussion along with the future directions. As the communities of engineers, computer scientists, statisticians, physicians, and biologists continue to integrate; this holds great promise for the development of methods that combine the best elements of modeling and learning approaches for solving new technical challenges. The potential of this field is most promising, so there is a lot of scope for budding researchers.