Introduction

Leukemia is the commonest childhood malignancy and accounts for 30% of childhood cases yearly. Acute lymphoblastic leukemia (ALL) is the predominant type, contributing approximately 80% of total cases of childhood leukemia [1]. In ALL, there is an abnormal, uncontrollable proliferation of lymphoid precursors called lymphoblast in the bone marrow with arrested maturation [2]. In many cases, these lymphoblast cells escape the bone marrow comportment and circulate in the patient’s peripheral blood.

The diagnosis of ALL is dependent on morphological, immunophenotyping and cytogenetic examination of blasts obtained from the bone marrow. Nevertheless, in clinical practice children suspected of leukemia would first undergo microscopic examination of their peripheral blood film. This is to ascertain whether suspicious cells e.g. blasts are present before being subjecting the child to an invasive procedure like bone marrow aspiration.

Trained pediatric hematologists are often only available at tertiary-referral hospitals. The symptoms of ALL e.g. fever, poor feeding and joint pain are common in children and can also be found in other benign conditions like viral infection. Thus, it is not surprising that children with ALL may initially present to primary care general practitioners, who are usually unskilled in the interpretation of peripheral blood film. [35]. Hence a computer based system that can clearly discern lymphoblast cells with accuracy will be useful for the initial screening of peripheral blood film of children with leukemia symptoms. Subsequently, these children can be referred to the appropriate institutions for further evaluation and management.

This research experiment focuses on localization and segmentation of lymphoblast cells using microscopic images. The dataset consists of three types of ALL blasts (French-American-British FAB-L1, L2 and L3; previously reviewed and diagnosed by an expert hematologist). The intended goal is to localize the blast cells, and then completely separate them from the image scene to be put on small sized sub-images that can fit a single cell. The goal will be accomplished using a methodology that has been developed by digital image processing techniques and shape features.

Literature review

Several authors have reported techniques of white blood cell (WBC) segmentation studying both normal leucocytes and lymphoblast cells. However, many of these works have not focused on finding a proper algorithm to localize the WBCs and place each individual WBC separately in a sub image that contains only one WBC. This localization process will facilitate subsequent steps, especially the separation of the nucleus and cytoplasm of the cell. Sadeghian, F et al. [6] reported on an ALL blast segmentation method based on peripheral blood images. Canny edge detector was used followed by a gradient vector flow (GVF) active contour to detect the nucleus and then Zak threshold was used to define the cytoplasm component. However, the work was done directly on a sub image that contains only single WBC. Liao. Q & Deng. Y [7] proposed a method to localize WBC by using a simple thresholding approach. However, threshold techniques are not an effective method for the localization of WBC, for instance, if two WBC’s are in contact or overlap each other from the nucleus part or from the cytoplasm part then threshold techniques will consider them as one object. Osowski S et al. [8] published a recognition method for myelogenous leukemia blasts obtained directly from the bone marrow aspirate. The Watershed transformation technique was used as a segmentation method for the images. However, the cells obtained from bone marrow samples are normally “crowded” and for these cases, the watershed transformation technique produced over-segmentation and consequently was inaccurate. Attempts to individuate the WBC using contrast stretching, edge detection and mathematical morphology to separate the whole cell from other image components have also been performed by Priori et al and Scott et al. [9, 10]. Similar to the earlier methods, neither group was able to segregate touching WBC’s each into an individual sub image due to the low color distinction between two close WBCs boundaries. An active contour model was proposed by Ongun et al. [11, 12] to resolve this problem. However, the initial position of active contour is considered the main drawback of this method.

In an earlier work done by the present author [13], a method based on contrast stretching and image arithmetic to localize the WBC nucleus was reported. The method managed to obtain up to 95% accuracy in localization and segmentation the WBC’s. Nevertheless, the method described did not localize the whole cell but only the nucleus component for identification of WBC type. Hence, it is inadequate for the purposes of identifying a leukemia cell since the recognition of a lymphoblast requires many features such as the size of the completed cell and the unique characteristics of the nucleus.

Materials and methods

Image acquisition

The first step for any image understanding system is image acquisition. Acquiring image with high resolution, clarity, accuracy and fair brightness is considered as a complicated task due to the complex setting of the video camera and the microscope, especially for non-medical people. For this experiment, all the images were obtained from the University of Malaya Medical Center (UMMC). A total of 180 RGB colors of ALL images were made available for this research. Each image contains at least one leukemic cell. Figure 1, shows the equipments that are used to acquire the images and Fig. 2, shows a sample of type L1 and L2 ALL images.

Fig. 1
figure 1

The equipments that are used for the acquisition stage

Fig. 2
figure 2

Sample of the acquired blood microscope image

The equipment that is used for image acquisition is an Olympus UC30 camera that is mounted on an optical microscope with a magnification of 4x100. The UC 30 camera is a 3.2 mega pixels digital color camera with CCD chip. The live frame rate is 7.0 frames per second at 2080 × 1544, which is the resolution that is used for the images in this research. The digital images captured by the camera were verified by the hematologist and based on the microscopic details of the features and contrast of the individual cell; the images were concluded to be acceptable by the hematologist in UMMC. All the peripheral blood slides are stained with May–Grünwald–Giemsa (MGG) stains. This staining method contains EOSIN-METHYLENE blue (May-Grunewald) & AZURE-EOSIN-METHYLENE blue (Giemsa) and its yield dark blue-purple leukocyte nuclei, light cytoplasm, and blue or light-orange cytoplasmics granules.

White blood cell localization

The purpose of the localization process is to find sub-images that contain only single lymphoblast on a white background, so that later it will ease the process of nucleus and cytoplasm separation, which will positively affect the features and measurements that can be obtained from the two parts. This step can be done through two stages:

  1. A)

    Remove most of the blood components like red blood cells (RBC) and platelets etc. and retain only the lymphoblast cells on a white background

  2. B)

    Extract each lymphoblast and put it in a small sub image to be used for the subsequent steps such as segmentation

Figure 3 below shows the proposed frame work of the lymphoblast localization and segmentation scheme.

Fig. 3
figure 3

The proposed framework of the lymphoblast cells localization and segmentation scheme

Stage A

The original images obtained by the digital camera are all in RGB color space Fig. 4(a), the image will be converted to another color space which is named HSV Fig. 4(b) stand for (Hue, Saturation, Value), the reason behind this conversion is to use the hue and the saturation channels as a mask and marker respectively, since the saturation band shows the lymphoblast nucleus as the brightest objects in the image Fig. 4(c) so this feature can be used as a startup point to extract the lymphoblast cells. The hue describes the shade of color, this color band highlights all the image components Fig. 4(d), and thus all the objects will completely be present in the binary version of the hue channel image Fig. 5.

Fig. 4
figure 4

a Original RGB Image, b Original HSV Image, c Saturation Band, d Hue Band

Fig. 5
figure 5

a Binary version of saturation band (Fig. 4-c) b Binary version of hue band (Fig. 4-d)

Later the two images in Fig. 4(c–d) will be converted to its binary counterpart using simple global threshold with a value of 0.5 using Eq. 1

$$ image\left( {x,y} \right) = \left\{ {\begin{array}{*{20}{c}} {1\,if\,f\,\left( {x,y} \right) \geqslant Threshold} \hfill \\{0\,Otherwise} \hfill \\\end{array} } \right. $$
(1)

The reason behind choosing 0.5 as a threshold value is that it will not cause any effect on the hue image which means the full size of the lymphoblast will be retained, as it is shown from the histogram of the image in Fig. 4(d), it tends to be bimodal, the green color box in Fig. 6, shows the area where all the cells reside and the histogram part which is surrounded by a yellow box is represented in the background.

Fig. 6
figure 6

Histogram of image in Fig. 4(d)

The same threshold value is used for the saturation image, even though its histogram is not bimodal. However, the binary version of the saturation image will be used as a marker so it does not matter if some part of the lymphoblast will be lost after the threshold, the most important thing is that this operation will keep at least one pixel from the lymphoblast to be used later as a seed point.

Now both of the binary images need to be prepared for the next step which is the morphological reconstruction, so a morphological opening will be done on the image in Fig. 5(b) by using a structuring element (SE which is a shape that has a certain size, used to interact with an image) of disk shape with size of 15 pixels. Morphological opening is achieved by first the eroding of image I by SE. Erosion mean corrodes the image pixels that are smaller than the SE, then dilating the result by SE. Dilation mean expanding the object in the image based on the size and shape of SE in use. In our case Eq. 2 will be applied first on the image, and then the result will be dilated by using Eq. 3 with the same SE. The combination of erosion followed by dilation (morphological opening) is presented in Eq. 4.

$$ {\hbox{Erosion}}:E = I\, \ominus \,SE = \left\{ {z\left| {{{\left( {SE} \right)}_z} \subseteq I} \right.} \right\} $$
(2)
$$ {\hbox{Dilation}}:\,D = I\, \oplus \,SE = \left\{ {z\left| {\left[ {{{\left( {SE} \right)}_z}\, \cap I} \right] \subseteq I} \right.} \right. $$
(3)
$$ {\hbox{Opening}}:\,O = E\,o\,D = \left( {A \ominus D} \right) \oplus D $$
(4)

Where I is the binary image and SE is the structuring element.

The Morphological opening will smooth the contour of the lymphoblast cells and remove pixels that cause two objects to be touched with a thin connection. The image in Fig. 5(a) will be eroded using Eq. 2 with the same structuring element as used with the morphological opening, since the structuring element is smaller than the smallest lymphoblast and bigger than the largest RBC, so at least one pixel will be retained from each lymphoblast which will be used as a seed point and all the RBC’s and platelets will be removed from the image as shown in the results in Fig. 7(a–b).

Fig. 7
figure 7

a image in Fig. 5-a after morphological erosion (Marker) b Image in Fig. 5-b after morphological opening (Mask)

After the marker and the mask have been prepared, morphological reconstruction will be used to reconstruct the lymphoblast cells from the mask image based on the marker image. The concept behind morphological reconstruction is to make repeated dilations of an image, called the marker image, until the contour of the marker image fits under a second image. The following algorithm is applied to retrieve all the lymphoblast cells. The shape of the SE that is used for the purpose of reconstruction is a 3 × 3 square structure element, and it is selected randomly since it will not affect the final result because all the pixels that are outside the contour of the mask will be removed by the intersection operation. Figure 8, shows the result of morphological reconstruction from the two images in Fig. 7(a–b)

Fig. 8
figure 8

Result after morphological reconstruction

After the localization is done, it becomes very simple to retrieve the RGB pixels of the localized object on a blank image of the same size as the original image; Fig. 9, below shows the resulting image

Fig. 9
figure 9

The localized lymphoblast cells with original RGB pixels

Stage B

Now every lymphoblast can be extracted separately from the resulting image in Fig. 8 using shape features like Area size, Axis length and centroid. Every object in the resulting binary image will be labeled with a different number like in Fig. 8, there are 7 objects; each object will be labeled with a different number. All the pixels for the first object will be labeled with 1’s and the second with 2’s and the third with 3’s and so on. The important thing in this operation is to create a label matrix that can be used in distinguishing every object in the image.

The lymphoblast cells can take an irregular or arbitrary shape. The WBC in the normal case can take a shape of the circle. It also can be an oval shape, or even sometimes oval shape with some protrusion. The first step in stage B is to find the center point of each cell in interest in the binary image. The center point of the blast will be the center point of the sub image that will contain a single lymphoblast and the width of the sub image will be double the axis length (AL) of the lymphoblast.

The extraction of a lymphoblast cells sub-images will be based on the following algorithm:

figure a

Figure 10, below illustrates the sub-image extraction algorithm

Fig. 10
figure 10

Illustration of sub-image extraction algorithm

The point with the red color in the center of the lymphoblast is considered as the center point (X, Y), and the yellow line represents the distance from the center of the lymphoblast to the top left corner and it is found based on the Axis Length of the lymphoblast cell itself and the AL is a variable distance which is based on the size of the cell. The blue line represents the width of the sub image and finally the black lines represent the lines that complete the sub-image rectangle. In case another lymphoblast falls in the same sub image then part of it will touch the border of the sub-image and in this case all the object pixels that are touching the border of the image can be removed by clearing border object which is an application of mathematical morphology that removes all connected components of a binary image that touches any image border

Result and discussion

This section is dedicated to the discussion of the performance of the proposed lymphoblast localization and segmentation method. The proposed method is part of a leukemia diagnosis and classification system, which is based on image processing and artificial intelligence involving different stages and steps; these steps are: image acquisition, image segmentation, features extraction and selection, and finally obtaining the decision by doing the classification based on the acquired features. In this study, the focus was on image acquisition and the segmentation process. In the actual experiment, 180 blood images from acute lymphoblastic leukemia cases (FAB-L1, L2, L3 morphological subtypes) were obtained using Olympus CX31 microscope with Olympus UC30 camera. In this study, we successfully designed a system that can localize and segment lymphoblast cells using 180 peripheral blood images of actual patients with leukemia.

Although there has been a large amount of work done in the area of WBC segmentation, many of the previous researchers did not focus on localization and extracting the cell of interest from the original image [6, 810, 1315]. In cases where there were many cells of interest in the same image, the focus has always been about the segmentation of WBC or blast directly and extracting the cell of interest sub image manually as done by Sadeghian, et al. [6]. However, our method localizes the blasts effectively in contrast to using a sub-image that contains only a single WBC for the purpose of segmentation.

The current method is considered an improvement to the method described earlier by Madhloom et al, where a localization method for a WBC was proposed based on the nucleus using image arithmetic and automatic threshold [13]. However, the method in [13] can only localize perfectly the nucleus of the WBC and cannot easily find the WBC boundary. Similarly, several other authors have described WBC segmentation techniques but all these have shown some shortcomings. Osowski. S et al (8) used watershed segmentation on an image of bone marrow aspirate and one of the drawbacks of watershed segmentation is the over-segmentation that may be produced from this method, especially if there are many crowded cells in an image. Markiewicz., T et al. [15] also used the same method like in [8] without any localization of the WBC’s. Piuri V and Scotti F, [10] localized the WBC based on the nucleus part of the cell using a contrast stretching method. However, the author did not mention the suitable range of the intensity stretching that can highlight the nucleus of the WBC.

The clinical impact of our work is that it will alert primary care physicians and general practitioners who may see children with leukemia at initial presentation. In Malaysia, a country of 13 states of 27 million people, there are only four tertiary-referral centers for childhood cancer and less than 30 trained pediatric hemato-oncologists. Hence having a tool to facilitate the initial screening of children suspected of having ALL would be beneficial to clinicians and laboratories located outside of major hospitals.

The lymphoblast localization and segmentation algorithm has been shown through Figs 4, 5, 6, 7, 8, 9, 10. The method consists of the localization process of the lymphoblast cells and the extraction of it into a sub-image. The first stage deals with locating the lymphoblast and differentiating it from the other blood components such as RBC and platelets, and the second stage deals with extracting the lymphoblast cells in a separate sub image. The main idea behind Stage A (the localization stage) is the color feature of the lymphoblast nucleus. Figure 4(c), shows how the brightness of the lymphoblast nucleus prominently dominates the image. The saturation component of the HSV image, which makes the lymphoblast nucleus, shows up as the brightest object in the image is used. The hue band image Fig. 4(d), shows almost all the blood components with same intensity color and this feature in the hue image is helpful in terms of retrieving the complete lymphoblast. The key advantages of this method are the use of morphological reconstruction based on the two images that are produced from the hue and saturation band which is used as mask and marker respectively. The two intensity images (Hue and Saturation) which have been threshold with a suitable threshold value are illustrated in Fig. 7, as the histogram of the image (which was first shown in Fig. 4(d). Figure 6 shows the histogram is bimodal. The objects in the image can be totally separated from the background without any error or losing some part of the original objects as shown Fig. 5(a–b). The iterative morphological reconstruction process has been done with an experimentally fitted structuring element, and the result is shown in Fig. 8, its shows how the morphological reconstructions restore only the lymphoblast based on the binary images of hue and saturation band.

The proposed method shows good qualitative localization results allowing the extraction of lymphoblast cells with their nucleus and cytoplasm. In order to measure the accuracy of the proposed method in a quantitative manner, the global quantitative method is used; the test is done on a subset of the original images, and it is important to note that the test set includes images taken from several peripheral blood smears with color inconstancy in order to show how the staining has been done. For the purpose of evaluation, the lymphoblast cells are localized and segmented manually by a hematologist, and then it is compared with the results of the proposed method. The evaluation is done according to the metric of volume and proportion of correct classified pixels of the whole cell resulting from the manual and proposed methods. This produced an error rate that can be used as an enhancement factor in future works. Table 1 shows the final result of the evaluation and how it compares to the results from the manual segmentation made by experts. Table 1 shows part of the testing result. It shows the number of pixels that is classified as cellular in the manual and the proposed methods, as well as the error rate and accuracy rate.

Table 1 Evaluation of the proposed method

The test results showed that the proposed method obtained an accuracy of 90-95% in restoring the lymphoblast pixels from the original image. The difference in the number of pixels that is considered as ‘cellular’ in the proposed method is due to the color inconsistency and also the blood components that are directly adjusted to the lymphoblast where the difference in the color between the lymphoblast and the background is very small. However, manual segmentation done by the hematologist is also subject to error since it is totally based on the operator experience.

Conclusion

The first and the most important part of an image understanding system, particularly in medical applications, is the ability of the system to find the region of interest and extract it from the image scene with a minimum rate of error. In the case of automatic blood cell recognition and analysis, the ability to find all the nucleated cells in the image scene and remove all the remaining components is considered as the heart of the system. The main issue for such a system is the perfect localization of the cells of interest in the image and distinguishes it from the other parts of the image. Until now most of the researches that have been done were about cropping the cells of interest sub-image manually and then segment its internal components.

We have successfully described a method for lymphoblast cells localization and segmentation. This will serve as a first step towards a fully automatic system for leukemia diagnosis and classification using peripheral blood microscope image.

From a technical point of view, this work had some key advantages such as the robustness and accuracy in comparison to manual separation of the lymphoblast made by expert. The results are very promising, and it opens new possibilities. In addition, the present algorithm can also be used to detect normal WBC like lymphocytes and monocytes, so it can be used for differential blood count systems. From an end-user point of view, this work can facilitate the laboratory work by reducing the time and cost.