Keywords

1 Introduction

Machine learning has developed rapidly in various fields to help humans make decisions. This development has great potential for medical imaging technology, medical analysis, medical diagnostics, general health care, one of which is machine learning-assisted medical diagnosis. Some machine learning studies that have helped in the health sector, such as diagnosing leukemia [1], classification of malaria [2] and evaluation of the administration of hypertension drugs [3]. This paper highlights the direction of new research on machine learning to help detect follicles in ovarian ultrasound images. Detecting the number of follicles will be useful as one of means in determining the diagnosis of patients with polycystic ovary syndrome.

Polycystic ovary syndrome (PCOS) is experienced women of reproductive age with excessive androgen production, usually ovulation is irregular, sometimes even does not have ovulation, ovulation disorders cause difficulty in getting pregnant. Women experiencing PCOS have higher risk of type 2 diabetes mellitus, hypertension, cardiovascular, ovarian hyperstimulation syndrome (OHSS) and endometrial cancer [4, 5]. PCOS can be diagnosed by various biochemical criteria such as blood tests to check hormone levels, glucose and thyroid levels. Another method is to see the presence of follicles in the ovary by ultrasonography (USG). This method is often used because it is cheap, fast and low risk for patients [6, 7]. Ultrasound images of ovarian morphology with polycystic ovary (PCO) are characterized by 12 or more follicles measuring 2–9 mm and/or increasing ovarian volume of more than 10 cm3 [8] Diagnosis of PCOS patients with calculating the number of the follicle is more reliable than calculating ovarian volume [6]. Follicles such as fluid-filled bags to form eggs, follicles produce estrogen which is needed in the development of eggs, so monitoring follicles in the ovary are very important for women for planning pregnancy. The characteristic appearance of the follicle by ultrasound [9] (see Fig. 1).

Fig. 1.
figure 1

PCOS [9]

PCOS is the most common female endocrine disorder, affecting 10% of women in the reproductive age [10, 11]. Diagnosis of PCOS is a pleasant experience because it has an explanation of the ovarian state and gives hope for treatment, but sometimes makes a woman frightened by the results of the diagnosis. Therefore it is essential to make an accurate diagnosis so that there is no diagnostic error [9, 12].

2 Data Collection

Follicle images are taken from ultrasound tests in accordance with specific rules, in women with regular menstrual cycles, examinations are carried out in the initial follicular phase (days 3–5), for women with irregular menstrual cycles (oligo-amenorrheic) can choose random days, or more precisely, the first 3–5 days after bleeding caused by progesterone. Ultrasonography should be arranged to achieve the best contrast to distinguish follicular fluid from the ovarian stroma [7, 13]. This research uses digital images of ovarian ultrasound from Permata Hati patients in Sardjito Hospital by converting analog images into digital images using the scanner.

3 Method

In general, machine learning for detection and calculation of follicle’s number and diameter in ovarian ultrasound images consists of several stages, including pre-processing, reduce speckle noice, segmentation, feature extraction, feature selection, calculate number of follicle, calculate diameter of follicle and model performance evaluation.

3.1 Preprocessing

The initial stage of image processing a pre-processing is an essential stage that needs to be done before image processed; it serves to improve image quality to produce better features in the next step. This research will use pre-processing method of histogram equalization and reduce speckle noise.

3.2 Segmentation

This research uses an active contour segmentation method, and active contour divides ovarian ultrasound image into several regions and separate object area and background from an ovarian ultrasound. Segmentation of active contour has the advantage of being able to adjust to object pattern according to its input parameters so that it can move broader or narrower to look for ovarian ultrasound objects.

3.3 Features Extraction

Feature extraction is a step to create a feature and reduce dimensions of dimensions from high dimensions to lower dimensions. Reliable feature extraction techniques are the main key in solving pattern recognition problems. Chain code is one shape feature extraction algorithm whose value does not change with the treatment of rotation, translation, reflection, and scaling. This method results in a value that shows the direction of the pixel, making up the object. This research will use geometric feature extraction, such as area, perimeter, major axis, minor axis, eccentricity, extent, circularity, and tortuosity produce measurements to recognize correct follicles.

3.4 Feature Selection

Once the feature extraction process is complete, then the next process is the selection of important feature that able to differentiate between very similar objects. The number of elements from the feature extraction process is too much possible to have the same values. The Feature selection is needed to improve accuracy. This research uses a chi-square feature selection technique to extract the most relevant features.

4 Proposed Method

In general, a computational model for detection and calculation of the number and diameter in ovarian ultrasound images are shown in Fig. 2 which consists of stages of image acquisition, pre-processing, follicle detection, feature extraction, feature selection, calculating follicle counts, calculating follicular diameter and model performance evaluation.

Fig. 2.
figure 2

Proposed method

The initial pre-processing is carried out to improve the quality of the image so that it can produce better features for the next stage. Pre-processing method used here is histogram equalization and speckle noise reduction.

One attempt to improve the distribution of ovarian ultrasound images that irregular shape of circles, is using histogram equalization method. There is a lot of speckle noise in ovarian ultrasonography, so to reduce the speckle noise used with the adaptive median filtering. This method can be used to handle filter operations in damaged images with impulse noise and smooth noise so that the image output is much better than the median filtering result.

The adaptive filter works on a rectangular region \( S_{xy} \). The adaptive median filter changes the size of \( S_{xy} \) during the filtering operation depending on specific criteria as listed below. The output of the filter is a single value which replaces the current pixel value at \( x,y \) the point on which \( S_{xy} \) centered at the time. The Adaptive Median Filter classifies pixels as noise by comparing each pixel in the image with its neighboring pixels around it. The neighborhood size is adjustable, and the comparison threshold is adjustable. The Adaptive median filters work on two levels, namely [14].

figure a

With

\( Z_{min} \) = Minimum gray level in \( S_{xy} \)

\( Z_{max} \) = Maximum gray level in \( S_{xy} \)

\( Z_{median} \)  = Median gray level in \( S_{xy} \)

\( Z_{xy} \)  = Gray level at coordinates \( \left( {x,y} \right) \)

5 Proposed Method Segmentation of Follicle

This research uses an active contour segmentation method to divide the ovarian ultrasound image into several regions and separate the object area and background from an ovarian ultrasound. Active contour for dividing ovarian ultrasound images into several regions and separating the object area and background from an ovarian ultrasound. The segmentation steps using active contour are shown in Fig. 3.

Fig. 3.
figure 3

Active contour

This method has the advantage of being able to adjust the object’s pattern with its input parameters so that it can move broader or narrower to look for ovarian ultrasound objects. The active contour is developed to form a contour curve, the edge of the object in the image. Active contour model can be roughly classified as parametric active contour model, with snake parametrically by \( z\left( s \right) = \left( {a\left( s \right),b\left( s \right)} \right) \).

Active contour model is the snake model proposed by Kass et al. [15]. The snake model is an energy functional is in Eq. (1) which consists of two part called the inner and outer energy.

$$ \begin{aligned} E_{snake}^{*} & = \int_{0}^{1} {E_{snake} \left( {z\left( s \right)} \right)ds} \\ & = \int_{0}^{1} {E_{internal} \left( {z\left( s \right)} \right) + E_{imageforce} \left( {z\left( s \right)} \right) + E_{constraint} \left( {z\left( s \right)} \right)ds} \\ \end{aligned} $$
(1)

Where \( E_{internal} \) represent the internal energy of spline due to bending, \( E_{imageforce} \) gives rise to the image force, and \( E_{constraint} \) gives rise to the external constraint force [15].

The segmentation of the Chan-Vese model is usually dependent on the placements of initial contours, Chan Vese was formulated by minimizing the energy functional is in Eq. (2) [16].

$$ \begin{aligned} F\left( {c_{1} ,c_{2} ,C} \right) & =\upmu\,Length\left( C \right) + v.Area\left( {inside\left( C \right)} \right) \\ & + \lambda_{1} \int_{inside\left( C \right)} {\left| {u_{0} \left( {x,y} \right) - c_{1} } \right|^{2} dxdy} \\ & + \lambda_{2} \int_{outside\left( C \right)} {\left| {u_{0} \left( {x,y} \right) - c_{2} } \right|^{2} dxdy} \\ \end{aligned} $$
(2)

Parameters for determining evolution are used µ, \( C \) is any variable curve, \( v \) is the parameter to increase speed, and the constants \( c_{1} \), \( c_{2} \), depending on \( C \), are the averages of \( u_{0} \) inside \( C \) and respectively outside \( C \), \( \lambda_{1} \) and \( \lambda_{2} \) parameter to adjust the intensity inside \( c_{1} \) and \( c_{2} \) Active contour model with \( v = 0 \) and \( \lambda_{1} = \lambda_{2} = \lambda \) is a particular case of the minimal partition problem, the best approximation \( u \) of \( u_{0} \) as a function taking only two value, namely [16].

$$ u = \left\{ {\begin{array}{*{20}l} {average\,\left( {u_{0} } \right)\,inside\,C} \hfill \\ {average\,\left( {u_{0} } \right)\,outside\,C} \hfill \\ \end{array} } \right. $$
(3)

Calculate the related Eular-Lagrange equation for unknown functions \( \phi \), slightly regularized from functions \( H \) and δ, denoted here by \( H_{\varepsilon } \) and \( \delta_{\varepsilon } \). \( F_{\varepsilon } \) is represented associated regularized functional [16].

$$ \begin{aligned} F_{\varepsilon } \left( {c_{1} ,c_{2} ,\phi } \right) & =\upmu\int_{\varOmega } {\delta_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right)\left| {\nabla \phi \left( {x,y} \right)} \right|dx\,dy} \\ & + v\int_{\varOmega } {H_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right)dx\,dy} \\ & + \lambda_{1} \int_{\varOmega } {\left| {u_{0} \left( {x,y} \right) - c_{1} } \right|^{2} H_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right)} dx\,dy \\ & + \lambda_{2} \int_{\varOmega } {\left| {u_{0} \left( {x,y} \right) - c_{2} } \right|^{2} \left( {1 - H_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right)} \right)} dx\,dy \\ \end{aligned} $$
(4)

The segmentation method will be evaluated using Probabilistic Rand Index (PRI) and Global Consistency Error (GCE) for evaluating the performance of segmentation method. The PRI value must be higher than the GCE value.

Probabilistic Rand Index (PRI) is rand index that is a fasimilarity function that converted the problem of comparing two partitions with possibly differing number of classes into a problem of computing pairwise label relationships.

PRI calculates pixel pair fractions whose labels are consistent between calculated and actual segmentation. A measure of similarity between two groups of data. PRI gives a value between zero and one. If two segmented results do not have a similarity, the results are zero and if segmented images are identical, result is one. Formula for PRI is defined in Eq. (5) [17].

$$ PR\left( {S_{test} ,\left\{ {S_{GT} } \right\}} \right) = \frac{1}{{\left( {\begin{array}{*{20}c} n \\ 2 \\ \end{array} } \right)}}\sum\limits_{{\begin{array}{*{20}c} {i,j} \\ {i < j} \\ \end{array} }} {\left[ {c_{i\,j} p_{i\,j} + \left( {1 - c_{i\,j} } \right)\left( {1 - p_{i\,j} } \right)} \right]} $$
(5)

Where \( S_{GT} \)  is ground truth images which are grouped manually \( \left\{ {S_{1} ,S_{2} \ldots S_{GT} } \right\} \) according to segmented ovary images by the algorithm. Set of all perceptually correct segmentation is defined by random numbers \( p_{ij} \), \( c_{ij} \)  shows the event that a pair of pixels \( i \) and \( j \) have the same label on \( S_{test} \) image [17].

$$ c_{ij} = I\left( {L_{i}^{{S_{test} }} = L_{j}^{{S_{test} }} } \right) $$
(6)

Global Consistency Error (GCE) is a measure to the extent of the results of segmentation. If one segment is the appropriate subset of the other, then the pixel is located in the area of repair and the error must be zero. If there is no subset relationship, then it indicates that the two regions overlap inconsistently. Let \( S \) and \( S^{{\prime }} \) be two segmentations of an image \( X\left\{ {x_{1} ,x_{2} \ldots x_{N} } \right\} \) consisting of \( N \) pixels. The Local Refinement Error (LCE) allows fo refinement to occur in either direction at different locations in the segmentation [17].

$$ {\text{GCE}}\left( {{\text{S}},{\text{S}}^{{\prime }} } \right) = \frac{1}{\text{N}}{ \hbox{min} }\left\{ {\sum\nolimits_{\text{i}} {{\text{LRE}}\left( {{\text{S}},{\text{S}}^{{\prime }} ,{\text{x}}_{\text{i}} } \right),\sum\nolimits_{\text{i}} {{\text{LRE}}\left( {{\text{S}}^{{\prime }} ,{\text{S}},{\text{x}}_{\text{i}} } \right)} } } \right\} $$
(7)

6 Experiment Results

Proposed segmentation method will be tested on 100 images of ovarian ultrasound from Sardjito Hospital Yogyakarta, Indonesia. This research right now is in the stage of implementation of the segmentation method using active contour. The Fig. 4(a) depicts an original ultrasound image of the ovary and the result adaptive median filtering images are shown in Fig. 4(b).

Fig. 4.
figure 4

Original ultrasound image of the ovary and result image of proposed method. (a) Original image, (b) Adaptive median filtering image

Images may have poor contrast and can’t be used directly. Therefore, it is necessary to get rid of speckle noise presented in an image. In order to remove speckle noise, adaptive Median filters are commonly used. An additional benefit of the adaptive median filter is that it seeks to preserve detail while smoothing speckle noise, the adaptive algorithm performed quite well. The Fig. 5(a) depicts an original ultrasound image of the ovary and the result images of proposed method are shown in Fig. 5(b) and (c).

Fig. 5.
figure 5

Original ultrasound image of the ovary and result image of proposed method. (a) Original image, (b) Histogram equalized image (c) Adaptive median filtering image, (d) Manual segmentation of follicles by medical expert.

7 Conclusion

Previous stage has detected presence of follicles in ovarian ultrasound images, then next step in calculating the number of follicles using connected component labeling. The expected result of this process is to be able to label and color each follicle in an ovarian ultrasound image making it easier to calculate number of follicles in one ovarian ultrasound image. Calculation of follicular diameter that is shaped like a circle is produced from area of follicle. Area is the number of pixels that make up a follicle (region), result of calculating the diameter in pixel values will be converted into millimeters.