Keywords

1 Introduction

Despite advances in oral healthcare, dental caries remain the most widespread of oral diseases, with approximately 36% of the world’s population showing signs of the infection [1]. This has led to many attempts to improve the detection rate of caries in order to prevent more serious oral diseases from developing. Traditionally, dental X-rays have been used by oral healthcare professionals to assess unobservable areas of the tooth and make a diagnosis through observation [2]. Newer advancements in computer vision have led to the development of Computer-aided Diagnosis systems in order to assist in the identification and diagnosis process. Unfortunately these systems have a high false positive rate at identifying caries and as such have not been usable as standalone systems [3].

The goal of this paper is to propose a caries detection model to assist in the treatment of dental caries. The proposed model aims to rectify the shortcomings of existing models and provide more accurate results by implementing a new approach to the diagnostic algorithm.

Several factors have led to the unfavourable identification results with respect to caries diagnosis. Firstly, dental X-rays are noisy and low in contrast due to the low dosage rates in the capture process [4]. These low dosage rates can also affect the visibility of caries due to the X-rays not fully penetrating the teeth. There is no workaround for this as the low dosage rates ensure the health of the patient [5], thus image enhancement techniques must be utilized to assist in computer vision. Secondly, the majority of segmentation research focuses on tooth segmentation for the purposes of human identification. As a result, features required for caries detection are lost in favour of preserving crown shape in order to match teeth. Finally, current caries detection algorithms use a supervised learning model as the basis of their comparative model. Suspected caries regions are compared against a set of classifiers which are obtained from a learning set where the presence of caries is known. If there are similarities between the test image and the classifiers, the algorithm provides a positive caries diagnosis.

There have been varying degrees of research into optimizing the each of the specific aspects of radiograph processing. Ahmad et al. [6] tested the effects of four image enhancement techniques, namely adaptive histogram equalization (AHE), contrast adaptive histogram equalization (CLAHE), median adaptive histogram equalization (MAHE) and sharp contrast adaptive histogram equalization (SCLAHE) in an attempt to determine which provided better results in terms of improving X-ray quality. Further research by Bharathi et al. [7] looked at the effectiveness of median, finite impulse response (FIR) and Gaussian filters in reducing noise levels.

Research with regards to tooth segmentation alternates between the use of integral projection or active contours. Nomir et al. [8] proposed a method adapted from the works of Hu et al. [9]. A mask of the initial image was obtained by performing an iterative and adaptive threshold. Integral projection was performed on this mask based on the assumption that most, if not all, of the non teeth related pixels have been removed. This method was used again by Nomir et al. [10] for human identification. Lin et al. [11] also used an adaptation of the method presented in Ref. [8] for use in human identification. Jain et al. [12] and Frejlichowski and Wanat [13] further developed this method to incorporate a probability model. Segmentation through active contours was used by Zhou et al. [14] as well as Oliveira [15]. Rad et al. [16] compiled an evaluation of these various segmentation methods.

Not much research has been done with respect to caries identification itself. Solanki et al. [17] used an unsupervised learning approach where the shape contour of each tooth was analyzed. Oprea et al. [18] proposed a binary threshold be applied on a high contrast image. A subsequent rule check was performed to determine if any black pixel groups occurred within the tooth or along its boundary and flagged these as caries. Oliveira [15] made use of a supervised learning approach and developed a set of classifiers for caries detection. Finally, Zhang et al. [19] used a blob detection method to isolate potential dental caries for 3-D rendering and assessment.

In order to achieve the goal of an improved caries detection model using an unsupervised learning approach, this paper presents a unique diagnostic model. This model is comprised of both adapted algorithms from previous research and novel algorithms which reduce the inaccuracies inherent to current methods.

2 Segmentation

The segmentation of the X-rays into individual teeth was achieved through a three stage process. This process consists of pre-processing and image enhancement, adaptive and iterative thresholding and separation line selection using a novel algorithm.

2.1 Pre-processing

The diagnostic rate of the algorithm detailed in this paper was improved by optimizing the quality of the images being processed. This optimization was achieved through the implementation of image enhancement techniques which were used to remove nosie from the image and improve the overall image quality. To ensure the best outcome of the image enhancements, a preliminary cleanup process was implemented to remove any abnormalities present in the images due to the radiograph process.

Image Enhancement. Following the removal of all non-organic structures barring dental fillings, the image contrast was enhanced in order to provide better definition of the dental structures. To preserve feature detail, noise reduction obtained from blur filters was avoided. Following research conducted by Yoon et al. [20] which suggested that Adaptive Histogram Equalization provided a greater contrast improvement for computer vision, a combination of a median filter followed by histogram equalization was implemented.

2.2 Thresholding

Iterative Thresholding. Due to the similarities in the X-Ray images being processed, the model proposed in [8] was adapted as the basis for the thresholding implementation. A canny edge detector was used to obtain the general outline of the teeth in each X-ray image. A morphological dilation was then applied to these edges in order to obtain the pixels in the area assumed to be the tooth boundary. Approximately half of the pixels obtained this way corresponded to the teeth pixels whilst the rest were of the jaw bone and other background objects. The initial threshold value was calculated from the average pixel value of the assumed teeth pixels and the background pixels and subsequent thresholds were calculated as follows:

$$\begin{aligned} \mu _D^i&= \frac{\sum _{\left( i,j\right) \in dental}{f\left( i, j\right) }}{\#dental\_pixels}, \end{aligned}$$
(1)
$$\begin{aligned} \mu _B^i&= \frac{\sum _{\left( i,j\right) \in dental}{f\left( i, j\right) }}{\#background\_pixels},\end{aligned}$$
(2)
$$\begin{aligned} T_{i+1}&= \frac{\mu _B^i + \mu _D^i}{2}. \end{aligned}$$
(3)

where f(i, j) is the grayscale value of a pixel at point (i, j), \( \mu _D^i \) and \( \mu _D^i \) are the mean grayscale values for their respective regions and \(T_i\) is the threshold value for the whole image calculated from the average values of the background and teeth pixels.

This step was repeated until the iterative threshold value did not change in subsequent re-evaluations or until a hard limit was reached. It was determined in Ref. [8] that convergence occurred within four to ten iterations for their set of images. Following several tests it was determined that convergence occurred within seven to fifteen iterations for the images discussed in this paper, so a maximum iteration limit of fifteen was used.

Once a final value had been determined for the iterative thresholding portion of the greater thresholding method, a mask of the X-ray was produced by isolating all teeth pixels whose grayscale value equaled or exceeded the iterative value.

Adaptive Thresholding. In order to correctly identify teeth pixels from miscellaneous background, jaw and gum pixels, an adaptive thresholding method was implemented. The adaptive thresholding method proposed by [8] determined the threshold value for an image as such; following standard adaptive threshold implementation, a pixel undergoes thresholding if, with it being the centre pixel of a window of size I \(\times \) J pixels, its grayscale value is less than the mean value of all non-zero pixels within the window. The formula for this is thus

$$\begin{aligned} T(i, j) = \frac{\sum _{s=-\frac{I}{2}}^\frac{I}{2} f(i + s, j + t)}{\#nonzero\_pixels} \end{aligned}$$
(4)

In order to account for the varying exposure rates of the X-Rays being tested, where some teeth would appear darker than background tissue, as well as the presence of darker regions in teeth which contained dental caries, the threshold value unique to each image could not be used. Likewise, a general threshold value for the entire dataset could not be used due to the varying brightness intensities in each X-Ray arising from the presence of dental fillings or caps.

A hybrid approach was therefore implemented to generate a threshold value which correctly removed as much of the non-teeth pixels as possible. A global threshold was determined by applying the adaptive threshold to the mask images of 40 images. The average of these thresholds was used to establish the global threshold value. The same process was applied to each image when it underwent thresholding to obtain its personal threshold. A final threshold value was obtained from the weighted sum of these two values where the distribution of the pixel intensity affected the weights. The initial weight for each threshold was set to 0.5 on the basis that the global threshold represented the average intensity trend for the dataset and that the personal threshold corrected any slight deviations. If there was a discrepancy between the two values, such that a 10% deviation or greater was present, the following rule was applied:

$$ FT(i, j) = {\left\{ \begin{array}{ll} 0.6 PAT + 0.4 GAT, &{} \quad \text {if} \ \ \ 1 - \frac{PAT}{GAT}>=0.1 \\ 0.4 PAT + 0.6 GAT &{} \quad \text {if} \ \ \ 1 - \frac{GAT}{PAT} >=0.1 \end{array}\right. } $$

where FT is the final threshold, PAT is the personal adaptive threshold and GAT is the global adaptive threshold.

2.3 Tooth Separation

Tooth separation was handled in two parts. The potential separation lines are initially generated through integral projection, which determines the gap regions between identified teeth. Following this, an evaluation algorithm determines the line of best fit.

Integral Projection. Integral projection is able to analyze pixel intensities across an image and detect regions of darker pixels. As such, it provided the best solution for the detection of gaps between teeth, where the spaces between two adjacent teeth are easily identifiable from the thresholded mask obtained in the previous stage. Areas where clusters of black pixels were present between pairs of adjacent clusters of white pixels were identified as valleys.

Line Selection. Separation lines were calculated using the gap clusters as training points for a linear regression model. Two variations of the simple linear regression algorithm were used. The first algorithm was the standard formula defined as follows:

$$\begin{aligned} \hat{\beta }&= \frac{\sum _{i=1}^{n}\left( x_{i} - \bar{x}\right) \left( y_{i} - \bar{y}\right) }{\sum _{i=1}^{n}\left( x_{i} - \bar{x}\right) ^2}, \end{aligned}$$
(5)
$$\begin{aligned}&= \frac{\sum _{i=1}^{n}{x_{i}y_{i}}-\frac{1}{n}\sum _{i=1}^{n}{x_{i}\sum _{j=1}^{n}{y_{j}}}}{\sum _{i=1}^{n}{x_{i}^2}-\frac{1}{n}\left( \sum _{i=1}^{n}{x_{i}}\right) ^2},\end{aligned}$$
(6)
$$\begin{aligned}&= \frac{\bar{xy}-\bar{x}\bar{y}}{\bar{x^2}-\bar{x}^2},\end{aligned}$$
(7)
$$\begin{aligned}&\hat{\alpha } = \bar{y} - \hat{\beta }\bar{x}. \end{aligned}$$
(8)

where n denotes the number of points, \(\beta \) denotes the gradient of the slope and \(\alpha \) denotes the y-intercept.

The second formula was a weighted linear regression model which proved effective in generating a correct separating line in cases where cluster distribution was favoured in one direction. In cases where there was an equal distribution of points around the median then the simple linear regression model was used. If the distribution of points was greater or less than the median then the value of n in the above equation was calculated to be half the total number of points.

In order to determine the best separation lines, the algorithm proposed by Frejlichowski and Wanat [13] was adapted to work on periapical X-Rays. The original algorithm determined separation lines based on the nature of panoramic X-rays, where all teeth are in view. It uses the uniform nature of teeth sizing to determine spacing across the entire row of teeth. Due to the nature of the X-rays being analyzed, where the number and types of teeth present in each X-Ray varied across each image, the adapted algorithm was required in order to achieve correct results. By combining the rotation algorithms used in [8, 12] with an altered probability model derived from [13], the new segmentation algorithm was developed which incorporated both rotational and probabilistic functions.

For segmentation lines where the number of intersection points is equal to, or greater than, a previously determined optimal line, a new set of acceptance criteria were introduced. Based on the probability formula implemented in [12], vertical lines have a higher probability of generating successful segmentation results. The weighting system judged potential line candidates by using slope gradient and intersection point percentage relative to the total separation line. The probability of a line being the best segmentation line was determined by

$$\begin{aligned} P = |nW_T + \frac{W_I}{IP_{deep}} - J | \end{aligned}$$
(9)

where P was the probability of the line being correct, n was the number of already segmented teeth, WT was the assumed width of the previous teeth, WI was the width of the image, IP are the number of integral projection points representing gaps between the teeth and J was the projected point of the segmentation line. The desire of the algorithm was to minimize the value of P where P actually represents the probability of a line being incorrect. Lines which fall on the projected segmentation value have a P rating of 0 meaning there is close to 0 probability of it being incorrect. As segmentation lines move away from the projected point the value of P increases resulting in unfavourable selection chances.

To accommodate for acceptance of separation lines for impacted or extremely close adjacent teeth, a second algorithm was used. If more than 60% of the separation line intersected with teeth pixels then the line was discarded and the gap was regarded as a space between molar roots.

3 Caries Detection

Caries detection was handled in two stages. Potential regions of interest were first identified using an edge detector which highlighted all locations where dark spots, and by extension possible caries, were present. Once a region of interest had been defined a novel algorithm was applied to the area in question in order to assess the validity of the caries flag.

3.1 Blob Detection

After testing several methods, a blob detector was implemented for the detection of potential caries in the demarcated search space. Regions of possible decay appeared substantially darker when compared to the surrounding tooth matter, due to the high contrast of the image from the top and bottom hat transformation performed during the boundary detection phase. Blob detection algorithms were able to capitilize on this, owing to their ability to locate local maxima. The blob detection model proposed by Lindeberg [21] was implemented as it was not affected by scaling issues which arose from the varying sizes of the teeth being processed. The model used a Laplacian of the Gaussian approach to detect darker regions, which was defined as a convolution kernel of the form

$$\begin{aligned} LoG = \frac{x^2 + y^2 - 2\sigma ^2}{\sigma ^4} e^{-\frac{x^2=y^2}{2\sigma ^2}} \end{aligned}$$
(10)

where \(\sigma \) was the width of the kernel. This was approximated to a 5\(\,\times \,\)5 kernel for the purposes of implementation defined as

$$ LoG = \begin{bmatrix} 0&0&1&0&0 \\ 0&1&2&1&0 \\ 1&2&-16&2&1 \\ 0&1&2&1&0 \\ 0&0&1&0&0 \end{bmatrix} $$

The use of a 4-connected kernel resulted in some loss of definition around the edges of the caries clusters which negatively impacted the diagnostic method, therefore the 8-connected kernel was implemented.

3.2 Caries Analysis

Region of Interest Generation. To achieve the goal of diagnosing whether dental caries were present with a non-supervised assessment model, image analysis techniques were implemented in order to assess the regions of interest using standard dentistry techniques. The depth of the search region was already known relative to the tooth, falling between 10–15% of the overall width. Teeth were approximated to fall between 7.5–9.0 mm in width as defined by Chu [22]. Due to the images being periapical X-rays and not panoramic, the exact tooth being analyzed was unknown as the X-Rays were taken of varying locations. In order to approximate the width, the following formula was proposed:

$$\begin{aligned} W = T{max} - \frac{T_{variance}(P_{max}-P_{calculated})}{P_{variance}} \end{aligned}$$
(11)

where W was the estimated width, T was the width of the tooth and P was the percentage depth of the search space, determined to be 10–15%. \(T_{variance}\) was obtained by calculating the difference of the maximum and minimum tooth width values and was determined to be 1.5. The value for \(P_{variance}\) was calculated to be 5 following the same process. This formula was derived using the probability that teeth which required smaller analysis regions represented the narrower spectrum of teeth, whereas teeth with wider search regions represented the wider spectrum.

Cluster Analysis. Positive classification of caries from flagged regions of interest required that several acceptance variables were met. A threshold value was obtained by calculating the mean pixel value of the cluster region. A second threshold value was also generated by calculating the mean pixel intensity of the area surrounding the suspected caries region. Due to caries originating in the enamel of the tooth, the search space was constrained to this region. This was done primarily to avoid incorrect caries classifications resulting from the darker dentin region interfering with the assessment algorithm. Calculations were based on enamel thickness varying from 0.87–1.45 mm as defined in Ref. [23]. The search space was obtained by creating an elliptical region centered along the perpendicular of the cluster with width equal to double its height and height defined by:

$$\begin{aligned} H = E_{max} - \frac{E_{variance}(T_{max} - T_{calculated})}{T_{variance}} \end{aligned}$$
(12)

where H was the height of the ellipsis, E was the width of the enamel and T was the width of the tooth. \(E_{variance}\) was obtained by calculating the difference of the maximum and minimum enamel width values and was determined to be 0.58.

By restricting the search space to this elliptical region, the second threshold value was calculated from neighbouring pixels contained within the enamel region. The two threshold values were compared to determine if there was a sizeable difference between the suspected caries cluster and the surrounding pixels. If the cluster was less than 5% darker than the surrounding area the cluster was discarded and no caries were identified. If the cluster region had a mean more than 15% darker than the surrounding area the cluster was identified as a caries region. If the cluster mean was between 5–15% darker, the algorithm proceeded to determine if the cluster represented a darkening of the X-ray itself or a site of early caries development. A Sobel operator with a kernel size of 3 was applied to the elliptical region in order to detect significant gradient changes within the region. By limiting the kernel size to the minimum possible, it was possible to apply the operator to any sized region of interest. In order to deal with the inaccuracies inherent to 3\(\,\times \,\)3 Sobel kernels, the Scharr function [24] was used, defined as two kernels of the form

$$ G_x = \begin{bmatrix} -3&0&+3 \\ -10&0&+10 \\ -3&0&+3 \end{bmatrix} \qquad \qquad G_y = \begin{bmatrix} -3&10&-3 \\ 0&0&0 \\ +3&+10&+3 \end{bmatrix} $$

These kernels were applied to all pixels within the analysis region, resulting in the transformation of

$$\begin{aligned} G = |G_{x}| + |G_{y}| \end{aligned}$$
(13)

where G was the value of the new pixel. If no edges were detected using this algorithm, it implied that there were no regions of significant pixel intensity change within the search region. As such, the cluster was regarded as a darkening of the X-ray which was initially flagged due to the enhanced contrast brought on by the top and bottom hat transformations. If, however, an edge was detected, this represented a region of pixel intensity change not in line with the surrounding area. As such these areas were denoted as caries.

4 Experimental Results

4.1 Segmentation Results

The success rate of the segmentation method was evaluated based on its ability to correctly separate teeth in the upper and lower jaw regions individually as well as the combined results of both regions. This provided both specific results as to whether the algorithm performed better on a particular jaw region, as well as a holistic view as to how well it performed on average when looking at both jaw regions.

Teeth were considered correctly separated if the separation line did not cause partial separation or division of the teeth. Teeth which were already partial as a result of being at the edge of the X-ray were considered correctly separated if no further partiality was caused. Teeth which were not correctly segmented were either caused as a result of extremely poor contrast in the original image, where the enhancement techniques could not establish a distinction between teeth and non-teeth structures, or due to impacted teeth.

A comparison of the results on a jaw specific basis are presented in Table 1. The results obtained by Oliveira and Nomir and Abdel-Mottaleb are used as a comparison, due to the similarity of the implemented methods used to achieve dental segmentation.

Table 1. Region specific segmentation results comparison

As can be seen, with a combination of the adapted and novel algorithms discussed in this paper, the segmentation results improved over existing methods. Table 2 provides a comparison of the proposed method to other implementations of the segmentation process, as described in Ref. [16].

Table 2. Overall segmentation results comparison

These results indicate that the method proposed in this paper offers a noticeable improvement on existing models. Furthermore, it indicates the diagnostic algorithm received the greatest quantity of correctly segmented images for evaluation.

4.2 Caries Detection Results

A collection of ground truth data was used to evaluate the success rate of the detection method. The data contained markers for the location of identified caries, as well as the locations of false positive regions. The false positive regions were defined as locations along the boundary of each tooth where caries were incorrectly identified. This occured due to a misinterpretation of the region, either due to the contrast of the X-ray, or because a partial set of caries identifiers were present which led to the algorithm interpreting the results as a caries location.

Table 3. Caries identification results comparison

To determine whether these rates fall within acceptable limits, a comparison was done against the different diagnostic methods available. These comprised of caries detection performed by dentists using the Logicon Caries Detector system, as discussed by Tracy et al. [28], unassisted caries diagnosis by dentists, as discussed by Dykstra [29], and caries detection preformed by a supervised learning model, using the method proposed by Oliveira [15]. The results of this comparison can be seen in Table 3.

5 Conclusion

In this paper an unsupervised learning model for caries detection was presented. The proposed model is implemented using a segmentation method to separate the X-rays into individual teeth, a boundary detection method to determine the edges of the teeth for caries analysis and finally a diagnostic algorithm that assesses the boundary using image analysis techniques. Both the proposed segmentation method and caries detection algorithm obtained favourable results when compared to similar models due to the novel approaches described in this paper. As such, the caries detection model outlined in this paper provides a viable alternative to existing models for use in caries detection.