Keywords

1 Introduction

The diabetic foot (DF) is one of the main pathologies developed by individuals with diabetes, characterized by foot lesions, infection, ulceration and/or destruction of deep tissues, and its development may even lead to limb amputation [1].

These lesions may be associated with neuropathy or vascular disease [2], and are directly linked to abnormal temperature changes in the sole of the foot; in consequence, a constant temperature monitoring and the identification of typical patterns is of high importance. In the feet of healthy individuals, it is possible to find a particular spatial temperature pattern, known as butterfly pattern, but, unfortunately, no specific configuration can be detected in DM ones [3,4,5].

Infrared thermography (IRT) allows to visualize the temperatures of the plant of the foot and to explore the changes that occur [6], thus capturing the interest of several researchers [3,4,5,6,7,8,9,10].

Since the foot does not present a uniform temperature, it is extremely important to analyze the temperature of the foot considering a regional division in order to obtain more detailed information about the temperature variation [11]. Thus, some authors present the analysis of the variation based on the observation of specific points or regions of interest (ROIs). For example, in [12, 13], 33 ROIs (points with the highest probability of ulceration) on both feet were considered for temperature measurement. As one of the main causes of the ulceration in DF is low blood supply, some other researchers used the concept of angiosoma (tissue region blood supplied by a single artery) and divide the sole of the foot into four areas [3,4,5, 14, 15].

A temperature index calculated using the average of the temperature of some regions, defined using the clustering concept, is proposed in [16]. Unlike other works in which the regions are defined statically, the division of the foot using a clustering method has the advantage of dynamically create the regions that take into account the temperature distribution of each foot. In the present study, the authors continue the work of [17], where the results of [16] were improved using different descriptors and the best performance was obtained with the minimum.

In the present study the authors developed a methodology that uses thermograms to analyze the diversity of thermal changes that exist in the plant of a foot and classifies it as being from an individual with possibility of ulcer arising or not. To this end, different binary classifiers, based on the clustering concept, were developed using both quantitative temperature indexes (computed using statistic measures of the temperature) and classification temperature thresholders.

A clustering method allows close temperature values to be grouped, in this case dividing the foot into different areas (clusters) with similar temperature values [18]. Hence, each one of the regions of the plant of the foot obtained using this technique present similar values of temperature and, thus, permitting to obtain an index that can measure thermal variations.

The k-means and the spectral algorithms were used to obtain the clusters and, for both of them, different indexes, based on the mean, minimum, maximum and range values of the temperature, were obtained. In order to improve the evaluation metrics from the latest descriptors, a hybrid index was proposed. To measure the classifiers’ performance, binary experiments were performed using a public data set; the performance metrics Sensitivity, Specificity, Precision, Accuracy, F-measure, and Area Under the Curve were used to assess and validate the proposed classification temperature thresholders.

This paper is organized as follows: In Sect. 2 the proposed method is described. In Sect. 3 the used database is introduced, and the obtained results are presented and analyzed. In Sect. 4 the results are discussed. At last, in Sect. 5 conclusions and future work are addressed.

2 Methodology

The method that was developed has 3 processing phases: temperature clustering, index computation, and classification. More specifically: 1) Using the temperature values, the plant of the foot is divided into regions (5 clusters) using a clustering algorithm; 2) Temperature descriptors are computed for each region, specifically the mean, maximum, minimum and range values, as well as an index that measures the variation of the temperature with respect to a reference value; 3) Results concerning the classification of the individual as healthy or with possibility of ulcer arising are presented, by applying a threshold procedure on the index.

2.1 Clustering

Clustering is an important technique that is used in many areas such as statistics, computer science, biology and social sciences [19]. There are several clustering algorithms, allowing, with this technique, to group data samples into groups based on a certain measure of similarity [20, 21]. The data set is grouped so that similar data points are assembled and different from data points in other groups [18].

Centroid-based, such as k-means [22], are one of the most popular types of clustering algorithms, being graph-based clustering algorithms, like spectral clustering, another popular type [23].

2.2 k-Means Clustering

Because of its speed and computational simplicity when operated on a large data set, k-means clustering is widely used. It is an iterative process, that splits a data set into k groups (clusters) defined by the centroids, by minimizing the sum of the distances of each point of the data set to the cluster centroid, in all the clusters [22]; the number of clusters is previously chosen. The algorithm can be synthetized in the following five steps:

  1. 1.

    Initialize the number of clusters, k, and randomly choose k initial centers (centroid).

  2. 2.

    Compute point-to-cluster-centroid distances of all the elements of the data set to each centroid.

  3. 3.

    Assign each one of the elements of the data set to the cluster with the closest centroid.

  4. 4.

    Compute the average of the elements in each cluster to obtain k new centroid locations.

  5. 5.

    Repeat steps 2 through 4 until the centroids do not change, this means that k-means algorithm converged.

Figure 1. Shows an example of clustering a set of points using the k-means algorithm.

Fig. 1.
figure 1

Clustering example using k-means (adapted from [24])

2.3 Spectral Clustering

Spectral clustering is a graph-based algorithm used to obtain k arbitrarily clusters in a data set. It is especially suitable for non-convex datasets because it does not make any assumptions on the global structure of the data [23].

This algorithm is based on spectral graph theory, for clustering data points in k arbitrarily shaped clusters. It involves constructing a graph, finding its Laplacian matrix (way of representing a similarity graph that models the local neighborhood relationships between data points as an undirected graph), in order to find k eigenvectors to fragment the graph k ways [24]. By default, the algorithm computes the normalized random-walk Laplacian matrix using the method described by Shi-Malik [25], and also supports the (non)normalized Laplacian matrix which uses the Ng-Jordan-Weiss method [26].

Fig. 2.
figure 2

Clustering example using Spectral Clustering (adapted from [24]).

This algorithm can be synthetized in the following three steps:

  1. 1.

    Create a similarity graph between all the elements of the data set to cluster.

  2. 2.

    Compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object.

  3. 3.

    Run k-means on these features to separate objects into k classes.

Spectral clustering, unlike k-means and classic algorithms, can group objects belonging to irregular form groups based on connectivity, as shown in Fig. 2 [27].

In this work the image of the foot was divided in five clusters with both algorithms. After that, for each cluster, some descriptors of the temperature are computed, more specifically, the mean, maximum, minimum and range.

2.4 Cluster Temperature Indexes

New indexes, based on the concept of clustering, are proposed to measure the temperature variations in the sole of the foot; they will be called Cluster Temperature Indexes (CTIs) and are computed having as reference the variation of the temperature of each cluster with respect to a reference temperature obtained from the healthy individuals that constitute the control group. Each index was based in a cluster descriptor, specifically in average, maximum, minimum and range values of the temperature.

In each cluster the reference value of the temperature descriptor (average, maximum minimum and range) is computed for the control group.

The index is calculated for each individual as the mean of the absolute value of the differences between the temperatures of the individual (IND) and the correspondent reference values, obtained from the healthy group (\(\overline{CTR }\)), as is (1), where \(\overline{CTR }\) i is the descriptor reference temperature value of cluster i and \({IND}_{i}\) is the value of the temperature of the chosen descriptor (average, maximum, minimum or range) for cluster i for the individual in study

$$CTI= \frac{\sum_{i=1}^{5}\left|{\overline{CTR} }_{i}-{IND}_{i}\right|}{5}$$
(1)

Average CTI (ACTI) will be obtained, from (1), when using the mean descriptor; analogously, Maximum CTI (MCTI), the minimum CTI (mCTI) and the range CTI (RCTI) will be obtained when using in (1) the descriptors maximum, minimum and range, respectively. These indexes are used to classify a thermogram as being from an individual with possibility of ulcer arising or not.

2.5 Classification and Performance Evaluation

CTI indexes measure the variations of the temperature values of the sole of the foot between people with DF and healthy ones. Thus, higher CTI values correspond to larger values of temperature variation; therefore, implying higher possibility of ulceration.

During the third stage of the method the thresholding of CTI values is used to classify a thermal photo as belonging to a healthy person or to a diabetic one with the possibility of foot ulceration. The success of this stage relies on the choice of an appropriate threshold value, that will be called Cluster Temperature Threshold (CTT). A CTT is calculated for each descriptor; so, with the average descriptor the Average CTT (ACTT) is obtained, the minimum descriptor gives the minimum CTT (mCTT), the Maximum CTT (MCTT) results from using the maximum descriptor and Range CTT (RCTT) from applying the range descriptor.

With these thresholds and the CTI’s, also calculated for each descriptor, a new CTI index, denominated as Hybrid CTI (HCTI), is proposed. This index uses a combination of the indexes mCTI and RCTI, according to (2).

$$HCTI= \frac{mCTI+\left|mCTT-RCTT\right|*RCTI}{2}$$
(2)

Several performance metrics were calculated during the experiments, like: Sensitivity (Se), Specificity (Sp), Precision, Accuracy, F-measure, and Area Under the Curve (AUC). The correspondent values are obtained from the expressions (3) to (7), where the acronyms TP, FP, TN and FN represent, respectively, the number of cases of True Positive, False Positive, True Negative and False Negative:

$$Sensitivity=\frac{TP}{TP+FN}$$
(3)
$$Specificity=\frac{TN}{FP+TN}$$
(4)
$$Precision=\frac{TP}{TP+FP}$$
(5)
$$Accuracy=\frac{TP+FN}{TP+TN+FP+FN}$$
(6)
$${F}_{measure}=\frac{2*TP}{2*TP+FP+FN}$$
(7)

As the proposed classifiers are used to detect the risk of ulceration, Sensitivity and Specificity are the most relevant metrics to evaluate the their performance [28]. The optimal value chosen for each CTT was obtained using an approach known as the point closest-to-(0,1) corner in the Receiver Operating Characteristic (ROC) plane. This approach, denominated as ER criteria, uses the point that minimizes the Euclidean distance between the ROC curve and the (0,1) point to define the optimal cut-point, and is obtained using (8) [29].

$$ER=\sqrt{{(1-Se)}^{2}+ {(1-Sp)}^{2}}$$
(8)

The AUC is used to summarize the entire location of the ROC curve (a curve that shows the connection between the sensitivity and the 1- specificity), as it is an effective and combined measure of the metrics that describes the inherent validity of diagnostic tests, rather than depending on a specific operating point [30].

3 Results

The public dataset used to evaluate the different classifiers contains 334 individual plantar thermograms (244 of diabetic and 90 of healthy) segmented and vertically adjusted, corresponding to the plant of the left or the right foot of a total of 167 individuals [11]. The computational experiments were carried out using MATLAB.

Now, the results obtained with the two algorithms described in Sect. 2 are presented. Table 1 presents the descriptor values calculated by the two clustering methods. In Subsect. 3.1, the results of the classification with the k-means clustering are described, while in Subsect. 3.2 the results of the classification with the spectral clustering are addressed.

Table 1. Temperature (ºC) descriptors per cluster obtained with the two methods.

3.1 k-Means Clustering

In this subsection, the results obtained with the classifiers using the k-means clustering algorithm are presented. After some preliminary experiments, it has been concluded that the division of the foot into five clusters gives the best results, as the metric values improve until five, but from five to six clusters and so one, worse metric values are obtained, as small variations of the temperature between each pair of clusters are verified [16]. The results for healthy people (reference group) and the diabetics are presented in Fig. 3 and Table 1; there, the average values of the temperature per cluster for each descriptor, measured in Celsius degrees (ºC), and the CTI average values, are presented.

Fig. 3.
figure 3

Mean of the temperature (ºC) per cluster and of CTI, for each descriptor using k-means

Each CTI indicates the differences in a subject's foot temperature pattern relative to the normal temperature pattern (butterfly pattern). Analyzing Fig. 3, as well as Table 1, and comparing the average temperatures obtained using the descriptors for each cluster, it can be observed that the DM group has higher values than the healthy group, except for the range descriptor that presents similar values. Thus, it can be concluded that higher indexes correspond to higher temperature variations, signaling an increase of the probability of ulceration.

The threshold value used to determine whether a thermogram is either from an individual with risk of ulceration or from a healthy one, will affect the performance of the classification. The ER measure was used to obtain the optimal threshold, in order to balance the metrics sensibility and sensitivity. The values chosen to classify a thermogram as belonging to a healthy person or to a diabetic one with the possibility of foot ulceration are: 1.9 for the average descriptor, 1.84 when using the minimum, 2.22 for the maximum, 0.47 for the range and 1.27 for the hybrid.

Fig. 4.
figure 4

Metrics scores and ER using k-means, for the 5 descriptors

From Fig. 4 and Table 1, it can be perceived that similar metric results are obtained with the average and the maximum descriptors, but the last one gives less balanced specificity and sensitivity values. The best results are attained with the minimum descriptor, noticing an increase of 4% in some of the metrics, as F-measure or Accuracy, when compared with the worst results; it is also worth noticing that the value of the metrics increase when the hybrid descriptor is used.

3.2 Spectral Clustering

In this subsection, the results of the classifiers that use the spectral clustering algorithm are presented. As already mentioned in the previous subsection, these classifiers also use five clusters. Figure 5 illustrates the difference between the control group (reference group) and the DM group; the average values of the temperature per cluster for each descriptor, measured in Celsius degrees, and the CTI average values are presented.

Analyzing Fig. 5 and Table 1, it can be observed that, as previously mentioned, the DM group has higher average temperatures than the control group. Nonetheless, analyzing in particular the amplitude descriptor, it can be seen that, for the first cluster, the amplitude is larger in the control group since this cluster is associated with the toes, which are areas that suffer high variations of the temperature, and can have either very low or very high values of the temperature. Thus, being a descriptor that may bring advantages when combined with other descriptors.

Fig. 5.
figure 5

Mean of the temperature (ºC) per cluster and of CTI, for each descriptor using spectral

For the spectral clustering, the metrics scores are present in Fig. 6. The values used as the limit to catalogue the individuals as either healthy or DM with the possibility of ulceration, for each descriptor, are: 2.31 for the average descriptor, 1.96 for the minimum, 2.06 for the maximum, 0.6 for the range and 1.47 for the hybrid.

Fig. 6.
figure 6

Metrics scores and ER using Spectral, for the 5 descriptors

In Fig. 6, it can be observed that the metrics results are similar for all the features, but when comparing the ER value, the minimum is the descriptor that presents the higher values and, like it happens with the k-means approach for the case of the hybrid descriptor, an increase of the metric values can be noticed.

4 Discussion

In this work, two different clustering algorithms with five features extraction each, in a total of ten different classifiers, were used to classify foot thermograms. A binary cataloguing technique, that splits the thermograms into binary groups according to the CTI of each feature mentioned in Sect. 2.2 is proposed.

The highest metrics values were obtained with the minimum descriptor for both clustering algorithms. However, when comparing these two classifiers, the best classifier using k-means presents the best metrics results, like ER, F-measure, AUC, Accuracy and also presents a better balance in the sensitivity and specificity, but with smaller differences.

Table 2. Values, per cluster, of CTI and of the temperature (ºC) for some foot thermograms of the control and the DM groups in [11].

To better illustrate the comparison between the used clustering algorithms, three thermograms from the dataset (CG005, DM030, DM010 in [11]) are presented in Table 2. The columns of the table indicate: the identification of the individual (Subject), the thermogram (Thermal), the image obtained after applying the k-means clustering (K-means), the HCTI value with k-means (HCTIkm), the image obtained after applying the spectral clustering (Spectral), the HCTI value with spectral (HCTISp) and the measures of the entire foot (General).

Comparing the performance of the two best classifiers involving each clustering algorithm in different cases, it was noticeable that when the clusters have average temperatures similar to the reference ones (obtained using the feet of the healthy people), small CTI values are obtained and the temperature distribution of the foot approaches the butterfly pattern, typical of non-diabetic subjects (Table 2 (CG005)). It was also observed that when the variations are quite large, up to very high temperatures across the sole of the foot, it will result in very high CTI values, for example (Table 2 (DM010)). In general, in these two cases, both methods accurately classify this type of foot.

In the intermediate thermogram (Table 2 (DM030)), that represents a case where there are small temperature variations and the thermal changes are slightly different from the butterfly pattern, the CTI's are close to the reference CTI values. In these cases, both classifiers have more difficulty finding these small variations, thus affecting their performance, but this is more evident when using spectral clustering. To try to combat this difficulty a hybrid CTI that uses mCTI and RCTI, allowing to increase the classification performance, was proposed. Finally, it can be observed that the algorithm that presents the best results is k-means.

5 Conclusions

A method that uses thermograms of the plant of the foot to help measuring temperature variations and predicts the risk of ulceration is proposed.

The concept of clustering was used to build binary classifiers with different descriptors and using two clustering algorithms. For each descriptor, a temperature index and a cataloguing temperature threshold was calculated; additionally, an optimal threshold value was found by balancing two of the metrics used, sensitivity and specificity.

From the presented results and by comparing the algorithms, it can be noticed that both of them allowed to obtain a global performance similar to others already published, [16, 17] and, in general, when the butterfly pattern can be perceived in the distribution of the temperature of the sole of the foot, or when the hot spots cover the entire plantar area, the performance of the classification are similar. On the other hand, when the temperature variations are small and the thermal variations start to vary slightly from the butterfly pattern, both classifiers have more difficulty finding these small variations, thus affecting their performance (being this behavior more evident in the spectral clustering). To combat this difficulty, a hybrid CTI that uses two of the presented indices was proposed and an increase of the classification performance was obtained.

As the next step, it is intended not only to expand the dataset, to balance the number of photos of healthy and DM individuals, but also to implement and test some other approaches, as well as to create a repository in GITHUB, where the developed code and the thermograms database will be placed.