1 Introduction

Studies that have been conducted in the USA on skin cancer have stated that it is the most common type of cancer [1, 2]. Although skin cancer is extremely common, because of the early diagnosis, this disease has high survival rates. For example, in the UK and the USA, survival rates for the last 5 years are respectively 86% and 90% [3]. Melanoma is distinctive with high mortality rates in skin cancer types [2]. Therefore, the correct diagnosis of melanoma lesions is of essential importance. The high diagnostic accuracy is primarily possible by the correct segmentation of the lesion images and accurate classification. Segmentation is the first stage in the analysis of skin lesions [3]. For this purpose, a large number of methods have been described in the literature, based on histograms, using color clustering methods, based on statistical or image space and using adaptive or iterative threshold methods [3, 4]. It is possible to divide these methods into two main groups as supervised and unsupervised [5, 6]. One of the most popular among the unsupervised methods is the active contour (AC) method. AC models can indicate different characteristics in practice in accordance with the purpose of use and the mathematical methods used within. Sirakov et al. developed a new AC model using a partial differential approach and named it shrinking active contour exact solution (S-ACES) model [7, 8]. Then, they have used this method for the segmentation of skin lesions [9, 10]. In the present study, this S-ACES model was used for segmentation.

The purpose of segmenting images is to calculate the features to be used for the identification of the lesions. In the studies performed with skin lesions, many descriptive features were calculated after the segmentation process. In the method proposed by Menzies S. et al., 11 features were grouped as negative and positive. Negative attributes were as follows: the symmetry of the color patterns in the lesion and the presence of one color. If at least one of them was present, the lesion was not melanoma. Positive attributes were as follows: blue-white envelope, multiple brown dots, pseudopods, radiant scattering, wound-like color loss, peripheral black spheres, multiple (5–6) colors, multiple blue-gray spots, and expanding network topology. There was a high risk of melanoma if at least one of the positive attributes was present [11]. Bakheet S. has used histogram of oriented gradients (HOG) to reflect the texture characteristics of the image in his study [12]. In this aforementioned study, filters and other image enhancement techniques were used extensively. However, although they emphasized that the feature which offered was insensitive to lighting problems, they adjusted the brightness in order to eliminate lighting problems in the images. In addition, they also stated that the feature they have used was affected by the rotation of the image.

Alquran H. et al. also calculated the total dermoscopy score (TDS) value based on asymmetry, edge irregularity, color, and diameter component of the studies. In addition, they calculated the properties of contrast, skewness, kurtosis, energy, mean, standard deviation, circulation, energy, correlation, and homogeneity [13]. Then, they used the principal component analysis (PCA) to select five of the above in order to use it in the classification. The selected parameters were as follows: TDS, mean, standard deviation, energy, and contrast. When these attributes were examined, it was observed that most lesions had color characteristics. Mete M. et al. [9, 10] used the number of colors (NC), abrupt color transitions, asymmetry of lesions, and color regions in accordance with the ABCD rule. They used advanced SVM methods for classification. Nachbar F. et al. designed a method called the ABCD rule [14]. In this method, they used descriptors such as asymmetry of the lesion region, NC, abrupt color transitions, and the presence of different structural elements. Dalila F. et al. used a total of 112 descriptors in three different groups in their study and then selected 12 of them with the “Relief” attribute selection algorithm [15]. Ten of the selected features were related to color features. Faziloglu Y. et al. evaluated the color properties of the lesion and developed the Percentage Melanoma Color (PMC) identifiers, and revealed that the PMC feature was powerful in distinguishing melanoma samples in their studies [16]. When we examine the attributes used in the studies, it is understood that color information is the most important indicator of the lesion structure. The main factor determining the color content of skin lesions is melanin [17, 18]. In the studies, the colors considered to detect melanomas (black, light brown, dark brown, blue-gray) appear depending on the location of melanin in the skin tissue [17, 18]. In addition, the sources of the color of the pink-red color range were the red blood cells [17]. Abbadi N.K.E. et al. used the ABCD rule in their studies and calculated the asymmetry, edge irregularity, number of colors, and diameter attributes accordingly. Unlike other studies, they performed color counting in RGB space [19].

As can be seen from the open literature studies mentioned above, analysis of the color of the skin lesion is significantly essential for the diagnosis of melanoma. In this respect, the NC detected in the lesion has been the strongest and most common indicator of melanoma [9, 17, 20]. However, for the acceptance of the presence of any color in the lesion and the calculation of the total NC in the lesion, users need to determine a color counting threshold (CCT). Color is considered to be present when there is a greater amount of CCT; otherwise, it is considered non-existent. For this purpose, a certain ratio or fixed number of pixels has been used in the literature depending on the lesion area [9, 16, 19]. In the techniques using a fixed number of pixels [9], the size and resolution of the image constitute a serious problem. In some cases, different CCT values should be selected for each different sample. In the techniques related to the lesion area [16, 19], it is important to choose the correct color ratio as the CCT. Choosing this CCT value incorrectly causes both the NC to be calculated incorrectly and the misdiagnosis rate to increase. Although it has been determined in the studies that the NC in the melanoma lesions are particularly high [9, 17, 20], the wrong choice of the CCT value leads to the directly opposite results. In such cases, it is essential to use more powerful and complex classifying methods for both high performance and to avoid incorrect classification. Although it was major importance, no specific method for the determination of CCT has been proposed in the literature.

The aim of this presented study is to define a new descriptor based on the NC for the diagnosis of melanoma. In order to achieve this goal, the effect of the proportional CCT on the selectivity of the NC parameter was examined and CCTs with the highest discrimination were determined by a new method. Thereafter, a novel number of colors difference (NCD) attribute was defined based on the NC. Finally, univariate classification tests were carried out with this attribute and other NC attributes calculated by different CCT values exist in the literature. The results obtained indicate that the classification success with this new feature in terms of f-measure is at least 20.2% and a maximum of 52.7% higher than the literature and is at least 43.8% and a maximum of 84.5% higher than the literature in terms of true positive.

2 Method

The method used in this study is consisting of the process steps as provided in Fig. 1. Initially, space transformation was applied to images in the data set and converted from RGB space to grayscale space. In transformations, 0.3, 0.59, and 0.11 coefficients were used for each frame in RGB space, respectively [9]. Then, the spatial filter process, expressed as Eq. (1), was applied to the image [21]. Following the filtering process, without further application of any pre-processing, the segmentation process had executed.

$$ h\left(x,y\right)=\sum \limits_{s=-a}^a\sum \limits_{t=-g}^g\varphi \left(s,t\right)f\left(x+s,y+\mathrm{t}\right) $$
(1)
Fig. 1
figure 1

Process steps applied in this study

In Eq. (1), f(x, y)indicates the input image, φ(s, t) indicates the spatial filter mask, h(x, y) indicates the filtered image, s and t indicate the coordinates of the mask elements. x and y are the coordinates of the image pixel on which the filter focusing on, g and a indicate the size of the filter.

2.1 S-ACES model

In this study, the shrinking active contour model which was developed by Sirakov N.M. et al. by applying the geometric differential equations and expressed in vector form was used [7,8,9,10]. The model was defined by the following four equations:

$$ \boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}\right)={\boldsymbol{e}}^{\boldsymbol{b}\boldsymbol{w}-\mathbf{4}{\boldsymbol{b}}^{\mathbf{2}}\boldsymbol{z}}\left(\mathbf{Acos}\left(\boldsymbol{ubw}\right),\mathbf{A}\ \mathbf{\sin}\left(\boldsymbol{ubw}\right)\right) $$
(2)

Initial values for Eq. (2):

$$ 4{\displaystyle \begin{array}{c}4{b}^2z=0.001,\kern0.75em u=1000,\kern1em \mathrm{A}=\sqrt{{\mathrm{st}}^2+{\mathrm{sa}}^2,}\nabla =160,\\ {}\mathrm{z}0:0.00001,\mathrm{dz}:0.000001,\varepsilon :120,\mathrm{m}=\mathrm{n}=15\end{array}} $$
(3)
$$ {\displaystyle \begin{array}{c}\boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}\right)=\boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}+\boldsymbol{dz}\right)\kern1.25em \boldsymbol{If}\kern0.5em \boldsymbol{R}\left(\boldsymbol{w},\boldsymbol{z}\right)>\mathbf{\nabla},\\ {}\boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}\right)\ne \boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}+\boldsymbol{dz}\right)\kern1.25em \boldsymbol{If}\kern0.5em \boldsymbol{R}\left(\boldsymbol{w},\boldsymbol{z}\right)\le \mathbf{\nabla},\end{array}} $$
(4)

In Eq. (4):

$$ R\left(w,z\right)=\sum \limits_{j=0}^{m-1}\sum \limits_{i=0}^{n-1}{Y}_{ji}\left(f\left(Q\left(w,z\right)\right)\right),\kern1em {Y}_{ji}\left(f\left(Q\left(w,z\right)\right)\right)=\left\{\begin{array}{c}1\ if\kern0.5em f\left(Q\left(w,z+ dz\right)\right)\in \left[{\varepsilon}_1,{\varepsilon}_2\right]\kern0.5em \\ {}0\ \mathrm{else}\kern13em \end{array}\right. $$
$$ {\displaystyle \begin{array}{c}\boldsymbol{If}\kern0.5em \frac{\boldsymbol{\delta} \boldsymbol{f}\left(\boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}\right)\right)}{\boldsymbol{\delta} \boldsymbol{z}}>\boldsymbol{\varepsilon} \kern0.5em \left(\mathbf{9}\ge \boldsymbol{z}\ge \mathbf{0.00001}\right),\kern0.5em \boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}\right)=\boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}+\boldsymbol{dz}\right)\\ {}\boldsymbol{Else}\boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}\right)\ne \boldsymbol{Q}\left(\boldsymbol{w},\boldsymbol{z}+\boldsymbol{dz}\right)\end{array}} $$
(5)

In this model, Eqs. (2) and (5) were used to create the contour of the lesion. Q(w, z) represents the lesion contour. In Eq. (3), the initial values of the variables in the S-ACES model are given. Equation (4) is used to determine the color pixels in the lesion.

The parameter u, in S-ACES model, is a real constant. Provided that, w ∈ [0, 2π/ub], the test points on the active contour are b = |dw|/2. |dw| is the size of the curve that is defined on the w value. z is the parameter that allows the active contour to narrow to the center. The initial value of this parameter (z0) must be determined by the user in accordance with the image dimensions. dz is the measure of the progress of the active contour towards the lesion center, in other words, the amount of change of z. In order for the lesion contour to be calculated correctly, the dz value must be properly identified by the user. st and sa state the column and row values of the image respectively. f represents the image. A indicates the diagonal length of the image. Y represents the filter matrix of size mxn. R indicates total number of pixels in the color value ranged within [ε1, ε2] in the area of mxn around the point (w,z) that the active contour is testing. If R > ∇ this means an edge point and the test proceeds to the next AC point. m and n indicate the number of rows and columns of this filter, respectively. This function allows filtering of noises and other unwanted colors in the image.

ε1 and ε2 are the parameters calculated for the color of the upper and lower limit gray values of each color that are taken into account in the skin lesions and calculated in accordance with the Table 1 by using Eq. (6) [9]. ε is the threshold value that allows the lesion to distinguish the skin surrounding the lesion. Detailed studies on the success of the mentioned model have been carried out in [7, 8]. In this study, the same method was run on all data sets. The operation of the model is stated in Fig. 5.

Table 1 The smallest and largest values of colors in RGB space

After the segmentation process, the original lesion image was multiplied by the lesion mask, which was the output of the AC algorithm. Thus, the lesion was segmented by preserving its original colors and the negative effects of the filters were prevented. Therefore, conversion from RGB space to grayscale space had applied to the segmented lesion. All subsequent calculations are made in grayscale space. In this way, computational load and complexity were reduced.

$$ {\boldsymbol{\varepsilon}}_{\mathbf{1}}=\mathbf{0.3}{\boldsymbol{R}}_{\mathbf{min}}+\mathbf{0.59}{\boldsymbol{G}}_{\mathbf{min}}+\mathbf{0.11}{\boldsymbol{B}}_{\mathbf{min}}{\boldsymbol{\varepsilon}}_{\mathbf{2}}=\mathbf{0.3}{\boldsymbol{R}}_{\mathbf{max}}+\mathbf{0.59}{\boldsymbol{G}}_{\mathbf{max}}+\mathbf{0.11}{\boldsymbol{B}}_{\mathbf{max}}, $$
(6)

2.2 CCT selection method

Derivation of highly distinctive attributes from lesion images has essential importance for lesion classification operations [3]. Thus, in this section, a method to increase the ability of the NC attributes that are commonly used in the literature is presented.

As a result of dermatological examinations, it has been seen that melanoma type of lesions was found to contain more color than benign type of lesions [9, 17, 20]. Therefore, the determination of NC in the lesions is extremely important for the diagnosis of melanomas. There are 6 essential colors that are taken into consideration in the skin lesions. These are white, red, light-brown, dark-brown, blue-gray, and black colors [9, 14, 17, 19, 20, 22]. The lowest and highest limit values of these colors in RGB space are provided in Table 1 [9].

In order to accept the presence of any particular color in any skin lesion, a CCT value comparison should be executed. No matter how this CCT value is determined with any method, if there are more color pixels available above the CCT, that color is considered available in the lesion. Two types of CCT have been used in the literature. The first is based on a certain ratio, and the other is based on the number of fixed pixels [9, 10, 16, 19]. In this study, the proportional threshold approach was used.

In this approach, the ratio (γ) of the number of pixels with the investigated color to the total number of pixels constituting the lesion was calculated as stated in Eq. (7). The calculated γ value is compared with the selected γ′ value as CCT.

If γ > γ′, the color was assumed to be present in the lesion and the NC value was increased by 1. Otherwise, it has been accepted that the color did not exist in the lesion. The NC is determined by this process and its highest value is 6. The calculation of NC was performed in 0.0001 steps for CCT value γ[0, 1] in order to be compatible with the literature.

In Eq. (7), Ra represents the total number of pixels of the color investigated in the lesion, and their colors are within the range [ε1, ε2] in the grayscale space. ε1 and ε2 are calculated with Eq. (6) according to the limits in Table 1. La is the total number of pixels of the area occupied by the lesion, and γ represents the ratio of these two.

$$ \boldsymbol{\gamma} =\frac{\boldsymbol{Ra}}{\boldsymbol{La}} $$
(7)

However, no method has been proposed in the literature to determine the CCT value. On the other hand, the value of the selected CCT depends on the image acquisition conditions and image properties. However, a highly distinctive NC attribute should be able to distinguish melanoma and benign lesion types very well, as shown in Fig. 2. For this, the NC attribute must be calculated correctly. This is only possible by using the most accurate CCT values. For this purpose, the method given with Eq. (8), which provides the determination of the most accurate CCT values, has been developed.

$$ \boldsymbol{\varTheta} \left({\gamma}^{\prime}\right)={\mu}_{M{\boldsymbol{\gamma}}^{\prime }}-{\mu}_{B{\boldsymbol{\gamma}}^{\prime }};\left({\mu}_{M{\boldsymbol{\gamma}}^{\prime }}=\frac{\sum_{i=1}^k\left({M}_i\right)}{k},{\mu}_{B{\boldsymbol{\gamma}}^{\prime }}=\frac{\sum_{j=1}^d\left({B}_i\right)}{d},\gamma^{\prime}\in \left[0,1\right]\right) $$
(8)
Fig. 2
figure 2

Ideally expected NC attribute. The graphic shows the relationship between NC and Θ. The dashed lines show the average NC values for each lesion type

In Eq. (8), Θ represents the average NC difference between melanoma and benign lesion groups. μ and μ are the average NC value of melanoma and benign groups, respectively. Mi, NC of melanoma sample of i, Bj, the NC of benign sample of j, for γ′ selected as the CCT value. The k and d are the sample numbers of each group. The developed method provided that, Θ is examined as a function dependent on γ′.

If the value of Θ is high, this situation indicates that the NC of the two groups were very different and if the value is low, this situation indicates that NC were close to one another. According to Eq. (8), if Θ >0, then the Melanoma samples shall have more NC, and If Θ <0, then Benign samples shall have more NC.

Therefore, the purpose of the method is to determine the values of \( {\gamma}_1^{\prime } \) and \( {\gamma}_2^{\prime } \) which make the Θ(γ′) function ± maximum in both directions. In this way, the best CCT values were determined for both lesion types. The CCT values used in the literature are 0.001, 0.00125, and 150pixels, respectively [9, 16, 19]. Therefore, in order to conform to the literature, in the presented study, the processing steps for γ were selected as 0.0001.

Equation (8) was applied separately for each data set. In this way, \( {\gamma}_1^{\prime } \) and \( {\gamma}_2^{\prime } \) are determined separately for datasets created under different image acquisition conditions.

Then, a t test was applied to examine the effect of determined \( {\gamma}_1^{\prime } \) and \( {\gamma}_2^{\prime } \) and other CCT values used in the literature on the selectivity of the NC calculated with them. The t test was a strong indicator with regard to distinctiveness. In this method, the arithmetic means, standard deviation, and sample numbers of each class were taken as input and t value is calculated by Eq. (9). In this equation, tC(γ) indicates the t test result of the NC obtained with the selected γ value. If this t test value is high, this result indicates that the distinctiveness also has a high value [23]. These operations were applied to each data set separately and the results obtained were compared in terms of p value.

$$ {t}_C\left({\boldsymbol{\gamma}}^{\prime}\right)=\frac{\mid {\mu}_M-{\mu}_B\mid }{\sqrt{\frac{\left(k{\left({\sigma}_M\right)}^2+d{\left({\sigma}_B\right)}^2\right)}{\left(k+d\right)}}} $$
(9)

2.2.1 Proposal of a new metric: NCD

As explained in the previous sections, the color characteristics of the lesions provide important information. These features need to be made measurable with appropriate methods. The fact that a lesion contains many colors causes its surface area to be divided into many small color regions. This is especially the case for melanomas. Since benign lesions contain less color than melanomas, their surface area will be divided into fewer but larger color areas as symbolically shown in Fig. 3 for two lesion types. Therefore, when the CCT value changes, the NC value of melanoma lesions will be more affected than the NC values of benign lesions. The method proposed in this section is based on this case. Depending on the selected CCTs, if the calculated NC values vary greatly, the risk of melanoma is also high. This variation was calculated according to Eq. (10) using \( {\gamma}_1^{\prime } \) and \( {\gamma}_2^{\prime } \) CCT values determined by the method described in Section 2.2.

Fig. 3
figure 3

Symbolic representation of the amount of colors detected in skin lesions. C1…6; Symbolic representation of the colors specified in Table 1

$$ \boldsymbol{NCD}=\left[\left({\boldsymbol{NC}}_{\mathbf{1}}-{\boldsymbol{NC}}_{\mathbf{2}}\right)\left(\frac{{\boldsymbol{NC}}_{\mathbf{1}}}{{\boldsymbol{NC}}_{\mathbf{2}}}\right)\right] $$
(10)

In Eq. (10), NC1 and NC2 are the NC attributes calculated on the same example for \( {\gamma}_1^{\prime } \) and \( {\gamma}_2^{\prime } \) CCT values, respectively.

The value range of the NC attributes calculated in the literature [9, 11, 16, 17, 19, 22] and in this study ranges from 1 to 6. However, the proposed NCD attribute ranged from 0 to 30. Extending the value range of the attribute in this way will make it easier to detect small differences between groups. Increasing the value range of the attribute will enlarge the small differences between the lesion groups and the discrimination ability of the attribute will increase. The proposed CCT determination method and the associated NCD calculation method can be easily applied to data sets created under different conditions. In addition, although there are different image acquisition conditions, the proposed CCT determination method and NCD attribute can easily be adapted.

2.2.2 Lesion classification

In this study, the classification of the lesions and the interpretation of the results were conducted in accordance with Table 2. Accordingly, true positive (Tp) states the melanoma samples which were classified as melanoma. False positive (Fp) states samples that were classified as melanoma although they were benign. False negative (Fn) states samples that were classified as benign although they were melanoma. True negative (Tn) states the benign samples which were classified as benign.

Table 2 Evaluation matrix and interpretation of results

Following the determination of the two most distinctive CCT values for the images in the entire data sets as described in Section 2.2, the NCD attribute was calculated as described in Section 2.2.1. In addition, in order to compare the results of this study, the alternative NC attributes were calculated by 0.001 [19], 0.00125 [16], and 150 pixels [9] CCT values used in the literature.

The univariate lesion classification was performed using Tj [0, 30] (for j = 1,2) classification threshold (CT) values. Tj, CTs were determined automatically by the methods in Table 3 during the training stage. In these methods, the step size was 0.01.

Table 3 Classification threshold selection methods

Values above the selected CT value were classified as melanoma (positive) and the ones below or equal the selected CT were classified as benign (negative).

The common point of these methods was that they were based on having a high value of the total correct classification ratio. For each NC attribute, the CT values which are determined by the training sets in accordance with the methods provided in Table 3 were applied to the test sets and the lesion classification procedures were completed. The results from the test sets and the success rate were evaluated.

3 Results and discussion

3.1 Data set

In the present study, three data sets were used. The first database consisting of 28 melanoma and 69 benign cases (DS1) was used to test the developed S-ACES model [9, 22]. Some of the images in the database have black frame problems. Cropping was applied to eliminate such problems. Preprocessing was similar procedures that were used by Mete M. et al. in their studies [9].

The second data set consists of 209 melanomas and 279 benign samples (DS2) [24]. The third data set is composed of a combination of the two (DS3). The aim is to test the performance of the proposed attribute and method on a complex data set containing images obtained under different conditions. The training and test stages were applied to each data set separately. The details of the data sets are given in Table 4. Ground truth is available for diagnosis and segmentation verification of all samples in the datasets.

Table 4 Detail of the data sets used in the presented study

Accordingly, in the first step, color space transformation was applied to images. Thus, the images were transferred from the RGB space to the single-channel grayscale space, which was easy to work with. The spatial filter was applied to the images as expressed in Eq. (1). A 3 × 3 spatial filter mask (φ(s, t)) as stated in Fig. 4 was applied to the images. The coefficients that were used in the filter mask can be arranged in different ways to improve the different attributes of the image. Such as, if the sum of the coefficients in the mask was zero, the background image will appear darker and the edges of the objects will appear in the form of thin white lines. Similarly, if the sum of the coefficients is increased, the contrast of the image also increases [25]. As previously mentioned, coefficients were arranged to improve the image and to obtain the best results in segmentation. The coefficients that provide the best results in experimental studies are given in Fig. 4.

Fig. 4
figure 4

Spatial filter mask used in Eq. (1). (φ(s, t))

In addition, against the noises such as scars and hair in the image, a method was used as stated in Eq. (4) which has been embedded to the S-ACES algorithm.

3.2 Segmentation

The S-ACES model as defined by Eqs. (25) was used. For this model, images were segmented by using the initial values as specified in Eq. (3) [7, 9, 26]. An example of the operation of the model and the results of the segmentation is given in Fig. 5.

Fig. 5
figure 5

Example of segmentation. a S-ACES model process. b Segmented lesion

3.3 The effect of CCT selection on the NC

All of the samples from melanoma and benign lesions were subject to segmentation by the active contour model as mentioned in the previous sections. The NC for each sample was then calculated on the segmented samples.

Then, the Θ value was calculated as described in Section 2.2. As a result of these processes, Θ value changes were obtained for all DSs according to γ values as shown in Fig. 6.

Fig. 6
figure 6

The change of the Θ in accordance with the CCT γ for different DS’s. a DS1. b DS2. c DS3

As shown in Fig. 6a, positive and negative peaks showing the critical CCT values for the DS1 were determined as \( \varTheta \left({\gamma}_1^{\prime }=0\right) \)= + 0.6734 and \( \varTheta \left({\gamma}_2^{\prime }=0.1231\right) \)= − 0.6444. The γ values providing these peaks were recorded as \( {\boldsymbol{\gamma}}_{\mathbf{1}}^{\prime } \) and \( {\boldsymbol{\gamma}}_{\mathbf{2}}^{\prime } \) respectively.

In addition, as mentioned in the previous sections, the negative peak, which indicates that benign lesions have higher color count depending on the CCT value, is clearly seen here.

This process has been applied to all DSs separately. The Θ values for DS2 and DS3 are given in Fig. 6b and c respectively. For all data sets, positive and negative peak values of Θ(γ′) and the critical \( {\gamma}_1^{\prime } \) and \( {\gamma}_2^{\prime } \) CCT values are given in Table 5.

Table 5 Critical CCT values were determined in the data sets. \( {\gamma}_1^{\prime } \): CCT providing the highest Θ value for the melanoma group. \( {\gamma}_2^{\prime } \): CCT providing the highest Θ value for the benign group

Although melanoma lesions have been reported to have a higher NC in the literature [9, 17, 20], experimental results indicated that this was depended on the CCT value selected. Thus, the CCT was also found to be of critical importance.

Next, the NC value was calculated for all the samples in all data sets using the CCTs described in Table 5 and the CCTs values used in the literature (0.001, 0.00125, and 150 pixels) [9, 16, 19]. Thus, by using six different CCT values, five different NC attributes were calculated for each DS. In addition, using NC values calculated with CCTs given in Table 5, NCD attributes were calculated for each data set as a function of CCTs according to Eq. (10). NC and NCD parameters which were calculated on the DS1 are provided in Fig. 7.

Fig. 7
figure 7

The NC calculated in accordance with the different CCTs. (For DS1). a γ=0 CCT. b γ=0.1231 CCT. c γ=0.001 CCT. d γ=0.00125 CCT. e γ=150 pixel CCT. f The NCD calculated according to Eq. (10) using NC obtained for γ=0 and γ=0.1231. Asterisk symbols indicate the melanoma samples; Circles indicate the benign samples

All of the NC parameters were statistically evaluated for their distinctiveness capabilities. For this purpose, the t test was applied. The results are provided in Table 6.

Table 6 t test results of NC obtained by different threshold and NCD parameters

The γ values used in the literature for DS1, DS2, and DS3 did not significantly increase the value of Θ between the benign and melanoma lesion groups, as shown in the 6th column in Table 6. However, the 0 and 0.123x CCT values ​proposed in this study produced the highest Θ value ​for all data sets. In addition, although the literature suggests that melanoma lesions contain a greater number of colors than benign lesions, when γ= 0.123x, it was calculated that benign lesions had a higher color count (Θ < 0).

In addition, when CCT values in the literature were used especially for DS2, no statistically significant difference was observed between the lesion groups in terms of NC (p > 0.05). However, the NC calculated with the 0 and 0.123x CCT values suggested in this study revealed significant differences in all data sets.

In order to avoid these contradictory situations and to benefit from the CCT values ​that make Θ the highest value, the NCD parameter as a function of CCT values ​is calculated as given in Eq. (10). In the tests performed on all data sets with the proposed NCD attribute, the Θ value reached the highest level among the lesion groups and the power of distinctiveness was significantly higher.

3.4 Classification of lesions with NCD

As mentioned in Section 1, three data sets were used in the experiments. Four attributes were calculated on each data set. These were as follows: three NC calculated with CCTs used in the literature and NCD attribute proposed in this study. Each feature vector was divided into 70% training and 30% test groups. In this way, 20 different training/test sets were randomly created for each attribute. Thus, experiments for each attribute were repeated with a total of 60 different training/test sets.

The classification experiments were executed using one attribute for each period. The CT values (T1 and T2) used in the classification tests were determined on each training set using the CT selection methods (M1 and M2) given in Table 3. These methods allowed determining the value of T1 and T2, which provide the highest M score (M1 and M2) on each training set. The working principles of these methods were given as an example of one case in Fig. 8. In this figure, the execution of the M1 method was shown on the 14th training set created for the NCD attribute calculated on DS1. Figure 8a shows the change of Tp and Tn values ​​during the training phase, while Fig. 8b shows the change of the M1 score value during the training. In this training example, the T1 value related to the NCD feature was determined as 3.34 as shown in Fig. 8b. In the test stage, samples above this CT value were classified as melanoma and in other cases as benign. All of these processes were repeated on 60 randomly generated training/test sets for each attribute.

Fig. 8
figure 8

Determining the CT value for the NCD attribute in the training stage for DS1. The M1 method was used to determine the CT for the 14th train set. a Variation of correct classification rates of melanoma and benign groups in accordance with the selected threshold value. b Change of score calculated according to M1 method. In this sample, it has been found CCT that T1 = 3.34. Asterisk symbols indicate Tp(Melanoma); Circles indicate Tn(Benign)

In order to set an example for the operations executed during the training and testing stages, the details of the results with NCD attributes are provided in Table 7 for DS1. The first column in this table states the sequence number of the randomly generated training/test sets. The second and third columns are the correct classification rates of melanoma and benign lesions in the training stage. These values were obtained by the M1 method. The fourth and fifth columns show the highest scores achieved by the M1 method for each training set and the T1 values that provide them.

Table 7 Details of the results of training and testing operations with 20 training/test sets conjugated to the NCD attribute for M1 method, for DS1

Finally, the sixth and seventh columns state the correct classification rates for melanoma and benign lesions in the tests stage. The average values of each column were provided at the bottom of Table 7.

The comparison was executed by average Tp, Tn, and f-measure values for all attributes and all DSs. These mean values were calculated using univariate classification results using CTs determined by M1 and M2 methods.

The correct diagnosis of melanoma is vital for treatment. Misdiagnosis of the actual patient may lead to delayed treatment and death. Therefore, it is vital to have high a Tp ratio in the diagnostic support systems used. The Tn value is also expected to be high for diagnostic success. However, even if this value is low, it is not vital. It only leads to increased diagnostic costs and psychological effects on patients.

The f-measure value is used to assess the accuracy of the tests and has also vital importance for correct interpretation of results. These values for experiments were calculated for the melanoma and benign groups and tabulated in Table 8.

Table 8 Average results for each DS obtained from 20 training/test sets that are randomly generated and used in lesion classification experiments

Classification tests were performed on the three databases by NC used in the literature and the proposed NCD. In Table 8, the first three rows show test results used the attributes (NC) and the last rows indicate a new feature called “NCD” for all DSs.

The first column in Table 8 shows the data set used in the tests. The second and third columns show the attributes used in the tests. The four, five, six, seven, eight, and ninth columns show the average Tp, Tn, and f-measure results from 20 sets of tests for each attribute. Since the training/test procedures are performed with two different methods (M1, M2), the results are given in separate columns for each method. The ten, eleven, and twelfth columns are the averages of the test results obtained according to the M1 and M2 methods. The thirteenth, fourteenth, and fifteenth columns show the improvement in the proposed NCD attribute relative to the average Tp, Tn, and f-measure values obtained with the NC attributes used in the literature.

When we compare the first three rows for DS1, Faziloglu et al. [16] had a higher average Tp and f-measure value (55.6%, 62.8% respectively). However, the new proposed attribute (NCD) of this study had the highest average Tp and f-measure (80.0%, 75.5% respectively).

When we make the same comparison for DS2, Mete et al. [9] had higher average Tp and f-measure values (39.4%, 47.9% respectively) than the others in the literature, but the new NCD attribute was again the highest average Tp and f-measure values (64.7%, 66.0% respectively). When a similar comparison is made for DS3 [9], it has a higher average Tp and f-measure values (40.1%, 49.0% respectively) than others in the literature. However, as in the previous two data sets, the proposed new NCD attribute has the highest average Tp and f-measure values (68.2%, 67.4% respectively) in this data set. When the test results were examined in terms of Tn, although the proposed NCD attribute showed relatively poor performance, low Tn values as described above are not vital for diagnosis. However, the experimental results revealed that the NCD attribute improved the f-measure value up to 52.7% and Tp value up to 84.5% compared with literature using NC attributes with different CCTs.

In the open literature, Faziloglu Y. et al. achieved classification success with a single attribute for melanoma as Tp = 84.6% and benign lesions as Tn = 83.0% [16] with the attribute called PMC based on histogram. However, the mentioned attribute was based on the histogram vector and required more workload. On the other hand, the calculation of the proposed attribute (NCD) was easier as a scalar number rather than a vector.

In addition, the best classifying performance belongs to the NCD attribute among the other NC generated depending on the CCT. This result also indicates that NCD was more effective than the attributes of NC that were commonly used in the classification of the lesions.

4 Conclusion

It is known that NC and color properties are important for melanoma detection. However, in determining these characteristics, it is seen that there is no detailed study on the effects of CCT values in the literature. In the experiments, it was found that the selected CCT values significantly affected NC. For example, in the literature, NC of melanomas are higher than benign type lesions. However, this study revealed that this situation may give the opposite results depending on the selected CCT value. In addition, it was found statistically that CCT values used in the literature were not suitable for use with different data sets (Table 6). In addition, it was shown that the most appropriate CCT values for the examination of melanoma and benign lesions were 0 and 0.123, respectively.

CCT values are highly dependent on the lighting and hardware conditions and resolution values from which the images are acquired. Therefore, for samples obtained under different conditions, this CCT may need to be determined again. In this study, a new CCT determination method was proposed for this purpose. Experimental and statistical results showed that the proposed threshold determination method was highly effective. Since there is no method to determine CCT values in the literature, the proposed method is an important contribution of this study to the literature.

When the CCT values determined by the proposed method were examined, it was seen that 0 CCT value was prominent in all data sets for melanomas. This situation has shown that even the smallest color changes in the lesion should be taken into account for the diagnosis of melanoma.

In this case, the accuracy of the methods used for segmentation can become even more critical. Because segmentation errors may cause the color properties of the skin surrounding the lesion to affect the attribute calculation processes more. However, even if there is an error in the segmentation process, the proposed CCT determination method will determine a new CCT value accordingly. In this regard, the effect of segmentation methods on NCD attribute calculation is a separate research subject.

Still, threshold value selection alone is insufficient to increase classification performance. A new attribute called NCD is defined to improve this situation. The dynamic range of NCD was extended to 0 and 30 while it was between 1 and 6 for NC. This improvement allowed even the smallest difference between the lesions to be evaluated. The proposed CCT determination method and the associated NCD calculation method can be easily applied to lesion images created under different conditions. NCD is not a simple color number attribute but a measure of the variation in the number of colors of the lesion for different CCTs. In this respect, it differs from the classic NC attribute. Univariate classification experiments using this feature revealed that this is a strong indicator for the diagnosis of melanoma.

In the experiments, the average Tp for melanomas reached up to 80%, the average Tn up to 68.5%, and the average f-measure reached up to 75.5%. These values have shown to improve the f-measure value up to 52.7% and Tp value up to 84.5% compared with the results obtained by the NC depending on different CCTs used in the literature. Providing such a high improvement by a single scalar descriptor might be one of the important contributions of this study to the literature. Due to this improvement and adaptability feature, in the ABCD system, which is commonly used by dermatologists, using NCD instead of color number (C) will provide more successful results.

In the following studies, the interaction of NCD attribute with other important features in literature will be investigated.