1 Introduction

Coral reefs are one of the wealthy and assorted ecosystems on earth [24, 34]. They occupy a smaller area of just 0.1% of the ocean surface although they are the territory for at least 25% of the entire marine species. Healthy coral reefs are one of the most important and the most precious ecosystems on planet Earth. In recent days, coral diseases are a major threat to the coral ecosystem worldwide [6]. Coral diseases generally occur due to increased sea surface temperatures, ultraviolet radiation and pollutants which can increase the susceptibility of corals to disease, leading to outbreaks where corals are abundant in diversity. Coral diseases can cause significant changes in coral reproduction rates, growth rates, community structure, species diversity, and abundance of reef-associated organisms. Besides these, corals span a wide area, and their abundance supports biodiversity since it is a natural barrier to deadly tsunami and is also a habitat to many creatures. Coral diseases present in a particular area at a given period (month) if reported early will enable government to take countermeasures so as to save the precious ecosystem. Marine biologists are still looking for an automated system to provide earlier reporting of solutions to protect marine ecosystem.

Corals are generally classified as healthy, unhealthy and dead as shown in Fig. 1. Healthy coral appears in colors of olive green, brown, tan and pale yellow. Healthy coral colony is not affected by disease or bleaching. Healthy coral provides shelter for many species that rely on the composition provided by corals for their homes. Corals are affected by a form of narrow band intruding on living tissues of a healthy coral, with the dead white coral skeletons above. The narrow band’s color comes from deliberations of bacteria that produce sulphur compounds and low-oxygen conditions, which infect the coral [34]. For example, black band disease is caused by a bacterium called phormidium corallyticum. Healthy and diseased coral images have to be classified as any onset of disease can be curbed in the bud to protect coral for the future. An automated system reports coral diseases earlier so that the government can take countermeasures so as to prevent further damage to coral reef. There are a lot of existing works on classification that are applicable to texture data sets and some works for coral reef image classification, but only a few works are reported on coral reef disease classification. An automated system is the need of the hour for coral reef disease classification. An efficient feature vector must be devised to reduce the computational and time complexities and to improve accuracy of the classification process.

Fig. 1
figure 1

a Healthy coral. b Diseased coral. c Dead coral

This paper is divided into five sections and starts with introduction. Section two presents research works that have been carried out on coral reef classification. Section three describes the overall framework of the proposed system. Section four describes the proposed novel feature descriptors. Experimental results are presented in section five, and the last section presents the conclusion and future work.

1.1 The contributions in this work are summarized as follows

  1. 1.

    Both HSV and RGB color spaces are considered with feature extraction for diseased coral reef image classification.

  2. 2.

    A novel feature descriptor MDCP is proposed which considers every element for mean estimation and only the diagonal elements for assigning direct code during feature extraction.

  3. 3.

    A novel feature descriptor DDVP is proposed which considers diagonal directional elements for feature extraction.

  4. 4.

    Feature vector is constructed, by concatenating both HSV-MDCP and RGB-DDVP.

  5. 5.

    The proposed framework reduces the computational and time complexities effectively by considering the diagonal elements only.

  6. 6.

    Since diagonal elements are considered, classification results are improved which is proven with various classifiers like Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Adaboost, Rotation Forest (RoF), Random Forest (RF), SVM, CNN, PCCNN, KNN and Naive Bayes.

  7. 7.

    Performance comparison of the proposed feature descriptors with other feature descriptors such as LBP [23], LDP [38], CLBP [33], ILDP [2], DLBP [16], LTxXORP [4], CS-LBP [11], RLTP [31], Z ⊕ TZLBP [1], OC-LBP [39], LTrP [25], PRI-CoLBP [30], OPT [3], is made to highlight the superiority of the proposed framework in terms of classification accuracy.

2 Related works

Coral disease is a major threat to coral reef ecosystems [14]. With the available literature it is observed that there are some works that have reported coral reef image classification, but no work has reported on coral reef disease classification. Pérez et al. [28] have enhanced coral reef images using Contrast Limited Adaptive Histogram Equalization (CLAHE), segmented coral reef images using Gaussian mixture models [7] and extracted texture features using Gabor. Shiela et al. [23] have classified corals in videos as living and nonliving by extracting texture features using Local Binary Pattern (LBP) descriptor and classified them using Linear Discriminant Analysis (LDA). According to Shiela et al. [23] living coral is considered as smooth textures, and non living coral is considered as irregular textures. For enhancing coral reef images, Shihavuddin et al. [33] have utilised Normalization, color correction, color stretching and Contrast Limited Adaptive Histogram Specification (CLAHS) whereas Ani et al. [2] have utilised CLAHE and Contrast Stretching (CS). Multi-task learning and temporal pattern mining [19] are mostly applied for feature representation for activities and selecting discriminant features [21]. Generic human motions [18] are tracked using fusion of low and high dimensional Approaches [37]. Ye Liu et al. [20] have used multi-task multi-view learning for urban water quality prediction.

For extracting features of coral reef images, Shihavuddin et al. [33] have utilised Gabor, GLCM, CLBP, opponent angle and hue histogram whereas Ani et al. [2, 3] have utilised Improved Local Derivative Pattern (ILDP) and Octa-angled Pattern for triangular sub region (OPT). For coral reef classification, Shihavuddin et al. [33] have utilised KNN, SVM, and NN classifiers whereas Ani et al. [2, 3] have utilised Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Pulse Coupled Convolutional Neural Network (PCCNN) and Convolutional Neural Network (CNN) classifiers. Ani et al. [1] have segmented coral reef images using zctive contour and classified coral videos using KNN classifier with various distance metrices whereas Mohammad et al. [32] have utilised KNN with euclidean distance for classification. For extracting features, Ani et al. [1] have utilised Z with Tilted Z Local Binary Pattern (Z ⊕ TZLBP) feature descriptor whereas Mohammad et al. [32] have utilised Completed Local Binary Pattern (CLBP). To enhance coral reef images, Ani et al. [1] have utilised CLAHE and contrast stretching whereas Mohammad et al. [32] have utilised contrast normalization. Beijbom et al. [5] have provided Moorea Labeled Corals (MLC) data set, which is a benchmark data set for coral reef image classification. Maximum Response (MR) filter bank is used for extracting features, and Library SVM is applied for classification. Not only is coral reef, fish that lives in coral reef also classified using SVM and CNN classifiers by extracting HOG features [35].

2.1 Observation

Just as coral reef image is classified into any one of the classes, Coral reef diseased images have to be classified into any one of the nine classes with class types being Aspergillosis (Asper), Bacterial Bleaching (BaB), Black Band (BB), Black Spot Dark Spot (BSDS), White Plague (WPl), White Band (WB), Yellow Band (YB), White Pox (WPo), and Pink Spot (PS). An automated system is expected to classify the diseased coral reef images into any one of the classes. For texture, a lot of standard data sets are available. But there are not many data sets available for coral reef diseases. Coral reef and texture data sets are classified with their textural features, but texture combined with colour spaces is important for diseased coral reef image classification.

3 System architecture

The proposed framework consists of the following steps for diseased coral reef image classification, and it is shown in Fig. 2. The steps in the proposed framework are segmentation, feature extraction and classification.

Fig. 2
figure 2

System Architecture of the proposed framework

3.1 Segmentation using gradient-based sobel operator

In a submarine environment, the edges of the coral reef must be correctly estimated. To detect the coral edges perfectly, coral reef has to be correctly separated from its background. K-Means is the mostly used clustering technique for image segmentation [36]. The difficulty with K-Means is that it considers global cluster where results are not accepted. For various initial divisions, K-Means results in different final clusters. It is also not simple to calculate K-Value. On the other hand, Fuzzy C-Means method poses difficulty in managing outlier points [9]. It allocates high membership values for the outlier points so that the membership of data points depends directly on the membership values of other cluster centres. This directs to unwanted results.

Gradient-based approach is applied on the diseased coral reef images for efficient segmentation. Accurate edges of the coral image are identified using Sobel operator. The number of white points are estimated from the edge extracted coral image. Gradient-based image segmentation technique is used to separate the coral reef image from its background efficiently as shown in Fig. 3. b. The Gradient-based image segmentation identifies significant local changes in the intensity level of a coral image. Consider a coral image f(x, y), where magnitude and the direction of the gradient are computed as shown in Eqs. (1) and (2) respectively.

Fig. 3
figure 3

a Input coral reef b Background removed coral reef c Edges detected coral reef

$$ \left|G\right|=\sqrt{G_x^2+{G}_y^2} $$
(1)
$$ \upalpha \left(x,y\right)={\tan}^{-1}\left(\frac{G_y}{G_x}\right) $$
(2)

Accurate edges of the coral reef image are identified using Sobel operator as shown in Fig. 3. c. Sobel operator is based on convolving the coral reef image with a small, separable, and integer-valued filter in the horizontal and the vertical directions. It is inexpensive in terms of computations. The kernels of sobel operator are shown in Eq. (3):

$$ {G}_x={\displaystyle \begin{array}{ccc}-1& 0& 1\\ {}-2& 0& 2\\ {}-1& 0& 1\end{array}}\ {G}_y={\displaystyle \begin{array}{ccc}1& 2& 1\\ {}0& 0& 0\\ {}-1& -2& -1\end{array}} $$
(3)

Where Gx and Gy are the horizontal and the vertical derivative approximations respectively.

3.2 Feature extraction

DDVP extracts features by finding the local difference between the elements among the three planes in four diagonal directions in RGB format. Then, the original image is transformed to HSV color space because it is more perceptually uniform than other color spaces. MDCP so as to extract features considers all elements for mean estimation and diagonal elements for applying direct codes.

A feature descriptor can be efficient only when it improves accuracy besides reducing computational complexity. For feature extraction, it is not important to use all pixels in an image as it becomes a time consuming process if all the pixels are considered. It is a fact that diagonal elements contribute more than other elements for extracting features [2, 3]. Novel DDVP from RGB plane and MDCP from HSV plane is concatenated to generate feature vector which is explained in detail in Section 4.

However deep learning might together deal with the feature extraction and classification step, the purpose of this work is to develop a feature descriptor that might pair with existing classifiers for the classification of coral reef or its diseases with greater accuracy. However one more objective of this work is to create it appropriate for real time implementations where computational resources and time are constraints. The drawback of deep learning is its computational and time complexity, which makes it incompatible for real time implementation where there is time constraint [3].

3.3 Classification

For diseased coral reef image classification, texture feature vector is provided as input for various classifiers. A Decision Tree (DT) classifier is a non-parametric supervised learning technique applied for classification and regression. Still now, DT classifiers have provided good results for large data sets and imbalanced data sets. So, in this work the performances of different DT classifiers are evaluated and assessed. The classifiers used for classification are Decision Trees (DT), Classification And Regression Trees (CART), Random Forest (RF), C4.5, AdaBoost, Rotation Forest (RoF), SVM, KNN, CNN, PCCNN [3] and Naive Bayes.

Classification using DT classifier is very easy. A DT is a non-parametric supervised learning technique used for classification. The DT [22] constructs a model that classifies an image by learning simple decision rules inferred from the features trained. Random forest [15] is an ensemble method which considers a subset of annotations and a subset of variables to build a decision tree. It builds several such decision trees and combines them collectively to result in accurate classification. Adaboost improves the classification results by combining weak predictors collectively. CART considers every possible subset of features for classification.

CART is divided into Regression Trees and Classification Trees. Both the trees divide the feature space into distinct and non-overlapping regions. Both the trees apply recursive binary splitting technique that uses a top-down greedy approach. CART makes the tree grow for the entire data and then prunes back. C4.5 [17] is a statistical classifier which is an extension of ID3 algorithm. It uses an information entropy evaluation function for feature selection. In Rotation Forest (RoF), every decision tree is trained by first applying Principal Component Analysis (PCA) on a random subset of the input features. This uses a number of decision trees in order to improve the classification accuracy. Each tree depends on the values of a random vector sampled independently in the forest.

Experiment is started by classifying using DT with default parameters to get a baseline. By default, the Minimum sample per leaf node is set to 1, that logically makes the tree over-fit and contains all the data points, including outliers. By default, Greedy approach is used to reduce similar trees. The features that are considered for decision making is common for all classifiers such as bin which contains the maximum pixel value, bin which contains the minimum pixel value and bin which contains the median pixel value. Features considered for diseased coral reef image data sets with nine classes are shown in Fig. 4, where FV represents Feature Vector of the proposed framework. The bin with maximum, minimum and median pixel values are set to 24 because the bin size of histogram is 24. This is similar for all classifiers. Here maximum tree depth is set to 12, Hence every features considered will get a chance to participate in becoming a decision node.

Fig. 4
figure 4

Decision Tree Classifier for coral reef diseased image

In Random forest, the bin with maximum, minimum and median pixel values are set to 24. The node estimators is set to 9, to find the optimum classification. In Adaboost, number of weak learners and the maximum depth of tree are the same as other classifiers. Here learning rate is an important parameter that is set to fit the weak learners perfectly. By default, learning rate is set to 0.01, that fits more weak learners. The two important parameters in C4.5 are Minimum cases (M) and Confidence Level (CL). M is preferred to be a high value and is set as 9 and the CL should be a lower value and is set as 1 for classification. The number of iterations and the number of splits are the two main default parameters in RoF which are set to 9 and 2.

4 The proposed framework

4.1 Background

There are a lot of differences between normal coral reef images and diseased coral reef images. For coral reef classification, texture features alone are considered. Each species of coral reef has its unique texture features. In the existing literature for feature extraction, texture features alone are considered. In order to improve the accuracy of classification of diseased coral reef images, features extracted from RGB and HSV color spaces are combined with textural feature. Usually for texture feature extraction, Ojala et al. [26, 27] and Heikkila et al. [12, 13] have used Local Binary Pattern (LBP), which is called as ‘Curse of dimensionality’, due to the size of feature vector.

4.2 Diagonal direction value pattern (DDVP)

Several feature descriptors are available in the literature for texture classification. Most of the feature descriptors make use of only the gray scale values of the pixel in an image. But the relationship between the diagonal directional elements in three planes of an image is not considered.

DDVP extracts features diagonally along four (450 , 1350, 2250 and 3150) directions from three planes efficiently. The speciality of DDVP is that diagonal direction difference among pixel values in the three planes of the image are considered while reducing the bin size of histogram used in the classification stage. The steps in the proposed feature descriptor are summarized as follows:

In an image I, let Ri, j, Gi, jand Bi, j be the centre pixels of a 3 ×3 local region termed as Block Bk, in each of the three planes as shown in Fig. 5, where 1 ≥k ≤ N,N is the total number of 3 ×3 blocks in the image I. Ri − 1, j + 1, Ri, j + 1, Ri + 1, j + 1, Ri + 1, j, Ri + 1, j − 1, Ri, j − 1, Ri − 1, j − 1 and Ri − 1, j are the eight neighbors of the centre pixel Ri, j in the Red plane. Gi − 1, j + 1, Gi, j + 1, Gi + 1, j + 1, Gi + 1, j, Gi + 1, j − 1, Gi, j − 1, Gi − 1, j − 1 and Gi − 1, j are the eight neighbors of the centre pixel Gi, j in the Green Plane. Bi − 1, j + 1, Bi, j + 1, Bi + 1, j + 1, Bi + 1, j, Bi + 1, j − 1, Bi, j − 1, Bi − 1, j − 1 and Bi − 1, j are the eight neighbors of the centre pixel Bi, j in the Blue plane.

Fig. 5
figure 5

RGB plane images

4.2.1 Direction difference estimation

For a Block BK, Rθ, Bθ, Gθ, RGθ, RBθ, GRθ, GBθ, BRθ and BGθ are constructed as shown in Eq.(4) to Eq. (12) from the neighbors of the centre pixel in three planes along the directions θ =  < 45°, 135°, 225°and 315°> as shown in Fig. 6

$$ {R}_{B_k,\theta}\left[m\right]={R}_{i,j}-{R}_{i+u,j+v} $$
(4)
$$ {G}_{B_k,\theta}\left[m\right]={G}_{i,j}-{G}_{i+u,j+v} $$
(5)
$$ {B}_{B_k,\theta}\left[m\right]={B}_{i,j}-{B}_{i+u,j+v} $$
(6)
$$ {RG}_{B_k,\theta}\left[m\right]={R}_{i,j}-{G}_{i+u,j+v} $$
(7)
$$ {RB}_{B_k,\theta}\left[m\right]={R}_{i,j}-{B}_{i+u,j+v} $$
(8)
$$ {BR}_{B_k,\theta}\left[m\right]={B}_{i,j}-{R}_{i+u,j+v} $$
(9)
$$ {BG}_{B_k,\theta}\left[m\right]={B}_{i,j}-{G}_{i+u,j+v} $$
(10)
$$ {GR}_{B_k,\theta}\left[m\right]={G}_{i,j}-{R}_{i+u,j+v} $$
(11)
$$ {GB}_{B_k,\theta}\left[m\right]={G}_{i,j}-{B}_{i+u,j+v} $$
(12)
Fig. 6
figure 6

Illustration of direction difference estimation

where Bk is of size 3 × 3,

$$ 1\ge \mathrm{m}\le 4, $$
$$ 1\ge k\le N $$

and

$$ \left(\mathrm{u},\mathrm{v}\right)=\left\{\begin{array}{c}u=1,v=1, if\ \theta ={45}^{{}^{\circ}}\\ {}u=-1,v=1, if\ \theta ={135}^{{}^{\circ}}\\ {}\ u=-1,v=-1, if\ \theta ={225}^{{}^{\circ}}\\ {}u=1,v=-1, if\ \theta ={315}^{{}^{\circ}}\end{array}\ \right. $$

4.2.2 Vector generation using four diagonal directions in three planes

For each R, G and B plane, the vectors are generated along diagonal directions. Vector \( {R}_{B_k,\theta } \) is constructed using R, RB and RG values, whereas vector \( {G}_{B_k,\theta } \) is constructed using G, GR and GB values and vector \( {B}_{B_k,\theta } \) is constructed using B, BR and BG values. The vectors generated for three planes are shown in Eqs. (13) to (15).

$$ {R}_{B_k,\theta}\left[i\right]={\left\langle {R}_{\theta },{RB}_{\theta },{RG}_{\theta}\right\rangle}_{B_k} $$
(13)
$$ {G}_{B_k,\theta}\left[i\right]={\left\langle {G}_{\theta },{GB}_{\theta },{GR}_{\theta}\right\rangle}_{B_k} $$
(14)
$$ {B}_{B_k,\theta}\left[i\right]={\left\langle {B}_{\theta },{BR}_{\theta },{BG}_{\theta}\right\rangle}_{B_k} $$
(15)

where θ = < 45°, 135°, 225° and 315° > ,

1 ≥ i ≤ 4 and

Now, \( {R}_{B_k,\theta },{G}_{B_k,\theta } \), and \( {B}_{B_k,\theta } \) vectors may contain negative and non-negative values. Binary codes (0 and 1) are assigned depending on negative and non-negative values.

4.2.3 Binary codes assignment

The binary codes are assigned for three planes as given by Eqs. (16) to (18). Negative and non-negative values are converted into binary codes either 0 or 1.

$$ \mathrm{BC}{R}_{B_k,\theta}\left[a\right]=\left\{\begin{array}{c}1,{R}_{B_k,\theta}\left[i\right]>0\ \\ {}0, Otherwise\end{array}\right. $$
(16)
$$ \mathrm{BC}{G}_{B_k,\theta}\left[a\right]=\left\{\begin{array}{c}1,{G}_{B_k,\theta}\left[i\right]>0\\ {}0, Otherwise\end{array}\right. $$
(17)
$$ {\mathrm{BC}B}_{B_k,\theta}\left[a\right]=\left\{\begin{array}{c}1,{B}_{B_k,\theta}\left[i\right]>0\\ {}0, Otherwise\end{array}\right. $$
(18)

where θ =  < 45°, 135°, 225° and 315° > ,

$$ 1\ge i\le 3, $$
$$ 1\ge k\le N\ \mathrm{and} $$
$$ 1\ge \mathrm{a}\le 3. $$

4.2.4 Diagonal direction value assignment

DDV is assigned for three plane vectors estimated. Depending on the position of the binary codes of three planes, the diagonal directional value is assigned for diagonal directions as shown in Eq. (19) to Eq. (21).

$$ \mathrm{DDV}\_\mathrm{BC}R\left[{}_{B_k,\theta}\right]=\left\{\begin{array}{c}1, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=0\\ {}2, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=1\ \\ {}3, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=0\ \\ {}4, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=1\\ {}5, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=0\\ {}6, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=1\\ {}7, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=0\\ {}8, if\ \mathrm{BC}{R}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{R}_{B_k,\theta}\left[3\right]=1\end{array}\right. $$
(19)
$$ \mathrm{DDV}\_\mathrm{BC}G\left[{}_{B_k,\theta}\right]=\left\{\begin{array}{c}1, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=0\\ {}2, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=1\ \\ {}3, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=0\ \\ {}4, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=1\\ {}5, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=0\\ {}6, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=1\\ {}7, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=0\\ {}8, if\ \mathrm{BC}{G}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{G}_{B_k,\theta}\left[3\right]=1\end{array}\right. $$
(20)
$$ \mathrm{DDV}\_\mathrm{BC}B\left[{}_{B_k,\theta}\right]=\left\{\begin{array}{c}1, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=0\\ {}2, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=1\ \\ {}3, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=0\ \\ {}4, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=1\\ {}5, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=0\\ {}6, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=0\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=1\\ {}7, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=0\\ {}8, if\ \mathrm{BC}{B}_{B_k,\theta}\left[1\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[2\right]=1\ and\ \mathrm{BC}{B}_{B_k,\theta}\left[3\right]=1\end{array}\right. $$
(21)

where θ = 45°, 135°, 225° and 315°.

4.2.5 Summed diagonal direction value

The obtained DDV values are summed together along the four diagonal directions. Summed Diagonal Direction Value (SDDV) is estimated using Eq. (22).

$$ \mathrm{SDDV}\left({B}_k,\theta \right)={\sum}_{k=1}^N\left|\mathrm{DDV}\_\mathrm{BC}R\left[{}_{B_k,\theta}\right]+\mathrm{DDV}\_\mathrm{BC}G\left[{}_{B_k,\theta}\right]+\mathrm{DDV}\_\mathrm{BC}B\left[{}_{B_k,\theta}\right]\right| $$
(22)

where θ =  < 45°, 135°, 225° and 315° > .

Summed Diagonal Direction Value Vector (SDDVV) is estimated using Eq. (23)

$$ SDDVV=< SDDV\left({B}_{1_k},\theta \right), SDDV\left({B}_{2_{k+1}},\theta \right),\dots, SDDV\left({B}_{N-1},\theta \right), SDDV\left({B}_N,\theta \right)> $$
(23)

where θi =  < 45°, 135°, 225° and 315°> and

$$ 1\ge k\le N. $$

For an entire image, SDDVV is considered as the feature vector. This is shown in Eq. (24).

$$ \mathrm{i}.\mathrm{e}.\mathrm{Feature}\ \mathrm{vector}\ \left(\mathrm{I}\right)= SDDVV $$
(24)

Figure 7 explains the algorithm for feature extraction using DDVP.

Fig. 7
figure 7

DDVP Algorithm

4.3 Mean direct code pattern (MDCP)

Many feature descriptors utilize the relationship between the neighbor pixels in the gray scale image based on co-occurrence matrix, but the relationship between the diagonal neighbors in an image is not considered. In the proposed feature descriptor, the relationship between the diagonal neighbors is considered by providing a direct code pattern. The new feature descriptor, named MDCP with HSV extracts texture features of diseased coral reef images. RGB color space cannot represent color in terms of human interception, pixel variations due to illumination problem and intensity variations are overcome by HSV. Hence HSV is considered in this work. First the color image is converted into HSV color space. The steps in the proposed feature descriptor are summarized in the following sub sections:

4.3.1 Mean estimation for HSV images

The image is transformed to HSV color space as shown in Fig. 9. In a diseased coral reef image, consider a Block Bk of size 3 ×3 where 1 ≤k ≤ N and N is the total number of blocks in a frame. Let Ei, j be the centre pixel of a 3 ×3 Block Bk, as shown in Fig. 8. Ei − 1, j + 1, Ei, j + 1, Ei + 1, j + 1, Ei + 1, j, Ei + 1, j − 1, Ei, j − 1, Ei − 1, j − 1, and Ei − 1, j are the eight neighbors of the centre pixel. Ei, j. Ei − 1, j + 1, Ei + 1, j + 1, Ei − 1, j − 1 and Ei + 1, j − 1 are the four diagonal elements. Here, the input corresponds to three channels, i.e. H-plane, S-plane and V-plane as shown in Fig. 9.

Fig. 8
figure 8

Diagonal elements in a block are highlighted

Fig. 9
figure 9

HSV plane images

The mean value of Block Bk is computed as in Eq. (25) separately for each plane as in Eq. (26), Eq. (27) and Eq. (28).

$$ Mean= average\ of\ nine\ elements\ in\ a\ block $$
(25)
$$ HMean\left[k\right]=\frac{1}{n}\ \sum \limits_{i=1}^nH{\left[i\right]}_{\left({B}_k\right)} $$
(26)
$$ SMean\left[k\right]=\frac{1}{n}\ \sum \limits_{i=1}^nS{\left[i\right]}_{\left({B}_k\right)} $$
(27)
$$ VMean\left[k\right]=\frac{1}{n}\ \sum \limits_{i=1}^nV{\left[i\right]}_{\left({B}_k\right)} $$
(28)

where 1 ≥ k ≤ N, N represents the elements in a Block Bk.

As shown in Fig. 10, more importance is provided to diagonal elements, where the diagonal directions are covered very efficiently along 45°, 135°, 225°and 315° as shown in Fig. 10a. To improve accuracy, diagonal elements are given more importance compared to the other vertical and horizontal neighbors. Depending on the mean value, diagonal elements are alone considered for assigning the direct code. Diagonal elements are compared with the mean value obtained. Then, a direct code is assigned as feature vector for that block. This pattern is efficient because it depends on the diagonal elements much.

Fig. 10
figure 10

(a) Diagonal elements are covered

4.3.2 Diagonal neighbors estimation for HSV images

In a Block Bk, for H-plane, S-plane and V-plane the mean value is compared with four diagonal elements. Let the four diagonal elements be Ni,, where i = 1 to 4 as shown in Eq. (29) to Eq. (31), mean value is compared with the four diagonal elements. If the value of the diagonal element is equal to or greater than mean value of the block, it is assigned a directional code as 1 or 0, otherwise as shown in Eq. (29) to Eq. (31).

$$ BC{H}_{B_K}\left[j\right]=\left\{\begin{array}{c}1,{\left[{N}_i\right]}_{B_K}\ge HMean\left[k\right]\\ {}0,{\left[{N}_i\right]}_{B_K}< HMean\left[k\right]\end{array}\right. $$
(29)
$$ BC{S}_{B_K}\left[j\right]=\left\{\begin{array}{c}1,{\left[{N}_i\right]}_{B_K}\ge SMean\left[k\right]\\ {}0,{\left[{N}_i\right]}_{B_K}< SMean\left[k\right]\end{array}\right. $$
(30)
$$ BC{V}_{B_K}\left[j\right]=\left\{\begin{array}{c}1,{\left[{N}_i\right]}_{B_K}\ge VMean\left[k\right]\\ {}0,{\left[{N}_i\right]}_{B_K}< VMean\left[k\right]\end{array}\right. $$
(31)

where j = 1- N, 1≥j ≤ 4where \( i=\left\{\begin{array}{c}i-m,j+n\\ {}i-m,j-n\\ {}i+m,j+n\\ {}i+m,j-n\end{array}\right.,\left(\mathrm{m},\mathrm{n}\right)=\left(1,1\right) \)

4.3.3 Assigning direct code for HSV images

Direct codes are assigned for a block Bkdepending on the values of the diagonal neighbors. Now, the four neighbors value will be either 0 or 1. Depending on these values, a particular direct code is assigned for H color space as shown in Eq. (32), for S color space as shown in Eq. (33), for V color space as shown in Eq. (34) and in Fig. 11.

Fig. 11
figure 11

Direct code pattern

$$ DC\_H\left[{B}_K\right]=\left\{\begin{array}{c}1, if\ {BCH}_{B_k}=0000\\ {}2, if\ {BCH}_{B_k}=0001\\ {}3, if\ {BCH}_{B_k}=0010\\ {}4, if\ {BCH}_{B_k}=0100\\ {}5, if\ {BCH}_{B_k}=1000\\ {}6, if\ {BCH}_{B_k}=1100\\ {}7, if\ {BCH}_{B_k}=1010\\ {}8, if\ {BCH}_{B_k}=1001\\ {}9, if\ {BCH}_{B_k}=0110\\ {}10, if\ {BCH}_{B_k}=0101\\ {}11, if\ {BCH}_{B_k}=0011\\ {}12, if\ {BCH}_{B_k}=1110\\ {}13, if\ {BCH}_{B_k}=1011\\ {}14, if\ {BCH}_{B_k}=1101\\ {}15, if\ {BCH}_{B_k}=0111\\ {}16, if\ {BCH}_{B_k}=1111\end{array}\right. $$
(32)
$$ DC\_S\left[{B}_K\right]=\left\{\begin{array}{c}1, if\ {BCS}_{B_k}=0000\\ {}2, if\ {BCS}_{B_k}=0001\\ {}3, if\ {BCS}_{B_k}=0010\\ {}4, if\ {BCS}_{B_k}=0100\\ {}5, if\ {BCS}_{B_k}=1000\\ {}6, if\ {BCS}_{B_k}=1100\\ {}7, if\ {BCS}_{B_k}=1010\\ {}8, if\ {BCS}_{B_k}=1001\\ {}9, if\ {BCS}_{B_k}=0110\\ {}10, if\ {BCS}_{B_k}=0101\\ {}11, if\ {BCS}_{B_k}=0011\\ {}12, if\ {BCS}_{B_k}=1110\\ {}13, if\ {BCS}_{B_k}=1011\\ {}14, if\ {BCS}_{B_k}=1101\\ {}15, if\ {BCS}_{B_k}=0111\\ {}16, if\ {BCS}_{B_k}=1111\end{array}\right. $$
(33)
$$ DC\_V\left[{B}_K\right]=\left\{\begin{array}{c}1, if\ {BCV}_{B_k}=0000\\ {}2, if\ {BCV}_{B_k}=0001\\ {}3, if\ {BCV}_{B_k}=0010\\ {}4, if\ {BCV}_{B_k}=0100\\ {}5, if\ {BCV}_{B_k}=1000\\ {}6, if\ {BCV}_{B_k}=1100\\ {}7, if\ {BCV}_{B_k}=1010\\ {}8, if\ {BCV}_{B_k}=1001\\ {}9, if\ {BCV}_{B_k}=0110\\ {}10, if\ {BCV}_{B_k}=0101\\ {}11, if\ {BCV}_{B_k}=0011\\ {}12, if\ {BCV}_{B_k}=1110\\ {}13, if\ {BCV}_{B_k}=1011\\ {}14, if\ {BCV}_{B_k}=1101\\ {}15, if\ {BCV}_{B_k}=0111\\ {}16, if\ {BCV}_{B_k}=1111\end{array}\right. $$
(34)

where 1 ≥k ≤ N.

4.3.4 Summed direct code pattern

The texture feature vector size obtained will be of 16 bins for each H-plane, S-plane and V-plane. The three planes feature vectors are summed as shown in Eq. (35).

$$ SDC\left[{B}_k\right]=\sum \limits_{k=1}^N\left| DC\_H\left[{B}_K\right]+ DC\_S\left[{B}_K\right]+ DC\_V\left[{B}_K\right]\right| $$
(35)

The resultant feature vector size for a particular Block Bk, after combining the three planes will be of 48 bins. These 48 bins of histogram is then normalised to half of its bin size as 24 bin histogram, which provides efficient feature extraction.

For an entire image, SDC is considered as the feature vector. This is shown in Eq. (36).

$$ \mathrm{SDDC}=\frac{1}{2}\ SDC\left[{B}_k\right] $$
(36)

Feature vector (I) = SDDC

4.4 Feature vector concatenation

The obtained feature vectors for DDVP and MDCP are concatenated to provide an efficient feature descriptor as shown in Eq. (37)

$$ \mathrm{Feature}\ \mathrm{vector}\ \left(\mathrm{I}\right)=< SDDVV\ SDDC> $$
(37)

Figure 12 explains the algorithm of MDCP method. Appendix 1 explains the proposed method.

Fig. 12
figure 12

MDCP Algorithm

5 Experimental evaluation

5.1 Data sets description

The real-time diseased coral reef database was obtained from Suganthi Devadasan Marine Research Institute (SDMRI) (http://www.sdmri.in), Beach Road in Tuticorin coast. Diseased coral reef images are taken in a shallow water reef area of two metres depth in Shingle Island. The images are taken during the months between September 2017 and February 2018. Diseased coral reef images are classified into nine classes as shown in Fig. 13. The data set consists of 878 images at a resolution of 640 × 480. The images in data set are all diseased images.

Fig. 13
figure 13

SDMRI coral diseases

5.2 Quantitative comparison

Implementation is executed using MATLAB 2016a with Intel® Pentium® Processor (Dual Core), 4GB Memory and 2 TB Hard drive. The performance of various classifiers is compared using different metrics such as specificity, sensitivity, accuracy, F-measure, time and Mathews’ Correlation Coefficient. To estimate the effectiveness of the proposed technique, the samples in the data set are separated in the following ratio: 90% for training and 10% for testing, 75% for training and 25% for testing, and 50% for training and 50% for testing.

5.2.1 F-measure

F-measure is a computation method used to test accuracy. It considers both precision and recall values. In Table 1, the variables are represented as TE to denote Testing, TR to denote Training, Prec to denote Precision, Recal to denote Recall and Fm to denote F-measure.

Table 1 Comparison of feature descriptors with F-measure (Fm) on various coral Diseases data sets. The highest OA (%) obtained for each data set is shown in bold
$$ \mathrm{F}-\mathrm{measure}=\frac{2\ast Prec\ast Recal}{Prec+ Recal} $$
(38)
$$ \mathrm{Prec}=\frac{Number\ of\ images\ classified\ accurately}{Total\ number\ of\ images\ classified} $$
(39)
$$ \mathrm{Recal}=\frac{Number\ of\ images\ classified\ accurately}{Total\ number\ of\ images\ in\ the\ database} $$
(40)

Table 1 presents the results of classification accuracy for diseased coral reef data sets in comparison with existing feature descriptors. The proposed feature descriptor provides better classification accuracy compared to existing feature descriptors. Since the data sets depend on both color and texture features, so the proposed feature descriptor has resulted in an improvement in accuracy by 9 and 10% respectively over existing feature descriptors. In SDMRI, real-time diseased coral reef data set, following the proposed, OPT, Z ⊕ TZLBP and ILDP feature descriptors provide improved results compared to CS-LBP, LBP, DLBP, PRI-CoLBP, LTxXORP, CLBP and OC-LBP. In OPT, pixels are covered in eight diagonal directions so as to improve accuracy. In ILDP, diagonal directions and diagonal neighbors are utilised for estimation which covers all pixels in an image efficiently to improve accuracy. In Z ⊕ TZLBP, all the pixels are covered using Z and TZ (Tilted Z) pattern, where the centre pixel is also considered as one of the neighbors. In ILDP, OPT and Z ⊕ TZLBP only texture features are considered, but in the proposed feature descriptor textural features from RGB and HSV color planes are considered which provides efficient results.

5.2.2 Accuracy

$$ Accuracy=\frac{Images\ classified\ correctly}{All\ images\ classified} $$
(41)

In Table 2, the accuracy results of the proposed framework is compared with previous submarine image classification approaches. The proposed framework is not directly comparable with the results obtained by other previous approaches in literature as the experimental setups differ. Even though no pre-processing techniques are used, the proposed framework achieves good results for various data sets. The recent techniques of Shihavuddin et al. [33], Ani et al. [2], Ani et al. [1], Ani et al. [3], Mohammad et al. [32] and the proposed method are shown in Table 2. This table indicates that the proposed framework and OPT uses a quicker and simpler classification framework. This simpler classification framework provides outstanding results in coral reef image classification compared to existing coral reef classification frameworks.

Table 2 Overall accuracy (OA) (%) of the previous coral datasets with techniques and the proposed framework used for coral data sets

For EILAT coral data set, DT classifier provides the highest classification accuracy as shown in Table 3. DT is the well-known classifier for performing the multi-class classification on a data set. For EILAT2 coral data set, PCCNN classifier provides the highest classification accuracy. For MLC 2012 coral data set and SDMRI coral diseased data set, Random Forest provides the highest classification accuracy. For RSMAS coral data set and SDMRI real time data set, SVM, PCCNN and KNN provide highest classification accuracy compared to other classifiers. The dimensionality of the proposed feature vector is very low to reduce the complexity during classification Table 4.

Table 3 Testing of the classification algorithms for coral data sets. The highest accuracy (%) acquired for every data set is highlighted in bold

5.2.3 Specificity

Specificity is defined as the true negative rate that estimates the percentage of correct rejection with diseased coral reef images.

$$ Specificity=\frac{TN}{TN+ FP} $$
(42)

where TN represents True Negative rate and FP represents False Positive rate.

Specificity of classifiers is excellent since the true negatives are mostly identified as shown in Table 5. The specificity for the tree classifiers ranges from 94 to 100% because the true negative rate is good, i.e. misclassification of diseased images is less. PCCNN, SVM and KNN have shown excellent results due to majority voting and support vectors. It also indicates that tree classifiers are selective. DT classifier has better overall specificity compared to other classifiers since it tries to maintain some prediction strength. RF also has good specificity value closer to DT classifier because there is limited generalization error in the classification. The other tree classifiers such as Adaboost, RoF, CART and C4.5 have obtained values that range from 20 to 70% due to fewer trees grown. For Naive Bayes, the specificity values range from 16 to 49% only which is less due to misclassification rate. For the class Black band, SVM, PCCNN, KNN and Random Forest have yielded full results. Also PCCNN and SVM have obtained the highest results for white Plague, white band and yellow band.

5.2.4 Sensitivity

Sensitivity is a metric that reveals how correctly a classifier classifies the true positives. It estimates the percentage of correctly classified diseased coral reef images.

$$ Sensitivity=\frac{TP}{TP+ FN} $$
(43)

where TP represents the True Positive rate and FN represents the False Negative rate.

Sensitivity is good for classes with large sample images, and it is fair for classes with fewer sample images. Table 6 shows that diseased image classes with large number of samples like black spot dark spot, white band and pink spot, KNN gives highest sensitivity value, but for the least sample class such as Aspergillosis and Black Band, they are not sensitive. DT has the highest overall sensitivity compared to other decision tree classifiers because of the procedure of creating many trees. It is able to classify a few sample images from the classes having fewer samples, i.e. White Pox whereas other classifiers are not able to classify as observed in column 9 of Table 6. Also, SVM and PCCNN have the highest sensitivity for the classes with fewer samples such as White Plague and Yellow band. Naive Bayes, AdaBoost, CART, C4.5 and Random Forest classifiers are also sensitive to classes with fewer sample especially for Aspergillosis. But Random Forest classifier and PCCNN have the highest sensitivity for Black band class among all the classifiers implemented because they does not need a lot of features to perform well.

5.2.5 Mathew’s correlation coefficient (MCC)

It is considered as one of the most effective and accurate performance parameters for any purpose of classification. MCC takes values in the interval (−1and 1), whereby 1 indicates that the classifier classifies the diseased images correctly and − 1 indicates that the classifier classifies the diseased images incorrect by MCC is estimated as shown in Eq. (13), for real time diseased data set (http://www.sdmri.in),

$$ MCC=\frac{TP\times TN- FP\times FN}{\sqrt{\left( TP+ FP\left)\right( TP+ FN\right)\left( TN+ FP\right)\left( TN+ FN\right)}} $$
(13)

where FN represents False Negative, TP represents True Positive, TN represents True Negative and FP represents False Positive.

The overall highest Mathew’s Correlation Coefficient (MCC) value is obtained for SVM, CNN, PCCNN and KNN because they are accurate and powerful classifiers as shown in Table 7. C4.5 and Naive Bayes classifiers have the lowest MCC values due to the problem of over fitting in C4.5 and strong assumption on the shape of the feature distribution in Naive Bayes. From the study, it is found that RF classifier, PCCNN and DT classifier perform better compared to other classifiers based on the performance measures accuracy, specificity, F-measure, time and Mathew’s Correlation Coefficient for coral disease classification.

5.2.6 Time

It is essential to analyze the time taken by various feature descriptors, classifiers and existing techniques to classify the diseased coral reef images. So as to make it usable for real time implementations. Tables 8, 9, and 10 show the time estimated for feature descriptors, classifiers and comparison with existing techniques. When the feature descriptor’s feature size is less it helps in fast and efficient classification. In Table 8, the proposed feature descriptor is about two times faster to compute than LBP, CLBP RLTP and LTrP. Even though LDP, ILDP, OPT and Z ⊕ TZLBP take less time, the proposed has improved accuracy by 2% than them.

5.2.7 Computational complexity

LBP, LDP, CLBP, DLBP, LTxXORP, CS-LBP, OC-LBP, PRI-CoLBP, ILDP, Z ⊕ TZLBP and the proposed feature descriptors have a linear Complexity. If there is an iteration, and the iterative variables are incrementing linearly, then time complexity is determined as O(n). RLTP and LTrP techniques have a Quadratic Complexity. This is shown in Table 11.

5.2.8 Space complexity

The space complexity of an algorithm is the amount of storage it uses. The feature vector that is obtained has the histogram bin size in the range 0 to 256 for LBP, LDP, CLBP, DLBP, LTxXORP and LTrP feature descriptor whereas for RLTP it is between 0 to 512 which is devastating. Hence the size of feature vector and its bin size has an impact on storage requirement. In ILDP and OC-LBP, the bin size is between 0 to 32. For Z ⊕ TZLBP and CS-LBP, the bin size is between 0 to 16. For OPT and the proposed technique, the bin size is between 0 to 24. This is presented in Table 11.

6 Results and discussions

6.1 Accuracy

The accuracy of various classifiers implemented using the same SDMRI diseased coral data set with different feature classes is compared in Fig. 14. It is observed from Table 4 that the DT and PCCNN produce higher classification rate, i.e. approximately 89–92% which is greater when compared to other classifiers. RF performs well because it can produce accurate predictions that do not overfit the data. SVM (RBF kernel), CNN and KNN also provide better results compared to other classifiers. It is also noticed that DT and PCCNN classifiers provide the highest overall percentage of accuracy of 90.91 and 89.85%. Also, it provides highest accuracy for the classification of diseased samples of Pink spot, Aspergillosis, Black spot dark band and Yellow band classes. It is also found that KNN classifier performs well, i.e. it provides the highest accuracy for White plague, Black band and White pox disease classes. This is because White plague, Black band and White pox disease have larger samples compared to other classes and KNN classifier is having the capability to classify problems using majority voting.

Fig. 14
figure 14

Overall Accuracy of classifiers

Table 4 Accuracy calculation for SDMRI coral diseased images with DT Classifiers. The highest accuracy (%) acquired for every data set is highlighted in bold
Table 5 Specificity calculation for various coral diseases with DT Classifiers. The highest Specificity (%) acquired for every data set is highlighted in bold
Table 6 Sensitivity calculation for various coral classes with DT classifiers. The highest Sensitivity (%) acquired for every data set is highlighted in bold
Table 7 MCC Estimation
Table 8 Average time reported for the different feature descriptors using SDMRI Diseased data set
Table 9 Time taken by various classifiers for SDMRI diseased data set
Table 10 Comparison of the total execution time (in sec.) of SDMRI diseased data set with the existing and the proposed approaches
Table 11 Computational Complexity and Bin Size for various feature descriptors

The accuracy of various classifiers implemented using SDMRI real-time diseased coral data set with different feature classes are compared in Fig. 14. Among various classifiers, DT, PCCNN and RF provide good classification accuracy. KNN also provides better classification results compared to other classifiers. The other classifiers such as Adaboost and CART provide very less classification results because AdaBoost is sensitive to noisy data and outliers. Compared to other classifiers, AdaBoost has overfitting problem.

6.2 Specificity

As shown in Fig. 15, SVM, PCCNN, KNN, DT and RF show the highest results of 90 - 96%. Naive Bayes shows the very least results in specificity due to strong feature independence assumptions. The other classifiers such as AdaBoost, C4.5, CART and Rotation Forest produce average results of 60-70% results.

Fig. 15
figure 15

Specificity

6.3 Sensitivity

As shown in Fig. 16, PCCNN, CNN, SVM, KNN, DT and Random Forest report sensitivity of 80 - 88%. CART and C4.5 show the very least results in sensitivity due to the unstable DT formation and empty branches. The other classifiers such as AdaBoost, Naive Bayes and Rotation Forest produce average results of 45-52%.

Fig. 16
figure 16

Sensitivity

6.4 Time

The time taken by various classifiers is shown in Table 9. Naive Bayes classifier takes less time compared to all the other classifiers studied, but it is not so good in other measures. The other three classifiers such as Adaboost, CNN, Rotation forest except Random forest take more time compared to these classifiers, since it takes more time to build the tree. CART takes less time but other measures are not good. Random forest and PCCNN have a good performance in all aspects because boot strapping reduces bias and variance both which make RF more accurate and robust. Similarly, Adaboost has an average performance in all aspects but in case of time taken it takes more time compared to Random Forest.

The time taken by existing approaches are shown in Table 10. Even though Shivahudin [33] have proved good contribution in accuracy for various coral data sets, their time complexity is more due to utilization of various feature descriptors such as GLCM, Gabor and CLBP. Shiela [23], Stokes [8], Oscar [29], Beijbom [5], Guo [10] and Mohammad [32] have provided good classification accuracy with reasonable time complexity. But they did not take any steps to reduce time complexity. Ani [1,2,3] have concentrated on both improving accuracy and reducing time complexity by reducing the feature vector size of the proposed method.

6.5 Conclusion and future work

The proposed feature descriptors are competent for submarine diseased coral reef images. This paper presents the test results with real-time diseased coral reef data sets, i.e. SDMRI real-time diseased coral image data sets have been used. The proposed feature descriptor extracts the important diagonal information by reducing the size of the feature vector. The combination of the proposed feature descriptor, segmentation and various classifiers attains the best results. MDCP assists in achieving the highest overall classification accuracy and the smallest execution time. Moreover, the proposed feature descriptor is about two times faster to compute than LBP. Therefore, it can produce promising results for a large class of problems.

Even though many feature descriptors are available to improve the classification accuracy, very few works have been reported to reduce the size of feature vectors. CS-LBP, ILDP, Z ⊕ TZLBP and OC-LBP operators reduce the size of original LBP histogram with improvement in accuracy. But these feature vectors cover every pixel in an image which is a time-consuming process. Compared to these techniques, the proposed feature descriptor shows its efficiency in classification, feature vector size and time complexity. It outperforms OPT, CS-LBP, OC-LBP, ILDP and Z ⊕ TZLBP patterns with the smallest histogram size and maximum accuracy for coral reef image classification.

In this work, various classifiers are studied, and their performances are analysed with various performance measures. Also, the performances of various classifiers are compared. It is concluded that Random Forest and Decision Tree have a good performance to handle the classes with fewer sample images and larger sample images. Naive Bayes classifier is having the least performance compared with the other classifiers.

Experimental results indicate that the proposed feature extraction method achieves the highest overall classification accuracy with minimum execution time when compared to other state-of-the-art methods. From the computational complexity point of view, the proposed feature descriptor provides min size when compared to the other previous feature descriptors, which makes it a faster classifiers. In addition, the speed of the proposed framework is considerably higher than some advanced approaches of coral reef classification. The classification of diseased coral images would enhance the study of coral reef ecosystem, dynamics and processes by reducing the runtime.