1 Introduction

Analysis of binary objects invariant to geometric transformation such as translation, rotation and scale is the most critical task in pattern recognition, classification and other similar applications. Shape is a key information to human vision and machine vision for distinguishing objects from other objects in the real world. Therefore, several approaches have been developed to construct geometrically invariant features for shape classification and recognition in machine vision. Analysis of objects has been widely used in many applications such as object recognition and shape evolution [1,2,3,4], medical image and protein analysis [5], robot navigation [6], and topology in sensor networks [7], etc. It is one of the critical problems, as shape instances from the same category, which look similar to humans, are often very different when measured with geometric transformation (translation, rotation, scaling, etc) and nonlinear deformations (articulation, noise and occlusion). Compared to geometric transformation, the nonlinear deformations are much more challenging for shape similarity measures. Therefore, textual annotation of images is inefficient and sometimes impossible in a large database. Retrieval by the content-based image retrieval (CBIR) method [1] received considerable attention to find good shape descriptors and efficient matching measures that are issues in the image vision research area. The shape recognition and classification methodology developed by eminent researcher’s in the field of computer-vision for the object analysis has been broadly divided into two groups: (a) based on the contour points and (b) based on the region pixels of an object. The contour base shape descriptor extracts features from boundaries points (outer pixels) of an object, and the region base shape descriptor extracts features from the inner pixels of an object. The preprocessing steps for object recognition using boundary representation are sampling of boundary points that’s used for extracting the features vector [2,3,4, 8]. The sampling of contour points does not provide enough information from objects that produced lossy data for the shape classification and recognition model. Also, the pixels of the region alone do not recognize the shape boundary of an object. The shortfall of the present existing methodology for boundary and region base shape classification can be overcome by a hybrid model that extracts features from inner and outer pixels of an object [8]. To provide a novel object recognition methodology, the proposed model is built on a hybrid model that combines contour and region base shape classification methods.

As a result, throughout our review of the literature, we tested a variety of ways for achieving geometrically invariant shape-based feature vectors for texture classification, which may be further classified into two categories: the statistical methods and the model-based methods, respectively. As stated in the literature [2], shape descriptors with only global or local information may probably fail to be robust enough in statistical methods and model-based methods. Global descriptors are robust to local deformations, but they cannot capture local details of shape boundary. Local descriptors are precise to represent local shape features only, while they are too sensitive to noise, those approaches are discussed in the [6, 9,10,11,12,13,14,15,16]. In fact, it is always challenging to distinguish between noise and local details of the shape boundary. Naturally, one solution to this problem is to define a rich shape descriptor, which consists of both global and local shape features [8]. By combining local and global shape features, we proposed local binary pattern and combine it with Hu’s moment. In this paper, we fined the local binary pattern of the subsequent boundaries of the binary object using some predefined assumption base morphological operation known as Coordination Number (CN)*. Subsequently, erode the object and subtract the erode binary object from the previous image until all pixels in the original image have background pixels value. Therefore, the combine LBP with Hu’s moment plays the most important role in contours point pattern analysis. Our hybrid model is supported by recent work on shape classification and recognition methodology. In [8], the author developed a hybrid model that combines the height function characteristic of contour points with the morphology feature of region base pixels. The author of [11, 12] constructed a model that combines the Gabor features, histogram of oriented gradients (HoG), and local binary pattern (LBP) for classification of Grass weed. He has also described the texture of grass by modeling the convolutional neural network (CNN) with local binary pattern (LBP) of super-pixel. In [13], the author combines the LBP with Hu’s seven moments for binary image analysis and classification. In [14], the author map the binary image to the lattices and compute the energy of pixels for shape classification. The article, in [15], developed a model for shape segmentation that segment the image based on statistical range. The author of [16] constructed a model that map the concept of the lattices from chemical science for binary image analysis. In the article [17], the binary image shape descriptor has been analyzed on a concurrent line by finding the shortest distance from each points of a shape boundary. The author of [18] constructed a model that’s offer a collection of weighted sum rule-based techniques based on a set of widely used shape descriptors (shape context, height functions, and inner-distance shape context). In [19], the author has devised the pattern spectrum shape descriptors on the basis of pixel energies in an object. These shape descriptors are transformed into a matrix, from which a collection of texture descriptors is produced. Despite the sophisticated methods deployed by researchers, it has been a challenge for us. Hence, the aim of researchers has been to cautiously acquire the new model based on a convolution neural network with chemical properties of pixels. In particular [20, 21], machine learning (ML) techniques based on deep learning and convolutional neural networks (CNNs) are increasingly being investigated in several areas due to their high accuracy. Therefore, our aim to enabling them to be the state of the art in various binary shape classification and recognition. The deep learning and CNN-based architecture discussed in [21] is not invariant to rotation of object structure. In this case, the precise modification of subsampling and flattening layers is crucial for classification accuracy and the object descriptor’s invariance to the geometric transformation. In [20], author deployed the PDE-based framework that generalizes group equivariant convolutional neural networks (G-CNNs) for roto-translation objects. The research’s main goal is to create a revolutionary binary form categorization and severity system. The Map Reduce framework also supports an analysis of the proposed model. The input is obtained and processed in the Map Reduce architecture. Throughout the data partitioning process, partitioned input is used in the mapper step to remove noise from the input image, and the phase of pre-processing is applied to the image. Segmentation is carried out with the pre-processed data using a generative adversarial network (GAN) called min filter and then the non-reliant components are taken out as needed to boost the performance.

The rest of this paper is organized as follows: Sect. 2 describes the morphological operation: Erosion. Section 3 discusses the coordination number (CN)* of an object. Section 4 analyzes the sequence of shape contour CS(n) of a binary object. Section 5 discussed a brief review of the local binary pattern (LBP). Section 6 describes the seven Hu’s moment invariants. Section 7 discusses the shape similarity measure using PCA. Section 8 analyzes the experimental results and discussions. Finally, Sect. 9 concludes the paper.

2 Erosion

The original idea of erosion is given in [22]. It is the fundamental morphological transformation which combines two sets using the vector subtraction of set elements. If A and B as set in \( Z^2 \), the erosion of A by B denoted by \(A \ominus B\) is defined as:

$$\begin{aligned} A \ominus B = \{ Z \mid (B)_z \subseteq A \} \end{aligned}$$
(1)

In words, this equation indicates that the erosion of A by B is the set of all points Z such that B, translated by Z, is contained in A. In the following discussion, set B is assumed to be a structuring element. Equation (1) is the mathematical formulation. Because the statement that B has to be contained in A is equivalent to B not sharing any common elements with the background, we can express erosion in the following equivalent form:

$$\begin{aligned} A \ominus B = \{ Z \mid (B)_z \cap A^c =\emptyset \} \end{aligned}$$
(2)

where \(A^c\) is the complement of A and \(\emptyset \) is the empty set.

Here, we concentrate on object boundary points and discussions from Eqs. (1) or (2) represent that erode pixels from binary objects depend on structuring elements and the object after erosion varies from one shape to another shape. The contour of an object acts as a vital role in binary object analysis for shape classification that motivates toward selecting an effective structuring element for boundaries representation. One method for fixed points boundaries of objects is discussed in Moore boundary tracking algorithm after Moore [1968]. The Moore boundary tracking algorithm represents boundaries as a sequence of the set of ordered boundary points that does not represent the contribution of object pixels in a boundary.

Fig. 1
figure 1

Square binary pattern in \(Z^2\)

Therefore, we need to select an effective structuring element as shown in Fig. 1 that fixed the object points as a boundary by considering a square binary shape structuring element. This pattern erodes the foreground object pixels that share with 8-neighbor of background object pixels by a non-informal manner. The object boundary for a given binary object using a square binary pattern does not represent contribution of boundary points by numerically or informally. So, we are motivated toward the representation of the object boundary by informal manner using coordination number (CN)* that will be discussed in the next section.

3 Coordination Number (CN)*

The solid substances in chemical science are generally classified as either crystalline or amorphous. Crystalline solids are characterized by a regular, ordered arrangement of particles. Therefore, the coordination number is the number of nearest neighbors (or touching particles) that a particle has in a crystal, which is called its coordination number. In a body-centered cubic unit cell, the atom at the center of the unit cell is surrounded by a maximum of eight neighbor atoms. Therefore, all possible coordination numbers in crystalline solids are given in set \( S = (0, 1, 2, 3, 4, 5, 6, 7, 8),\) and we defined contour points as all those points which have a coordination number between 1 and 7. The atom with the \(CN=0\) value is an isolated point that does not contribute to the formation of any shape. There exist multiple possible geometric combinations for each value of the coordination number for the central atom.

  • 2 for Linear

  • 3 for Trigonal planar, trigonal pyramidal, or T-shaped

  • 4 for Square planar or tetrahedral

  • 5 for Trigonal bipyramidal or square pyramid structures

  • 6 for Trigonal prism structure, hexagonal planar, or octahedral

  • 7 for Pentagonal bipyramidal, capped octahedron, or a capped trigonal prism structure.

  • 8 for Cubic, hexagonal bipyramidal, square antiprism, or dodecahedron

Therefore, for a given binary object the boundary or contour of an object are those pixels having a coordination number that satisfies Eq. (3).

$$\begin{aligned} (CN)=j; where \quad j\in \{1,2,3,4,5,6,7\} \end{aligned}$$
(3)

The coordination number of a binary object pixel can be obtained by computing the number of 8-neighbor object pixels, as discussed below.

\(CN(x,y)=f(x-1,y-1)+f(x-1,y)+f(x-1,y+1)+f(x,y-1)+f(x,y+1)+f(x+1,y-1)+f(x+1,y)+f(x+1,y+1)\) where f represents the object pixels. So, rewrite Eq. (3) as follows:

$$\begin{aligned} CN ( x , y )= j; where \; j\in \{1,2,3,4,5,6,7 \} \end{aligned}$$
(4)

Equation (4) represents the coordination number of object pixels at (xy) coordinate that are invariant to translation, rotation and represents the contribution of object pixels in an object boundary by informal manner. In boundary-based binary object analysis or classification, informal knowledge of boundary points gives comparatively better features that will be discussed in the result section.

Fig. 2
figure 2

a A binary square shape. b 1st boundary of a square shape. c 2nd boundary of a square shape. d 3rd boundary of a square shape

4 Shape Contour

In this section, we define one of the principal applications of morphology that are useful in the representation and description of object shape. In particular, we consider morphological algorithms for extracting the successive shape-based contour of a compact binary object relative to a binary pattern as shown in Fig. 1. These are discussed in [23]. Let \(X\subseteq Z^2\) represent a finite-extent discrete binary image and let \(B\subseteq Z^2\) be a fixed finite binary pattern with \((0,0)\in B\) then the successive shape contour for a given binary object X are denoted by CS(n) and defined as follows:

$$\begin{aligned} CS(n)=(X\ominus nB)-(X\ominus (n+1)B) \end{aligned}$$
(5)

for \(n=0,1,2,...,N.\) where \(N=max(n\ge 0: X\ominus nB\ne \emptyset ). \) Equation (5) represents the splitting of an object in a sequence of N-number of successive closed boundaries in two-dimensional space that contained all object pixels in the form of boundaries of boundary points. This conclusion is that an object may be reconstructed by merging the boundary points of a set of boundaries. The abovementioned explanation is shown in Fig. 2.

Consider, a binary \(5 \times 5\) a square shape as shown in Fig. 2a and their deformations sequence of shape contours \(CS (n=3)\) is shown in Fig. 2b–d. The square shape of contour points represents the pixels value only at those particular contour points, but those contour points share different contributions in object formation. The deformations of an object in shape contours do not give enough knowledge for further processing in shape classification and analysis. Therefore, we have needed more informal information from the sequence of shape contours that can be justified by using the coordination number that has been discussed in Sect 2. Now, we consider the square shape shown in Fig. 2a and will discuss the shape deformations by using (CN)* that’s the informal boundary representation. Figure 3a–d represents the overall concepts of shape-based contour segmentations using coordination number (CN)*.

Fig. 3
figure 3

a The coordination number (CN)* matrix corresponding to square shape, b the 1st contour of square shape using (CN)*, c the 2nd contour of square shape using (CN)*, d the 3rd contour of square shape using (CN)*

Figure 3a represents the coordination number of a square shape object as shown in Fig. 2a. By the definition of coordination number base contour segmentation which is stated in Sect. 2, the contour points of an object are those points having coordination number belonging to set \(S = (0, 1, 2, 3, 4, 5, 6, 7).\)

The deformations of a square shape object using (CN)* are shown in Fig. 3b–d, which represents the contributions of object pixels in boundary formation by informal manner. This concludes and supports the shape contour section and algorithm 1 for further processing in shape analysis and classifications. The algorithm 1 describes the computational method of contour segments CS(n) for \(j^{th}\) image in a database.

5 A Brief Review of the Local Binary Pattern (LBP)

In this section, we provide a brief review of the LBP that has an excellent metric which focuses on the structure of the object boundary [10]. It is a powerful tool for translation and rotation invariant features extraction from shape boundaries. As we have mentioned above, the LBP is a method that provides shape pattern descriptor in the form of decimal value corresponding to their shape contour point. The LBP computation on a binary object boundary point is done as follows.

figure c
$$\begin{aligned} LBP^1(x,y)= & {} CN(x+1,y)*2^7+CN(x+1,y+1)*2^6 \nonumber \\{} & {} +CN(x,y+1)*2^5+CN(x-1,y+1)\nonumber \\{} & {} *2^4+CN(x-1,y)*2^3 \nonumber \\{} & {} + CN(x-1,y-1)*2^2+CN(x, y-1)\nonumber \\{} & {} *2^1+CN(x+1,y-1)*2^0 \end{aligned}$$
(6)
$$\begin{aligned} LBP^2(x,y)= & {} CN(x+1,y+1)*2^7+CN(x,y+1)*2^6 \nonumber \\{} & {} + CN(x-1,y+1)*2^5+CN(x-1,y)\nonumber \\{} & {} *2^4+CN(x-1,y-1)*2^3 \nonumber \\{} & {} +CN(x,y-1)*2^2+CN(x+1, y-1)\nonumber \\{} & {} *2^1+CN(x+1,y)*2^0 \end{aligned}$$
(7)
$$\begin{aligned} LBP^3(x,y)= & {} CN(x,y+1)*2^7+CN(x-1,y+1)*2^6 \nonumber \\{} & {} + CN(x-1,y)*2^5+CN(x-1,y-1)\nonumber \\{} & {} *2^4+CN(x,y-1)*2^3 \nonumber \\{} & {} + CN(x+1,y-1)*2^2+CN(x+1, y)\nonumber \\{} & {} *2^1+CN(x+1,y+1)*2^0 \end{aligned}$$
(8)
$$\begin{aligned} LBP^4(x,y)= & {} CN(x-1,y+1)*2^7+CN(x-1,y)*2^6 \nonumber \\{} & {} + CN(x-1,y-1)*2^5+CN(x,y-1)\nonumber \\{} & {} *2^4+CN(x+1,y-1)*2^3 \nonumber \\{} & {} + CN(x+1,y)*2^2+CN(x+1, y+1)\nonumber \\{} & {} *2^1+CN(x,y+1)*2^0 \end{aligned}$$
(9)
$$\begin{aligned} LBP^5(x,y)= & {} CN(x-1,y)*2^7+CN(x-1,y-1)*2^6 \nonumber \\{} & {} + CN(x,y-1)*2^5+CN(x+1,y-1)\nonumber \\{} & {} *2^4+CN(x+1,y)*2^3 \nonumber \\{} & {} + CN(x+1,y+1)*2^2+CN(x, y+1)\nonumber \\{} & {} *2^1+CN(x-1,y+1)*2^0 \end{aligned}$$
(10)
$$\begin{aligned} LBP^6(x,y)= & {} CN(x-1,y-1)*2^7+CN(x,y-1)*2^6 \nonumber \\{} & {} + CN(x+1,y-1)*2^5+CN(x+1,y)\nonumber \\{} & {} *2^4+CN(x+1,y+1)*2^3 \nonumber \\{} & {} + CN(x,y+1)*2^2+CN(x-1, y+1)\nonumber \\{} & {} *2^1+CN(x-1,y)*2^0 \end{aligned}$$
(11)
$$\begin{aligned} LBP^7(x,y)= & {} CN(x,y-1)*2^7+CN(x+1,y-1)*2^6 \nonumber \\{} & {} + CN(x+1,y)*2^5+CN(x+1,y+1)\nonumber \\{} & {} *2^4+CN(x,y+1)*2^3 \nonumber \\{} & {} + CN(x-1,y+1)*2^2+CN(x-1, y)\nonumber \\{} & {} *2^1+CN(x-1,y-1)*2^0 \end{aligned}$$
(12)
$$\begin{aligned} LBP^8(x,y)= & {} CN(x+1,y-1)*2^7+CN(x+1,y)*2^6 \nonumber \\{} & {} + CN(x+1,y+1)*2^5+CN(x,y+1)\nonumber \\{} & {} *2^4+CN(x-1,y+1)*2^3 \nonumber \\{} & {} +CN(x-1,y)*2^2+CN(x-1, y-1)\nonumber \\{} & {} *2^1+CN(x,y-1)*2^0 \end{aligned}$$
(13)
$$\begin{aligned} LBP= & {} min(LBP^k); where k=1,2,3,4,5,6,7,8 \nonumber \\ \end{aligned}$$
(14)

In our work, the LBP is computed for the object shape contour CS(n) pixels with 8 neighbors. These LBP values are represented in the form of Hu’s moment invariants function [24], and details are given in the subsequent next section.

6 Moment Invariants

In this section, we defined two-dimensional \( (p+q)^{th} \) order moments that are invariant to translation, rotation and scaling and discussed in [24].

$$\begin{aligned} m_{pq} = \int _{-\infty }^{\infty } \int _{-\infty }^{\infty } x^py^qLBP(x,y) \,dxdy \end{aligned}$$
(15)

where \(p,q=0,1,2,...\)

If the contour points function LBP(xy) is a piecewise continuous bounded function that are uniquely determined by local binary pattern also the moment of all orders exist and the moment sequence \(m_{pq}\) is uniquely determined by LBP(xy) . One should note that the moment in [25] may be not invariant when LBP(xy) changes by translation, rotation and scaling. The invariant features can be achieved using central moment, which are defined as follows:

$$\begin{aligned} \mu _{pq} = \int _{-\infty }^{\infty } \int _{-\infty }^{\infty } (x-{\overline{x}} )^p(y-{\overline{y}})^qLBP(x,y) \,dxdy \end{aligned}$$
(16)

where \(p,q=0,1,2,...\) ; \({\overline{x}}=\frac{m_{10}}{m_{00}}\) and \({\overline{y}}=\frac{m_{01}}{m_{00}}\) .

The pixel points \(({\overline{x}},{\overline{y}})\) are the centroid of the shape contour CS(n) of a given binary object, defined in Eq. (3). The centroid moment \(\mu _{pq}\) computed using the centroid of a corresponding shape contour points function CN(xy) is equivalent to the \(m_{pq}\) whose center has shifted to centroid of the CS(n). Therefore, the central moment are geometrically invariant to translations and rotations. Scale invariance can be achieved by using normalization of central moments, and its formulation is described as follows:

$$\begin{aligned} \eta _{p,q}=\frac{\mu _{pq}}{\mu _{00}^{\gamma }} , \gamma =\frac{(p+q+2)}{2} , p+q=2,3,... \end{aligned}$$
(17)

Based on normalized central moments, Hu’s [25] introduced seven moment invariants functions known as Hu’s moment invariants function and details are given below.

$$\begin{aligned} \emptyset _1= & {} \eta _{20}+\eta _{02} \end{aligned}$$
(18)
$$\begin{aligned} \emptyset _2= & {} (\eta _{20}-\eta _{02})^2 +4\eta _{11}^2 \end{aligned}$$
(19)
$$\begin{aligned} \emptyset _3= & {} (\eta _{30}-3\eta _{12})^2 +(3\eta _{21}-\eta _{03})^2 \end{aligned}$$
(20)
$$\begin{aligned} \emptyset _4= & {} (\eta _{30}+\eta _{12})^2 +(\eta _{21}+\eta _{03})^2 \end{aligned}$$
(21)
$$\begin{aligned} \emptyset _5= & {} (\eta _{30}-3\eta _{12})(\eta _{30}+\eta _{12})[(\eta _{30}+\eta _{12})^2-3(\eta _{21}+\eta _{03})^2] \nonumber \\{} & {} +(3\eta _{21}-\eta _{03})(\eta _{21}+\eta _{03})[3(\eta _{30}+\eta _{12})^2-(\eta _{21}+\eta _{03})^2] \end{aligned}$$
(22)
$$\begin{aligned} \emptyset _6= & {} (\eta _{20}-\eta _{02})[(\eta _{30}+\eta _{12})^2-(\eta _{21}+\eta _{03})^2] \nonumber \\ {}{} & {} +4\eta _{11}(\eta _{30}+\eta _{12})(\eta _{21}+\eta _{03}) \end{aligned}$$
(23)
$$\begin{aligned} \emptyset _7= & {} (3\eta _{21}-\eta _{03})(\eta _{30}+\eta _{12})[(\eta _{30}+\eta _{12})^2-3(\eta _{21}+\eta _{03})^2] \nonumber \\{} & {} -(\eta _{30}-3\eta _{12})(\eta _{21}+\eta _{03})[3(\eta _{30}+\eta _{12})^2-(\eta _{21}+\eta _{03})^2]\nonumber \\ \end{aligned}$$
(24)

These, \(\emptyset _1\),\(\emptyset _2\),\(\emptyset _3\),\(\emptyset _4\),\(\emptyset _5\),\(\emptyset _6\),\(\emptyset _7\) seven moment invariants are useful properties of being unchanged under geometrical variant binary objects.

7 Similarity Measure Using the PCA

For the task of shape recognition and classification, usually a shape similarity or dissimilarity is computed using principal component analysis (PCA) by finding the optimal correspondence of features extracted by applying Hu’s seven moments invariant functions on shape contours CS(n), which is used to rank the database shapes for shape retrieval and classification. In this paper, we use Euclidean distance between eigenvalues of correlation coefficient of Hu’s seven moments corresponding to shape contours of an object that reduced the dimension of extracted features in a similar scale for robust data handling between two objects. In a given shape contour segment CS(n) from Eq. (5), we find \(\emptyset _{1}^{n}\),\(\emptyset _{2}^{n}\),\(\emptyset _{3}^{n}\),\(\emptyset _{4}^{n}\),\(\emptyset _{5}^{n}\),\(\emptyset _{6}^{n}\),\(\emptyset _{7}^{n}\) for \(n=1,2...N\) that gives up a Nx7 matrix for further processing of shape features as shown in Table 1.

Table 1 Hu’s seven moments function of CS(n) of an object

The algorithm 2 describes the procedure for computing the value of Hu’s seven moments \((\emptyset _{i}^{n})\) in each contour segments CS(n) n=1,2,...,N for a given object. The Hu’s seven moments describe the shape in terms of their pixel structures at N different contour segments.

figure d

Now, our main focus to measure the shape dissimilarity or similarity between two objects in the context of any distance metric. And it can be computed by computing the Euclidean distance between two shapes. For this purpose, the generated Hu’s seven moments are not efficient due to dimensionally variant from shape to shape. The dimension of Table 1 is Nx7 for \(CS(n), n=1,2,...,N;\) where \(N=max(n\ge 0: X\ominus nB\ne \emptyset )\). Here, we compute the correlation coefficient of Hu’s seven moments for dimension reductions that can be efficiently represented by the Euclidean distance metric. This gives up the minimum error rate for shape segments analysis and recognition. Equation 25 presents the correlation coefficient(\(r_{i,j}\)) with respect to co-variance between \(\emptyset _{i}^{n}\) and \(\emptyset _{j}^{n}\). We have been finding the relation between the \(r_{i,j}\) and co-variance of \(i^{th}\) Hu’s seven moment to \(j^{th}\) Hu’s seven moment for each and every CS(n). The interpretation of \(r_{i,j}\) with respect to \(-1\le r_{i,j} \le +1\) generates Table 2 that has been used in next step of shape recognition and analysis process. The beauty of this proposed methodology is that the methodology does not only depend on contour pixels or region pixels only. The methodology analyzes the segment of contours that has been characterized by the contour pixels as well as region pixels of an object based on the contour segment CS(n).

$$\begin{aligned} r_{i,j}= & {} \frac{Cov(\emptyset _{i}^n,\emptyset _{j}^n)}{\sqrt{Var(\emptyset _{i}^n)}\sqrt{Var(\emptyset _{j}^n)}}; for i,j=1,2,...,7 \nonumber \\{} & {} and n=1,2,...,N; where \nonumber \\ N= & {} max(n\ge 0: X\ominus nB\ne \emptyset ). \end{aligned}$$
(25)
$$\begin{aligned} A_{7,7}= & {} \begin{pmatrix} r_{1,1} &{} r_{1,2} &{} \cdots &{} r_{1,7} \\ r_{2,1} &{} r_{2,2} &{} \cdots &{} r_{2,7} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ r_{7,1} &{} r_{7,2} &{} \cdots &{} r_{7,7} \end{pmatrix} \end{aligned}$$
(26)
Fig. 4
figure 4

The Kimia’s 99 dataset

Table 2 Correlation coefficient with respect to Table 1
Table 3 Retrieval results on Kimia’s 99 dataset

Equation 26, which has been generated from Table 2, transforms the shape features in terms of the correlation coefficient square matrix \(A_{7,7}\) of order 7. In this work, we have computed the object shape distance between two shapes by using the characteristic equation \(A_{7,7}-\lambda I=0\). The object shape dissimilarity can be measured by using the characteristic equation represented in Eq. 27. The latent roots or characteristic roots of the matrix \(A_{7,7}\) describe the shape dissimilarity in the context of contour segment CS(n).

$$\begin{aligned} A_{7,7}-\lambda I = \begin{pmatrix} 1-\lambda &{} r_{1,2} &{} \cdots &{} r_{1,7} \\ r_{2,1} &{} 1-\lambda &{} \cdots &{} r_{2,7} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ r_{7,1} &{} r_{7,2} &{} \cdots &{} 1-\lambda \end{pmatrix} = 0 \end{aligned}$$
(27)

In Table 2, we have computed the correlation coefficient between moments that are generated from contour segments. It is one of the primary stages to fix the different length of the data matrix in some predefined scale for shape similarity measure. For two given shapes A and B with their eigenvalue sequences characterized by Eq. 27, \(A=\lambda _{1}^{a},\lambda _{2}^{a},\lambda _{3}^{a},\lambda _{4}^{a},\lambda _{5}^{a},\lambda _{6}^{a}.\lambda _{7}^{a}\) and \(B=\lambda _{1}^{b},\lambda _{2}^{b},\lambda _{3}^{b},\lambda _{4}^{b},\lambda _{5}^{b},\lambda _{6}^{b}.\lambda _{7}^{b}\) corresponding to the correlation coefficient given in Table 2. The matching cost of sequence of two eigenvalues \(\lambda _{i}^{a}\) and \(\lambda _{i}^{b}\) is defined by Euclidean distance E(AB).

$$\begin{aligned} E(A,B)= \sqrt{\sum _{i=1}^{i=7}(\lambda _{i}^{a}-\lambda _{i}^{b})^2} \end{aligned}$$
(28)

The design of E(AB) as shown in Eq. (28) is to determine the shape similarity in the context of higher to lower eigenvalues.

8 Results and Discussions

In the experiment, we assumed a negative value corresponding to the coordination number of all background pixels that differentiated from the coordination number of object pixels. The experiments have been done by assuming the coordination number of background pixels as any negative number with respect to foreground object pixels that are computed by computing the coordination number. The experimental results on popular benchmarks datasets using our proposed algorithm achieved encouraging results. We used Kimia’s 99 [26] and MPEG-7 [27] dataset for the experiments. During experiments, we take shape contour points of an object for CN = j, for j=2,3,4,5,6,7 in case of Kimia’s 99 dataset and CN = j, for j=2,3,4,5,6,7 in case of MPEG-7 CE-Shape-1 dataset. All experiments are conducted using a C-Programming tool and tested on Intel CORE-i5 CPU with 3GB RAM on Linux Mint operating system (OS).

8.1 Kimia’s dataset

The Kimia’s [26] dataset is widely used for testing the performances of shape contour preserving descriptors in the recent era of shape matching and classification. It contains 99 images from nine categories, each category contains eleven images (as shown in Fig. 4). In the experiment, every binary object in the data set is considered a query, and the retrieval result is summarized as the number of tops 1 to top 10 closest matches in the same class (excluding the query object). Therefore, the best possible result for each of the rankings is 99. Table 3 lists the results of our proposed method and some other recent methods. The performance of our approach is comparably better than recent approaches.

According to an experimental setting, Table 3 shows the outcomes of the suggested methodology, which outperforms the current strategy. In the experimental setting, we have designed the shape in terms of the pixel’s structure. The structure of pixel’s in an object can be estimated by the neighbor pixel’s so-called coordination number. It can also be defined by convolving the object by the \(3 \times 3\) box filter. Figure 3a, specified in sect. 4, is the results of convolution on the square shape defined in Fig. 2a by the box filter specified in Fig. 1. The method we suggest is equivalent to convolution by box filters, followed by application of a min filter to the convolution outputs. Here, we have presented a robust and effective methodology for geometric transformation invariant object recognition and classification in computer vision using the chemical properties of a lattice that is also backed by a deep learning model in the current era.

Fig. 5
figure 5

The MPEG-7 CE-Shape-1 dataset

8.1.1 MPEG-7 Dataset

The other widely tested dataset is MPEG-7 CE-Shape-1 [27], which consists of 1400 silhouette images from 70 classes. Each class has 20 different binary objects, some typical objects are shown in Fig. 5. The recognition rate is measured by the Bull’s eye test used by several authors in literature [2, 3, 9]. The Bull’s eyes score for every query image in the datasets is described by hit ratio. It is matched with all other images in the dataset and the top 40 most similar images are counted. These 40 images, at most 20 images are from the query image class that is correctly hit. The score of the test is the ratio of the number of correct hits of all images to the highest possible number of hits. In this case, the highest possible number of hits is \(20*1,400=28,000\). Table 4 shows the result of our proposed algorithm and comparison with some other existing context. In this table, we have calculated the retrieval rate on the MPEG-7 dataset in terms of the percentage of Bull’s eyes score.

$$\begin{aligned} \begin{aligned}&\% \; of Bull's \; eyes \; score= 100*\frac{N_{sq}}{N_{hp}}, where \, N_{sq}\\&=Number \; of \; correctly \; hit \; \\ {}&images \; corresponding \; to \; each \; Query \;\\&image\; in\; the\; top\; 40\; most\; similar\;\\ {}&images=27,129 \;(for\; our\; proposed\; algorithm )\\ {}&and \; N_{hp}=The \; Highest \; Possible \; number \; of \; hits \;\\&= \;28,000 \;. \end{aligned} \end{aligned}$$
(29)
Table 4 Retrieval rate (Bull’s eye score) of different algorithms for the MPEG-7 CE-Shape-1 dataset

Equation 29, explains the percentage of the Bull’s eyes score. Based on Bull’s eyes score, we have compared the proposed methodology with the existing algorithms for shape classification and recognition. The LBP for the segments of contour pixels structure outperforms on the MPEG-7 dataset. Numerous methods for classifying shapes have been developed by eminent researchers, and their results have been published in the literature. One of them has reported an 85.40% success rate, the best Bull’s eye score on his findings, for the inner-distance shape classification using dynamic programming mentioned in [4]. Although the framework for shape recognition, based on adaptive contour evolution (ACE) and shape context (SC), which is discussed in [28], reported an 87.54% success rate when the redundant pixels are filtered out from the object contour. The devised method uses the salient points on the contour for constructing the SC, followed by dynamic programming (DP) for shape matching. In our proposed methodology, we have constructed the structure of each object pixel and so our result is better. In our experiment, we have constructed the LBP of those pixels that have coordination numbers greater than 1. The pixels having coordination numbers 0 or 1 have been filtered out from each contour segment of an object. Since the isolated points do not form a discernible shape, and their corresponding LBP in each point would have been 0, and \(1*2^{0}\)=1. Therefore, the removal of outlayers from Hu’s seven moments attained superior performance. In addition to the retrieval and recognition tests conducted on the MPEG-7 CE-Shape-1 dataset, several researchers demonstrate that the similarities found by the proposed method can be enhanced by combining it with the locally constrained diffusion process (LCDP), a context-sensitive learning technique that is covered in [2], and reported 96.45% Bull’s eyes score. Herein the author also combines the shape complexity with height function in his finding and deployed a 90.35% score; however, in their proposed methodology he has reported 89.66% score.

Fig. 6
figure 6

Myth sample image

Table 5 The average shape characteristic roots of Myth dataset

Besides the above explanation in support of our work, we have also analyzed the shape classification network (SCN) using CNN as discussed in [20, 21, 30]. The kind of hybrid structure is enhanced in an end-to-end procedure that’s composed of a semantic segmentation model, followed by a feature generator and discriminator. Therefore, we have devised a hybrid model using the coordination number (semantic segmentation) followed by LBP (feature generator) and eigenvalues (discriminator) and reported a 96.89% Bull’s eyes score on the MPEG-7 CE-Shape-1 dataset. In [30], the author deployed a model inspired by LeNet5 [21] basic network structure, and in his findings, the model used transposed convolution and reported 90.99% classification accuracy on the MPEG-7 dataset. The standard CNN architecture used for image classification is translation equivalent, but it is not always comparable with regard to rotation, scaling, and other transformations, according to the author in [20]. Herein, the author has chosen PDE-based group CNNs because they exhibit similar behavior on images as traditional CNNs do through convolution, pooling, and ReLU for roto-translation image datasets. The pooling layer is mainly used for the reduction of object size. The results of the pooling layer do not preserve the shape structure in their feature descriptor. So, we have used the proposed architecture similar to convolving an object by a box filter followed by a min filter. Figure 3a, represents the results of convolution in Fig. 2a by the box filter given in Fig. 1. After the convolution, we applied the min filter, and we get the results given in Fig. 3c. The proposed algorithm shrinks the size of an object by preserving the structure. After all, in each shrinking phase, we have extracted the LBP feature descriptor from their boundary structure and reported better results than comparable existing algorithms.

During the experiments on all non-commercial-available datasets, we assumed a negative number for (CN)* of background pixels and then analysis that the object pixels having coordination number either 0 or 1 does not belong to the object. The (CN)* belongs to 0 or 1 does not construct any shape in objects. It is either outliers or noise in a shape.

8.1.2 The Myth and Tools Dataset

Myth dataset used by [2, 8, 31] contains 15 samples of binary images, which consists of 3 classes (humans, horses, centaurs), and each class contains 5 sample images. The myth dataset is shown in Fig. 6, and Table 5 represents its corresponding average eigenvalues for three different classes.

Fig. 7
figure 7

Tool’s dataset

Table 5 shows the average eigenvalues sequences corresponding to 5 shapes in each class of the myth database that has been minimized the intra-class variation and at the another hand the proposed algorithm maximized the inter-class variation. Table 5 demonstrates our methodology by discriminating the shape features for humans, horses, centaurs that have describing by Euclidean distances of eigenvalues corresponding to correlation coefficient of Hu’s seven moments.

The Tool’s dataset, shown in Fig. 7, used by [2, 7, 8, 31], consists of 7 classes of 35 sample objects of different types of instruments, and each class contains 5 images. Table 6 represents the shape features of the Tool’s dataset corresponding to Fig. 7, and its retrieval results have been shown in terms of total sum of each shape characteristic roots in a dataset.

The total sum of the characteristic roots for each shape = \(\sum _{i=1}^{i=7} {\lambda _i^{a}}\) corresponding to the roots of Eq. 27 for tool’s dataset. In this case, \(\lambda _{i}^{a}\) is the \(i^{th}\) eigenvalue corresponding to the given shape ’a’.

Table 6 demonstrates the proposed shape descriptor methodology in terms of \(\sum {\lambda _i^{a}}\) that has been describes by contour segmentation of an object using the properties of a shape pixels coordination number. The data in Table 6 extracted from tool’s dataset have similar values for the same class. The methodology can be extended to the classification of noisy shape dataset. The proposed methodology would be more compact in case of selective object pixels due to specific characteristics in coordination number for removing noisy pixels from object’s.

9 Conclusions and Future Works

This paper has presented an analysis of binary objects that are deformed in a sequence of shape contour-based features vector using local binary pattern and Hu’s moments on different shape contours of an image. Our finding of moment invariant features corresponding to LBP is more robust and geometrically invariant to shape. Here, we have designed the classifier by measuring eigenvalues of a correlation coefficient matrix corresponding to Hu’s seven moments of a sequence of shape contours of an object. Experimental results on standard shape databases exhibit the success of the proposed LBP approach using (CN)*. Several extensions of the proposed approach are possible. In this paper, LBP and (CN)* are only used for the contour points of an object, and it can also apply on local and global points of an object for finding LBP and (CN)* corresponding to a given binary object. Moreover, it is possible to apply LBP and (CN)* to contour points function into the hierarchical matching frameworks that may be called decision-making tree for shape classification, segmentation and retrieval.

Table 6 The total sum of characteristic roots for each shape of Tool’s dataset