1 INTRODUCTION

Shape plays an important role in human recognition and perception [13]. Deriving shape descriptors is an important task in content-based image retrieval [46]. In real applications, it is quite common that shapes may have changes in orientation, scale, and viewpoint; a robot shape descriptor should be unaffected by translation, rotation, and scaling.

There are mainly two classical approaches to shape representation: the boundary-based approaches and the region-based approaches. The boundary-based approaches only use the boundary of a shape to extract its features, such as Fourier descriptors [7], curvature scale space [8], wavelet descriptors [9], chain codes [10], autoregressive models [11], Delaunay triangulation technique [12], point set matching and dynamic programming [13, 14]. Exploiting information only from the boundary of a shape, the boundary-based approaches ignore potentially important information from the interior region of a shape; the region-based approaches utilize information from both of the boundary and interior region of a shape. These methods include geometrical moments [1518], radial moments [19], Zernike moments [18, 2028], Legendre moments [26], Tchebychev moments [29], generic Fourier descriptors [30], wavelet descriptors [3133], shape matrix [34], hypergraph [35, 36], kernel entropy component analysis [37] and compound image descriptors [38]. Soft computing [39, 40] proposed by L. Zadeh in 1990, dealing with approximation, uncertainty, imprecision, can partial truth to effectively achieve practicability, robustness and low solution cost; it has been successful applied in image analysis. There are shape representations based on fuzzy sets [31, 4144], neural networks [42, 4446], and granular computing [32], respectively.

Zernike moments are highly effective in terms of shape representation [4749]. They have good feature representation capability [2224, 27, 48], invariance to linear transformations, geometric invariance with respect to rotation [21, 25], better image reconstruction [6, 50], and low noise sensitivity [48, 51]. Zernike polynomials are orthogonal within the unit circle, which implies no redundancy that exists between moments of different orders [48, 52].

However, Zernike moments are not directly invariant under scaling. There are two common approaches to achieve sale invariance of Zernike moments: the preprocessing approach [52] and the indirect approach [53]. In the preprocessing approach, scale invariance is achieved by rescaling the object in the unit circle; in the indirect approach, Zernike moments are expressed in terms of the regular moments where the indirect Zernike moments invariants (ZMI) can be achieved. However, both of these approaches have high computation complexity. Another popular method for achieving the scale invariance of Zernike moments [25, 54, 55] is based on normalizing the area of a shape, that is the zero order of its geometric moments, to a constant number. However this method may lead to computation errors: first, the changes of the area are not always linearly dependent on the changes of the size with respect to the same shape; second, when the order of moments becomes larger, the dynamic range increases, which in turn amplifies the numerical errors; third, the conversion from geometric moments to Zernike moments increases the computation complexity; finally, this method requires that the objects must be clearly separated from its background.

To reduce computation complexity and computation errors, Belkasim et al. [20] expressed Zernike moments in Cartesian coordinates and used the zero order and the second order of Cartesian Zernike moments to develop three scale invariance parameters. Through using one of these parameters, the scale errors can be reduced considerably and efficiently. However these scale parameters are unstable for processing shapes with large scale variations. Additionally, these scale invariance parameters cannot be uniquely represented by a single quantity. Zhao et al. [28] proposed a method to combine the scale invariance parameters [1] into one single quantity, which reduces the invariance errors over shapes with large scale variations. In this paper, we explore the properties of this combined scale invariance parameter further and find its good capability to noise; even if the shapes are corrupted by different kinds of noises, such as Gaussian, salt & pepper, and speckle noise, this combined scale invariance parameter still has good performances.

Our paper is organized as follows: in Section 1, we give the introduction; in Section 2, we introduce Cartesian Zernike moments and Cartesian Zernike moments invariants, especially for the three scale invariance parameters of Cartesian Zernike moments proposed by Belkasim et al. [20]; in Section 3, we introduce the combined scale invariance parameter to improve the stability of the scale invariance and the noise tolerance; in Section 4, we present the simulation to demonstrate the capability of our proposed scale invariance parameter in terms of reducing the scale errors and stabilizing the scale invariance, especially for processing the image data that are corrupted by different kinds of noises; in Section 5, we present the conclusion and the future work.

2 CARTESIAN ZERNIKE MOMENTS AND CARTESIAN ZERNIKE MOMENTS INVARIANTS

Direct translation and rotation invariance of Zernike moments can be achieved through explicitly expressing the original Zernike moments in their Cartesian coordinates form. In this section, we introduce Cartesian Zernike Moments and Cartesian Zernike moments invariants, especially for the three scale invariance parameters of Cartesian Zernike moments proposed by Belkasim et al. [20].

2.1 Cartesian Zernike Moments

Zernike moments are defined in terms of a set of orthogonal functions with simple rotation properties known as Zernike polynomials [26, 56]. Zernike polynomials can be expressed in Cartesian coordinates as follows [20]:

$$\begin{gathered} {{V}_{{nL}}}(x,y) \\ = {{R}_{{nL}}}(x,y)\left( {\left( {{{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}\,\mathop \sum \limits_{j = 0}^{{{m}_{r}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j} \end{array}} \right){{x}^{{L - 2j}}}{{y}^{{2j}}})} \right.} \right. \\ \left. {\left. { + \;i({{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}\,\mathop \sum \limits_{j = 0}^{{{m}_{i}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j + 1} \end{array}} \right){{x}^{{L - 2j - 1}}}{{y}^{{2j + 1}}}} \right)} \right), \\ \end{gathered} $$
((1))

where \(i = \sqrt { - 1} \); \({{({{x}^{2}} + {{y}^{2}})}^{{1/2}}} \leqslant 1\); \(n\) is non-negative integer, \(L\) is positive integer subject to the constraints \(n - L~\) is even and \(L \leqslant n\), \({{R}_{{nL}}}(x,y)\) is real valued Zernike polynomials, which are defined as follows:

$$\begin{gathered} {{R}_{{nL}}}(x,y) = \mathop \sum \limits_{s = 0}^{(n - L)/2} {{( - 1)}^{s}} \\ \times \;\frac{{(n - s)!}}{{s!\left( {\frac{{n + L}}{2} - s} \right)!\left( {\frac{{n - L}}{2} - s} \right)!}}{{({{x}^{2}} + {{y}^{2}})}^{{\frac{{n - 2s}}{2}}}}, \\ \end{gathered} $$
((2))

\({{m}_{r}}\) and \({{m}_{i}}\) are defined with respect to \(L\) as the follows:

$${\text{when }}L{\text{ is even}}\quad {{m}_{r}} = \frac{L}{2}\quad {{m}_{i}} = \frac{{L - 2}}{2},$$
$${\text{when }}L{\text{ is odd}}\quad {{m}_{r}} = \frac{{L - 1}}{2}\quad {{m}_{i}} = \frac{{L - 1}}{2}.$$

The complex Cartesian Zernike moments \({{A}_{{nL}}}\) of order \(n\) and repetition \(L\) for a digital image \(f(x,y)\) is defined in Cartesian coordinates as follows:

$${{A}_{{nL}}} = \frac{{n + 1}}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y)V_{{nL}}^{{\text{*}}}(x,y).$$
((3))

The symbol * denotes the complex conjugate of \({{V}_{{nL}}}\).

From the Eqs. (1)(3), Cartesian Zernike moments \({{A}_{{nL}}}\) can be further expressed as follows [20]:

$${{A}_{{nL}}} = {{C}_{{nL}}} - i{{S}_{{nL}}},$$
((4))

where

$$\begin{gathered} {{C}_{{nL}}} = \frac{{n + 1}}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \left\{ {f(x,y){{R}_{{nL}}}(x,y){{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}} \right. \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{r}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j} \end{array}} \right){{x}^{{L - 2j}}}{{y}^{{2j}}}} \right\}, \\ \end{gathered} $$
((5))
$$\begin{gathered} ~{{S}_{{nL}}} = \frac{{n + 1}}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \left\{ {f(x,y){{R}_{{nL}}}(x,y){{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}} \right. \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{i}}} {{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j + 1} \end{array}} \right){{x}^{{L - 2j - 1}}}{{y}^{{2j + 1}}}} \right\}. \\ \end{gathered} $$
((6))

Note that for \(L = 0\), \({{S}_{{n0}}} = 0\).

2.2 Cartesian Zernike Moments Invariants

Based on the Cartesian Zernike moments above, Belkasim et al. [20] proposed a set of Cartesian Zernike moments invariants (CZMI) that descript image features directly invariant under scale, translation and rotation without using regular moments, which avoids preprocessing or resizing the original image. This set of CZMI is computed as follows [20]:

$${{(CZMI)}_{{n0}}} = {{C}_{{n0}}}\,,$$
((7))
$${{(CZMI)}_{{nL}}} = {{\left| {{{C}_{{nL}}}} \right|}^{2}} + {{\left| {{{S}_{{nL}}}} \right|}^{2}},$$
((8))
$$\begin{gathered} {{(CZMI)}_{{n,z}}} = [({{C}_{{nL}}} + i{{S}_{{nL}}}){{({{C}_{{mh}}} - i{{S}_{{mh}}})}^{p}}] \\ \pm \;{\text{[}}({{C}_{{nL}}} + i{{S}_{{nL}}}){{({{C}_{{mh}}} - i{{S}_{{mh}}})}^{p}}{\text{]*,}} \\ \end{gathered} $$
((9))

where \(h \leqslant L\), \(p = \frac{L}{h}\), \(p \geqslant 1\), \((L\,{\text{mod}}h) = 0\), and z = p + L + h.

The main steps to achieve CZMI are summarized as follows [1]:

(1) use Eqs. (2), (5), and (6) to compute \({{C}_{{00}}}{{C}_{{11}}}\) and \({{S}_{{11}}}\);

(2) compute the centroid \((\bar {x},\bar {y})\)

$$\bar {x} = \frac{{{{C}_{{11}}}}}{{2{{C}_{{00}}}}}\,,$$
((10))
$$\bar {y} = \frac{{{{S}_{{11}}}}}{{2{{C}_{{00}}}}}\,;$$
((11))

(3) compute the scale invariance parameter \(\beta \), which can be achieved using one of three parameters \({{\beta }_{{00}}}\), \({{\beta }_{ - }}\), and \({{\beta }_{ + }}\);

(4) use \(\beta \), \((\bar {x},\bar {y})\) and Eq. (2) to compute RnL(β(x\(\bar {x}),\,\,\,\beta (y\)\(\bar {y}))\);

(5) use \({{R}_{{nL}}}(\beta (x - \bar {x}),\beta (y - \bar {y}))\) and Eqs. (5), (6) to compute updated \({{C}_{{n0}}}\), \({{C}_{{nL}}}\), and \({{S}_{{nL}}}\)

$${{C}_{{n0}}} = \frac{{n + 1}}{\pi }{{\beta }^{2}}\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y){{R}_{{n0}}}(\beta (x - \bar {x}),\beta (y - \bar {y})),$$
((12))
$$\begin{gathered} {{C}_{{nL}}} = \frac{{n + 1}}{\pi }{{\beta }^{2}}\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,\left\{ {f(x,y){{R}_{{nL}}}{{{(\beta (x - \bar {x}),\beta (y - \bar {y}))}}^{{^{{^{{^{{}}}}}}}}}} \right. \\ \times \;{{({{(x - \bar {x})}^{2}} + {{(y - \bar {y})}^{2}})}^{{\frac{{ - L}}{2}}}} \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{r}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j} \end{array}} \right){{{(x - \bar {x})}}^{{L - 2j}}}{{{(y - \bar {y})}}^{{2j}}}} \right\}, \\ \end{gathered} $$
((13))
$$\begin{gathered} {{S}_{{nL}}} = \frac{{n + 1}}{\pi }{{\beta }^{2}}\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,\left\{ {f(x,y){{R}_{{nL}}}{{{(\beta (x - \bar {x}),\beta (y - \bar {y}))}}^{{^{{^{{^{{}}}}}}}}}} \right. \\ \times \;{{({{(x - \bar {x})}^{2}} + {{(y - \bar {y})}^{2}})}^{{\frac{{ - L}}{2}}}} \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{i}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j + 1} \end{array}} \right){{{(x - \bar {x})}}^{{L - 2j - 1}}}{{{(y - \bar {y})}}^{{2j + 1}}}} \right\}; \\ \end{gathered} $$
((14))

(6) use updated \({{C}_{{n0}}}\), \({{C}_{{nL}}}\), \({{S}_{{nL}}}\) and Eqs. (7)(9) to compute CZMI.

2.3 Three Scale Invariance Parameters

The scale invariance of CZMI can be achieved using one of the three parameters [20]:

$${{\beta }_{{00}}} = \sqrt {\frac{\pi }{A}\,} ,$$
((15))
$${{\beta }_{ - }} = \sqrt {\frac{{3{{A}_{{00}}} - \sqrt {9A_{{00}}^{2} + 4\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)} }}{{2\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)}}\,} ,$$
((16))
$${{\beta }_{ + }} = \sqrt {\frac{{3{{A}_{{00}}} + \sqrt {9A_{{00}}^{2} + 4\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)} }}{{2\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)}}} ,$$
((17))

where

$$~A = \mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y),$$
((18))
$${{A}_{{00}}} = \frac{{\sum\limits_x {\sum\limits_y {f(x,y)} } }}{\pi },$$
((19))
$${{A}_{{20}}} = \frac{3}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y)[2({{x}^{2}} + {{y}^{2}}) - 1],$$
((20))

\({{\beta }_{{00}}}\) is a scale invariance parameter, depending on area. \({{\beta }_{ - }}\) is another scale invariance parameter, depending on Cartesian Zernike moments up to the second order. \(~{{\beta }_{ + }}\) is the third scale invariance parameter, depending on Cartesian Zernike moments up to the second order.

These three scale invariances parameters can be used respectively to normalize objects against scale changes, however they are not stable for shapes with large scale variations. Furthermore, having several parameters to normalize the variations in scale is inconvenient for automating the process of feature extraction; therefore we introduce an approach that combines these scale parameters into one scale parameter β.

3 THE COMBINED β SCALE INVARIANCE PARAMETER

In this section, we introduce an approach that combines the above scale parameters into one scale parameter β to improve the stability of scale invariance.

Our combined scale invariance parameter \(\beta \) is proposed as follows [28]:

$${\beta } = {{{\beta }}_{ - }}\sqrt {\frac{{{{{\beta }}_{ - }}}}{{{{{\beta }}_{ + }}}}}.$$
((21))

Furthermore we substitute (16) and (17) into (21):

$$\begin{gathered} {\beta } = \sqrt {\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \\ \times \;\sqrt[4]{{\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{3{{{\text{A}}}_{{00}}} + \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}}}\,. \\ \end{gathered} $$
((22))

The reason that we only combine \({{{\beta }}_{ - }}\) and \({{{\beta }}_{ + }}\) is that \({{{\beta }}_{ - }}\) and \({{{\beta }}_{ + }}\) contain the information of \({{{\beta }}_{{00}}}\). \({{{\beta }}_{{00}}}\) directly related to the area, which can be shown as follows:

$${\text{A}} = \pi {\text{/}}\beta _{{00}}^{2}{\text{.}}$$
((23))

Normalization against the change in variance implicitly includes size (area). \({{{\beta }}_{ - }}\) and \({{{\beta }}_{ + }}\) contain the information of the area (\({{{\beta }}_{{00}}}\)) which can be seen through the Eqs. (16)(19).

From paper [20], we know that \({{{\beta }}_{ - }}\) and \({{{\beta }}_{ + }}\) are the root of the following equation:

$$\left( {3{{{\text{A}}}_{{00}}} + {{{\text{A}}}_{{20}}}} \right){{{\beta }}^{4}} - \left( {3{{{\text{A}}}_{{00}}}} \right){{{\beta }}^{2}} - 1 = 0\,.$$
((24))

Equation (24) keeps the value of the second order of Zernike moments constant as the scale or size changes [20].

We can consider the left part of the Eq. (24) as the characteristic polynomial of a linear transformation \(B\) and get the following characteristic equation of \(B\):

$$\det \left( {{\text{B}} - {{\text{I}\lambda }}} \right) = 0,$$
((25))

where

$${\lambda } = {\beta }{\text{.}}$$
((26))

This characteristic equation has four different roots:

$${{{\lambda }}_{1}} = - {{{\beta }}_{ + }} = - \sqrt {\frac{{3{{{\text{A}}}_{{00}}} + \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \,,$$
((27))
$${{{\lambda }}_{2}} = - {{{\beta }}_{ - }} = - \sqrt {\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \,,$$
((28))
$${{{\lambda }}_{3}} = {{{\beta }}_{ - }} = \sqrt {\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \,,$$
((29))
$${{{\lambda }}_{4}} = {{{\beta }}_{ + }} = \sqrt {\frac{{3{{{\text{A}}}_{{00}}} + \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} .$$
((30))

Therefore the condition number of the linear transformation \(B\) can be defined as follows:

$${\varkappa }\left( {\text{B}} \right) = \left| {\frac{{{{{\lambda }}_{{{\text{max}}}}}}}{{{{{\lambda }}_{{{\text{min}}}}}}}} \right| = \left| {\frac{{{{{\lambda }}_{4}}}}{{{{{\lambda }}_{3}}}}} \right| = \left| {\frac{{{{{\beta }}_{ + }}}}{{{{{\beta }}_{ - }}}}} \right| = \frac{{{{{\beta }}_{ + }}}}{{{{{\beta }}_{ - }}}}.$$
((31))

The condition number of a function measures the worst case change with respect to small changes.

Since \({{{\beta }}_{ - }}\) is more effective in most shapes than \({{{\beta }}_{ + }}\) [20]. We consider the quantity \(\sqrt {\frac{1}{{{\varkappa }\left( {\text{B}} \right)}}} \), that is \(\sqrt {\frac{{{{{\beta }}_{ - }}}}{{{{{\beta }}_{ + }}}}} \) as the ratio or weight factor of \({{{\beta }}_{ - }}\) to derive the combined scale invariance parameter: \({{\beta }} = {{{{\beta }}}_{ - }}\sqrt {\frac{{{{{{\beta }}}_{ - }}}}{{{{{{\beta }}}_{ + }}}}} \).

This combined parameter \({\beta }\) includes the information of the area and variance about an image; therefore it is more powerful to process different kinds of shapes and more robust to process data corrupted by noise.

4 SIMULATION

4.1 Simulation Data

We choose two standard image databases which are carefully designed to include most challenges in shape analysis and have been used by many researchers to test robustness of their methods under changes of rotation, scale and translation.

(1) MPEG7_70_OBJECTS [31]: 70 images of 70 objects taken from MPEG7 data base.

(2) LEAF [58]: originally created for experiments with recognition of wood species based on a leaf shape. It contains leaves of 90 wood species growing in the Czech Republic, both trees and bushes; native, invasive and imported (only those imported species which are common in parks are included). The number of samples (leaves) of one species varies from 2 to 25, their total number in the database is 795.

For each shape (image) of the above two databases, there are two copies with different sizes (untransformed and transformed shape). We add three kinds of noise respectively to these two databases: Gaussian noise with mean = 0 and variance = 0.01, salt & pepper noise with noise density = 0.05, and speckle noise with mean = 0 and variance = 0.15. Therefore, for each of these two databases, we have four sample data sets: the clean sample data set, the sample data set with Gaussian noise, the sample data set with salt & pepper noise, and the sample data set with speckle noise.

4.2 Simulation Process

Our simulation is based on comparing the performances of CZMI using our combined scale invariance parameter \(\beta \) [28] with other four methods: indirect Zernike moment invariants (ZMI) [53], CZMI using area normalization parameter \({{\beta }_{{00}}}\) [20], CZMI using positive scale invariance parameter \({{\beta }_{ + }}\) [20], and CZMI using negative scale invariance parameter \({{\beta }_{ - }}\) [20].

For each of these five methods, moment invariants were computed for each untransformed shape (image) and its related transformed shape (image) respectively. Under order \(n\), we have \(N\) moment invariants for each untransformed shape (image) and its related transformed shape (image) respectively. We use the absolute relative mean error \({{E}_{r}}\) to quantify the scale accuracy of each method for a pair of untransformed and transformed shape (image):

$${{E}_{r}} = \frac{1}{N}\mathop \sum \limits_{k = 1}^N \left| {\frac{{{{I}_{0}}\left( k \right) - {{I}_{t}}\left( k \right)}}{{{{I}_{0}}\left( k \right)}}} \right|,$$
((32))

where \({{I}_{0}}(k)\) is value of the \(k\)th moment invariant for an untransformed shape (image) under order \(n\); \({{I}_{t}}(k)\) is value of the kth moment invariant for the related transformed shape (image) under order \(n\); k = 1, 2, 3, ..., N.

To evaluate stability of the scale accuracy of each method we compute the variance of the absolute relative mean error \({{E}_{r}}\) among \(W\) pairs of untransformed and transformed shape (image):

$$var = \frac{{\sum\limits_{w = 1}^W {{{{({{E}_{r}}(w) - \mu )}}^{2}}} }}{{W - 1}}~,$$
((33))

where

$$\mu = \frac{{\sum\limits_{w = 1}^W {{{E}_{r}}(w)} }}{W}$$
((34))

\(w\) is id of a pair of untransformed and transformed shape (image); \(w = 1,~\,\,2,~\,\,3,\,\,~ \ldots ,~\,\,W\).

4.3 Simulation Results Analysis

Tables 1, 2 show simulation results of the five methods: ZMI, CZMI using area normalization parameter \({{\beta }_{{00}}}\), CZMI using positive scale invariance parameter \({{\beta }_{ + }}\), CZMI using negative scale invariance parameter \({{\beta }_{ - }}\), and CZMI using combined scale invariance parameter \(\beta \), for two standard image databases Mpeg 7 and LEAF respectively. These two tables clearly demonstrate that for both Mpeg7 and LEAF, our method (CZMI using combined scale invariance parameter \(\beta \)) always achieves the lowest mean of \({{E}_{r}}\) and the lowest variance of \({{E}_{r}}\), even if data sets are corrupted by different kinds of noises, such as Gaussian, salt & pepper, and speckle, which reveals that our method has the highest scaling accuracy and stability.

Table 1. Simulation results of ZMI, CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ + }}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for Mpeg7
Table 2. Simulation results of ZMI, CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ + }}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for LEAF

Tables 1, 2 also show clearly that the two methods: ZMI and CZMI using positive scale invariance parameter \({{\beta }_{ + }}\) have much worse performances (higher mean of \({{E}_{r}}\) and higher variance of \({{E}_{r}}\)) than the other three methods. To compare the performances of the other three methods in detail, we will use Figs. 1–8.

Fig. 1.
figure 1

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for clean Mpeg7.

Fig. 2.
figure 2

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for Mpeg7 with Gaussian noise.

Fig. 3.
figure 3

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for Mpeg7 with salt & pepper noise.

Fig. 4.
figure 4

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for Mpeg7 with speckle noise.

Fig. 5.
figure 5

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for clean LEAF.

Fig. 6.
figure 6

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for LEAF with Gaussian noise.

Fig. 7.
figure 7

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \)for LEAF with salt & pepper noise.

Fig. 8.
figure 8

Simulation results of CZMI \({{\beta }_{{00}}}\), CZMI \({{\beta }_{ - }}\), CZMI \(\beta \) for LEAF with speckle noise.

Figures 1–8 show the simulation results of the three methods: CZMI using area normalization parameter \({{\beta }_{{00}}}\), CZMI using negative scale invariance parameter \({{\beta }_{ - }}\), and CZMI using combined scale invariance parameter \(\beta \). In each figure, the error means the absolute relative mean error \({{E}_{r}}\) and the shape label means ID of an image in the database. The database Mpeg7 has 70 images; therefore, the shape label of each figure related to Mpeg7 is 1, 2, 3, …, 70. The database LEAF has 795 images; therefore, the shape label of each figure related to LEAF is 1, 2, 3, …, 795. Each of these figures demonstrate that our method (CZMI using combined scale invariance parameter \(\beta \)) generates the lowest variance of errors among the three methods. Furthermore, the results show that high prediction errors might occurred regardless the technique used. However, our technique shows lower error counts than the rest of the approaches in the comparative.

The simulation results clearly demonstrate that our method (CZMI using combined scale invariance parameter \(\beta \)) reduces the scale errors and improves the stability of the scale invariance. Our method has good capability to noises; even if the images are corrupted by different kinds of noises, such as Gaussian, salt & pepper and speckle noise, our method still has good performances.

5 CONCLUSIONS AND FURTHER WORK

In real applications, it is quite common that shapes may have changes in orientation, scale, and viewpoint; a shape descriptor should be unaffected by translation, rotation, and scaling. Zernike moments have been widely applied in shape retrieval, due to its rotation invariance. However, Zernike moments are not directly invariant under scaling and translation. Belkasim et al. [20] expressed Zernike moments in Cartesian coordinates to explicitly make them invariant to translation, rotation and scale; this method reduces the computation complexity and improves the accuracy rate. However, when Cartesian Zernike moments are applied to shapes with large scale variations, the accuracy rate becomes unstable. In this paper, we introduce a combined scale invariance parameter which can reduce the scale errors and improve the stability of the scale invariance. Our combined scale invariance parameter also has good capability to noises; even if the shapes are corrupted by different kinds of noises, such as Gaussian, salt & pepper and speckle noise, our combined scale invariance parameter still has good performances. Our combined scale invariance includes the information from both of the area and variance of an image; it is more powerful to process different kinds of shapes and more robust to process data corrupted by different kinds of noises.

Our simulation is based on comparing the performances of CZMI using our combined scale invariance parameter with other four popular methods: indirect Zernike moment invariants (ZMI), CZMI using area normalization parameter, CZMI using positive scale invariance parameter, and CZMI using negative scale invariance parameter. We choose two standard image databases: (1) MPEG7_70_OBJECTS: 70 shapes from 70 different kinds of objects (2) LEAF: 795 shapes of the leaves that belong to 90 wood species. These two standard databases are carefully designed to include most challenges in shape analysis and have been used by many researchers to test robustness of their methods under changes of rotation, scale and translation. For each shape (image) of the above two databases, there are two copies with different sizes (untransformed and transformed shape) to test the scale invariance. To test the robustness, we add three kinds of noise respectively to these two databases: Gaussian, salt & pepper and speckle noise. The simulation results based on the two above databases clearly demonstrate that CZMI using our combined scale invariance parameter always generates the lowest variance of errors and the lowest mean of the errors whenever processing clean data or data corrupted by different kinds of noises. Furthermore, the results show that high errors might occurred regardless the technique used. However, our technique shows lower error counts than the rest of the approaches in the comparative.

Further work would involve the use of emerging technologies such as deep learning which has been proven to obtain promising results for image processing and computer vision. However, using this approach would require a considerably higher number of data sets and may have a high computational complexity. This would be subject of study in the near future. Nevertheless, Zernike moments with our scale invariance have been proven to be an excellent image processing approach when dealing different scales for a large data set.