The Stability and Noise Tolerance of Cartesian Zernike Moments Invariants

Zhao, Yanjun; Belkasim, Saeid; Arteta, Alberto; Lee, Sanghoon

doi:10.1134/S1054661818040296

The Stability and Noise Tolerance of Cartesian Zernike Moments Invariants

REPRESENTATION, PROCESSING, ANALYSIS, AND UNDERSTANDING OF IMAGES
Published: 18 September 2019

Volume 29, pages 425–437, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

The Stability and Noise Tolerance of Cartesian Zernike Moments Invariants

Download PDF

Yanjun Zhao¹,
Saeid Belkasim²,
Alberto Arteta¹ &
…
Sanghoon Lee³

108 Accesses
Explore all metrics

Abstract

In real applications, it is quite common that shapes may have changes in orientation, scale, and viewpoint; a shape retrieval method should be unaffected by translation, rotation, and scaling. Zernike moments are widely used in shape retrieval, due to its rotation invariance. However, Zernike moments are not directly invariant under scaling and translation. Recently, Cartesian Zernike Moments Invariants (CZMI) were introduced to make Zernike moments directly invariant under scaling and translation. Although CZMI reduce the scale errors considerably, they are inconsistent and the scale errors increase for high aspect ratio shapes. In this paper, we introduce a scale invariance parameter which reduces the scale errors, improves the stability of the scale invariance and is more robust for wide range of shapes; even if the shapes are corrupted by different kinds of noises, such as Gaussian, Salt & Pepper and Speckle noise, our combined scale invariance parameter still has good performances.

Orthogonal Affine Invariants from Gaussian-Hermite Moments

New Invariant Meixner Moments for Non-uniformly Scaled Images

A Novel Line Integral Transform for 2D Affine-Invariant Shape Retrieval

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 INTRODUCTION

Shape plays an important role in human recognition and perception [1–3]. Deriving shape descriptors is an important task in content-based image retrieval [4–6]. In real applications, it is quite common that shapes may have changes in orientation, scale, and viewpoint; a robot shape descriptor should be unaffected by translation, rotation, and scaling.

There are mainly two classical approaches to shape representation: the boundary-based approaches and the region-based approaches. The boundary-based approaches only use the boundary of a shape to extract its features, such as Fourier descriptors [7], curvature scale space [8], wavelet descriptors [9], chain codes [10], autoregressive models [11], Delaunay triangulation technique [12], point set matching and dynamic programming [13, 14]. Exploiting information only from the boundary of a shape, the boundary-based approaches ignore potentially important information from the interior region of a shape; the region-based approaches utilize information from both of the boundary and interior region of a shape. These methods include geometrical moments [15–18], radial moments [19], Zernike moments [18, 20–28], Legendre moments [26], Tchebychev moments [29], generic Fourier descriptors [30], wavelet descriptors [31–33], shape matrix [34], hypergraph [35, 36], kernel entropy component analysis [37] and compound image descriptors [38]. Soft computing [39, 40] proposed by L. Zadeh in 1990, dealing with approximation, uncertainty, imprecision, can partial truth to effectively achieve practicability, robustness and low solution cost; it has been successful applied in image analysis. There are shape representations based on fuzzy sets [31, 41–44], neural networks [42, 44–46], and granular computing [32], respectively.

Zernike moments are highly effective in terms of shape representation [47–49]. They have good feature representation capability [22–24, 27, 48], invariance to linear transformations, geometric invariance with respect to rotation [21, 25], better image reconstruction [6, 50], and low noise sensitivity [48, 51]. Zernike polynomials are orthogonal within the unit circle, which implies no redundancy that exists between moments of different orders [48, 52].

However, Zernike moments are not directly invariant under scaling. There are two common approaches to achieve sale invariance of Zernike moments: the preprocessing approach [52] and the indirect approach [53]. In the preprocessing approach, scale invariance is achieved by rescaling the object in the unit circle; in the indirect approach, Zernike moments are expressed in terms of the regular moments where the indirect Zernike moments invariants (ZMI) can be achieved. However, both of these approaches have high computation complexity. Another popular method for achieving the scale invariance of Zernike moments [25, 54, 55] is based on normalizing the area of a shape, that is the zero order of its geometric moments, to a constant number. However this method may lead to computation errors: first, the changes of the area are not always linearly dependent on the changes of the size with respect to the same shape; second, when the order of moments becomes larger, the dynamic range increases, which in turn amplifies the numerical errors; third, the conversion from geometric moments to Zernike moments increases the computation complexity; finally, this method requires that the objects must be clearly separated from its background.

To reduce computation complexity and computation errors, Belkasim et al. [20] expressed Zernike moments in Cartesian coordinates and used the zero order and the second order of Cartesian Zernike moments to develop three scale invariance parameters. Through using one of these parameters, the scale errors can be reduced considerably and efficiently. However these scale parameters are unstable for processing shapes with large scale variations. Additionally, these scale invariance parameters cannot be uniquely represented by a single quantity. Zhao et al. [28] proposed a method to combine the scale invariance parameters [1] into one single quantity, which reduces the invariance errors over shapes with large scale variations. In this paper, we explore the properties of this combined scale invariance parameter further and find its good capability to noise; even if the shapes are corrupted by different kinds of noises, such as Gaussian, salt & pepper, and speckle noise, this combined scale invariance parameter still has good performances.

Our paper is organized as follows: in Section 1, we give the introduction; in Section 2, we introduce Cartesian Zernike moments and Cartesian Zernike moments invariants, especially for the three scale invariance parameters of Cartesian Zernike moments proposed by Belkasim et al. [20]; in Section 3, we introduce the combined scale invariance parameter to improve the stability of the scale invariance and the noise tolerance; in Section 4, we present the simulation to demonstrate the capability of our proposed scale invariance parameter in terms of reducing the scale errors and stabilizing the scale invariance, especially for processing the image data that are corrupted by different kinds of noises; in Section 5, we present the conclusion and the future work.

2 CARTESIAN ZERNIKE MOMENTS AND CARTESIAN ZERNIKE MOMENTS INVARIANTS

Direct translation and rotation invariance of Zernike moments can be achieved through explicitly expressing the original Zernike moments in their Cartesian coordinates form. In this section, we introduce Cartesian Zernike Moments and Cartesian Zernike moments invariants, especially for the three scale invariance parameters of Cartesian Zernike moments proposed by Belkasim et al. [20].

2.1 Cartesian Zernike Moments

Zernike moments are defined in terms of a set of orthogonal functions with simple rotation properties known as Zernike polynomials [26, 56]. Zernike polynomials can be expressed in Cartesian coordinates as follows [20]:

$$\begin{gathered} {{V}_{{nL}}}(x,y) \\ = {{R}_{{nL}}}(x,y)\left( {\left( {{{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}\,\mathop \sum \limits_{j = 0}^{{{m}_{r}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j} \end{array}} \right){{x}^{{L - 2j}}}{{y}^{{2j}}})} \right.} \right. \\ \left. {\left. { + \;i({{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}\,\mathop \sum \limits_{j = 0}^{{{m}_{i}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j + 1} \end{array}} \right){{x}^{{L - 2j - 1}}}{{y}^{{2j + 1}}}} \right)} \right), \\ \end{gathered} $$

((1))

where $i = \sqrt { - 1} $; ${{({{x}^{2}} + {{y}^{2}})}^{{1/2}}} \leqslant 1$; $n$ is non-negative integer, $L$ is positive integer subject to the constraints $n - L~$ is even and $L \leqslant n$, ${{R}_{{nL}}}(x,y)$ is real valued Zernike polynomials, which are defined as follows:

$$\begin{gathered} {{R}_{{nL}}}(x,y) = \mathop \sum \limits_{s = 0}^{(n - L)/2} {{( - 1)}^{s}} \\ \times \;\frac{{(n - s)!}}{{s!\left( {\frac{{n + L}}{2} - s} \right)!\left( {\frac{{n - L}}{2} - s} \right)!}}{{({{x}^{2}} + {{y}^{2}})}^{{\frac{{n - 2s}}{2}}}}, \\ \end{gathered} $$

((2))

${{m}_{r}}$ and ${{m}_{i}}$ are defined with respect to $L$ as the follows:

$${\text{when }}L{\text{ is even}}\quad {{m}_{r}} = \frac{L}{2}\quad {{m}_{i}} = \frac{{L - 2}}{2},$$

$${\text{when }}L{\text{ is odd}}\quad {{m}_{r}} = \frac{{L - 1}}{2}\quad {{m}_{i}} = \frac{{L - 1}}{2}.$$

The complex Cartesian Zernike moments ${{A}_{{nL}}}$ of order $n$ and repetition $L$ for a digital image $f(x,y)$ is defined in Cartesian coordinates as follows:

$${{A}_{{nL}}} = \frac{{n + 1}}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y)V_{{nL}}^{{\text{*}}}(x,y).$$

((3))

The symbol * denotes the complex conjugate of ${{V}_{{nL}}}$.

From the Eqs. (1)–(3), Cartesian Zernike moments ${{A}_{{nL}}}$ can be further expressed as follows [20]:

$${{A}_{{nL}}} = {{C}_{{nL}}} - i{{S}_{{nL}}},$$

((4))

where

$$\begin{gathered} {{C}_{{nL}}} = \frac{{n + 1}}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \left\{ {f(x,y){{R}_{{nL}}}(x,y){{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}} \right. \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{r}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j} \end{array}} \right){{x}^{{L - 2j}}}{{y}^{{2j}}}} \right\}, \\ \end{gathered} $$

((5))

$$\begin{gathered} ~{{S}_{{nL}}} = \frac{{n + 1}}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \left\{ {f(x,y){{R}_{{nL}}}(x,y){{{({{x}^{2}} + {{y}^{2}})}}^{{\frac{{ - L}}{2}}}}} \right. \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{i}}} {{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j + 1} \end{array}} \right){{x}^{{L - 2j - 1}}}{{y}^{{2j + 1}}}} \right\}. \\ \end{gathered} $$

((6))

Note that for $L = 0$, ${{S}_{{n0}}} = 0$.

2.2 Cartesian Zernike Moments Invariants

Based on the Cartesian Zernike moments above, Belkasim et al. [20] proposed a set of Cartesian Zernike moments invariants (CZMI) that descript image features directly invariant under scale, translation and rotation without using regular moments, which avoids preprocessing or resizing the original image. This set of CZMI is computed as follows [20]:

$${{(CZMI)}_{{n0}}} = {{C}_{{n0}}}\,,$$

((7))

$${{(CZMI)}_{{nL}}} = {{\left| {{{C}_{{nL}}}} \right|}^{2}} + {{\left| {{{S}_{{nL}}}} \right|}^{2}},$$

((8))

$$\begin{gathered} {{(CZMI)}_{{n,z}}} = [({{C}_{{nL}}} + i{{S}_{{nL}}}){{({{C}_{{mh}}} - i{{S}_{{mh}}})}^{p}}] \\ \pm \;{\text{[}}({{C}_{{nL}}} + i{{S}_{{nL}}}){{({{C}_{{mh}}} - i{{S}_{{mh}}})}^{p}}{\text{]*,}} \\ \end{gathered} $$

((9))

where $h \leqslant L$, $p = \frac{L}{h}$, $p \geqslant 1$, $(L\,{\text{mod}}h) = 0$, and z = p + L + h.

The main steps to achieve CZMI are summarized as follows [1]:

(1) use Eqs. (2), (5), and (6) to compute ${{C}_{{00}}}{{C}_{{11}}}$ and ${{S}_{{11}}}$;

(2) compute the centroid $(\bar {x},\bar {y})$

$$\bar {x} = \frac{{{{C}_{{11}}}}}{{2{{C}_{{00}}}}}\,,$$

((10))

$$\bar {y} = \frac{{{{S}_{{11}}}}}{{2{{C}_{{00}}}}}\,;$$

((11))

(3) compute the scale invariance parameter $\beta $, which can be achieved using one of three parameters ${{\beta }_{{00}}}$, ${{\beta }_{ - }}$, and ${{\beta }_{ + }}$;

(4) use $\beta $, $(\bar {x},\bar {y})$ and Eq. (2) to compute R_nL(β(x – $\bar {x}),\,\,\,\beta (y$ – $\bar {y}))$;

(5) use ${{R}_{{nL}}}(\beta (x - \bar {x}),\beta (y - \bar {y}))$ and Eqs. (5), (6) to compute updated ${{C}_{{n0}}}$, ${{C}_{{nL}}}$, and ${{S}_{{nL}}}$

$${{C}_{{n0}}} = \frac{{n + 1}}{\pi }{{\beta }^{2}}\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y){{R}_{{n0}}}(\beta (x - \bar {x}),\beta (y - \bar {y})),$$

((12))

$$\begin{gathered} {{C}_{{nL}}} = \frac{{n + 1}}{\pi }{{\beta }^{2}}\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,\left\{ {f(x,y){{R}_{{nL}}}{{{(\beta (x - \bar {x}),\beta (y - \bar {y}))}}^{{^{{^{{^{{}}}}}}}}}} \right. \\ \times \;{{({{(x - \bar {x})}^{2}} + {{(y - \bar {y})}^{2}})}^{{\frac{{ - L}}{2}}}} \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{r}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j} \end{array}} \right){{{(x - \bar {x})}}^{{L - 2j}}}{{{(y - \bar {y})}}^{{2j}}}} \right\}, \\ \end{gathered} $$

((13))

$$\begin{gathered} {{S}_{{nL}}} = \frac{{n + 1}}{\pi }{{\beta }^{2}}\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,\left\{ {f(x,y){{R}_{{nL}}}{{{(\beta (x - \bar {x}),\beta (y - \bar {y}))}}^{{^{{^{{^{{}}}}}}}}}} \right. \\ \times \;{{({{(x - \bar {x})}^{2}} + {{(y - \bar {y})}^{2}})}^{{\frac{{ - L}}{2}}}} \\ \left. { \times \;\mathop \sum \limits_{j = 0}^{{{m}_{i}}} \,{{{( - 1)}}^{j}}\left( {\begin{array}{*{20}{c}} L \\ {2j + 1} \end{array}} \right){{{(x - \bar {x})}}^{{L - 2j - 1}}}{{{(y - \bar {y})}}^{{2j + 1}}}} \right\}; \\ \end{gathered} $$

((14))

(6) use updated ${{C}_{{n0}}}$, ${{C}_{{nL}}}$, ${{S}_{{nL}}}$ and Eqs. (7)–(9) to compute CZMI.

2.3 Three Scale Invariance Parameters

The scale invariance of CZMI can be achieved using one of the three parameters [20]:

$${{\beta }_{{00}}} = \sqrt {\frac{\pi }{A}\,} ,$$

((15))

$${{\beta }_{ - }} = \sqrt {\frac{{3{{A}_{{00}}} - \sqrt {9A_{{00}}^{2} + 4\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)} }}{{2\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)}}\,} ,$$

((16))

$${{\beta }_{ + }} = \sqrt {\frac{{3{{A}_{{00}}} + \sqrt {9A_{{00}}^{2} + 4\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)} }}{{2\left( {{{A}_{{20}}} + 3{{A}_{{00}}}} \right)}}} ,$$

((17))

where

$$~A = \mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y),$$

((18))

$${{A}_{{00}}} = \frac{{\sum\limits_x {\sum\limits_y {f(x,y)} } }}{\pi },$$

((19))

$${{A}_{{20}}} = \frac{3}{\pi }\,\mathop \sum \limits_x \,\mathop \sum \limits_y \,f(x,y)[2({{x}^{2}} + {{y}^{2}}) - 1],$$

((20))

${{\beta }_{{00}}}$ is a scale invariance parameter, depending on area. ${{\beta }_{ - }}$ is another scale invariance parameter, depending on Cartesian Zernike moments up to the second order. $~{{\beta }_{ + }}$ is the third scale invariance parameter, depending on Cartesian Zernike moments up to the second order.

These three scale invariances parameters can be used respectively to normalize objects against scale changes, however they are not stable for shapes with large scale variations. Furthermore, having several parameters to normalize the variations in scale is inconvenient for automating the process of feature extraction; therefore we introduce an approach that combines these scale parameters into one scale parameter β.

3 THE COMBINED β SCALE INVARIANCE PARAMETER

In this section, we introduce an approach that combines the above scale parameters into one scale parameter β to improve the stability of scale invariance.

Our combined scale invariance parameter $\beta $ is proposed as follows [28]:

$${\beta } = {{{\beta }}_{ - }}\sqrt {\frac{{{{{\beta }}_{ - }}}}{{{{{\beta }}_{ + }}}}}.$$

((21))

Furthermore we substitute (16) and (17) into (21):

$$\begin{gathered} {\beta } = \sqrt {\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \\ \times \;\sqrt[4]{{\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{3{{{\text{A}}}_{{00}}} + \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}}}\,. \\ \end{gathered} $$

((22))

The reason that we only combine ${{{\beta }}_{ - }}$ and ${{{\beta }}_{ + }}$ is that ${{{\beta }}_{ - }}$ and ${{{\beta }}_{ + }}$ contain the information of ${{{\beta }}_{{00}}}$. ${{{\beta }}_{{00}}}$ directly related to the area, which can be shown as follows:

$${\text{A}} = \pi {\text{/}}\beta _{{00}}^{2}{\text{.}}$$

((23))

Normalization against the change in variance implicitly includes size (area). ${{{\beta }}_{ - }}$ and ${{{\beta }}_{ + }}$ contain the information of the area (${{{\beta }}_{{00}}}$) which can be seen through the Eqs. (16)–(19).

From paper [20], we know that ${{{\beta }}_{ - }}$ and ${{{\beta }}_{ + }}$ are the root of the following equation:

$$\left( {3{{{\text{A}}}_{{00}}} + {{{\text{A}}}_{{20}}}} \right){{{\beta }}^{4}} - \left( {3{{{\text{A}}}_{{00}}}} \right){{{\beta }}^{2}} - 1 = 0\,.$$

((24))

Equation (24) keeps the value of the second order of Zernike moments constant as the scale or size changes [20].

We can consider the left part of the Eq. (24) as the characteristic polynomial of a linear transformation $B$ and get the following characteristic equation of $B$:

$$\det \left( {{\text{B}} - {{\text{I}\lambda }}} \right) = 0,$$

((25))

where

$${\lambda } = {\beta }{\text{.}}$$

((26))

This characteristic equation has four different roots:

$${{{\lambda }}_{1}} = - {{{\beta }}_{ + }} = - \sqrt {\frac{{3{{{\text{A}}}_{{00}}} + \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \,,$$

((27))

$${{{\lambda }}_{2}} = - {{{\beta }}_{ - }} = - \sqrt {\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \,,$$

((28))

$${{{\lambda }}_{3}} = {{{\beta }}_{ - }} = \sqrt {\frac{{3{{{\text{A}}}_{{00}}} - \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} \,,$$

((29))

$${{{\lambda }}_{4}} = {{{\beta }}_{ + }} = \sqrt {\frac{{3{{{\text{A}}}_{{00}}} + \sqrt {9{\text{A}}_{{00}}^{2} + 4\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)} }}{{2\left( {{{{\text{A}}}_{{20}}} + 3{{{\text{A}}}_{{00}}}} \right)}}} .$$

((30))

Therefore the condition number of the linear transformation $B$ can be defined as follows:

$${\varkappa }\left( {\text{B}} \right) = \left| {\frac{{{{{\lambda }}_{{{\text{max}}}}}}}{{{{{\lambda }}_{{{\text{min}}}}}}}} \right| = \left| {\frac{{{{{\lambda }}_{4}}}}{{{{{\lambda }}_{3}}}}} \right| = \left| {\frac{{{{{\beta }}_{ + }}}}{{{{{\beta }}_{ - }}}}} \right| = \frac{{{{{\beta }}_{ + }}}}{{{{{\beta }}_{ - }}}}.$$

((31))

The condition number of a function measures the worst case change with respect to small changes.

Since ${{{\beta }}_{ - }}$ is more effective in most shapes than ${{{\beta }}_{ + }}$ [20]. We consider the quantity $\sqrt {\frac{1}{{{\varkappa }\left( {\text{B}} \right)}}} $, that is $\sqrt {\frac{{{{{\beta }}_{ - }}}}{{{{{\beta }}_{ + }}}}} $ as the ratio or weight factor of ${{{\beta }}_{ - }}$ to derive the combined scale invariance parameter: ${{\beta }} = {{{{\beta }}}_{ - }}\sqrt {\frac{{{{{{\beta }}}_{ - }}}}{{{{{{\beta }}}_{ + }}}}} $.

This combined parameter ${\beta }$ includes the information of the area and variance about an image; therefore it is more powerful to process different kinds of shapes and more robust to process data corrupted by noise.

4 SIMULATION

4.1 Simulation Data

We choose two standard image databases which are carefully designed to include most challenges in shape analysis and have been used by many researchers to test robustness of their methods under changes of rotation, scale and translation.

(1) MPEG7_70_OBJECTS [31]: 70 images of 70 objects taken from MPEG7 data base.

(2) LEAF [58]: originally created for experiments with recognition of wood species based on a leaf shape. It contains leaves of 90 wood species growing in the Czech Republic, both trees and bushes; native, invasive and imported (only those imported species which are common in parks are included). The number of samples (leaves) of one species varies from 2 to 25, their total number in the database is 795.

For each shape (image) of the above two databases, there are two copies with different sizes (untransformed and transformed shape). We add three kinds of noise respectively to these two databases: Gaussian noise with mean = 0 and variance = 0.01, salt & pepper noise with noise density = 0.05, and speckle noise with mean = 0 and variance = 0.15. Therefore, for each of these two databases, we have four sample data sets: the clean sample data set, the sample data set with Gaussian noise, the sample data set with salt & pepper noise, and the sample data set with speckle noise.

4.2 Simulation Process

Our simulation is based on comparing the performances of CZMI using our combined scale invariance parameter $\beta $ [28] with other four methods: indirect Zernike moment invariants (ZMI) [53], CZMI using area normalization parameter ${{\beta }_{{00}}}$ [20], CZMI using positive scale invariance parameter ${{\beta }_{ + }}$ [20], and CZMI using negative scale invariance parameter ${{\beta }_{ - }}$ [20].

For each of these five methods, moment invariants were computed for each untransformed shape (image) and its related transformed shape (image) respectively. Under order $n$, we have $N$ moment invariants for each untransformed shape (image) and its related transformed shape (image) respectively. We use the absolute relative mean error ${{E}_{r}}$ to quantify the scale accuracy of each method for a pair of untransformed and transformed shape (image):

$${{E}_{r}} = \frac{1}{N}\mathop \sum \limits_{k = 1}^N \left| {\frac{{{{I}_{0}}\left( k \right) - {{I}_{t}}\left( k \right)}}{{{{I}_{0}}\left( k \right)}}} \right|,$$

((32))

where ${{I}_{0}}(k)$ is value of the $k$th moment invariant for an untransformed shape (image) under order $n$; ${{I}_{t}}(k)$ is value of the kth moment invariant for the related transformed shape (image) under order $n$; k = 1, 2, 3, ..., N.

To evaluate stability of the scale accuracy of each method we compute the variance of the absolute relative mean error ${{E}_{r}}$ among $W$ pairs of untransformed and transformed shape (image):

$$var = \frac{{\sum\limits_{w = 1}^W {{{{({{E}_{r}}(w) - \mu )}}^{2}}} }}{{W - 1}}~,$$

((33))

where

$$\mu = \frac{{\sum\limits_{w = 1}^W {{{E}_{r}}(w)} }}{W}$$

((34))

$w$ is id of a pair of untransformed and transformed shape (image); $w = 1,~\,\,2,~\,\,3,\,\,~ \ldots ,~\,\,W$.

4.3 Simulation Results Analysis

Tables 1, 2 show simulation results of the five methods: ZMI, CZMI using area normalization parameter ${{\beta }_{{00}}}$, CZMI using positive scale invariance parameter ${{\beta }_{ + }}$, CZMI using negative scale invariance parameter ${{\beta }_{ - }}$, and CZMI using combined scale invariance parameter $\beta $, for two standard image databases Mpeg 7 and LEAF respectively. These two tables clearly demonstrate that for both Mpeg7 and LEAF, our method (CZMI using combined scale invariance parameter $\beta $) always achieves the lowest mean of ${{E}_{r}}$ and the lowest variance of ${{E}_{r}}$, even if data sets are corrupted by different kinds of noises, such as Gaussian, salt & pepper, and speckle, which reveals that our method has the highest scaling accuracy and stability.

Table 1. Simulation results of ZMI, CZMI ${{\beta }_{{00}}}$, CZMI ${{\beta }_{ + }}$, CZMI ${{\beta }_{ - }}$, CZMI $\beta $ for Mpeg7

Full size table

Table 2. Simulation results of ZMI, CZMI ${{\beta }_{{00}}}$, CZMI ${{\beta }_{ + }}$, CZMI ${{\beta }_{ - }}$, CZMI $\beta $ for LEAF

Full size table

Tables 1, 2 also show clearly that the two methods: ZMI and CZMI using positive scale invariance parameter ${{\beta }_{ + }}$ have much worse performances (higher mean of ${{E}_{r}}$ and higher variance of ${{E}_{r}}$) than the other three methods. To compare the performances of the other three methods in detail, we will use Figs. 1–8.

Figures 1–8 show the simulation results of the three methods: CZMI using area normalization parameter ${{\beta }_{{00}}}$, CZMI using negative scale invariance parameter ${{\beta }_{ - }}$, and CZMI using combined scale invariance parameter $\beta $. In each figure, the error means the absolute relative mean error ${{E}_{r}}$ and the shape label means ID of an image in the database. The database Mpeg7 has 70 images; therefore, the shape label of each figure related to Mpeg7 is 1, 2, 3, …, 70. The database LEAF has 795 images; therefore, the shape label of each figure related to LEAF is 1, 2, 3, …, 795. Each of these figures demonstrate that our method (CZMI using combined scale invariance parameter $\beta $) generates the lowest variance of errors among the three methods. Furthermore, the results show that high prediction errors might occurred regardless the technique used. However, our technique shows lower error counts than the rest of the approaches in the comparative.

The simulation results clearly demonstrate that our method (CZMI using combined scale invariance parameter $\beta $) reduces the scale errors and improves the stability of the scale invariance. Our method has good capability to noises; even if the images are corrupted by different kinds of noises, such as Gaussian, salt & pepper and speckle noise, our method still has good performances.

5 CONCLUSIONS AND FURTHER WORK

In real applications, it is quite common that shapes may have changes in orientation, scale, and viewpoint; a shape descriptor should be unaffected by translation, rotation, and scaling. Zernike moments have been widely applied in shape retrieval, due to its rotation invariance. However, Zernike moments are not directly invariant under scaling and translation. Belkasim et al. [20] expressed Zernike moments in Cartesian coordinates to explicitly make them invariant to translation, rotation and scale; this method reduces the computation complexity and improves the accuracy rate. However, when Cartesian Zernike moments are applied to shapes with large scale variations, the accuracy rate becomes unstable. In this paper, we introduce a combined scale invariance parameter which can reduce the scale errors and improve the stability of the scale invariance. Our combined scale invariance parameter also has good capability to noises; even if the shapes are corrupted by different kinds of noises, such as Gaussian, salt & pepper and speckle noise, our combined scale invariance parameter still has good performances. Our combined scale invariance includes the information from both of the area and variance of an image; it is more powerful to process different kinds of shapes and more robust to process data corrupted by different kinds of noises.

Our simulation is based on comparing the performances of CZMI using our combined scale invariance parameter with other four popular methods: indirect Zernike moment invariants (ZMI), CZMI using area normalization parameter, CZMI using positive scale invariance parameter, and CZMI using negative scale invariance parameter. We choose two standard image databases: (1) MPEG7_70_OBJECTS: 70 shapes from 70 different kinds of objects (2) LEAF: 795 shapes of the leaves that belong to 90 wood species. These two standard databases are carefully designed to include most challenges in shape analysis and have been used by many researchers to test robustness of their methods under changes of rotation, scale and translation. For each shape (image) of the above two databases, there are two copies with different sizes (untransformed and transformed shape) to test the scale invariance. To test the robustness, we add three kinds of noise respectively to these two databases: Gaussian, salt & pepper and speckle noise. The simulation results based on the two above databases clearly demonstrate that CZMI using our combined scale invariance parameter always generates the lowest variance of errors and the lowest mean of the errors whenever processing clean data or data corrupted by different kinds of noises. Furthermore, the results show that high errors might occurred regardless the technique used. However, our technique shows lower error counts than the rest of the approaches in the comparative.

Further work would involve the use of emerging technologies such as deep learning which has been proven to obtain promising results for image processing and computer vision. However, using this approach would require a considerably higher number of data sets and may have a high computational complexity. This would be subject of study in the near future. Nevertheless, Zernike moments with our scale invariance have been proven to be an excellent image processing approach when dealing different scales for a large data set.

REFERENCES

Y. Cao, S. Grossberg and J. Markowitz, “How does the brain rapidly learn and reorganize view and positionally-invariant object representations in inferior temporal cortex?” Neural Networks. 24, 1050–1061 (2011).
Article MATH Google Scholar
W. Y. Kim, “A practical pattern recognition system for translation, scale and rotation invariance”, International Conference Computer Vision and Pattern Recognition, CVPR1994, Seattle, WA, pp. 391–396.
R. Mukundan and K. Ramakrishnan. Moment functions in image analysis theory and application. World Scientific Publishing Co. Pte. Ltd. 1998, pp. 16–18.
G. Castellano, C. Castiello, and A. M. Fanelli, “Content-based image retrieval by shape matching,” Annual meeting of the North American Fuzzy Information Processing Society, NAFIPS 2006, June 3–6, 2006, Montréal, Canada, pp. 114–119.
A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” International Journal of Computer Vision, 18, 233–254 (1996).
Article Google Scholar
M. Pawlak. “On the reconstruction aspect of moment descriptors”, IEEE Transaction Information & Theory, 38, 1698–1708 (1992).
Article MathSciNet MATH Google Scholar
R. C. Gonzalez and R. E.Woods, Digital Image Processing, 3rd edition, Upper Saddle River, NJ: Prentice Hall 2008, pp. 818–821.
Google Scholar
F. Mokhtarian and A. Mackworth. “Scale-based description and recognition of planar curves and two-dimensional shapes”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 34–43 (1986).
Article Google Scholar
G. Chauang and C. Kuo, “ Wavelet descriptor of planar curves: Theory and applications,” IEEE Transaction on Image Processing, 5, 56–70(1996).
Article Google Scholar
J. Sun and X. Wu, “Chain Code Distribution-Based Image Retrieval,” International Conference Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP’06, Pasadena, CA, Dec. 16–20, 2006, pp. 139–142.
S. R. Dubois and F. H. Glanz, “An autoregressive model approach to two dimensional,” IEEE Transaction on Pattern Analysis and Machine Intelligence, 8, 55–65(1986).
Article Google Scholar
Tao Y. and W. I. Grosky “Delaunay triangulation for image object indexing: a novel method for shape representation,” 7th SPIE symposium on storage and retrieval for image and video databases, San Jose, CA, 1999, pp. 631–642.
W. Lian, L. Zhang, and D. Zhang “Rotation-Invariant Nonrigid Point Set Matching in Cluttered Scenes,” IEEE Transactions Image Processing, 21, 2786–2797(2012).
Article MathSciNet MATH Google Scholar
E. G. M. Petrakis, A. Diplaros, and E. Milios, “Matching and Retrieval of Distorted and Occluded Shapes Using Dynamic Programming,” IEEE Transaction on Pattern Analysis and Machine Intelligence, 24, 1501–1516(2002).
Article Google Scholar
D. L. Hung, H.D.Cheng and S. Sengkhamyong “Design of a configurable accelerator for moment computation,” IEEE Transactions Very Large Scale Integration Systems 8, 741–746 (2000), .
Article Google Scholar
D. L. Hung, M. Kormicki, A. Schwarz, H.D. Cheng, and C. Y. Wu, “A reconfigurable processor array for real-time moment-invariant feature extraction,” Proceedings of IEEE Inter. Conf. Industrial Technology ICIT’96, Shanghai Chian Dec. 2–6 1996, pp. 854–857.
M. Hu, “Visual Pattern Recognition by Moment Invariants,” IRE Transactions on Information Theory, 8, 179–187(1962).
MATH Google Scholar
J. C. Yang, N. X. Xiong, A.V. Vasilakos, D. S Park, X. H. Xu and S. Yoon, “A Fingerprint Recognition Scheme Based on Assembling Invariant Moments for Cloud Computing Communications’, IEEE Systems Journal, 5, 574–583(2011).
Article Google Scholar
R. Desai and H. D. Cheng, “Pattern recognition by local radial moments,” Pattern Recognition, 2, 168–172 (1994).
Google Scholar
S. Belkasim, E. Hassan and T.Obeidi “Explicit Invariance of Cartesian Zernike Moments” Pattern Recognition Letters, 28, 1969–1980 (2007).
Article Google Scholar
C.-W. Chong, P. Raveendran, and R. Mukundan, “Translation Invariants of Zernike Moments”, Pattern Recognition, 36, 1765–1773 (2003).
Article MATH Google Scholar
K. M. Hosny, “A systematic method for efficient computation of full and subsets Zernike moments,” Information Sciences 180, 2299–2313 (2011), .
Article MathSciNet MATH Google Scholar
C.-L. Lim, B. Honarvar, K.-H. Thung, and P. Raveendran, “Fast computation of exact Zernike moments using cascaded digital filters,” Information Sciences, 181, 3638–3651 (2011).
Article MathSciNet Google Scholar
G. A. Papakostas, Y. S. Boutalis, D. A. Karras, and B.G. Mertzios, “A new class of Zernike moments for computer vision application,” Information Sciences, 177, 2802–2819 (2007).
Article MathSciNet MATH Google Scholar
P. Raveendran and S. Omatu “Performance of an optimal subset of Zernike features for pattern classification” Information Sciences - Applications 1(1994), pp. 133–147.
M. R. Teague, “Image analysis via the general theory of moments”, Journal of the Optical Society of America A, 70, 920–930 (1980).
Article MathSciNet Google Scholar
C.-Y. Wee, P. Raveendran, and F. Takeda, “New computational methods for full and subset Zernike moments,” Information Sciences 159, 203–220 (2004), .
Article MathSciNet MATH Google Scholar
Y. Zhao and S. Belkasim. “The Stability and Invariance of Cartesian Zernike Moments” IEEE Southwest Symposium on Image Analysis and Interpretation SSIAI’12. Santa Fe, NW, April 22–24, 2012, pp. 61–64.
R. Mukundan, S. H. Ong, and P. A. Lee, “Image Analysis by Tchebichef Moments,” IEEE Transactions on Image Processing, 10, 1357–1364 (2001).
Article MathSciNet MATH Google Scholar
D. Zhang and G. Lu, “Shape-based image retrieval using generic Fourier descriptor,” Signal Processing: Image Communication, 17, 825–848 (2002), .
Google Scholar
K. C. Kwak and W. Pedrycz, “Face recognition using fuzzy Integral and wavelet decomposition method, ”IEEE Transactions Systems, Man, and Cybernetics, Part B: Cybernetics, 34, 1666 – 1675 (2004).
Article Google Scholar
S. K. Meher and S. K. Pal, “Rough-Wavelet Granular Space and Classification of Multispectral Remote Sensing Image”, Applied Soft Computing, 11, 5662–5673 (2011).
Article Google Scholar
H. Nobuhara, K. Kitamura, K. Hirota and B. Bede, “Generalized Non-linear Wavelets and Their Application to Medical Image Processing,” IEEE Inter. Conf. Systems, Man and Cybernetics, SMC’05, Hawaii, USA, Oct. 10–15 2005, pp. 1488 – 1493.
A. Goshtasby, “Description and discrimination of planar shapes using shape matrices,” IEEE Transaction on Pattern Analysis and Machine Intelligence, 7, 738–743 (1985).
Article Google Scholar
A. Ducournau, A. Bretto, S. Rital and B. Laget, “A reductive approach to hypergraph clustering: An application to image segmentation Pattern Recognition,” 45, 2788–2803 (2012).
Article MATH Google Scholar
J. Yu, D. Tao and M. Wang, “Adaptive Hypergraph Learning and its Application in Image Classification,” IEEE Transactions Image Processing, 21, 3262–3272 (2012), .
Article MathSciNet MATH Google Scholar
L. Gómez-Chova, R. Jenssen, and G. Camps-Valls, “Kernel Entropy Component Analysis for Remote Sensing Image Clustering” IEEE Geosciences and remote sensing letters, 9, 312–316 (2012).
Article Google Scholar
S. Li and M.-C. Lee, “Effective Invariant Features for Shape-Based,” Journal of the American Society for Information Science and Technology, 56, 729–740 (2005).
Article Google Scholar
L. A. Zadeh, “Fuzzy Logic, Neural Networks, and Soft Computing,” Communication of the ACM, 37, 77–84 (1994).
Article Google Scholar
L. A. Zadeh, “ Soft computing, fuzzy logic and recognition technology,” Fuzzy Systems Proceedings of IEEE World Congress on Computational Intelligence. 4-9 May 1998 Anchorage, AK, pp. 1678–1679.
B. Lazzerini and F. Marcelloni, “A fuzzy approach to 2D-shape recognition,” IEEE Transactions Fuzzy Systems, 9, 5–16 (2001).
Article Google Scholar
P. Melin, O. Mendoza, and O. Castillo, “Face Recognition With an Improved Interval Type-2 Fuzzy Logic Sugeno Integral and Modular Neural Networks, ” IEEE Transactions Systems, Man and Cybernetics, Part A: Systems and Humans, 41, 1001–1012 (2011).
Article Google Scholar
S. Mitra and S. K. Pal, “Fuzzy Sets in Pattern Recognition and Machine Intelligence”, Fuzzy Sets and Systems, 156, 381–386 (2005).
Article MathSciNet Google Scholar
K. Pal and A. Ghosh, “Neuro-fuzzy Computing for Image Processing and Pattern Recognition”, International Journal of Systems Science, 27, 1179–1193 (1996).
Article MATH Google Scholar
G. A. Carpenter, S. Grossberg and G.W. Lesher, “A what-and-where neural network for invariant image preprocessing” Inter. Joint Conf. Neural Networks IJCNN1992, Baltimore, MD, Jun 7–11, 1992, pp. 303–308.
P. N. Suganthan, K. T. Eam and D.P. Mital, “Hopfield network with constraint parameter adaptation for overlapped shape recognition,” IEEE Transactions Neural Networks, 10, 444–449 (1992).
Article Google Scholar
W. Kim and Y. Kim, “A region-based shape descriptor using Zernike moments,” Signal Processing: Image Communication., 16, 95–102 (2000).
Google Scholar
C.H. The and R. T. Chin, “On image analysis by the methods of moments,’ IEEE Transaction. Pattern Analysis and Machine Intelligence, 10, 496–512 (1998).
D. Zhang and G. Lu, “Evaluation of MPEG-7 shape descriptors against other shape descriptors,” Multimedia Systerm, 9, 15–30 (2003).
Article Google Scholar
S. X. Liao, Image Analysis by Moments, Ph.D. dissertation, The University of Manitoba, 1993.
S. X. Liao and M. Pawlak, “On image analysis by moments”, IEEE Transaction Pattern Analysis and Machine Intelligence, 18, 254–266 (1996).
Article Google Scholar
C. Kan, and M. D. Srinath, “Invariant character recognition with Zernike and orthogonal Fourier-Mellin moments”, Pattern Recognition, 35, 143–154 (2002).
Article MATH Google Scholar
S. Belkasim, M. Shridhar, and M. Ahmadi, “Pattern recognition with moment invariants: A comparative study and new results,” Pattern Recognition, 24, 1117–1138 (1991).
Article Google Scholar
A. Khotanzad, “Invariant image recognition by Zernike moments”, IEEE Transaction Pattern Analysis and Machine Intelligence, 12, 489–497 (1990).
Article Google Scholar
A. Khotanzad, “Object recognition using a neural network and invariant Zernike features”. International Conference on Computer Vision and Pattern Recognition, 1989, pp. 200–207.
L. Lundström and P. Unsbo, “Transformation of Zernike coefficients: scaled, translated and rotated wavefronts with circular and elliptical pupils”. Journal of the Optical Society of America A 24, 569–577 (2007).
Article MathSciNet Google Scholar
B. S. Manjunath, P. Salembier, and T. Sikora, Introduction to MPEG-7: Multimedia Content Description Interface John Wiley and Sons, 2002.
LEAF - Tree Leaf Database, Inst. of Information Theory and Automation ASCR, Prague, Czech Republic, http://zoi.utia.cas.cz/tree_leaves.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Troy University, Troy, AL, USA
Yanjun Zhao & Alberto Arteta
Department of Computer Science, Georgia State University College of Arts and Sciences, Atlanta, GA, USA
Saeid Belkasim
Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
Sanghoon Lee

Authors

Yanjun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Saeid Belkasim
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Arteta
View author publications
You can also search for this author in PubMed Google Scholar
Sanghoon Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yanjun Zhao, Saeid Belkasim, Alberto Arteta or Sanghoon Lee.

Additional information

Yanjun Zhao is an Assistant Professor of Computer Science at Troy University, USA. Her researches mainly include image processing, pattern recognition, data mining and machine learning. She received her B. S. Degree in Computer Science and Technology from Jilin University, China, and her MS Degree and PhD Degree in Computer Science from Georgia State University, USA. She was awarded Brain & Behavior (B&B) Fellowship and Second Century Initiative (2CI) Fellowship from Georgia State University. She has severed as Program Committee Member or Technical Program Committee Member for several international conferences, such as ICIBM2015, BDCloud2016, ITAIC2011, ICACAR2018. Currently she serves as Editorial Board Member for International Journal of Engineering Research in Computer Science and Engineering (IJERCSE) and The Research Journal of Computer Science and Information Technology (RJCSIT). She is also a member of IEEE society.

Saeid Belkasim is a well-known researcher in image processing and shape recognition. He started his publications in this area since the late 80’s and early 90’s. The most noted publication is the comparative study of moment invariants for extracting shape features. This article won the prestigious most honorable mention award given by the international pattern recognition society for the year 1990. His research publications have been cited in several books, theses and over 1900 refereed international journals and conference proceedings. His publications cover many topics in image processing and pattern recognition such as image segmentation, image compression, image retrieval and image databases.

Alberto Arteta is currently an Assistant professor of Troy University. He’s been working in Math and Computer Science departments since 2007. His main area of interest includes the study of new computational models such us membrane computing, cells computing, evolutionary computing and Neural Networks. He holds his PhD in this field and have published papers his work in prestigious journals and symposium such us “Engineering Application of Artificial Intelligence”, “Evolutionary computation”, “IEEE membrane computing symposium”, etc. Furthermore, Dr. Arteta provides expertise and professional experience in the private sector in 12 companies as employee and as founder, working mainly in Holland and Spain. Most of his roles have been Developer, System engineer and Database Consultant.

Sanghoon Lee is a postdoctoral fellow at Emory University School of Medicine, where he is highly motivated to pursue the multidisciplinary research career supported by NIH U24 grant. He has M.S and PhD degrees in Computer Science from the Georgia State University. He was one of Brain & Behavior fellows in Neuroscience Institute and was one of Second Century Initiative Presidential Fellows at Georgia State University. His research interests include Machine Learning, Deep Learning, Data Mining, Information Retrieval, and Digital Pathology. He currently serves as a member of Digital Pathology Association and a TPC member of ACM Research in Adaptive and Convergent Systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, Y., Belkasim, S., Arteta, A. et al. The Stability and Noise Tolerance of Cartesian Zernike Moments Invariants. Pattern Recognit. Image Anal. 29, 425–437 (2019). https://doi.org/10.1134/S1054661818040296

Download citation

Received: 16 May 2018
Revised: 09 February 2019
Accepted: 17 April 2019
Published: 18 September 2019
Issue Date: July 2019
DOI: https://doi.org/10.1134/S1054661818040296

Keywords:

Use our pre-submission checklist

Avoid common mistakes on your manuscript.