1 Introduction

With the continuous development of computer science and software system, the research of object detection and recognition makes remarkable progress, which attracts a growing number of researchers [1,2,3]. As one of the important branches of computer vision, shape recognition has yielded fruitful achievements in object description and matching [4, 5]. Shape feature is crucial for object recognition. Specifically, objects suffering from a lack of information such as brightness, texture or color will have to rely on their shapes. In addition, shape is robust to the variance of object color, illumination and texture [6]. In such cases, it is imperative to use shape to improve the accuracy of object representation and recognition. For instance, a large number of shape descriptors have been proposed for remote sensing applications since the texture and color information acquired by remote sensors is not clear. In particular, the target is transient in the battlefield environment, so the speed of the target shape recognition directly determines the accuracy of the strike. Figure 1 shows the shape of a telemetry ship target. Owing to the above-mentioned advantages, the research of shape-based object description and recognition has been a hot research area in the field of image processing.

Fig. 1
figure 1

Optical sensing ship target, the upper part is the measured ship target optical sensing image, and the lower part is the binary processed shape image of ship

1.1 Related work

As the key technique of shape features, shape coding plays a significant role in the definition, representation and processing of visual objects [7]. Generally, shape coding methods can be divided into two categories: bitmap-based methods and contour-based methods. A bitmap-based method represents the object shape as a binary image and then adopts a binary image coding method, such as context-based arithmetic coding (CAE) of JBIG, JBIG2 and MPEG-4 [8,9,10,11,12]. Contour-based methods sequentially extract and code the contour curve of a visual object, which are usually based on chain code, polygon, spline and other curve-fitting approaches [13,14,15]. Typically, a curve fitting method could obtain higher coding efficiency, which is suitable for lossy coding rather than lossless coding. A number of techniques and algorithms for shape coding have been proposed. Liu et al. [16] proposed a shape coding method based on H.264 standard. Wulan et al. [17] extended the traditional chain code and proposed a shape coding method called vertex chain code. Lai et al. [18] focused on the quality evaluation and distortion estimation in shape coding based on B-spline fitting and then proposed a distortion measurement method combined with visual perception characteristics. In [19], a binary image shape coding method is proposed based on context arithmetic coding, which utilizes the local linear edge feature in the image contour and applies it to the context modeling of arithmetic coding to improve the accuracy and precision of context. The performance is proved to be better than several traditional methods. A shape representation and coding method based on image correlation is proposed in [20], which associates the object shape coding and decoding with the image data itself. Consequently, it could make full use of the correlation between the image content and the object shape, thus significantly improving the coding efficiency. Nevertheless, it also belongs to the lossy coding category which is not suitable for lossless applications.

In addition, some classical shape recognition methods, such as [16, 17, 21,22,23,24,25,26,27], all use dynamic programming for shape matching, which will undoubtedly increase the complexity of recognition and reduce the speed of recognition. From the identification time in Sect. 5 (experimental part), it is clear that the recognition time of these methods is too long to the transient battlefield requirements.

Recently, deep learning-based methods exhibit outstanding performance in object detection and recognition after data training. However, the practicability is restricted by two major factors, some image datasets are difficult to obtain, and existing public datasets are generally small. Therefore, feature-based shape description is particularly important. On the other hand, rather than conventional methods, Bicego and Lovato [16] proposed the biogenic informatics model to achieve two-dimensional shape recognition. The key idea is to code the shape contour into biological information sequences according to its spatial distribution, i.e., DNA molecular sequences, and then adopt some chain code matching calculation methods to analyze the similarity between sequences, thus achieving shape recognition and classification.

1.2 Essential problem of shape recognition

While the above shape coding algorithms improve the performance of recognition in various aspects, there is still room for further improving recognition accuracy and running efficiency. That is to say, they do not take into account the intrinsic problem affecting the performance of shape recognition. Neither shape coding nor shape description methods consider the impact of structural features in shape extraction during the whole process of shape recognition. Every single input in a system would play a vital role in the whole system, especially features in shape recognition. Conventional object recognition methods mainly focus on accuracy and precision without considering computational and executional efficiencies. To this end, the essential problem is explored and pinpointed as “structural restraints” within the existing methods, and the restraint of feature structure is detailed in this paper. Specifically, shape descriptors are usually poor in stability if they are directly extracted from edge contour point information from flat shapes. Consequently, it easily leads to unsatisfactory accuracy in feature matching. In addition, conventional descriptors are usually complex in feature structure, resulting in lower running efficiency. Therefore, the cause of the restraint of feature structure can be summarized as follows:

  1. 1.

    It is easy to produce unstable feature structures when the image plane is variable;

  2. 2.

    Overly complex shape descriptors usually pose large computational burden in the subsequent matching stage;

  3. 3.

    A strong local deformation in the shape contour easily evokes feature distortion. In this case, the distorted global feature may be completely mismatched in the matching stage;

  4. 4.

    For the vast majority of international general databases, the shape degree of shape contour, that is, the orientation of contour, is different. Shape features with the same restraints may produce different description abilities for different databases, thus reducing the overall effectiveness of shape descriptors.

1.3 Contributions

Mathematically, restraint is the condition that the solution of an optimization problem needs to meet. When analyzing some specific logic functions, we often encounter such a situation that the value of input variables is not arbitrary. Restrictions on the value of input variables become restraints. Reducing restraints is to change the relevant factors affecting the process in order to meet a specific result or requirement, that is, the input or output parameters of a process, and reducing the influence of these factors will have a positive effect on the expected results. Similarly, there may be a series of restraints in shape recognition analysis, including restraints generated by extracting shape features, restraints generated in the process of shape feature calculation and restraints generated in the process of shape matching, etc. Here, all factors that affect the accuracy and speed of shape recognition results in the process of shape recognition are collectively referred to as shape recognition restraints. Therefore, in shape recognition, we consider the new concept of reducing shape recognition restraints in order to improve the speed and accuracy of shape recognition.

The contributions of this paper are as follows:

  1. (1)

    The concept and method of structural restraint is first proposed in shape recognition, including the restraint of shape contour extraction and expression;

  2. (2)

    20-connected chain code based on twofold restraint reduction is adopted in shape recognition and the Hamming code distance is first used to measure shape similarity;

  3. (3)

    Experimental results verify that the method significantly improves the performance, which runs up to 500 times faster than the existing description methods.

The rest of the paper is organized as follows. In Sect. 2, related new concepts and research work of contour coding shape fast recognition with double reduction of restraints are introduced. Section 3 describes related research ideas and work of the proposed method and explains in detail the model of the shape recognition method proposed in this paper. In Sect. 4, the proposed method is tested on three public databases and compared with other existing shape recognition methods. In addition, the results of the proposed double reduced restraint method are compared with the same descriptor without reduced restraint. Finally, the conclusion and future outlook are given in Sect. 5.

2 Preliminaries

In this section, some essential problems affecting the speed and accuracy of shape recognition are identified. At the same time, the general form of chain code coding and the related process of shape contour coding in the field of shape recognition are introduced. Finally, we introduce representative shape descriptors and the advantages and disadvantages of shape features based on contour.

2.1 Constraint theoretic perspective

In this subsection, the concept of the proposed “structural restraint” is elaborated, including structural restraints, feature structure restraints, contour feature extraction restraints and contour feature expression restraints. The definition of restraint reduction is also expressed.

Structural restraints The process of shape recognition mainly contains three stages: shape feature description, shape feature calculation and shape feature matching. Therefore, a series of restraints that affect the performance (speed and accuracy) of shape recognition in the three stages are defined as structural restraints, which includes feature structure constraint, calculation structure constraint and matching structure constraint.

Feature structure restraint The feature structure restraint is the parent restraint in the feature description stage of shape recognition, and its four sub-restraints are contour feature extraction restraint, contour feature calculation restraint, contour feature expression restraint and contour feature matching restraint, respectively.

Restraints in contour feature extraction When shape is used as the shape feature in the process of shape recognition, shape contour is used as the input variable of shape descriptor in the process of feature extraction.

Restraints in contour feature expression After selecting a specific shape descriptor, a mathematical expression result (usually vector) can be obtained by using the shape description method. The buffering stage of obtaining shape contour as shape recognition and matching stage is defined as shape feature expression stage. Therefore, the contour feature expression restraint is defined as the sub-constraint of the contour feature output which affects the shape recognition performance after taking the contour as the restraint and selecting a specific contour feature description method.

Restraints reduction Define the initial condition parameters of all aspects that need to be reduced in order to achieve the best performance or optimal model in shape recognition and the limitation as reducing constraint.

Restraints reduction of contour feature In order to achieve the best performance or the best model in shape recognition, the feature initial conditions and restraints that need to reduce the influence of the result are defined as reducing restraints of contour feature.

2.2 8-Connected chain code

Chain code, also known as Freeman code, is utilized to describe curves or boundaries through the coordinates of starting points of curves and the direction codes of boundary points. As a popular tool, it is widely used to represent curves and region boundaries in image processing, computer graphics, pattern recognition and other fields.

The commonly used chain codes are divided into 4-connected chain codes and 8-connected chain codes according to the number of adjacent directions of the central pixel. 4-connected chain codes have four adjacent points, i.e., above, below, left and right of the center point, respectively. As shown in Fig. 2, compared with the 4-connected chain code, the 8-connected chain code has another four oblique directions. Since there are 8 adjacent points around (except at the edges) every single pixel in an image plane, the 8-connected chain code could exactly describe the information of the central pixel and its adjacent points, which makes it more popular than the 4-connected chain.

Fig. 2
figure 2

Schematic diagram of connected direction of 8-connected chain codes

When representing a curve, the starting point of curves is a prerequisite for chain codes. In 8-chain codes, the corresponding line segment lengths of odd and even digits(shown in Fig. 2) are different. The unit length of even and odd digits is set to 1 and \(\sqrt{2}\), respectively.

Original chain code

Starting from a starting point of the boundary (curve or line), observe the direction of each line segment clockwise, and represent it with the corresponding pointer of 8-chain codes. The result is a digital sequence representing the boundary (curve or line), which is called the original chain code. The original chain code has translation invariance (the index is not changed during translation), but when the starting point S is changed, a different chain code representation will be obtained, that is, it is not unique.

Differential chain code

Differential coding is also known as incremental coding. Differential coding refers to the code of a digital data stream in which every element, except the first element, is represented as the difference between that element and its predecessor. Differential chain codes and original chain codes have the same properties, that is, translation invariance and scale invariance. The computation of difference code is shown in Eq. 1.

$$\begin{aligned} \left\{ \begin{array}{l}M_N^{\prime }=\sum _{i=1}^N a_i^{\prime } \\ a_1^{\prime }=\left( a_1-a_i\right) , MOD N, i=2,3, \ldots , N \\ a_i^{\prime }=\left( a_i-a_{i-1}\right) , MOD N\end{array}\right. \end{aligned}$$
(1)

where the value of N is the number of difference codes, and \(a_{i}\) is the value of each sequence in the difference chain code. It can start from any code value in the difference code and recurse step by step in one direction.

2.3 Contour-based shape descriptors

Belongie et al. [28] proposed a description method of shape context (SC) according to the spatial position relationship of contour points. In this method, each point in the contour sequence is represented by a vector and the shape object is described by constructing a set of feature vectors. The method focuses on the spatial distribution of one point in the contour sequence and all the other sampling points in the contour sequence, and has the characteristics of large amount of information, strong description ability and robustness.

Fig. 3
figure 3

Framework of the proposed fast shape recognition method via a bi-level restraint reduction of contour coding

Based on the idea of Shape Context, Ling and Jacobs [29] proposed to use inner-distance shape context (IDSC) between contour points to replace Belongie et al. [28] in calculating the Euclidean distance adopted in shape context. The shortest path length of two sampling points in contour sequence through the shape is defined as the internal distance. This method is suitable for the identification of the target with flexible change, and good results are achieved in the experiment.

Biswass et al. [30] proposed a new shape index and retrieval framework based on the shape context description method of internal distance. The similarity between the shape to be retrieved and the shape in the database can be calculated efficiently by index, which improves the efficiency of the traditional shape matching algorithm based on dynamic programming.

In general, methods based on contour shape description have the following three advantages.

  1. 1.

    The overall and local two-dimensional shape information can be organically combined to describe the shape and the feature of structure more accurately.

  2. 2.

    It can be used with a variety of shape matching algorithms to flexibly adopt shape matching based on dynamic programming or shape index and matching based on dictionary.

  3. 3.

    It can simplify the shape feature structure and greatly reduce the complexity of the calculation process.

On the other hand, there are several shortcomings in these methods.

  1. 1.

    Shape contour feature extraction is not flexible enough, it merely considers the contour boundary information of the object that easily suffers from shape deformation. As a result, the extracted feature sometimes fails to reflect the object information accurately.

  2. 2.

    Once the extraction method of shape contour is fixed, only one feature expression result will appear in the feature expression stage due to the limitation of extracted contour features. In other words, the contour feature is limited by the extraction.

  3. 3.

    In general, there is local deformation within shapes. Nevertheless, all of the above contour descriptors are based on global features. In the matching stage, the accuracy of the global feature description will be affected by the local feature difference, thus greatly deteriorating the matching process.

3 The proposed method

As mentioned above, almost all contour coding-based shape recognition methods ignore the essence of “feature structure restraint.” Accordingly, it is difficult to optimize the performance by merely improving the coding method, while it may be highly efficient and accurate. Intrinsically, if the restraints that affect the performance of shape recognition in the shape description stage are reduced, even a simple shape contour coding method could benefit from it. In this section, the restraint problems in contour feature extraction and expression are analyzed in a constraint theoretic perspective, which produces a simple shape contour coding method based on a bi-level restraint reduction. As shown in Fig. 3, the proposed method can be divided into six parts, from the input shape contour to the output shape recognition result, which represents the complete step of shape recognition.

Assume a shape contour sampling point set \(P=\left\{ P_{1}, P_{2}, \ldots , P_{N_{i}}\right\} \), where \(N_ {i} \) is the number of sampling points. In this paper, the number of sampling points to 100. Then, the relative direction between the two sampling points on the contour is expressed as Eq. 2.

$$\begin{aligned} {\text {dir}}_{N_{i}}=\left\{ \overrightarrow{P_{1} P_{2}}, \overrightarrow{P_{2} P_{3}}, \ldots , \overrightarrow{P_{N_{i}} P_{1}}\right\} \end{aligned}$$
(2)
Algorithm 1
figure e

CC2RR

3.1 The first reduction of restraint for contour feature extraction

Figure 4 demonstrates a method of drawing still life in art teaching. First, the rough contour of the object is outlined by some line segments. Then, the length of each line segment is gradually reduced. The process is iterated until the length of the line segment approaches zero. In this way, the contour of the object varies from line segments to a series of points, which is termed as the approximate contour of the actual object.

As shown in Fig. 4a, a semi-finished product is outlined with line segments. After being repeated outlined, the sketch acquires vivid details in Fig. 4b. It is still easy to distinguish the object like a teapot in the early stages (Fig. 4a). From another perspective, all the curves in Fig. 4b are replaced by straight lines in Fig. 4a. Consequently, no matter how a rough contour is iterated to fine (i.e., no matter what the curvature or smoothness of the actual arc segment represented by the straight-line segment is), it is still obvious that the result of the next drawing work is indeed a teapot according to the rough contour in Fig. 4a. If the fine outline in Fig. 4b is adopted as a baseline, the depiction of the object might be completely limited from the beginning. Since the parameters such as curvature and smoothness of the arc segment of the object are fixed, it is impossible to use the course contour with fewer restraints to draw instead. This limitation undoubtedly reduces the plasticity of the object.

Fig. 4
figure 4

Object sketch, a is the semi-finished rough contour of object sketch, b is the finished fine contour sketch

Inspired by the rough contour in sketch drawing, the proposed work adopts the idea of extracting rough contour as input in the feature description stage, thus reducing the restraint of contour feature extraction, which can be termed as restraint reduction of contour feature extraction.

In actual shapes, a set of shape contour points can be obtained by uniform sampling of a shape, while the contour coding is to code the extracted shape contour. Therefore, the extraction of shape contour features (evenly sampling a set of points) is earlier than coding the contour. To reduce the deformation of different local contours corresponding to the same shape, the idea of extracting shape rough contour is adopted. Meanwhile, the direction code value of the last contour point relative to the previous one is no longer calculated in the coding process. It is the rough contour segments formed by the fourth contour point and the current contour point, that is, the code value of the line segments is the relative direction between the current contour line segment and the next contour line segment. In other words, the direction of \(\overrightarrow{P_{1} P_{2}}\) convert to \(\overrightarrow{P_{1} P_{4}}\), the direction of the whole contour \(\hbox {dir}_{N_{i}}\) convert to \(\hbox {dir}_{N_{i}}^{R}\), It can be expressed as Eq. 3.

$$\begin{aligned} {\text {dir}}_{N_{i}}^{R}=\left\{ \overrightarrow{P_{1} P_{4}}, \overrightarrow{P_{4} P_{7}}, \ldots , \overrightarrow{P_{N_{i}-3} P_{N_{i}}}\right\} \end{aligned}$$
(3)

In this way, the influence of local deformation of the original contour arc corresponding to the line segment on the shape features will be greatly reduced. In this case, local rough contour features could be obtained while keeping the overall shape features unchanged. In other words, the selection of contour points to form the rough contour segment would be based on the recognition of the shape rough contour. The reason why four points are selected as one contour segment in this method is obtained from the experimental results. Corresponding comparative experiments were conducted, respectively. For the three main datasets, the recognition accuracy would be reduced if less than four points and more than four points were selected. In addition, choosing less than four points will slow down the recognition speed. The specific experimental result and analysis are shown in Sect. 5.6.1 in this paper, the idea of selecting the rough contour segment of the current contour point and the fourth contour point is derived from experimental restraints, which is undoubtedly optimal for the current experiment. Moreover, the restraint of the previous fine contour will also be greatly reduced, and the problem of insufficient stability in the case of local deformation of the contour could be alleviated to a certain extent.

Figure 5 shows the contour obtained after the extraction of contour features with restraint reduction. Obviously, although Fig. 5 shows the contour extracted after the reduction of constraint, it is still explicit that the object represented by the shape is bone, as the two teapots shown in Fig. 4.

Fig. 5
figure 5

Contour features of bone were extracted after first reducing restraint

Fig. 6
figure 6

Diagram of 20-connected chain code and corresponding feature coding of part of bone rough contours, the rightmost column is the coding value of the local contour

3.2 20-Connected chain code for shape contours

The conventional 4-chain code and 8-chain code are both suitable for simple two-dimensional shapes, while complex shape contours must describe themselves with more direction codes. Therefore, 20-direction chain code is selected in this paper to code shape contour. In the experiment, we select 16-chain code, 20-chain code and-24 chain code, respectively, for shape and contour coding. According to the experimental results, it is found that the selection of 20-direction coding is the best choice for the international general datasets used in the experiment, and the best performance results can be obtained. The specific experimental results and analysis are shown in Sect. 5.6.2

The chain code can be divided into primary chain code, differential chain code and normalized chain code. The requirements of shape recognition in all aspects, that is, the shape representation method should meet the requirements of translation invariance, scale transformation invariance and rotation invariance. It is found that the normalized difference chain code can meet the requirement of shape descriptor for shape recognition, and it has invariance of translation, scale and rotation. Therefore, it can be used as feature variables to describe shape feature and form corresponding shape coding descriptors.

The connected direction of the 20-direction chain code proposed is shown in Fig. 6a. The 20-connected directions corresponding to the current rough contour lines are divided into 20 regions, respectively. The position area of each possible next rough contour segment is represented by letters from A to T. During the calculation, the letters in the 20 directions correspond to natural numbers from 0 to 19, respectively. The starting direction is set as the right direction of the image, that is, the coding of the rough contour segment is represented by the relative position between the next rough contour segment and the current rough contour segment. By coding the rough contour segment, a shape’s feature coding can be obtained (Fig. 6b). The coding direction is counterclockwise and the starting point of coding is the rough contour segment closest to the upper left of the contour part. All rough contour segments on the shape are coded in the clockwise direction according to the above rules, and finally return to the initial line segment.

According to the contour feature, the restraint reduction is extracted, and the 20-connected chain code value between contour points is called \({\text {Code}}^{R} \approx {\text {dir}}_{N_{i}}^{R}\), expressed as Eq. 4. Where \(M_{i}\) represents the number of directions for the current shape contour. Due to the variability of the encoding directions in shape contours, it is highly likely that they may not precisely match the given set of 20 encoding directions. Therefore, we draw inspiration from the description of directional encoding in reference [31] and provide more accurate encoding rules based on mathematical theory. The rules are as follows: For the 20 directional code values, we define an angle \(\theta \) to represent the directional range of a specific code value dir. If the local shape contour encoding direction Code falls within this range, specifically \(- \theta /2 \le \theta _\textrm{Code} < \theta /2\), we consider the encoding direction of that local shape contour to be dir. In Eq. (4), we use the approximate equality symbol to denote the relationship between the local contour direction and the 20 directional code values.

$$\begin{aligned} {\text {Code}}_{M_{i}}^{R}=\left\{ C_{\overline{P_{1} P_{4}}}, C_{\overline{P_{4} P_{7}}}, \ldots C_{\overline{P_{N_{i}-3} P_{N_{i}}}}\right\} \nonumber \\ \approx \left\{ {\text {Code}}_{1}, {\text {Code}}_{2}, \ldots , {\text {Code}}_{M_{i}}\right\} \end{aligned}$$
(4)
Fig. 7
figure 7

Differential chain code of the bone’s local shape rough contour

According to the coding direction in the left figure in Fig. 6, the direction coding of partial rough contour features of bone in the right figure is: CEBSOQSSSSSSSTTQNLN, the code is the 20-connected feature chain code of the rough contour of the local area. From the original chain code, the difference code can be calculated according to Eq. (1). It only needs to set \(N=20\) in the formula. The specific calculation process is shown in Fig. 7. The difference code of the local rough contour can be obtained as CRRQCCAAAAAABAPRSCG.

In general, the difference chain codes need normalization. For the proposed 20-direction chain codes, it is not possible to use letters to form natural numbers and find their minimum value. Therefore, this paper converts the letters corresponding to the direction into binary, and then uses the binary values corresponding to the letters in the base 32-bit system to make the binary values form natural numbers and minimize them. Therefore, the binary values corresponding to the above difference chain codes are \(00010, 10011, 10011,\ldots , 01000\). Finally, similar to the normalized difference code calculation Formula (2) of the 8-direction chain code, the normalized difference chain code corresponding to the local contour is AAAAAABAPRSCGCRRQCC.

3.3 The second reduction of restraint for contour feature expression

After obtaining the differential chain code of the shape contour, the feature of the shape can be obtained as well. The extracted rough shape contour can reduce the influence of small deformation on the recognition result to a certain extent. Nevertheless, the shapes and deformations in the popular datasets are changeable in size. After the normalized difference coding of 20-direction chain code, there might be mismatching of local feature directions, that is, restraint of contour feature expression. As a result, it would greatly reduce the matching rate due to the mismatching of these local feature directions in the calculation process. In fact, these local feature mismatches do not affect the judgment of the overall shape of the object in the actual situation. In addition, the shape feature will be fixed after feature expression, which would then be input into the matching stage to obtain the unique matching result. Therefore, in the shape feature expression stage, the coding direction can be used to approximate fit the direction value of the local contour (code value). The rule of the direction fitting is that the current rough contour direction has the same descending restraint direction as the former rough contour direction and the latter rough contour direction, respectively, so as to reduce the influence of the obtained contour coding feature on the matching result when it enters the final matching stage and improve the result performance of shape recognition. This process is called contour feature expression reduction restraint.

In what follows the contour feature expression is described in more details. Figure 8 shows the process of the rough contour obtained from two images in the public datasets after reducing the restraint of contour feature extraction. They represent a bone as the same object. However, there are still similar but mismatched local directional features in the calculated differential chain codes. The reduction of restraint for contour feature expression (direction fitting) is used to deal with this case. As shown in Fig. 8, according to the 20-direction difference chain code, there is only one direction value difference between the rough contour directions of the two pictures at the local corresponding contours ①, ②,and ③, respectively. Specifically, they are F to E, F to G and R to S. Therefore, in the feature expression stage before the matching of Figure a and Figure b in Fig. 8, local contours (①, ②, ③) are first, respectively, fitted into the same feature direction code as the current query graph, which is used as the final shape feature for matching recognition. Finally, the recognition result can be obtained after reducing the restraint of feature expression.

The restraint reduction is expressed according to contour features. The restraint reduction of the directional code value between the current contour points is reduced to the former directional code value and the later one. It is expressed as Eq. 5.

$$\begin{aligned} {\text {Code}}_{M_i}^{R R}=\left\{ \begin{array}{l} {\text {Code}}_{M_i-1} \\ {\text {Code}}_{M_i} \\ {\text {Code}}_{M_i+1} \end{array} \text{, } \text{ when } \text{ Code } ={\text {Code}}_{M_i}\right. \end{aligned}$$
(5)
Fig. 8
figure 8

Two bone shapes correspond to fine contour and three local similarity directions based on the reduction of restraint for contour feature expression

Table 1 Rule of reducible restraint code corresponding to the current code in chain code

The rules for reducing restraint of contour expression are shown in Table 1. The first row is the code value of the current direction, and the second row is the code values that can be reduced to the current, which is termed as the reducible restraint code. In this paper, the step of the reduction of restraint is set as 1 unit direction. The reason for setting the constraint code with step of 1 comes from the experiment ( Sect. 5.6.3, Table 9). We choose step size of 0, 1, 2 and 3 to carry out experiments, respectively, and find that the recognition accuracy will be significantly reduced. Therefore, for the datasets tested, the step of 1 can obtain the optimal result. The specific experimental methods and results are shown in Table 8.

3.4 Shape matching

Shape matching is the process of similarity correspondence after obtaining the final shape features. Conventional studies usually calculate the Euclidean distance between two shapes to judge whether two shapes are similar or not. Nevertheless, shape features based on contour coding described in this paper are relatively simple. If Euclidean distance is still used for similarity measurement, the effectiveness of coding cannot be reflected. What’s more, the low identification accuracy of Euclidean distance calculation method in this paper is verified by experiments (see Sect. 5.6.4). Hamming code is a linear block code, which divides the information coding sequence into sequence segments of length K, whose structure is similar to the feature sequence code proposed in this paper. The Hamming code distance refers to the number of bits with different code values between any two code words in a code set. That is, the smaller the Hamming code distance between two code groups, the greater the similarity between two code groups, which could indicate the shape similarity of the corresponding contour coding features in this paper. Therefore, this section refers to the calculation method of Hamming code distance and uses Hamming code distance as the shape similarity measurement. In the calculation process, the binary sequence corresponding to the contour direction code word is still used. For example, the letter B corresponds to the binary sequence 00010, which simplifies the calculation process of Hamming code distance between sequences. The Hamming code calculation formula of contour coding feature sequence based on double reduction of restraint is shown in Eq. 6.

$$\begin{aligned} \hbox {dis}_{CC2RR}\left( {x,y} \right) = {\sum {x\left[i \right]\bigoplus }}y\left[i \right]\end{aligned}$$
(6)

Here, \(i = 0,1,\ldots ,n - 1\) are n-bit sequence codes, and \(\oplus \) stands for XOR operator.

4 Computation complexity

Computational complexity in shape recognition refers to the complexity of shape description method in feature calculation. The proposed shape contour coding method is a mathematical expression method with low computational complexity. Coupled with the processing of the first reduction of restraint, the amount of input computational data has greatly reduced; thereby, it has low computational complexity. When the input data are a certain shape contour point \(N_{i}\), the reduction of restraint method proposed also reduces the amount of calculated data to a certain extent. Compared with other shape descriptors with the same calculation method, the proposed approach has the lowest computational complexity, which also becomes an important prerequisite for fast recognition. Table 2 lists the computational complexity of some representative shape description methods. T represents the number of iterations of the method. In LCDP, T is less than 50. In Co-transduction and LP, T is 5000. N represents the number of contour points that make up the rough contour. In CC2RR, n is 3.

Table 2 Some representative shape description methods feature computational complexity

5 Experiment and result analysis

To evaluate the performance of the proposed shape recognition method, the corresponding shape descriptors are tested on three popular datasets for shape recognition, the MPEG-7 Part B dataset, leaf dataset and Kimia99 dataset. There are different types of 2D shape maps in each dataset, which can be used for validating the running efficiency and recognition accuracy. In this section, all methods used for comparison are implemented in MATLAB and executed on a laptop with an Intel Core i7-4850 CPU (2.30 GHz) and 8GB RAM.

5.1 Performance on MPEG-7 Part B

The MPEG-7 Part B dataset is widely used in shape recognition and retrieval classification [17, 18, 32, 33] which contains a total of 1400 shape templates (70 classes, each containing 20 shapes). It is very helpful for the study of shape features and is popular in many classical descriptors. Figure 9 shows some of the shapes.

Fig. 9
figure 9

Partial shapes in the MPEG-7 Part B dataset

In the evaluation of retrieval accuracy, the Bullseye test method was used to test the retrieval rate of the descriptors proposed. Specifically, for each query shape, it calculates the percentage of shapes belonging to the same class in the top 40 results sorted by similarity. Table 3 lists the experimental results of the proposed method as well as some representative shape descriptors.

Table 3 Comparison of classification accuracy on MPEG-7 Part B dataset

It can be seen from Table 3 that the proposed shape descriptor outperforms the other seven state-of-the-art methods under the same test environment. Compared with IDSC, which shows a better overall performance of retrieval rate and matching speed, the accuracy of the proposed descriptor improves by 1.4\(\%\). The reason lies in the bi-level restraint reduction on the feature description stage. In addition, since the proposed feature description method of contour coding is more concise in feature structure, it is more than 300 times faster than IDSC in terms of matching speed. It is also worth noting that while BioClass [38] shows a higher retrieval rate, its speed is extremely slow. Overall, the method proposed in this paper is more robust.

5.2 Performance on Swedish leaf

The Swedish leaf dataset is a collection of plant leaf images, containing 15 classes with 75 kinds of shapes (1125 shapes in total). Since the contour in the Swedish leaf dataset is relatively diverse, while the overall contour similarity is strong, it is proper to verify the robustness of the proposed shape recognition method on this dataset. Figure 10 shows several shapes in this dataset.

Fig. 10
figure 10

Several shapes in Swedish leaf dataset

In the evaluation of recognition accuracy, the leave-one-out method based on K nearest neighbors is adopted. K shapes with the highest similarity to each shape(excluding itself) are found, and the category of this shape is recorded as the category with the highest frequency of occurrence among the K shapes. Finally, the number of successful shape recognition (classification) is counted to obtain the overall recognition accuracy. The experimental results of some representative shape descriptors tested in this dataset are shown in Table 4, including our method.

Table 4 Comparison of classification accuracy on Swedish leaf dataset

Table 4 lists the experimental results. It can be seen that the classification accuracy rate on this dataset is higher than that of the MPEG-7 dataset. It is mainly because the dataset has relatively few shapes and the overall contour of the shape is more apparent. As a result, the proposed method performs better than other shape description methods, excluding CBW [34] which is outstanding in leaf shape classification. Although CBW has a little bit better performance in accuracy rate, the proposed method shows a marked improvement in running efficiency, which is more than 500 times faster than CBW.

5.3 Performances on Kimia99

Apart from the above two datasets, Kimia99 [17] is also popular for shape recognition. A large number of shape descriptors in the literature use Kimia 99 as the test database. It contains 9 classes with 11 shapes (99 shapes in total). Since the dataset is small in size, it is not suitable for shape recognition methods requiring multiple learning in the early stage. On the other hand, it has comprehensive shape contours that are very suitable for validating contour-based shape description methods. All shapes in the dataset are shown in Fig. 11.

Fig. 11
figure 11

All shapes in the Kimia99 dataset

During the evaluation, each shape in the database is set as a query in turn, and similar shapes are then identified among the remaining shapes. As for the results, the number of correct hits from the first most similar shape to the tenth most similar shape for each query is counted. The final statistical results are used to evaluate the performance of the descriptors.

Table 5 Average hit rate of the 10 most similar shapes on the Kimia99 dataset

The experimental results of some representative shape descriptors tested in this database are shown in Table 5. It can be seen that the contour coding reduction method proposed in this paper still has relatively higher accuracy and matching efficiency on the Kimia99 dataset. It is also worth noting that the reason for the higher accuracy than the Mpeg-7 dataset lies in its fewer shapes and comprehensive contour distribution. In addition, the overlapping of shape features is cut by the bi-level restraint reduction, which makes them easier to identify.

5.4 Performance on the occluded shape dataset

Since the datasets tested are deformed to some extent in contour, the proposed method in this paper is tested in the noise datasets itself. The above experiments show that its anti-noise ability is stronger than some other classical methods. Therefore, this section only analyzes the occluded performance. Since there is no public dataset of occlusion shape, we constructs an occluded shape dataset ourself, which is named FEW100 [41]. The shapes in this dataset come from the open remote sensing shape dataset, and we occluded them artificially. In the shape matching stage, we also employ the Hamming code distance to measure the similarity between occluded shapes [see Eq. (6)]. There are five classes of shapes (aircraft, ship, storeroom, helicopter, oil tank), twenty shapes in each class. All the shapes are shown in Fig. 12.

Fig. 12
figure 12

Leftmost column is a schematic diagram of the target shapes artificially occluded (to verify the effectiveness of occlusion resistance), and the remaining part is 100 single remote sensing target shapes collected from the Internet and remote sensing image databases

The test results are shown in Table 6. Its performance on this dataset is better than that of some other classical methods. It can also be seen from the experimental results that the method in this paper has stronger anti-occlusion ability.

Table 6 Performance on FEW100 dataset

5.5 Influences of different restraints

To clarify the effectiveness of the proposed method in contour coding shape recognition, a comparative experiment is conducted on the recognition accuracy between the normal restraint and the reduction restraint. The experiment contains four parts: normal restraints, only the first reduction of restraint, only the second reduction of restraint and bi-level restraint reduction. All experiments are carried out in the above three shape datasets. The experimental results are compared in Fig. 13. It can be clearly seen from the experimental results that the recognition accuracy is low under normal constraints, and the results with only one reduction of restraint are improved but not satisfactory, while the results of bi-level restraint reduction are greatly improved. Consequently, the effectiveness of the proposed contour coding method based on double reduction of restraint is validly proved.

Fig. 13
figure 13

Experimental results of different times reduction constraints are compared in three databases, the abscissa represents three different international general shape recognition datasets, and the ordinate represents the recognition accuracy after reducing different times restraint

5.6 Supplementary experiments

In order to better illustrate the effectiveness of the proposed method, supplementary experiments are conducted. The details are as follows.

5.6.1 Fourth point as the restraint reduction of contour extraction

The degree of restraint reduction here depends on the category difference of shapes in the dataset and the deformation degree of shape contours. As we all know, the greater the degree of deformation, the stronger the difference between the two same shapes, the worse the recognition accuracy. Therefore, the reduction constraint is to reduce the influence brought by deformation, but it must ensure that the shape features will not become very similar due to a high degree of reduction constraint, because this will greatly reduce the recognition accuracy of the algorithm.

To prove the correctness of selecting 4 in Sect. 3.1 as the step of reducing restraint for contour extraction, the authors conducted an ablation experiment in this section. Step of 3, 4 and 5 was selected, respectively, to conduct experiments in the three datasets with other conditions and parameters unchanged. The experimental results are shown in Table 6. It is obvious from the experimental results that selecting 4 is the optimal choice in the datasets. It is worth mentioning that experiments on step 1, 2 and 6 were also conducted, and the performance was not as good as 4.

Table 7 Recognition accuracy (%)/matching time (s) of different step in reducing restraint for contour extraction

5.6.2 20-Connected chain as the number of coding direction

Generally speaking, the existing coding directions are 8 direction and 16 directions, but the shape contour of the dataset in this paper is some complex, and the recognition accuracy will be greatly reduced if the feature quantity of the contour is greatly reduced. Therefore, according to the experimental results, we choose a more complex 20-direction chain code.

To prove the correctness of selecting 20-connected chain code in Sect. 3.2 for contour coding, ablation experiments were conducted. With other conditions and parameters unchanged, 8-connected chain code, 16-connected chain code, 20-connected chain code, and 24-connected chain code were selected, respectively, to conduct experiments in the three datasets. The experimental results are shown in Table 7. It is obvious from the experimental results that selecting 20-connected chain code in the datasets is the optimal choice.

5.6.3 Former and latter directions as the restraint reduction of feature expression

Like the first layer of reducing restraint, the degree of reduction restraint directly affects the difference of contour features, while the shape contour in the dataset are complex to some extent. Therefore, restraint cannot be reduced too much, which will greatly reduce the shape discriminability and ultimately affect the recognition accuracy.

To prove the correctness of selecting 1 in Sect. 3.3 as the step of reducing restraint for feature extraction, ablation experiments were conducted. With other conditions and parameters unchanged, step of 1, 2 and 3 was selected, respectively, to conduct experiments in the three datasets. The experimental results are shown in Table 8. It is obvious from the experimental results that selecting 1 as the step of reducing restraint in the datasets is the optimal choice (Table 9).

Table 8 Recognition accuracy of different number of direction in reducing restraint for contour coding
Table 9 Recognition accuracy of different step in reducing restraint for feature expression (matching time does not change)

5.6.4 Hamming code distance compared with Euclidean distance

The reason why Hamming code distance is selected in the feature matching stage is that the shape features of the proposed method are represented by coding. The shape feature that is not represented by code cannot measure its hamming code distance. To verify the superiority of using Hamming code distance compared with Euclidean distance, they are used, respectively, to measure the similarity of the encoded features. The experimental results in Table 10 clearly support this view. The reason for the analysis is that the shape features used Hamming code distance for similarity measurement before the code is transformed to binary code value, which has a higher resolution than converting it to decimal code for Euclidean distance measurement.

Table 10 Hamming code distance compared with Euclidean distance on the MPEG -7 Part B dataset

5.6.5 Relationships between reduction degree of restraint and corresponding accuracy and efficiency

Two reduction restraints were carried out in this paper. The first was contour extraction stage (Sect. 3.1), and the second was feature expression stage (Sect. 3.3).

It can be seen from Table 6 that the first reduction restraint will not only affect the recognition accuracy but also the matching time. With the increase in the degree of reduction restraint, the time will undoubtedly decrease. Because the reduced restraint degree in this stage directly affects the volume of shape features, the less the structure of the feature, the faster the matching time. The reduction constraint in this stage will also affect the recognition accuracy. It can be seen from the recognition results of steps 1 to 6 that the comprehensive recognition performance of the datasets all is the best when the step is 4. So this is the optimal degree of the first reduction constraint.

However, the second restraint reduction only affects the recognition accuracy. With the increase in the restraint degree (from 0 to 3), the best performance is achieved when the step is 1. And increasing the degree of reducing restraint will significantly reduce the recognition accuracy. So this is also the optimal degree of the second reduction restraint. It is worth mentioning that when the step is 0 (without reducing restraint), the result is lower than that after reducing restraint, which further explains the effectiveness of reducing restraint at this stage.

6 Conclusion

This paper first points out that the essential problem affecting the accuracy and speed of shape recognition is “structural restraint,” which is represented by the characteristic structural restraint. Accordingly, a fast contour coding shape recognition method via a bi-level restraint reduction of contour coding is proposed to reduce two major restraints of extraction and expression of contour features. In terms of shape contour coding feature, which possesses the advantage of structural simplicity, the main downside is the problem of feature structure restraint. Therefore, the efficiency and robustness could be improved significantly after reducing the restraint. Experimental results prove that this method not only improves the accuracy of shape recognition but also greatly improves the matching speed in the matching stage. In addition, it is more than 500 times faster than some representative shape description methods, which provides greater possibilities for practical applications. Future work will focus on the essential problem of “structural restraints.” The influence of computational structural restraints and matching structural restraints will be intensively studied, thus improving the performance of shape recognition, as well as solving the intrinsic problems with the relevant concepts of restraints in this field. Besides, this paper provides a theoretical basis for computer vision engineering.