Abstract
In this paper, two new accelerated and robust template matching and object detection algorithms are established. The algorithms retain the accuracy measures compared to relevant well-known standard and recent schemes, and remarkably accelerate the execution time simultaneously. Instead of dealing with images in the grayscale or color representations, the proposed methods are established by measuring resemblance on extracted binary images. The two different techniques are based on the φ-correlation coefficient and logical circuits. Both techniques enhance the accuracy, compared to two classical and two recent methods, while giving very efficient running time. The algorithms are carried out on two different datasets used in state-of-the-art methods with performance and comparison analysis. Robustness of the proposed algorithms against normal and artificial noises is examined and assured as well.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Template recognition, matching, and tracking as well as object detection are major signal and image processing techniques that have many applications in different disciplines, see e.g. [6, 17, 27, 39]. Such applications involve; but not limited to, detection of colonic polyps [22, 23], detection of anatomical landmarks in brain Magnetic Resonance (MR) images [47], identifying types of film defects [53], fingerprint matching [20, 46], analyzing stock market behavior [16], detecting ventricular tachycardia [8, 9, 19, 24], the detection cancerous masses in various types of images [2, 5, 13, 38], detection of defects of printed circuit boards (PCB) [3, 4, 10]. See Table 1 for a complete list of acronyms of the paper.
In more details, Kilic et al. introduced in [23] a computer aided detection system to detect colonic polyps in computer tomography images using cellular neural networks, genetic algorithm and a three dimensional (3D) template matching (TM)technique, which is based on a fuzzy-rule-based thresholding. For this purpose, three different templates are created genetically, and they are implemented in the 3D-TM algorithm. See also [22].
Yoon et al. [53] presented an effective defect inspection system that identifies film defects and determines their types in order to produce polarized films for TFT-LCD (thin film transistor-liquid crystal display). The proposed system is designed and implemented to find defects from polarized film images using image segmentation techniques and to determine defect types through the image analysis of detected defects using TM techniques. They extracted features of the defects such as shape and texture, and compared them to the features of referential defect images stored in a template database.
A fingerprint matching automated system is established by Uz et al. in [46] to treat the low quality fingerprints, which may be also affected by various distortions due to the acquisition filters and other enhancement procedures. Their effective approach, to account for the within-class variations, is founded by capturing multiple enrollment impressions of a finger. Concentrating on combining minutiae information from multiple impressions of the same finger increases coverage area, retrieve missing minutiae, and remove spurious minutiae. Therefore a super-template is produced for each finger for the TM applications. Passos et al. [40] established an eye detection system using ensemble of weak classifiers based on a correlation filter, and Peng et al. [41] presented a corner detection and scale estimation algorithm.
In [5], Bator and Nieniewski implemented the correlation coefficient to detect cancerous masses in mammograms and Osman et al. [38], used a 3D convolution TM approach to detect Lung nodules, see also [13]. In the work of Ambrosini et al. [2], spherical tumor appearance models are created in such a way that they match the expected geometry of brain metastases, while accounting for partial volume effects and offsets due to the cut of MR images sampling planes. Then a 3D normalized cross correlations (NCC) measures similarity between the brain volume and the created spherical templates with varying radii to detect the positions of the brain metastases.
Kurosaki et al. [24] used the NCC as a similarity measure of an automated TM system to detect Tachycardia of the right ventricular outflow. Ciaccio et al. [8] adapted a TM technique based on the quantification of beat-to-beat changes in electrograms to locate functional reentrant circuits that are relatively stable and cause monomorphic ventricular tachycardia, cf. [9, 19].
Aggarwal and Kumar [1] have used convolution neural network to classify image surface texture. In [11], Dastanova et al. presented the hardware implementation of a novel algorithm for moving-object detection, which can be integrated with complementary metal oxide semiconductor (CMOS) image sensors. Bit planes of consecutive frames are stored in memristive crossbar arrays and compared using threshold-logic XOR gates. The resulting outputs are combined using weighted summation circuits and thresholded using comparators, to obtain binary images.
In this paper two new and fast TM and OD algorithms are established. We implement two different approaches using the φ-correlation coefficient and logic circuits. The proposed techniques are carried out on the bit-plane slices of grayscale or color images. While these approaches reduce the execution time, they retain accuracy measures compared to state-of-the-art relevant methods. In addition both proposed techniques show robustness against various types of noise. The next section gives a brief account about closer related works as well as some basic mathematical formulations. The proposed techniques of this paper are introduced in the Section 3. It contains two subsections, one is for establishing a TM and object detection (OD) algorithm via the φ-correlation coefficient, and another for the use of Boolean functions between bit-plane slices to establish TM and object detection schemes. Section 4 is devoted to investigate the performance analysis compared to four state-of-the-art techniques. We conclude that, while the TM execution time is remarkably accelerated, the accuracy and robustness are efficiently maintained. To sum up, the main contributions of this paper are:
-
Introducing two novel, fast and robust TM and OD techniques.
-
Compute the running time in comparison with relevant techniques.
-
Compare accuracy with four relevant techniques.
-
Investigate and assure robustness against various types of noise, involving for example occlusions, geometric deformation, illumination variation and the change of background as well as artificial noise.
2 Related works
In image TM techniques, in which this work is concerned with, a reference image f(x,y) is given together with a template image T(u,v) to be detected and matched. Usually, the association between T and a sub-image fc of f is measured at every pixel of f(x,y), where the sizes of fc and T coincide to each other. Consequently, the template \( T\equiv f_{c}^{*}\), where \(f_{c}^{*}\) is the subimage with the highest association, or similarity with T.
The (zero mean) normalized cross correlations (ZNCC), NCC, are considered as basic measures of association in TM algorithms, cf. [6]. However, due to their very high computation expenses, many techniques are derived to reduce computation expenses, as well as to enhance the accuracy measures, cf. e.g. [12, 14, 33, 43, 50]. Also to avoid the inaccuracy of using the NCC, Choi and Kim [7] gave a two stage template matching method for rotation and illumination invariance. In the first stage the matching candidates are selected using computationally low cost features, while in the second one, a rotation invariant TM technique is performed on these candidates using Zernike’s moments. Also Lei and Zhang [26] used an adaptive low cost ZNCC approach to derive a rotation invariant TM algorithm provided that the rotation angle is limited in the range of [− 20∘,20∘], cf. [15, 28, 42].
In many fast TM algorithms based on NCC and ZNCC, researchers implement partial eliminations to reduce the number of compared pixels, where the similarity between T and fc(x,y) is measured. For instance, Muramatsu et al. [35] speed up the application of the NCC as a tool of measuring association in TM algorithms by shrinking the compared images in a way that retains main features. This is done in [43] for NCC computations and in [44] for ZNCC computations by neglecting pixels whose correlation with T falls beyond a certain bound. Di Stefano et al. [44] have established a fast ZNCC-based TM algorithm using bounded partial correlation, and Mattoccia et al. [33] implemented this technique for multichannel images TM. However, [31], due to non-monotonicity of NCC, or ZNCC over pixels, the fast elimination techniques may not fasten the search algorithm. In addition, the search may suffer premature termination at pixels that cannot compete with the best match locations, cf. [31]. These issues are treated in [31] by creating a monotonic formulation over pixels, see also [30] for other elimination-based algorithms.
As we have indicated, partial elimination methods are found useful in many TM techniques, as in [33]. Lee and Chen [25] have implemented the technique of [43] with the help of the boxfiltering technique of [34] to derive a rotation invariant TM technique. In [37], using image segmentation, a chamfer TM method is established. The use of super-pixels is utilized in [51] and the search by comparing target signatures is implemented in [36] by using the NCC. In [50] the authors have enhanced computation on tensor cores by using low-level descriptions, together with a local normalization. In [21] the TM procedure is accelerated using an integral image. See [3] for computing NCC for images on 1-D feature vectors and [52] for computing NCC using addition-based criterion on 1-D vectors.
In the present work novel approaches to establish fast TM and object detection algorithms are created. Instead of dealing with images in their gray or color representations, the algorithms are developed via measuring similarity/dissimilarity between binary processed images. If \({f_{c}^{r}}(x,y)\) and Tr(u,v) represent the r th-level bit plane images of fc and T respectively, then we measure association between fc and T by computing association between \({f_{c}^{r}}, T^{r}\). If the image and template are in 8-bit gray level scale, then \({f_{c}^{7}}, T^{7}\) and/or \({f_{c}^{6}}, T^{6}\) are compared as they bear the major properties of images. With this respect, we use the low-computation φ-correlation coefficient [48] instead of the computationally expensive NCC, or ZNCC. Moreover, a suitably chosen Boolean circuit is implemented as another measure in a second TM algorithm. Both techniques are computationally fast, and give accurate results compared to known relevant techniques. It is worthwhile to mention here that both φ-correlation and Boolean functions are efficiently implemented for defect detection in [4].
3 The proposed methods
This sections involves two TM techniques based on the φ-coefficient and Boolean circuits.
3.1 A φ-correlation TM algorithm
It is known that using correlation criteria, like NCC, ZNCC, is an effective tool in various patterns recognition applications cf. [6, 43, 44]. Nevertheless, the implementation of NCC, or ZNCC is based on pixel-by-pixel comparisons, and consequently it is computationally exhaustive. Therefore fast TM and OD techniques are required, see also [45]. From another point of view, NCC and ZNCC measure association between data when they are linearly correlated. To overcome these disadvantages, the first proposed technique of this paper implemented the φ-correlation coefficient as a measure of similarity/dissimilarity.
The φ-correlation coefficient is a measure of association between 2 × 2 tables, which measures association between data that are not necessarily linearly related. In addition, cf. [29], it overcomes the disadvantage when the margins are largely deviated. The implementation of the φ-coefficient as the major measure of association will remarkably reduce the matching time, as indicated below, while maintaining similar accuracy and robustness to state-of-the-art techniques. The main reason of the reduction in execution time of the matching procedure is that instead of performing comparisons on gray-level f(x,y), we carry out performance on bit-plane slices. Thus each pixel is represent by 1 bit only 0, or 1, instead of n-bits in the n-bits gray level. In the following, and without any loss of generality n is taken to be 8.
Let f(x,y) be a given n-bit gray scale image. Then f(x,y) can be sliced into n-bit-plane binary images fr(x,y),r = 0,…,n − 1, with the gray scale
It is well known [18] that f0(x,y) contains all lowest order bits in the bit patterns comprising the pixels of f(x,y), while fn− 2(x,y),fn− 1(x,y) contain the higher order ones and preserves the major properties of f(x,y).
Now let us describe how to use the φ-correlation on the binary images to achieve the TM process. Let I(i,j) be the reference image under examination of size M × N pixels, T(k,l) be the target template of size m × n pixels, and I(x,y)(k,l) be the sub-image of I(i,j) of size m × n located at pixel coordinates (x,y) in I(i,j) such that m < M and n < N. In real applications, we want to find sub-image I(x,y)(k,l) of the original image I(i,j) such that I(x,y)(k,l) is very close to the given template T(k,l). To formalize the proposed algorithm in a mathematical form, suppose that the reference I(i,j) is such that 0 ≤ i ≤ M − 1,0 ≤ j ≤ N − 1 and the template T(k,l) is such that 0 ≤ k ≤ m − 1,0 ≤ l ≤ n − 1 where m,n are relatively small compared to M,N, respectively. We define I(x,y)(k,l) to the m × n sub-image of I(i,j) by
where 0 ≤ k ≤ m − 1,0 ≤ l ≤ n − 1. Thus, the domain of (x,y), where the block image I(x,y)(k,l) is defined, is given by
As is mentioned above, the r th-bit plane images of I,T,I(x,y) are denoted by \(I^{r},T^{r},I^{r}_{(x,y)}\) respectively, 0 ≤ r ≤ 7. Using (1) the r th-bit plane image of I(i,j) is obtained via
Likewise, \(T^{r}(k,l),I^{r}_{(x,y)}(k,l)\) are sliced.
The current TM algorithm is based on measuring association between the binary images \(T^{r}(k,l),I^{r}_{(x,y)}(k,l)\) at each pixel (x,y) of the domain (3). Then we select the optimum (x0,y0) at which T(k,l),I(x,y)(k,l) nearly coincide to each other. The φ-correlation coefficients are computed between \(T^{7}(k,l),I^{7}_{(x,y)}(k,l)\) for each pixel (x,y) of the domain (3). Indeed, we form the contingency table, with the similarity/dissimilarity values
i.e. λpq(x,y) are the cardinalities of the sets
Then, we define the margins, cf. [48],
Thus, we form the contingency table, Table 2.
The rest of the technique is based on computing the φ-correlation coefficient between \(T^{7},I^{7}_{(x,y)}\), which is
where none of the marginal numbers is zero.
In order to have a perception on the φ-correlation values in the proposed method, an example from the considered dataset1 is displayed in Fig. 1 where Fig. 1 (a) and (b) show individually a 10 × 10 patch in a printed circuit board (PCB) reference image and the corresponding template sub-image, respectively, while Fig. 1 (c) and (d) present the gray scale values of the two 10 × 10 image patches A1 and B1, and Fig. 1 (e) and (f) show the binary values of patches A1 and B1, respectively, extracted using 7-bit plane slice. Finally, Fig. 1(g) represents the φ-correlation coefficients between template image in Fig. 1(b) and 100 sub-images from the reference image in Fig. 1(a). The number typed in boldface represents the maximum φ-correlation value. See Section 4 for the introduction of the different datasets implemented in this paper.
In (8), the φ-coefficient overcomes the problem in percentage statistics when the margins are extremely scattered. As in the case of the NCC, \(\varphi (I^{r}_{(x,y)},T^{r})=1\) expresses strongest correlation between \(I^{r}_{(x,y)}\), Tr;\(\varphi (I^{r}_{(x,y)},T^{r})\) = − 1 represents the weakest association between \(I^{r}_{(x,y)}\), Tr and the zero value reveals no conclusion in this sense. It is known, cf. [29], that the exact correlation is determined if λ00 = λ11 = 0 or λ01 = λ10 = 0; or μ1 = μ3 or μ2 = μ4. If μ1.μ2.μ3.μ4 = 0, we redefine \(\varphi (T^{7},I^{7}_{(x,y)})\) as follows:
This definition preserves φ between − 1 and 1. We also notice that negative and positive associations are pertained. For instance, if μ1 = 0, then λ00 = λ10 = 0, and the similarity depends on the difference λ11 − λ01, i.e. there is a positive correlation when λ11 > λ01 and negative correlation when λ01 > λ11 and no correlation if λ11 = λ01. The other cases are similar. The larger the value of \(\varphi (T^{7},I^{7}_{(x,y)})\) is, the more similar the template T(k,l), and the sub-image I(x,y)(i,j) at the position (x,y) are. When \(\varphi (T^{7},I^{7}_{(x,y)})\) is very close to 1, the best matching is obtained. Therefore the exact template matched in I(i,j) is \(I_{(x_{0},y_{0})}(i,j)\) where
The pseudocode of the proposed technique is summarized in Algorithm 1. and Fig. 2 is a sketch of the technique.
3.2 TM via Boolean functions
In this subsection we propose another novel TM-OD technique based on Boolean functions. The Boolean functions are ubiquitous in signal and image processing, where they provide models for logical operations performed by computers on digital signals. They can be used with binary images to solve the TM problem in image processing. Recall that for two binary images η(i,j) and 𝜗(i,j) both of size m × n, the exclusive OR circuit between η and 𝜗 is also a binary image and it is given by:
Using the XOR circuit (11), the two binary images η and 𝜗 are matched if the number of zeros of η ⊕ 𝜗 is very close to m × n. As indicated above the bit-plane slices scheme is a vital technique to obtain a sequence of binary images from a grayscale, or color image. The number of these binary images depend on the bit pattern length that represents each pixel in the original grayscale image. The first binary images in the bit-plane sequence consist of the last bits of each gray value of every pixel. These bits have the least effect in terms of the magnitude of its gray value. So, we call the binary images consisting of those bits, the least significant binary images in the bit-plane sequence. On the other hand, the last binary images in the bit-plane sequence consist of the first bits of each gray value. These bits have the greatest effect in terms of the magnitude of its gray value. We call the binary images consisting of those bits the most significant binary images in the bit-plane sequence. In any clean dataset the Boolean function in (11) can be applied smoothly on the least or most significant binary images to solve the TM problem. Whereas, the simple Boolean function in (11) and the least significant binary images are weak to find the template in the noisy reference image. For these reasons, we are going to propose an effective Boolean function as well as choosing the most significant binary images to solve the TM problem for noisy images.
Instead of working on the least significant binary images we choose the most significant binary images I6,I7 from the reference image and T6,T7 from the template image. Then a robust similarity measure ψ is built using the logical OR and XOR functions. Then ψ is applied to I6,I7,T6, and T7 as follows:
0 ≤ k ≤ m − 1,0 ≤ l ≤ n − 1. The output of ψ is 1 if either \((T^{7}\oplus I^{7}_{(x,y)})\) or \((T^{6}\oplus I^{6}_{(x,y)})\) contain a non-zero element at the same location, otherwise that element is assigned the value zero. The two sub-images I7 and T7 are matched if the (I7 ⊕ T7) is zero for each pixel in the domain, likewise I6 and T6. Therefore, the number of zeros in ψ indicates the degree of similarity between the template and a selected sub-image in the reference image. Now for each (x,y) of domain (3), we compute the cardinal numbers
Notice that 0 ≤ σ(x,y) ≤ mn and the best possible TM occurs when σ(x0,y0) = mn, therefore, ψ(T,I(x,y))(k,l) = 0 if and only if T7,T6 and their corresponding \(I^{7}_{(x,y)},I^{6}_{(x,y)}\) gain the same value. The sub-image \(I_{(x_{0},y_{0})}\) is the best match, if
The sketch and pseudocode of the Boolean template matching are outlined in Fig. 3 and Algorithm 2.
It is worthwhile to mention that the application of both techniques, i.e. Algorithm 1 and Algorithm 2 is not completely hanged upon slicing images into their bit-plane binary images. It is applicable also if we implement any thresholding technique to create binary images from color or grayscale images.
4 Performance and comparison analysis
This section investigates and assesses the performance of the proposed algorithms introduced in the previous one. Several experiments are carried out to investigate accuracy of the proposed methods. We also discuss robustness in the presence of real and artificial noises. A detailed comparison with four different TM and object detection techniques are also executed. Before discussing the performance and, robustness and comparison analysis, we define the datasets, compared methods and the machine used in these experiments.
The performance of the two proposed methods introduced above is assessed and compared to the use of two classic methods, namely the NCC and the ZNCC, and two methods introduced by Yoo et al. [52] and Xia et al. [49]. Hereafter we call them Yoo2010 and Xia2019, respectively. Together with the proposed algorithms, the all six methods are implemented on two different datasets. The first one is prepared by Mattoccia et al. [32], which we call dataset1 and the second is du to Xia et al. [49], which we call dataset2. Experiments are carried out using Matlab 2016 on a Laptop with an Intel Pentium Core2 Duo 3.00 GHz processor.
As the detailed experiments introduced below indicate, the proposed systems retain accuracy because they measure the association between the template and candidate windows in the source images based on Boolean function and the φ-correlation which is robust against noise. This is because we use the last two bit-plane slices, which preserve the major properties of the original image. The performance analysis presented here shows also that the proposed techniques are very fast compared to standard relevant methods. The main reason of the reduction in execution time of the matching procedure is that instead of performing comparisons on gray-level images f(x,y), we carry out performance on the lowly represented bit-plane slices, which are extracted from f(x,y). Thus each pixel is represent by 1 bit 0, or 1, instead of n-bits in the n-bits gray level.
4.1 Performance in presence of real noise
To validate the robustness of the proposed algorithms against real noise, we have chosen the dataset1 with a real distortion typically occurring in TM applications, such changes occur in viewpoint and due to camera noise. The dataset1 consists of seven grayscale source images and twelve templates of various sizes, as shown in Table 3. Source images and templates are depicted with their sizes together with both the left top corner (x0,y0) of the template in the source image and φ(x0,y0),σ(x0,y0). The results indicate that the proposed techniques are accurate in matching templates. It is noted that φ(x0,y0) and σ(x0,y0) are not always close to 1 and to the size of the template, respectively. This is because the real noise of the images. Figures 4, 5 and 6 exhibit the performance of the proposed algorithms visually on the Board, Wafer, and Catalonia images, respectively.
Table 4 displays the execution time required in seconds to search for the templates in its corresponding source image when the six methods are applied on the dataset1 of Table 3. The experiments found that Yoo2010 failed to find the correct match for ringo template in both source images ringo1 and ringo2. Also, it failed to find the correct match for wafer1 and wafer3 in the wafer image. On the other hand, the NCC, ZNCC, Xia2019 and the proposed φ-correlation and Binary circuit can correctly specify the optimal solution for all templates in its corresponding source image.
All experimental results on the dataset1 show the significance improvement of efficiency (accuracy and speeding) of the proposed Binary circuit algorithm compared to NCC, ZNCC, Xia2019 and Yoo2010 methods. Nevertheless, it is noted that the φ-correlation method comes in the third place in terms of time cost, i.e. after the Binary circuit and NCC methods. In addition, the accuracy of the proposed algorithms are identical to that of the standard ZNCC, but our methods outperform ZNCC, Yoo2010, and Xia2019 in terms of running time to find the optimal solutions when there is a real noise. From the above results, we can see that the proposed φ-correlation and Boolean circuit pattern algorithms are robust against the real distortion, while Yoo2010 remains sensitive to the same distortion.
4.2 Performance in the presence of artificial noise
In this subsection we introduce additional experimental results to check the performance of the proposed φ-correlation and Boolean circuit algorithms in the case of imposing an artificial noise in larger templates. For this purpose, the three different templates Cata1, Cata2, and Cata3 of sizes 162 × 118, 152 × 128, and 160 × 164, respectively, are manually selected from the Catalonia image of the dataset1. These three templates are then matched into the source Catalonia image, after being contaminated by pepper-and-salt noise that is used as outliers, with 0.15 % outlier ratio. Figure 7 visualizes Catalonia image with pepper-salt noise with 0.15 % outlier ratio. The templates have been extracted accurately using the proposed methods.
The execution time yielded by the six methods applied to the noisy Catalonia image is shown in Table 5. The left-up corner coordinates for Cata1, Cata2, and Cata3 by the six methods are (725,167), (434,152), and (258,134), respectively, in artificial noise case. These coordinates agree with the results in Table 2. This proves that the six methods can detect the correct location for the templates in case of existing artificial noise. We see also that the six pattern matching algorithms are robust against artificial noise. The proposed Binary circuit algorithm come out to be the fastest in both artificial noise and without artificial noise for the three templates. Table 5 reveals that the execution time increases after imposing the salt-pepper noise. When the salt and pepper noise is added by 15%, the running time average of NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Boolean circuit increased by 1.25%, 2.99%, 2.37%, 3.36%, 11.78%, and 3.99%, respectively. It can be readily seen that the computational advantages with larger templates made by the proposed techniques generally run notably faster than the other four methods.
4.3 Performance in unconstrained environments
In this subsection, we investigate the robustness of the proposed algorithms implemented on the Dataset2, which have more complicated noise. In these experiments the template is extracted from the first image and is used to find the best matching location in a sequence from five images of the same scene with different noises such as partial occlusions, geometric deformation, illumination variation and the change of background. A bounding-box of a target object has been determined manually within each test image. A total of 210 (35 × 6) images are generated from 35 color videos that have been previously used to evaluate the performance of the best-buddies similarity measure (BBS) template matching algorithm of [49]. Since these images are all in color, the images are converted to grayscale to extract the bit-plane binary images. The bounding-box in the initial frame of video sequence is used to define a template. The proposed template matching algorithms are applied to calculate the similarity between the template and every position in the other five frames from the video sequences. The single position which with the highest similarity is considered as the estimated position of the target. A bounding-box having the same size as the one in the initial frame was determined around this estimated position and the overlap between this and the bounding-box in the ground-truth is also computed, and it is used as a measure of the accuracy. This overlap is calculated as the intersection over union between the two bounding-boxes.
Template matching results for some examples from Dataset2 are shown in Figs. 8, 9, 10, 11, 12 and 13. The image in the top row is the initial frame in the video sequence whose template is determined by a green rectangle. The red box in the remaining rows shows the location of the object estimated by several different algorithms. We can see that the accuracy result of NCC and Yoo2010 are low see for example Figs. 8, 11, second and fourth rows, and the accuracy is mediate for ZNCC, Xia2019, and φ-correlation for example Figs. 9, 13, third, fifth, and sixth rows, but the accuracy is high for the binary circuit algorithm see Figs. 8–13, the bottom row. The accuracy results across all 210 images of Dataset2 are averaged, summarized and depicted in Fig. 14. This graph shows the averaged proportion of overlap between the ground-truth box and the estimated box by the different algorithms. It is seen that the accuracy rate of the binary circuit algorithm exceeds that of the other algorithms followed by Xia2019, φ-correlation, and ZNCC algorithms, as well as the accuracy rate of NCC and Yoo2010, which have the lowest rate of accuracy. It is worthwhile to point out that the performance of φ-correlation is similar to that of using the ZNCC. This is expected since both methods are computing the correlation intensity values in the template and each patch of image based on the mean and variance values. Having said that, we would like to mention that the φ-correlation technique is implemented on one bit-plane slice, not two as in the case of the Boolean function algorithm. The accuracy of the φ-correlation method will be definitely enhanced if it is implemented on two bit-plane slices.
To evaluate the time efficiency, many experiments have been carried out on templates with different sizes, which are extracted from the images of Dataset2. Template sizes varied from 71 × 31 to 151 × 111. The results of the total running time are computed and presented in Table 6. It can be seen from this table that, the proposed φ-correlation method is faster than ZNCC, Yoo2010, and Xia2019 methods, and the proposed binary circuit algorithm is substantially faster than all the other methods. As we could have expected, the computational benefits increase with using the binary bit-plane slices, since the Boolean functions speed up better on binary image than the traditional techniques. It is also noted that the proposed binary circuit algorithm is always faster than the compared algorithms with both template and reference image sizes, and it performs better with bigger source images, due to the fact that with larger images the computational overhead associated with NCC and ZNCC is computationally expensive.
4.4 Object detection application
This subsection is devoted to the last experimental results of this paper concerning the efficiency of the proposed φ-correlation and Boolean circuit algorithms for object detection application. For this purpose,fifteen image pairs are randomly taken from the real-world dataset2 previously used by [49] to evaluate the accuracy and running time. Each pairs of image consists of reference and target images. The target images are taken with existing real noises such as partial occlusions, geometric deformation, illumination variation and the change of background, occurred in one object in the reference image. This object is considered as the object of interest in the reference image. Sample of reference and target images from this set are shown in Fig. 15.
A ground truth bounding box is determined manually in the target image to compute the accuracy. The ground truth bounding box Bg and the estimated bounding box Be by the active method are used to compute the accuracy of the method as:
The proposed methods were compared with the four TM methods, NCC, ZNCC, Yoo2010 and Xia2019. Figures 16 and 17 demnostrate the detection results of these methods applied on the selected sample from dataset2. It is found from Fig. 16 that the proposed methods and Xia2019 can detect the correct position of the Coke object in the Coke images in the first column while the other methods fail to detect the object. For the lemming images in the second and third columns NCC and ZNCC fail to detect the lemming object while the other methods can detect the lemming object with percentage from 10% to 90%. The liquor object is detected well for all methods in the images in the last column but only the Binary circuit and Xia2019 methods detect the correct position for the same object in images in the fourth column. Finally, the Boolean circuit method is the only method successfully matching the object of interest in all compared methods.
Table 7 lists the average time consumed and the average accuracy of the six methods applied to the randomly fifteen picked pairs from dataset2 to detect the objects of interest. As can be seen, the Boolean circuit method achieves the highest accuracy of 84.08 dominating the other compared methods. The confidence Boolean function and the most significant binary images show distinct and well localized modes to detect the object of interest. However, for the other methods in which modes are not well localized where the difference in confidence between the correct location and estimated locations is relatively large, so we expect a more rapid drop in accuracy. The calculation of Xia2019 consumes excessive time because it depends on best buddies similarity measure which compute the nearest neighbor between every pixel in the template and all pixels in every possible window in the target image and vice versa. On the other hand, the proposed Binary circuit outperform the other methods in terms of running time because it based on Boolean functions which working with high efficiently on binary images.
5 Conclusions
Two fast, efficient and robust TM algorithms are introduced, tested and compared with two classic methods; NCC, ZNCC, and two recent schemes; Yoo2010 and Xia2019 methods. The algorithms are executed on the whole images, pixel-by-pixel any without partial elimination techniques. However, the proposed algorithms are developed to lower representations of both images and templates gray-scale and color values, particularly the highest order bit-plane images, which are binary images. The approaches’ novelty is not only the use of these binary representations of images, but also the use of the φ-correlation coefficient, which is rarely used in TM techniques. Boolean circuits are also implemented, giving fastest possible results within the mentioned methods. Both methods are fast and accurate. Robustness is also tested against real and artificial noises. Partial elimination techniques associated with both methods are expected to accelerate the techniques more beyond existing situations, when they are figured out. It is worthwhile to notice that the use of Boolean circuits is a very promising approach as it improves accuracy, robustness and fastens time. The use of the φ-correlation coefficient is faster than the use of the ZNCC, but it is comparable to the use of the NCC. The use of other measures of similarity/dissimilarity, rather than NCC, ZNCC is recently recognized as an efficient tool.
References
Aggarwal A, Kumar M (2021) Image surface texture analysis and classification using deep learning. Multimed Tools Appl 80:1289–1309. https://doi.org/10.1007/s11042-020-09520-2
Ambrosini RD, Wang P, O’Dell WG (2010) Computer-aided detection of metastatic brain tumors using automated three-dimensional template matching. J Magn Reson Imaging 31(1):85–93. https://doi.org/10.1002/jmri.22009
Annaby MH, Fouda YM, Rushdi MA (2019) Improved normalized cross-correlation for defect detection in printed-circuit boards. IEEE Trans Semicond Manuf 32(2):199–211. https://doi.org/10.1109/TSM.2019.2911062
Annaby MH, Basha SH, Fouda YM (2021) Defect detection methods using boolean functions and the φ-coefficient between bit-plane slices. Opt Lasers Eng 139:106474. https://doi.org/10.1016/j.optlaseng.2020.106474
Bator M, Nieniewski M (2012) Detection of cancerous masses in mammograms by template matching: optimization of template brightness distribution by means of evolutionary algorithm. J Digit Imaging 25:162–172. https://doi.org/10.1007/s10278-011-9402-1
Brunelli R (2009) Template matching techniques in computer vision: theory and practice. Wiley, NJ
Choi MS, Kim WY (2002) A novel two stage template matching method for rotation and illumination invariance. Pattern Recogn 35(1):119–129. https://doi.org/10.1016/S0031-3203(01)00025-5
Ciaccio EJ, Scheinman MM, Fridman V, Schmitt H, Coromilas J, Wit AL (1999) Dynamic changes in electrogram morphology at functional lines of block in reentrant circuits during ventricular tachycardia in the infarcted canine heart: a new method to localize reentrant circuits from electrogram features using adaptive template matching. J Cardiovasc Electrophysiol 10(2):194–213. https://doi.org/10.1111/j.1540-8167.1999.tb00661.x
Ciaccio EJ, Scheinman MM, Wit AL (2000) Relationship of specific electrogram characteristics during sinus rhythm and ventricular pacing determined by adaptive template matching to the location of functional reentrant circuits that cause ventricular tachycardia in the infarcted canine heart. J Cardiovasc Electrophysiol 11 (4):446–57. https://doi.org/10.1111/j.1540-8167.2000.tb00341.x
Crispin A, Rankov V (2007) Automated inspection of pcb components using a genetic algorithm template-matching approach. Int J Adv Manuf Technol 35:293–300. https://doi.org/10.1007/s00170-006-0730-0
Dastanova N, Duisenbay S, Krestinskaya O, James AP (1918) Bit-plane extracted moving-object detection using memristive crossbar-cam arrays for edge computing image devices. IEEE Access 6:18954–18966. https://doi.org/10.1109/ACCESS.2018.2819986
Duan H, Xu C, Liu S, Shao S (2010) Template matching using chaotic imperialist competitive algorithm. Pattern Recogn Lett 31 (13):1868–1875. https://doi.org/10.1016/j.patrec.2009.12.005. Meta-heuristic Intelligence Based Image Processing
Ertaş G., Özcan Gülçür H, Osman O, Uçan ON, Tunaci M, Dursun M (2008) Breast mr segmentation and lesion detection with cellular neural networks and 3d template matching. Comput Biol Med 38(1):116–126. https://doi.org/10.1016/j.compbiomed.2007.08.001
Esmaeil AG, Lucas C (2007) Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, pp 4661–4667
Fredriksson K, Navarro G, Ukkonen E (2005) Sequential and indexed two-dimensional combinatorial template matching allowing rotations. Theor Comput Sci 347(1):239–275. https://doi.org/10.1016/j.tcs.2005.06.029
Fu TC, Chung FL, Luk R, Ng CM (2007) Stock time series pattern matching: template-based vs. rule-based approaches. Eng Appl Artif Intel 20(3):347–364. https://doi.org/10.1016/j.engappai.2006.07.003
Cheng G, Han J, PZ, Xu D (2019) Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection. IEEE Trans Image Process 28(1):265–278. https://doi.org/10.1109/TIP.2018.2867198
Gonzalez RC, Woods RE (2007) Digital image processing, 3rd edn. Addison-Wesley Longman Publishing Co, Boston
Greenhut SE, Deering TF, Steinhaus BM, Ingram JL, Camp SR, Dicarlo LA (1992) Separation of ventricular tachycardia from sinus rhythm using a practical, real-time template matching computer system. Pacing Clin Electrophysiol 11(2):2146–53. https://doi.org/10.1111/j.1540-8159.1992.tb03038.x
Hermann I, Martínez-Heras E., Rieger B, Schmidt R, Golla AK, Hong JS, Lee W, Yu-Te W, Nagtegaal M, Solana E, Llufriu S, Gass A, Schad LR, Weingärtner S, Zöllner FG (2021) Accelerated white matter lesion analysis based on simultaneous t1 and \(t_{2}^{*}\) quantification using magnetic resonance fingerprinting and deep learning. Magn Reson Med 86(1):471–486. https://doi.org/10.1002/mrm.28688
Jung J, Lee H, Lee JH, Park D (2010) A template matching procedure for automatic target recognition in synthetic aperture sonar imagery. IEEE Signal Process Lett 17(1):107–110. https://doi.org/10.1109/LSP.2009.2032452
Kilic N, Ucan ON, Osman O (2009) Colon segmentation and colonic polyp detection using cellular neural networks and three-dimensional template matching. Expert Syst 26(5):378–390. https://doi.org/10.1111/j.1468-0394.2009.00499.x
Kilic N, Ucan ON, Osman O (2009) Colonic polyp detection in CT colonography with fuzzy rule based 3D template matching. J Med Syst 33(1):9–18. https://doi.org/10.1007/s10916-008-9159-3
Kurosaki K, Nogami A, Sakamaki M, Kowase S, Sugiyasu A, Oginosawa Y, Kubota S (2009) Automated template matching to pinpoint the origin of right ventricular outflow tract tachycardia. Pacing Clin Electrophysiol 32(1):347–364. https://doi.org/10.1111/j.1540-8159.2008.02226.x
Lee W, Chen C (2012) A fast template matching method with rotation invariance by combining the circular projection transform process and bounded partial correlation. IEEE Signal Process Lett 19(11):737–740. https://doi.org/10.1109/LSP.2012.2212010
Lei M, Zhang G (2008) A novel algorithm for a rotation invariant template matching. Optoelectron Lett 4:379–383. https://doi.org/10.1007/s11801-008-8043-1
Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307. https://doi.org/10.1016/j.isprsjprs.2019.11.023
Lin YH, Chen CH (2008) Template matching using the parametric template vector with translation, rotation and scale invariance. Pattern Recogn 41 (7):2413–2421. https://doi.org/10.1016/j.patcog.2008.01.017
Liu R (1980) A note on phi-coefficient comparison. Res High Educ 13:3–8. https://doi.org/10.1007/BF00975772
Mahmood A, Khan S (2010) Exploiting transitivity of correlation for fast template matching. IEEE Trans Image Process 19(8):2190–2200. https://doi.org/10.1109/TIP.2010.2046809
Mahmood A, Khan S (2012) Correlation-coefficient-based fast template matching through partial elimination. IEEE Trans Image Process 21(4):2099–2108. https://doi.org/10.1109/TIP.2011.2171696
Mattoccia S, Tombari F, Stefano LD (2008) Fast full-search equivalent template matching by enhanced bounded correlation. IEEE Trans Image Process 17(4):528–538. https://doi.org/10.1109/TIP.2008.919362
Mattoccia S, Tombari F, Stefano LD (2011) Efficient template matching for multi-channel images. Pattern Recogn Lett 32(5):694–700. https://doi.org/10.1016/j.patrec.2010.12.004
Mcdonnell M (1981) Box-filtering techniques. Comput Graph Image Process 17:65–70
Muramatsu S, Otsuka Y, Kobayashi Y, Shimizu E (2003) Strategy of high-speed template matching and its optimization by using ga. Syst Comput Japan 34(2):81–91. https://doi.org/10.1002/scj.1192
Myers V, Fawcett J (2010) A template matching procedure for automatic target recognition in synthetic aperture sonar imagery. IEEE Signal Process Lett 17(7):683–686. https://doi.org/10.1109/LSP.2010.2051574
Nguyen DT, Vu N, Do T, Nguyen T, Yearwood J (2018) Improving chamfer template matching using image segmentation. IEEE Signal Process Lett 25(11):1635–1939. https://doi.org/10.1109/LSP.2018.2862645
Osman O, Ozekes S, Ucan ON (2007) Lung nodule diagnosis using 3d template matching. Comput Biol Med 37(8):1167–1172. https://doi.org/10.1016/j.compbiomed.2006.10.007
Padilla R, Passos WL, Dias TLB, Netto SL, da Silva EAB (2021) A comparative analysis of object detection metrics with a companion open-source toolkit. Electronics 10(279). https://doi.org/10.3390/electronics10030279
Passos WL, Araujo GM, Lima AA, Ribeiro FML, Silva EAB (2018) Eye detection using ensemble of weak classifiers based on correlation filter. In: Proceedings of international joint conference on neural networks (IJCNN). International Neural Network Society and the IEEE Computational Intelligence Society, pp 1–6
Peng Z, Wu J, Zhang Y, Lin X (2021) A high-speed feature matching method of high-resolution aerial images. J Real-Time Image Proc 18:705–722. https://doi.org/10.1007/s11554-020-01012-8
Saitoh F (2003) Rotation invariant image template matching based on correlation of curvature distribution. Electr Eng Japan 145(4):56–63. https://doi.org/10.1002/eej.10235
Stefano LD, Mattoccia S (2003) Fast template matching using bounded partial correlation. Mach Vis Appl 13:213–221. https://doi.org/10.1007/s00138-002-0070-5
Stefano LD, Mattoccia S, Tombari F (2005) ZNCC-based template matching using bounded partial correlation. Pattern Recogn Lett 26(14):2129–2134. https://doi.org/10.1016/j.patrec.2005.03.022
Tsai DM, Lin CT, Chen JF (2003) The evaluation of normalized cross correlations for defect detection. Pattern Recogn Lett 24(15):2525–2535. https://doi.org/10.1016/S0167-8655(03)00098-9
Uz T, Bebis G, Erol A, Prabhakar S (2009) Minutiae-based template synthesis and matching for fingerprint authentication. Comput Vis Image Underst 113(9):979–992. https://doi.org/10.1016/j.cviu.2009.04.002
Vidal C, Jedynak B (2009) Learning to match: deriving optimal template-matching algorithms from probabilistic image models. Int J Comput Vis 88:189–213. https://doi.org/10.1007/s11263-009-0258-5
Warrens MJ (2008) On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions. Psychometrika 73(4):777–789. https://doi.org/10.1007/s11336-008-9070-3
Xia H, Zhao W, Jiang F, Li H, Xin J, Zhou Z (2019) Fast template matching based on deformable best-buddies similarity measure. Multimed Tools Appl 78:11905–11925. https://doi.org/10.1007/s11042-018-6722-x
Yamaguchi T, Ichimura T, Fujita K, Kato A, Nakagawa S (2019) Matched filtering accelerated by tensor cores on Volta GPUs with improved accuracy using half-precision variables. IEEE Signal Process Lett 26(12):1857–1861. https://doi.org/10.1109/LSP.2019.2951305
Yang H, Huang C, Wang F, Song K, Yin Z (2019) Robust semantic template matching using a super-pixel region binary descriptor. IEEE Trans Image Process 28(6):3061–3074. https://doi.org/10.1109/TIP.2019.2893743
Yoo JC, Choi BD, Choi HK (2010) 1-D fast normalized cross-correlation using additions. Digit Signal Process 20(5):1482–1493. https://doi.org/10.1016/j.dsp.2010.01.002
Yoon YG, Lee SL, Chung CW, Kim SH (2008) An effective defect inspection system for polarized film images using image segmentation and template matching techniques. Comput Industr Eng 55(3):567–583. https://doi.org/10.1016/j.cie.2008.01.015
Funding
Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Annaby, M., Fouda, Y. Fast template matching and object detection techniques using φ-correlation and binary circuits. Multimed Tools Appl 83, 6469–6496 (2024). https://doi.org/10.1007/s11042-023-15564-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15564-x