Abstract
With the development of content-based image retrieval technology, the retrieval efficiency of image retrieval technology is getting higher and higher. For different data images, image retrieval based on color features and shape features can be used to improve retrieval efficiency. However, when a single image feature is retrieved, its retrieval efficiency still cannot meet people’s needs. In this paper, we propose an image retrieval algorithm based on the combination of color and shape features. The cumulative histogram method is used to calculate the color features of the image, and 7 Hu invariant moments are calculated as shape features. The color and shape features are combined with certain weights, and the Euclidean distance is used as the similarity measure. Finally, the image is retrieved, and the related experiments are passed. By comparing with related experiments, the algorithm effectively improves the accuracy of image retrieval.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
With the development of the Internet era, image data is increasing, and image retrieval is widely used in target recognition, photo filtering and other scenarios. Of course, as more and more image data is stored, corresponding security problems and solutions [1,2,3] and the need for image retrieval efficiency are also increasing. Therefore, it is increasingly important to effectively improve the retrieval efficiency of images.
Currently, there are many algorithms for content-based image retrieval (CBIR) that have been proposed and widely used [4]. Content-based image retrieval mainly uses the underlying features such as the color, texture, shape and spatial features of the image for retrieval. Color is the most direct and simple feature in an image. It has less dependence on the size, orientation, rotation, etc. of the image itself. Therefore, image color feature retrieval is the most commonly used basic method in content-based image retrieval technology. However, the color feature is sensitive to the change of brightness, and the histogram as the color feature does not contain spatial layout information of any color, and the shape feature is one of the essential features of the object, which does not change with the change of the surrounding environment and brightness, relative to the color. And texture is more intuitive, and carries a certain amount of spatial layout information. Therefore, image retrieval of a single feature is currently widely used.
However, the retrieval efficiency of image retrieval with a single feature can no longer meet the needs of image retrieval. It is especially important to find an efficient image retrieval method. The criminal investigation image retrieval algorithm based on double-tree complex calibration wave combined with four-direction six-parameter gray level co-occurrence matrix Hu invariant moment in literature has high retrieval efficiency and is not widely used for criminal investigation images [5,6,7].
Therefore, this paper proposes an image retrieval algorithm based on the combination of color features and shape features. Firstly, the local cumulative histogram of the image is statistically analyzed, and similarity ranking is performed. Then, the shape characteristics are calculated by calculating the invariant moment of the image Hu, and the similarity is performed. Sorting; Finally, we assign color and shape features to a certain weight for image retrieval. Through coherent experiment comparison, the algorithm effectively improves the accuracy of image retrieval [8, 9].
In summary, the main contribution of this paper is to assign certain weights to color features and shape features, improve the retrieval efficiency of images, and make up for the problem of low efficiency when only single feature image retrieval is performed.
In section 2, we briefly introduce the related methods of color feature extraction. In section 3, we introduce the method of shape feature. In section 4, we introduce the method of similarity measure. In section 5, we introduce the steps of the algorithm in this paper. In section 6, we introduce the experimental environment and the results of the comparative experiments and the retrieval efficiency of each algorithm. We conclude this paper in section 7.
2 Color Feature
The color histogram method proposed by Swain [10] et al. can divide the color space into several fixed subspaces, then count the number of pixels belonging to each subspace for each image, and adopt the intersection of color histogram to measure the similarity between images. Also, the color has scale, translation and rotation invariance. The main disadvantage of the color histogram: it only contains the frequency of a certain color and loses the position information of the pixel. Either image can give the only histogram corresponding to it, but different images may have the same histogram, which means that the histogram and the image are one-to-many, which does not match the human visual perception. That is, the rate of false positives is high.
To further improve the color histogram method, Pass et al. proposed an image coherence vector CCV (color coherence vector) as the image feature of the color [11]. The core idea is that when the area of the contiguous area occupied by the pixels with similar colors in the image is greater than a certain threshold, the pixels of the area are aggregated pixels, otherwise they are non-aggregated pixels. The ratio of the aggregated pixels to the non-aggregated pixels of each color included in such a statistical image is referred to as the color aggregation vector of the image, and the aggregated vector of the target image and the aggregated vector of the retrieved image are matched during the retrieval process. The aggregate vector preserves spatial information of the image color to some extent. Stricker et al. [12] proposed a method of cumulative color histogram, and proposed a method of color moment, mainly for the first, second and third moments of each color component; He Heng [13] et al., put forward a fuzzy histogram method for applying fuzzy theory to image retrieval; a new weighted primary color descriptor is proposed in [14], which is based on the proportion of each dominant color, using MP7DCD or fast LBA algorithm to extract images. The main color, the weight value of each main color is obtained, and combined into a new main color descriptor, considering the background color of the image; In literature [15], the color difference histogram is used to represent the color features, which not only considers the role of image edge points, color and color differences, but also considers the spatial layer features of the image without using any image segmentation techniques; the literature [16] uses the Gaussian mixture model generated from the training set using the expected maximum algorithm to quantify the color, and then consider the color space information to construct a new spatial color histogram. In this paper, for the above problem that only the single retrieval algorithm of color features has insufficient retrieval precision, the color feature and shape feature are combined to perform image retrieval. The process is to use the cumulative histogram method to calculate 7 Hu invariant moments as shape features. Combining the color and shape features, using Euclidean distance as the similarity measure, an algorithm based on the combination of color features and shape features is proposed, and its effectiveness is verified by experiments [6, 17].
2.1 Color Histogram
The color histogram is the proportion of different colors in the entire image and does not care about the spatial position of each position. The definition of the color histogram is as follows.
The pixel frequency h[ck]of the Kth color appearing in the image is as follows.
Where N1 and N2 represent the width and height of the image.
2.2 Cumulative Histogram
The local accumulation histogram method is as follows: a picture is provided, and for convenience of processing, it is converted into a 256 × 256 pixel size, and the color value of one of the pixels is ai, j (where i is the abscissa of the pixel, j For the ordinate of the pixel), ai, j……ai + 16, j + 16 is a color block, and the accumulated color value C is calculated:
Let the C value of the hth block in the picture be Ch, the picture has 256 blocks, and there are 256 values, then the local cumulative histogram Ha(h)= Ch /(C1 + C2 + … + C256).
2.3 Color Moment
The color moment is a simple and effective color feature representation method, which is widely used in the field of image processing. The distribution of image color information is mainly concentrated in the lower-order moments. The first moment (mean), second moment (variance) and third moment (skewness) of the color information can fully express the color distribution of the image. Color moments represent color features. Its mathematical model is as follows.
Where μi, σi, and si represent the first moment, the second moment, and the third moment, respectively, and N represents the number of pixels.
3 Shape Feature
Shape features are another important feature that describes image content and is a fundamental issue in computer vision and pattern recognition. Using shape features for retrieval, the user can retrieve similar images from the image library by sketching the shape or outline of the image. There are two kinds of retrieval based on shape features: one is to obtain the contour of the target after edge extraction, and to perform image feature retrieval on the contour; the other is to search based on the regional features of the image [9, 18].
The description methods for shape contour features are: boundary histogram [19], chain coding [20], curvature scale space [21], Fourier description [22], etc. The most typical method is Fourier description, its basic idea The Fourier transform of the object boundary is used as the shape description, and the closedness of the regional boundary and the Monday property are transformed into a one-dimensional problem, thereby improving the retrieval efficiency. The description methods for the regional features mainly include the shape-independent moment, the area of the area, and the aspect ratio of the shape. For shape-based retrieval, the extraction, description, and matching of shapes are the key issues to be solved. Shape-based retrieval methods are more difficult than color- and texture-based retrieval methods.
3.1 Definition of Geometric Moment
Moments are used in imaging as a valid description of image shape features. It has been widely used in image analysis, pattern recognition, and other fields.
Let the gray function of an image be f(x, y), where (x, y) represents the pixel point of the image, that is, the (p + q) order geometric moment (standard moment) of the image is defined as [23]:
The (p + q) order center moment is defined as
Where \( \overline{\mathrm{x}} \) and \( \overline{\mathrm{y}} \) represent the center of gravity of the image, \( \overline{\mathrm{x}}=\frac{{\mathrm{m}}_{10}}{{\mathrm{m}}_{\infty }},\overline{\ \mathrm{y}}={\mathrm{m}}_{01}/{\mathrm{m}}_{\infty } \), where N and M are the height and width of the image, respectively.
The normalized central moment is defined as
Where \( \uprho =\frac{\mathrm{p}+\mathrm{q}}{2}+1 \), p + q = 2, 3, 4…
3.2 Hu Invariant Moment
Combining the normalized second-order central moments η11, η20, η02 and the third-order central moments η12, η21, η03, η30 can obtain seven moments with translation, rotation, and scale invariance. Extract the shape feature of the image, i.e.
Thus, the internal grayscale distribution can be expressed by the seven central moments: M1, M2, …, M7.
4 Similarity Measure
In image retrieval, the similarity or difference between images is measured by calculating the distance between the image to be queried and the database image feature vector. In this paper, the histogram intersection and Euclidean distance are used as the similarity measure between image feature vectors, and the applicability of histogram intersection and Euclidean distance to image library is verified by experiments [5, 7, 24].
Let the database image be D, and the image to be checked is Q, where HQ(k) and HD(k) are the histograms of the query Q and the database image D, respectively, and L represents the gray level of the histogram, then the histogram intersection formula is as follows:
The Euclidean distance formula is as follows:
The lower the calculated distance, the greater the similarity; otherwise, the smaller the similarity between the two graphs.
5 Image Retrieval Algorithm Based on Color Feature and Shape Feature(CSIR)
This paper proposes a retrieval algorithm based on a combination of color and shape features. The specific steps are as follows:
Step 1 Convert the image library image into an image of 256 × 256 pixels;
Step 2 Calculate an image cumulative histogram according to eq. (3);
Step 3 Calculate the shape of the shape feature vector from the seven Hu invariant moments of the image according to eqs. (10) to (16);
Step 4 Combine the features obtained in step 2 and step 3 to obtain a combination of color and shape characteristics;
Step 5 Calculate the Euclidean distance between the target image feature and the image feature in the image library according to eq. (18) and sort the values obtained by the Euclidean distance from small to large to obtain the search result.
6 Experimental Results and Analysis
In the experimental environment of this algorithm, the CPU is Intel Core i7–7700, eight cores, 3.6GHz; the memory is 16G; operating system is Windows 10 (64-bit); the programming software is Visual C++6.0.
The image library used in the experiment contains 4 categories of traffic signs, gestures, cars and leaves, 80 images in each category, a total of 320 images, traffic signs category number is 1~80, gesture category number is 81~160, and car category number is 161~240, the leaf category number is 241~320. An example of various types of pictures is shown in Fig. 1.
In order to verify the effectiveness of the proposed algorithm, the simulation experiment uses color histogram retrieval algorithm, cumulative histogram retrieval algorithm, color moment retrieval algorithm, invariant moment retrieval algorithm, color histogram-based and invariant moment combination retrieval. The algorithm and the correlation algorithm based on the combination of color moment and invariant moment are compared with the algorithm of this paper.
6.1 Search Results
Each image entered into the image library was processed into an image of 256 × 256 pixels before the experiment. The target images are all from the image library and the first 10 images with the highest similarity are displayed in the query result, and the first image is the target image. Some of the search results are shown in Figs. 2a–g. Among the 10 result images obtained by searching for traffic sign images, there are 4 related images based on the color histogram algorithm. There are 8 related images based on the cumulative histogram algorithm. The search results based on the color moment algorithm have 2 related images. The correlation image, based on the Hu invariant moment algorithm, has 5 related images. The search results based on the literature [25] algorithm have 8 related images. The search results based on the combination of color moment and Hu invariant moment have 8 related images. The search results based on the combination of color features and shape features proposed in this paper have 9 related images.
It can be seen from the experimental results that the algorithm combining color feature and shape feature in this paper is better than other image retrieval algorithms for a single feature.
6.2 Precision and Recall
The precision and recall rate are important indicators for measuring the efficiency of image retrieval. To objectively analyze the effectiveness of the algorithm, this experiment uses the algorithm of [25], the color moment based retrieval algorithm, the algorithm based on Hu invariant moment and the combination of color features and shape features proposed in this paper. The images are searched one by one, and the average precision and the average recall rate are calculated, as shown in Tables 1 and 2.
The following conclusions can be drawn from the data of Table 1 and Fig. 3. Firstly, the image retrieval algorithm comparing single color features or shape features shows that the retrieval effect of combining color features and shape features is improved to some extent. Secondly, comparing the color histogram algorithm with the color histogram and the invariant moment algorithm, adding the invariant moment shape feature can improve the image retrieval effect to some extent. Comparing the algorithm of [25] with the algorithm of this paper, the color histogram algorithm has a large amount of computation when extracting color features. Combining the algorithm of the color moment and Hu invariant moment, we can see that the average precision of the proposed algorithm in the four categories of traffic signs, gestures, cars and leaves is higher than the combination of color moment and Hu invariant moment. To some extent, the improvement of its retrieval effect is achieved.
Besides, by observing the average recall rate of Table 2 and Fig. 4, it can be further verified that the algorithm combining the color feature and the shape feature proposed in this paper has higher retrieval accuracy and better retrieval performance than other single retrieval conditions.
Finally, in the comparison process, it is found that considering the attachment, calculation, average precision and average recall rate of the algorithm, the combined color features and shape features proposed in this paper are the best among the seven algorithms.
7 Conclusions
In this paper, we finally propose a retrieval algorithm based on the combination of color features and shape features, which allows users to experience better image retrieval services. Our method can be applied to many different scenes, and the efficiency of image retrieval is improved to some extent by combining color features with shape features. Experimental and theoretical results prove that our method can effectively improve the efficiency of image retrieval and has the ability to be portable. All in all, we finally proposed a more efficient method for image retrieval. In the next research process, other image feature extraction methods will be considered to obtain better retrieval results.
References
Qiu, H., Noura, H., Qiu, M., Ming, Z., & Memmi, G. (2019). A user-centric data protection method for cloud storage based on in vertible DWT[J]. IEEE Transactions on Cloud Computing, early access 1–1.
Qiu, H., Memmi, G., Chen, X., & Xiong, J. (2019). DC coefficient recovery for JPEG images in ubiquitous communication systems[J]. Future Generation Computer Systems, 96, 23–31.
Qiu, H., Kapusta, K., Lu, Z., Qiu, M., & Memmi, G. (2019). All-or-nothing data protection for ubiquitous communication: Challenges and perspectives[J]. Information Sciences, 502, 434.
Gandhani, S., & Singhal, N. (2015). Content based image retrieval: Survey and comparison of CBIR system based on combined features[J]. International Journal of Signal Processing, Image Processing and Pattern Recognition, 8(10), 155–162.
Qiu, M., Sha, E. H. M., Liu, M., Lin, M., Hua, S., & Yang, L. T. (2008). Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP[J]. Journal of Parallel and Distributed Computing, 68(4), 443–455.
Shao, Z., Wang, M., Chen, Y., Xue, C., Qiu, M., Yang, L., & Sha, E. (2007). Real-time dynamic voltage loop scheduling for multi-core embedded systems[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 54(5), 445–449.
Wang, J., Qiu, M., & Guo, B. (2017). Enabling real-time information service on telehealth system over cloud-based big data platform[J]. Journal of Systems Architecture, 72, 69–79.
Li, J., Qiu, M., Niu, J., Gao, W., Zong, Z., & Qin, X. (2010). Feedback dynamic algorithms for preemptable job scheduling in cloud systems [C]. In 2010 IEEE/WIC/ACM international conference on web intelligence (pp. 561–564). IEEE Computer Society.
Qiu, M., Jia, Z., Xue, C., Shao, Z., & Sha, E. H. M. (2007). Voltage assignment with guaranteed probability satisfying timing constraint for real-time multiproceesor DSP[J]. The Journal of VLSI Signal Processing Systems, 46(1), 55–73.
Swain, M. J., & Ballard, D. H. (1991). Color indexing[J]. International Journal of Computer Vision, 7(1), 11–32.
Pass, G., Zabih, R., & Miller, J. (1996). Comparing images using color coherence vectors[C] (Vol. 96, pp. 65–73). ACM multimedia.
Stricker, M. A., & Orengo, M. (1995). Similarity of color images[C]. Storage and retrieval for image and video databases III (Vol. 2420, pp. 381–392). International Society for Optics and Photonics.
Heng, H. E., & Lin, Y. Y. (2001). Image retrieval using combined fuzzy histogram[J]. Journal of Image and Graphics, 7(106), 84–88.
Talib, A., Mahmuddin, M., Husni, H., & George, L. E. (2013). A weighted dominant color descriptor for content-based image retrieval[J]. Journal of Visual Communication and Image Representation, 24(3), 345–360.
Walia, E., & Pal, A. (2014). Fusion framework for effective color image retrieval[J]. Journal of Visual Communication and Image Representation, 25(6), 1335–1348.
Zeng, S., Huang, R., Wang, H., & Kang, Z. (2016). Image retrieval using spatiograms of colors quantized by Gaussian mixture models[J]. Neurocomputing, 171, 673–684.
Gai, K., Qiu, M., Xiong, Z., & Liu, M. (2018). Privacy-preserving multi-channel communication in edge-of-things[J]. Future Generation Computer Systems, 85, 190–200.
Li, J., Qiu, M., Niu, J., Gao, W., Zong, Z., & Qin, X. (2010). Feedback dynamic algorithms for preemptable job scheduling in cloud systems[C]. In 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (Vol. 1, pp. 561–564). IEEE.
Sundara Vadivel, P., Yuvaraj, D., Navaneetha Krishnan, S., & Mathusudhanan, S. R. (2019). An efficient CBIR system based on color histogram, edge, and texture features[J]. Concurrency and Computation: Practice and Experience, 31(12), e4994.
Dhou, K. (2019). An innovative design of a hybrid chain coding algorithm for bi-level image compression using an agent-based modeling approach[J]. Applied Soft Computing, 79, 94–110.
Lisheng, R., & Lizhong, W. (2016). Analysis of image matching algorithm for corner detection based on curvature scale space[J]. Electronic Technology Application, 42(12), 112–114.
XU Qiang, & MA Dengwu. (2014). Classification tree of contours based on Fourier descriptor’s peincipal coefficients[J]. Journal of Computer Applications, 34(A01), 124–126.
Yongku, Z., Yunfeng, L. I., & Jingguang, S. (2014). Image retrieval 473based on clustering according to color and shape features[J]. Journal of Computer Applications, 34(12), 3549–3553..
Lan, R., Guo, S., & Jia, S. (2018). Criminal investigation image retrieval algorithm based on texture and shape feature fusion[J]. Computer Engineering and Design, 39(4), 1106–1110.
Imran, M., Hashim, R., & Khalid, N. E. A. (2014). Segmentation-based fractal texture analysis and color layout descriptor for content based image retrieval[C]. In 2014 14th international conference on intelligent systems design and applications (pp. 30–33). IEEE.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61972136, No. 61471161, No. 61971339), Hubei Provincial Department of Education Outstanding Youth Scientific Innovation Team Support Foundation (T201410), Hubei Province Higher Education Teaching Research Project(No.2018432).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zenggang, X., Zhiwen, T., Xiaowen, C. et al. Research on Image Retrieval Algorithm Based on Combination of Color and Shape Features. J Sign Process Syst 93, 139–146 (2021). https://doi.org/10.1007/s11265-019-01508-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-019-01508-y