1 Introduction

Texture is an essential characteristic in image retrieval and analysis, and has lured more researches in this area in few decades. Texture analysis and classification is a hot research topic in image processing (Luo and Crandall 2006). Now days, requirement high efficiency and high accuracy Image Retrieval systems are incresed. Previous period of time, Information Retrieval methods had used the text based methods, later on the scope of using such methods was come down due to the existence of CBIR Systems, because contents based retrieving methods give accurate results visually. From Quellec et al. (2012) and Umamaheswaran et al. (2015) literature, CBIR methods discover more subjectively and in effect than text based methods.

Image Retrieval systems primary intention are efficient, more accuracy in searching, reading and retrieving from bear-sized data sets either online or offline. Few newly distinctive image retrieval applications are processed for face recognition, detect and retrieve the human body actions from stored databases (Jones and Shao 2013; Zarchi et al. 2014; Singh et al. 2012; Zhang et al. 2012; Low 2004). Image retrieval system accuracy depends on appropriate representation of image property descriptors. An Image Retrieval system are capable to bring familiarized indiscriminately select images. Currently, image retrieval systems have used more number of methods are semantic based, because it deals with the CBIR problems (Low 2004; Haralick and Shangmugam 1973; Tamura et al. 1978). Cross and Jain (1983), Manjunathi and Ma (1996) and Ojala et al. (2002) stated that using feedback algorithms may help to identify the semantic properties and help to extract the results closely related to human perception in CBIR.

The computed statistical parameter from Gray Level Co-occurrence parameters for sequential, random window has been estimated by Wang et al. (2011). This work has applied image preprocessing over the texture and wcount is found out by splitting the texture size with the window size. When a statistical window method is followed, the variables i, j and count are initialized to 1. The window is read from the preprocessed image and GLCM features are calculated in 0˚, 45˚, 90˚ and 135˚ from the window. If the count is greater than wcount, then the results are applied to the retrieval system. This process stops when the count is greater than wcount. In the same way, random window is also considered. This work concludes that retrieval done after preprocessing shows better results than other leading methods (Liu and Zhang 2010; Luo and Crandall 2006; Liu and Yang 2013; Su and Jure 2012).

2 Related works

More researche have been performed in the area of CBIR using varied number of features. In this section, we discussed about some of recent works in the CBIR based on texture features. Image can not split or sub-grouped as multiple regions in Global feature based method. Chen et al. (2010) utilized color content alone to describe the image features. Existing method was extracted and preserved the third moment of the color distribution and output are finer than other properties of color, only color features have to be considered. The low level primary attribute like texture, color and shape are utilized effectively to be the global image by Wang et al. (2011). The Wang proposed method have used pseudo-Zernike moments and decomposition steerable filter to acquire dominant group of color information which is needed for construction their descriptor. The above mention method not considered structure of local neighbours to encode connection between the neighbouring pixels. Lot of Researchers have integrated the spatial location to build the feature description in Region based method. Hsiao et al. (2010) designed a method, it split the images as five different regions based on its fixed absolute locations and same for semantic retrieval also, these methods also required user interference in-between process of image retrieval. Beside, these method used local neighbor pixel value to boost local power. The developed a method by Lin et al. (2011), used three feature descriptors to represent the spatial and color properties of the images. These method used one of the unsupervised algorithm for split and convert the image into variety clusters based on image pixel intensity. This method has given keen outcome with the cost of high computation then sizable multidimensional descriptions.

Liu and Yang (2008) designed a method based on co-occurrence matrix with combination of histogram features and based on these features they developed a new feature descriptor for CBIR like a Multi-texton Histogram (MTH). Liu et al. (2011) had presented Micro Structure Descriptors (MSD), this descriptor combined all three low level features and image spatial properties for efficient image retrieval. Xingyuan and Zongyu (2013) designed a good descriptor as SEH (Structure Element Histogram) by using combined color, texture features. The SEH based methods given auspicious results in CBIR, but their performance was slow down in scaling and rotation operations. Reduce the complexities of Classification of Images and CBIR problems by using a single feature of low level features, therefore, many of the methods preferred to club features for not exceeding spatial properties values from limit. Wang et al. (2013) designed a method used two (color and texture) low level basic features. These method acquired color attribute value based on Zemike chromaticity distribution from the opposite chromaticity space. Bella and Vasuki (2019) proposed a method based on Fused Information Feature to retrieve Image. Zhang et al. (2019) proposed new algorithm for image retrieval based on a shadowed set. The proposed algorithm used the theory of shadowed set for extract the salient regions, shadowed regions and non- salient regions from image and combined the two extracted features i.e. the salient region and shadowed regions and used in retrieval process. Shamna et al. (2019) had introduced a new automated image retrieval system using Topics and Location model, to get Topic information used Guided Latent Dirchlet Allocation and along with the spatial information of visual words. They used new metric Position weighted Precision to evaluate the system.

Overcome from drawbacks prevailed in above mentioned descriptors, a novel composite micro structure descriptor is presented for the Image retrieval system. The present approach assumes and treat full image content as a one region and build new descriptor. The CMSD is unify color cues and textural cues of the image by a good way. The work is organized in paper as following order: third section have discussed about proposed technique. The elaborated results analysis a are given in fourth section, then conclusion part and future enhancement in the fifth section.

3 Feature extraction process

Extraction of feature is defined as a form of spatial property reduction that transforms the input data into a diminished representation. The pull out feature from image is predicted to give the characteristics as an input to the classifier by pickings into record form of the suitable attributes of image into a dimension location. The core objective of feature extraction from images is, original data set to be reduced using attribute measurement (Oliva and Torralba 2001).

It provides the image’s different characteristics, that characteristics are considered in image retrieval system. The proposed CBIR system use three different feature vectors such as:

  1. 1.

    F(V1) of original image.

  2. 2.

    F(V2) of Oriented Gabor filters image.

  3. 3.

    F(V3) using Micro structure descriptor.

3.1 Calculation of feature vector F(V1) using original image

In general, divides the image into a number of little size images is called Gridding. The present method use 4 grids, 18 grids, and 24 grids to divide the original image. The value for block count is computed using pixel intensity, its range from 1 to 255. The subsequent F(V1) is calculated from the primary gridding image.

3.2 Calculation of F(V2) using gabor transform

A complex sinusoid value is used to modulate the Gabor function. The Gaussian envelope is specified by the sinusoid frequency ω and the SD \(\sigma_{x} \,and\,\sigma_{y}\).

$$\varPsi_{mn} \left( {x,y} \right) = \frac{1}{{2\pi \sigma_{x} \sigma_{y} }}exp\left[ { - \frac{1}{2}\left( {\frac{{x^{2} }}{{\sigma_{x}^{2} }} + \frac{{y^{2} }}{{\sigma_{y}^{2} }}} \right)} \right]exp\left( {j2\pi Wx} \right)$$
(3.1)

The structure of Gabor filter is built based on result of Gabor filter for Image I, and is presented by

$$G_{mn} \left( {x,y} \right) = \mathop \sum \limits_{j} \mathop \sum \limits_{t} I\left( {\left( {x - s, y - t} \right)\varPsi_{mn}^{*} \left( {s, t} \right)} \right)$$
(3.2)

After Gabor orientation process of original image, then the gridding process is applied, then value of block count is computed using Gabor orientation image’s pixel intensity value.The consequential F(V2) is received from the Gabor transform gridding image.

3.3 Calculation of histogram vector F(V3)

The histogram vector computing process lie in two point:

  1. i.

    Image formation for Texton Structure.

  2. ii.

    Every intensity value are used to calculate value to block count for texton structure image.

3.4 Texton image structure creation process

In texture analysis, Texton is highly used and essential feature. Julesz (1981) Texton is type of pattern and its common property shared by all over the image. First, frame the template of texton using on four variety of texton and use these texton templates to extract the texton structure map from images, at last, form texton structure image by combining all texton structure map.

The extraction of texton structure map is consists four stage process as represented the original image f (a, b) splits as 3 × 3 blocks as below

Create a block size with 3 × 3 to move into original image f(a,b) such as vertically and horizontally, then move towards right from left and up to down from origin i.e.(0,0) with a fixed length of three pixels. Texton structure map is created as T1(a,b), where 0 ≤ a ≤ M − 1, 0 ≤ b ≤ N − 1.

Step (ii) are repeated from the beginning of (0,1),(1,0) and (1,1) and create texton structure maps T2(a,b) where 0 ≤ a ≤ M − 1,0 ≤ b ≤ N − 1, T3(a,b), where 0 ≤ a ≤ M − 1,0 ≤ b ≤ N − 1, and T4(a,b), where 0 ≤ a ≤ M − 1, 0 ≤ b ≤ N − 1, respectively.

The final T(a,b) texton structure map is created using the Eq. (3.3):

$${\text{T}}\left( {\text{a,b}} \right) = \left\{ {\hbox{max} \left\{ {{\text{T}}_{1} \left( {\text{a,b}} \right),{\text{T}}_{2} \left( {\text{a,b}} \right),{\text{T}}_{3} \left( {\text{a,b}} \right),{\text{T}}_{4} \left( {\text{a,b}} \right)} \right\}} \right\}$$
(3.3)

The texton structure map T1(a,b) extraction process is shown in Fig. 1.

Fig. 1
figure 1

Texton structure map T1(a,b) extraction process

The final texton structure T(a,b) is arranged using union of texton structure map T1(a,b), T2(a,b), T3(a,b) and T4(a,b) is shown in Fig. 2.

Fig. 2
figure 2

Organization of final Texton structure T(a,b) using by texton structure maps T1(a,b), T2(a,b), T3(a,b) and T4(a,b)

The final texton structure mask is applied to the original image. The mask are set to empty if pixels do not match. Based on modified MSD, final texton structure image is created, it is displayed in Fig. 3.

Fig. 3
figure 3

Texton structure Image extraction process

3.5 Calculation of blocks count value

A Texton image T(a,b) value is expressed as T(a, b) = w, w = {0,1,2,3…..N − 1}. All 3 × 3 block of T(a,b), P0 = (a0, b0) denotes position of centre and let T(P0) = w0, Pi = (ai, bi) denotes the eight neighbour pixel values to P0 and let \(T(P_{i} ) = w_{i} ,\,\,i = 1,2,3, \ldots 8\). Let N stand for co-occurring number of two values w0,wi, then \(\bar{N}\) stand for number of values in w0. Create a block in the size of 3 × 3, moves the block from bottom to top direction and from right to left direction in structure of image. Then, create image texton structure using Eq. (3.4), Liu et al. (2011):

$$H\left( {w_{0} } \right) = \left\{ {\begin{array}{*{20}l} {\frac{{N\{T\left( {P_{0} } \right) = w_{0} T\left( {P_{1} } \right) = w_{i} \|P_{i} - P_{0}\| = 1\}}}{{8\bar{N}\left( {T\left( {P_{0} } \right) = w_{0} } \right)}}} \hfill \\ {where\,w_{0} = w_{i} , i \in \left\{ {1,2, \ldots 8} \right\}} \hfill \\ \end{array} } \right.$$
(3.4)

And then, used image pixel intensity value to measure the block count numerical value. Finally, resulting vector F(V4) is got from image texton structure.

3.6 Derived the final feature vector (F(V))

Thus, proposed CMSD uses \(F(V) = F(V_{1} ) + F(V_{2} ) + F(V_{3} )\) multidimensional vector as the last image property for image retrieval.

4 Result analysis

This section have discussed more about HIDs based available methods for CBIR, then discusses in details about the image data sets, the result analysis and comparison with existing methods and optimization strategy of the model parameters.

4.1 Retrieval metrics

The accuracy and recollectt are major metrics in area of Image and Information Retrieval. The above said two metrics are combined as the subjective harmonics indicates, that is, F-measure, and it is helped to analysis systems performance Liu and Zhang (2010).

The metrics are defined as:

$$F = \frac{{\left( {1 + \beta^{2} } \right)*R*P}}{{\left( {\beta^{2} *P} \right) + R}}$$
(4.1)
$$P = \frac{{I_{N} }}{N}$$
(4.2)
$$R = \frac{{I_{N} }}{M}$$
(4.3)

4.2 Similarity metrics

The system accuracy is just not based on storing attribute content. It can achieved through the efficient way of calculating similarity feature vector. Corresponding, feature vector of query image Q is described as \(f_{Q} = f_{Q1} + f_{Q2} + \cdots f_{{Q_{\lg } }}\) got after the attribute removal. Similarly, calculate feature vector for images in the DB and it has described with feature vector \(f_{{DB_{j} }} = (f_{{DB_{j1} }} + f_{{DB_{j2} }} + \cdots f_{{DB_{{jL_{g} }} }} );\,\,j = 1,2, \ldots \left| {DB} \right|\). In this work, Canberra distance are used and shown below:

4.2.1 Canberra (L1)

$$D\left( {Q, DB} \right) = \mathop \sum \limits_{i = 1}^{{L_{g} }} \frac{{\left| {f_{{DB_{ji} }} - f_{Q,i} } \right|}}{{\left| {f_{{DB_{ji} }} } \right| + \left| {f_{Q,i} } \right|}}$$
(4.4)

where \(f_{{DB_{ji} }}\) is the ith feature of jth image in the record.

We have compared our proposed with other image retrieval method against various descriptor such as CHD, CCM, MSD and MTH. The result analysis are described by using two evaluation methods (i.e.precision and Recall). Results are displayed in Table 1.

Table 1 Evaluation analysis of various function using Corel-1000 data set

From performance graphs, the proposed system is given better results than previously available methods. The precision and recall performance graphs are given in Figs. 4 and 5.

Fig. 4
figure 4

Precision measure result comparison with Corel 1000 for each category

Fig. 5
figure 5

Recall measure result comparison with Corel 1000 for each category

Proposed method CHSDs-based experiments are conducted on Corel-1000 dataset and other four different data sets comparing and with MTH, CHD, CCM and MSD. The average experimental results of different methods with four data sets is given in Table 2.

Table 2 The retrieval results comparisons on four data sets with five methods

5 Conclusion and feature work

The designed proposed method, a novel composite micro structure descriptor is shown better performances than other CBIR systems, which integrates the advantages of both muti texton histogram and micro-structure descriptor. Proposed method, CMSD is designed such way that internal correlations are explored and multi-scale feature is analyse. Beside, Gabor filters are integrated into bar-shaped structure to stimulate orientation-selective mechanism and to represent image. Then block value is measured for each pixel intensity values 1–255 for original and transform image. New proposed approach has proved its performance (81.44% / 9.75.% in Corel 1000 data set, for average precision/average recall) better than other methods by comparing with other descriptor such as CCM, CHD, MTH and MSD through experimental results. A similar comparison shows better performance in terms of average precision and average recall on other datasets (Oliva, New Caltech and Corel 10000).