Content Based Image Retrieval Using Self Organizing Map

Shrinivasacharya, Purohit; Sudhamani, M. V.

doi:10.1007/978-81-322-0997-3_48

Purohit Shrinivasacharya³ &
M. V. Sudhamani⁴

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 221))

1162 Accesses
2 Citations

Abstract

From the recent literature it is observed that information storage and retrieval through the Internet has made impressive progress. Practical searching for information still confronts us with retrieval systems that are present. A Content Based Image Retrieval (CBIR) system provides an efficient way of retrieving related images from image collections. In this paper we present a new feature extraction techniques and clustering of the features to achieve better performance in image retrieval system. The proposed method uses an approach which combines edge information and median filtering technique to extract the features from the image. Self Organizing Map (SOM) technique is used for clustering the extracted image features. The median filtering technique is applied to the original image to get a smooth image. The edge information can be extracted from the image using Bi-directional Empirical Mode Decomposition (BEMD) technique. Then replace only the values of edge position of smooth image with the detected edge image values by BEMD and extracted only 64 bins gray features. These extracted features are supplied as input to the SOM neural network for clustering where features are clustered into nine different groups. Finally query image features are feed to the neural network to identify the cluster to which the query image belongs. The surrounded clustered features are compared with the query image features and display the similar resultant images. The experiment is carried out on a ground truth database which has 1000 images of different categories. The experimental results have been compared with the conventional Median filter histogram technique. Here performance of the retrieval system is good because of combination of median, edge and SOM techniques. It gives an average precision 2.37 % and recall 2.82 % improvement compared with an existing system.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Semantic Based Image Retrieval System for Web Images

Applicability of Self-Organizing Maps in Content-Based Image Classification

Unsupervised Clustering of Natural Images in Automatic Image Annotation Systems

Keywords

1 Introduction

In the past few years, there has been enormous amount of images which are being added every minute to the World Wide Web (WWW). As it is the need for effective and efficient image retrieval system. Retrieving an image having some characteristics in a big database is a crucial task. Searching for an image among a collection of images can be done by different approaches. Text based search uses the surrounding information of the images like image name, the web page surrounding text etc. and it requires humans to personally describe every image information in the database. The limitations of the text based search are moving the trend towards the CBIR technique. The CBIR uses some features of the image content instead of surrounding text for searching the image. The main intention of CBIR technique is to automatically retrieve using some image features from the image database. This is very impractical for large amount of image databases. It is possible to miss images which use different synonyms in their descriptions.

A CBIR is an alternative or complementary method for the textual indexing search. CBIR is an important part of multimedia information retrieval, while image feature extraction and expression is the basement of CBIR. The process of retrieving images from a database on the basis of features that are extracted from the images themselves is called CBIR. This paper presents a CBIR system which accepts a query image as input and relevant images are retrieved based on the similarity of the features of the query image and features of the individual images stored in database. The proposed method uses the edge using BEMD [1, 2], Median filtering [3] histogram [4] and SOM [5, 6] clustering techniques to build new system. The proposed system uses the different category images. Ten categories have been chosen for experiment to show that the proposed method performance is good and the categories are as shown in the following Fig. 1.

The detailed description of the work will be organized as follows: Sect. 2 describes the general clustering method. The proposed CBIR architecture and different phase of the architecture are presented in Sect. 3. Section 4 gives the details of feature extraction and clustering using edge detection using BEMD and training the network using batch algorithm for SOM. Experimental results and graphs are presented in Sect. 5. Finally, Summary and conclusions of this study and future work are mentioned in Sect. 6.

2 Clustering

A clustering ‘P’ is a partitioning of a data set into a set of clusters $P_i$, $i=1, 2,\ldots ,m$ in such a way that each data sample exactly belongs to one cluster. The clustering can be carried out using two-level approach, where the data set clustered using the SOM. The important benefit of SOM is that computational load decreases considerably, building it is possible to cluster large amount data sets and to consider several type of preprocessing strategies in a limited time.

Self organizing maps were developed by Kohonen [5] in 1980. It is a type of artificial neural network that is trained with unsupervised learning to produce a low dimensional discretized demonstration of the input space of training samples, called a map. The SOM consists of small working components called neurons. Associated with each neuron is a weight vector of the same dimension as the input information vectors and a position in the map space. The weights of the neurons are initialized either to random values or sampled lightly from the subspace spanned by the two largest principal component of eigenvectors. Internally SOM uses the Euclidean distance metric to make the cluster. The main features of the SOM are

–
Clustering of high dimensional data
–
Resulting clusters are arranged on a grid.

3 Proposed Content Based Image Retrieval (CBIR) System

CBIR system mainly consists of following two phases.

Offline Phase: In this phase each image feature vectors are extracted from the collection of images, store features and their index in the database. These selected features are used to create cluster using the SOM neural network. The clustering process will be repeated, if any new images added to the system. This phase is the top portion in Fig. 2.

Online Phase: In this phase user will give query image for extracting the similar images from the database. Query image feature vector are extract in the same manner and is given to the SOM neural network to identify the cluster number to which it belongs. Identified cluster and its surrounding clusters feature vectors are compared to identify the similar images. The selected similar images are displayed on the Graphical User Interface (GUI). This phase is the bottom part of the Fig. 2.

4 Feature Extraction

4.1 Edge Detection Using BEMD

In this paper, empirical mode decomposition algorithm is used to detect the edge of the image. When the original image is decomposed using BEMD, first Intrinsic Mode Frequency (IMF) shows fine edge characterization. The clear edge image is obtained from the first IMF by applying suitable threshold. Extracting the IMF’s from the image is called process of shifting is as follows.

Assume that $Y(t)$ is the original signal and let $S(t)=Y(t),k=0 $ and $i=0$

1.
Find the local minima and maxima of $S(t)$
2.
Find the lower envelop $LE(t)$ by interpolating between minima and find the upper envelop $UE(t)$ with maxima.
3.
Compute the mean envelop as an approximation to the local average
$$\begin{aligned} M(t)_1=(UE(t)+LE(t))/2 \end{aligned}$$
(1)
4.
Let $i=i+1$ and define the intermediate function as
$$\begin{aligned} IM_i(t)=S(t)-M(t) \end{aligned}$$
(2)
5.
Repeat steps 1 to 4 on $IM_i(t)$ until it is an IMF, then record the IMF
$$\begin{aligned} C_1=IM_i(t) \end{aligned}$$
(3)
6.
Let $ S(t)=S(t)-C_k (t) $, if stopping criteria is reached then stop the shifting process otherwise $k=k+1,i=0$, and goto step 1.

After completing the shifting process the original signal $Y(t)$ can be represented using extracted IMF’s as follows

$$\begin{aligned} Y(t) = \sum _{j=i}^n (C_n (t)+S_n) \end{aligned}$$

(4)

where $C_n (t)$ is the $n^{th}$ IMF and $S(t)$ is the residue.

The above process is used for signal it is like single dimension, but image is two dimensions the following standard division criteria for stopping the sifting process [2] is used

$$\begin{aligned} SD_k=\sum _{m=0}^N \frac{|IM_{i-1}(m)-IM_i (m)|^2}{(IM_{i-1}^2 (m))} \end{aligned}$$

(5)

4.2 Feature Extraction Using Edge and Median Filtering

The following steps are carried out for generating feature vector for the image.

1.
The image is converted to gray scale image.
2.
Histogram Equalization is applied for gray scale image.
3.
Extract edge using empirical mode decomposition.
4.
Median filtering is applied to the histogram equalized gray scale image block of $ 3\times 3$.
5.
Replace the values of edge position of median filtering image with detected edge values by BEMD.
6.
Extract 64 bins vector and is stored in the database.

4.3 Training with the Batch Algorithm

In this approach SOM neural network will be created and trained using batch training algorithm. The neural network consists of 3-by-3, two-dimensional map of 9 neurons. During training, 200 iterations of the batch algorithm will be run for making the cluster, the SOM neural network will be distributed to all the image features space into 9 neurons.

SOM Algorithm
1. Initialization: Chose any random values for the initial weight vectors $W_j$.
2. Sampling: Take a sample training input vector $X$ from the input space.
3. Matching: Find the winning neuron $WN(X)$ with weight vector closest to input vector.
4. Updating: Apply the weight update using equation
$\Delta W_{ji} = \eta (t)\; T_{ji} \;X( t) (X_i - W_{ji})$
5. Continuation: Repeat steps 2–4 until the feature map stops changing.

Figure 3 shows total number of image feature points are associated with each neuron.

4.4 SOM Algorithm

The following steps describe the SOM algorithm to make the clusters.

4.5 Similarity Measure

Figure 4 shows the structure of each neuron and its neighbor neuron. The query features are submitted to SOM neural network to identify the cluster number. Once the cluster number is identified, then the cluster and its surrounding cluster features are extracted from the database and compared with query features by using Euclidean distance technique. The smallest distance will be selected and the corresponding image will be displayed as result.

Let ‘Q’ be a query image and ‘A’ be an image in database and $Q (n)$ and $A(n)$ be the average value of pixels of each bin, the difference between the value of each bin is calculated as $diff (n) =|A (n) - Q (n) |, \; where \; n=1, 2,\ldots ,64$. The average value of $diff (n)$ is stored in the array ‘SI’. Finally chosen images difference $diff(n)$ is arranged in the ascending order to display most nearby images on top.

5 Experimental Results

The performance estimation of the CBIR system is done by submitting query image to retrieve similar image from various categories of database images. The experiments are conducted on the ground truth database provided by James S. Wang et al. [7, 8]. The ground truth database consists of 1000 images of 10 different categories and each category has 100 related images. The sample query image and its corresponding retrievals are shown in Figs. 5 and 6. Here, only top nine similar images of results are shown.

The Precision and Recall are two generally used metrics for evaluating the accuracy of CBIR system. Precision and recall can be calculated as follows:

$$\begin{aligned} Precision&=\frac{\{relevant\; images\}}{\{retrieved\; images\}} \\ Recall&= \frac{\{relevant\; images\} }{\{relevant\; images\; in\; the\; DB\}} \end{aligned}$$

Table 1 Precision and recall for existing and proposed method

Full size table

The precision and recall values of the existing method and the proposed system are shown in the Table 1. and its graph in Fig. 7. From the experimental results it is observed that a substantial improvement in the average value of precision 65.37 % and recall value of 39.82 % compared to existing system [10] values of 63 % and 37 % respectively. The performance is also increased because of less number of compressions of database features.

6 Conclusion

In this paper we have presented a novel approach for image retrieval by combining edge, median histogram with SOM neural network approach for features classification. The techniques were implemented and tested for 500 queries on image database of 1000 images with 10 different categories. The experimental results shows that there is a substantial improvement in the performance of image retrieval system in respect of precision and Recall. The system performance can be enhanced by exploring different techniques which is our current research focus.

References

LingFei L, ZiLiang P (2008) An edge detection algorithm of image based on empirical mode decomposition. Second international symposium on intelligent information technology application. In: Proceedings of IEEE, vol 1. pp 128–132
Google Scholar
Nunes JC (2005) Texture analysis based on the bidimensional empirical mode decomposition. Mach Vis Appl, Guwahati 16(3):177–188
Article MathSciNet Google Scholar
Hui Z, Pankoo K, Jongan P (2009) Feature analysis based on edge extraction and median filtering for CBIR. In: 11th International Conference on Computer Modelling and Simulation, vol 48. pp 245–249
Google Scholar
Sizintsev M, Derpanis KG, Hogue A (2008) Histogram-based search: a comparative study. In: Proceedings of IEEE, CVPR, pp 1–8
Google Scholar
Kohonen T (1990) The self organizing map. Proc IEEE 78(9):1464–1480
Google Scholar
Juha V, Esa A (2000) Clustering of the self-organizing map. EEE Trans Neural Netw 11(3):1464–1480
Google Scholar
Li J, Wang JZ (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans Pattern Anal Mach Intell 25:1075–1088
Google Scholar
Wang JZ, Li L, Wiederhold G (2000) SIMPLIcity: semantics-sensitive integrated matching for picture libraries. In: Advances in visual information systems: 4th international conference VISUAL. Loyn, France
Google Scholar
Ait Aoudia S, Mahiou R, Benzaid B (2010) YACBIR-Yet another content based image retrieval system. In: 14th international conference information visualisation, pp 570–575
Google Scholar
Shrinivasacharya P, Kavitha H, Sudhamani MV (2011) Content based image retrieval by combining median filtering and BEMD technique. Int Conf Data Eng Commun Syst (ICDECS) 1(2):231–236
Google Scholar

Download references

Author information

Authors and Affiliations

Department of ISE, Siddaganga Institute of Technology, 03, Tumkur, India
Purohit Shrinivasacharya
Department of ISE, RNS Institute of Technology, 61, Bengaluru, Karnataka
M. V. Sudhamani

Authors

Purohit Shrinivasacharya
View author publications
You can also search for this author in PubMed Google Scholar
M. V. Sudhamani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Purohit Shrinivasacharya .

Editor information

Editors and Affiliations

, Computer Science & Engineering, Dr. N.G.P. Institute of Technology, Kalapatti Road, Coimbatore, 641048, Tamil Nadu, India
Mohan S
, Electronics & Communication Engineering, Dr. N.G.P. Institute of Technology, Kalapatti Road, Coimbatore, 641048, Tamil Nadu, India
S Suresh Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shrinivasacharya, P., Sudhamani, M.V. (2013). Content Based Image Retrieval Using Self Organizing Map. In: S, M., Kumar, S. (eds) Proceedings of the Fourth International Conference on Signal and Image Processing 2012 (ICSIP 2012). Lecture Notes in Electrical Engineering, vol 221. Springer, India. https://doi.org/10.1007/978-81-322-0997-3_48

Download citation

DOI: https://doi.org/10.1007/978-81-322-0997-3_48
Published: 11 January 2013
Publisher Name: Springer, India
Print ISBN: 978-81-322-0996-6
Online ISBN: 978-81-322-0997-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics