Abstract
This paper presents our implementation techniques for an intelligent Web image search engine. A reference architecture of the system is provided and addressed in this paper. The system includes several components such as a crawler, a preprocessor, a semantic extractor, an indexer, a knowledge learner and a query engine. The crawler traverses web sites in multithread accesses model. And it can dynamically control its access load to a Web server based on the corresponding capacity of the local system. The preprocessor is used to clean and normalize the information resource downloaded from Web sites. In this process, stop-word removing and word stemming are applied to the raw resources. The semantic extractor derives Web image semantics by partitioning combining the associated text. The indexer of the system creates and maintains inverted indices with relational model. Our knowledge learner is designed to automatically acquire knowledge from users’ query activities. Finally, the query engine delivers search results in two phases in order to mine out the users’ feedbacks.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chakrabarti, S., Berg, M.V.D., Dom, B.: Focused Crawling: a New Approach to Topic-Specific Web Resource Discovery. Computer Networks 31(11-16), 1623–1640 (1999)
Chakrabarti, S., Berg, M.V.D., Dom, B.: Focused Crawling: a New Approach to Topic-Specific Web Resource Discovery. Computer Networks 31(11-16), 1623–1640 (1999)
Chang, S.-K., Hsu, A.: Image information systems: Where do we go from here? IEEE Trans. on Knowledge and Data Eng. 4(5), 431–442 (1992)
Chen, Z. et al.: Web Mining for Web Image Retrieval. To appear in the special issue of Journal of the American Society for Information Science on Visual Based Retrieval Systems and Web Mining
Gong, Z., Hou U, L., Cheang, C.W.: Web Image Semantic Extractions from its Associated Texts. In: The 8th IASTED International Conference on Internet & Multimedia Systems & Applications, Kauai, Hawaii, USA, August 16-18 (2004)
Harmandas, V., Sanderson, M., Dunlop, M.D.: Image Retrieval By Hypertext Links. In: Proceedings of SIGIR-1997, 20th ACM International Conference on Research and Development in Information Retrieval (1997)
Li, M., Chen, Z., Zhang, H.: Statistical Correlation Analysis in Image Retrieval. Pattern Recognition 35, 2687–2693 (2002)
Lin, H.: Discovering Informative Content Blocks from Web Documents. In: ACM SIGKDD 2002, Edmonton, Alberta, Canada, July 23 - 26 (2002)
Porter, M.F.: An Algorithm For Suffix Stripping. Program 14(3), 130–137 (1980)
Stop Word List, http://www.searchengineworld.com/spy/stopwords.htm
Tamura, H., Yokoya, N.: Image database systems: A survey. Patt. Recog 17(1), 29–43 (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gong, Z., U, L.H., Cheang, C.W. (2004). An Implementation of Web Image Search Engines. In: Chen, Z., Chen, H., Miao, Q., Fu, Y., Fox, E., Lim, Ep. (eds) Digital Libraries: International Collaboration and Cross-Fertilization. ICADL 2004. Lecture Notes in Computer Science, vol 3334. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30544-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-30544-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24030-3
Online ISBN: 978-3-540-30544-6
eBook Packages: Computer ScienceComputer Science (R0)