Abstract
Web is considered as the largest information pool and search engine, a tool for extracting information from web, but due to unorganized structure of the web it is getting difficult to use search engine tool for finding relevant information from the web. Future search engine tools will not be based merely on keyword search, whereas they will be able to interpret the meaning of the web contents to produce relevant results. Design of such tools requires extracting information from the contents which supports logic and inferential capability. This paper discusses the conceptual differences between the traditional web and semantic web, specifying the need for crawling semantic web documents. In this paper a framework is proposed for crawling the ontologies/semantic web documents. The proposed framework is implemented and validated on different collection of web pages. This system has features of extracting heterogeneous documents from the web, filtering the ontology annotated web pages and extracting triples from them which supports better inferential capability.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Berners-Lee, T., Hendler, J., Ora, L.: The Semantic Web. Scientific American 284(5), 34–43 (2001)
Biddulph, M.: Crawling the Semantic Web. BBC London, United Kingdom (2003)
McBride, B.: Jena implementing the RDF Model and Syntax Specification. Hewlett Packard laboratories, Bristol, UK (2000)
DARPA Agent Markup Language (2012), http://www.daml.org/language/
Dhingra, V., Bhatia, K.K.: Towards Intelligent Information retrieval on Web. IJCSE (2011)
Dhingra, V., Bhatia, K.K.: Metadata: Towards Machine-Enabled Intelligence. IJWesT 3(3), 121–130 (2012)
Dodds, L.: Slug: A Semantic Web Crawler in Jena User Conference Bristol, UK (2006)
Li, D., et al.: Swoogle: A search and metadata engine for the semantic web. In: Proceedings of 13th ACM Conference on Information and Knowledge Management (2004)
Dong, H., Hussain, F.K., Chang, E.: A semantic crawler based on an extended CBR algorithm. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008 Workshops. LNCS, vol. 5333, pp. 1076–1085. Springer, Heidelberg (2008)
Asunción, G.-P., Corcho, O.: Ontology Languages for the Semantic Web. IEEE Intelligent Systems Journal (2002)
Andreas, H., Hannes, G.: On Searching and Displaying RDF Data from the Web. In: 2nd European Semantic Web Conference (ESWC 2005), Heraklion, Greece (2005)
Hendler, J., Berners-Lee, T.: From Semantic Web to Social Machine. Artificial Intelligence 174(2), 156–161 (2010)
Vishal, J.: Ontology Based Information Retrieval in Semantic Web. International Journal of Information technology and Computer Sceince, 62–69 (2013)
Annett, M., Ronny, W., Klaus: Searching Community-built Semantic Web Resources to Support Personal Annotation. In: Proceedings of Bridging the Gap between Semantic Web and Web 2.0, Austria (2007)
Van de Maele, F., Spyns, P., Meersman, R.: An Ontology-Based Crawler for the Semantic Web. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008 Workshops. LNCS, vol. 5333, pp. 1056–1065. Springer, Heidelberg (2008)
Staab, S., Apsitis, K., Handschuh, S., Oppermann, H.: Specification of an RDF Crawler (2004)
World Wide Consortium RDF Primer, http://www.w3.org/TR/rdf-primer/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dhingra, V., Bhatia, K.K. (2015). SemCrawl: Framework for Crawling Ontology Annotated Web Documents for Intelligent Information Retrieval. In: Buyya, R., Thampi, S. (eds) Intelligent Distributed Computing. Advances in Intelligent Systems and Computing, vol 321. Springer, Cham. https://doi.org/10.1007/978-3-319-11227-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-11227-5_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11226-8
Online ISBN: 978-3-319-11227-5
eBook Packages: EngineeringEngineering (R0)