Abstract
Twenty years ago Tim Berners-Lee proposed a distributed hypertext system based on standard Internet protocols. The Web that resulted fundamentally changed the ways we share information and services, both on the public Internet and within organizations. That original proposal contained the seeds of another effort that has not yet fully blossomed: a Semantic Web designed to enable computer programs to share and understand structured and semi-structured information easily. We will review the evolution of the idea and technologies to realize a Web of Data and describe how we are exploiting them to enhance information retrieval and information extraction. A key resource in our work is Wikitology, a hybrid knowledge base of structured and unstructured information extracted from Wikipedia.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Lenat, D.B., Guha, R.V.: Building Large Knowledge-Based Systems; Representation and Inference in the Cyc Project. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
Berners-Lee, T.: Information management: A proposal. In: European Particle Physics Laboratory, CERN (1989) (unpublished report)
Bizer, C.: The emerging web of linked data. IEEE Intelligent Systems 24(5), 87–92 (2009)
Syed, Z., Finin, T.: Wikitology: A Wikipedia derived novel hybrid knowledge base. In: Grace Hopper Conference for Women in Computing (2009)
Wu, F., Weld, D.S.: Automatically refining the wikipedia infobox ontology. In: Proceeding of the 17th International Conference on World Wide Web, WWW 2008, pp. 635–644. ACM, New York (2008)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A large ontology from wikipedia and wordnet. Web Semant. 6(3), 203–217 (2008)
Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring article quality in Wikipedia: models and evaluation. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 243–252. ACM, New York (2007)
Prud’Hommeaux, E., Seaborne, A., et al.: SPARQL query language for RDF. W3C working draft 4 (2006)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Ives, Z.: Dbpedia: A nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, pp. 1247–1250. ACM, New York (2008)
Bizer, C., Heath, T., Ayers, D., Raimond, Y.: Interlinking open data on the web. In: 4th European Semantic Web Conference (2007)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, p. 706. ACM, New York (2007)
Syed, Z., Finin, T., Joshi, A.: Wikipedia as an ontology for describing documents. In: Proceedings of the Second International Conference on Weblogs and Social Media. AAAI Press, Menlo Park (2008)
Finin, T., Syed, Z., Mayfield, J., McNamee, P., Piatko, C.: Using wikitology for cross-document entity coreference resolution. In: Proceedings of the AAAI Spring Symposium on Learning by Reading and Learning to Read. AAAI Press, Menlo Park (2009)
Crestani, F.: Application of spreading activation techniques in information retrieval. Artificial Intelligence Review 11(6), 453–482 (1997)
Mayfield, J., Alexander, D., Dorr, B., Eisner, J., Elsayed, T., Finin, T., Fink, C., Freedman, M., Garera, N., McNamee, P., et al.: Cross-Document Coreference Resolution: A Key Technology for Learning by Reading. In: AAAI 2009 Spring Symposium on Learning by Reading and Learning to Read (2009)
Hatcher, E., Gospodnetic, O.: Lucene in action. Manning Publications Co., Greenwich (2004)
Boschee, E., Weischedel, R., Zamanian, A.: Automatic Information Extraction. In: Proceedings of the 2005 International Conference on Intelligence Analysis, McLean, VA, pp. 2–4 (2005)
Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S., Weischedel, R.: The automatic content extraction (ACE) program – tasks, data, and evaluation. In: Proceedings of the Language Resources and Evaluation Conference, pp. 837–840
McNamee, P., Dang, H.: Overview of the TAC 2009 knowledge base population track. In: Proceedings of the 2009 Text Analysis Conference, National Institute of Standards and Technology, Gaithersburg MD (2009)
McNamee, P., Dredze, M., Gerber, A., Garera, N., Finin, T., Mayfield, J., Piatko, C., Rao, D., Yarowsky, D., Dreyer, M.: HLTCOE approaches to knowledge base population at TAC 2009. In: Proceedings of the 2009 Text Analysis Conference, National Institute of Standards and Technology, Gaithersburg MD (2009)
Wikinews: Wikinews, the free news source, http://en.wikinews.org/wiki (accessed 2009)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explorations 11(1), 10–18 (2009)
Garera, N., Yarowsky, D.: Structural, transitive and latent models for biographic fact extraction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009, Morristown, NJ, USA, pp. 300–308. Association for Computational Linguistics (2009)
Lenat, D.B.: Cyc: a large-scale investment in knowledge infrastructure. ACM Commun. 38(11), 33–38 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Syed, Z., Finin, T. (2011). Creating and Exploiting a Hybrid Knowledge Base for Linked Data. In: Filipe, J., Fred, A., Sharp, B. (eds) Agents and Artificial Intelligence. ICAART 2010. Communications in Computer and Information Science, vol 129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19890-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-19890-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19889-2
Online ISBN: 978-3-642-19890-8
eBook Packages: Computer ScienceComputer Science (R0)