Abstract
We present a method for rapid development of benchmarks for Semantic Web knowledge base systems. At the core, we have a synthetic data generation approach for OWL that is scalable and models the real world data. The data-generation algorithm learns from real domain documents and generates benchmark data based on the extracted properties relevant for benchmarking. We believe that this is important because relative performance of systems will vary depending on the structure of the ontology and data used. However, due to the novelty of the Semantic Web, we rarely have sufficient data for benchmarking. Our approach helps overcome the problem of having insufficient real world data for benchmarking and allows us to develop benchmarks for a variety of domains and applications in a very time efficient manner. Based on our method, we have created a new Lehigh BibTeX Benchmark and conducted an experiment on four Semantic Web knowledge base systems. We have verified our hypothesis about the need for representative data by comparing the experimental result to that of our previous Lehigh University Benchmark. The difference in both experiments has demonstrated the influence of ontology and data on the capability and performance of the systems and thus the need of using a representative benchmark for the intended application of the systems.
Chapter PDF
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swamy, A.: Mining association rules between sets of items in large databases. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (May 1993)
Alexaki, S., et al.: On Storing Voluminous RDF Description: The case of Web Portal Catalogs. In: Proc. of the 4th International Workshop on the Web and Databases (2001)
Beall, S., Hodges, R.: Application & Systems Program Development: Software Directory Columns. Gartner Corporation Technical Report (1997)
Bibtex Definition in OWL Version 0.1, http://www.visus.mit.edu/bibtex/0.1/
Bitton, D., DeWitt, D., Turbyfill, C.: Benchmarking Database Systems, a Systematic Approach. In: Proc. of the 9th International Conference on Very Large Data Bases (1983)
Bitton, D., Turbyfill, C.: A Retrospective on the Wisconsin Benchmark. In Readings in Database Systems, 2nd edn (1994)
Broekstra, J., Kampman, A.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 54. Springer, Heidelberg (2002)
Carroll, J.J., Roo, J.D. (eds.): OWL Web Ontology Test Cases, http://www.w3.org/TR/2004/REC-owl-test-20040210/
Cattell, R.G.G.: An Engineering Database Benchmark. Readings in Database Systems, 2nd edn. (1994)
Elhaik, Q., Rousset, M.C., Ycart, B.: Generating Random Benchmarks for Description Logics. In: Proc. of DL 1998 (1998)
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On Power-law Relationships of the Internet Topology. In: SIGCOMM 1999, pp. 251–262 (1999)
Guo, Y., Heflin, J., Pan, Z.: Benchmarking DAML+OIL Repositories. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 613–627. Springer, Heidelberg (2003)
Guo, Y., Pan, Z., Heflin, J.: LUBM: A Benchmark for OWL Knowledge Base Systems. Journal of Web Semantics 3(2) (2005)
Horrocks, I., Patel-Schneider, P.: DL Systems Comparison. In: Proc. of DL 1998 (1998)
Java BibTeX-To-RDF Converter, http://www.aifb.uni-karlsruhe.de/WBS/pha/bib/
Karvounarakis, G., et al.: Querying Community Web Portals, http://www.ics.forth.gr/proj/isst/RDF/RQL/rql.pdf
Kopena, J.B., Regli, W.C.: DAMLJessKB: A Tool for Reasoning with the Semantic Web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 628–643. Springer, Heidelberg (2003)
Lehigh University Bibtex Ontology, http://www.cse.lehigh.edu/~syw/bib-bench.owl
Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent Systems 16(2), 72–79 (2001)
Manno, I.: Introduction to the Monte Carlo Method. Akadémiai Kiadó, Budapest (1999)
Pan, Z., Heflin, J.: DLDB: Extending Relational Databases to Support Semantic Web Queries. In: Workshop on Practical and Scalable Semantic Systems, ISWC 2003 (2003)
Stonebraker, M., et al.: The SEQUIOA 2000 Storage Benchmark. Readings in Database Systems, 2nd edn. (1994)
Tempich, C., Volz, R.: Towards a benchmark for Semantic Web reasoners–an analysis of the DAML ontology library. In: Workshop on Evaluation on Ontology-based Tools, ISWC 2003 (2003)
Wang, S.-Y., Guo, Y., Qasem, A., Heflin, J.: Rapid Benchmaring for Semantic Web Knowledge Base Systems, Technical Report LU-CSE-05-026, Dept. of Computer Science and Engineering, Lehigh University (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, SY., Guo, Y., Qasem, A., Heflin, J. (2005). Rapid Benchmarking for Semantic Web Knowledge Base Systems. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds) The Semantic Web – ISWC 2005. ISWC 2005. Lecture Notes in Computer Science, vol 3729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11574620_54
Download citation
DOI: https://doi.org/10.1007/11574620_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29754-3
Online ISBN: 978-3-540-32082-1
eBook Packages: Computer ScienceComputer Science (R0)