MatchBench: Benchmarking Schema Matching Algorithms for Schematic Correspondences

Guo, Chenjuan; Hedeler, Cornelia; Paton, Norman W.; Fernandes, Alvaro A. A.

doi:10.1007/978-3-642-39467-6_11

Chenjuan Guo¹⁹,
Cornelia Hedeler¹⁹,
Norman W. Paton¹⁹ &
…
Alvaro A. A. Fernandes¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7968))

Included in the following conference series:

British National Conference on Databases

5151 Accesses
3 Citations

Abstract

Schema matching algorithms aim to identify relationships between database schemas, which are useful in many data integration tasks. However, the results of most matching algorithms are expressed as semantically inexpressive, 1-to-1 associations between pairs of attributes or entities, rather than semantically-rich characterisations of relationships. This paper presents a benchmark for evaluating schema matching algorithms in terms of their semantic expressiveness. The definition of such semantics is based on the classification of schematic heterogeneities of Kim et al.. The benchmark explores the extent to which matching algorithms are effective at diagnosing schematic heterogeneities. The paper contributes: (i) a wide range of scenarios that are designed to systematically cover several reconcilable types of schematic heterogeneities; (ii) a collection of experiments over the scenarios that can be used to investigate the effectiveness of different matching algorithms; and (iii) an application of the experiments for the evaluation of matchers from three well-known and publicly available schema matching systems, namely COMA++, Similarity Flooding and Harmony.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

A Linear Program for Holistic Matching: Assessment on Schema Matching Benchmark

Matcher Composition Methods for Automatic Schema Matching

Two Phase User Driven Schema Matching

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ontology Alignment Evaluation Initiative (OAEI), http://oaei.ontologymatching.org/
Alexe, B., Tan, W.C., Velegrakis, Y.: Stbenchmark: towards a benchmark for mapping systems. PVLDB 1(1), 230–244 (2008)
Google Scholar
Bernstein, P., Melnik, S.: Model management 2.0: manipulating richer mappings. ACM SIGMOD, 1–12 (2007)
Google Scholar
Bernstein, P.A., Madhavan, J., Rahm, E.: Generic schema matching, ten years later. PVLDB 4(11), 695–701 (2011)
Google Scholar
Bonifati, A., Chang, E.Q., Ho, T., Lakshmanan, L.V.S., Pottinger, R., Chung, Y.: Schema mapping and query translation in heterogeneous p2p xml databases. VLDB J. 19(2), 231–256 (2010)
Article Google Scholar
Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., Summa, G.: Schema mapping verification: the spicy way. In: EDBT, pp. 85–96 (2008)
Google Scholar
Dhamankar, R., Lee, Y., Doan, A., Halevy, A.Y., Domingos, P.: imap: Discovering complex mappings between database schemas. In: SIGMOD Conference, pp. 383–394 (2004)
Google Scholar
Do, H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)
Article Google Scholar
Do, H.-H., Melnik, S., Rahm, E.: Comparison of schema matching evaluations. In: Chaudhri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds.) NODe-WS 2002. LNCS, vol. 2593, pp. 221–237. Springer, Heidelberg (2003)
Chapter Google Scholar
Duchateau, F., Bellahsene, Z., Hunt, E.: Xbenchmatch: a benchmark for xml schema matching tools. In: VLDB, pp. 1318–1321 (2007)
Google Scholar
Fagin, R., Haas, L.M., Hernández, M., Miller, R.J., Popa, L., Velegrakis, Y.: Clio: Schema mapping creation and data exchange. In: Borgida, A.T., Chaudhri, V.K., Giorgini, P., Yu, E.S. (eds.) Conceptual Modeling: Foundations and Applications. LNCS, vol. 5600, pp. 198–236. Springer, Heidelberg (2009)
Chapter Google Scholar
Franklin, M., Halevy, A., Maier, D.: From databases to dataspaces: a new abstraction for information management. SIGMOD Record 34(4), 27–33 (2005)
Article Google Scholar
Haas, L.: Beauty and the beast: The theory and practice of information integration. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 28–43. Springer, Heidelberg (2006)
Chapter Google Scholar
Kim, W., Seo, J.: Classifying schematic and data heterogeneity in multidatabase systems. IEEE Computer 24(12), 12–18 (1991)
Article Google Scholar
Lee, Y., Sayyadian, M., Doan, A., Rosenthal, A.: etuner: tuning schema matching software using synthetic scenarios. VLDB J. 16(1), 97–122 (2007)
Article Google Scholar
Massmann, S., Engmann, D., Rahm, E.: Coma++: Results for the ontology alignment contest oaei 2006. Ontology Matching (2006)
Google Scholar
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and itsapplication to schema matching. In: ICDE, pp. 117–128 (2002)
Google Scholar
Melnik, S., Rahm, E., Bernstein, P.: Rondo: a programming platform for generic model management. In: ACM SIGMOD, pp. 193–204 (2003)
Google Scholar
Ozsu, M.T., Valduriez, P.: Principles of distributed database systems. Addison-Wesley, Reading Menlo Park (1989)
Google Scholar
Rahm, E., Bernstein, P.: A survey of approaches to automatic schema matching. The VLDB Journal The International Journal on Very Large Data Bases 10(4), 334–350 (2001)
Article MATH Google Scholar
Seligman, L., Mork, P., Halevy, A.Y., Smith, K., Carey, M.J., Chen, K., Wolf, C., Madhavan, J., Kannan, A., Burdick, D.: Openii: an open source information integration toolkit. In: SIGMOD Conference, pp. 1057–1060 (2010)
Google Scholar
Smith, K., Morse, M., Mork, P., Li, M.H., Rosenthal, A., Allen, D., Seligman, L.: The role of schema matching in large enterprises. In: CIDR (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Manchester, M13 9PL, UK
Chenjuan Guo, Cornelia Hedeler, Norman W. Paton & Alvaro A. A. Fernandes

Authors

Chenjuan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Cornelia Hedeler
View author publications
You can also search for this author in PubMed Google Scholar
Norman W. Paton
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro A. A. Fernandes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, OX1 3 QD, Oxford, UK
Georg Gottlob
Department of Computer Science, Oxford University, Wolfson Building, Parks Road, OX1 3QD, Oxford, UK
Giovanni Grasso & Christian Schallhart &
University of Oxford, Wolfson Building, Parks Road, OX1 3QD, Oxford, UK
Dan Olteanu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, C., Hedeler, C., Paton, N.W., Fernandes, A.A.A. (2013). MatchBench: Benchmarking Schema Matching Algorithms for Schematic Correspondences. In: Gottlob, G., Grasso, G., Olteanu, D., Schallhart, C. (eds) Big Data. BNCOD 2013. Lecture Notes in Computer Science, vol 7968. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39467-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-39467-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39466-9
Online ISBN: 978-3-642-39467-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MatchBench: Benchmarking Schema Matching Algorithms for Schematic Correspondences

Abstract

Chapter PDF

Similar content being viewed by others

A Linear Program for Holistic Matching: Assessment on Schema Matching Benchmark

Matcher Composition Methods for Automatic Schema Matching

Two Phase User Driven Schema Matching

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

MatchBench: Benchmarking Schema Matching Algorithms for Schematic Correspondences

Abstract

Chapter PDF

Similar content being viewed by others

A Linear Program for Holistic Matching: Assessment on Schema Matching Benchmark

Matcher Composition Methods for Automatic Schema Matching

Two Phase User Driven Schema Matching

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation