Abstract
A key aspect of any data integration endeavor is determining the relationships between the source schemata and the target schema. This schema integration task must be tackled regardless of the integration architecture or mapping formalism. In this paper, we provide a task model for schema integration. We use this breakdown to motivate a workbench for schema integration in which multiple tools share a common knowledge repository. In particular, the workbench facilitates the interoperation of research prototypes for schema matching (which automatically identify likely semantic correspondences) with commercial schema mapping tools (which help produce instance-level transformations). Currently, each of these tools provides its own ad hoc representation of schemata and mappings; combining these tools requires aligning these representations. The workbench provides a common representation so that these tools can more rapidly be combined.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. The VDLB Journal 10, 334–350 (2001)
Shvaiko, P., Euzenat, J.: A Survey of Schema-Based Matching Approaches. Journal on Data Semantics 4, 146–171 (2005)
Miller, R., Hernández, M.A., Haas, L.M., Yan, L., Ho, C.T.H., Fagin, R., Popa, L.: The Clio Project: Managing Heterogeneity. SIGMOD Record 30, 78–83 (2001)
Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, MD (2005)
Hammer, J., Garcia-Molina, H., Nestorov, S., Yerneni, R., Bruenig, M.M., Vassalos, V.: Template-Based Wrappers in the TSIMMIS System. In: Proceedings ACM SIGMOD International Conference on Management of Data, Tucson, AZ (1997)
Doan, A., Domingos, P., Halevy, A.Y.: Learning to Match the Schemas of Databases: A Multistrategy Approach. Machine Learning 50, 279–301 (2003)
Mork, P., Rosenthal, A., Seligman, L.J., Korb, J., Samuel, K.: Integration Workbench: Integrating Schema Integration Tools. In: InterDB 2006 Second International Workshop on Database Interoperability, Atlanta, GA (2006)
Ullman, J.D.: Information Integration Using Logical Views. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186. Springer, Heidelberg (1997)
Seligman, L.J., Rosenthal, A., Lehner, P.E., Smith, A.: Data Integration: Where Does the Time Go? IEEE Database Engineering Bulletin 25, 3–10 (2002)
Ashish, N., Knoblock, C.A.: Wrapper Generation for Semi-structured Sources. SIGMOD Record 26, 8–15 (1997)
Batini, C., Lenzerini, M., Navathe, S.B.: A Comparative Analysis of Methodologies for Database Schema Integration. ACM Computing Surveys 18, 323–364 (1986)
Cluet, S., Delobel, C., Siméon, J., Smaga, K.: Your Mediators Need Data Conversion! In: SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, Seattle, WA (1998)
Florescu, D., Levy, A.Y., Mendelzon, A.O.: Database Techniques for the World-Wide Web: A Survey. SIGMOD Record 27, 59–74 (1998)
Pan, A., Raposo, J., Álvarez, M., Hidalgo, J., Viña, Á.: Semi-Automatic Wrapper Generation for Commercial Web Sources. In: Engineering Information Systems in the Internet Context, Kanazawa, Japan (2002)
Papakonstantinou, Y., Gupta, A., Garcia-Molina, H., Ullman, J.D.: A Query Translation Scheme for Rapid Implementation of Wrappers. In: Ling, T.-W., Vieille, L., Mendelzon, A.O. (eds.) DOOD 1995. LNCS, vol. 1013. Springer, Heidelberg (1995)
Popa, L., Velegrakis, Y., Miller, R., Hernández, M.A., Fagin, R.: Translating Web Data. In: VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases, Hong Kong, China (2002)
Fagin, R., Kolaitis, P., Miller, R., Popa, L.: Data Exchange: Semantics and Query Answering. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572. Springer, Heidelberg (2003)
Fernandez, M.F., Tan, W.-C., Suciu, D.: SilkRoute: Trading between Relations and XML. In: Ninth International World Wide Web Conference, Amsterdam, The Netherlands (2000)
Rys, M.: Bringing the Internet to Your Database: Using SQL Server 2000 and XML to Build Loosely-Coupled Systems. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany (2001)
Wyss, C.M., Robertson, E.L.: Relational Languages for Metadata Integration. ACM Transactions on Database Systems 30, 624–660 (2005)
Goh, C.H., Bressan, S., Madnick, S.E., Siegel, M.: Context Interchange: New Features and Formalisms for the Intelligent Integration of Information. ACM Transactions on Information Systems 17, 270–293 (1999)
Sciore, E., Siegel, M., Rosenthal, A.: Using Semantic Values to Facilitate Interoperability Among Heterogeneous Information Systems. ACM Transactions on Database Systems 19, 254–290 (1994)
Koudas, N., Sarawagi, S., Srivastava, D.: Record Linkage: Similarity Mesaures and Algorithms. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Chicago, IL (2006)
Johnson, T., Dasu, T.: Data Quality and Data Cleaning: An Overview. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, CA (2003)
Bernstein, P.A., Melnik, S., Petropoulos, M., Quix, C.: Industrial-Strength Schema Matching. SIGMOD Record 33, 38–43 (2004)
Do, H.H., Rahm, E.: COMA - A System for Flexible Combination of Schema Matching Approaches. In: VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases, Hong Kong, China (2002)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, Rom, Italy (2001)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm. In: Proceedings of the 18th International Conference on Data Engineering, San Jose, CA (2002)
Brickley, D., Guha, R.: RDF Vocabulary Description Language 1.0: RDF Schema. In: World Wide Web Consortium (W3C®) (2003)
Stenbit, J.P.: Department of Defense Net-Centric Data Strategy (2003)
Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France (2004)
Mork, P., Bernstein, P.A.: Adapting a Generic Match Algorithm to Align Ontologies of Human Anatomy. In: Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, Boston, MA (2004)
Carey, M.J., Ghandeharizadeh, S., Mehta, K., Mork, P., Seligman, L.J., Thatte, S.: AL$MONY: Exploring Semantically-Assisted Matching in an XQuery-Based Data Mapping Tool. In: International Workshop on Semantic Data and Service Integration, Vienna, Austria (2007)
Haas, L.M.: Beauty and the Beast: The Theory and Practice of Information Integration. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353. Springer, Heidelberg (2007)
Clifton, C., Housman, E., Rosenthal, A.: Experience with a Combined Approach to Attribute-Matching Across Heterogeneous Databases. In: Data Mining and Reverse Engineering: Search for Semantics, IFIP TC2/WG2.6 Seventh Conference on Database Semantics (DS-7), Leysin, Switzerland (1997)
Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-Match: an Algorithm and an Implementation of Semantic Matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mork, P., Seligman, L., Rosenthal, A., Korb, J., Wolf, C. (2008). The Harmony Integration Workbench. In: Spaccapietra, S., et al. Journal on Data Semantics XI. Lecture Notes in Computer Science, vol 5383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92148-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-92148-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92147-9
Online ISBN: 978-3-540-92148-6
eBook Packages: Computer ScienceComputer Science (R0)