Abstract
The Clio project at IBM Almaden investigates foundational aspects of data transformation, with particular emphasis on the design and execution of schema mappings. We now use Clio as part of a broader data-flow framework in which mappings are just one component. These data-flows express complex transformations between several source and target schemas and require multiple mappings to be specified. This paper describes research issues we have encountered as we try to create and run these mapping-based data-flows. In particular, we describe how we use Unified Famous Objects (UFOs), a schema abstraction similar to business objects, as our data model, how we reason about flows of mappings over UFOs, and how we create and deploy transformations into different run-time engines.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic integration of heterogeneous information sources. Data Knowl. Eng. 36(3), 215–249 (2001)
Bernstein, P.A., Green, T.J., Melnik, S., Nash, A.: Implementing Mapping Composition. In: Proceedings of VLDB, pp. 55–66 (2006)
Dessloch, S., Hernández, M.A., Wisnesky, R., Radwan, A., Zhou, J.: Orchid: Integrating Schema Mapping and ETL. In: ICDE, pp. 1307–1316 (2008)
Do, H.-H., Rahm, E.: Coma: a system for flexible combination of schema matching approaches. In: VLDB 2002, pp. 610–621 (2002)
Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: a machine-learning approach. In: SIGMOD 2001, pp. 509–520 (2001)
Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Learning to map between ontologies on the semantic web. In: WWW 2002, pp. 662–673 (2002)
Fagin, R., Kolaitis, P., Popa, L., Tan, W.-C.: Composing Schema Mappings: Second-Order Dependencies to the Rescue. In: PODS, pp. 83–94 (2004)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theoretical Computer Science 336(1), 89–124 (2005)
Fuxman, A., Hernández, M.A., Ho, H., Miller, R.J., Papotti, P., Popa, L.: Nested Mappings: Schema Mapping Reloaded. In: Proceedings of VLDB, pp. 67–78 (2006)
Li, W.-S., Clifton, C.: Semint: a tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data Knowl. Eng. 33(1), 49–84 (2000)
Madhavan, J., Bernstein, P.A., Doan, A., Halevy, A.: Corpus-based schema matching. In: ICDE 2005, pp. 57–68 (2005)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB 2001, pp. 49–58 (2001)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: A versatile graph matching algorithm. In: ICDE 2002, pp. 117–128 (2002)
Miller, R.J., Haas, L.M., Hernández, M.A.: Schema mapping as query discovery. In: VLDB 2000, pp. 77–88 (2000)
Milo, T., Zohar, S.: Using schema matching to simplify heterogeneous data translation. In: VLDB 1998, pp. 122–133 (1998)
Popa, L., Velegrakis, Y., Hernández, M.A., Miller, R.J., Fagin, R.: Translating web data. In: VLDB 2002, pp. 598–609 (2002)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)
Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alexe, B. et al. (2009). Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration. In: Castellanos, M., Dayal, U., Sellis, T. (eds) Business Intelligence for the Real-Time Enterprise. BIRTE 2008. Lecture Notes in Business Information Processing, vol 27. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03422-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-03422-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03421-3
Online ISBN: 978-3-642-03422-0
eBook Packages: Computer ScienceComputer Science (R0)