Abstract
Dataflow repositories are databases containing dataflows and their different runs. We propose a formal conceptual data model for such repositories. Our model includes careful formalisations of such features as complex data manipulation, external service calls, subdataflows, and the provenance of output values.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Oinn, T., et al.: Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)
Ludäscher, B., et al.: Scientific workflow management and the Kepler system. Concurrency and Computation: Practice And Experience 18(10), 1039–1065 (2006)
Buneman, P., Naqvi, S., Tannen, V., Wong, L.: Principles of programming with complex objects and collection types. Theor. Computer Science 149, 3–48 (1995)
Stevens, R., Goble, C., Baker, P., Brass, A.: A classification of tasks in bioinformatics. Bioinformatics 17(1), 180–188 (2001)
Chen, J., Chung, S.-Y., Wong, L.: The Kleisli query system as a backbone for bioinformatics data integration and analysis. In: Bioinformatics: Managing Scientific Data, pp. 147–187. Morgan Kaufmann, San Francisco (2003)
Davidson, S., et al.: The Kleisli approach to data transformation and integration. In: The Functional Approach to Data Management, pp. 135–165. Springer, Heidelberg (2004)
Gambin, A., Hidders, J., Kwasnikowska, N., et al.: NRC as a formal model for expressing bioinformatics workflows. Poster at ISMB, Detroit, MI, USA (2005)
Pierce, B.: Types and Programming Languages. MIT Press, Cambridge (2002)
Ailamaki, A., Ioannidis, Y., Livny, M.: Scientific workflow management by database management. In: Proceedings of SSDBM, pp. 190–199. IEEE Computer Society, Los Alamitos (1998)
Chen, I., Markowitz, V.: An overview of the object protocol model (OPM) and the OPM data management tools. Information Systems 20(5), 393–418 (1995)
Shankar, S., Kini, A., DeWitt, D., Naughton, J.: Integrating databases and workflow systems. SIGMOD Record 34(3), 5–11 (2005)
Tröger, A., et al.: A language for comprehensively supporting the In Vitro experimental process. In: Silico Proceedings of BIBE, pp. 47–56. IEEE Computer Society, Los Alamitos (2004)
Zhao, Y., et al.: A notation and system for expressing and executing cleanly typed workflows on messy scientific data. SIGMOD Record 34(3), 37–43 (2005)
Cohen, S., Cohen Boulakia, S., Davidson, S.: Towards a model of provenance and user views in scientific workflows. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 264–279. Springer, Heidelberg (2006)
Bose, R., Frew, J.: Lineage retrieval for scientific data processing: A survey. ACM Computing Surveys 37(1), 1–28 (2005)
Wong, S., Miles, S., Fang, W., et al.: Provenance-based validation of e-science experiments. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 801–815. Springer, Heidelberg (2005)
Mutsuzaki, M., et al.: Trio-One: Layering uncertainty and lineage on a conventional DBMS. In: Proceeding of CIDR Januari, Asilomar, California (2007)
Medeiros, C., et al.: WOODSS and the Web: annotating and reusing scientific workflows. SIGMOD Record 34(3), 18–23 (2005)
McPhillips, T., et al.: Collection-oriented scientific workflows for integrating and analyzing biological data. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 248–263. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Hidders, J., Kwasnikowska, N., Sroka, J., Tyszkiewicz, J., Van den Bussche, J. (2007). A Formal Model of Dataflow Repositories. In: Cohen-Boulakia, S., Tannen, V. (eds) Data Integration in the Life Sciences. DILS 2007. Lecture Notes in Computer Science(), vol 4544. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73255-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-73255-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73254-9
Online ISBN: 978-3-540-73255-6
eBook Packages: Computer ScienceComputer Science (R0)