Abstract
The complexity of scientific workflows for analyzing biological data creates a number of challenges for current workflow and provenance systems. This complexity is due in part to the nature of scientific data (e.g., heterogeneous, nested data collections) and the programming constructs required for automation (e.g., nested workflows, looping, pipeline parallelism). We present an extended version of the Kepler scientific workflow system to address these challenges, tailored for the systematics community. Our system combines novel approaches for representing scientific data, modeling and automating complex analyses, and recording and browsing associated provenance information.
This work supported in part through NSF grants IIS-0630033, OCI-0722079, IIS-0612326, DBI-0533368, and DOE grant DE-FC02-01ER25486.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ludäscher, B., et al.: Scientific workflow management and the kepler system. Concurrency and Computation: Practice & Experience 18(10), 1039–1065 (2006)
McPhillips, T., Bowers, S., Ludäscher, B.: Collection-oriented scientific workflows for integrating and analyzing biological data. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 248–263. Springer, Heidelberg (2006)
McPhillips, T., Bowers, S., Zinn, D., Ludäscher, B.: Scientific workflow design for mere mortals. FGCS (to appear, 2008)
Majithia, S., Shields, M.S., Taylor, I.J., Wang, I.: Triana: A graphical web service composition and execution toolkit. In: ICWS (2004)
Oinn, T., et al.: Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice & Experience 18(10), 1067–1100 (2006)
Bavoil, L., Callahan, S.P., Scheidegger, C.E., Vo, H.T., Crossno, P., Silva, C.T., Freire, J.: VisTrails: Enabling interactive multiple-view visualizations. In: IEEE Visualization (2005)
Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006)
Lee, E.A., Sangiovanni-Vincentelli, A.L.: A framework for comparing models of computation. IEEE Trans. on CAD of Integrated Circuits and Systems 17(12) (1998)
Moreau, L., Ludäscher, B. (eds.): Computation and Concurrency: Practice and Experience, vol. 20(5). Wiley, Chichester (2008)
Moreau, L., Freire, J., Futrelle, J., McGrath, R., Myers, J., Paulson, P.: The open provenance model. Technical Report 14979, University of Southampton (2007)
Biton, O., Boulakia, S.C., Davidson, S.B.: Zoom*userviews: Querying relevant provenance in workflow systems. In: VLDB (2007)
Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in Collection-Oriented Scientific Workflows. Concurrency and Computation: Practice and Experience (2007)
Bowers, S., McPhillips, T.M., Wu, M., Ludäscher, B.: Project histories: Managing data provenance across collection-oriented scientific workflow runs. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS (LNBI), vol. 4544, pp. 122–138. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bowers, S., McPhillips, T., Riddle, S., Anand, M.K., Ludäscher, B. (2008). Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life. In: Freire, J., Koop, D., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2008. Lecture Notes in Computer Science, vol 5272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89965-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-89965-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89964-8
Online ISBN: 978-3-540-89965-5
eBook Packages: Computer ScienceComputer Science (R0)