Abstract
In life sciences, scientists are confronted with an exponential growth of biological data, especially in the genomics and proteomics area. The efficient management and use of these data, and its transformation into knowledge are basic requirements for biological research. Therefore, integration of diverse applications and data from geographically distributed computing resources will become a major issue. We will present the status of our efforts for the realization of an automated protein prediction pipeline as an example for a complex biological workflow scenario in a Grid environment based on Web services. This case study demonstrates the ability of an easy orchestration of complex biological workflows based on Web services as building blocks and Triana as workflow engine.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baker, D., Sali, A.: Protein structure prediction and structural genomics. Science 294, 93–96 (2001)
Moult, J.: A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15, 285–289 (2005)
Fischer, D., Barret, C., Bryson, K., Elofsson, A., Godzik, A., Jones, D., Karplus, K., Kelley, L., MacCallum, R., Pawowski, K., Rost, B., Rychlewski, L., Sternberg, M.: CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins 3, 209–217 (1999)
Altschul, S.F., Madden, T.L., Schaffler, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nuc. Acids Res. 25, 3389–3402 (1997)
Majithia, S., Shields, M., Taylor, I., Wang, I.: Triana: A graphical web service composition and execution toolkit. In: IEEE International Conference on Web Services (ICWS 2004) (2004)
Curbera, F., Andrews, T., Dholakia, H., Goland, Y., Klein, J., Leymann, F., Liu, K., Roller, D., Smith, D., Thatte, S., Trickovic, I., Weerawarana, S.: Business Process Execution Language for Web services, V.1.0 Available via, http://www-106.ibm.com/developerworks/webservices/library/ws-bpel
Leymann, F.: Web Service Flow Language WSFL, version 1.0
Triana Available via, http://www.trianacode.org
Gao, H.T., Hayes, J.H., Cai, H.: Integrating biological research through web services. Computer, 26–31 (2005)
Cavalcanti, M.C., Targino, R., Baião, F.A., Rössle, S.C., Bisch, P.M., Pires, P.F., Campos, M.L.M., Mattoso, M.: Managing structural genomic workflows using web services. Data Knowl. Eng. 53(1), 45–74 (2005)
IBM BPWS4J, Available via, http://www.alphaworks.ibm.com/tech/bpws4j
Guo, J., Ellrott, K., Chung, W.J., Xu, D., Passovets, S., Xu, Y.: PROSPECT-PSPP: an automated computational pipeline for protein structure prediction. Nucleic Acid Res. (Web Server Issue) 32, W522–W525 (2004)
Velankar, S., McNeil, P., Mittard-Runte, V., Suarez, A., Barrell, D., Apweiler, R., Henrick, K.: E-MSD: an integrated data resource for bioinformatics. Nucleic Acids Res. (Database issue) 33, D262–D265 (2005)
Trissl, S., Rother, K., Muller, H., Steinke, T., Koch, I., Preissner, R., Froemmel, C., Leser, U.: Columba: an integrated database of proteins, structures, and annotations. BMC Bioinformatics 6(1), 81–92 (2005)
HOBIT (Helmholtz Open Bioinformatics Technology) project Available via, http://hobit.sourceforge.net
Michalsky, E., Goede, A., Preissner, R., May, P., Steinke, T.: A distributed pipeline for structure prediction. In: CASP6 Methods Abstracts, 6th Meeting on the Critical Assessment of Techniques for Protein Structure Prediction, Gaeta, Italy, pp. 112–114 (2004)
Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shyndyalov, I., Bourne, P.: The protein data bank. Nucl. Acids Res 28, 235–242 (2000)
Bairoch, A., Apweiler, R., Wu, C., Barker, W., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M., Natale, D., O’Donovan, C., Redaschi, N., Yeh, L.: The universal protein resource (uniprot). Nucleic Acids Res. 1(33), 154–159 (2005)
Marti-Renom, M., Stuart, A., Fiser, A., Sanchez, R., Melo, F., Sali, A.: Comparitive protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000)
McGuffin, L., Bryson, K., Jones, D.: The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405 (2000)
May, P., Steinke, T.: THESEUS - protein structure prediction at ZIB. ZIB Report 06-24 (2006)
Lathrop, R.H., Sazhin, A., Sun, Y., Steffen, N., Irani, S.S.: A multi-queue branch-and-bound algorithm for anytime optimal search with biological applications. Genome Informatics 12, 73–82 (2001)
BCB-Cluster Available via, http://elfie.bcbio.de
Apache Axis Available via, http://ws.apache.org/axis
Taylor, I., Wang, I., Shields, M., Majithia, S.: Distributed computing with triana on the grid. Concurrency and Computation:Practice and Experience 17, 1–18 (2005)
MediGRID Available via, http://www.medigrid.de/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
May, P., Ehrlich, HC., Steinke, T. (2006). ZIB Structure Prediction Pipeline: Composing a Complex Biological Workflow Through Web Services. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds) Euro-Par 2006 Parallel Processing. Euro-Par 2006. Lecture Notes in Computer Science, vol 4128. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823285_121
Download citation
DOI: https://doi.org/10.1007/11823285_121
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37783-2
Online ISBN: 978-3-540-37784-9
eBook Packages: Computer ScienceComputer Science (R0)