A Dynamic Workflow Approach for the Integration of Bioinformatics Services

Mahoui, Malika; Lu, Lingma; Gao, Ning; Li, Nianhua; Chen, Jessica; Bukhres, Omran; Miled, Zina Ben

doi:10.1007/s10586-005-4095-1

A Dynamic Workflow Approach for the Integration of Bioinformatics Services

Published: October 2005

Volume 8, pages 279–291, (2005)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Cluster Computing Aims and scope Submit manuscript

A Dynamic Workflow Approach for the Integration of Bioinformatics Services

Download PDF

Malika Mahoui¹,
Lingma Lu²,
Ning Gao³,
Nianhua Li³,
Jessica Chen¹,
Omran Bukhres² &
…
Zina Ben Miled³

65 Accesses
10 Citations
Explore all metrics

Abstract

Modern biological and chemical studies rely on life science databases as well as sophisticated software tools (e.g., homology search tools, modeling and visualization tools). These tools often have to be combined and integrated in order to support a given study. SIBIOS (System for the Integration of Bioinformatics Services) serves this purpose. The services are both life science database search services and software tools. The task engine is the core component of SIBIOS. It supports the execution of dynamic workflows that incorporate multiple bioinformatics services. The architecture of SIBIOS, the approaches to addressing the heterogeneity as well as interoperability of bioinformatics services, including data integration are presented in this paper.

References

I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludäscher and S. Mock, Kepler: An extensible system for design and execution of scientific workflows, in 16th Intl. Conference on Scientific and Statistical Database Management (SSDBM) (Santorini Island, Greece, 2004).
S.F. Altschul, T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D.J. Lipman, Gapped BLAST and PSI-BLAST: A new generation of protein database search programs, Nucleic Acids Res 25(17) (1997) 3389–3402.
Article Google Scholar
BLOCKS, J.G. Henikoff, E.A. Greene, S. Pietrokovski and S. Henikoff, Increased coverage of protein families with the blocks database servers, Nucl. Acids Res. 28 (2000) 228–230.
Google Scholar
Z. Ben Miled, N. Li, G. Kellett, B. Sipes and O. Bukhres, Complex life science multidatabase queries, in: Proceedings of the IEEE, vol. 90, no. 11, (2002).
D. Buttler, M. Coleman, T. Critchlow, R. Fileto, W. Han, C. Pu, D. Rocco and L. Xiong, Querying multiple bioinformatics information sources: Can semantic web research help?, SIGMOD Record 31(4) (2002).
A. Bairoch, The ENZYME database in 2000, Nucleic Acids Res. 28 (2000) 304–305.
Google Scholar
T. Berners-Lee, J. Hendler and O. Lassila, The semantic web, Scientific American (2001).
D. Booth, M. Champion, C. Ferris, F. McCabe, E. Newcomer and D. Orchard, Web services architecture, W3C Working Draft (2003).
D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn, H. F. Nielsen, S. Thatte and M.D. Winer, Simple object access protocol (SOAP) 1.1, W3C Note (2000).
T. Bellwood et al., UDDI Spec, Technical Committee Specification, (2002).
S. Brin and L. Page, The anatomy of a large scale hypertextual web search engine, 7th WWW Conference, (1998).
E. Christensen, F. Curbera, G. Meredith and S. Weerawarana, Web services description language (WSDL) 1.1, W3C Note (2001).
S.B. Davidson, O.P. Buneman, J. Crabtree, V. Tannen, G.C. Overton and L. Wong, BioKleisli: Integrating biomedical data and analysis packages, in: Bioinformatics: Databases and Systems, S. Letovsky (ed.), Kluwer Academic Publishers, Norwell, MA pp. 201–211 (1999).
Google Scholar
DoubleTwist, Inc., http://www.doubletwist.com
eMOTIF, J.Y. Huang and D.L. brutlag, The EMOTIF database, Nucleic Acid Res., 21(1) (2000) 202–204.
Google Scholar
T. Etzold, A. Ulyanov and P. Argos, SRS: Information retrieval system for molecular biology data banks, Methods Enzymol 266 (1996) 114–128.
Google Scholar
Entigen Corporation (eBioinformatics, Inc., and Empatheon, Inc.), http://www.entigen.com/
Entrez, Entrez's 3D-structure database, Nucl. Acids. Res. 31 (2003) 474–477.
Google Scholar
GenBank, GenBank, Nucl. Acids. Res. 31 (2003) 23–27.
Google Scholar
Genome resources and searches, http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome
INCOGEN, Inc., VIBE: Visual integrated bioinformatics, white paper, http://www.incogen.com
Java Web Start, http://java.sun.com/products/javawebstart/
JavaScript, http://wp.netscape.com/eng/mozilla/3.0/handbook/java-script/
K. Kochut and J. Arnold, et al., IntelliGEN: A distributed workflow system for discovering protein-protein interactions, Distributed and Parallel Databases 13 (2003) 43–72.
Article Google Scholar
M. Kanehisa, S. Goto, S. Kawashima, Y. Okuno and M. Hattori, The KEGG resource for deciphering the genome, Nucl. Acids. Res. 32 (2004) D277–D280.
Article Google Scholar
L. Moreau, S. Miles, C. Goble, M. Greenwood, V. Dialani, M. Addis, N. Alpdemir, R. Cawley, D. De Roure, J. Ferris, R. Gaizauskas, K. Glover, C. Greenhalgh, M. Greenwood, P. Li, X. Liu, P. Lord, M. Luck, D. Marvin, T. Oinn, N. Paton, S. Pettifer, M.V Radenkovic, A. Roberts, A. Robinson, T. Rodden, M. Senger, N. Sharman, R. Stevens, B. Warboys, A. Wipat and C. Wroe, On the use of agents in a bioinformatics grid, in: Proceedings of the Third IEEE/ACM CCGRID'2003 Workshop on Agent Based Cluster and Grid Computing, Sangsan Lee, Satoshi Sekguchi, Satoshi Matsuoka, and Mitsuhisa Sato (eds.), Tokyo, Japan, (2003) pp. 653–661.
L. Moreau, S. Miles, C. Goble, M. Greenwood, V. Dialani, M. Addis, N. Alpdemir, R. Cawley, D. De Roure, J. Ferris, R. Gaizauskas, K. Glover, C. Greenhalgh, M. Greenwood, P. Li, X. Liu, P. Lord, M. Luck, D. Marvin, T. Oinn, N. Paton, S. Pettifer, M. V Radenkovic, A. Roberts, A. Robinson, T. Rodden, M. Senger, N. Sharman, R. Stevens, B. Warboys, A. Wipat and C. Wroe, On the Use of Agents in a bioInformatics grid, in: Proceedings of the Third IEEE/ACM CCGRID'2003 Workshop on Agent Based Cluster and Grid Computing, Sangsan Lee, Satoshi Sekguchi, Satoshi Matsuoka, and Mitsuhisa Sato (eds.), Tokyo, Japan, (2003) pp. 653–661.
OWL, A non-redundant composite protein sequence database, Nucl. Acids. Res. 22 (1994) 3574–3577.
Protein Sequence Analysis, a practical guide. http://www.bioinf.man.ac.uk/dbbrowser/bioactivity/
PIR, The protein information resource (PIR), Nucl. Acids. Res. 28 (2000) 41–44.
PROSITE, The PROSITE database, Nucl. Acids. Res. 30 (2002) 235–238.
Profiles, http://hits.isb-sib.ch/cgi-bin/PFSCAN
Pfam, The Pfam protein families database, Nucl. Acids. Res. 32 (2004) D138–D141.
Google Scholar
W.R. Pearson and D.J. Lipman, improved tools for biological sequence comparison, PNAS 85 (1988) 2444–2448, W.R. Pearson, Rapid and sensitive sequence comparison with FASTP and FASTA, Methods in Enzymology 183 (1990) 63–98.
P. Rice, I. Longden and A. Bleasby, EMBOSS: The European molecular biology open software suite, Trends in Genetics, 16(6) (2000) 276–277.
Article Google Scholar
D. Rocco and T. Critchlow, Discovery and Classification of Bioinformatics Web Services, Lawrence Livermore National Laboratory Technical Report. UCRL-JC-149963 (2002).
R. Stevens, P. Baker, S. Bechhofer, G. Ng, A. Jacoby, N.W. Paton, C.A. Goble and A. Brass, TAMBIS: Transparent access to multiple bioinformatics information sources, Bioinformatics 16(2) (2000) 184–186.
Article Google Scholar
A. Siepel, A. Tolopko, A. Farmer, P. Steadman, F. Schilkey, B.D. Perry and W. Beavis, An integration platform for heterogeneous bioinformatics software components, IBM Systems Journal 40(2) 570–591.
Swiss-Prot, The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucl. Acids. Res. 31 (2003) 365–370.
Google Scholar
S. Schulze-Kremer, Ontologies for molecular biology, Third Pacific Symposium on Biocomputing (1998) 695–706.
Transeq, EMBOSS tool for translating DNA/RNA into protein. http://www.ebi.ac.uk/emboss/transeq/
TurboWorx^TM, http://www.turboworx.com
The workflow portal, The Workflow Handbook 2004, Published in association with the Workflow Management Coalition (WfMC), Layna Fischer (ed.).
Ubertool, http://www.science-factory.com/products.html
M.D. Wilkinson and M. Links, BioMOBY: An open-source biological web services proposal, Briefings in Bioinformatics 3(4) (2002) 331–341.
Article Google Scholar
GCG^® Wisconsin Package^TM, http://www.accelrys.com/products/seqweb
C. Wroe, R. Stevens, C. Goble, A. Boberts and M. Greenwood, A suite of DAML + OIL ontologies to describe bioinformatics web services and data, International Journal of Cooperative Information Systems 12(2) (2003).

Download references

Author information

Authors and Affiliations

School of Informatics, IUPUI, Walker Plaza, 719 Indiana Avenue, Indianapolis, IN, 4620, USA
Malika Mahoui & Jessica Chen
Department of Computer and Information Science, IUPUI, USA
Lingma Lu & Omran Bukhres
Department of Electrical and Computer Engineering, IUPUI, USA
Ning Gao, Nianhua Li & Zina Ben Miled

Authors

Malika Mahoui
View author publications
You can also search for this author in PubMed Google Scholar
Lingma Lu
View author publications
You can also search for this author in PubMed Google Scholar
Ning Gao
View author publications
You can also search for this author in PubMed Google Scholar
Nianhua Li
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Chen
View author publications
You can also search for this author in PubMed Google Scholar
Omran Bukhres
View author publications
You can also search for this author in PubMed Google Scholar
Zina Ben Miled
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malika Mahoui.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mahoui, M., Lu, L., Gao, N. et al. A Dynamic Workflow Approach for the Integration of Bioinformatics Services. Cluster Comput 8, 279–291 (2005). https://doi.org/10.1007/s10586-005-4095-1

Download citation

Issue Date: October 2005
DOI: https://doi.org/10.1007/s10586-005-4095-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Dynamic Workflow Approach for the Integration of Bioinformatics Services

Abstract

Article PDF

Similar content being viewed by others

An automated real-time integration and interoperability framework for bioinformatics

Towards an Integration Platform for Bioinformatics Services

Workflows and Service Discovery: A Mobile Device Approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Dynamic Workflow Approach for the Integration of Bioinformatics Services

Abstract

Article PDF

Similar content being viewed by others

An automated real-time integration and interoperability framework for bioinformatics

Towards an Integration Platform for Bioinformatics Services

Workflows and Service Discovery: A Mobile Device Approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation