Skip to main content

Tapis: An API Platform for Reproducible, Distributed Computational Research

  • Conference paper
  • First Online:
Advances in Information and Communication (FICC 2021)

Abstract

Modern computational research increasingly spans multiple, geographically distributed data centers and leverages instruments, experimental facilities and a network of national and regional cyberinfrastructure (CI). Tapis is an open-source API platform developed at the Texas Advanced Computing Center at the University of Texas at Austin to increase reproducibility and minimize time-to-solution for distributed computational experiments. Core features of Tapis include data management and code execution, a fine-grained permissions system enabling objects to be saved privately, shared with individuals or “published” to a community, and provenance endpoints exposing the detailed history Tapis collects on analyses, enabling workflows to be repeated and results reproduced. In this paper, we describe the evolution of the Tapis platform, from its origins in 2008, and discuss the growth and success of the project as well as challenges and limitations that have led to a new design effort, funded by the National Science Foundation in September of 2019. We present a detailed overview of the new system, including reference architecture and new features such as support for streaming/sensor data, and we discuss some of the early science use cases driving its design. We conclude with the roadmap for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Json Web Token (JWT) (2018). https://tools.ietf.org/html/rfc7519. Accessed 17 Mar 2018

  2. Amazon Key Management Service (2019). https://aws.amazon.com/kms/. Accessed 30 Oct 2019

  3. Apache Airavata (2019). https://airavata.apache.org/index.html. Accessed 30 Oct 2019

  4. Apache Shiro (2019). http://shiro.apache.org/. Accessed 30 Oct 2019

  5. Dataturbine (2019). http://dataturbine.org/. Accessed 30 Oct 2019

  6. Fortress (2019). https://directory.apache.org/fortress/. Accessed 30 Oct 2019

  7. Galaxy community hub (2019). https://galaxyproject.org/. Accessed 30 Oct 2019

  8. Globus (2019). https://www.globus.org/. Accessed 30 Oct 2019

  9. HashiCorp Vault (2019). https://www.vaultproject.io/. Accessed 30 Oct 2019

  10. HubZero (2019). https://hubzero.org/. Accessed 30 Oct 2019

  11. iReceptor Plus (2019). https://www.ireceptor-plus.com/. Accessed 30 Oct 2019

  12. Kerberos (2019). https://web.mit.edu/Kerberos/. Accessed 30 Oct 2019

  13. Microsoft Active Directory (2019). https://azure.microsoft.com/en-us/services/active-directory. Accessed 30 Oct 2019

  14. OAuth2 (2019). https://oauth.net/2/. Accessed 30 Oct 2019

  15. PERMIS (2019). http://www.openpermis.info/. Accessed 30 Oct 2019

  16. Sciserver (2019). http://www.sciserver.org/. Accessed 30 Oct 2019

  17. VDJServer (2019). http://vdjserver.org. Accessed 30 Oct 2019

  18. Who’s Using AWS (2020). https://www.contino.io/insights/whos-using-aws. Accessed 08 May 2020

  19. Cleveland, S.B., et al.: Building science gateway infrastructure in the middle of the pacific and beyond: experiences using the Agave Deployer and Agave platform to build science gateways. In: Proceedings of the Practice and Experience on Advanced Research Computing. PEARC 2018 (2018a)

    Google Scholar 

  20. Cleveland, S.B., et al.: The ‘Ike Wai Gateway - A science gateway for the water future of Hawai’i. In: Proceedings of Science Gateways 2018, Austin TX, USA, September 2018. Science Gateways Community Institute (2018b)

    Google Scholar 

  21. Collard, F., Ardhuin, F., Chapron, B.: Monitoring and analysis of ocean swell fields from space: new methods for routine observations. JGR-Oceans 114(C7) (2009). https://doi.org/10.1029/2008jc005215

  22. Deelman, E., et al.: Pegasus: a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015). http://pegasus.isi.edu/publications/2014/2014-fgcs-deelman.pdf

  23. Dooley, R., et al.: Software-as-a-Service: the iPlant foundation API. In: 5th IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS). IEEE (2012)

    Google Scholar 

  24. Goff, S.A., Vaughn, M., McKay, S., Lyons, E., Stapleton, A.E., Gessler, D., Matasci, N., Wang, L., Hanlon, M., Lenards, A., et al.: The iPlant collaborative: cyberinfrastructure for plant biology. Front. Plant Sci. 2, 34 (2011)

    Article  Google Scholar 

  25. Gottdank, T.: Introduction to the WS-PGRADE/gUSE science gateway framework, pp. 19–32. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11268-8_2

  26. Kerkez, B., et al.: Cloud hosted real-time data services for the Geosciences (CHORDS). Geosci. Data J., 2–4 (2016)

    Google Scholar 

  27. Kurtzer, G.M., Sochat, V., Bauer, M.W.: Singularity: scientific containers for mobility of compute. PloS One 12(5), e0177,459 (2017)

    Google Scholar 

  28. Li, N., Cheung, K., Stopa, J., Hsiao, F., Chen, Y.L., Vega, L., Cross, P.: Thirty-four years of Hawaii wave hindcast from downscaling of climate forecast system reanalysis. Ocean Model. 100, 78–95 (2016). https://doi.org/10.1016/j.ocemod.2016.02.001

    Article  Google Scholar 

  29. Litvina, E., Adams, A., Barth, A., Bruchez, M., Carson, J., Chung, J., Dupre, K., Frank, L., Gates, K., Harris, K., Joo, H.: BRAIN initiative: cutting-edge tools and resources for the community. J. Neurosci. 39(42), 8275–84 (2019). https://doi.org/10.1523/JNEUROSCI.1169-19.2019

    Article  Google Scholar 

  30. Merchant, N., Lyons, E., Goff, S., Vaughn, M., Ware, D., Micklos, D., Antin, P.: The iPlant collaborative: cyberinfrastructure for enabling data to discovery for the life sciences. PLoS Biol. 14(1), e1002,342 (2016a)

    Google Scholar 

  31. Merchant, N., et al.: The iPlant collaborative: cyberinfrastructure for enabling data to discovery for the life sciences. PLOS Biol. (2016b). https://doi.org/10.1371/journal.pbio.1002342

  32. Padhy, S., Jansen, G., Alameda, J., Black, E., Diesendruck, L., Dietze, M., Kumar, P., Kooper, R., Lee, J., Liu, R., et al.: Brown dog: leveraging everything towards autocuration. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 493–500. IEEE (2015)

    Google Scholar 

  33. Powell, J., Stubbs, J., Cleveland, S., Pierce, S., Daniels, M.: Streamed data via cloud-hosted real-time data services for the geosciences as an ingestion interface into the Planet Texas science gateway and integrated modeling platform. In: Proceedings of Science Gateways, San Diego, CA, USA, September 2019, p. 2019. Science Gateways Community Institute (2019)

    Google Scholar 

  34. Proctor, W.C., Packard, M., Jamthe, A., Cardone, R., Stubbs, J.: Virtualizing the Stampede2 supercomputer with applications to HPC in the In: Proceedings of the Practice and Experience on Advanced Research Computing (2018). https://doi.org/10.1145/3219104.3219131

  35. Rathje, E.M., Dawson, C., Padgett, J.E., Pinelli, J.P., Stanzione, D., Adair, A., Arduino, P., Brandenberg, S.J., Cockerill, T., Dey, C., et al.: DesignSafe: new cyberinfrastructure for natural hazards engineering. Nat. Hazards Rev. 18(3), 06017,001 (2017)

    Google Scholar 

  36. Stopa, J., Cheung, K.F., Chen, Y.L.: Assessment of wave energy resources in Hawaii. Renew. Energy 36(2), 554–567 (2011). https://doi.org/10.1016/j.renene.07.014

  37. Stopa, J., Ardhuin, F., Husson, R., Jiang, H., Chapron, B., Collard, F.: Swell dissipation from 10 years of Envisat ASAR in wave mode. GRL (2016). https://doi.org/10.1002/2015GL067566

  38. Stopa, J.E., Mouche, A.: Significant wave heights from Sentinel-1 SAR: validation and applications. J. Geophys. Res. Oceans 122, 1827–1848 (2017). https://doi.org/10.1002/2016JC012364

    Article  Google Scholar 

  39. Stubbs, J., et al.: Rapid development of scalable, distributed computation with Abaco. In: 10th International Workshop on Science Gateways. Science Gateways Community Institute (2018a)

    Google Scholar 

  40. Stubbs, J., et al.: TACC’s Cloud Deployer: automating the management of distributed software systems. In: The 2nd Industry/University Joint International Workshop on Data Center Automation, Analytics, and Control (DAAC). Supercomputing (2018b). https://drive.google.com/file/d/1oORwQdQEWTHLpARVJPzQqR_0OY5SOrfg/view

  41. Wilkins-Diehr, N., Zentner, M., Pierce, M., Dahan, M., Lawrence, K., Hayden, L., Mullinix, N.: The science gateways community institute at two years. In: Proceedings of the Practice and Experience on Advanced Research Computing, p. 53. ACM (2018)

    Google Scholar 

  42. Wolstencroft, K., et al.: The Taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41, W557–W561 (2013)

    Article  Google Scholar 

Download references

Acknowledgment

This material is based upon work supported by the National Science Foundation Office of Advanced CyberInfrastructure, grant numbers 1931439 and 1931575.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joe Stubbs .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stubbs, J. et al. (2021). Tapis: An API Platform for Reproducible, Distributed Computational Research. In: Arai, K. (eds) Advances in Information and Communication. FICC 2021. Advances in Intelligent Systems and Computing, vol 1363. Springer, Cham. https://doi.org/10.1007/978-3-030-73100-7_61

Download citation

Publish with us

Policies and ethics