Abstract
Modern computational research increasingly spans multiple, geographically distributed data centers and leverages instruments, experimental facilities and a network of national and regional cyberinfrastructure (CI). Tapis is an open-source API platform developed at the Texas Advanced Computing Center at the University of Texas at Austin to increase reproducibility and minimize time-to-solution for distributed computational experiments. Core features of Tapis include data management and code execution, a fine-grained permissions system enabling objects to be saved privately, shared with individuals or “published” to a community, and provenance endpoints exposing the detailed history Tapis collects on analyses, enabling workflows to be repeated and results reproduced. In this paper, we describe the evolution of the Tapis platform, from its origins in 2008, and discuss the growth and success of the project as well as challenges and limitations that have led to a new design effort, funded by the National Science Foundation in September of 2019. We present a detailed overview of the new system, including reference architecture and new features such as support for streaming/sensor data, and we discuss some of the early science use cases driving its design. We conclude with the roadmap for future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Json Web Token (JWT) (2018). https://tools.ietf.org/html/rfc7519. Accessed 17 Mar 2018
Amazon Key Management Service (2019). https://aws.amazon.com/kms/. Accessed 30 Oct 2019
Apache Airavata (2019). https://airavata.apache.org/index.html. Accessed 30 Oct 2019
Apache Shiro (2019). http://shiro.apache.org/. Accessed 30 Oct 2019
Dataturbine (2019). http://dataturbine.org/. Accessed 30 Oct 2019
Fortress (2019). https://directory.apache.org/fortress/. Accessed 30 Oct 2019
Galaxy community hub (2019). https://galaxyproject.org/. Accessed 30 Oct 2019
Globus (2019). https://www.globus.org/. Accessed 30 Oct 2019
HashiCorp Vault (2019). https://www.vaultproject.io/. Accessed 30 Oct 2019
HubZero (2019). https://hubzero.org/. Accessed 30 Oct 2019
iReceptor Plus (2019). https://www.ireceptor-plus.com/. Accessed 30 Oct 2019
Kerberos (2019). https://web.mit.edu/Kerberos/. Accessed 30 Oct 2019
Microsoft Active Directory (2019). https://azure.microsoft.com/en-us/services/active-directory. Accessed 30 Oct 2019
OAuth2 (2019). https://oauth.net/2/. Accessed 30 Oct 2019
PERMIS (2019). http://www.openpermis.info/. Accessed 30 Oct 2019
Sciserver (2019). http://www.sciserver.org/. Accessed 30 Oct 2019
VDJServer (2019). http://vdjserver.org. Accessed 30 Oct 2019
Who’s Using AWS (2020). https://www.contino.io/insights/whos-using-aws. Accessed 08 May 2020
Cleveland, S.B., et al.: Building science gateway infrastructure in the middle of the pacific and beyond: experiences using the Agave Deployer and Agave platform to build science gateways. In: Proceedings of the Practice and Experience on Advanced Research Computing. PEARC 2018 (2018a)
Cleveland, S.B., et al.: The ‘Ike Wai Gateway - A science gateway for the water future of Hawai’i. In: Proceedings of Science Gateways 2018, Austin TX, USA, September 2018. Science Gateways Community Institute (2018b)
Collard, F., Ardhuin, F., Chapron, B.: Monitoring and analysis of ocean swell fields from space: new methods for routine observations. JGR-Oceans 114(C7) (2009). https://doi.org/10.1029/2008jc005215
Deelman, E., et al.: Pegasus: a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015). http://pegasus.isi.edu/publications/2014/2014-fgcs-deelman.pdf
Dooley, R., et al.: Software-as-a-Service: the iPlant foundation API. In: 5th IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS). IEEE (2012)
Goff, S.A., Vaughn, M., McKay, S., Lyons, E., Stapleton, A.E., Gessler, D., Matasci, N., Wang, L., Hanlon, M., Lenards, A., et al.: The iPlant collaborative: cyberinfrastructure for plant biology. Front. Plant Sci. 2, 34 (2011)
Gottdank, T.: Introduction to the WS-PGRADE/gUSE science gateway framework, pp. 19–32. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11268-8_2
Kerkez, B., et al.: Cloud hosted real-time data services for the Geosciences (CHORDS). Geosci. Data J., 2–4 (2016)
Kurtzer, G.M., Sochat, V., Bauer, M.W.: Singularity: scientific containers for mobility of compute. PloS One 12(5), e0177,459 (2017)
Li, N., Cheung, K., Stopa, J., Hsiao, F., Chen, Y.L., Vega, L., Cross, P.: Thirty-four years of Hawaii wave hindcast from downscaling of climate forecast system reanalysis. Ocean Model. 100, 78–95 (2016). https://doi.org/10.1016/j.ocemod.2016.02.001
Litvina, E., Adams, A., Barth, A., Bruchez, M., Carson, J., Chung, J., Dupre, K., Frank, L., Gates, K., Harris, K., Joo, H.: BRAIN initiative: cutting-edge tools and resources for the community. J. Neurosci. 39(42), 8275–84 (2019). https://doi.org/10.1523/JNEUROSCI.1169-19.2019
Merchant, N., Lyons, E., Goff, S., Vaughn, M., Ware, D., Micklos, D., Antin, P.: The iPlant collaborative: cyberinfrastructure for enabling data to discovery for the life sciences. PLoS Biol. 14(1), e1002,342 (2016a)
Merchant, N., et al.: The iPlant collaborative: cyberinfrastructure for enabling data to discovery for the life sciences. PLOS Biol. (2016b). https://doi.org/10.1371/journal.pbio.1002342
Padhy, S., Jansen, G., Alameda, J., Black, E., Diesendruck, L., Dietze, M., Kumar, P., Kooper, R., Lee, J., Liu, R., et al.: Brown dog: leveraging everything towards autocuration. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 493–500. IEEE (2015)
Powell, J., Stubbs, J., Cleveland, S., Pierce, S., Daniels, M.: Streamed data via cloud-hosted real-time data services for the geosciences as an ingestion interface into the Planet Texas science gateway and integrated modeling platform. In: Proceedings of Science Gateways, San Diego, CA, USA, September 2019, p. 2019. Science Gateways Community Institute (2019)
Proctor, W.C., Packard, M., Jamthe, A., Cardone, R., Stubbs, J.: Virtualizing the Stampede2 supercomputer with applications to HPC in the In: Proceedings of the Practice and Experience on Advanced Research Computing (2018). https://doi.org/10.1145/3219104.3219131
Rathje, E.M., Dawson, C., Padgett, J.E., Pinelli, J.P., Stanzione, D., Adair, A., Arduino, P., Brandenberg, S.J., Cockerill, T., Dey, C., et al.: DesignSafe: new cyberinfrastructure for natural hazards engineering. Nat. Hazards Rev. 18(3), 06017,001 (2017)
Stopa, J., Cheung, K.F., Chen, Y.L.: Assessment of wave energy resources in Hawaii. Renew. Energy 36(2), 554–567 (2011). https://doi.org/10.1016/j.renene.07.014
Stopa, J., Ardhuin, F., Husson, R., Jiang, H., Chapron, B., Collard, F.: Swell dissipation from 10 years of Envisat ASAR in wave mode. GRL (2016). https://doi.org/10.1002/2015GL067566
Stopa, J.E., Mouche, A.: Significant wave heights from Sentinel-1 SAR: validation and applications. J. Geophys. Res. Oceans 122, 1827–1848 (2017). https://doi.org/10.1002/2016JC012364
Stubbs, J., et al.: Rapid development of scalable, distributed computation with Abaco. In: 10th International Workshop on Science Gateways. Science Gateways Community Institute (2018a)
Stubbs, J., et al.: TACC’s Cloud Deployer: automating the management of distributed software systems. In: The 2nd Industry/University Joint International Workshop on Data Center Automation, Analytics, and Control (DAAC). Supercomputing (2018b). https://drive.google.com/file/d/1oORwQdQEWTHLpARVJPzQqR_0OY5SOrfg/view
Wilkins-Diehr, N., Zentner, M., Pierce, M., Dahan, M., Lawrence, K., Hayden, L., Mullinix, N.: The science gateways community institute at two years. In: Proceedings of the Practice and Experience on Advanced Research Computing, p. 53. ACM (2018)
Wolstencroft, K., et al.: The Taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41, W557–W561 (2013)
Acknowledgment
This material is based upon work supported by the National Science Foundation Office of Advanced CyberInfrastructure, grant numbers 1931439 and 1931575.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Stubbs, J. et al. (2021). Tapis: An API Platform for Reproducible, Distributed Computational Research. In: Arai, K. (eds) Advances in Information and Communication. FICC 2021. Advances in Intelligent Systems and Computing, vol 1363. Springer, Cham. https://doi.org/10.1007/978-3-030-73100-7_61
Download citation
DOI: https://doi.org/10.1007/978-3-030-73100-7_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73099-4
Online ISBN: 978-3-030-73100-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)