Abstract
High-level languages such as Python offer convenient language constructs and abstractions for readability and productivity. Such features and Python’s ability to serve as a steering language as well as a self-contained language for scientific computations has made Python a viable choice for high-performance computing. However, the Python interpreter’s reliance on shared objects and dynamic loading causes scalability issues that at large-scale consumes hours of wall-clock time just for loading the interpreter.
The work in this paper explores an approach to bypass the conventional software stack, by replacing the Python interpreter on compute nodes with an adaptable runtime system capable of executing the compute intensive portions of a Python program. Allowing for a single instance of the Python interpreter, interpreting the users’ program and additionally moving program interpretation off the compute nodes. Thereby avoiding the scalability issue of the interpreter as well as providing a means of running Python programs on restrictive compute notes which are otherwise unable to run Python.
The approach is experimentally evaluated through a prototype implementation of an extension to the Bohrium runtime system. The evaluation shows promising results as well as identifying issues for future work to address.
Chapter PDF
Similar content being viewed by others
References
Ahmadia, A.: Solving the import problem: Scalable Dynamic Loading Network File Systems. Technical report, Talk at SciPy conference, Austin, Texas (July 2012), http://pyvideo.org/video/1201/solving-the-import-problem-scalable-dynamic-load
Beazley, D.M.: Automated scientific software scripting with SWIG, vol. 19, pp. 599–609. Elsevier (2003)
Beazley, D.M., et al.: SWIG: An easy to use tool for integrating scripting languages with C and C++. In: Proceedings of the 4th USENIX Tcl/Tk workshop, pp. 129–139 (1996)
Behnel, S., Bradshaw, R., Citro, C., Dalcin, L., Seljebotn, D.S., Smith, K.: Cython: The best of both worlds. Computing in Science & Engineering 13(2), 31–39 (2011)
Black, F., Scholes, M.: The pricing of options and corporate liabilities. The Journal of Political Economy, 637–654 (1973)
Daily, J., Lewis, R.R.: Using the global arrays toolkit to reimplement numpy for distributed computation. In: Proceedings of the 10th Python in Science Conference (2011)
Dalcin, L., Paz, R., Storti, M., Elia, J.D.: MPI for Python: Performance improvements and MPI-2 extensions. Journal of Parallel and Distributed Computing 68(5), 655–662 (2008)
Drummond, L.A., Galiano, V., Migallón, V., Penadés, J.: High-level user interfaces for the DOE ACTS collection. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 251–259. Springer, Heidelberg (2007)
Enkovaaraa, J., Louhivuoria, M., Jovanovicb, P., Slavnicb, V., Rännarc, M.: Optimizing GPAW. Partnership for Advanced Computing in Europe (September 2012), http://www.prace-ri.eu/IMG/pdf/Optimizing_GPAW.pdf
Frings, W., Ahn, D.H., LeGendre, M., Gamblin, T., de Supinski, B.R., Wolf, F.: Massively Parallel Loading. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, pp. 389–398. ACM, New York (2013)
Gawande, K., Webers, C.: PyPETSc User Manual (Revision 1.0). Technical report, NICTA (2009), http://elefant.developer.nicta.com.au/documentation/userguide/PyPetscManual.pdf
Hunter, J.D.: Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering 9(3), 90–95 (2007)
Ketcheson, D.I., Mandli, K.T., Ahmadia, A.J., Alghamdi, A., Quezada de Luna, M., Parsani, M., Knepley, M.G., Emmett, M.: PyClaw: Accessible, Extensible, Scalable Tools for Wave Propagation Problems. SIAM Journal on Scientific Computing 34(4), C210–C231 (2012)
Kristensen, M.R.B., Vinter, B.: Numerical Python for scalable architectures. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, PGAS 2010, pp. 15:1–15:9. ACM, New York (2010)
Marion, P., Ahmadia, A., Froehle, B.M.: Import without a filesystem: scientific Python built-in with static linking and frozen modules. Technical report, Talk at SciPy conference, Austin, Texas, July 2012 (2013), https://www.youtube.com/watch?v=EOiEIWMYkwE
Nagle, J.: Congestion Control in IP/TCP Internetworks. RFC 896 (January 1984)
Oliphant, T.E.: Python for Scientific Computing. Computing in Science & Engineering 9(3), 10–20 (2007)
Pérez, F., Granger, B.E.: IPython: A System for Interactive Scientific Computing. Computing in Science & Engineering 9(3), 21–29 (2007)
Kristensen, M.R.B., Lund, S.A.F., Blum, T., Skovhede, K., Vinter, B.: Bohrium: a Virtual Machine Approach to Portable Parallelism. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW). IEEE (2014)
Sala, M., Spotz, W., Heroux, M.: PyTrilinos: High-Performance Distributed-Memory Solvers for Python. ACM Transactions on Mathematical Software (TOMS) (March 34, 2008)
Smith, K., Spotz, W.F., Ross-Ross, S.: A Python HPC Framework: PyTrilinos, ODIN, and Seamless. In: 2012 SC Companion High Performance Computing, Networking, Storage and Analysis (SCC), pp. 593–599. IEEE (2012)
Zhao, Z., Davis, M., Antypas, K., Yao, Y., Lee, R., Butler, T.: Shared Library Performance on Hopper. In: CUG 2012, Greengineering the Future, Stuttgart, Germany (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Lund, S.A.F., Kristensen, M.R.B., Vinter, B., Katsaros, D. (2014). Bypassing the Conventional Software Stack Using Adaptable Runtime Systems. In: Lopes, L., et al. Euro-Par 2014: Parallel Processing Workshops. Euro-Par 2014. Lecture Notes in Computer Science, vol 8806. Springer, Cham. https://doi.org/10.1007/978-3-319-14313-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-14313-2_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14312-5
Online ISBN: 978-3-319-14313-2
eBook Packages: Computer ScienceComputer Science (R0)