Abstract
The nano-threads programming model was proposed to effectively integrate multiprogramming on shared-memory multiprocessors, with the exploitation of fine-grain parallelism from standard applications. A prerequisite for the applicability of the nano-threads programming model is the ability of the runtime environment to manage parallelism at any level of granularity with minimal overheads. In this paper, we introduce runtime techniques for efficient memory management and user-level scheduling in an experimental runtime system designed to support the nano-threads programming model. We evaluate the exploitation of processor affinity for the management of nano-thread contexts, and the use of hierarchical queues to implement user-level scheduling strategies for applications with inherent multilevel parallelism. The proposed mechanisms attempt to obtain maximum benefits from data locality on cache-coherent NUMA multiprocessors. Through the use of synthetic benchmarks, we find that our mechanism for memory management in the runtime system reduces overheads by 52% on average, compared to other known mechanisms. The use of hierarchical queues gives significant performance improvements between 17% and 40%, compared to scheduling strategies that use local queues.
This work was supported by the NANOS project (ESPRIT No. 21907)
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
T. Anderson, E. Lazowska and H. Levy, The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors, IEEE Transactions on Computers, vol. 38(12), pp. 1632–1644, December 1989.
F. Bellosa and M. Steckermeier, The Performance Implications of Locality Information Usage in Shared-Memory Multiprocessors, Journal of Parallel and Distributed Computing, vol. 37(l), pp. 113–121, August 1996.
S. Dandamundi and P. Cheng, A Hierarchical Task Queue Organization for SharedMemory Multiprocessor Systems, IEEE Transactions on Parallel and Distributed Systems, vol. 6(1), pp. l–16, January 1995.
V Freeh, D. Lowenthal, and G. Andrews, Efficient Support for Fine-Grain Parallelism on Shared-Memory Machines, Technical Report TR96-l, University of Arizona, January 1996.
M. Girkar and C. Polychronopoulos, Automatic Extraction of Functional Parallelism from Ordinary Programs, IEEE Transactions on Parallel and Distributed Systems, vol. 3(2), pp. 166–178, March 1992.
D. Keppel, Tools and Techniques for Building Fast Portable Threads Packages, Technical Report UWCSE 93-05-06, University of Washington at Seattle, May 1993.
J. Laudon and D. Lenoski, The SGI Origin: A ccNUMA Highly Scalable Server, Proceedings of the 24th International Symposium on Computer Architecture, pp. 241–251, Denver, Colorado, June 1997.
X. Martorell, J. Labarta, N. Navarro and E. Ayguadé, A Library Implementation of the Nano-Threads Programming Model, Proceedings of the 2nd International EuroPar Conference, pp. 644–649, Lyon, France, August 1996.
X. Martorell, J. Labarta, N. Navarro and E. Ayguadé, Analysis of Several Scheduling Algorithms under the Nano-threads Programming Model, Proceedings of the 11th International Parallel Processing Symposium, pp. 281–287, Geneva, Switzerland, April 1997.
J. Moreira, On the Implementation and Effectiveness of Autoscheduling for SharedMemory Multiprocessors, PhD Thesis, University of Illinois at Urbana-Champaign, Department of Electrical and Computer Engineering, 1995.
C. Polychronopoulos, N. Bitar and S. Kleiman, Nano-Threads: A User-Level Threads Architecture, Technical Report 1297, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, 1993.
E. Polychronopoulos and T. Papatheodorou, Dynamic Bisectioning Scheduling for Scalable Shared-Memory Multiprocessors based on the Nano-Threads Programming Model, Technical Report HPCAL-TR-010697, University of Patras, Department of Computer Engineering and Informatics, June 1997.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nikolopoulos, D.S., Polychronopoulos, E.D., Papatheodorou, T.S. (1998). Efficient runtime thread management for the nano-threads programming model. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_688
Download citation
DOI: https://doi.org/10.1007/3-540-64359-1_688
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64359-3
Online ISBN: 978-3-540-69756-5
eBook Packages: Springer Book Archive