Skip to main content

Efficient runtime thread management for the nano-threads programming model

  • Worshop on Run- Time Systems for Parallel Programming Matthew Haines, University or Wyoming, USA Koen Langendoen, Vrije Universiteit, The Netherlands Greg Benson, University of California at Davis, USA
  • Conference paper
  • First Online:
Parallel and Distributed Processing (IPPS 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1388))

Included in the following conference series:

Abstract

The nano-threads programming model was proposed to effectively integrate multiprogramming on shared-memory multiprocessors, with the exploitation of fine-grain parallelism from standard applications. A prerequisite for the applicability of the nano-threads programming model is the ability of the runtime environment to manage parallelism at any level of granularity with minimal overheads. In this paper, we introduce runtime techniques for efficient memory management and user-level scheduling in an experimental runtime system designed to support the nano-threads programming model. We evaluate the exploitation of processor affinity for the management of nano-thread contexts, and the use of hierarchical queues to implement user-level scheduling strategies for applications with inherent multilevel parallelism. The proposed mechanisms attempt to obtain maximum benefits from data locality on cache-coherent NUMA multiprocessors. Through the use of synthetic benchmarks, we find that our mechanism for memory management in the runtime system reduces overheads by 52% on average, compared to other known mechanisms. The use of hierarchical queues gives significant performance improvements between 17% and 40%, compared to scheduling strategies that use local queues.

This work was supported by the NANOS project (ESPRIT No. 21907)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. T. Anderson, E. Lazowska and H. Levy, The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors, IEEE Transactions on Computers, vol. 38(12), pp. 1632–1644, December 1989.

    Article  Google Scholar 

  2. F. Bellosa and M. Steckermeier, The Performance Implications of Locality Information Usage in Shared-Memory Multiprocessors, Journal of Parallel and Distributed Computing, vol. 37(l), pp. 113–121, August 1996.

    Article  Google Scholar 

  3. S. Dandamundi and P. Cheng, A Hierarchical Task Queue Organization for SharedMemory Multiprocessor Systems, IEEE Transactions on Parallel and Distributed Systems, vol. 6(1), pp. l–16, January 1995.

    Google Scholar 

  4. V Freeh, D. Lowenthal, and G. Andrews, Efficient Support for Fine-Grain Parallelism on Shared-Memory Machines, Technical Report TR96-l, University of Arizona, January 1996.

    Google Scholar 

  5. M. Girkar and C. Polychronopoulos, Automatic Extraction of Functional Parallelism from Ordinary Programs, IEEE Transactions on Parallel and Distributed Systems, vol. 3(2), pp. 166–178, March 1992.

    Article  Google Scholar 

  6. D. Keppel, Tools and Techniques for Building Fast Portable Threads Packages, Technical Report UWCSE 93-05-06, University of Washington at Seattle, May 1993.

    Google Scholar 

  7. J. Laudon and D. Lenoski, The SGI Origin: A ccNUMA Highly Scalable Server, Proceedings of the 24th International Symposium on Computer Architecture, pp. 241–251, Denver, Colorado, June 1997.

    Google Scholar 

  8. X. Martorell, J. Labarta, N. Navarro and E. Ayguadé, A Library Implementation of the Nano-Threads Programming Model, Proceedings of the 2nd International EuroPar Conference, pp. 644–649, Lyon, France, August 1996.

    Google Scholar 

  9. X. Martorell, J. Labarta, N. Navarro and E. Ayguadé, Analysis of Several Scheduling Algorithms under the Nano-threads Programming Model, Proceedings of the 11th International Parallel Processing Symposium, pp. 281–287, Geneva, Switzerland, April 1997.

    Google Scholar 

  10. J. Moreira, On the Implementation and Effectiveness of Autoscheduling for SharedMemory Multiprocessors, PhD Thesis, University of Illinois at Urbana-Champaign, Department of Electrical and Computer Engineering, 1995.

    Google Scholar 

  11. C. Polychronopoulos, N. Bitar and S. Kleiman, Nano-Threads: A User-Level Threads Architecture, Technical Report 1297, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, 1993.

    Google Scholar 

  12. E. Polychronopoulos and T. Papatheodorou, Dynamic Bisectioning Scheduling for Scalable Shared-Memory Multiprocessors based on the Nano-Threads Programming Model, Technical Report HPCAL-TR-010697, University of Patras, Department of Computer Engineering and Informatics, June 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Rolim

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nikolopoulos, D.S., Polychronopoulos, E.D., Papatheodorou, T.S. (1998). Efficient runtime thread management for the nano-threads programming model. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_688

Download citation

  • DOI: https://doi.org/10.1007/3-540-64359-1_688

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64359-3

  • Online ISBN: 978-3-540-69756-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics