Summary
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distributed memory parallelization on the node interconnect with the shared memory parallelization inside of each node. Various hybrid MPI+OpenMP programming models are compared with pure MPI. Benchmark results of several platforms are presented. This paper analyzes the strength and weakness of several parallel programming models on clusters of SMP nodes. There are several mismatch problems between the (hybrid) programming schemes and the hybrid hardware architectures. Benchmark results on a Myrinet cluster and on recent Cray, NEC, IBM, Hitachi, SUN and SGI platforms show, that the hybrid-masteronly programming model can be used more efficiently on some vector-type systems, but also on clusters of dual-CPUs. On other systems, one CPU is not able to saturate the inter-node network and the commonly used masteronly programming model suffers from insufficient inter-node bandwidth. This paper analyses strategies to overcome typical drawbacks of this easily usable programming scheme on systems with weaker inter-connects. Best performance can be achieved with overlapping communication and computation, but this scheme is lacking in ease of use.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Rudolf Berrendorf, Michael Gerndt, Wolfgang E. Nagel and Joachim Prumerr, SVM Fortran, Technical Report IB-9322, KFA Jülich, Germany, 1993. www.fz-juelich.de/zam/docs/printable/ib/ib-93/ib-9322.ps
Frank Cappello and Daniel Etiemble, MPI versus MPI+OpenMP on the IBM SP for the NAS benchmarks, in Proc. Supercomputing'00, Dallas, TX, 2000. http://citeseer.nj.nec.com/cappello00mpi.html
The Earth Simulator. www.es.jamstec.go.jp
William Gropp, Ewing Lusk, Nathan Doss, and Anthony Skjellum, A high-performance, portable implementation of the MPI message passing interface standard, in Parallel Computing 22–6, Sep. 1996, pp 789–828. http://citeseer.nj.nec.com/gropp96highperformance.html
Shinichi Habataa, Mitsuo Yokokawa, and Shigemune Kitawaki, The Earth Simulator System, in NEC Research & Development, Vol. 44, No. 1, Jan. 2003, Special Issue on High Performance Computing.
Jonathan Harris, Extending OpenMP for NUMA Architectures, in proceedings of the Second European Workshop on OpenMP, EWOMP 2000. www.epcc.ed.ac.uk/ewomp2000/proceedings.html
D. S. Henty, Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling, in Proc. Supercomputing'00, Dallas, TX, 2000. http://citeseer.nj.nec.com/henty00performance.html www.sc2000.org/techpapr/papers/pap.pap154.pdf
Matthias Hess, Gabriele Jost, Matthias Müller, and Roland Rühle, Experiences using OpenMP based on Compiler Directed Software DSM on a PC Cluster, in WOMPAT2002: Workshop on OpenMP Applications and Tools, Arctic Region Supercomputing Center, University of Alaska, Fairbanks, Aug. 5–7, 2002. http://www.hlrs.de/people/mueller/papers/wompat2002/wompat2002.pdf
Georg Karypis and Vipin Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering, Journal of Parallel and Distributed Computing, 48(1): 71–95, 1998. http://www-users.cs.umn.edu/~karypis/metis/ http://citeseer.nj.nec.com/karypis98parallel.html
R. D. Loft, S. J. Thomas, and J. M. Dennis, Terascale spectral element dynamical core for atmospheric general circulation models, in proceedings, SC 2001, Nov. 2001, Denver, USA. www.sc2001.org/papers/pap.pap189.pdf
John Merlin, Distributed OpenMP: Extensions to OpenMP for SMP Clusters, in proceedings of the Second European Workshop on OpenMP, EWOMP 2000. www.epcc.ed.ac.uk/ewomp2000/proceedings.html
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Rel. 1.1, June 1995, www.mpi-forum.org.
Message Passing Interface Forum. MPI-2: Extensions to the Message-Passing Interface, July 1997, www.mpi-forum.org.
Hans Meuer, Erich Strohmaier, Jack Dongarra, Horst D. Simon, Universities of Mannheim and Tennessee, TOP500 Supercomputer Sites, www.top500.org.
OpenMP Group, www.openmp.org.
Rolf Rabenseifner and Gerhard Wellein, Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures, International Journal of High Performance Computing Applications, Sage Science Press, Vol. 17, No. 1, 2003, pp 49–62.
Rolf Rabenseifner, Hybrid Parallel Programming: Performance Problems and Chances, in proceedings of the 45th CUG Conference 2003, Columbus, Ohio, USA, May 12–16, 2003, www.cug.org.
Mitsuhisa Sato, Shigehisa Satoh, Kazuhiro Kusano, and Yoshio Tanaka, Design of OpenMP Compiler for an SMP Cluster, in proceedings of the 1st European Workshop on OpenMP (EWOMP'99), Lund, Sweden, Sep. 1999, pp 32–39. http://citeseer.nj.nec.com/sato99design.html
A. Scherer, H. Lu, T. Gross, and W. Zwaenepoel, Transparent Adaptive Parallelism on NOWs using OpenMP, in proc. of the Seventh Conference on Principles and Practice of Parallel Programming (PPoPP '99), May 1999, pp 96–106.
Weisong Shi, Weiwu Hu, and Zhimin Tang, Shared Virtual Memory: A Survey, Technical report No. 980005, Center for High Performance Computing, Institute of Computing Technology, Chinese Academy of Sciences, 1998, www.ict.ac.cn/chpc/dsm/tr980005.ps.
Lorna Smith and Mark Bull, Development of Mixed Mode MPI / OpenMP Applications, in proceedings of Workshop on OpenMP Applications and Tools (WOMPAT 2000), San Diego, July 2000. www.cs.uh.edu/wompat2000/
Gerhard Wellein, Georg Hager, Achim Basermann, and Holger Fehske, Fast sparse matrix-vector multiplication for TeraFlop/s computers, in proceedings of VECPAR'2002, 5th Int'l Conference on High Performance Computing and Computational Science, Porto, Portugal, June 26–28, 2002, part I, pp 57–70. http://vecpar.fe.up.pt/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rabenseifner, R., Wellein, G. (2005). Comparison of Parallel Programming Models on Clusters of SMP Nodes. In: Bock, H.G., Phu, H.X., Kostina, E., Rannacher, R. (eds) Modeling, Simulation and Optimization of Complex Processes. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27170-8_31
Download citation
DOI: https://doi.org/10.1007/3-540-27170-8_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23027-4
Online ISBN: 978-3-540-27170-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)