Abstract
We consider distribution at compile time of the array data in a distributed-memory implementation of a data-parallel program written in a language like Fortran 90. We allow dynamic redistribution of data and define a heuristic algorithmic framework that chooses distribution parameters to minimize an estimate of program completion time. We represent the program as an alignment-distribution graph. We propose a divide-and-conquer algorithm for distribution that initially assigns a common distribution to each node of the graph and successively refines this assignment, taking computation, realignment, and redistribution costs into account. We explain how to estimate the effect of distribution on computation cost and how to choose a candidate set of distributions. We present the results of an implementation of our algorithms on several test problems.
The work of these authors was supported by the NAS Systems Division via Contract NAS 2-13721 between NASA and the Universities Space Research Association (USRA).
Preview
Unable to display preview. Download preview PDF.
References
Robert Bixby, Ken Kennedy, and Ulrich Kremer. Automatic data layout using 0–1 integer programming. Technical Report CRPC-TR93349-S, Center for Research on Parallel Computation, Rice University, Houston, TX, November 1993.
Siddhartha Chatterjee, John R. Gilbert, and Robert Schreiber. Mobile and replicated alignment of arrays in data-parallel programs. In Proceedings of Supercomputing'93, pages 420–429, Portland, OR, November 1993. Also available as RIACS Technical Report 93.08 and Xerox PARC Technical Report CSL-93-7.
Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, and Thomas J. Sheffler. Array distribution in data-parallel programs. Technical Report 94.09, RIACS, 1994.
Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, and Thomas J. Sheffler. Modeling data-parallel programs with the alignment-distribution graph. Journal of Programming Languages, 1994. Special issue on compiling and run-time issues for distributed address space machines. To appear.
Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, and Shang-Hua Teng. Optimal evaluation of array expressions on massively parallel machines. In Proceedings of the Second Workshop on Languages, Compilers, and Runtime Environments for Distributed Memory Multiprocessors, Boulder, CO, October 1992. Published in SIGPLAN Notices, 28(1), January 1993, pages 68–71. An expanded version is available as RIACS Technical Report TR 92.17 and Xerox PARC Technical Report CSL-92-11.
Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, and Shang-Hua Teng. Automatic array alignment in data-parallel programs. In Proceedings of the Twentieth Annual ACM SIGACT/SIGPLAN Symposium on Principles of Programming Languages, pages 16–28, Charleston, SC, January 1993. Also available as RIACS Technical Report 92.18 and Xerox PARC Technical Report CSL-92-13.
Roger Fletcher. Practical Methods of Optimization. John Wiley & Sons, second edition, 1989.
Philip E. Gill, Walter Murray, and Margaret H. Wright Practical Optimization. Academic Press, Orlando, FL, 1981.
Manish Gupta. Automatic Data Partitioning on Distributed Memory Multicomputers. PhD thesis, University of Illinois at Urbana-Champaign, Urbana, IL, September 1992. Available as technical reports UILU-ENG-92-2237 and CRHC-92-19.
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1–2):1–170, 1993.
Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66–80, August 1992.
Patty Hough and Thomas J. Sheffler. A performance analysis of collective communication on the cm5. Technical report, RIACS, 1994. In Preparation.
David Karger and Clifford Stein. On Õ(n 2) algorithm for minimum cuts. In Proceedings of the 25th Annual ACM Symposium on Theory of Computing, pages 757–765, 1993.
Ulrich Kremer. NP-completeness of dynamic remapping. Technical Report CRPC-TR93-330-S, Center for Research on Parallel Computation, Rice University, Houston, TX, August 1993. Appears in the Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.
Ulrich Kremer, John Mellor-Crummey, Ken Kennedy, and Alan Carle. Automatic data layout for distributed-memory machines in the D programming environment. Technical Report CRPC-TR93-298-S, Center for Research on Parallel Computation, Rice University, Houston, TX, February 1993. Appears in Proceedings of the First International Workshop on Automatic Distributed Memory Parallelization, Automatic Data Distribution and Automatic Parallel Performance Prediction (AP'93), Vieweg Verlag, Wiesbaden, Germany.
Alex Pothen, Horst D. Simon, and Kang-Pu Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM Journal of Matrix Analysis and Applications, 11(3):430–452, July 1990.
Richard S. Varga. Matrix Iterative Analysis. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1962.
Skef Wholey. Automatic Data Mapping for Distributed-Memory Parallel Computers. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, May 1991. Available as Technical Report CMU-CS-91-121.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chatterjee, S., Gilbert, J.R., Schreiber, R., Sheffler, T.J. (1995). Array distribution in data-parallel programs. In: Pingali, K., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1994. Lecture Notes in Computer Science, vol 892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0025872
Download citation
DOI: https://doi.org/10.1007/BFb0025872
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58868-9
Online ISBN: 978-3-540-49134-7
eBook Packages: Springer Book Archive