Semi-automatic Composition of Data Layout Transformations for Loop Vectorization

Xu, Shixiong; Gregg, David

doi:10.1007/978-3-662-44917-2_40

Shixiong Xu^18,19 &
David Gregg^18,19

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8707))

Included in the following conference series:

IFIP International Conference on Network and Parallel Computing

2265 Accesses
3 Citations

Abstract

In this paper we put forward an annotation system for specifying a sequence of data layout transformations for loop vectorization. We propose four basic primitives for data layout transformations that programmers can compose to achieve complex data layout transformations. Our system automatically modifies all loops and other code operating on the transformed arrays. In addition, we propose data layout aware loop transformations to reduce the overhead of address computation and help vectorization. Taking the Scalar Penta-diagonal (SP) solver, from the NAS Parallel Benchmarks as a case study, we show that the programmer can achieve significant speedups using our annotations.

This work was supported, in part, by Science Foundation Ireland grant 10/CE/I185 to Lero - the Irish Software Engineering Research Centre (www.lero.ie).

Download to read the full chapter text

Chapter PDF

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Polygonal Iteration Space Partitioning

Inter-iteration Scalar Replacement Using Array SSA Form

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bae, H., Mustafa, D., et al.: The Cetus Source-to-Source Compiler Infrastructure: Overview and Evaluation. Int. J. Parallel Program. 41, 753–767 (2013)
Article Google Scholar
Sung, I.J., Stratton, J.A., Hwu, W.M.W.: Data Layout Transformation Exploiting Memory-level Parallelism in Structured Grid Many-core Applications. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010 (2010)
Google Scholar
Ramachandran, A., Vienne, J., et al.: Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi. In: 2013 42nd International Conference onParallel Processing (ICPP), pp. 736–743 (2013)
Google Scholar
Bacon, D.F., Graham, S.L., Sharp, O.J.: Compiler Transformations for High-performance Computing. ACM Comput. Surv. 26, 345–420 (1994)
Article Google Scholar
O’Boyle, M.F.P., Knijnenburg, P.M.W.: Non-singular Data Transformations: Definition, Validity and Applications. In: Proceedings of the 11th International Conference on Supercomputing, ICS 1997 (1997)
Google Scholar
Jang, B., Mistry, P., et al.: Data Transformations Enabling Loop Vectorization on Multithreaded Data Parallel Architectures. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2010 (2010)
Google Scholar
Bailey, D.H., Barszcz, E., et al.: The NAS Parallel Benchmarks. Technical report, The International Journal of Supercomputer Applications (1991)
Google Scholar
Kennedy, K., Kremer, U.: Automatic Data Layout for Distributed-memory Machines. ACM Trans. Program. Lang. Syst. 20, 869–916 (1998)
Article Google Scholar
Maleki, S., Gao, Y., et al.: An Evaluation of Vectorizing Compilers. In: Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, PACT 2011 (2011)
Google Scholar
Girbal, S., Vasilache, N., et al.: Semi-automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies. Int. J. Parallel Program. 34, 261–317 (2006)
Article MATH Google Scholar
Rice University, CORPORATE:High Performance Fortran Language Specification. SIGPLAN Fortran Forum 12 (1993)
Google Scholar
Henretty, T., Stock, K., Pouchet, L.-N., Franchetti, F., Ramanujam, J., Sadayappan, P.: Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures. In: Knoop, J. (ed.) CC 2011. LNCS, vol. 6601, pp. 225–245. Springer, Heidelberg (2011)
Chapter Google Scholar
Majeti, D., Barik, R., Zhao, J., Grossman, M., Sarkar, V.: Compiler-Driven Data Layout Transformation for Heterogeneous Platforms. In: an Mey, D., et al. (eds.) Euro-Par 2013. LNCS, vol. 8374, pp. 188–197. Springer, Heidelberg (2014)
Google Scholar
Sinkarovs, A., Scholz, S.B.: Semantics-Preserving Data Layout Transformations for Improved Vectorisation. In: Proceedings of the 2nd ACM SIGPLAN Workshop on Functional High-performance Computing, FHPC 2013 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Lero, The Irish Software Engineering Research Centre, Ireland
Shixiong Xu & David Gregg
Software Tools Group, Department of Computer Science, University of Dublin, Trinity College, Dublin, Ireland
Shixiong Xu & David Gregg

Authors

Shixiong Xu
View author publications
You can also search for this author in PubMed Google Scholar
David Gregg
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Chung Hua University, 707, Sec. 2, WuFu Rd., 30012, Hsinchu, Taiwan
Ching-Hsien Hsu
Huazhong University of Science and Technology, 1037#, Luoyu Road, 430074, Wuhan, China
Xuanhua Shi
IBM Thomas J. Watson Research Center, 1101 Kitchawan Rd., 10598, Yorktown Heights, NY, USA
Valentina Salapura

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, S., Gregg, D. (2014). Semi-automatic Composition of Data Layout Transformations for Loop Vectorization. In: Hsu, CH., Shi, X., Salapura, V. (eds) Network and Parallel Computing. NPC 2014. Lecture Notes in Computer Science, vol 8707. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44917-2_40

Download citation

DOI: https://doi.org/10.1007/978-3-662-44917-2_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44916-5
Online ISBN: 978-3-662-44917-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Semi-automatic Composition of Data Layout Transformations for Loop Vectorization

Abstract

Chapter PDF

Similar content being viewed by others

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Polygonal Iteration Space Partitioning

Inter-iteration Scalar Replacement Using Array SSA Form

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Semi-automatic Composition of Data Layout Transformations for Loop Vectorization

Abstract

Chapter PDF

Similar content being viewed by others

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Polygonal Iteration Space Partitioning

Inter-iteration Scalar Replacement Using Array SSA Form

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation