Abstract
SAC is a functional array processing language particularly designed with numerical applications in mind. In this field the runtime performance of programs critically depends on the efficient utilization of the memory hierarchy. Cache conflicts due to limited set associativity are one relevant source of inefficiency. This paper describes the realization of an optimization technique which aims at eliminating cache conflicts by adjusting the data layout of arrays to specific access patterns and cache configurations. Its effect on cache utilization and runtime performance is demonstrated by investigations on the PDE1 benchmark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D. F. Bacon, S. L. Graham, and O. J. Sharp. Compiler Transformations for High-Performance Computing. ACM Computing Surveys, vol. 26(4), pp. 345–420, 1994.
B. Bershad, D. Lee, T. Romer, and B. Chen. Avoiding Conflict Misses in Large Direct-Mapped Caches. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOSVI), San José, California, USA, 1994.
M. Cierniak and W. Li. Unifying Data and Control Transformations for Distributed Shared-Memory Machines. In Proceedings of the ACM SIGPLAN Conference on Programming Design and Implementation (PLDI’95), La Jolla, California, USA, 1995.
S. Coleman and K. McKinley. Tile Size Selection Using Cache Organization and Data Layout. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’95), La Jolla, California, USA, pp. 279–290, 1995.
D. Gannon, W. Jalby, and K. Gallivan. Strategies for Cache and Local Memory Management by Global Program Transformation. Journal of Parallel and Distributed Computing, vol. 5(5), pp. 587–616, 1988.
S. Ghosh, M. Martonosi, and S. Malik. Cache Miss Equations: A Compiler Framework for Analyzing and Tuning Memory Behavior. ACM Transactions on Programming Languages and Systems, vol. 21(4), pp. 703–746, 1999.
C. Grelck, D. Kreye, and S.-B. Scholz. On Code Generation for Multi-Generator WITH-Loops in SAC. In P. Koopman and C. Clack, editors, Proceedings of the 11th International Workshop on Implementation of Functional Languages (IFL’99), Lochem, The Netherlands, selected papers, Lecture Notes in Computer Science, vol. 1868, pp. 77–94. Springer-Verlag, 2000.
C. Grelck and S.-B. Scholz. HPF vs. SAC — A Case Study. In A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Proceedings of the 6th International Euro-Par Conference on Parallel Processing (Euro-Par’00), Munich, Germany, Lecture Notes in Computer Science, vol. 1900, pp. 620–624. Springer-Verlag, 2000.
J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach, Second Edition. Morgan Kaufmann, 1995.
M. S. Lam, E. E. Rothberg, and M. E. Wolf. The Cache Performance of Blocked Algorithms. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Palo Alto, California, USA, pp. 63–74, 1991.
N. Manjikian and T. S. Abdelrahman. Fusion of Loops for Parallelism and Locality. IEEE Transactions on Parallel and Distributed Systems, vol. 8(2), pp. 193–209, 1997.
K. McKinley, S. Carr, and C.-W. Tseng. Improving Data Locality with Loop Transformations. ACM Transactions on Programming Languages and Systems, vol. 18(4), pp. 424–453, 1996.
K. McKinley and O. Temam. A Quantative Analysis of Loop Nest Locality. In Proceedings f the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Boston, Massachusetts, USA, 1996.
T. Mowry, M. Lam, and A. Gupta. Design and Evaluation of a Compiler Algorithm for Prefetching. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-V), Boston, Massachusetts, USA, pp. 62–73, 1992.
P. R. Panda, H. Nakamura, N. D. Dutt, and A. Nicolau. A Data Alignment Technique for Improving Cache Performance. In Proceedings of the International Conference on Computer Design VLSI in Computers and Processors, Austin, Texas, USA, pp. 587–592. IEEE Computer Society Press, 1997.
G. Rivera and C.-W. Tseng. Data Transformations for Eliminating Conflict Misses. In Proceedings of the ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI’98), Montréal, Canada,ACM SIGPLAN Notices, vol. 33(5), pp. 38–49. ACM Press, 1998.
G. Rivera and C.-W. Tseng. Eliminating Conflict Misses for High Performance Architectures. In Proceedings of the ACM International Conference on Supercomputing (ICS’98), Melbourne, Australia. ACM Press, 1998.
G. Rivera and C.-W. Tseng. A Comparison of Compiler Tiling Algorithms. In Proceedings of the 8th International Conference on Compiler Construction (CC’99), Amsterdam, The Netherlands, Lecture Notes in Computer Science, vol. 1575, pp. 168–182. Springer-Verlag, 1999.
V. Sarkar and R. Thekkath. A General Framework for Iteration-Reordering Loop Transformations. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’92), San Francisco, California, USA, pp. 175–187, 1992.
S.-B. Scholz. On Defining Application-Specific High-Level Array Operations by Means of Shape-Invariant Programming Facilities. In S. Picchi and M. Micocci, editors, Proceedings of the International Conference on Array Processing Languages (APL’98), Rome, Italy, pp. 40–45. ACM Press, 1998.
S.-B. Scholz. A Case Study: Effects of WITH-Loop Folding on the NAS Benchmark MGin SAC. In K. Hammond, T. Davie, and C. Clack, editors, Proceedings of the 10th International Workshop on Implementation of Functional Languages (IFL’98), London, UK, selected papers, Lecture Notes in Computer Science, vol. 1595, pp. 216–228. Springer-Verlag, 1999.
O. Temam, C. Fricker, and W. Jalby. Cache Interference Phenomena. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Nashville, Tennessee, USA, pp. 261–271. ACM Press, 1994.
M. E. Wolf and M. S. Lam. A Data Locality Optimizing Algorithm. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’91), pp. 30–44, 1991.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grelck, C. (2001). Improving Cache Effectiveness through Array Data Layout Manipulation in SAC. In: Mohnen, M., Koopman, P. (eds) Implementation of Functional Languages. IFL 2000. Lecture Notes in Computer Science, vol 2011. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45361-X_14
Download citation
DOI: https://doi.org/10.1007/3-540-45361-X_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41919-8
Online ISBN: 978-3-540-45361-1
eBook Packages: Springer Book Archive