Register Saturation in Instruction Level Parallelism

Touati, Sid-Ahmed-Ali

doi:10.1007/s10766-005-6466-x

Register Saturation in Instruction Level Parallelism

Published: August 2005

Volume 33, pages 393–449, (2005)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

International Journal of Parallel Programming Aims and scope Submit manuscript

Register Saturation in Instruction Level Parallelism

Download PDF

Sid-Ahmed-Ali Touati¹

90 Accesses
15 Citations
Explore all metrics

Abstract

The registers constraints are usually taken into account during the scheduling pass of an acyclic data dependence graph (DAG): any schedule of the instructions inside a basic block must bound the register requirement under a certain limit. In this work, we show how to handle the register pressure before the instruction scheduling of a DAG. We mathematically study an approach which consists in managing the exact upper-bound of the register need for all the valid schedules of a considered DAG, independently of the functional unit constraints. We call this computed limit the register saturation (RS) of the DAG. Its aim is to detect possible obsolete register constraints, i.e., when RS does not exceed the number of available registers. If it does, we add some serial edges to the original DAG such that the worst register need does not exceed the number of available registers. We propose an appropriate mathematical formalism for this problem. Our generic processor model takes into account superscalar, VLIW and EPIC/IA64 architectures. Our deeper analysis of the problem and our formal methods enable us to provide nearly optimal heuristics and strategies for register optimization in the face of ILP.

References

E. Altman (1995) Optimal Software Pipelining with Functional Units and Registers, PhD thesis McGill University Montreal
Google Scholar
W. Ambrosch M.A. Ertl F. Beer A. Krall (1994) ArticleTitleGlobal Register Allocation Lecture Notes in Computer Science. 782 129–??
Google Scholar
Bergner P. , Dahl P., Engebretsen D., O’Keefe M. Spill Code Minimization via Interference Region Spilling, ACM SIG-PLAN Notices, 32 (5):287–295, (1997), Proceedings of Programming Language Design and Implementation (PLDI’97)
D. Bernstein, D. Q. Goldin, M. C. Golumbic, H. Krawczyk, Y. Mansour, I. Nahshon, and R. Y. Pinter, Spill Code Minimization Techniques for Optimizing Compilers, SIGPLAN Notices, 24 (7):258–263, (1989), Proceedings of the ACM SIGPLAN ’89 Conference on Programming Language Design and Implementation
D. Bernstein J.M. Jaffe M. Rodeh (1989) ArticleTitleScheduling Arithmetic and Load Operations in parallel with No Spilling SIAM Journal on Computing. 18 IssueID6 1098–1127 Occurrence Handle10.1137/0218074
Article Google Scholar
D.A. Berson (1996) Unification of Register Allocation and Instruction Scheduling in Compilers for Fine-Grain Parallel Architecture, PhD thesis Pittsburgh University Pittsburgh
Google Scholar
Berson D.A., Gupta R., Soffa M. (1993). URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures. in Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, Orlando, Florida, pp. 243–254, (1993)
Brasier T.S., Sweany P.H., Beaty S.J., Carr S. (1995). CRAIG: A Practical Framework for Combining Instruction Scheduling and Register Assignment, in Parallel Architectures and Compilation Techniques (PACT ’95)
D. Callahan and B. Koblenz Register Allocation via Hierarchical Graph Coloring, SIGPLAN Notices 26 (6):192–203, (1991) Proceedings of the ACM SIGPLAN ’91 Conference on Programming Language Design and Implementation
G. J. Chaitin, Register Allocation and Spilling via Graph Coloring, ACM SIG-PLAN Notices, 17 (6):98–105, (1982). Proceedings of the SIGPLAN ’82 Symposium on Compiler Construction
P. Crawley R.P. Dilworth (1973) Algebraic Theory of Lattices Prentice Hall Englewood Cliffs
Google Scholar
D. Werra Particlede C. Eisenbeis S. Lelait B. Marmol (1999) ArticleTitleOn a Graph-Theoretical Model for Cyclic Register Allocation Discrete Applied Mathematics. 93 IssueID2–3 191–203 Occurrence Handle10.1016/S0166-218X(99)00105-5
Article Google Scholar
C. Eisenbeis, F. Gasperoni, and U. Schwiegelshohn, Allocating Registers in Multiple Instruction-Issuing Processors, in Proceedings of the IFIP WG 10.3 Working Conf., Parallel Architectures Compilation Tech., PACT’95, ACM Press, pp. 290–293 (1995)
C. Eisenbeis and A. Sawaya, Optimal Loop Parallelization under Register Constraints. In Sixth Workshop on Compilers for Parallel Computers CPC’96., pp. 245–259, Aachen, Germany (Dec. 1996)
W. fen Lin, S. K. Reinhardt, and D. Burger, Reducing DRAM Latencies with an Integrated Memory Hierarchy Design, in Proceedings of the 7th Int Symp High-Perform Comput Architect, Nuevo Leone, Mexico, (Jan. 2001)
Freudenberger S.M., Ruttenberg J.C. (1992). Phase Ordering of Register Allocation and Instruction Scheduling, in Code Generation – Concepts, Tools, Techniques. Proc of the International Workshop on Code Generation, London, Springer–Verlag pp. 146–172
Garfinkel R.S., Nemhauser G.L. Integer Programming, John Wiley & Sons, New York (1972) Series in Decision and Control
J.R. Goodman and W.-C. Hsu, Code Scheduling and Register Allocation in Large Basic Blocks, in Conf. Proc 1988 Int Conf. on Supercomput., St. Malo, France, pp. 442–452, (1988)
R. Govindarajan, E. R. Altman, and G. R. Gao, Minimizing Register Requirements under Resource-Constrained Rate-Optimal Software Pipelining, MICRO27, pp. 85–94, (1994).
R. Govindarajan, H. Yang, J. N. Amaral, C. Zhang, and G. R. Gao, Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architecture, IEEE Transactions on Computers, 4–20 (2003)
W. Jalby, C. Lemuet, and S.-A.-A. Touati, Improving Load/Store Queues Usage in Scientific Computing, in Proc of the Int Conf Parallel Process (ICPP’04)., Montréal, Canada, pp. 38–45 (2004)
W. Jalby, C. Lemuet, and S.-A.-A. Touati, An Efficient Memory Operations Optimization Technique for Vector Loops on Itanium 2 Processors, Conucurrency and Computation: Practice and Experience, 2004 (to appear). Wiley Interscience
J. Janssen (2001) Compilers Strategies for Transport Triggered Architectures, PhD thesis Delft University Netherlands
Google Scholar
W.M. Meleis (2001) ArticleTitleDural-Issue Scheduling for Binary Trees with Spills and Pipelined Loads SIAM Journal of Computing. 30 IssueID6 1921–1941 Occurrence Handle10.1137/S009753979834610X
Article Google Scholar
C. Norris and L. L. Pollock, A Scheduler-Sensitive Global Register Allocator, in IEEE, editor, Supercomputing 93 Proceedings: Portland, Oregon, pp. 804–813, 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, (1993). IEEE Computer Society Press
S. S. Pinter, Register Allocation with Instruction Scheduling: A New Approach, SIGPLAN Notices, 28 (6):248–257, (1993) Proceedings of the SIGPLAN ’93 Conference on Programming Language Design and Implementation
M. Poletto V. Sarkar (1999) ArticleTitleLinear Scan Register Allocation ACM Transactions on Programming Languages and Systems. 21 IssueID5 895–913 Occurrence Handle10.1145/330249.330250
Article Google Scholar
R. Silvera, J. Wang, G. R. Gao, and R. Govindarajan, A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors in Proc of the 1997 International Conference on Parallel Architectures and Compilation Techniques (PACT-97), pp 78–89, San Francisco, California, (1997). IEEE Computer Society Press
S.-A.-A. Touati, Register Saturation in Superscalar and VLIW Codes, in Proc of The International Conference on Compiler Construction, Lecture Notes in Computer Science. Springer-Verlag, Berlin (2001)
S.-A.-A. Touati, Register Pressure in Instruction Level Parallelisme, PhD thesis, Université de Versailles, France (2002) ftp.inria.fr/INRIA/Projects/a3/touati/thesis
J. Zalamea, J. Llosa, E. Ayguadé, and M. Valero, Modulo Scheduling with Integrated Register Spilling for Clustered VLIW Architectures. in Proc of the 34th Int. Symp Microarchitecture (MICRO-34), Austin, Texas, pp. 160–169, (2001)

Download references

Author information

Authors and Affiliations

PRiSM laboratory, University of Versailles, France
Sid-Ahmed-Ali Touati

Authors

Sid-Ahmed-Ali Touati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sid-Ahmed-Ali Touati.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Touati, SAA. Register Saturation in Instruction Level Parallelism. Int J Parallel Prog 33, 393–449 (2005). https://doi.org/10.1007/s10766-005-6466-x

Download citation

Received: 03 May 2004
Revised: 01 October 2004
Accepted: 01 March 2005
Issue Date: August 2005
DOI: https://doi.org/10.1007/s10766-005-6466-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Register Saturation in Instruction Level Parallelism

Abstract

Article PDF

Similar content being viewed by others

Enabling energy-proportional computing on instruction-level parallel processors

Improving Code Density with Variable Length Encoding Aware Instruction Scheduling

Dual-IS: Instruction Set Modality for Efficient Instruction Level Parallelism

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Register Saturation in Instruction Level Parallelism

Abstract

Article PDF

Similar content being viewed by others

Enabling energy-proportional computing on instruction-level parallel processors

Improving Code Density with Variable Length Encoding Aware Instruction Scheduling

Dual-IS: Instruction Set Modality for Efficient Instruction Level Parallelism

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation