Abstract
The registers constraints are usually taken into account during the scheduling pass of an acyclic data dependence graph (DAG): any schedule of the instructions inside a basic block must bound the register requirement under a certain limit. In this work, we show how to handle the register pressure before the instruction scheduling of a DAG. We mathematically study an approach which consists in managing the exact upper-bound of the register need for all the valid schedules of a considered DAG, independently of the functional unit constraints. We call this computed limit the register saturation (RS) of the DAG. Its aim is to detect possible obsolete register constraints, i.e., when RS does not exceed the number of available registers. If it does, we add some serial edges to the original DAG such that the worst register need does not exceed the number of available registers. We propose an appropriate mathematical formalism for this problem. Our generic processor model takes into account superscalar, VLIW and EPIC/IA64 architectures. Our deeper analysis of the problem and our formal methods enable us to provide nearly optimal heuristics and strategies for register optimization in the face of ILP.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
E. Altman (1995) Optimal Software Pipelining with Functional Units and Registers, PhD thesis McGill University Montreal
W. Ambrosch M.A. Ertl F. Beer A. Krall (1994) ArticleTitleGlobal Register Allocation Lecture Notes in Computer Science. 782 129–??
Bergner P. , Dahl P., Engebretsen D., O’Keefe M. Spill Code Minimization via Interference Region Spilling, ACM SIG-PLAN Notices, 32 (5):287–295, (1997), Proceedings of Programming Language Design and Implementation (PLDI’97)
D. Bernstein, D. Q. Goldin, M. C. Golumbic, H. Krawczyk, Y. Mansour, I. Nahshon, and R. Y. Pinter, Spill Code Minimization Techniques for Optimizing Compilers, SIGPLAN Notices, 24 (7):258–263, (1989), Proceedings of the ACM SIGPLAN ’89 Conference on Programming Language Design and Implementation
D. Bernstein J.M. Jaffe M. Rodeh (1989) ArticleTitleScheduling Arithmetic and Load Operations in parallel with No Spilling SIAM Journal on Computing. 18 IssueID6 1098–1127 Occurrence Handle10.1137/0218074
D.A. Berson (1996) Unification of Register Allocation and Instruction Scheduling in Compilers for Fine-Grain Parallel Architecture, PhD thesis Pittsburgh University Pittsburgh
Berson D.A., Gupta R., Soffa M. (1993). URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures. in Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, Orlando, Florida, pp. 243–254, (1993)
Brasier T.S., Sweany P.H., Beaty S.J., Carr S. (1995). CRAIG: A Practical Framework for Combining Instruction Scheduling and Register Assignment, in Parallel Architectures and Compilation Techniques (PACT ’95)
D. Callahan and B. Koblenz Register Allocation via Hierarchical Graph Coloring, SIGPLAN Notices 26 (6):192–203, (1991) Proceedings of the ACM SIGPLAN ’91 Conference on Programming Language Design and Implementation
G. J. Chaitin, Register Allocation and Spilling via Graph Coloring, ACM SIG-PLAN Notices, 17 (6):98–105, (1982). Proceedings of the SIGPLAN ’82 Symposium on Compiler Construction
P. Crawley R.P. Dilworth (1973) Algebraic Theory of Lattices Prentice Hall Englewood Cliffs
D. Werra Particlede C. Eisenbeis S. Lelait B. Marmol (1999) ArticleTitleOn a Graph-Theoretical Model for Cyclic Register Allocation Discrete Applied Mathematics. 93 IssueID2–3 191–203 Occurrence Handle10.1016/S0166-218X(99)00105-5
C. Eisenbeis, F. Gasperoni, and U. Schwiegelshohn, Allocating Registers in Multiple Instruction-Issuing Processors, in Proceedings of the IFIP WG 10.3 Working Conf., Parallel Architectures Compilation Tech., PACT’95, ACM Press, pp. 290–293 (1995)
C. Eisenbeis and A. Sawaya, Optimal Loop Parallelization under Register Constraints. In Sixth Workshop on Compilers for Parallel Computers CPC’96., pp. 245–259, Aachen, Germany (Dec. 1996)
W. fen Lin, S. K. Reinhardt, and D. Burger, Reducing DRAM Latencies with an Integrated Memory Hierarchy Design, in Proceedings of the 7th Int Symp High-Perform Comput Architect, Nuevo Leone, Mexico, (Jan. 2001)
Freudenberger S.M., Ruttenberg J.C. (1992). Phase Ordering of Register Allocation and Instruction Scheduling, in Code Generation – Concepts, Tools, Techniques. Proc of the International Workshop on Code Generation, London, Springer–Verlag pp. 146–172
Garfinkel R.S., Nemhauser G.L. Integer Programming, John Wiley & Sons, New York (1972) Series in Decision and Control
J.R. Goodman and W.-C. Hsu, Code Scheduling and Register Allocation in Large Basic Blocks, in Conf. Proc 1988 Int Conf. on Supercomput., St. Malo, France, pp. 442–452, (1988)
R. Govindarajan, E. R. Altman, and G. R. Gao, Minimizing Register Requirements under Resource-Constrained Rate-Optimal Software Pipelining, MICRO27, pp. 85–94, (1994).
R. Govindarajan, H. Yang, J. N. Amaral, C. Zhang, and G. R. Gao, Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architecture, IEEE Transactions on Computers, 4–20 (2003)
W. Jalby, C. Lemuet, and S.-A.-A. Touati, Improving Load/Store Queues Usage in Scientific Computing, in Proc of the Int Conf Parallel Process (ICPP’04)., Montréal, Canada, pp. 38–45 (2004)
W. Jalby, C. Lemuet, and S.-A.-A. Touati, An Efficient Memory Operations Optimization Technique for Vector Loops on Itanium 2 Processors, Conucurrency and Computation: Practice and Experience, 2004 (to appear). Wiley Interscience
J. Janssen (2001) Compilers Strategies for Transport Triggered Architectures, PhD thesis Delft University Netherlands
W.M. Meleis (2001) ArticleTitleDural-Issue Scheduling for Binary Trees with Spills and Pipelined Loads SIAM Journal of Computing. 30 IssueID6 1921–1941 Occurrence Handle10.1137/S009753979834610X
C. Norris and L. L. Pollock, A Scheduler-Sensitive Global Register Allocator, in IEEE, editor, Supercomputing 93 Proceedings: Portland, Oregon, pp. 804–813, 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, (1993). IEEE Computer Society Press
S. S. Pinter, Register Allocation with Instruction Scheduling: A New Approach, SIGPLAN Notices, 28 (6):248–257, (1993) Proceedings of the SIGPLAN ’93 Conference on Programming Language Design and Implementation
M. Poletto V. Sarkar (1999) ArticleTitleLinear Scan Register Allocation ACM Transactions on Programming Languages and Systems. 21 IssueID5 895–913 Occurrence Handle10.1145/330249.330250
R. Silvera, J. Wang, G. R. Gao, and R. Govindarajan, A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors in Proc of the 1997 International Conference on Parallel Architectures and Compilation Techniques (PACT-97), pp 78–89, San Francisco, California, (1997). IEEE Computer Society Press
S.-A.-A. Touati, Register Saturation in Superscalar and VLIW Codes, in Proc of The International Conference on Compiler Construction, Lecture Notes in Computer Science. Springer-Verlag, Berlin (2001)
S.-A.-A. Touati, Register Pressure in Instruction Level Parallelisme, PhD thesis, Université de Versailles, France (2002) ftp.inria.fr/INRIA/Projects/a3/touati/thesis
J. Zalamea, J. Llosa, E. Ayguadé, and M. Valero, Modulo Scheduling with Integrated Register Spilling for Clustered VLIW Architectures. in Proc of the 34th Int. Symp Microarchitecture (MICRO-34), Austin, Texas, pp. 160–169, (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Touati, SAA. Register Saturation in Instruction Level Parallelism. Int J Parallel Prog 33, 393–449 (2005). https://doi.org/10.1007/s10766-005-6466-x
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10766-005-6466-x