Techniques for critical path reduction of scalar programs

Schlansker, Michael; Kathail, Vinod

doi:10.1007/BF02700034

Techniques for critical path reduction of scalar programs

Published: June 1997

Volume 25, pages 147–181, (1997)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

International Journal of Parallel Programming Aims and scope Submit manuscript

Techniques for critical path reduction of scalar programs

Download PDF

Michael Schlansker¹ &
Vinod Kathail¹

70 Accesses
1 Citation
Explore all metrics

Abstract

Scalar performance on processors with instruction level parallelism (ILP) is often limited by control and data dependences. This paper describes a family of compiler techniques, called Critical Path Reduction (CPR) techniques, which reduce the length of critical paths through control and data dependences. Control CPR reduces the number of branches on the critical path and improves the performance of branch intensive codes on processors with inadequate branch throughput or excessive branch latency. Data CPR reduces the number of arithmetic operations on the critical path. Optimization and scheduling are adapted to support CPR.

References

R. Hank, W. W. Hwu, and B. R. Rau, Region-Based Compilation: An Introduction and Motivation,Proc. 28th Ann. Symp. on Microarchitecture Ann Arbor, Michigan, pp. 158–168 (1995).
J. C. Dehnert and R. A. Towle, Compiling for the Cydra 5,J. Supercomputing 7(1/2):181–228 (1993).
Article Google Scholar
M. Schlansker and V. Kathail, Acceleration of First and Higher Order Recurrences on Processors with Instruction Level Parallelism,Sixt Int’l. Workshop on Lang. Compilers for Parallel Computing, U. Banerjee,et al. (Eds., Springer-Verlag, pp. 406–429 (1993).
M. Schlansker, V. Kathail, and S. Anik, Height Reduction of Control Recurrences for ILP Processors,Proc. 27th Ann. Int’l. Symp. on Microarchitecture, San Jose, California, pp. 40–51 (1994).
J. A. Fisher, Very Long Instruction Word Architectures and the ELI-512,Proc. Tenth Ann. Intnl. Symp. Computer Architecture, Stockholm, Sweden, pp. 140–150 (1983).
G. Lowneyet al., The Multiflow Trace Scheduling Compilers,J. Supercomputing 7(1/2):51–142 (1993).
Article Google Scholar
W. W. Hwu,et al., The Superblock: An Effective Technique for VLIW and Superscalar Compilation.J. Supercomputing 7(1/2): 229–248 (1993).
Article Google Scholar
J. A. Fisher and S. M. Freudenberger, Predicting Conditional Jump Directions from Previous Runs of a Program,Proc. Fifth Int’l. Conf. Archit. Support for Progr. Lang. and Oper. Syst., Boston, Massachusetts, pp. 85–95 (1992).
V. Kathail, M. S. Schlansker, and B. R. Rau, HPL PlayDoh Architecture Specification: Version 1.0. Technical Report HPL-93-80, Hewlett-Packard Laboratories, Palo Alto, California (1993).
P. Y. T. Hsu and E. S. Davidson. Highly Concurrent Scalar Processing.Proc. 13th Ann. Int’l. Symp. Computer Archit., pp. 386–395 (1986).
B. R. Rauet al., The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions and Trade-Offs.Computer 22(1):12–35 (1989).
Article MathSciNet Google Scholar
S. A. Mahlke,et al., Effective Compiler Support for Predicated Execution Using the Hyperblock.Proc. 25th Ann. Int’l. Symp. Microarchitecture, pp. 45–54 (1992).
J. C. Dehnert, P. Y.-T. Hsu, and J. P. Bratt, Overlapped Loop Support in the Cydra 5.Proc. Third Int’l. Conf. Archit. Support for Progr. Lang. Oper. Syst., Boston, Massachusetts, pp. 26–38 (1989).
S. A. Mahlke,et al., Sentinel Scheduling: A Model for Compiler-Controlled Speculative Execution.ACM Trans. Computer Systems 11(4):376–408 (1993).
Article Google Scholar
J. R. Ellis,Bulldog: A Compiler for VLIW Architectures, The MIT Press, Cambridge, Massachusetts, (1985).
Google Scholar
J. Ferrante, K. Ottenstein, and J. Warren, The Program Dependence Graph and Its Use in Optimization.ACM Trans. Progr. Lang. Syst. 9(3):319–349 (1987).
Article MATH Google Scholar
K. Pingali and G. Bilardi, APT: A Data Structure for Optimal Control Dependence Computation.Proc. Progr. Lang. Design and Implementation, La Jolla, California (1995).
J. C. H. Park and M. S. Schlansker, On Predicated Execution. Technical Report HPL-91-58, Hewlett-Packard Laboratories, Palo Alto, California (1991).
Google Scholar
D. J. Kuck,The Structure of Computers and Computations, John Wiley, New York (1978).
Google Scholar
J. A. Fisher, Trace scheduling: A Technique for Global Microcode Compaction,IEEE Trans. Computers C-30(7):478–490 (1981).
Article Google Scholar
A. Nicolau, Percolation Scheduling: A Parallel Compilation Technique. Technical Report TR 85-678, Department of Computer Science, Cornell (1985).
Google Scholar
K. Ebcioglu and A. Nicolau. AGlobal Resource-Constrained Parallelization Technique.Proc. Third Int’l. Conf. Supercomputing, Crete, Greece, pp. 154–163 (1989).
P. Tirumalai, M. Lee, and M. S. Schlansker, Parallelization of Loops with Exits on Pipelined Architectures,Proc. Supercomputing, pp. 200–212 (1990).
S.-M. Moon and K. Ebcioglu, An Efficient Resource-Constrained Global Scheduling Technique for Superscalar and VLIW Processors,Proc. 25th Ann. Int’l. Symp. Microarchitecture, Portland, Oregon (1992).
J. A. Fisher, 2^N-way Jump Microinstruction Hardware and an Effective Instruction Binding Method,Proc. 13th Ann. Workshop on Microprogramming, Colorado Springs, Colorado, pp. 64–75 (1980).
K. Ebcioglu and R. Groves, Some Global Compiler Optimization and Architectural Features for Improving Performance of Superscalars, Technical Report RC16145, IBM T. J. Watson Research Center, Yorktown Heights, New York (1990).
Google Scholar
B. R. Rau, M. S. Schlansker, and P. P. Tirumalai, Code Generation Schemas for Modulo Scheduled DO-Loops and WHILE-Loops. Technical Report HPL-92-47, Hewlett-Packard Laboratories, Palo Alto, California (1992).
Google Scholar

Download references

Author information

Authors and Affiliations

Hewlett-Packard Laboratories, 94304, Palo Alto, California
Michael Schlansker & Vinod Kathail

Authors

Michael Schlansker
View author publications
You can also search for this author in PubMed Google Scholar
Vinod Kathail
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schlansker, M., Kathail, V. Techniques for critical path reduction of scalar programs. Int J Parallel Prog 25, 147–181 (1997). https://doi.org/10.1007/BF02700034

Download citation

Issue Date: June 1997
DOI: https://doi.org/10.1007/BF02700034

Key words

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Techniques for critical path reduction of scalar programs

Abstract

Article PDF

Similar content being viewed by others

An Approach for Compiler Optimization to Exploit Instruction Level Parallelism

Automatic SIMD Vectorization of Loops: Issues, Energy Efficiency and Performance on Intel Processors

Auto-Vectorization of Loops on Intel 64 and Intel Xeon Phi: Analysis and Evaluation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Techniques for critical path reduction of scalar programs

Abstract

Article PDF

Similar content being viewed by others

An Approach for Compiler Optimization to Exploit Instruction Level Parallelism

Automatic SIMD Vectorization of Loops: Issues, Energy Efficiency and Performance on Intel Processors

Auto-Vectorization of Loops on Intel 64 and Intel Xeon Phi: Analysis and Evaluation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation