Abstract
Instruction hints have become an important way to communicate compile-time information to the hardware. They can be generated by the compiler and the post-link optimizer to reduce cache misses, improve branch prediction and minimize other performance bottlenecks. This paper discusses different instruction hints available on modern processor architectures and shows the potential performance impact on many benchmark programs. Some hints can be effectively selected at compile time with profile feedback. However, since the same program executable can behave differently on various inputs and performance bottlenecks may change on different micro-architectures, significant performance opportunities can be exploited by selecting instruction hints dynamically.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Alpha architecture handbook (October 1998)
Beyls, K., D‘Hollander, E.: Compile-time cache hint generationfor epic architectures. In: EPIC-2 (November 2002)
Beyls, K., D’Hollander, E.H.: Generating cache hints for improved program efficiency. J. Syst. Archit. 51(4), 223–250 (2005)
Luk, C.K., et al.: Ispike: a post-link optimizer for the intel®itanium®architecture. In: CGO 2004, pp. 15–26 (2004)
Yang, H., et al.: Compiler-assisted cache replacement: Problem formulation and performance evaluation. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 77–92. Springer, Heidelberg (2004)
Lu, J., et al.: Design and implementation of a lightweight dynamic optimization system. JLPT 6(1) (2004)
Lu, J., et al.: Dynamic helper threaded prefetching on the sun ultrasparc cmp processor. In: MICRO 2005, pp. 93–104 (2005)
Tendler, J.M., et al.: Power4 system microarchitecture (October 2001)
Song, Y., et al.: Design and implementation of a compiler framework for helper threading on multi-core processors. In: PaCT 2005, pp. 99–109 (2005)
Want, Z., et al.: Using the compiler to improve cache replacement decisions. In: PACT 2002, pp. 199–208 (2002)
Hewlett-Packard Company: PA-RISC 1.1 Architecture and Instruction Set Reference Manual, 3rd edn. (February 1994)
IBM. PowerPC User Instruction Set Architecture (September 2003)
Intel Corp. Intel®Itanium®2 Processor Reference Manual for Software Development and Optimization (May 2004)
Intel Corp. Intel®IA-64 Architecture Software Developer’s Manual (January 2006)
Kane, G.: PA-RISC 2.0 Architecture. Prentice-Hall, Englewood Cliffs (1995)
Kurpanek, G., et al.: Pa7200: a pa-risc processor with integrated high performance mp bus interface. In: Compcon Spring 1994, Digest of Papers, pp. 375–382 (1994)
Lee, R., Huck, J.: 64-bit and multimedia extensions in the pa-risc 2.0 architecture. In: Compcon 1996, pp. 152–160 (February 1996)
SPARC International, Inc. The SPARC Architecture Manual Version 9 (1994)
Standard Performance Evaluation Corp., http://www.spec.org/cpu2000
Sun Microsystems Inc. UltraSPARC®III Processor User’s Manual (January 2004)
Sun Microsystems Inc. UltraSPARC®IV+ Processor User’s Manual Supplement (October 2005)
Sun Studio Compilers and Tools, http://developers.sun.com/prodtech/cc/index.jsp
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, R., Lu, J., Zhai, A., Hsu, WC. (2006). A Study of the Performance Potential for Dynamic Instruction Hints Selection. In: Jesshope, C., Egan, C. (eds) Advances in Computer Systems Architecture. ACSAC 2006. Lecture Notes in Computer Science, vol 4186. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11859802_7
Download citation
DOI: https://doi.org/10.1007/11859802_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40056-1
Online ISBN: 978-3-540-40058-5
eBook Packages: Computer ScienceComputer Science (R0)