Abstract
This paper describes initial experiences with semi-automated performance tuning of a sparse linear solver in LS-DYNA, a large, widely used engineering application. Through a collection of tools supporting empirical optimization, we alleviate the burden of performance tuning for mapping today’s sophisticated engineering software to increasingly complex hardware platforms. We describe a tool that automatically isolates code segments to create benchmark subsets for the purposes of performance tuning. We present a collection of automatically generated empirical results that demonstrate the sensitivity of the application’s performance to optimization parameters. Through this case study, we demonstrate the importance of developing automatic performance tuning support for performance-sensitive applications.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
LS-DYNA User’s Manual V. 960, Livermore Software Technology Corporation, http://www.lstc.com (March 2001)
Ashcraft C., Lucas R.F A Stackless Multifrontal Method, in Proc. 10th SIAM Conference on Parallel Processing for Scientific Computing (March 2001)
Baradaran N., Chame J., Chen C., Diniz P., Hall M., Lee Y., Liu B., Lucas R., ECO: An Empirical-based Compilation and Optimization System, in Proc. of the Workshop on Next Generation Software, held in conjunction with IPDPS’03 (April 2003)
Chen C., Chame J., Hall M., Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy, in Int. Symposium on Code Generation and Optimization (CGO’05) (March, 2005)
Diniz P., Liu B. Selector: An Effective Technique for Adaptive Computing, in Proc. of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC’02) (July, 2002)
Lee Y., Hall M. A Code Isolator: Isolating Code Fragments from Large programs, in Proc. of the 17th Workshop on Languages and Compilers for Parallel Computing (LCPC’04) (September, 2004)
Vetter J.S., Worley P. Asserting Performance Expectations, in Proc. of Supercomputing’02 (November, 2002)
Diniz P., Lee Y., Hall M., and Lucas R., A Case Study Using Empirical Optimization for a Large, Engineering Application, in Proc. of the Workshop on Next Generation Software, held in Conjunction with IPDPS’04 (April, 2003)
Hall M., Amarasinghe S., Murphy B., Liao S., and Lam M., and M Lam, Interprocedural Parallelization Analysis in SUIF, in ACM Trans. on Programming Languages and Systems (2005)
MIPSpro C and C++ Pragmas, Document Number 007-3587-003, 1998, 1999 Silicon Graphics, Inc
S. Carr K. Kennedy (July 1994) ArticleTitleImproving the Ratio of Memory Operations to Floating-Point Operations in Loops in ACM Trans. on Programming Languages and Systems (TOPLAS) 15 IssueID3 400–462
London K., Dongarra J., Moore S., P Mucci, Seymour K., and T Spencer, End-user Tools for Application Performance Analysis, Using Hardware Counters, Intl. Conference on Parallel and Distributed Computing Systems (August, 2001)
C. Whaley and Dongarra J., Automatically tuned linear algebra software, in. Proc. of Super-computing ’98 (1998)
Bilmes J., Asanovic K., C.-W. Chen, and Demmel J., Optimizing Matrix Multiply using PHiPAC: Portable High-Performance ANSI-C Coding Methodology, in Proc. of the ACM International Conference on Supercomputing ’97 (1997)
D. Mirkovic and Johnsson SL., Automatic Performance Tuning in the UHFFT Library, in Proc. of the International conference on Computational Science (ICCS’01) (May, 2001)
Frigo M., A Fast Fourier Transform Compiler, in Proc. of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’99) (June, 1999)
Xiong J., Johnson J., Johnson R., and Padua D., SPL: A Language and Compiler for DSP Algorithms, in Proc. of the ACM Conference on Programming Language Design and Implementation (PLDI’01) (June, 2001)
M. Wolf and Lam M., A Data Locality Optimization Algorithm, in Proc. of the 1991 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’91) (June, 1991)
Wolfe M., More iteration space tiling, in Proc. of Supercomputing ’89 (November, 1989)
J. Chame and Moon S., A Title Selection Algorithm for Data Locality and Cache Interference, in Proc. of the 1999 ACM International Conference on Supercomputing’ 99 (June, 1999)
S. Coleman and McKinley K., Tile Size Selection Using Cache Organization and Data Layout, in Proc. of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’95) (June, 1995)
G. Rivera and C.-Tseng W., Data Transformations for Eliminating Conflict Misses, in Proc. of the ACM Conference on Programming Language Design and Implementation (PLDI’98) (June, 1998)
Lam M., Rothberg E., and Wolf M., The Cache Performance and Optimization of Blocked Algorithms, in Proc. of the 4th International conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’91) (April, 1991)
Chatterjee S., Parker E., Hanlon PJ., and Lebeck AR., Exact Analysis of the Cache Behavior of Nested Loops, in Proc. of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’01) (June, 2001)
Ghosh S., Martonosi M., and Malik S., Precise Miss Analysis for Program Transformations with Caches of Arbitrary Associativity, in Proc. of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’98) (October, 1998)
Temam O., Granston E., and Jalby W., To Copy or not to Copy: A Compile-time Technique for Assessing When Data Copying Should be Used to Eliminate Cache Conflicts, in Proc. of Supercomputing ’93 (November, 1993)
M. Voss and Eigenmann R., High-Level Adaptive Program Optimization with ADAPT, in Proc. of the ACM SIGPLAN Conference on Principles and Practice of Parallel Processing (PPoPP’01) (June, 2001)
Adve V., Lam V., and Ensink B., Language and Compiler Support for Adaptive Distributed Applications, in Proc. of the ACM SIGPLAN Workshop on Optimization of Middleware and Distributed Systems (OM’01) (June, 2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, YJ., Diniz, P.C., Hall, M.W. et al. Empirical Optimization for a Sparse Linear Solver: A Case Study. Int J Parallel Prog 33, 165–181 (2005). https://doi.org/10.1007/s10766-005-3581-7
Issue Date:
DOI: https://doi.org/10.1007/s10766-005-3581-7