Reducing the Overhead of Direct Application Instrumentation Using Prior Static Analysis

Mußler, Jan; Lorenz, Daniel; Wolf, Felix

doi:10.1007/978-3-642-23400-2_7

Jan Mußler¹⁸,
Daniel Lorenz¹⁸ &
Felix Wolf^18,19,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6852))

Included in the following conference series:

European Conference on Parallel Processing

1678 Accesses
8 Citations
3 Altmetric

Abstract

Preparing performance measurements of HPC applications is usually a tradeoff between accuracy and granularity of the measured data. When using direct instrumentation, that is, the insertion of extra code around performance-relevant functions, the measurement overhead increases with the rate at which these functions are visited. If applied indiscriminately, the measurement dilation can even be prohibitive. In this paper, we show how static code analysis in combination with binary re-writing can help eliminate unnecessary instrumentation points based on configurable filter rules. In contrast to earlier approaches, our technique does not rely on dynamic information, making extra runs prior to the actual measurement dispensable. Moreover, the rules can be applied and modified without re-compilation. We evaluate filter rules designed for the analysis of computation and communication performance and show that in most cases the measurement dilation can be reduced to a few percent while still retaining significant detail.

This material is based upon work supported by the US Department of Energy under Award Number DE-SC0001621.

Download to read the full chapter text

Chapter PDF

Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes

Actionable Program Analyses for Improving Software Performance

ParLoT: Efficient Whole-Program Call Tracing for HPC Applications

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience 22(6), 685–701 (2009)
Google Scholar
Ball, T., Larus, J.R.: Efficient path profiling. In: Proc. of the 29th ACM/IEEE International Symposium on Microarchitecture, pp. 46–57. IEEE Computer Society, Washington, DC, USA (1996)
Google Scholar
Buck, B., Hollingsworth, J.: An API for runtime code patching. Journal of High Performance Computing Applications 14(4), 317–329 (2000)
Article Google Scholar
Cactus code (2010), http://www.cactuscode.org
Gadget 2 (2010), http://www.mpa-garching.mpg.de/gadget
Geimer, M., Shende, S.S., Malony, A.D., Wolf, F.: A generic and configurable source-code instrumentation component. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009. LNCS, vol. 5545, pp. 696–705. Springer, Heidelberg (2009)
Chapter Google Scholar
Geimer, M., Wolf, F., Wylie, B., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurrency and Computation: Practice and Experience 22(6), 702–719 (2010)
Google Scholar
Hernandez, O., Jin, H., Chapman, B.: Compiler support for efficient instrumentation. In: Proc. of the ParCo 2007 Conference. Advances in Parallel Computing, vol. 15, pp. 661–668 (2008)
Google Scholar
JuRoPA (2010), http://www.fz-juelich.de/jsc/juropa
Malony, A.D., Shende, S.S.: Overhead compensation in performance profiling. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 119–132. Springer, Heidelberg (2004)
Chapter Google Scholar
Malony, A.D., Shende, S.S., Morris, A., Wolf, F.: Compensation of measurement overhead in parallel performance profiling. International Journal of High Performance Computing Applications 21(2), 174–194 (2007)
Article Google Scholar
McCabe, T.: A complexity measure. IEEE Transactions on Software Engineering 2, 308–320 (1976)
Article MathSciNet MATH Google Scholar
Mellor-Crummey, J., Fowler, R., Marin, G., Tallent, N.: HPCView: A tool for top-down analysis of node performance. The Journal of Supercomputing 23(1), 81–104 (2002)
Article MATH Google Scholar
Message Passing Interface Forum: MPI: A message-passing interface standard, version 2.2 (September 2009), ch. 14: Profiling Interface
Google Scholar
an Mey, D., et al.: Score-P – A unified performance measurement system for petascale applications. In: Proc. of Competence in High Performance Computing, Schloss Schwetzingen, Germany (2010), (to appear)
Google Scholar
Müller, M., van Waveren, M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W., Parrott, C., Elken, T., Feng, H., Ponder, C.: SPEC MPI2007 – An application benchmark suite for parallel systems using MPI. Concurrency and Computation: Practice and Experience 22(2), 191 (2010)
Google Scholar
Nagel, W.E., Arnold, A., Weber, M., Hoppe, H.-C., Solchenbach, K.: VAMPIR: Visualization and analysis of MPI resources. Supercomputer 12(1), 69–80 (1996)
Google Scholar
Schulz, M., Galarowicz, J., Maghrak, D., Hachfeld, W., Montoya, D., Cranford, S.: Open|SpeedShop: An open source infrastructure for parallel performance analysis. Scientific Programming 16(2-3), 105–121 (2008)
Article Google Scholar
Servat, H., Llort, G., Giménez, J., Labarta, J.: Detailed performance analysis using coarse grain sampling. In: Lin, H.-X., Alexander, M., Forsell, M., Knüpfer, A., Prodan, R., Sousa, L., Streit, A. (eds.) Euro-Par 2009. LNCS, vol. 6043, pp. 185–198. Springer, Heidelberg (2010)
Chapter Google Scholar
Shende, S.S.: The role of instrumentation and mapping in performance measurement. Ph.D. thesis, University of Oregon (August 2001)
Google Scholar
Shende, S.S., Malony, A.D.: The TAU parallel performance system. International Journal of High Performance Computing Applications 20(2), 287–311 (2006)
Article Google Scholar
Williams, C.C., Hollingsworth, J.K.: Interactive binary instrumentation. IEEE Seminar Digests 915, 25–28 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Jülich Supercomputing Centre, 52425, Jülich, Germany
Jan Mußler, Daniel Lorenz & Felix Wolf
German Research School for Simulation Sciences, 52062, Aachen, Germany
Felix Wolf
RWTH Aachen University, 52056, Aachen, Germany
Felix Wolf

Authors

Jan Mußler
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Lorenz
View author publications
You can also search for this author in PubMed Google Scholar
Felix Wolf
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France
Emmanuel Jeannot & Raymond Namyst &
Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France
Jean Roman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mußler, J., Lorenz, D., Wolf, F. (2011). Reducing the Overhead of Direct Application Instrumentation Using Prior Static Analysis. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23400-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-23400-2_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23399-9
Online ISBN: 978-3-642-23400-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reducing the Overhead of Direct Application Instrumentation Using Prior Static Analysis

Abstract

Chapter PDF

Similar content being viewed by others

Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes

Actionable Program Analyses for Improving Software Performance

ParLoT: Efficient Whole-Program Call Tracing for HPC Applications

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Reducing the Overhead of Direct Application Instrumentation Using Prior Static Analysis

Abstract

Chapter PDF

Similar content being viewed by others

Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes

Actionable Program Analyses for Improving Software Performance

ParLoT: Efficient Whole-Program Call Tracing for HPC Applications

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation