Abstract
In this paper, two tools are presented: an execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visualize the generated data in various ways. To get a general purpose, easy-to-use tool suite, the simulation approach allows us to take advantage of runtime instrumentation, i.e. no preparation of application code is needed, and enables for sophisticated preprocessing of the data already in the simulation phase. In an ongoing project, research on advanced cache analysis is based on these tools. Taking a multigrid solver as an example, we present the results obtained from the cache simulation together with real data measured by hardware performance counters.
Chapter PDF
Similar content being viewed by others
References
Ammons, G., Ball, T., Larus, J.R.: Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling. In: Proceedings of PLDI 1997 (June 1997)
Anderson, J.M., Berc, L.M., Dean, J., et al.: Continuous Profiling: Where Have All the Cycles Gone? ACM Transactions on Computer Systems 15(4), 357–390 (1997)
Berg, E., Hagersten, E.: SIP: Performance Tuning through Source Code Interdependence. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 177–186. Springer, Heidelberg (2002)
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. The International Journal of High Performance Computing Applications 14(3), 189–204 (2000)
Cascaval, G.C.: Compile-time Performance Prediction of Scientific Programs. PhD thesis, University of Illinois at Urbana-Champaign (August 2000)
Cmelik, B., Keppel, D.: Shade: A Fast Instruction Set Simulator for Execution Profiling. In: SIGMETRICS, Nashville, TN, US (1994)
Eustace, A., Srivastava, A.: ATOM: A Flexible Interface for Building High Performance Program Analysis Tools (1994)
Graham, S., Kessler, P., McKusick, M.: GProf: A Call Graph Execution Profiler. In: SIGPLAN: Symposium on Compiler Construction, pp. 120–126 (1982)
Hsiao, H.C., King, C.T.: MICA: A Memory and Interconnect Simulation Environment for Cache-based Architectures. In: Proceedings of the 33rd IEEE Annual Simulation Symposium (SS 2000), April 2000, pp. 317–325 (2000)
Itzkowitz, M., Wylie, B.J.N., Aoki, C., Kosche, N.: Memory profiling using hardware counters. In: Proceedings of Supercomputing (November 2003)
Kowarschik, M., Rüde, U., Thürey, N., Weiß, C.: Performance Optimization of 3D Multigrid on Hierarchical Memory Architectures. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds.) PARA 2002. LNCS, vol. 2367, pp. 307–316. Springer, Heidelberg (2002)
Levon, J.: OProfile, a system-wide profiler for Linux systems
Martonosi, M., Gupta, A., Anderson, T.E.: Memspy: Analyzing memory system bottlenecks in programs. In: Measurement and Modeling of Computer Systems, pp. 1–12 (1992)
Mellor-Crummey, J., Fowler, R., Whalley, D.: Tools for Application-Oriented Performance Tuning. In: Proceedings of 15th ACM International Conference on Supercomputing, Italy (June 2001)
Miller, B.P., Callaghan, M.D., Cargille, J.M., et al.: The Paradyn Parallel Performance Measurement Tool. IEEE Computer 28(11), 37–46 (1995)
Mohr, B., Malony, A., Cuny, J.: TAU. In: Parallel Programming using C++, M.I.T. Press, Cambridge (1996)
Nethercote, N., Mycroft, A.: The Cache Behaviour of Large Lazy Functional Programs on Stock Hardware. In: Proceedings of the ACM SIGPLAN Workshop on Memory System Performance (MSP 2002), Berlin, Germany (July 2002)
Nethercote, N., Seward, J.: Valgrind: A Program Supervision Framework. In: Proceedings of the Third Workshop on Runtime Verification (RV 2003), Boulder, Colorado, USA (July 2003), Available at http://valgrind.kde.org/
Pai, V.S., Ranganathan, P., Adve, S.V., Harton, T.: An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors. In: Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 12–23 (October 1996)
Shneiderman, B.: Treemaps for space-constrained visualization of hierarchies, http://www.cs.umd.edu/hcil/treemap-history/index.shtml
Traub, O., Schechter, S., Smith, M.D.: Ephemeral instrumentation for lightweight program profiling. In: Proceedings of PLDI 2000 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Weidendorfer, J., Kowarschik, M., Trinitis, C. (2004). A Tool Suite for Simulation Based Analysis of Memory Access Behavior. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science - ICCS 2004. ICCS 2004. Lecture Notes in Computer Science, vol 3038. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24688-6_58
Download citation
DOI: https://doi.org/10.1007/978-3-540-24688-6_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22116-6
Online ISBN: 978-3-540-24688-6
eBook Packages: Springer Book Archive