Abstract
Bug checker tools for Java require fine-grained heap abstractions including object-sensitive call graphs, field information for objects, and points-to sets for program variables to find bugs in source codes. However, heap abstractions coined commonly as points-to analysis, have high runtime-complexity especially when the points-to analysis is context- sensitive, and, hence, state-of-the-art points-to analyses do not scale for large code bases.
In this paper, we introduce a new points-to framework that facilitates the computation of context-sensitive points-to analysis for large code bases. The framework is demand-driven, i.e., a client queries the points-to information for some program variables. The novelty of our approach is a pre-analysis technique that is a combination of staged points-to analyses with program slicing and program compaction. We implemented the proposed points-to framework in Datalog for a proprietary bug checker that could identify security vulnerabilities in the OpenJDKTM library which has approximately 1.3 million variables and 500,000 allocation-sites. For the clients that we have chosen, our technique is able to eliminate about 73% of all variables and about 95% of allocation-sites. Thus our points-to framework scales for code bases with millions of program variables and hundreds of thousands of methods.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)
Andersen, L.O.: Program Analysis and Specialization for the C Programming Language. Ph.D. thesis, DIKU, University of Copenhagen (Fall 1994)
Appel, A.W.: Modern Compiler Implementation in Java. Cambridge University Press (1998)
Ball, T., Rajamani, S.K.: The SLAM toolkit. In: Berry, G., Comon, H., Finkel, A. (eds.) CAV 2001. LNCS, vol. 2102, pp. 260–264. Springer, Heidelberg (2001)
Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., Engler, D.: A few billion lines of code later – using static analysis to find bugs in the real world. Comm. ACM 53, 66–75 (2010)
Blackburn, S.M., Garner, R., Hoffmann, C., Khan, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanovic, D., VanDrunen, T., von Dincklage, D., Wiedermann, B.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA 2006: Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (2006)
Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. In: Proceeding of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2009, pp. 243–262. ACM (2009), http://doi.acm.org/10.1145/1640089.1640108
Cifuentes, C., Keynes, N., Li, L., Hawes, N., Valdiviezo, M.: Transitioning Parfait into a development tool. IEEE Security and Privacy 10(3), 16–23 (2012)
Corporation, O.: Secure coding guidelines for java se (April 2014), http://www.oracle.com/technetwork/java/seccodeguide-139067.html
Debray, S.K., Evans, W., Muth, R., De Sutter, B.: Compiler techniques for code compaction. ACM Transactions on Programming Languages and Systems 22(2), 378–415 (2000)
Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: Semantics-based detection of android malware through static analysis. In: International Symposium on Foundations of Software Engineering (2014) (to appear)
Gotsman, A., Berdine, J., Cook, B.: Interprocedural shape analysis with separated heap abstractions. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 240–260. Springer, Heidelberg (2006)
Green, T.J., Aref, M., Karvounarakis, G.: Logicblox, platform and language: A tutorial. In: Barceló, P., Pichler, R. (eds.) Datalog 2.0 2012. LNCS, vol. 7494, pp. 1–8. Springer, Heidelberg (2012)
Hind, M., Pioli, A.: Which pointer analysis should i use? In: Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pp. 113–123. ACM (2000)
Lhoták, O., Hendren, L.J.: Context-sensitive points-to analysis: Is it worth it? In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 47–64. Springer, Heidelberg (2006)
Lhoták, O., Hendren, L.J.: Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implementation. ACM Transactions on Software Engineering Methodology 18(1) (2008)
Lu, Y., Shang, L., Xie, X., Xue, J.: An incremental points-to analysis with cfl-reachability. In: Jhala, R., De Bosschere, K. (eds.) Compiler Construction. LNCS, vol. 7791, pp. 61–81. Springer, Heidelberg (2013)
Milanova, A., Rountev, A., Ryder, B.G.: Parameterized object sensitivity for points-to analysis for Java. ACM Transaction on Software Engineering Methodolology 14(1), 1–41 (2005), http://doi.acm.org/10.1145/1044834.1044835
Octeau, D., McDaniel, P., Jha, S., Bartel, A., Bodden, E., Klein, J., Le Traon, Y.: Effective inter-component communication mapping in android with epicc: An essential step towards holistic security analysis. In: Proceedings of the 22nd USENIX Conference on Security (SEC), pp. 543–558. USENIX Association (2013), http://dl.acm.org/citation.cfm?id=2534766.2534813
Oh, H., Lee, W., Heo, K., Yang, H., Yi, K.: Selective context-sensitivity guided by impact pre-analysis. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 475–484. ACM (2014)
Ryder, B.G.: Dimensions of precision in reference analysis of object-oriented programming languages. In: Hedin, G. (ed.) CC 2003. LNCS, vol. 2622, pp. 126–137. Springer, Heidelberg (2003)
Smaragdakis, Y., Balatsouras, G., Kastrinis, G.: Set-based pre-processing for points-to analysis. In: ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), pp. 253–270 (2013)
Smaragdakis, Y., Bravenboer, M., Lhoták, O.: Pick your contexts well: understanding object-sensitivity. In: Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, pp. 17–30. ACM (2011), http://doi.acm.org/10.1145/1926385.1926390
Sridharan, M., Bodík, R.: Refinement-based context-sensitive points-to analysis for Java. In: Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2006, pp. 387–400. ACM (2006), http://doi.acm.org/10.1145/1133981.1134027
Sridharan, M., Gopan, D., Shan, L., Bodik, R.: Demand-driven points-to analysis for Java. In: Proceedings of the 20th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 59–76. ACM (2005), http://doi.acm.org/10.1145/1094811.1094817
Tip, F., Palsberg, J.: Scalable propagation-based call graph construction algorithms. In: Rosson, M.B., Lea, D. (eds.) OOPSLA 2000, pp. 281–293. ACM (2000)
Yan, D., Xu, G., Rountev, A.: Demand-driven context-sensitive alias analysis for Java. In: Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA), pp. 155–165. ACM (2011), http://doi.acm.org/10.1145/2001420.2001440
Zheng, X., Rugina, R.: Demand-driven alias analysis for C. In: Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, pp. 197–208 (2008), http://doi.acm.org/10.1145/1328438.1328464
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Allen, N., Scholz, B., Krishnan, P. (2015). Staged Points-to Analysis for Large Code Bases. In: Franke, B. (eds) Compiler Construction. CC 2015. Lecture Notes in Computer Science(), vol 9031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46663-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-662-46663-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46662-9
Online ISBN: 978-3-662-46663-6
eBook Packages: Computer ScienceComputer Science (R0)