Abstract
The advantage of collecting data provenance information has driven research on how to extend or modify applications and systems in order to provide it, or the creation of architectures that are built from the ground up with provenance capabilities. In this paper we propose a universal data provenance framework, using dynamic instrumentation, which gathers data provenance information for real-world applications without any code modifications. Our framework simplifies the task of finding the right points to instrument, which can be cumbersome in large and complex systems. We have built a proof-of-concept implementation of the framework on top of DTrace. Moreover, we evaluated its functionality by using it for three different scenarios: file-system operations, database transactions and web browser HTTP requests. Based on our experiences we believe that it is possible to provide data provenance, transparently, to any layer of the software stack.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Sqlite, http://www.sqlite.org/
Aho, A.V., Kernighan, B.W., Weinberger, P.J.: The AWK Programming Language. Addison-Wesley (1988)
Apple. dtrace(1) Mac OS X Manual Page, http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man1/dtrace.1.html
Buneman, P., Tan, W.-C.: Provenance in databases. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD 2007, pp. 1171–1173. ACM, New York (2007)
Cantrill, B.M., Shapiro, M.W., Leventhal, A.H.: Dynamic instrumentation of production systems. In: Proceedings of the USENIX Annual Technical Conference (ATC), pp. 15–28 (2004)
Cheney, J., Chiticariu, L., Tan, W.-C.: Provenance in Databases: Why, How, and Where. Foundations and Trends in Databases 1(4), 379–474 (2007)
Demsky, B.: Garm: cross application data provenance and policy enforcement. In: Proceedings of the 4th USENIX Conference on Hot Topics in Security, HotSec 2009, p. 10. USENIX Association, Berkeley (2009)
Dietz, M., Shekhar, S., Pisetsky, Y., Shu, A., Wallach, D.S.: Quire: Lightweight provenance for smart phone operating systems. In: Proceedings of the 20th USENIX Security Symposium, San Francisco, CA (August 2011)
FreeBSD. DTrace – FreeBSD Wiki, http://wiki.freebsd.org/DTrace
Jones, S., Strong, C., Long, D.D.E., Miller, E.L.: Tracking emigrant data via transient provenance. In: Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP 2011), Heraklion, Greece (June 2011)
Kemerlis, V.P., Pappas, V., Portokalidis, G., Keromytis, A.D.: iLeak: A lightweight system for detecting inadvertent information leaks. In: Proceedings of the 6th European Conference on Computer Network Defense (EC2ND), Berlin, Germany, pp. 21–28 (October 2010)
Kemerlis, V.P., Portokalidis, G., Jee, K., Keromytis, A.D.: libdft: Practical dynamic data flow tracking for commodity systems. In: Proceedings of the 8th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), London, UK (March 2012)
Lakshmanan, G.T., Curbera, F., Freire, J., Sheth, A.: Guest editors’ introduction: Provenance in web applications. IEEE Internet Computing 15(1), 17–21 (2011)
Linux, O.: Trying out dtrace, http://blogs.oracle.com/wim/entry/trying_out_dtrace
Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2005, pp. 190–200. ACM, New York (2005)
Margo, D.W., Seltzer, M.: The case for browser provenance. In: Proceedings of the First Workshop on on Theory and Practice of Provenance, pp. 9:1–9:5. USENIX Association, Berkeley (2009)
Michaelis, J.R., McGuinness, D.L.: Towards Provenance Aware Comment Tracking for Web Applications. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010. LNCS, vol. 6378, pp. 265–273. Springer, Heidelberg (2010)
Muniswamy-Reddy, K.-K., Holland, D.A., Braun, U., Seltzer, M.: Provenance-aware storage systems. In: Proceedings of the Annual Conference on USENIX 2006 Annual Technical Conference, p. 4. USENIX Association, Berkeley (2006)
Muniswamy-Reddy, K.-K., Macko, P., Seltzer, M.: Provenance for the cloud. In: Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST 2010, pp. 14–15. USENIX Association, Berkeley (2010)
QNX. The community portal for qnx software developers, http://community.qnx.com/sf/projects/dtrace/
Spillane, R., Sears, R., Yalamanchili, C., Gaikwad, S., Chinni, M., Zadok, E.: Story book: an efficient extensible provenance framework. In: Proceedings of the First Workshop on Theory and Practice of Provenance, pp. 1–10. USENIX Association, Berkeley (2009)
Theoharis, Y., Fundulaki, I., Karvounarakis, G., Christophides, V.: On provenance of queries on semantic web data. IEEE Internet Computing 15, 31–39 (2011)
Viega, J., Messier, M., Chandra, P.: Network security with OpenSSL. O’Reilly Media (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Gessiou, E., Pappas, V., Athanasopoulos, E., Keromytis, A.D., Ioannidis, S. (2012). Towards a Universal Data Provenance Framework Using Dynamic Instrumentation. In: Gritzalis, D., Furnell, S., Theoharidou, M. (eds) Information Security and Privacy Research. SEC 2012. IFIP Advances in Information and Communication Technology, vol 376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30436-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-30436-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30435-4
Online ISBN: 978-3-642-30436-1
eBook Packages: Computer ScienceComputer Science (R0)