Keywords

1 Introduction

Dynamic program analyses, such as profiling, tracing and bug-finding tools, are essential for software development. However, despite this importance, the Java platform currently provides very limited support for creating these tools. Shortcomings common both to existing Java Virtual Machines (JVMs) and Android’s Dalvik Virtual Machine (DVM) include lack of high-level abstractions for expressing analyses, lack of support for complete code coverage, and difficulty of avoiding interference between analysis and the underlying program. Instead, dynamic program analysis tools must be implemented using low-level mechanisms, such as the JVM Tool Interface (JVMTI) [19]—making for code that is error-prone and difficult to maintain, and often supporting only a particular virtual machine. Bytecode instrumentation presents fundamental interference and coverage difficulties, meaning that many analysis tools necessarily produce output that is unsound or incomplete, in order to avoid crashing or corrupting the application [11].

In this tool paper, we present our dynamic program analysis framework ShadowVM, which offers a high-level programming model for comprehensive, multi-platform analysis. ShadowVM ensures complete bytecode coverage and isolates the execution of the analysis code from the observed program. With our framework, the same implementation of an analysis can be applied to programs running on the JVM and on the DVM. ShadowVM offers dedicated support for analyzing distributed applications comprising multiple communicating processes—fundamental for the analysis of Android applications, which are typically split into multiple components running in separate DVM processes.

This tool demonstration complements our previous publications on ShadowVM. In [14] we presented our initial design of ShadowVM, which only supported the JVM. In [20] we described the challenges of enabling dynamic program analysis on Android and presented an updated design of ShadowVM suited for Android applications running in DVM processes. The corresponding software release of ShadowVM for Android supported only Android SDK ARM emulator, resulting in extremely slow analysis. This tool demonstration provides the first complete presentation of the multi-platform analysis framework that has evolved from these two pieces of work. Additionally, our framework now offers two new deployment options: Android SDK x86 emulator and Android devices.

Section 2 gives an overview of the ShadowVM architecture. Section 3 illustrates how to implement a simple code-coverage analysis with ShadowVM and evaluates its performance on Android. Section 4 shows a new demonstration scenario, where we use ShadowVM to analyze a distributed application comprised of a Java server backend and an Android client frontend. Section 5 discusses related work and Sect. 6 summarizes the strengths and limitations of our framework.

2 ShadowVM Overview

Dynamic program analysis can be regarded as the processing of events that are produced within an observed application. Depending on the purpose of the analysis, different kinds of events are relevant, such as e.g. method call, method entry/exit, field access, or object allocation and reclamation. Most of them correspond to the execution of a specific location in the bytecode of the observed application; such events can be produced using bytecode instrumentation, and we call them instrumentation events. Other lifecycle events, such as e.g. object reclamation or virtual machine termination, do not correspond to any specific code location; they are produced by the framework using some internal mechanism of the virtual machine (such as the JVMTI [19] on the JVM) [14]. Similarly, special communication events are produced by the framework in the case of inter-process communication in an Android application [20]. Events may carry various context information, such e.g. an object reference or the name of an accessed field.

An analysis implemented on top of ShadowVM consists of two parts, the event producer and the event consumer. ShadowVM offers high-level programming abstractions for implementing the event-producing and event-consuming logic. Instrumentation events are expressed in the domain-specific aspect language DiSL (DSL for Instrumentation) [13, 15, 24], whereas lifecycle and communication events are automatically generated by the framework. ShadowVM also includes a library of DiSL code for commonly used instrumentation events. The event-consuming logic is expressed as Java methods that handle the required events.

The deployment of an analysis consists of three components: the observed application (running with instrumentation), an instrumentation server, and an analysis server. The instrumentation server and the analysis server may be deployed within the same JVM process. The observed application is always running in one (or in a set of) separate JVM or DVM processes to avoid unwanted interferences between the observed application and the instrumentation/analysis code [14]. Instrumentation is performed at class load time; any method that has a bytecode representation can be instrumented (including those in the class library and in dynamically generated classes). For the DVM, a conversion between dex files and JVM class files occurs before and after instrumentation, which allows the instrumentation server (running DiSL) to only deal with Java class files.

The events produced in the observed application are sent to the analysis server through sockets. For each received event, a corresponding method (defined in an analysis class) is invoked by the framework’s event dispatcher. The payload of the event can include primitives (passed by value) and object references; the latter are exposed to the analysis as shadow objects. These preserve the identity of the objects from the original program, and expose reflective metadata mirroring the class hierarchy of the observed application. Shadow objects allow attaching and accessing arbitrary analysis state—perhaps analysis-specific data (e.g. timestamps) or perhaps the real object’s contents (by observing field writes). For convenience, shadow strings replicate the real strings’ contents.

Fig. 1.
figure 1

Overview of an analysis receiving events from multiple observed Java and Android application components.

Figure 1 illustrates multiple observed components of a distributed application, all sending events to the same analysis server. The server-side components of the application are running in JVM processes, whereas the client-side frontend components are executing in DVM processes. The origin of a received event is represented by a JVM/DVM context object. Thanks to ShadowVM, a single implementation of the event-producing and event-consuming parts can observe the entire distributed multi-platform deployment. For example, the analysis may trace all communication between the distributed components. Section 4 demonstrates such an analysis.

Fig. 2.
figure 2

Event producer and consumer of JaCoCo recast for ShadowVM.

3 Code Coverage Analysis with ShadowVM

To illustrate how to implement a simple analysis with ShadowVM, we recast the popular code coverage tool JaCoCo [7].

Figure 2a shows the DiSL [13, 15] instrumentation code producing branch events. Figure 2b shows the plain Java analysis code which consumes these branch events. The instrumentation assigns each branch a dedicated number for indexing, and emits an event indicating which branch is taken (the event marshalling code not shown here is in class CodeCoverageAnalysisProxy). In DiSL, Java annotations mark a snippet (a static method) with places where it should be inserted (here before and after branches). The extra “synthetic” local boolean is inserted into each method body and used to select only the taken branches. Although snippets appear as static methods within a Java class, along with auxiliary definitions (like the synthetic local), this is simply a convenient container; it is never loaded nor instantiated, and is used only by the instrumentation server. The snippet produces an event consisting of two strings and an integer, uniquely identifying the branch. The analysis maintains a simple data structure tracking taken branches, updated in reaction to the events received.

In [20] we compared the analysis results produced by our analysis and by the original JaCoCo. Both tools support JVM and DVM, and produce equivalent results for application classes. In contrast to the original JaCoCo, our tool is also capable of analyzing code coverage in the class library.

Below we report some new results on source-code metrics and on performance. The original JaCoCo (excluding report generation features) has 1959 logical lines of code (LOC), whereas our recast has only 363 LOC for the same functionality (including event producing and consuming logic). That is, with ShadowVM we can express the same analysis in less than 19 % of the LOC of the original tool.

Our current version of ShadowVM supports three deployment options for Android 4.4: (a) Android SDK ARM emulator; (b) Android SDK x86 emulator; (c) Android devices. Options (b) and (c) are new. Table 1 compares the performance of JaCoCo and our recast for the three Android deployment options.Footnote 1 We use GrinderBenchFootnote 2 for the analysis. The reported metric is the elapsed wall time from the start of a benchmark until completion of the analysis.

In general, our ShadowVM recast introduces more overhead than JaCoCo. The slowdown is explained by event transmission overheads, whereas JaCoCo collects the coverage data within the observed process. The ShadowVM recast of JaCoCo offers greater flexibility—we can switch to different metrics, such as basic block profiling, without redeploying the profiler onto the device, since the instrumentation server runs separately. The option of deploying on the Android x86 emulator greatly reduces overhead (down to a factor of 2, instead of 8.3) by eliminating the cost of emulating a non-native instruction set architecture. We note that even the higher overhead need not be prohibitive; tools with overheads a factor of 10 or greater have gained acceptance among developers [17].

Table 1. Execution time (in seconds) of the Grinder benchmarks on ARM emulator, x86 emulator or real device. Each deployment option is evaluated without instrumentation (baseline), with JaCoCo, or with our ShadowVM recast.

4 Fuzzing a Distributed Multi-platform Application

We will demonstrate ShadowVM with a trace validation tool for fuzzing a distributed application comprising a Java server and an Android device as frontend. Fuzzing is an automatic testing technique that feeds random inputs to a program to trigger exceptional behavior [3]. Monkey is a fuzzing tool generating Android user-interface events; it has been applied to finding security bugs [12]. As Android applications increasingly rely on server-side components, they increasingly suffer difficult-to-find bugs triggered only by proper coordination of the client and the server. ShadowVM enables fuzzing these distributed multi-platform applications, by analyzing the whole distributed application’s state at once. For example, we can validate traces against a whole-application state machine; on failure, we report to the developer the unexpected next state, the current state, and the triggering input stream.

Figure 3 illustrates the state machine of a login routine. The login credentials are collected by an Android application and validated by a Java server. To avoid brute-force attacks, only three successive login attempts are allowed: the server counts the number of unsuccessful attempts, and will block the account once it exceeds three. During login, the user may also reset the password; the “Reset Password” sub-routine should not change the “Login Attempts” state.

Suppose the implementation of the server-side component violates this specification, by accidentally resetting the “Login Attempts” state after resetting the password. This would permit brute-force attacks by resetting the password every two login attempts. Such a bug is difficult to detect when fuzzing only client or server code in isolation. One must either “mock up” the remote party’s interactions in a testing harness usable by the fuzzer or deal with the state-space explosion that can result from overapproximating this behaviour. By contrast, with ShadowVM we can easily fuzz the whole ensemble.

We will demonstrate fuzzing of the aforementioned distributed application (driven by Monkey) using a ShadowVM-based trace validation tool. In the observed virtual machines, events are produced for the login-related actions of both client (DVM) and server (JVM). In the analysis server, a state machine validates the event trace. The specification of the state machine is configurable such that the trace validation tool can analyze different applications.

5 Related Work

The baseline approach on which ShadowVM improves is explicit bytecode instrumentation. Various systems offering services and simple abstractions for this have been described in the literature [4, 6, 9, 23], but in all cases direct manipulation of instructions, much lower-level than the aspect-oriented primitives proposed in DiSL [15], on which ShadowVM is based, and lack our framework’s multi-platform capability. RoadRunner [8] offers a pipeline abstraction, but is specialized to the specific analysis domain of dynamic data-race detection.

Fig. 3.
figure 3

State machine illustrating all possible login sequences.

Other research addresses the dynamic analysis of distributed systems. Pinpoint [5] aims to identify components correlated with failed requests in multi-tier JavaEE server systems, by tracking incoming requests using a unique identifier propagated along a request execution path. ARM [22] (Application Response Measurement) defines a standardized infrastructure for monitoring multi-tier enterprise applications, using tags to associate system behavior with individual requests. Magpie [2] allows analyzing server workloads and resource consumption of individual requests, enabling the construction of workload models for clusters of canonicalized requests. Aguilera et al. [1] use statistical methods to infer dominant causal paths in a distributed system, using only message-level traces.

The common denominator of these approaches is their use of events for the analysis of the system behavior. None of them focuses on the task of instrumentation and event production. In contrast, ShadowVM is primarily a framework providing easy-to-use event-producer and event-consumer programming models which can be used for implementing cross-platform distributed program analyses.

ShadowVM also relates to distributed aspect-oriented programming [16, 18, 21], which introduces the concept of a remote pointcut and allows deploying aspects on a set of hosts. In contrast, ShadowVM is primarily tailored for the development of dynamic analyses deployed on a single host, correlating events from potentially many observed hosts.

Existing fuzzing or active testing systems such as CalFuzzer [10] have tackled scenarios similar to our demonstration, with some degree of extensibility, but stop short of being general-purpose dynamic analysis frameworks. For example, CalFuzzer provides a fixed set of instrumentation callbacks that cover only synchronization and shared-memory operations.

6 Conclusions

We gave the first demonstration of our multi-platform dynamic program analysis framework ShadowVM, which reconciles a high-level programming model, expressiveness, complete bytecode coverage, and isolation of the analysis from the observed application. ShadowVM offers seamless support for both the JVM and the DVM; an analysis written for Java applications also supports Android applications out-of-the-box. As demonstrated before, one analysis server can handle the events of multiple observed JVM and DVM processes, enabling centralized analysis of distributed systems. Because the event consumer executes in a separate analysis server, ShadowVM implicitly parallelizes the execution of the observed application and the analysis. This approach also minimizes the extra memory requirements on the observed virtual machine, crucial for deploying heavyweight analyses on resource-constrained Android devices.

The design of ShadowVM also has some drawbacks. As shown in our performance evaluation, the analysis overhead is often higher than with straightforward analysis within the observed virtual machine. In particular, using Android SDK’s ARM emulator results in excessive overhead. The newly supported deployment options (Android SDK x86 emulator and Android devices) mitigate this overhead. For the analysis of Android applications, a version-specific patch needs to be applied to the DVM first. The implicit conversion between JVM and DVM bytecode may introduce some bias in metrics related to individual bytecodes or basic blocks of code. As events are processed remotely and asynchronously by the analysis, ShadowVM is not suited for interactive debugging. Finally, our system cannot observe execution in native code.

ShadowVM is available as part of the DiSL 2.1 open-source release (http://disl.ow2.org/). DVM support is currently available as a prototype (http://dag.inf.usi.ch/downloads/); it will be part of the forthcoming DiSL 3.0 open-source release.