1 INTRODUCTION

Software often contains bugs of the type

• use-after-free (UAF) and

• buffer or heap overflow.

Since a large part of software is used in critically important applications, bugs can cause serious consequences. There are a number of tools that help resolve this problem using static [1, 2] and dynamic [37] analysis.

Static analysis examines the code without executing it. A drawback of this method is that no information about the state of the program during execution (registers, program execution trace, input data, etc.) is available, which results in a large number of false positives. For this reason, static analysis is mostly used prior to the dynamic analysis to reveal fragments of the program that can potentially contain bugs.

The tool described in [1] designed for detecting UAF bugs performs the analysis similar to the analysis of available expressions (the expression x + y is available at the point p if this expression is computed on any path from the entry point to p and x and y remain invariable after the last such computation until p [8]). All paths in the program are examined and the condition object is defined before its use is checked. If this condition is not satisfied, the memory use is considered erroneous, and a message is generated.

GUEB [2] is based on the examination of the program binary code. The process of analysis is divided into two main phases. In the first phase, the heap access and address assignment operations are traced (the correspondence between pointers and heap elements). The pairs {address, size} are stored in the sets alloc_set and free_set when memory is allocated and freed, respectively.

In the second phase, UAF bugs are detected. Using the data collected in the first phase for each point in the program, the tool constructs the set access_heap that contains all pairs {address, size} available at this point. If the intersection of the sets access_heap and free_set is nonempty, then a UAF bug is registered.

A reason for the popularity of dynamic analysis is its capability to examine programs at run time, which gives access to registers and memory contents. The tool Avalanche [3] iteratively analyzes the program’s execution code based on dynamic binary translation. In the process of analysis, the tool computes the input data of the program in order to automatically traverse all attainable paths in the program and detect its abnormal termination.

In DangNull [4] and FreeSentry [5], the focus is on detecting and nullifying pointers to the dynamically allocated memory after this memory is freed, thus preventing UAF bugs. Both tools use static instrumentation of programs.

Undangle [6] also prevents UAF bugs. This tool labels the return value of each memory allocation function and uses the analysis of tainted data for tracing these labels. When memory is freed, the tool checks which memory locations are associated with the corresponding label and detects dangling pointers (i.e., the pointers that reference an already freed memory location).

The tool Mayhem [7] is based on the method of bug detection in the binary code that combines offline and online approaches to the symbolic execution of programs. The offline approach sequentially examines the program execution paths. During each run, the tool covers only one execution path. A drawback is that the common initial fragment of several paths is repeatedly executed at each run of the program. The online approach examines all possible execution paths simultaneously, which can lead to low memory situations.

The combination of these two approaches works as follows. When the limiting value of the memory consumption is achieved, control points are created, and the examination of some paths is suspended, and data about the current state of the execution, symbolic execution context, and specific input data are saved. When resources become available (the examination of certain paths has completed), one of the control points is recovered (i.e. the execution up to this point is reproduced using the saved data). Next, the symbolic execution context is loaded, and the analysis of a new path begins. This approach makes it possible to avoid the repeated symbolic execution of the program up to the control point.

In this paper, we consider the approach based on the dynamic analysis and dynamic instrumentation [910]. We describe a method for detecting UAF bugs that checks the correctness of using pointers for all possible execution paths of the program. The method is based on the code coverage algorithm used in SAGE [11], and it uses the infrastructure of the dynamic analyzer Triton [10].

In this work, we modified the code coverage algorithm used in Triton, which significantly improved its performance, and we added the support of the analysis of programs that work with input data read from files; this feature was not supported in the implementation of Triton.

In the second section of this paper, we describe the code coverage algorithm used in Triton and in the proposed modification. In the third section, we discuss the initial implementation of UAF bug detection and its combination with the dynamic code coverage. The results are presented in the fourth section.

2 THE CODE COVERAGE ALGORITHM

2.1 Code Coverage in Triton

In this paper, we use the algorithm for maximizing code coverage developed in Microsoft and used in SAGE [11]. This algorithm is partially implemented in Triton. It consists of two phases:

• the choice of initial input data and gathering constraints for each program execution path;

• generating new input data by solving logic expressions consisting of constraints gathered in the first phase.

Consider the example of a program shown in Fig. 1.

Fig. 1.
figure 1

An example of program from [10].

In order to examine all paths of this program, it must be provided with the input string bad!. To produce the required data, the algorithm starts by running the program with the initial input string, which is placed in the list of input data. After the first run, the set of constraints < i0 ≠ b, i1 ≠ a, i2 ≠ d, i3 ≠ ! >, where i0, i1, i2, and i3 are the memory locations input [0], input [1], input [2], and input [3], respectively, is obtained.

In the course of the algorithm execution, these constraints are solved to generate, for each element in the list of input data, descendant data satisfying these constraints; the generated data are placed in the list of input data. The program is again run on each element of this list, and the work of the algorithm is resumed.

This process continues until all elements in the list of input data are examined (the pseudocode of the algorithm is shown in Fig. 3). By applying this algorithm to the program shown in Fig. 1 with the initial input string good, we obtain the list of solutions shown in Fig. 2.

Fig. 2.
figure 2

Input data after each iteration of the algorithm.

Fig. 3.
figure 3

Pseudocode of the program code coverage algorithm.

Since the work of this algorithm requires the program to be run many times, Triton implements the option of saving the program state. This significantly improves the performance of the algorithm. Note that a part of the SAGE algorithm [11] designed to decrease the set of input data was not implemented in Triton. For this reason, the tool multiply runs the program under analysis on the input data that do not open new execution paths. In this work, we added new features to the code coverage algorithm that considerably improve the performance.

2.2 Modification of the Coverage Algorithm

In the initial implementation of the SAGE algorithm [11] in Triton, the program always obtained the last element from the list of input data after every iteration without taking into account that the number of program’s basic blocks opened using this element. As a result, together with the input data affecting the code coverage, the input data that did not open any new paths were processed.

Since the algorithm generates descendant data of each input element, the length of the list of input data significantly increases. Therefore, in order to execute the algorithm efficiently, the generated input data should be prioritized.

In the algorithm modification proposed in this paper, each element in the list of input data is assigned a weight equal to the number of basic blocks opened by this element. At the start of the algorithm, the input data are assigned a zero weight. During the first iteration of the algorithm, the weights of the initial input data are calculated.

After each iteration, the weights are updated as follows: the weights of the examined elements are passed to the descendant elements (obtained by solving the logic equations). Thus, the hierarchical traversal of input data is used. Before each run of the program, the element with the maximum weight is chosen from the list of input data. This significantly simplifies the list of solutions as is shown in Fig. 4.

Fig. 4.
figure 4

Input data after each iteration of the algorithm after adding weight.

It is seen in this figure that after adding weights the amount of input data to be examined decreased almost by a factor of two. This significantly improves the performance of the algorithm (in some tests, the performance increased almost by 90%).

Another drawback of Triton is the support of only the programs that accept only command-line arguments at the input. To extend the range of programs that can be analyzed, we added the support of programs that use files as the source of input data. We also added the capability of determining the ranges of input data that will be marked as symbolic ones in the course of the analysis.

The approach to calculating weights described above is not the only possible one, and in future research other techniques for determining weights will be considered.

3 BUG DETECTION

The dynamic analysis of programs is based on the analysis of programs in the course of their execution. This makes it possible to analyze programs taking into account specific execution conditions and use specific values of pointers. A drawback of dynamic analysis is the requirement to have a good code coverage. However, for the detected bugs, input data on which the bug is reproduced can in many cases be generated. The UAF bug is caused by two successive events:

• creation of a dangling pointer;

• memory access using a dangling pointer.

An example of UAF is shown in Fig. 5. After checking the condition in line 3, the memory referenced by the pointer ptr is freed (line 4), and then the control is transferred to line 12, where the same memory location is freed again.

Fig. 5.
figure 5

An example of UAF.

3.1 Triton’s Bug Detection Algorithm

Using the program instrumentation, the algorithm traces the memory allocation (malloc) and freeing (free) functions. At the start of the algorithm, two sets (allocSET and freeSET) are created. They are used to trace the memory locations that were allocated and then freed during the program execution. The elements of these sets are pairs (address, size).

Each time when malloc or free are called, the sets allocSET and freeSET are updated by adding or removing elements with the corresponding address and size of allocated memory. When malloc is called, a new element (address_2, size_2) is created and added to the set allocSET and removed from the second set if it is included in freeSET (i.e., both the address and size coincide).

If only the address of an element matches, then the address and size are additionally processed before updating the sets (if size_2 < size_1, then the element (address_2 + size_2, size_1size_2) is added to freeSET. When the function free is called, the corresponding element is moved from the set allocSET to freeSET.

When memory access instructions are executed, it is checked whether the pointer is in both sets. A UAF bug is registered in two cases: the element is found in freeSET but is not included in allocSet; one and the same element occurs in freeSET more than once. The work of the algorithm is illustrated in Fig. 6.

Fig. 6.
figure 6

An example of the algorithm operation.

3.2 The Proposed Method

To improve the efficiency of bug detection, we propose to combine both algorithms described above. The combined method allows one to find UAF bugs on various execution paths that occur due to checks in deeper and nontrivial parts of the program. Figure 7 shows examples of programs in which double deallocation bugs cannot be found without using information about the code coverage (due to the presence of conditional control transfers that will be executed only if the program is run on specific input data).

Fig. 7.
figure 7

Double deallocation bugs. The first example is based on on the UAF bug in the program libssh. In the first example, the bug can occur in line 26, and in the second example in line 27.

For the examples shown in Fig. 7, the UAF bug occurs if the program execution reaches lines 21 and 23 in the first and the second program, respectively. The proposed method makes it possible to find the input data that guarantee that the desired block in the code (lines 20–21 and 22–23 in Fig. 7) is reached and then check this part of the code for UAF bugs.

3.3 Comparison of Dynamic Analysis Approaches

Table 1 compares the approach described in this paper with the approach used in Mayhem and Triton.

Table 1. Comparison results

4 RESULTS

The proposed method was tested on manually generated benchmarks including those shown in Figs. 1 and 7; the testing results are presented in Fig. 8. These results show that the performance on manually generated benchmarks increased by about 80% compared with the Triton implementation. The trial runs of the analyzer on real-life programs showed that in the majority of cases the number of symbolic equations is so large that the Triton’s code coverage algorithm cannot resolve these equations for all paths.

Fig. 8.
figure 8

Comparison results in terms of the analysis execution time.

To test the proposed approach, we intentionally injected UAF bugs into the code of real-life projects. We analyzed the projects gvgen from the package graphviz, jasper from the package libjasper-runtime, and gif2rgb from the package giflib. In these projects, bugs were found at different injection levels. In the case of gvgen, bugs were injected in a number of functions that call each other (the maximum injection depth was three levels (functions)). The injected code was a conditional expression depending on the input data and the code of the bug itself. The satisfaction of this condition resulted in the occurrence of the bug.

In the projects jasper and gif2rgb, injection was done only in one function due to the complexity of the symbolic equations. The injected code was exactly the bug code. We also selected specific code fragments in real-life programs that contained UAF bugs on which the proposed approach was able to find bugs.

5 CONCLUSIONS

A method for detecting the use-after-free (UAF) bugs occurring due to incorrect processing of dynamic memory pointers is described. The method was implemented using the Triton infrastructure [10] based on the algorithm described in [11] and the UAF bug detection algorithm. The modification and improvement of the existing implementation allowed us to significantly improve the performance of the analysis.