1 INTRODUCTION

Verification of a multithreaded program is always a much more complicated task then verification of sequential program. Precise computation of all possible interleavings leads to a state explosion. Thus, most of the verification tools try to perform different kinds of optimizations: partial order reduction [1, 2], counter abstraction [3] and others. Anyway, most state-of-the-art tools do not scale well on real-world software. That is confirmed by the software verification competition [4]. Concurrency benchmarks based on Linux device driversFootnote 1 cause significant difficulties for any software model checker tool.

The other side of model checking is static analysis that is targeted on fast finding bugs without any confidence in the final verdict. Such tools apply different filters and unsound heuristics to speed up the analysis and thus can not prove the correctness. Our goal was to develop a tool, which becomes a golden mean between precise and slow software model checkers and fast and imprecise static analyzers. It should be easily configured and targeted to a particular task.

The main idea of the suggested approach is following. As the verification object is a large multithreaded program, we do not consider all possible thread interleavings and consider every thread separately. In this case a state of every thread becomes partial, i.e. they do not contain information about other threads and, therefore, can not describe a complete state of the program. A possible thread interaction is over approximated by a set of actions, which the threads can perform over the set of shared data, including synchronization primitives. Thus, an approximation of possible actions, or effects is constructed simultaneously for all threads. We will call it an environment. Though the environment is single for all threads, it does not mean that all its effects could be applied to a thread, as for every effect there are conditions of its application. The conditions depends on the analysis. Moreover, the conditions may include requirements on certain operators of a program or thread states.

Precision of the environment defines precision and speed of the whole tool. The precision may be increased by combining different verification techniques. An implementation was performed on the top of CPAchecker [5] framework, which provides a rich set of verification approaches. A thread-modular approach [69] can also be implemented as a one more technique, which will be integrated into CPAchecker framework, appending the traditional set of techniques, i.e. CEGAR algorithm [10] and predicate abstraction [11].

Software model checkers usually have the following two stages:

(1) Constructing of a set of reachable states.

(2) Check a property over the set of the reached states.

The two steps may be performed sequentially or in parallel and even the reached set construction may be driven by the checked property. For example, some static verification tools are checked a potential race condition while adding a following reached state. Such a tool may stop the analysis in case of detecting an error. This approach is slow and not practical for data race detection, but it is successfully used for solving a reachability task or detecting memory leaks.

Data race detection approaches usually are based on a Lockset algorithm [12]. We use a more intelligent way to detect data races: a potential data race is a pair of two compatible abstract transitions, which modifies the same memory. Compatibility here means that two partial abstract transitions may start from a single global state. Thus, the precision of such definition corresponds to a level of abstraction. So, a lock abstraction leads to a classical Lockset algorithm. Predicate analysis with BnB memory model [1315] improves the precision for dealing with pointer accesses and allows to keep soundness under reasonable assumptions.

We evaluate our approach on a set of benchmarks, based on Linux device drivers. They are prepared by Klever, a framework for verification of large software systems [16, 17] which divides a large codebase into separate verification tasks (usually, one or two Linux modules) and prepares an environment model.

Our contribution is:

– CPA with transition abstractions – an extension of CPA theory to be able to describe thread-modular approach;

– Thread Modular CPA – an implementation of thread-modular approach in terms of extended CPA theory, which allows to combine a thread-modular approach with other approaches, like predicate abstraction;

– a tool, which was successfully evaluated on benchmarks, based on Linux device drives.

The rest of the paper is organized as follows. In Section 2 we present challenges for the state-of-the-art verifiers and our high-level solutions. Section 3. Section 4 introduce a program model and basic definitions. The next 4 sections are related to the efficient computation of abstraction of the program. Section 5 describes an extension of CPA theory. Section 6 presents a thread modular analysis in terms of CPA. Sections 7–12 contain extended CPAs for thread modular approach: predicate analysis and lock analysis. In Section 13 we discuss some specifics of data race detection with our approach. Section 14 presents the results of our approach on two benchmark sets: software verification competition (SV-COMP) and Linux device drivers. Section 15 gives a brief overview of similar works.

2 MOTIVATING EXAMPLE

Consider an exampleFootnote 2 from SV-COMP’19 [4]. The verification task is based on a real data raceFootnote 3. The task is more than 7 kLOC and contains 4 artificially created threads: one thread for basic platform callbacks, one for interrupt handling, one for power management operations of the driver and one main thread, which performs init-exit operations. All kernel mutexes and spinlocks are replaced by a pthread_mutexes. The known data race is encoded as a reachability task in a following way:

tmp = tspi->rst;

assert(tmp == tspi->rst);

The detailed results may be found on the SV-COMP web-siteFootnote 4. Most of the state-of-the-art verifiers faced with the problems:

– CBMC: “pointer handling for concurrency is unsound – UNKNOWN”;

– CPAchecker: “Unsupported feature: BDD-analysis does not support arrays”;

– SMACK: “Exception thrown in lockpwn”;

– yogar-cbmc: “out of memory”;

– Ultimate: “Ultimate could not prove your program”.

The main challenge for a verifier in the example is a large number of operations in all threads, many of them are over the same data, which means they affect each other. It leads to the set of following subproblems, which are usually ignored for small artificial examples:

(1) Analysis of multithreaded programs should be precise enough to produce a small number of false alarms, but efficient enough to be able to solve the task. Many efficient analyses do not support complicated data structures (e.g. BDD analysis [18], analysis of explicit values [19]). And vice versa precise approaches have problems with long paths (e.g. Predicate analysis [20], BMC approach [21]).

(2) Efficient encoding of synchronization primitives. Many approaches encode locks as variables, which are simultaneously checked and assigned (e.g. [7, 9]). This encoding is mixed with other variables and complicates the general analysis.

Note, the race condition in the benchmarks is encoded as reachability task, thus it is a hint for verifier, which memory accesses are buggy. In practice a verifier does not know a precise location of the data race, thus it has to check all potential memory accesses. This complicates the task much more. That is why we made efforts to develop an approach, which will be able to solve the corresponding task.

3 OVERVIEW OF THE APPROACH

As we have already discussed, a data race detection method may be divided into two stages: construction of a set of reached states and checking of requirements, particularly, searching of pairwise states, which form the data race. An analysis of a program may be performed by a sequential combination of the phases or their parallel execution. Note, the requirement check usually takes little time, as it represented by a condition over a state. An example of check may be absence of states of a special kinds (reachability task), absence of special pair of states (data race detection task). Construction of the set of reached states usually leads to problems related to efficiency. Thus, further we will concentrate of this task.

Consider a simple program, which has only two threads (Fig. 1). This is a model example, which contains an implicit synchronization between threads: the first thread initializes some data (in this case, a global variable \(g\)), then sets a flag, encoding, that the data are ready. The second thread can read the data only after the flag was set, thus there is no data race in this example (Fig. 2).

Fig. 1.
figure 1

An example of code.

Fig. 2.
figure 2

An example of possible thread execution.

Classical methods of model checking have to check all possible interleavings of two threads. A number of states grows rapidly even in a simple example and with different optimizations (Fig. 3). And so-called “combinatorial explosion” happens, which leads to resource exhaustion. Thus, classical methods of model checking can not prove the correctness of large real-world software.

Fig. 3.
figure 3

Construction of interleaving set.

Simple methods of static analysis try to compute a quick overapproximation of thread interaction, so-called thread effect. However, they are not able to consider complicated dependencies between shared variables. So, for the example in Fig. 1 there is only information, that the global variable d may be set into one. With such an approximation the simple kind of analysis has to conclude, that there is a potential data race in the example.

A suggested approach is based on a well known thread-modular approach. The approach considers each thread separately but in a special environment, which is constructed also during the analysis. The environment computation is based on the analysis of all threads, as every thread is a part of the environment for other threads. For each thread, a set of its actions, which may affect other threads, is collected. The actions include modification of shared variables, acquiring synchronization primitives and so on. Precision of the environment strongly affects the precision of the whole analysis. However, there is the main question, how to compute and represent the environment efficiently.

In sequential analysis, there is a successful technique, which allows reducing the number of considered program states – an abstraction. It allows to abstract from minor details of a program and considers general (abstract) state. Each abstract state may correspond to a set of real (concrete) program states. This idea allows to significantly increase the efficiency of the analysis.

Table 1. Evaluation on SV-COMP benchmarks
Table 2. Evaluation on drivers/net/ of Linux 4.2.6

The key idea of the suggested approach is an extension of abstraction not only to the program states but also to the operations of a thread. Adjusting the level of the abstraction, it is possible to choose a balance between speed and precision of the tool.

Figure 4 shows a part of the Abstract Reachability Graph (ARG) for the first and the second threads from the example (Fig. 1). There is no interaction considered, so this is not a final step of the analysis. The transitions are based on a simple Value Analysis [19], which tracks only explicit values of variables. A transition contains an abstract state and an abstract operation. The first abstract state in both threads contains information that both global variables (d and g) are equal to zero. After performing an operation g = 1 (transition #A1) the value of g is updated into 1 in the second abstract state (transition #A2).

Fig. 4.
figure 4

Abstract transitions for two threads without any interaction.

After constructing an ARG for two threads separately, we need to consider the influence of threads to each other, i.e. to construct an environment. For every thread operation, we compute its projection – a representation of operation in a thread for other threads as an environment. For example, modification of local variables can not affect on other threads, so the corresponding projection is empty. Modification of a global variable may affect other threads, so the projection may be equal to the original transition or overapproximate it, for example, by abstraction from a precise assigned value. A projection may contain not only information about an action but also a condition for performing this action, so-called \(guard\). Consider a transition #A1. We may represent the corresponding projection in the following way: if the value of g equals zero, it may be changed to one. In other words, the projection consists from two parts: a \(guard\) (\([g = = 0]\)) and an \(action\) (\(g \to 1\)). The guard corresponds to a predecessor abstract state and an action corresponds to an operation.

Let us return to the Example 4. After computation of transitions in threads separately, we have to construct the environment, i.e. set of projections. Figure 5 presents computed projections for the first thread.

Fig. 5.
figure 5

Application of projections to the second thread.

There are two transitions, which may affect the global variables: #A1 and #A2. We compute the corresponding projections: #P1 and #P2. Then every projection has to affect the second thread, i.e. apply to all possible (according to the guard) transitions of the second thread. We apply the projection #P1 to the transition #B1, as the state is compatible with the guard of the projection. Only after that, we may apply the projection #P2 to the new transition #B2, which requires \(g\) to be equal to one. And only then the second thread may go throw a new transition #B3, which discovers new paths. Note, the figure presents only projections for the first thread, in complete ARG there should be also projections for the second thread as well.

For data race detection we have to find two transitions, which modify the same variable. The example has potential candidates: #A1 and #B4. Now, we should check, if the two transitions may be executed in parallel. That means, that the corresponding abstract states must be a part of one global state, i.e. they must be \(compatible\). In this case, the partial states are contradicting each other, as one has \(g \to 0\) and the other \(g \to 1\). So, the corresponding transitions can not be executed simultaneously. Thus, we conclude there is no data race for g. Note, there is a data race for \(d\) (transitions #A2 and #B2), but it may be considered as lock-free synchronization, and a potential race is a part of its implementation.

The suggested approach provides a lot of possible options and configurations for targeting to a particular task. A projection may be represented by more or less precise abstraction. Several projections may be joined altogether or considered separately. An example of more abstract transitions is presented in Fig. 6.

Fig. 6.
figure 6

Application of projections to the second thread.

The two initial projections (#P1 and #P2) are joined into a single one (#P3). Usually, it leads to losing some information, for example, here we lost a precise value of variable g. The action of projection also becomes more complicated. The second thread can not identify a precise value of the variables, as both of the variables are now equal to zero or one. The simple kind of analysis operates only with single explicit values of variables and both variables are considered to be equal to any random value. Now, transitions #A1 and #C3 become compatible, which means, that the race is reported on variable g.

The level of abstraction strongly affects the precision of the analysis and its speed, but the analysis always remains sound.

4 PRELIMINARY DEFINITIONS

In this section, we present preliminary definitions of a parallel program and reachable concrete states in the program.

We restrict the presentation to a simple imperative programming language, where all operations are assignments, assumptions, acquire/release synchronization operations and thread creates. We denote all operations in a program as Ops.

A parallel program is represented by a Control Flow Automaton (CFA) [22], which consists of a set \(L\) of program locations (modeled by a program counter, pc), and a set \(G \subseteq L \times Ops \times L\) of control-flow edges (modeling the operation that is executed when control flows from one program location to another). There is a thread create operation in Ops which creates a thread with an identifier from the set \(T\) and the thread starts from some location in L. The set of program variables that occur in assignment and assumption operations from Ops is denoted by \(X\) having values from \(\mathbb{Z}\). The parts of X, containing local and global variables, are denoted by \({{X}^{{local}}}\) and \({{X}^{{global}}}\) respectively. Acquire/release operations are defined over a set of lock variables \(S\) having values from \(T \cup \{ {{ \bot }_{T}}\} \), where \(t \in T\) means that the lock is acquired by a thread t and \({{ \bot }_{T}}\) means that the lock is not acquired.

A concrete state of a parallel program is a quadruple of \(({{c}_{{pc}}},{{c}_{l}},{{c}_{g}},{{c}_{s}})\), where

(1) A mapping \({{c}_{{pc}}}:T \to L\) is a partial function from thread identifiers to locations.

(2) A mapping \({{c}_{l}}:T \to {{C}^{{local}}}\) is a partial function from thread identifiers to assignments to local variables \({{C}^{{local}}}\), where \({{C}^{{local}}}:{{X}^{{local}}} \to \mathbb{Z}\) assigns each local variable its value.

(3) \({{c}_{g}}:{{X}^{{global}}} \to \mathbb{Z}\) is an assignment of values to global variables.

(4) \({{c}_{s}}:S \to T \cup \{ {{ \bot }_{T}}\} \) is an assignment of values to lock variables.

A set of all concrete states of a program is denoted by C.

We define a (labeled) transition relation \(\xrightarrow{{g,t}} \subseteq \) C × G × T × C, where an edge \(g \in G\) and a thread tT. Note, there is a special ε-transition from every state to itself: \(\forall c \in C\), \(t \in T:c\xrightarrow{{\varepsilon ,t}}c\). This transition is used only to convenient description of the next state in case there is no normal transition.

We define a set of concrete transitions as \(\mathcal{T}\, = \,C\) × G × T. An element \(\tau \in \mathcal{T}\) is a triple \(\tau = (c,g,t)\). We will write τ1 → τ2 if \(\exists {{c}_{3}} \in C:{{c}_{1}}\xrightarrow{{{{g}_{1}},{{t}_{1}}}}{{c}_{2}}\xrightarrow{{{{g}_{2}},{{t}_{2}}}}{{c}_{3}}\). We denote \(Reac{{h}_{ \to }}(\tau )\) = \({\text{\{}}\tau {\kern 1pt} '|\exists {{\tau }_{1}}, \ldots ,{{\tau }_{n}} \in \mathcal{T}.\tau \to {{\tau }_{1}}\) → ... → τn = τ'}.

A transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\) will be formally defined later for each \(g = (l, \cdot ,l{\kern 1pt} ') \in G\). Anyway, every correct transition fulfills the following requirements:

(1) Transition starts in state c : \(t \in dom({{c}_{{pc}}}) \wedge {{c}_{{pc}}}(t)\) = l.

(2) Program counter of thread \(t\) proceeds to \(l{\kern 1pt} '\): \(t \in dom(c_{{pc}}^{'}) \wedge c_{{pc}}^{'}(t) = l{\kern 1pt} '\).

(3) Each transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\) may change local parts of the state only for a thread t. Thus, local parts for the other threads remain unchanged, formally \(\forall t{\kern 1pt} ' \in T\) : \((t{\kern 1pt} ' \ne t)\)\((t \in dom({{c}_{{pc}}})\)\(dom(c_{{pc}}^{'}))\)\(c_{{pc}}^{'}(t{\kern 1pt} ')\) = \({{c}_{{pc}}}(t{\kern 1pt} ') \wedge c_{l}^{'}(t{\kern 1pt} ') = {{c}_{l}}(t{\kern 1pt} ')\). Note, there are no restrictions on changes in global parts of the state.

We denote by \(eval(c,t,expr)\) a value of expression expr over variables in \({{X}^{{local}}} \cup {{X}^{{global}}}\) with values from state cC and thread tT.

Further we will define semantics of operators in a program: assignment, assumption, acquire/release of locks and thread create operator. For every operator we have to define how it changes a concrete state of a program. We do not emphasize requirements 1–3, which are hold for every transition.

4.1 Assumptions

For an assumption edge g = (l, assume(expr), \(l{\kern 1pt} ') \in G\), \(t \in T\), \(l,l{\kern 1pt} ' \in L\) a transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\), c, \(c{\kern 1pt} ' \in C\) exists, if

\(c = c{\kern 1pt} '\) – transition does not change the state.

\(eval(c,t,expr) \ne 0\) – expressions evaluated to non zero value.

4.2 Assignments

For an assignment edge g = (l, assign(expr), \(l{\kern 1pt} ') \in G\), \(t \in T\), \(l,l{\kern 1pt} ' \in L\) a transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\), c = (cpc, cl, cg, cs), \(c{\kern 1pt} ' = (c_{{pc}}^{'},c_{l}^{'},c_{g}^{'},c_{s}^{'}) \in C\) exists, if

\(dom(c) = dom(c{\kern 1pt} ')\) – transition does not change the set of threads.

– If \(x \in {{X}^{{local}}}\) then \(c_{l}^{'}(t)(x) = eval(c,t,expr)\) and \(c_{l}^{'}(t)(x{\kern 1pt} ') = {{c}_{l}}(t)(x{\kern 1pt} ')\) for \(x{\kern 1pt} ' \in {{X}^{{local}}}:x{\kern 1pt} ' \ne x\). Otherwise \(c_{l}^{'}(t) = {{c}_{l}}(t)\).

– If \(x \in {{X}^{{global}}}\) then \(c_{g}^{'}(x) = eval(c,t,expr)\) and \(c_{g}^{'}(x{\kern 1pt} ')\) = cg(x') for \(x{\kern 1pt} ' \in {{X}^{{global}}}:x{\kern 1pt} ' \ne x\). Otherwise \(c_{g}^{'}(t)\) = cg(t).

\(c_{s}^{'} = {{c}_{s}}\) – locks stay unchanged.

4.3 Synchronization Operations

We define a synchronization primitives acquire/release. We assume that \(acquire(s)\) operation in a thread \(t \in T\), where \(s \in S\) is a lock variable, has a semantics: if \(s = {{ \bot }_{T}}\) in previous state then s = t in the successor state and the operation performed atomically in a single edge (similar to [7]).

For an edge \(g = (l,acquire(s),l{\kern 1pt} ') \in G\) and \(t \in T\), \(l,l{\kern 1pt} ' \in L\), we have a transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\), c = (cpc, cl, cg, \({{c}_{s}})\), \(c{\kern 1pt} ' = (c_{{pc}}^{'},c_{l}^{'},c_{g}^{'},c_{s}^{'}) \in C\), if \({{c}_{s}}(s) = \,{{ \bot }_{T}}\) and (we acquire a lock s by assigning a value t to s and changing location in thread t to \(l{\kern 1pt} '\))

\(dom(c) = dom(c{\kern 1pt} ')\) – transition does not change the set of threads.

\(c_{l}^{'}(t) = {{c}_{l}}(t)\) – local variables are unchanged.

\(c_{g}^{'} = {{c}_{g}}\) – global variables are unchanged.

\(c_{s}^{'}(s) = t\) and \(\forall s{\kern 1pt} ' \in S:s{\kern 1pt} ' \ne s \Rightarrow c_{s}^{'}(s{\kern 1pt} ') = {{c}_{s}}(s{\kern 1pt} ')\).

We assume that release(s) operation in a thread \(t \in T\) has a semantics of assigning a value \({{ \bot }_{T}}\) to s if s = t.

For an edge \(g = (l,release(s),l{\kern 1pt} ') \in G\) and \(t \in T\), l, \(l{\kern 1pt} '\, \in \,L\), we have a transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\), \(c\, = \,({{c}_{{pc}}},{{c}_{l}},{{c}_{g}}\), cs), \(c{\kern 1pt} ' = (c_{{pc}}^{'},c_{l}^{'},c_{g}^{'},c_{s}^{'}) \in C\), if \({{c}_{s}}(s) = t\) and

\(dom(c) = dom(c{\kern 1pt} ')\) – transition does not change the set of threads.

\(c_{l}^{'}(t) = {{c}_{l}}(t)\) – local variables are unchanged.

\(c_{g}^{'} = {{c}_{g}}\) – global variables are unchanged.

\({{c}_{s}}(s)\, = \,{{ \bot }_{T}}\) and \(\forall s{\kern 1pt} ' \in S:s{\kern 1pt} ' \ne s \Rightarrow c_{s}^{'}(s{\kern 1pt} ') = {{c}_{s}}(s{\kern 1pt} ')\).

4.4 Thread Creation

We define semantics of \(thread\_create({{l}_{\nu }})\) such that the current thread proceeds to the location after \(thread\_create\) and new thread is created with the new identifier \(\nu \in T\) which is added to local parts of the state. The starting location of thread ν is \({{l}_{\nu }} \in L\).

Note, that we consider programs with unbounded thread creation, as \(thread\_create\) may occur in a loop.

For an edge \(g = (l,thread\_create({{l}_{\nu }}),l{\kern 1pt} ')\), we have a transition \(c\xrightarrow{{g,t}}c{\kern 1pt} '\), \(c = ({{c}_{{pc}}},{{c}_{l}},{{c}_{g}},{{c}_{s}})\), c' = \((c_{{pc}}^{'}\), \(c_{l}^{'},c_{g}^{'},c_{s}^{'}) \in C\), \(\nu \in T\), if

\(\nu \notin dom(c) \wedge dom(c{\kern 1pt} ') = dom(c) \cup \{ \nu \} \) – a thread ν is added to the set of threads.

\(c_{{pc}}^{'}(t) = l{\kern 1pt} '\) – program counter proceeds to \(l{\kern 1pt} '\) in the current thread.

\(c_{{pc}}^{'}(\nu ) = {{l}_{\nu }}\) – program counter of the new thread is lν.

\(c_{l}^{'}(\nu ) = {{c}_{l}}(t)\) – we inherit local part of the state from t.

\(c_{l}^{'}(t) = {{c}_{l}}(t)\) and \(c_{g}^{'} = {{c}_{g}}\) and \(c_{s}^{'} = {{c}_{s}}\) – all the other variables stay unchanged.

We do not consider join operations, as they complicate the theory description. Nevertheless, it is possible to add the operations into the theory.

5 CONFIGURABLE PROGRAM ANALYSIS WITH ABSTRACT TRANSITIONS

In the original CPA theory [5, 22], an abstract state represents a set of concrete states of a program. In our theory, an abstract state is a partial one, so it may not represent any concrete state of a program. That is why the concretization function in our theory differs from the original one. Our concretization function is defined on a set of elements.

As a consequence, an abstract transition is also a partial one. Thus the analysis can not guarantee, that succeeding concrete transitions are reachable by one step of an abstract transition. In the general case, the analysis needs k steps. In the thread-modular approach k = 2: the analysis performs a usual transition in a thread and then spread it to all others as a transition in an environment. It takes two iterations of the algorithm.

Now we formally define a Configurable Program Analysis with Transition Abstractions \(\mathbb{D} = (D\), Π, merge, stop, prec, \( \rightsquigarrow \)). It consists from an abstract domain D of abstract elements, a set of precisions Π, a merge operator merge, a termination check stop, a precision adjustment function prec and transfer relation \( \rightsquigarrow \).

(1) The abstract domain \(D = (\mathcal{T},\mathcal{E}, [\kern-0.15em[ \cdot ]\kern-0.15em] )\) is defined by the set \(\mathcal{T}\) of concrete transitions, \(\mathcal{T} \subseteq C \times G\) × T, the semi-lattice \(\mathcal{E}\) of abstract transitions and a concretization function \( [\kern-0.15em[ \cdot ]\kern-0.15em] \). The \(\mathcal{E} = (E, \top , \bot , \sqsubseteq , \sqcup )\) consists of the (possibly infinite) set E of abstract domain elements, a top element \( \top \, \in E\), a bottom element \( \bot \in E\), a partial order \( \sqsubseteq \subseteq E \times E\) and a function \( \sqcup :E \times E\)E (join operator). The function \( \sqcup \) yields the least upper bound for two lattice elements, and symbols \( \top \) and \( \bot \) denote the least upper bound and greatest lower bound of the set E respectively.

The concretization function \( [\kern-0.15em[ \cdot ]\kern-0.15em] :{{2}^{E}} \to {{2}^{\mathcal{T}}}\) assigns to each set of abstract transitions \(R \subseteq E\) its meaning, i.e. the set of concrete transitions that it represents. Note, that we use concretization on sets of transitions instead of a single transition. Thus we have

$$\forall R \subseteq E: [\kern-0.15em[ R ]\kern-0.15em] \supseteq \bigcup\limits_{e \in R} \, [\kern-0.15em[ {\text{\{}}e{\text{\}}} ]\kern-0.15em] $$

meaning that the summary knowledge for the set of partial transitions may be bigger than union of knowledge for the single (partial) transition.

(2) The set \(\Pi \) of precision determines the possible precisions of the abstract domain. The program analysis uses precision elements to keep track of different precisions for different abstract elements. A pair (e, π) is called abstract element e with precision π. The operators on the abstract domain are parametric in the precision. For \(R \subseteq E \times \Pi \) we denote \( [\kern-0.15em[ R ]\kern-0.15em] = [\kern-0.15em[ \bigcup\nolimits_{(e,\pi ) \in R} {\text{\{}}e{\text{\}}} ]\kern-0.15em] \).

(3) The transfer relation \( \rightsquigarrow \)\(:E \times \Pi \times {{2}^{E}} \times E\) assigns to each partial transition \(\hat {e}\) with precision \(\pi \) possible new abstract transition \(e{\kern 1pt} '\) which is abstract successor of \(\hat {e}\). Note, the result may depend on reached elements \(R \subseteq E\). We write \((\hat {e},\pi )\mathop \rightsquigarrow \limits^R e{\kern 1pt} '\) if \((\hat {e},\pi ,R,e) \in \, \rightsquigarrow \).

Let us denote Reachk as

$$\begin{gathered} \forall R \subseteq E:Reac{{h}^{0}}(R) = R \\ \forall k \geqslant 1:Reac{{h}^{{k + 1}}}(R) \\ = \bigcup\limits_{e \in Reac{{h}^{k}}(R)} \,{\text{\{}}e{\kern 1pt} '|e \rightsquigarrow e{\kern 1pt} '{\text{\}}} \cup Reac{{h}^{k}}(R) \\ \end{gathered} $$
(1)

The requirement on the transfer states that Reachk over-approximates the concrete transitions:

$$\exists k \geqslant 1:\forall R\, \subseteq \,E: [\kern-0.15em[ Reac{{h}^{k}}(R) ]\kern-0.15em] \, \supseteq \,\bigcup\limits_{\tau \in [[R]]} {\text{\{}}\tau {\kern 1pt} '|\tau \to \tau {\kern 1pt} '{\text{\}}}$$
(2)

The requirement 2 weakens the requirement on transfer operator in the classical CPA theory. It means, that we may produce corresponding concrete transitions not after one step of abstract transition (classical theory), but after k steps. For thread-modular approach k = 2 as we will see later.

(4) The merge operator \(merge:E \times E \times \Pi \to E\) weakens the second parameter using the information of the first parameter and returns a new abstract element of the precision that is given as the third parameter. The merge operator has to fulfill the following requirement:

$$\forall e,e{\kern 1pt} ' \in E,\pi \in \Pi :e{\kern 1pt} ' \sqsubseteq merge(e,e{\kern 1pt} ',\pi )$$
(3)

(5) The termination check operator stop : E × \({{2}^{{E \times \Pi }}} \times \Pi \to \mathbb{B}\) checks if the abstract element given as the first parameter with the precision given as the third parameter, is covered by the set of abstract elements given as the second parameter. The termination check can, for example, go through the elements of the set R that is given as second parameter and search for a single element that subsumes (\( \sqsubset \)) the first parameter. The termination check has to fulfill the following requirements:

$$\begin{aligned} \forall e \in E,R \subseteq E,\pi \in \Pi : \\ stop(e,R,\pi )\forall \hat {R} \subseteq E: [\kern-0.15em[ {\text{\{}}e{\text{\}}} \cup \hat {R} ]\kern-0.15em] \subseteq [\kern-0.15em[ R \cup \hat {R} ]\kern-0.15em] \\ \end{aligned} $$
(4)

(6) The precision adjustment function prec : E × Π × \({{2}^{{E \times \Pi }}} \to E \times \Pi \) computes a new abstract element and a new precision, for a given abstract element with precision and a set of abstract elements with precision. The precision adjustment function may perform widening of the abstract element, in addition to a change of precision. The precision adjustment function has to fulfill the following requirement:

$$\begin{gathered} \forall e,e{\kern 1pt} ' \in E,\pi ,\pi {\kern 1pt} ' \in \Pi ,R \subseteq E \times \Pi : \\ (e{\kern 1pt} ',\pi {\kern 1pt} ') = prec(e,\pi ,R)e \sqsubseteq e{\kern 1pt} ' \\ \end{gathered} $$
(5)

The precision domain Π, termination check stop, merge operator merge, precision adjustment prec are the same as in the original CPA theory. The main algorithm, which computes a set of reached abstract transitions, also stays the same except for the extension of the transfer operator.

Data: a configurable program analysis with partial states \(\mathbb{D}\), initial abstract transition e0 with precision \({{\pi }_{0}}\, \in \,\Pi \), a set reached of elements E, a set waitlist of elements E

Result: a set of reachable states reached

 

\(waitlist: = {\text{\{}}({{e}_{0}},{{\pi }_{0}}){\text{\}}}\);

 

\(reached: = {\text{\{}}({{e}_{0}},{{\pi }_{0}}){\text{\}}}\);

 

while \(waitlist \ne \emptyset \) do

 
 

pop \((e,\pi )\) from \(waitlist\);

 
 

for \(e{\kern 1pt} '\) in \((e,\pi )blue\mathop \rightsquigarrow \limits^{reached} e{\kern 1pt} ')\) do

 
  

\((\hat {e},\hat {\pi }) = prec(e{\kern 1pt} ',\pi ,reached)\);

 
  

for each \((e{\kern 1pt} '',\pi {\kern 1pt} '') \in reached\) do

 
   

\({{e}_{{new}}} = merge(\hat {e},e{\kern 1pt} '',\hat {\pi })\);

  
   

if \({{e}_{{new}}} \ne e{\kern 1pt} ''\) then

  
    

\(waitlist: = waitlist{\backslash \{ }(e{\kern 1pt} '',\pi {\kern 1pt} ''){\text{\}}} \cup {\text{\{}}({{e}_{{new}}},\pi {\kern 1pt} ''{\text{)\} }}\)

 
    

\(reached: = reached{\backslash \{ (}e{\kern 1pt} '',\pi {\kern 1pt} ''){\text{\}}} \cup {\text{\{}}({{e}_{{new}}},\pi {\kern 1pt} ''{\text{)\} }}\)

   

end

  
  

end

  
  

if \(!stop(\hat {e},reached,\hat {\pi })\) then

 
   

\(waitlist: = waitlist \cup {\text{\{}}(\hat {e},\hat {\pi }){\text{\}}}\);

   

\(reached: = reached \cup {\text{\{}}(\hat {e},\hat {\pi }){\text{\}}}\);

  

end

  
 

end

  

end

  

Algorithm 1. Algorithm \(CPA(\mathbb{D},{{e}_{0}},{{\pi }_{0}})\).

For the algorithm, the main Theorem 1 may be proven even with weaker requirements. The proof replicates the proof of the classical theorem.

Theorem 1. (Soundness) For a given configurable program analysis with thread abstractions \(\mathbb{D}\) and an initial abstract state e0 with precision π0, Algorithm CPA computes a set of abstract states that overapproximates the set of reachable concrete states:

$$ [\kern-0.15em[ CPA(\mathbb{D},{{e}_{0}},{{\pi }_{0}}) ]\kern-0.15em] \supseteq Reac{{h}_{ \to }}( [\kern-0.15em[ {\text{\{}}{{e}_{0}}{\text{\}}} ]\kern-0.15em] ).$$

Note, several CPAs are used in practice for analysis of source code. The set of CPAs has a tree structure, where a root CPA provides its operators to Algorithm 7. Different CPAs interact each other for increasing the precision of the whole analysis. Section 6 presents such a root CPA, which requires a nested CPA. Section 7 presents a CPA, which implements a parallel composition of analyzes. Examples of nested CPAs are presented in Sections 8–12. Section 13 describes the data race detection approach, which is based on the considered CPAs.

Some auxiliary CPAs (CallstackCPA, AutomatonCPA), which do not implement any kind of analysis, may be applied without changes comparing to the origin version of the theory. Thus, they will not be described further. CPAs, which implements different kinds of analysis (analysis of explicit values, predicate analysis, etc.) should be modified to support transitions in environment. These CPAs will be described in the corresponding sections.

6 THREAD-MODULAR ANALYSIS

In section, we present a CPA, which implements thread-modular functionality. The main task of the CPA is computation of a potential effect for environment, i.e. a projection, and also an application the projections to the corresponding thread. ThreadModularCPA requires a nested CPA with extended set of operators: an operator of projection, an operator of compatibility check and and operator of composition. Thus, first, we define an extended inner CPA with the new operators.

6.1 Extended Inner CPA for Thread-Modular Analysis

A definition of CPA for thread-modular analysis is extended with three new operators: \(\mathbb{I} = ({{D}_{I}},\) \({{\Pi }_{I}},{{ \rightsquigarrow }_{I}},\) \(merg{{e}_{I}},sto{{p}_{I}},\) \(pre{{c}_{I}},\) \(compatibl{{e}_{I}},\)⋅|p, \(compos{{e}_{I}}\)).

An abstract domain \({{D}_{I}} = ({{\mathcal{T}}_{I}},{{\mathcal{E}}_{I}},{{ \oplus }_{I}})\) consists of a set of concrete transitions \({{\mathcal{T}}_{I}}\), a semi-lattice of abstract transitions \({{\mathcal{E}}_{I}}\) and a composition operator \({{ \oplus }_{I}}\). Note, an inner analysis is required to define not a concretization function \( [\kern-0.15em[ \cdot ]\kern-0.15em] \), but a composition operator \( \oplus \) because the thread-modular approach requires unified schema for calculation of concrete states.

As we have already discussed all states and transitions are partial, so they may not be directly related to concrete states and transitions. To get a complete transition we should get a composition of a set of partial transitions, which represent all available threads. Compatible partial transitions can be composed into a complete concrete transition with an operator \( \oplus \) : E × \(T \times {{2}^{{E \times T}}} \to {{2}^{\mathcal{T}}}\). It returns a set of concrete transitions, which is represented by given partial transitions.

\( \oplus \) is required to be consistent with the semi-lattice. So, if one abstract transition is less than the other, the composition with the same set must not get a larger set of concrete transitions.

A compatibility operator \(compatibl{{e}_{I}}\) : E × E \( \to {\text{\{}}true,false{\text{\}}}\) checks if two partial transitions can be started from a common complete predecessor.

Projection operator ⋅|p: \(E \to E\) projects a transition in a thread to another thread. For example, a projection may contain modifications of global variables and misses changes of thread-local data.

\(compos{{e}_{I}}:E \times E \to E\) combines two abstract transitions into a single one. It applies an abstract edge from one transition to an abstract state of the other transition.

Further we will use operator \(appl{{y}_{I}}\) as a combination of the three operators: ⋅|p, composeI and \(compatibl{{e}_{I}}\):

$$\begin{gathered} \forall e,e{\kern 1pt} ' \in E:apply(e,e{\kern 1pt} ') \\ = \left\{ \begin{gathered} compos{{e}_{I}}(e,e{\kern 1pt} '{{{\text{|}}}_{p}}),\quad {\text{if}}\quad compatibl{{e}_{I}}(e,e{\kern 1pt} '{{{\text{|}}}_{p}}) \hfill \\ \bot ,\quad {\text{else}}{\text{.}} \hfill \\ \end{gathered} \right. \\ \end{gathered} $$
(6)

Thus, apply means that the transitions may be composed only if they are compatible. The operator produces an applied transition, which is also called a transition in the environment because it represents the effect of the environment.

6.2 Thread-Modular CPA

Define a thread-modular CPA \(\mathbb{T}\mathbb{M} = ({{D}_{{TM}}},\) ΠTM, \({{ \rightsquigarrow }_{{TM}}},\) \(merg{{e}_{{TM}}},sto{{p}_{{TM}}},\) \(pre{{c}_{{TM}}})\), which is based on an inner CPA \(\mathbb{I} = ({{D}_{I}},\) \({{\Pi }_{I}},{{ \rightsquigarrow }_{I}},\) \(merg{{e}_{I}},sto{{p}_{I}},\) \(pre{{c}_{I}},\) \(compatibl{{e}_{I}},\) ⋅|p, \(compos{{e}_{I}}\)).

(1) The abstract domain \({{D}_{{TM}}} = (\mathcal{T},\mathcal{E},{{ [\kern-0.15em[ \cdot ]\kern-0.15em] }_{P}})\).

A complete semi-lattice \(\mathcal{E} = {{\mathcal{E}}_{I}}\) is equal to inner analysis one.

For thread-modular analyses a concretization function \( [\kern-0.15em[ \cdot ]\kern-0.15em] \) means all possible compositions of partial transitions:

$$\begin{gathered} \forall R \subseteq E:{{ [\kern-0.15em[ R ]\kern-0.15em] }_{P}} \\ = \bigcup\limits_{k = 1}^\infty \,\bigcup\limits_{\begin{array}{*{20}{c}} {{{e}_{0}},{{e}_{1}}, \ldots ,{{e}_{k}} \in R} \\ {{{t}_{0}},{{t}_{1}}, \ldots ,{{t}_{k}} \in T} \end{array}} \,\mathop \oplus \nolimits_I \left( {\left( {\begin{array}{*{20}{c}} {{{e}_{0}}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {{{e}_{1}}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {{{e}_{k}}} \\ {{{t}_{k}}} \end{array}} \right)} \right\}} \right) \\ \end{gathered} $$
(7)

(2) Transfer relation computes the next transitions, after that it applies all reached transitions as transitions in an environment to the new one and applies a new transition as a transition in an environment to all reached transitions.

Data: preceding transition e0 with precision \({{\pi }_{0}} \in \Pi \)

Result: a set of succeeding transitions \(result\)

\(result: = \emptyset \);

for each \(\hat {e}:{{e}_{0}}{{\mathop \rightsquigarrow \limits^R }_{I}}\hat {e}\) do

 

\(result: = result \cup {\text{\{}}\hat {e}{\text{\}}}\);

 
 

for each \(e{\kern 1pt} ' \in reached\) do

 
  

\(result: = result \cup {\text{\{}}apply(e{\kern 1pt} ',\hat {e}){\text{\}}}\);

 
  

\(result: = result \cup {\text{\{}}apply(\hat {e},e{\kern 1pt} '){\text{\}}}\);

 
 

end

 

end

 

return result

 

 Algorithm 2. transferTM(e0, π0, reached)

(3) \({{\Pi }_{{TM}}} = {{\Pi }_{I}}\)

(4) \(merg{{e}_{{TM}}} = merg{{e}_{I}}\).

(5) \(sto{{p}_{{TM}}} = sto{{p}_{I}}\).

(6) \(pre{{c}_{{TM}}} = pre{{c}_{I}}\).

7 COMPOSITE ANALYSIS

A composite analysis is used to combine different kinds of analysis altogether. Examples of the inner CPAs will be present further.

CompositeCPA \(\mathcal{C} = ({{D}_{\mathcal{C}}},\) \({{\Pi }_{\mathcal{C}}},{{ \rightsquigarrow }_{\mathcal{C}}},\) \(merg{{e}_{\mathcal{C}}},sto{{p}_{\mathcal{C}}},\) \(pre{{c}_{\mathcal{C}}}\), \(compatibl{{e}_{\mathcal{C}}}, \cdot {{|}_{p}}\), \(compos{{e}_{\mathcal{C}}}\)) operates with inner analyses \({{\Delta }_{i}} = ({{D}_{{{{\Delta }_{i}}}}},\) \({{\Pi }_{{{{\Delta }_{i}}}}},{{ \rightsquigarrow }_{{{{\Delta }_{i}}}}},\) \(merg{{e}_{{{{\Delta }_{i}}}}},sto{{p}_{{{{\Delta }_{i}}}}},\) \(pre{{c}_{{{{\Delta }_{i}}}}},\) \(compatibl{{e}_{{{{\Delta }_{i}}}}}\),\( \cdot {{|}_{p}}\), \(compos{{e}_{{{{\Delta }_{i}}}}}\))

(1) \({{D}_{\mathcal{C}}} = {{D}_{{{{\Delta }_{1}}}}} \times \ldots \times {{D}_{{{{\Delta }_{n}}}}}\)

$$\begin{gathered} \forall {{e}_{0}}, \ldots ,{{e}_{m}} \in {{E}_{\mathcal{C}}},\forall 0 \leqslant i \leqslant m:{{e}_{i}} = (e_{i}^{1}, \ldots ,e_{i}^{n}) \\ \mathop \oplus \nolimits_\mathcal{C} \left( {\left( {\begin{array}{*{20}{c}} {{{e}_{0}}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {{{e}_{1}}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {{{e}_{m}}} \\ {{{t}_{m}}} \end{array}} \right)} \right\}} \right) = \mathop \oplus \nolimits_{{{\Delta }_{1}}} \left( {\left( {\begin{array}{*{20}{c}} {e_{0}^{1}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {e_{1}^{1}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {e_{m}^{1}} \\ {{{t}_{m}}} \end{array}} \right)} \right\}} \right) \\ \cap \, \ldots \, \cap \mathop {\, \oplus }\nolimits_{{{\Delta }_{n}}} \left( {\left( {\begin{array}{*{20}{c}} {e_{0}^{n}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {e_{1}^{n}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {e_{m}^{n}} \\ {{{t}_{m}}} \end{array}} \right)} \right\}} \right) \\ \end{gathered} $$

(2) \({{\Pi }_{\mathcal{C}}} = {{\Pi }_{{{{\Delta }_{1}}}}} \times \ldots \times {{\Pi }_{{{{\Delta }_{n}}}}}\)

(3) Transfer relation applies inner transfers to corresponding part of the state.

$$\forall {{e}_{1}},{{e}_{2}} \in E:{{e}_{1}}{{ \rightsquigarrow }_{\mathcal{C}}}{{e}_{2}} \Leftrightarrow \forall 1 \leqslant i \leqslant n:e_{1}^{i}{{ \rightsquigarrow }_{{{{\Delta }_{i}}}}}e_{2}^{i}$$

(4) Merge operator calls inner merge operators: \(\forall {{e}_{1}},{{e}_{2}} \in E,\pi \in \Pi :merg{{e}_{\mathcal{C}}}\)(e1, e2, π) = \((merg{{e}_{{{{\Delta }_{1}}}}}(e_{1}^{1},e_{2}^{1}\), π1), ..., \(merg{{e}_{{{{\Delta }_{n}}}}}(e_{1}^{n},e_{2}^{n}\), πn).

(5) Stop operator calls inner stop operators: \(\forall e\, \in \,E,\,R\, \subseteq \,E,\,\pi \, \in \,\Pi \,:\,sto{{p}_{\mathcal{C}}}(e,R,\pi )\) \(\exists \hat {e} \in R\) \(\forall 1 \leqslant i \leqslant n\) : \(sto{{p}_{{{{\Delta }_{i}}}}}({{e}^{i}},{\text{\{}}{{\hat {e}}^{i}}{\text{\}}}\), πi).

(6) \(pre{{c}_{\mathcal{C}}}(e,R,\pi ) = (pre{{c}_{{{{\Delta }_{1}}}}}({{e}^{1}},R,{{\pi }_{1}})\), ..., \(prec_{{{{\Delta }_{n}}}}^{E}({{e}^{n}}\), R, πn)).

(7) \(compatibl{{e}_{\mathcal{C}}}({{e}_{1}},{{e}_{2}})\) = \(compatibl{{e}_{{{{\Delta }_{1}}}}}(e_{1}^{1},e_{2}^{1}) \wedge \) ... \( \wedge compatibl{{e}_{{{{\Delta }_{n}}}}}(e_{1}^{n},e_{2}^{n})\).

(8) \(e{{{\text{|}}}_{p}} = ({{e}^{1}}{{{\text{|}}}_{p}}, \ldots ,{{e}^{n}}{{{\text{|}}}_{p}})\).

(9) \(compos{{e}_{\mathcal{C}}}({{e}_{1}},{{e}_{2}})\) = \((compos{{e}_{{{{\Delta }_{1}}}}}(e_{1}^{1},e_{2}^{1})\), ..., \(compos{{e}_{{{{\Delta }_{n}}}}}(e_{1}^{n},e_{2}^{n}))\).

8 LOCATION ANALYSIS

This section presents a simple Location Analysis \(\mathbb{L} = ({{D}_{L}},\) \({{\Pi }_{L}},{{ \rightsquigarrow }_{L}},\) \(merg{{e}_{L}},sto{{p}_{L}},\) \(pre{{c}_{L}},\) \(compatibl{{e}_{L}}\), ·|p, \(compos{{e}_{L}}\)) which tracks abstract locations. The analysis extends the origin LocationCPA to be able to support thread-modular analysis.

(1) \({{D}_{L}} = (\mathcal{T},{{\mathcal{E}}_{L}},{{ \oplus }_{L}})\).

An abstract transition consists of an abstract state \(s \in E_{L}^{S}\) and an abstract edge \(q \in E_{L}^{T}\).

\(E_{L}^{S}\) is a set of abstract program locations, which is mapped to concrete CFA locations with function loc : \(E_{L}^{S} \to {{2}^{L}}\). \( \top _{L}^{S}\) means, the analysis does not know what is a particular program location. Formally, \(loc( \top _{L}^{S})\) = L. In general case analysis is able to operate with abstract locations, which express several concrete locations, but further we describe a simple version of analysis, which considers only single locations: \(\forall s \in E_{L}^{S}:s\) = \( \top _{L}^{S} \vee \,s\) = \( \bot _{L}^{S} \vee \,loc(s) = l \in L\). Defined \(\mathcal{E}_{L}^{S}\) is a flat lattice, meaning the two different locations are not comparable.

$$\begin{gathered} \forall {{s}_{1}},{{s}_{2}} \in E_{L}^{S}{\text{:}} \\ {{s}_{1}} \sqsubseteq {{s}_{2}} \Leftrightarrow {{s}_{1}} = \, \bot _{L}^{S} \vee \,{{s}_{2}} = \, \top _{L}^{S} \\ {{s}_{1}}\, \sqcup \,{{s}_{2}} = \left\{ \begin{gathered} {{s}_{1}},\quad {\text{if}}\quad {{s}_{2}} = \, \bot _{L}^{S} \hfill \\ {{s}_{2}},\quad {\text{if}}\quad {{s}_{1}} = \, \bot _{L}^{S} \hfill \\ \top _{L}^{S},\quad {\text{if}}\quad {{s}_{1}} \ne \, \bot _{L}^{S} \wedge \,{{s}_{2}} \ne \, \bot _{L}^{S} \hfill \\ \end{gathered} \right. \\ \end{gathered} $$

Abstract edge is based on a ordinary CFA edge, containing only source location and destination location. \(E_{L}^{T} \subseteq E_{L}^{S} \times E_{L}^{S}\).

$$\begin{gathered} \forall {{q}_{1}},{{q}_{2}} \in E_{L}^{T}:{{q}_{{1,2}}} = ({{s}_{{1,2}}},{{d}_{{1,2}}}) \\ {{q}_{1}} \sqsubseteq {{q}_{2}} \Leftrightarrow loc({{s}_{1}}) \subseteq loc({{s}_{2}}) \wedge loc({{d}_{1}}) \subseteq loc({{d}_{2}}) \\ {{q}_{3}} = {{q}_{1}} \sqcup {{q}_{2}} \Leftrightarrow loc({{s}_{3}}) \\ = loc({{s}_{1}}) \cup loc({{s}_{2}}) \wedge loc({{d}_{3}}) = loc({{d}_{1}}) \cup loc({{d}_{2}}) \\ \end{gathered} $$

Define an auxiliary operator to check if partial abstract transitions can be composed into a complete one.

$$\begin{gathered} \forall {{e}_{0}},{{e}_{1}}, \ldots ,{{e}_{n}} \in E,{{e}_{i}} = ({{s}_{i}},{{q}_{i}}),{{q}_{i}} = (s_{i}^{{src}},s_{i}^{{dst}}){\text{:}} \\ {{C}_{{check}}}({{e}_{0}},{\text{\{}}{{e}_{1}}, \ldots ,{{e}_{n}}{\text{\}}}) \\ = loc({{s}_{0}}) \cap loc(s_{0}^{{src}}) \cap \ldots \cap loc(s_{n}^{{src}}) \ne \emptyset \\ \wedge \,loc(s_{1}^{{dst}}) \cap \ldots \cap loc(s_{n}^{{dst}}) \ne \emptyset \\ \end{gathered} $$
(8)

The requirement 8 claims that all partial transitions may be related to the same concrete transitions: they have the same source location and the same destination location.

The most complicated part is the definition of composition operator.

$$\begin{gathered} \forall {{e}_{1}}, \ldots ,{{e}_{n}} \in E,\quad {{e}_{i}} = ({{s}_{i}},{{q}_{i}}),\quad {{q}_{i}} = (s_{i}^{{src}},s_{i}^{{dst}}): \\ \mathop \oplus \nolimits_L \left( {\left( {\begin{array}{*{20}{c}} {{{e}_{0}}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {{{e}_{1}}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {{{e}_{n}}} \\ {{{t}_{n}}} \end{array}} \right)} \right\}} \right) \\ = \left\{ \begin{gathered} \left\{ {\begin{array}{*{20}{c}} {(c,g,{{t}_{0}}) \in \mathcal{T}}&{\kern 1pt} \\ {c\, = \,({{c}_{{pc}}},{{c}_{l}},{{c}_{g}},{{c}_{s}})}&{\kern 1pt} \end{array}\left| {\begin{array}{*{20}{c}} {{{c}_{{pc}}} = \left\{ {\begin{array}{*{20}{c}} {{{t}_{0}} \to {{l}_{0}},} \\ \ldots \\ {{{t}_{n}} \to {{l}_{n}}} \end{array}} \right\},} \\ {{{s}_{i}} \ne \, \top _{L}^{S}{{l}_{i}} = {{s}_{i}}} \\ {g\, \in \,G,\,g\, = \,({{l}_{0}},\,op,\,{{l}_{1}}):} \\ {{{l}_{1}} \in loc(s_{0}^{{dst}}) \cap \ldots \cap loc(s_{n}^{{dst}})} \end{array}} \right.} \right\} \hfill \\ \emptyset ,\quad {\text{otherwise}} \hfill \\ \end{gathered} \right., \\ {\text{if}}\quad {{C}_{{check}}}({{e}_{0}},{\text{\{}}{{e}_{1}}, \ldots ,{{e}_{j}}{\text{\}}}) \\ \end{gathered} $$
(9)

(2) \({{\Pi }_{L}} = {\text{\{}}{{\pi }_{0}}{\text{\}}}\). The location state does not depend from the precision, so there is only one dummy precision element.

(3) The transition \(e{{\mathop \rightsquigarrow \limits^R }_{L}}e{\kern 1pt} '\) exists, if transition in a thread changes the location according to the abstract edge. More formally. Let \(e = (s,q)\), \(e{\kern 1pt} ' = (s{\kern 1pt} ',q{\kern 1pt} ')\), q = \((pred\), suc), \(q{\kern 1pt} '\, = \,(pred{\kern 1pt} '\), suc'). \(loc(s) \cap loc(pred) \ne \emptyset \), \(\exists g{\kern 1pt} ' \in G:g{\kern 1pt} '\) = \(({{l}_{1}},op,{{l}_{2}}) \wedge {{l}_{1}}\)\(loc(pred{\kern 1pt} ') \cap loc(s{\kern 1pt} ') \wedge {{l}_{2}}\)\(loc(suc{\kern 1pt} ')\) and \(s{\kern 1pt} ' = suc\).

(4) The merge operator does not join states: \(merg{{e}_{L}}(e,e{\kern 1pt} ',\pi )\) = e'.

(5) The termination check considers abstract states individually: \(sto{{p}_{L}}(e,R,\pi ) = (e \in R)\).

(6) The precision is never adjusted: \(pre{{c}_{L}}(e,\pi ,R)\) = (e, π).

(7) \(e{\text{|}}_{p}^{L} = e\).

(8) \(\forall e,e{\kern 1pt} ' \in E,e = (s,q)\), e' = (s', q') : composeL(e, e') = \(\tilde {e} = (s,\hat {q})\), where \(\hat {q} = (s,s)\), as transition in thread do not change the state.

(9) \(\forall {{e}_{1}},{{e}_{2}} \in E:compatible({{e}_{1}},{{e}_{2}}) \equiv true\)

9 THREAD ANALYSIS

We define a simple Thread Analysis \(\mathbb{T} = ({{D}_{T}},\) ΠT, \({{ \rightsquigarrow }_{T}},\) \(merg{{e}_{T}},sto{{p}_{T}},\) \(pre{{c}_{T}},\) \(compatibl{{e}_{T}}\), ⋅|p, \(compos{{e}_{T}}\)) which tracks thread identifiers.

The Thread Analysis inherits the limitations of [6] and restricted to the programs with bounded thread creation. We suppose that the program has a finite number of threads identified by the locations where they are created, i.e. \(T \subseteq L\) and for \(thread\_create(p{{c}_{\nu }})\) we always create a thread with identifier \(p{{c}_{\nu }}\). Note, that the other analyses are not bounded.

(1) The domain DT is based on the flat lattice for the set of threads T: \({{D}_{T}} = (C \times G \times T,{{\mathcal{E}}_{T}},{{ \oplus }_{T}})\), with \({{\mathcal{E}}_{T}} = \mathcal{E}_{T}^{S} \times \mathcal{E}_{T}^{T}\). \(E_{T}^{S} = T \cup \{ \bot _{T}^{S}, \top _{T}^{S}\} \), \( \bot _{T}^{S}\, \sqsubseteq _{T}^{S}t \sqsubseteq _{T}^{S}\, \top _{T}^{S}\) and \(t \ne t{\kern 1pt} ' \Rightarrow t\, {\not \sqsubseteq }{}_T^S \,t{\kern 1pt} '\) for all elements \(t,t{\kern 1pt} ' \in T\) (this implies \( \bot _{T}^{S} \sqcup _{T}^{S}t = t\), \( \top _{T}^{S} \sqcup t = \, \top _{T}^{S}\), \(t \sqcup _{T}^{S}t{\kern 1pt} ' = \, \top _{T}^{S}\) for all elements \(t,t{\kern 1pt} ' \in T\), \(t \ne t{\kern 1pt} '\))

$$E_{T}^{T} = {\text{\{}} \bot _{T}^{T},\varepsilon , \top _{T}^{T}{\text{\}}} \cup G$$

Define an auxiliary operator to check if partial abstract transitions can be composed into a complete one.

$$\begin{gathered} \forall {{e}_{1}}, \ldots ,{{e}_{n}} \in E,\quad {{e}_{i}}\, = \,({{s}_{i}},{{q}_{i}}),\quad {{s}_{i}} \in E_{T}^{S},\quad {{q}_{i}} \in E_{T}^{T} \\ {{C}_{{check}}}({\text{\{}}{{e}_{1}}, \ldots ,{{e}_{n}}{\text{\}}})\, = \,\forall k\, \ne \,m\,:\,{{s}_{k}}\, \ne \,{{s}_{m}} \\ \end{gathered} $$
(10)

Now define a composition operator \({{ \oplus }_{T}}\)

$$\begin{gathered} \forall {{e}_{0}},{{e}_{1}}, \ldots ,{{e}_{n}} \in E, \\ {{e}_{i}} = ({{s}_{i}},{{q}_{i}}),\quad {{t}_{0}},{{t}_{1}}, \ldots ,{{t}_{n}} \in T,\quad {{t}_{i}} \ne {{t}_{j}} \\ {{ \oplus }_{T}}\left( {\left( {\begin{array}{*{20}{c}} {{{e}_{0}}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {{{e}_{1}}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {{{e}_{n}}} \\ {{{t}_{n}}} \end{array}} \right)} \right\}} \right) \\ = \left\{ \begin{gathered} \left\{ {\begin{array}{*{20}{c}} {(c,g,{{t}_{0}}) \in \mathcal{T}} \\ {({{c}_{{pc}}},{{c}_{l}},{{c}_{g}},{{c}_{s}}) \in C} \end{array}\left| {\begin{array}{*{20}{c}} {dom({{c}_{{pc}}}) = {\text{\{}}{{s}_{1}}, \ldots ,{{s}_{j}}{\text{\}}}} \\ {dom({{c}_{l}}) = {\text{\{}}{{s}_{1}}, \ldots ,{{s}_{j}}{\text{\}}}} \\ {g \in G,{{t}_{0}} = {{s}_{0}}} \end{array}} \right.} \right\}, \hfill \\ \emptyset ,\quad {\text{otherwise}} \hfill \\ \end{gathered} \right. \\ {\text{if}}\quad {{C}_{{check}}}({\text{\{}}{{e}_{0}}, \ldots ,{{e}_{n}}{\text{\}}}) \\ \end{gathered} $$
(11)

(2) There is only one empty precision: \({{\Pi }_{T}} = {\text{\{}}\emptyset {\text{\}}}\).

(3) The transfer relation \({{ \rightsquigarrow }_{T}}\) has the transfer \(\tau \,{{ \rightsquigarrow }_{T}}\,\tau {\kern 1pt} '\), \(\tau = (s,g)\), \(g = ( \cdot ,op, \cdot )\) if

\(op \ne t{{c}_{{child}}}\) (the syntactical successor in the CFA without considering the semantics of the operation \(op\)), \(s = s{\kern 1pt} '\), \(g{\kern 1pt} ' \in G\).

\(op = t{{c}_{{child}}}({{l}_{\nu }})\) then \(s{\kern 1pt} ' = {{l}_{\nu }}\), \(g{\kern 1pt} ' \in G\)

(4) The merge operator does not combine elements when control flow meets: \(merg{{e}_{T}}(e,e{\kern 1pt} ',\pi ) = e{\kern 1pt} '\).

(5) The termination check considers abstract states individually: \(sto{{p}_{T}}(e,R,\pi ) = (e \in R)\).

(6) The precision is never adjusted: \(prec_{T}^{E}(e,\pi ,R)\) = (e, π).

(7) Definition of the projection:

$$\forall e \in E,e = (s,g):e{{{\text{|}}}_{p}} = \left\{ \begin{gathered} e,\quad {\text{if}}\quad g \notin G \hfill \\ (s,\varepsilon ),\quad {\text{otherwise}} \hfill \\ \end{gathered} \right.$$

(8) \(\forall e,e{\kern 1pt} ' \in {{E}_{P}},e = (s,q)\), e' = (s', q') : composeP(e, e') = \(\tilde {e}\) = (s, q').

(9) \(\forall {{e}_{1}},{{e}_{2}} \in E:compatibl{{e}_{P}}({{e}_{1}},{{e}_{2}}) = {{C}_{{check}}}({{e}_{1}},{{e}_{2}})\).

10 PREDICATE ANALYSIS

In section, we present a commonly used Predicate Analysis [20] with transition abstraction. Predicate transitions consist of two parts: a predicate state and a predicate abstract edge, which can be expressed by a normal CFA edge or a formula, encoding an operation. However, the formulas should be renamed to avoid the local variables collision.

Let \(\mathcal{P}\) be a set of formulas over program variables in a quantifier-free theory \(\mathcal{T}\). Let \(\mathcal{P} \subseteq \mathcal{P}\) be a set of predicates. Let \({v}:X \to \mathbb{Z}\) is a mapping from variables to values. Define \({v} \vDash \varphi \), where \({v}\) is called model of φ.

Let us define renaming of variables \(\theta :X \to X{\kern 1pt} '\) from the names \(X\) to \(X{\kern 1pt} '\) which is applicable to formulas \(\theta (\varphi )\). We denote as

$${{\theta }_{{X,i}}} = \left\{ \begin{gathered} x \mapsto x\# i,\quad {\text{if}}\quad x \in X \hfill \\ x \mapsto x,\quad {\text{otherwise}}. \hfill \\ \end{gathered} \right.$$
$$\theta _{{X,i}}^{{ - 1}} = \left\{ \begin{gathered} x\# i \mapsto x,\quad {\text{if}}\quad x \in X \hfill \\ x \mapsto x,\quad {\text{otherwise}}. \hfill \\ \end{gathered} \right.$$

Define \({{(\varphi )}^{\pi }}\) – the boolean predicate abstraction of formula φ.

Define \(S{{P}_{{op}}}(\varphi )\) – strongest postcondition of φ and operation op.

We define Predicate Analysis \(\mathbb{P} = ({{D}_{P}},\) \({{\Pi }_{P}},{{ \rightsquigarrow }_{P}},\) \(merg{{e}_{P}}\), stopP, \(pre{{c}_{P}},\) \(compatibl{{e}_{P}}\), \( \cdot {{{\text{|}}}_{p}}\), \(compos{{e}_{P}}\)) which can tracks the validity of predicates over program variables.

It consists of the following components.

(1) The abstract domain \({{D}_{P}} = (\mathcal{T},{{\mathcal{E}}_{P}},{{ \oplus }_{P}})\).

\(\mathcal{T} = C \times G \times T\). \({{\mathcal{E}}_{P}} = ({{E}_{P}},{{ \top }_{P}},{{ \bot }_{P}},{{ \sqsubseteq }_{P}},{{ \sqcup }_{P}})\). An abstract transition consists of an abstract state \(s \in E_{P}^{S}\) and an abstract edge \(q \in E_{P}^{T}\), so \({{E}_{P}} = E_{P}^{S} \times E_{P}^{T}\) and \({{\mathcal{E}}_{P}} = \mathcal{E}_{P}^{S} \times \mathcal{E}_{P}^{T}\).

\(E_{P}^{S} = \mathcal{P}\), so a state is a quantifier free formula. The top element \( \top _{P}^{S}\) = true, and bottom element \( \bot _{P}^{S}\) = false. The partial order \( \sqsubseteq _{P}^{S}\, \subseteq E_{P}^{S}\) × \(E_{P}^{S}\) is defined as \(e \sqsubseteq _{P}^{S}e{\kern 1pt} '\) \( \Leftrightarrow \) \(e \Rightarrow e{\kern 1pt} '\). The join \( \sqcup _{P}^{S}:E_{P}^{S} \times E_{P}^{S} \to E_{P}^{S}\) yields the least upper bound according to partial order.

Abstract edge \(q \in E_{P}^{T}\) is an action, expressed by a formula, or an ordinary CFA edge: \(E_{P}^{T} = E_{P}^{S} \cup G\).

We do not provide a formal definition of \({{ \oplus }_{P}}\), as it requires a lot of space. The basic idea is that it returns a set of concrete transitions, which correspond to a formula (strongest post-condition). The more tricky part of definition is a check \({{C}_{{check}}}\), whether the set of partial transitions corresponds to a single concrete one. For a transition in the thread e0 with normal CFA edge \({{q}_{0}} \in G\) and transitions in environment \({{e}_{1}}, \ldots ,{{e}_{n}}\) the check consists of two parts:

(a) for the state s0 of e0 and all states of transitions in environment \({{s}_{1}}, \ldots ,{{s}_{n}}\) there exists a model \({v}\);

(b) the edge qp of projection \({{e}_{0}}{{{\text{|}}}_{p}}\) is covered by abstract edges qj of transitions in the environment \({{q}_{p}} \sqsubseteq {{q}_{j}}\).

Formally,

$$\begin{gathered} \forall {{e}_{0}},{{e}_{1}}, \ldots ,{{e}_{n}} \in {{E}_{P}}, \\ {{e}_{i}} = ({{s}_{i}},{{q}_{i}}),\quad {{s}_{i}} \in E_{P}^{S},\quad {{q}_{i}} \in E_{P}^{T} \\ {{C}_{{check}}}({{e}_{0}},{\text{\{}}{{e}_{1}}, \ldots ,{{e}_{n}}{\text{\}}}) \\ = \,\exists {v}\,:\,{v}\, \vDash \,{{s}_{0}}\, \wedge \,{{\theta }_{{{{X}^{{local}}},1}}}({{s}_{1}})\, \wedge \, \ldots \, \wedge \,{{\theta }_{{{{X}^{{local}}},n}}}({{s}_{n}}) \\ \wedge \,({{q}_{0}} \in G\, \wedge \,\forall 0\, < \,j\, \leqslant \,n:{{q}_{j}}\, \notin \,G\, \wedge \,{{e}_{0}}{{{\text{|}}}_{p}}\, = \,({{s}_{p}},{{q}_{p}})\,:\,{{q}_{p}}\, \sqsubseteq \,{{q}_{j}}) \\ \end{gathered} $$
(12)

(2) The set of precisions \({{\Pi }_{P}} = {{2}^{\mathcal{P}}}\) models a precision for an abstract state as a set of predicates.

(3) Merge operator may have several modifications, for example,

(a) \(merg{{e}_{{Join}}}\) merges both parts of the transition:

$$\begin{gathered} \forall e,e{\kern 1pt} ' \in {{E}_{P}},\pi \in {{\Pi }_{P}},e = (s,g): \\ merg{{e}_{P}}(e,e{\kern 1pt} ',\pi ) \\ = \left\{ \begin{gathered} (s \vee s{\kern 1pt} ',g),\quad {\text{if}}\quad g = g{\kern 1pt} ',\quad g \in G \hfill \\ e{\kern 1pt} ',\quad {\text{if}}\quad g \in G \wedge g{\kern 1pt} ' \in \mathcal{P} \vee g \in \mathcal{P} \wedge g{\kern 1pt} ' \in G \hfill \\ (s \vee s{\kern 1pt} ',g \vee g{\kern 1pt} '),\quad {\text{if}}\quad g,g{\kern 1pt} ' \in \mathcal{P} \hfill \\ \end{gathered} \right. \\ \end{gathered} $$
(13)

(b) \(merg{{e}_{{Eq}}}\) merges only abstract edges for equal (or covered) states.

(c) \(merg{{e}_{{Sep}}}\) does not merge elements.

(4) The termination check if e is covered by another state in the reached set: \(sto{{p}_{P}}(e,R,\pi )\) = \(\exists e{\kern 1pt} ' \in R:(e \sqsubseteq e{\kern 1pt} ')\).

(5) The precision adjustment function constructs predicate abstraction over predicates in precision π: \(pre{{c}_{P}}(e,\pi ,R) = {{e}^{\pi }} = ({{s}^{\pi }},q)\).

(6) The transfer relation \(e{{ \rightsquigarrow }_{P}}(e{\kern 1pt} ',\pi )\), \(e = (s,g)\), \(e{\kern 1pt} ' = (s{\kern 1pt} ', \cdot )\). As Predicate analysis does not track relevant edges, it returns all possible ones.

– For \(g \in G\) we have the transfer \(e{{ \rightsquigarrow }_{P}}(e{\kern 1pt} ',\pi )\) with \(g = ( \cdot ,op, \cdot )\), if

$$s{\kern 1pt} ' = \left\{ \begin{gathered} {{(S{{P}_{{op}}}(e))}^{\pi }},\quad {\text{if}} \hfill \\ op = assign(w,expr) \vee op = assume(expr) \hfill \\ s,\quad {\text{otherwise}}. \hfill \\ \end{gathered} \right.$$

– For \(g = \hat {\varphi } \in \mathcal{P}\) the transfer computes a formula: \(s{\kern 1pt} ' = \hat {\varphi } \wedge e\).

(7) Definition of the projection:

$$\begin{gathered} \forall e \in {{E}_{P}},\quad e = (s,g): \\ e{{{\text{|}}}_{p}} = \left\{ \begin{gathered} e,\quad {\text{if}}\quad g \notin G \hfill \\ ({{\theta }_{{{{X}^{{local}}},env}}}(s),{{\theta }_{{{{X}^{{local}}},env}}}(S{{P}_{{op}}}(s))),\quad {\text{otherwise}} \hfill \\ \end{gathered} \right. \\ \end{gathered} $$
(14)

The projection represents how the transition looks as an environment. The local variables are renamed to avoid name collision. Thus, only predicates over global variables stay valuable. The state (the first part of the transition) represents an abstraction over global part of a thread state. And the edge (the second part of the transition) corresponds the concrete operation to the global changes.

(8) \(\forall e,e{\kern 1pt} ' \in {{E}_{P}},e = (s,q)\), e' = (s', q') : composeP(e, \(e{\kern 1pt} ') = \tilde {e}\) = (s, q').

(9) \(\forall {{e}_{1}},{{e}_{2}}\, \in \,{{E}_{P}},{{e}_{i}}\, = \,({{s}_{i}},{{q}_{i}})\), \({{s}_{i}} \in {{E}_{P}}\) : \(compatibl{{e}_{P}}({{e}_{1}}\), e2) = \(\exists {v}:{v} \vDash {{\theta }_{{{{X}^{{local}}},1}}}({{s}_{1}}) \wedge {{\theta }_{{{{X}^{{local}}},2}}}({{s}_{2}})\)

Compatibility check means that the transitions may be composed into the global one. And this is impossible, if the global predicates are inconsistent. Thus, we should check, if there exists a single model of two partial formulas.

11 VALUE ANALYSIS

Define a Value Analysis \(\mathbb{V} = (({{D}_{V}},\) \({{\Pi }_{V}},{{ \rightsquigarrow }_{V}},\) \(merg{{e}_{V}}\), stopV, \(pre{{c}_{V}},\) \(compatibl{{e}_{V}}\), \( \cdot {{{\text{|}}}_{p}}\), \(compos{{e}_{V}})\), which tracks explicit values. Unlike predicate analysis Value analysis stores only explicit values of variables, so it can not handle more complicated dependencies between variables, for example, inequalities.

The Value Analysis consists from the following components:

(1) Abstract state of the analysis is a mapping from a variables to their values: \(\forall s \in E_{V}^{S},s:X \to \mathcal{Z}\), where \(\mathcal{Z}\, = \,\mathbb{Z}\, \cup \,{\text{\{}}{\kern 1pt} {{ \bot }_{Z}},\,{{ \top }_{Z}}{\kern 1pt} {\text{\}}}\). Thus, a set of abstract states \(E_{V}^{S}\) is a flat lattice over integers. A top element \( \top _{V}^{S}\) = \({\text{\{}}{v}|\forall x \in X:{v}(x)\) = \({{ \top }_{Z}}\} \) is a mapping with each variable has an arbitrary value. A bottom element \( \bot _{V}^{S}\) = \({\text{\{}}{v}|\exists x \in X:{v}(x) = \,{{ \bot }_{Z}}\} \) is a mapping, where at least one variable has no value. The state is unreachable for real execution of a program. An order is trivial: any two states (not equal to \( \top _{V}^{S}\) or \( \bot _{V}^{S}\)) are not comparable.

An operator \({{C}_{{Check}}}\) checs if there a common mapping for global variables: \(\forall {{s}_{0}}, \ldots ,{{s}_{n}}\)\(E_{V}^{S}:{{C}_{{Check}}}\)({s0, ..., sn}) = \(\forall x \in {{X}^{g}}\), \(\exists z \in Z:\forall 0 \leqslant i \leqslant n:z\) = \({{s}_{i}}(x) \vee {{s}_{i}}(x)\) = \({{ \top }_{Z}}\). Denote the mapping as \({{{\hat {v}}}_{g}}\).

Now, define a composition operator:

$$\begin{gathered} \forall {{s}_{1}}, \ldots ,{{s}_{n}} \in E_{V}^{S}: \\ \oplus _{S}^{S}\left( {\left( {\begin{array}{*{20}{c}} {{{s}_{0}}} \\ {{{t}_{0}}} \end{array}} \right),\left\{ {\left( {\begin{array}{*{20}{c}} {s_{1}^{'}} \\ {{{t}_{1}}} \end{array}} \right), \ldots ,\left( {\begin{array}{*{20}{c}} {s_{n}^{'}} \\ {{{t}_{n}}} \end{array}} \right)} \right\}} \right) \\ = \left\{ \begin{gathered} c \in C|\forall x \in {{X}^{g}}:{{{{\hat {v}}}}_{g}}(x) \ne \,{{ \top }_{Z}}\, \Rightarrow {{c}_{g}}(x) = {{{v}}_{g}}(x){\text{\}}}, \hfill \\ {\text{if}}\quad {{C}_{{Check}}}({\text{\{}}{{s}_{0}}, \ldots ,{{s}_{n}}{\text{\}}}) \hfill \\ \emptyset ,\quad {\text{otherwise}} \hfill \\ \end{gathered} \right. \\ \end{gathered} $$

The definition of the operator \( \oplus _{S}^{S}\) definitely satisfy the requirements ?, ?, as an upper element (\( \sqsubseteq _{V}^{S}\)) means only \( \top _{V}^{S}\), and thus, widen the set of concrete states.

A set of abstract edges contains a set of normal CFA edges and transitions in environment, which are defined by changes of global variables: \(E_{V}^{T} = {{2}^{{X \to \mathcal{Z}}}} \cup G\). An identical transition \(\varepsilon = \emptyset \) is an empty mapping, meaning no variable changes its state.

(2) Precision of the Value analysis contains a set of tracked variables: \({{\Pi }_{V}} = {{2}^{X}}\).

(3) The transfer relation \({{ \rightsquigarrow }_{V}}\) contains a transition \(e{{\mathop \rightsquigarrow \limits^R }_{V}}e{\kern 1pt} '\), with \(e = (s,q)\), \(e{\kern 1pt} ' = (s{\kern 1pt} ', \top _{V}^{T})\), if

\(q \in G,q = ( \cdot ,assume(expr), \cdot {\kern 1pt} ')\)

$$\begin{gathered} \forall x \in X:s{\kern 1pt} '(x) \\ = \left\{ \begin{gathered} {{ \bot }_{Z}},\quad {\text{if}}\quad a \in \mathbb{Z}.(x \to a):{{(expr \ne 0)}_{{/s}}} \hfill \\ a,\quad {\text{if}}\quad \exists !a \in \mathbb{Z}.(x \to a): \hfill \\ {{(expr \ne 0)}_{{/s}}} \vee s(x) = c \hfill \\ {{ \top }_{Z}},\quad {\text{otherwise}} \hfill \\ \end{gathered} \right. \\ \end{gathered} $$

Here \({{(expr)}_{{/s}}}\) means an interpretation of the expression expr over variables in X for an abstract assignment s. And expression \((x \to a):{{(expr \ne 0)}_{{/s}}}\) means, that the value of a of the variable x satisfies the interpretation.

\(q \in G,q = ( \cdot ,assign(w,expr), \cdot ')\)

$$\forall x \in X:s'(x) = \left\{ {exp{{r}_{{/s}}},\;{\text{if}}\;x = ws(x),otherwise} \right.$$

\(q \in G\) in other cases the state is not changed: \(s{\kern 1pt} ' = s\).

\(q:X \to Z\) means that we have a transition, which changes the definite variables. In the general case the successor

$$\forall x \in X:s{\kern 1pt} '(x) = \left\{ \begin{gathered} q(x),\quad {\text{if}}\quad x \in dom(q) \hfill \\ s(x),\quad {\text{otherwise}} \hfill \\ \end{gathered} \right.$$

(4) Merge operator does not join the states: \(merg{{e}_{V}}(e\), e', π) = e'. The condition 3is evidently fulfilled.

(5) Stop operator consider states separately: stopS(e, R, π) = \((\exists e{\kern 1pt} ' \in R \wedge e \sqsubseteq e{\kern 1pt} ')\).

(6) Precision adjustment computes a new abstract state, limiting the assignments only by variables in a precision: \(pre{{c}_{V}}(e,\pi ,R) = ({{e}_{\pi }},\pi )\).

(7) \(\forall {{e}_{1}},{{e}_{2}}\, \in \,{{E}_{V}}\,:\,compatibl{{e}_{V}}({{e}_{1}},{{e}_{2}})\) = \({{C}_{{check}}}({\text{\{}}{{s}_{1}}\), s2}).

(8) Operator \(compose({{e}_{1}},{{e}_{2}})\) is a default one: \(\forall {{e}_{1}},{{e}_{2}},{{e}_{i}} = ({{s}_{i}},{{q}_{i}}):compose({{e}_{1}},{{e}_{2}}) = ({{s}_{1}},{{q}_{2}})\).

(9) A transition in environment may affect only on global variables: \(\forall e \in {{E}_{V}}\), \(e = (s,q):e{{{\text{|}}}_{p}}\) = \(({{s}^{{global}}}\), qglobal). Here a mapping \({{s}^{{global}}}\) means only a part, which is related to a global variables.

12 LOCK ANALYSIS

We define Lock Analysis \(\mathbb{S} = ({{D}_{S}},\) \({{\Pi }_{S}},{{ \rightsquigarrow }_{S}},\) \(merg{{e}_{S}}\), stopS, \(pre{{c}_{S}},\) \(compatibl{{e}_{S}}\), \( \cdot {{{\text{|}}}_{p}}\), \(compos{{e}_{S}}\)) which tracks the set of acquired locks (synchronization variables) for each thread.

It consists of the following components.

(1) The abstract domain \({{D}_{S}} = (C \times G \times T,{{\mathcal{E}}_{S}},{{ \oplus }_{S}})\) uses semi-lattice \({{\mathcal{E}}_{S}} = \mathcal{E}_{S}^{S} \times \mathcal{E}_{S}^{T}\). \(E_{S}^{S} = {{2}^{S}} \cup \{ {{ \top }^{E}},{{ \bot }^{E}}\} \) is a superset of synchronization variables, \( \bot _{S}^{S} \sqsubseteq ls \sqsubseteq _{S}^{S} \top _{S}^{S}\) and \(ls \subseteq ls{\kern 1pt} ' \Rightarrow ls \sqsupseteq _{S}^{S}ls{\kern 1pt} '\) for all elements ls, \(ls{\kern 1pt} ' \subseteq S\) (this implies \( \bot _{S}^{S} \sqcup _{S}^{S}ls\) = ls, \( \top _{S}^{S} \sqcup ls = \top _{S}^{S}\), \(ls \sqcup _{S}^{S}ls{\kern 1pt} '\) = \(ls \cap ls{\kern 1pt} '\) for all elements \(ls,ls{\kern 1pt} ' \subseteq S\), \(ls \ne ls{\kern 1pt} '\)), and \(E_{S}^{T}\) = \({\text{\{}}{\kern 1pt} \bot _{S}^{T},\varepsilon ,\, \top _{S}^{T}{\kern 1pt} {\text{\}}}\, \cup \,G\).

(2) There is only one empty precision: \({{\Pi }_{S}} = {\text{\{}}\emptyset {\text{\}}}\).

(3) The transfer increases the number of stored locks in case it goes via acquire operator and decreases in case of release. Formally, the transfer relation \( \rightsquigarrow {{{\kern 1pt} }_{S}}\) has the transfer \({{\tau }_{S}} \rightsquigarrow \tau {\kern 1pt} '\), \(\tau = (ls,g)\), \(g = ( \cdot ,op, \cdot {\kern 1pt} ')\) if

\(op\neg quire(s)\) and \(s \notin ls \wedge ls{\kern 1pt} ' = ls \cup {\text{\{}}s{\text{\}}}\), \(g{\kern 1pt} ' \in G\).

\(op = release(s)\) and \(ls{\kern 1pt} ' = ls{\backslash \{ }s{\text{\}}}\), \(g{\kern 1pt} ' \in G\).

\(op = thread\_create({{l}_{\nu }})\) and \(ls{\kern 1pt} ' = \emptyset \), \(g{\kern 1pt} ' \in G\).

– otherwise, \(ls = ls{\kern 1pt} '\), \(g{\kern 1pt} ' \in G\).

(4) The merge operator does not combine elements: \(merg{{e}_{S}}(e,e{\kern 1pt} ',\pi ) = e{\kern 1pt} '\).

(5) The termination check is true if exists state which contains less locks:

$$sto{{p}_{S}}(e,R,\pi ) = (\exists e{\kern 1pt} ' \in R \wedge e \sqsubseteq e{\kern 1pt} ').$$

(6) The precision is never adjusted: \(pre{{c}_{S}}(e,\pi ,R)\) = (e, π).

(7) Definition of the projection: \(\forall e \in {{E}_{S}}\), e = \((s,g)\) : e|p = (s, ε).

Note, the transitions in environment (ε transitions) can not change the set of acquired locks, as the one thread can not affect the acquired locks of the other thread. Thus, we have only one option for projection. Anyway, the projection strongly affects for compatibility check, as the threads can not acquire the same lock simultaneously.

(8) \(\forall e,e' \in {{E}_{S}}\), \(e = (s,q)\), e' = (s', q') : composeS(e, \(e') = \tilde {e}\) = (s, q').

(9) \(\forall {{e}_{1}},{{e}_{2}} \in {{E}_{S}}\) : \(compatibl{{e}_{S}}(e,e')\) = \((ls\, \cap \,ls{\kern 1pt} '\, = \,\emptyset )\). The compatibility check is very close to basic Lockset algorithm. If there is the same lock in both threads, the operations can not be composed into the concrete one, as the threads can not acquire the same lock twice.

13 DATA RACE DETECTION

As we have already discussed, an approach for data race detection is divided into two steps: construction of a reached set and detection of pairs, which from a data race. The first subtask is solved by a set of considered CPAs.

Figure 7 presents an example of CPA configuration for data race detection. The set of CPAs contains ThreadModularCPA (Section 6), as a root one. It includes ARGCPA, which provides different relations between states, including “parent-child”, “projected-projection” and others. CompositeCPA (Section 7) implements a parallel composition of nested CPA: LocationCPA (Section 8) models program counter, CallstackCPA provides interprocedural analysis, LockCPA (Section 12) tracks acquired locks, ThreadCPA (Section 9) tracks thread creation points, PredicateCPA (Section 10) implements predicate analysis. Note, this is not the only configuration. Moreover, different tasks require different configuration of CPAs.

Fig. 7.
figure 7

An example of CPA configuration.

An Abstract Reachability Graph (ARG) is constructed with help of the set of CPAs. It consists from the following transitions, which are reachable with a particular level of an abstraction. Thus, they may be not reachable in a real execution of a program. The next step is to find pairs, which form data races.

Usually, data race is considered to be a situation where simultaneous accesses to the same memory take place from different threads, and one of the accesses is a write one. Here are two main issues: how to detect the same memory in a static way and how to detect simultaneous accesses. Further, we will discuss the two features of our approach.

The presented theory supports shared data, which are expressed only by global variables. In real-world software, there are a lot of operations with pointers, structure fields and so on. We are using BnB memory model, which divides memory into a disjoint set of regions [1315]. The region corresponds to a special data type or to a special structure field in case of the field was not addressed. The memory model has a certain number of limitations. First of all, it does not support address arithmetic and casting, which reduces the soundness. Then, there may be false alarms for general data types, like integer, as there are a lot of accesses to it.

In case of we found a pair of accesses to the same memory region, we need to check a possibility of simultaneous access to it. We use compatibility notion here. Compatibility here means the two partial abstract states may be a part of a single state. Or, in other terms, one transition can be applied to another and vice versa, which means the two operations may be executed in parallel. Our approach for static race detection is a generalization of Lockset [12], which claims a data race as two accesses with disjoint sets of locks. One of the limitations of the Lockset approach is the absence of support of other synchronization primitives. We use \(compatible\) operator to identify the potentially parallel operations. As compatibility check is based on different kinds of analyzes, including Lock analysis, Predicate analysis, and others, it is more precise, than the Lockset.

The data race detection algorithm consists of the following steps:

(1) compute a complete set of reached abstract transitions (Algorithm 1);

(2) for every reached transition extract a memory region it accesses to;

(3) for every memory region try to find a compatible pair of transitions, which form a race condition;

(4) check every potential data race for feasibility and refine a predicate abstraction if necessary [20].

Note, that algorithm of refinement the abstraction (Counterexample Guided Abstraction Refinement, CEGAR) was reused without significant modifications. However, it allows to perform refinement only local transitions in a single thread. So, it can not detect contradiction between different threads. It is not a limitation of the approach, and in case of extending the CEGAR algorithm it is possible to obtain more precise results.

14 EVALUATION

We have implemented Thread-Modular Analysis with transition abstraction as a composition of following analyses:

– Location Analysis \(\mathbb{L}\) (Section 8),

– Callstack Analysis \(\mathbb{C}\mathbb{S}\) (tracks function callstack),

– Thread Analysis \(\mathbb{T}\) (Section 9),

– Lock Analysis \(\mathbb{S}\) (Section 12),

– Predicate Analysis \(\mathbb{P}\) (Section 10) with options:

\(Merg{{e}_{{Join}}}\) – merge state and abstract edge of abstract transition.

\(Merg{{e}_{{Eq}}}\) – merge abstract edges if states are equal.

\(Merg{{e}_{{Sep}}}\) – not merging.

– Value Analysis \(\mathbb{V}\) (Section 11).

The benchmark sets were launched on a set of machines with a GNU/Linux operating system (x86_64-linux, Ubuntu 18.04), Intel Xeon E3-1230 v5, 3.40 GHz. We used default SV-COMP limits: 15 min of CPU time and 15 GB of memory.

14.1 SV-COMP Benchmark Set

The approaches to compare:

(1) Variants of thread-modular analysis with abstract transition.

(a) Variants of merge for Predicate Analysis.

(i) MergeJoin. (\(\mathbb{L}\), \(\mathbb{C}\mathbb{S}\), \(\mathbb{T}\), \(\mathbb{S}\), \(\mathbb{P}\)), \(Merg{{e}_{{Join}}}\).

(ii) MergeEq. (\(\mathbb{L}\), \(\mathbb{C}\mathbb{S}\), \(\mathbb{T}\), \(\mathbb{S}\), \(\mathbb{P}\)), \(Merg{{e}_{{Eq}}}\).

(iii) MergeSep. (\(\mathbb{L}\), \(\mathbb{C}\mathbb{S}\), \(\mathbb{T}\), \(\mathbb{S}\), \(\mathbb{P}\)), \(Merg{{e}_{{Sep}}}\).

(b) Value. Value Analysis. (\(\mathbb{L}\), \(\mathbb{C}\mathbb{S}\), \(\mathbb{T}\), \(\mathbb{S}\), \(\mathbb{V}\)).

(2) Threading. ThreadingCPA described in [18] uses original CPA theory and considers all possible interleavings.

Discussion. The thread-modular approach does not produce incorrect true results that confirms the soundness of the approach.

MergeJoin shows better performance than MergeSep configuration. This is mostly because of a large number of environment steps, which should be performed for each transition in the environment. MergeJoin combines them together and applies them at once. This allows for saving a large amount of time. Anyway, the MergeSep configuration allows avoiding some imprecision due to separate state exploration.

A simple value configuration is a fast analysis, but sometimes it has to explore all possible values of a variable, which are numerous, and thus it fails into a timeout.

Classic analysis (Threading) is sound and precise, so it does not produce incorrect verdicts at all. However it requires a lot of resources, this is the main disadvantage of the approach. Threading is able to solve only 1 of 7 real-world benchmarks, which are based on Linux device drivers. The thread-modular approaches (MergeJoin,MergeEq, MergeSep) solve 5 of 7.

Most of the new true verdicts proved by the thread-modular approach (26 of 27 for the MergeJoin) were not proved by Threading. That is one of the contributions of the approach.

The thread-modular approach has many incorrect false verdicts. Most of them are due to unsupported atomic constructions like compare-and-swap. In some cases, we do not support happens-before ordering by thread creation (the child thread can not interfere parent before creation). One more minor problem is the current limitation of a refinement procedure, which does not allow to discover of interpolants to the other thread. A small number of benchmarks are sensitive for exploring interleavings, which is a limitation of the thread-modular approach.

14.2 Data Race Detection in Device Drivers

The set of benchmarks, based on Linux device drivers, was prepared by Klever tool, a framework for verification of large software systems [16, 17]. It divides a large codebase into separate verification tasks. For the Linux kernel, a verification task consists of one Linux module. Then Klever automatically prepares an environment model, which includes a thread model, a kernel model, and operations over the module. After the preparation of a verification task, Klever calls a verification tool via a common interface – BenchExec [23].

The benchmarks are checked for data race conditions as it was described in Section 13. Note, in SV-COMP benchmarks are reachability tasks. Thus, we can not evaluate SV-COMP participants on the new set, because they do not support data race detection.

We chose the drivers/net/ subsystem of Linux kernel 4.2.6, for which Klever prepared 425 verification tasks. We compared the following configurations:

(1) Base. MergeJoin configuration from the previous subsection.

(2) Havoc. MergeJoin configuration from the previous subsection without any predicate over global variables in predicate precision. It means, we abstract from value of global variables and consider them to be arbitrary changed.

Discussion. A false verdict means, that there is at least one potential data race condition. Havoc configuration is a bit faster and less precise, as it can not prove 8 true modules, but is able to find more unsafes. 6 of these false unsafes correspond to the missed true verdicts and are spurious due to imprecision. And 13 unsafes corresponds to unknowns in Base configuration, which means Havoc finds new unsafes.

We checked the 8 true modules, which are proved as correct. The precise approach proves, that the device can not be initialized in an appropriate way, and thus there is no race. Actually, this is a problem of the environment model (preparation of verification task), as it missed semantics of the data.

We analyzed some part of produced unsafes. There are several different race conditions per a verification task, we call them warnings. The true positive rate is about 42% (34 correct false and 47 incorrect false warnings). The main cause of false alarms are problems with memory model (>90%). The rest of false alarms are related to kernel specifics (for instance, interrupts handling), function pointer analysis, different synchronization primitives, and other specific cases. So, we have not faced problems due to imprecision in the thread-modular approach, similar to SV-COMP benchmarks. It confirms our thesis, that SV-COMP benchmarks contain mostly complicated, but artificial cases.

We reported most of the true race conditions and they were confirmed by Linux maintainers. However, most of the bugs were found in ancient drivers and nobody wants to fix it. Only a couple of patches are applied to the upstream as a part of Google Summer of Code project.

15 RELATED WORK

The existing approaches to the analysis of multithreaded programs have different features and performance. On the one side, there are precise approaches, which can prove the correctness of the program under certain assumptions. Starting from the bounded model checking, most of the approaches investigate different techniques to reduce state space. The examples of the optimizations are partial-order reduction [1], context bounding [24, 25], etc.

An attempt to abstract from an irrelevant environment is a thread-modular approach, which was first suggested by [6]. This version does not use any abstraction. A thread modular approach to formal verification was presented in [26]. The idea is to provide invariants for every process, which together imply the formal requirement. The evaluation is provided for two protocols for mutual exclusion. A predicate abstraction was composed of a thread-modular approach in [27]. The main distinction of the presented approach was that there was only one thread in several copies. Thus, the environment of the thread is formed by itself. Also, there were no synchronization primitives considered.

An extension of the thread-modular approach, which also uses an abstraction, is firstly presented in [28] and then implemented in TAR [7]. Our approach has the following main differences:

– TAR considers locks as ordinary variables. Our tool has a special Lock Analysis, which is composed of other analyses. That allows to avoid extra refinement iterations, as the analysis already handles it.

– TAR applies thread effects, which are precisely related to thread operators. Our tool provides a possibility to abstract (operator ⋅|p) and to join (operator merge) different transitions. That may increase the speed of analysis but decrease its precision.

– TAR supports a fixed number of threads, whereas our approach supports unbound thread creation.

– For environment TAR uses under-approximation, and our tool – over-approximation.

A similar approach was also implemented in Threader tool [29]. Threader uses over-approximation for an environment, based on Horn closes. Similar to our approach, Threader can provide modular proofs, but also it can search non-modular ones, in which complete interaction between threads is considered.

Many techniques provide techniques for sequentialization of the program to be able then to verify it with the aid of sequential tools [3032]. One of the examples is WHOOP [33], which uses a sequentialization technique and does not consider thread interaction. Moreover, it strongly applies Lockset algorithm and has no way to extend the approach with other analyzes. The authors applied the tool as a front-end to a more precise verifier CORALL [34].

On the other side there exist many lightweight approaches, which can be applied to a large amount of code. Such techniques are determined by weak requirements for resources and low precision. The examples are RELAY [35] and Locksmith [36], which are evaluated on the Linux kernel source code. RELAY found thousands of warnings when analyzing the Linux kernel, and then employs unsound post-analysis filters. The tool does not consider the thread interaction at all.

In opposite to RELAY Locksmith considers thread creation points, but it does not precisely identify shared data and thread interactions. It computes a general future effect related to the whole thread. An experimental evaluation shows the tool has problems with scalability.

In [37] authors presented an extension of Andersen points-to analysis for multithreaded programs. The idea is similar to the thread-modular approach, they compute a set of operators, which can be executed in parallel, and apply the operators in other threads.

There are many specific approaches for efficient data race detection for a particular software or a property. For example, low-level software with nested interrupts [38], data race detection in FreeRTOS [39, 40], concurrent use-after-free bugs in Linux device drivers [41]. Such approaches demonstrate good results, but significantly base on property and code specifics. Our approach is pretending to be more general, nevertheless, it also may be improved by targeting on a particular code and a specific property.

16 CONCLUSIONS

The paper presents an approach for practical data race detection in complicated software. We extend an existing CPA theory and implement it in a new tool. The experiments show the benefit of the approach on large examples. On the other hand, small and complicated benchmarks are better solved with other approaches. Anyway, the approach is sound and may be improved and optimized in the future.

Thus, we may conclude, that the requirements to the new tools are fulfilled, as it is successfully applied to different software systems, including Linux device drivers. As in classical model checker approaches, there may be provided a guarantee of correctness under a certain conditions: requirement on CPA operators and condition of disjoint BnB regions.

The extended CPA theory allows to describe complicated kinds of analysis, including those ones, which are not covered by a thread-modular approach. Nevertheless, the theory does not cover all possible kinds of analysis, and, for example, efficient description of interleaving analysis will require an extension. However, this is a question for the further investigation.

One of the possible directions is to extend the thread-modular approach in such a way, that it considers some thread interaction. One of the ideas is to implement analysis, which can keep an adjustable balance between interleaving analysis and thread-modular one.

The other interesting practical improvement is the integration of different approaches into one tool. For example, a combination of fast thread-modular analysis as the first stage and precise classical analysis as the second stage, which may be implemented according to cooperative verification idea [42].

One of the weaknesses of the thread-modular approach is difficulties with the computation of a real path with interleaving. However, the path will be really helpful for the investigation and refinement of the abstraction. Thus, reconstruction of the full path from a path with transitions in the environment is in our future plans.