Abstract
This paper addresses the problem of verifying heap evolution properties of pointer programs. To this end, a new unified model checking approach with MSVL (Modeling, Simulation and Verification Language) and PPTL\(^{\tiny \text{ SL }}\) is presented. The former is an executable subset of PTL (Projection Temporal Logic) while the latter is an extension of PPTL (Propositional Projection Temporal Logic) with separation logic. MSVL is used to model pointer programs, and PPTL\(^{\tiny \text{ SL }}\) to specify heap evolution properties. In addition, we implement a prototype in order to demonstrate our approach.
This research is supported by the National Natural Science Foundation of China Grant Nos. 61133001, 61322202, 61420106004, and 91418201.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Pointers are indispensable in real-world programs or applications. Reasoning about pointer programs is quite challenging since pointer usage is often complex and flexible. Potential bugs encountered in pointer programs such as null pointer dereference, memory leaks, or shape destruction are due to the nature of pointers. The problem is more serious for concurrent programs since we need to consider all possible execution sequences of processes. Alias analysis, as the name implies, is a point-to analysis which naively checks whether pointers can be aliased. Shape analysis is another form of pointer analysis that attempts to discover the possible shapes of heap structures. It aims to prove that these structures are not misused or corrupted.
Reynolds [1] proposes a famous Hoare-style logic known as separation logic which has received much attention. For the last decade, many works extend separation logic to do automated assertion checking [2] and shape analysis [3] in real-world applications. PTL (Pointer Assertion Logic) is a notation for expressing assertions about the heap structures of imperative languages. PALE (PAL Engine) is a complete implementation of PAL that encodes both programs and partial assertions as formulas in monadic second-order logic. However, loop invariants have to be manually provided so that it is not fully automatic.
In this paper we intend to apply the model checking framework to verify heap evolution properties. As the property-specification language we use a variant of temporal logic namely PPTL\(^{\tiny \text{ SL }}\) [4]. PPTL\(^{\tiny \text{ SL }}\) is a two-dimensional (spatial and temporal) logic by extending PPTL (Propositional Projection Temporal Logic) [5] with a decidable fragment of separation logic. For the program part, our method makes use of a temporal logic programming language, which is an executable subset of PTL (Projection Temporal Logic), called MSVL (Modeling, Simulation and Verification Language) [6], to model heap programs. PPTL\(^{\tiny \text{ SL }}\) can be translated (preserving satisfiability) into a strict subset of PTL. Specifications and models lie in the same logic framework, hence the name unified model checking. The previous unified model checking approach [7] cannot verify heap evolution properties. We extend the approach in [7] by replacing PPTL with the more expressive specification language PPTL\(^{\tiny \text{ SL }}\) such that heap evolution properties can be verified, and also the corresponding model checking approach is developed in this paper.
The work in [8] studies the problem of establishing temporal properties, including liveness properties of Java programs with evolving heaps. A specification language Evolution Temporal Logic (ETL) is defined which is a first-order linear temporal logic with transitive closure. ETL mainly focuses on describing behaviors of large granularity heap objects and high-level threads. Navigation Temporal Logic (NTL) [9] extends LTL with pointer assertions on single-reference structures including primitives for the birth and death of entities. The abstracted model checking algorithm for NTL is a non-trivial extension of the tableau based algorithm for LTL, which can be applied for both sequential and concurrent pointer programs. The major disadvantage of the approach of [9] is that it can only verify programs manipulating singly-linked lists. In [10], Rieger presents an abstraction and verification framework for pointer programs operating on unbounded heaps. In his work, two abstraction techniques are introduced, one is for singly-linked structures and the other employs context-free hyperedge replacement graph grammars to model more general heap structures. A two-dimensional (time and space) logic named maLTL is developed in [11] by combining temporal logic LTL and CTL, which is suitable to deal with pointers and heap management in the context of C programs. Since both dimensions of maLTL are realized by temporal logics that makes the difference between the two dimensions unclear.
2 Projection Temporal Logic
Let Var be a countable set of typed variables consisting of static and dynamic variables, and Prop a countable set of propositions. \(\mathbb {B}\) represents the boolean domain \(\{true, false\}\), and \(\mathbb {D}\) denotes the data domain. The terms e and formulas Q of PTL are given by the following grammar:
where \(q \in Prop\) is a proposition, \(x \in Var\) a variable, and fun a function of arity n and Pred is a predicate of arity n.
A state s is defined to be a pair \((I_{v},I_{p})\) of state interpretations \(I_{v}\) and \(I_{p}\), \(I_{v} \, : \, Var \rightarrow \mathbb {D} \cup \{nil\}, I_{p} \, : \, Prop \rightarrow \mathbb {B}\). An interval \(\sigma = \langle s_{0},s_{1},\ldots \rangle \) is a non-empty sequence of states, finite or infinite. The length of \(\sigma \), denoted by \(|\sigma |\), is \(\omega \) if \(\sigma \) is infinite, otherwise it is the number of states minus one. To have a uniform notation for both finite and infinite intervals, we will use extended integers as indices. That is, we consider the set \(N_{0}\) of non-negative integers, and define \(N_{\omega }=N_{0} \cup \{\omega \}\), and extend the comparison operators, \(=\), <, \(\le \), to \(N_{\omega }\) by considering \(\omega =\omega \), and for all \(i \in N_{0}\), \(i < \omega \). Moreover, we define \(\preceq \) as \(\le - \{(\omega ,\omega )\}\). With such a notation, \(\sigma _{(i .. j)}(0 \le i \preceq j \le |\sigma |)\) denotes the sub-interval \(\langle s_{i},\ldots ,s_{j} \rangle \) and \(\sigma ^{(k)}(0 \le k \preceq |\sigma |)\) denotes the suffix interval \(\langle s_{k},\ldots ,s_{|\sigma |} \rangle \) of \(\sigma \). The concatenation of \(\sigma \) with another interval \(\sigma '\) is denoted by \(\sigma \cdot \sigma '\). Further, let \(\sigma = \langle s_{k},\ldots ,s_{|\sigma |} \rangle \) be an interval and \(r_{1},\ldots ,r_{h}\) be integers \((h \ge 1)\) such that \(0 \le r_{1} \le r_{2} \le \cdots \le r_{h} \preceq |\sigma |\). The projection of \(\sigma \) onto \(r_{1},\ldots ,r_{h}\) is the interval, \(\sigma \downarrow (r_{1},\ldots ,r_{h}) = \langle s_{t_{1}}, \ldots , s_{t_{l}} \rangle \), where \(t_{1},\ldots ,t_{l}\) is obtained from \(r_{1},\ldots ,r_{h}\) by deleting all duplicates.
An interpretation for a PTL formula is a triple \(\mathcal {I} = (\sigma , k, j)\) where \(\sigma = \langle s_{0},s_{1},\ldots \rangle \) is an interval, k a non-negative integer and j an integer or \(\omega \) such that \(k \preceq j \le |\sigma |\). We write \((\sigma ,k,j) \models Q\) to mean that a formula Q is interpreted over a sub-interval \(\sigma _{(k .. j)}\) of \(\sigma \) with the current state being \(s_{k}\). The notation \(s_{k}=(I^{k}_{v},I^{k}_{p})\) indexed by k represents the k-th state of an interval \(\sigma \). The semantics of terms and PTL formulas are defined by:
A formula Q is satisfied over an interval \(\sigma \), written \(\sigma \models Q\), if \((\sigma ,0,|\sigma |) \models Q\) holds. Also we have the following derived formulas:
where \(\varepsilon \) (or len(0)) denotes an interval with zero length, “; ” and “\(^{+}(^{*})\)” are used to describe sequential and loop properties respectively.
3 Modeling, Simulation and Verification Language
MSVL is an executable temporal logic with framing technique which is recently extended with function calls [12] and groups of types such as basic data types, pointer types and struct types [13]. Thus it is capable of modeling pointer programs. The arithmetic and boolean expressions of MSVL can be defined as:
Some useful elementary statements of MSVL can be inductively defined as follows. A convenient way to execute MSVL programs is to transform them into their equivalent normal forms (Definition 1).
Definition 1 (Normal Form of MSVL)
An MSVL program Q is in normal form if , where \(Q_{f_{j}}\) is a general MSVL program, whereas \(Q_{e_{i}}\) and \(Q_{c_{j}}\) are true or all are state formulas of the form: \((x_{1}=e_{1}) \wedge \cdots \wedge (x_{m}=e_{m}) \wedge \overset{\cdot }{q_{x_{1}}} \wedge \cdots \wedge \overset{\cdot }{q_{x_{m}}}\). \(\overset{\cdot }{q}\) denotes either q or \(\lnot q\).
The normal form divides the formula into two parts: the present part and the future part. A key conclusion is that any MSVL program can be reduced to its normal form [5, 6]. Therefore, we can use an incremental way to execute MSVL programs based on normal form.
Theorem 1
Any MSVL program Q can be reduced to its normal form.
Using normal form of MSVL as a basis, a graph can be constructed, namely Normal Form Graph (NFG) [5, 6], by recursively progressing the future part of a normal form, which explicitly illustrates the state space of an MSVL program.
3.1 Examples of MSVL Programs Manipulating Pointers
Producer-consumer program. Consider the producer-consumer problem encoded in MSVL below. The producer process and the consumer process share a buffer which is realized as a global singly-linked list. The producer repeatedly generates new items by allocating new memory heap cells, and adds them to the tail x of the buffer, whereas the consumer removes items from the head y of the buffer and disposes them.
The parallel operator “||” in MSVL considers the true concurrency semantics of programs. In order to simulate the interleaving semantics of the two concurrent processes, the await statement is employed to force a process to sleep when the waiting condition is false, otherwise the process will continue to execute.
4 The Two-Dimensional Logic PPTL\(^{\tiny \text{ SL }}\)
Previously, we integrate a decidable fragment of separation logic (referred to as SL) with PPTL to obtain a two-dimensional logic. The logic, referred to as PPTL\(^{\tiny \text{ SL }}\) [4], allows us to express heap evolution properties. We assume a countable set PVar of variables with pointer type, and a finite set Loc of memory locations. \(PVal=Loc \cup \{null\}\) denotes the set of pointer values which are either locations or null. The syntax of PPTL\(^{\tiny \text{ SL }}\) formulas P is defined by the grammar:
\(l \in Loc\), \(x \in PVar\), \(\phi \) represents SL formulas, and P PPTL\(^{\tiny \text{ SL }}\) formulas. Formula \(e_{0} \mapsto \{ e_{1},\ldots ,e_{n} \}\) denotes that \(e_{0}\) points to \(e_{1},\ldots ,e_{n}\), where \(e_{0}\) represents an address in the heap and \(e_{1},\ldots ,e_{n}\) the consecutive values held in that address. The formula \(\phi _{1} {{\#}} \phi _{2}\) specifies properties holding respectively for disjoint portions of the current heap, one makes \(\phi _{1}\) true and the other makes \(\phi _{2}\) true. The temporal operators as well as their semantics are taken from PTL.
We refer to a pair \((I_{s},I_{h})\) as a memory state, \(I_{s} \, : \, PVar \rightharpoonup PVal, I_{h} \, : \, Loc \rightharpoonup {\bigcup }^{n}_{i=1} PVal^{i}\), where \(I_{s}\) represents a stack and \(I_{h}\) a heap. \(I_{s}\) serves as valuations of pointer variables and \(I_{h}\) as valuations of heap cells. We write dom(f) to denote the domain of mapping f. Given two mappings \(f_{1}\) and \(f_{2}\), the notation \(f_{1} \perp f_{2}\) means that \(f_{1}\) and \(f_{2}\) have disjoint domains. Moreover, we use \(f_{1} \cdot f_{2}\) to denote the union of \(f_{1}\) and \(f_{2}\). The semantics of SL formulas is given by:
The semantics of P is similar to that of Q since the only difference is their state formulas (formulas without temporal operators). Therefore, we only give the interpretation of state formulas, i.e., \(\varvec{\mathcal {I} \models \phi } \text { iff } I^{k}_{s},I^{k}_{h} \models _{_{SL}} \phi \). We abusively use the notation \(\mathcal {I} \models P\), and in this case P is interpreted over an interval of memory states.
SL can describe various heap structures, we present the following derived formulas expressed in SL which are related to singly-linked lists.
Here, \(ls(e_{1},e_{2})\) describes a list segment starting from the location \(e_{1}\) to \(e_{2}\), \(e_{1} \rightarrow ^{+} e_{2}\) and \(e_{1} \rightarrow ^{*} e_{2}\) mean that \(e_{2}\) is reachable from \(e_{1}\) via certain pointer links. emp denotes an empty heap, and alloc(e) indicates the address e is allocated in the current heap. Formulas describing other heap structures can also be derived.
4.1 Specify Heap Evolution Properties
For the producer-consumer program with a shared list, some interesting heap evolution properties can be specified by PPTL\(^{\tiny \text{ SL }}\):
-
(1)
Absence of memory leaks, i.e., any item can be reached by certain variable during the execution of the program: \(\square (\forall z: alloc(z) \rightarrow (x \rightarrow ^{*} z \vee y \rightarrow ^{*} z \vee t \rightarrow ^{*} z \vee r \rightarrow ^{*} z))\).
-
(2)
The tail of the list is never deleted nor disconnected from the head: \(\bigcirc ^{2} \square (alloc(x) \wedge y \rightarrow ^{*} x)\).
-
(3)
Shape integrity of the buffer, i.e., the shape of the buffer is repeatedly formed as a linked list within every five time units: \(\bigcirc ^{2} ((len(5) \wedge \lozenge ls(y,null))^{+})\).
In [4], we have proved that PPTL\(^{\tiny \text{ SL }}\) is decidable by an equisatisfiable translation. The key idea is that all the formulas expressing heaps in SL are reduced into the first-order theory. Since the model of first-order set has no heap ingredient, we use extra variables to simulate it. Let H denote a vector of n tuples of pointer variables, i.e., \(H = ((H_{1,1}, \ldots , H_{1,m_{1}}), \ldots , (H_{n,1}, \ldots , H_{n,m_{n}}))\). The translations \(f(\phi ,H)\) and F(P, H) generate an equisatisfiable state formula \(\phi '\) to \(\phi \) and an equisatisfiable temporal formula \(P'\) to P respectively. Both \(\phi '\) and \(P'\) belong to PTL. The detailed definitions of f and F can be found in [4]. Analogous to MSVL, by using a very similar approach, we can construct a graph structure (also called NFG) that explicitly characterizes the model for any reduced PPTL\(^{\tiny \text{ SL }}\) formula. The detailed proofs are given in [4], here we only give a brief summary.
Theorem 2
PPTL\(^{\tiny \text{ SL }}\) is decidable, and its complexity is the same as PPTL, i.e., non-elementary.
5 Model Checking with MSVL and PPTL\(^{\tiny \text{ SL }}\)
Our model checking approach is similar to the traditional automata-based model checking except that ours is based on NFG. The starting point is a pointer program modeled by an MSVL program M, and a PPTL\(^{\tiny \text{ SL }}\) formula P that formalizes the desired heap evolution property on M.
In general, the model checking procedure in this framework first creates the NFG \(G_{M}\) of an MSVL program and the NFG \(G_{\lnot P}\) of the negation of the input PPTL\(^{\tiny \text{ SL }}\) formula, then constructs the product G of the two NFGs. The nodes (edges) of G are conjunctions of nodes (edges) in \(G_{M}\) and \(G_{\lnot P}\). If there exist a valid path in G, a counterexample is found, otherwise M satisfies P.
We have developed a unified model checking tool (prototype) based on the approach presented in this paper. As shown Fig. 1, the tool structure consists of three essential modules: MSV, PPTL\(^{\tiny \text{ SL }}\) solver and unified model checker. An MSVL program M is feeded into MSV, and a property P is given to the PPTL\(^{\tiny \text{ SL }}\) solver. MSV constructs the NFG of M and PPTL\(^{\tiny \text{ SL }}\) solver builds the NFG of \(\lnot P\). The Model checker does not try to build the complete production of the two NFGs in practice. Instead, it works in “on the fly” manner and tries to find one valid path as early as possible. The SMT solver Z3 is called when checking whether an edge in the product NFG is satisfied or not. We have successfully verified the producer-consumer program with respect to the corresponding heap evolution properties mentioned in Sect. 4.
6 Conclusion
In this paper, we propose a unified model checking approach with MSVL and PPTL\(^{\tiny \text{ SL }}\). We can apply this approach to verify heap evolution properties of pointer programs, including both safety and liveness properties on heap structures. Since PPTL\(^{\tiny \text{ SL }}\) is able to be reduced to a subset of PTL so that programs and properties both belong to the same logic framework which makes the verification more convenient. We have developed a model checking tool that exploits the SMT solver Z3 as the verification engine. In the future, more case studies on various heap structures in addition to singly-linked structures will be carried out.
References
Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: 17th Annual IEEE Symposium on Logic in Computer Science, Computer Society, pp. 55–74. IEEE, Washington (2002)
Berdine, J., Calcagno, C., O’Hearn, P.W.: Symbolic execution with separation logic. In: Yi, K. (ed.) APLAS 2005. LNCS, vol. 3780, pp. 52–68. Springer, Heidelberg (2005)
Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional shape analysis by means of bi-abduction. J. ACM (JACM) 58(6), 26 (2011)
Lu, X., Duan, Z., Tian, C.: Extending PPTL for verifying heap evolution properties. arXiv preprint arXiv:1507.08426 (2015)
Duan, Z.: An extended interval temporal logic and a framing technique for temporal logic programming. Ph.D. thesis, University of Newcastle upon Tyne (1996)
Duan, Z., Yang, X., Koutny, M.: Framed temporal logic programming. Sci. Comput. Program. 70(1), 31–61 (2008)
Duan, Z., Tian, C.: A unified model checking approach with projection temporal logic. In: Liu, S., Maibaum, T., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 167–186. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88194-0_12
Yahav, E., Reps, T.W., Sagiv, S., Wilhelm, R.: Verifying temporal heap properties specified via evolution logic. Logic J. IGPL 14(5), 755–783 (2006)
Distefano, D., Katoen, J.-P., Rensink, A.: Safety and liveness in concurrent pointer programs. In: Boer, F.S., Bonsangue, M.M., Graf, S., Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 280–312. Springer, Heidelberg (2006). doi:10.1007/11804192_14
Rieger, S.: Verification of pointer programs. Ph.D. thesis, RWTH Aachen University (2009)
del Mar Gallardo, M., Merino, P., Sanán, D.: Model checking dynamic memory allocation in operating systems. J. Autom. Reasoning 42(2–4), 229–264 (2009)
Zhang, N., Duan, Z., Tian, C.: Extending MSVL with function calls. In: Merz, S., Pang, J. (eds.) ICFEM 2014. LNCS, vol. 8829, pp. 446–458. Springer, Heidelberg (2014). doi:10.1007/978-3-319-11737-9_29
Wang, X., Duan, Z., Zhao, L.: Formalizing and implementing types in MSVL. In: Liu, S., Duan, Z. (eds.) SOFL+MSVL 2013. LNCS, vol. 8332, pp. 60–73. Springer, Heidelberg (2014)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Lu, X., Duan, Z., Tian, C. (2016). Using Unified Model Checking to Verify Heaps. In: Chan, TH., Li, M., Wang, L. (eds) Combinatorial Optimization and Applications. COCOA 2016. Lecture Notes in Computer Science(), vol 10043. Springer, Cham. https://doi.org/10.1007/978-3-319-48749-6_55
Download citation
DOI: https://doi.org/10.1007/978-3-319-48749-6_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48748-9
Online ISBN: 978-3-319-48749-6
eBook Packages: Computer ScienceComputer Science (R0)