Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Deductive program verification is the process of establishing correctness by proving verification conditions (VCs) extracted from a program. It relies on a formal pre- and postcondition specification as well as loop invariants being provided by the programmer. This task by itself requires proficiency in mathematical logic. Further challenges include assessing completeness of the specification, whether invariants are sufficiently strong to establish the postcondition while sufficiently weak to be maintained, and using automatic theorem provers. Training in formal methods aims at giving the necessary conceptual and technical skills to address these challenges.

In instructional settings, verification is often taught by examples from tangible and visually perspicuous domains, such as arrays of colored objects in the case of the famous Dutch national flag three-way-partitioning problem [1]. While such examples are a valuable pedagogical device, how to generalize the reasoning to more typical programming problems is often left unexplained. Also, the transition from instructional pictures to a mathematically precise reasoning language does not always follow a happy path. Even though influential authors have already several decades ago highlighted the benefits of pictures in formal reasoning [2, 3], pictures have by and large been employed as stepping stones towards some final, textual, formalization suitable for conventional symbolic reasoning. While reasons therefore (lack of precision, technical limitations, convention) may be legitimate, we suspect that this demoted role of pictures means their full benefit as reasoning tools is not being realized.

A notable exception is Reynold’s interval and partition diagrams [4], which integrate pictures with mathematical notation seamlessly, allowing invariants and even proofs over arrays to be expressed in a way that simultaneously maintains visual perspicuity and mathematical precision. It is on this trajectory that we position the approach described in this paper. A partition diagram, in its base form, is a precise, compact and embeddable diagram stating that a collection of integer indexes subdivides an array into disjoint partitions. Precise means that the language has a well-defined mathematical meaning, compact that it is space-conserving, and embeddable that it can be integrated into another diagram or a textual formula. Associated with a partitioning is some collection of properties, that the elements in the partitions should satisfy. A property can be expressed precisely by a formula universally quantifying over the partition diagram (e.g., [4, p. 94]), or by (less formally) annotating the partition diagrams with the properties (e.g., [3, p. 94]). In line with the second approach, we extend partition diagrams with the notion of coloring a partition. Formally, a coloring is a function from array indexes to a small finite set (“palette”) of programmer-defined colors. The programmer gives interpretation to the colors through the legend construct. Analogously to its cartography namesake, a legend is a mapping from colors to a universally-quantified predicate over the colored partitions. Together, partitionings, colorings and legends provide a precise and expressive pictorial language for array invariants.

As an umbrella framework we use invariant-based programming (IBP), a correct-by-construction formal verification approach geared towards teaching [5]. In IBP, preconditions, postconditions and invariants—under the common nomen situations—serve as the main organizing structures of a program. The program is represented by an invariant diagram, a graph of nested situations connected by transitions. The situations represent state predicates, such as pre- and postconditions and invariants, while the transitions constitute the actual executable code. We define the semantics of colorings and legends by translation into predicates over the program state. After translation, the VCs of the diagram are extracted using the proof rules of invariant diagrams. Nesting allows substitutions to inherit constraints from outer situations. In our extension, nesting also allows legends to be shared by multiple situations, as well as to be extended in substitutions with additional color interpretations. We illustrate the approach with examples from the domain of searching and sorting. The examples have been mechanically verified using the Why3 platform [6], a front-end for a number of automatic theorem provers.

We proceed as follows. Section 2 introduces the pictorial language in the context of two search programs. Section 3 describes the verification semantics. A verification of a slightly more complex program is given in Sect. 4. We discuss related work in Sect. 5 and end the paper with conclusions and future work in Sect. 6.

Fig. 1.
figure 1

Linear and binary search

2 Pictorial Invariant Diagrams

Consider the leftmost invariant diagram in Fig. 1. Each rounded rectangle—called a situation—identifies a subset of all possible program states. The role of a situation in a program is determined by the transitions, guarded program statements, connecting to it: a situation with no incoming transitions corresponds to a precondition (); a situation with no outgoing transitions corresponds to a postcondition ( and ). A situation with both incoming and outgoing transitions is an intermediate situation; an intermediate situation (or collection of intermediate situations) connected through a cycle of transitions corresponds to a loop (). There are five types of declarations that can appear inside a situation:

  • Variable declarations introduce program variables and associate them with types. For example, the declaration “\(\mathrm {A:}\mathbf{array }\mathrm {[N]}{\mathbf {\, of\, int}}\)” in the situation types the variable \(\mathrm {A}\) as an integer array of length \(\mathrm {N}\), indexed from 0 to \(\mathrm {N}-1\).

  • Legends introduce colors and assign them their meanings. For instance, the legend “” states that the red elements in \(\mathrm {A}\) are different from \(\mathrm {x}\). A legend is not a state assertion; rather it introduces an implication, allowing invariants over an array to be expressed visually by “painting” sub-arrays with a relevant property (in this case, that the sub-array is known to not hold the value \(\mathrm {x}\)). We make this notion more precise in the next section. Legends may introduce any number of new colors, but the color palettes for distinct arrays must be disjoint.

  • Invariants are assertions over the program variables of the situation. We can express invariants using standard mathematical and logical notation. E.g., “\(0\le \mathrm {k}\le \mathrm {N}\)” expresses that the value of \(\mathrm {k}\) is between 0 and \(\mathrm {N}\), inclusively. For asserting that a collection of variables form apartitioning we prefer to use Reynolds-style partition diagrams. The basic partition diagram is a rectangular contour:

    where i and j are integer expressions over the program variables. It stands for the predicate “\(i<j\)”. The bounds may be juxtaposed with respect to the adjacent edge to specify whether they are inclusive or exclusive:

    Conjunctions of partition diagrams, when the upper bound of the predecessor coincides with the lower bound of its successor, may be written in chained form:

    The following abbreviations denote singleton intervals:

    Using partition diagrams, the aforementioned predicate is equivalently expressed as:

    As mentioned, partition diagrams can be embedded in textual formulas; e.g., the invariant of states that \(\mathrm {A}\) is sorted.

  • Colorings are pictorial invariants similar to partition diagrams, but appear as colored regions rather than as contours. The basic form is

    where c is the chosen color of the regions (for contrast, we chiefly pick , and ). It stands for the partial definition of a coloring function over the integer interval (ij]. Colorings allow the same syntactic shorthands as partition diagrams (bound juxtaposition, chaining and singleton intervals). For example, the following coloring asserts that the coloring function takes the value between 0 (inclusive) and k (exclusive):

    For compact representation, partitioning and coloring invariants may be drawn overlapping when their bounds coincide. For example, the invariant of situation \(\triangleright \) is the conjunction of a partitioning and a coloring:

  • Variants are written in the upper right corner of intermediate situations that are part of a loop. To verify termination, we need to show that the variant (\(\mathrm {N}-k\)) is decreased by each transition through the situation and does not decrease below the lower bound (0).

Finally, we note that situations can be nested. Nesting is conjunctive: an inner situation inherits all declarations, with the exception of variants, from the enclosing situations.

3 Verification of Pictorial Invariant Diagrams

An invariant diagram is correct iff it is consistent, terminating and live. A transition t from a situation satisfying predicate p to situation satisfying predicate q is consistent if \(p\Rightarrow \mathsf {wp}(t,q)\) is true, where \(\mathsf {wp}\) is the weakest precondition transformer. For termination, we check that the variant v decreases and that its lower bound is maintained on re-entry to the situation, i.e., \(v=v_{0}\wedge p\Rightarrow \mathsf {wp}(t,0\le v<v_{0})\). A situation satisfying p is live if at least one outgoing transition is always enabled, i.e, \(p\Rightarrow \mathsf {wp}(t,g_{1}\vee \dots \vee g_{n})\), where \(g_{1},\dots ,g_{n}\) are the guards of outgoing transitions. Next, we describe how the pictorial elements of a situation (legends and colorings) combine into a predicate onto which these rules can be applied. For a formal treatment of the proof rules themselves, see [7].

For a given situation s, let \(\overline{x}\) be the declared variables, \(\overline{T}\) their types, and \(\overline{a}\) the subset of \(\overline{x}\) containing only the variables of array type. The coloring function associated with a variable \(A\in \overline{a}\) of type \(\mathbf{array }\mathrm {[N]}\) in situation s is a total function from the program state and an array index

where the set \(C_{s,A}\) is the color palette associated with A in s, and is a special value indicating that no coloring has been specified. The coloring function formalizes the mapping between legends and invariants, is fully defined, and is intended to be fully eliminated from the final VC. Given the colorings declared for array A in situation s:

where \(c_{1},\dots ,c_{n}\in C_{s,A}\), the coloring function is defined as:

Disjointness of partitioning means that the if-conditions are mutually exclusive, and the else-clause ensures that the function is total. A legend declaration for variable a in situation s has the general form:

where \(c_{1},\dots ,c_{n}\in C_{s,a}\) and p is a predicate on the program state. Semantically, this legend stands for the following predicate:

$$\begin{aligned}{}\begin{array}[t]{lcccl} \mathsf {lgd}_{s,A}(\overline{x}) &{} = &{} (\forall i_{1},\dots ,i_{n},j_{1},\dots ,j_{n} &{} . &{} (0\le i_{1}<j_{1}<\mathrm {N})\wedge \dots \wedge (0\le i_{n}<j_{n}<\mathrm {N})\\ &{} &{} &{} \wedge &{} (\forall k\;.\;i_{1}<k\le j_{1}\Rightarrow \mathsf {col}_{s,A}(\overline{x})(k)=c_{1})\\ &{} &{} &{} \vdots \\ &{} &{} &{} \wedge &{} (\forall k\;.\;i_{n}<k\le j_{n}\Rightarrow \mathsf {col}_{s,A}(\overline{x})(k)=c_{n})\\ &{} &{} &{} \Rightarrow &{} p(\overline{x},i_{1},\dots ,i_{n},j_{1},\dots ,j_{n})) \end{array} \end{aligned}$$

That is, a legend is an assertion that p holds for subintervals of A matching the sequence of colorings given in the legend. Like invariants, legends are conjunctive.

To verify a diagram, we generate a theory including the coloring functions, legend predicates and invariants of each situation, and a lemma to be proved for each transition. For example, the theory of \(\triangleright \) in Fig. 1 contains the following declarations (for brevity, in we have omitted the inner quantifications, as the ranges are singletons in both cases):

To generate the VCs for situation , we can now apply the proof rules of IBP, taking the conjunction of and as the situation predicate. For example, to prove that the loop transition “\([\mathrm {t-s>1];\;k:=(s+t)\;\mathsf {div}\;2;\;[x>A(k)];\;s:=k}\)” is consistent we will need to discharge the following VC:

Additionally, to prove that the same transition is decreasing the variant of :

Note that the antecedents are identical to those of the consistency VC. Finally, the liveness condition for situation is:

The VCs can now be discharged using an automatic theorem prover.

4 Example: Insertion Sort

Figure 2 shows an invariant diagram interpretation of insertion sort. It consists of an outer loop () maintaining a sorted partition (green), and an inner loop () moving the next element from the unsorted partition into its correct position in the sorted partition. The inner loop, as it moves the element back one step per iteration, maintains two sorted partitions (green and blue). The control flow transfers from the inner to the outer loop when the concatenation of the partitions becomes sorted. The outer loop terminates when every element of the array has been processed. Transitions must additionally ensure that \(\mathrm {A}\) is a permutation of the original \(\mathrm {A_{0}}\).

Fig. 2.
figure 2

Insertion sort (Color figure online)

Like invariants, legends are inherited from outer situations. For instance, that any two adjacent green elements are sorted is visible to both and . We note that legends may introduce new colors limited in scope to the declaring situation and its nested situations. For example, blue introduced by the legend of is visible only within . The coloring functions, legend predicates and invariant predicates for situations and are shown in Fig. 3. Given these functions and predicates, the VCs for the transitions are formulated as described in the previous section (omitted here for brevity). The VCs are automatically proved by Why3 and its associated SMT solvers Z3 [8] and CVC4 [9].

Fig. 3.
figure 3

Coloring function, invariant and legend predicate of situations and . (Color figure online)

5 Related Work

Reynolds [4] introduced interval and partition diagrams to express constraints on arrays. Gries’s seminal textbook [3] uses array pictures in several examples. Astrachan [2] suggests pictorial representations of arrays and linked lists. Ginat [10] considers loop invariants as mathematical games, with emphasis on the heuristics of invariant identification. Some recent approaches have explored transforming invariant problems into games [11, 12] and crowdsourcing verification to online communities. Partitioning has been employed in static analysis and heuristics-driven loop invariant generation [13, 14]. Reasoning on range predicates is the basis of the axiomatic rules on array manipulations for correctness proofs of programs involving arrays in [15]. The converse problem, generating visual representations from textual specifications, has been addressed in the context of the Z language [16], and also with the purpose of visualizing VCs on arrays [17]. While pictures and colors are a staple in algorithm animation, we are not aware of prior work combining partitionings and colorings for formal reasoning.

6 Conclusions and Future Work

In this paper, we have introduced a pictorial language for invariants over arrays. The language extends two existing visual formalisms: the notation for invariants and predicates builds on Reynold’s partition diagrams, extending them with colorings to connect partitions with desired properties; the language for specifying the invariant structure and program statements is invariant diagrams, extended with a hierarchical mapping of colorings to predicates. Partition diagrams, colorings and legends seem to be rather expressive visual constructs, allowing many common array invariants to be stated.

This work is in its initial phases with multiple directions to be explored. First and foremost, tool support (in the form of editors and VC generators) would be needed for practical use. Existing tools for IBP [18] do not support the array-specific visual notations introduced here. Secondly, we would like to generalize the approach to more advanced data structures, such as trees and graphs. One challenge here is finding equally expressive and intuitive visual partitioning notations to state invariants over these non-linear data structures. Thirdly, we believe that colorings could serve runtime visualization and animation by overlaying the colors on a data structure instance picture, and analogously, to produce color-coded counterexamples during verification.