Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The ClawZ toolset is used to verify that automatically generated code from (a subset of) SimulinkFootnote 1 into (a verifiable subset of) Ada is correct [7]. This is achieved by encoding the semantics of the two representations in the Z notation. Correctness of the generation is then ensured by a formal proof of a refinement conjecture from the Z representation of Simulink into the Z representation of Ada. This proof is supported by a very powerful proof tactic for the ProofPower theorem prover called Supertac [7]. This tactic has been developed over a number of years and has a very high degree of automation as it is tailor-made for these types of conjectures.

Whilst Ada is used extensively for avionics software, other sectors, such as automotive, normally use the C programming language. The TargetLink code generator from dSPACEFootnote 2 is able to generate C code from a Simulink model. However, Supertac is configured for Ada and will, as will be shown in the next section, not be able to verify correctness of C generation.

A side-effect of the automation achieved by Supertac is that the code base has become highly complex and large. In fact, it consists of almost 50 K lines of dense ML codeFootnote 3. Adapting it from Ada to C is therefore a non-trivial problem.

Tactic languages, such as the one used to encode Supertac, are often difficult to analyse and debug as the error may manifest itself a different place from where the problem actually lies. This is further complicated by: (1) the non-deterministic nature of tactics, meaning multiple branches can be generated; and (2) that they tend to be untyped in the sense that subgoals cannot be differentiated. This non-deterministic nature makes it more difficult to debug than most software, and often the only method to find the mistakes is to insert “writeln” statements in the code to print information. Needless to say, this can be a hard task for 50 K LoC. As proofs are getting larger and more commonplace, proof maintenance is going to be a problem. Improved debugging features for ML would be beneficial and a step in the right direction, but this alone will not be sufficient: the non-deterministic nature of tactics and the non-trivial flow of sub-goals will require their own solutions.

Fig. 1.
figure 1

The GUI of tinker (left) & an illustrative evaluation step (right)

To ease maintenance, debugging and general understanding of tactics, we have previously developed the PSGraph language [1] and the supporting Tinker tool [2, 5], as can be seen in Fig. 1 (left). Here, proof tactics are encoded as directed hierarchical graphs, where the boxes contain tactics or nested graphs, and are composed by labelled wires. The labels are called goal types and are predicates that describe the expected properties of sub-goals. Each sub-goal becomes a special goal node in the graph, which “lives” on a wire. Evaluation is handled by applying a tactic to a goal node that is placed on one of its input wires. The resulting sub-goals are sent to the out wires of the tactic node. To add a goal node to a wire, the goal type must be satisfied. Figure 1 (top right) illustrates a single evaluation step where tactic C_rewrite_crucial, which is discussed in Sect. 3, is applied to a goal labelled i:

figure a

The tactic produces a goal labelled j on its output wire (bottom right of Fig. 1)

figure b

The goal types introduce a notion of ‘types’ to the tactic language, which improves upon the static properties for composition and evaluation found in the tactic language currently used to encode SupertacFootnote 4. As can be seen in Fig. 1 (left), Tinker provides support for tactic developers to step through the proofs using ‘interactive operations’ to e.g. ‘step over’ or ‘step into’ a tactic. A novel breakpoint feature has recently been introduced [5], where sub-goals are evaluated automatically until a sub-goal reaches the breakpoint. At this point evaluation will stop, and the user can guide the evaluation step-by-step from this point onwards. This is particularly useful for large and complex tactics such as Supertac. We will return to how we have exploited these special Tinker features for this work in Sect. 3. We believe that this support for inspection and adaptations through simple graph visualization is novel.

We will show our initial encoding of Supertac in PSGraph in Sect. 2. We will then show how this is used to analyse Supertac using PSGraph and adapt this to support C code in Sect. 3. We conclude and detail our next steps in Sect. 4.

Fig. 2.
figure 2

Supertac

2 PSGraph Encoding of Supertac for Ada

PSGraph handles modularity and complexity through hierarchies, represented by boxes in the graph that contains sub-graphs. The architecture of Supertac consists of four sub-tactics executed in sequence, where the first of these has been decomposed in a hierarchical node. This is done by unfolding and breaking it, and its sub-tactics, down, and then re-compositing the sub-tactics in PSGraph with proper goal-types. In the future we plan to decompose the remaining three. Figure 2 shows the top-level of our Supertac encoding:

  • structure_tac is used to classify the conjectures and enhancing them with meta-information. In addition, it unpacks structure surrounding conjectures, such as quantifiers.

  • end_proof_tac1 gets rid of Simulink and Ada vocabulary, to make it easier to reason about.

  • end_proof_tac2 deals with mathematical statements, and gets rid of high-level concepts (e.g. functions) to reduce it to set theoretical primitives that are easier to reason about.

  • end_proof_tac3 handles case statements and is mainly used as a brute-force strategy if end_proof_tac2 has not been able to discharge the conjecture.

Fig. 3.
figure 3

The structure_tac sub-tactic of Supertac

Figure 3 shows the tactic nested by the structure_tac node in PSGraph. As a proof of concept, this encoding has been successfully applied to discharge VCs generated from a Nose-Gear Velocity case study [8] in ClawZ.

3 Adapting Supertac to C-Code

The main difference between conjectures generated for Ada and C is that Ada specific semantics in the refinement conjecture has now been replaced by C constructsFootnote 5. The remaining parts, e.g. the encoding of Simulink using Z Notation, are unchanged. The key challenge is thus to replace the parts of Supertac that reduce the Ada vocabulary with pure set theory into parts that reduce the C vocabulary into set theory (and develop these). This is non-trivial, because semantically variables in Ada can be treated simply when aliasing restrictions are in place, however the semantics of a variable in C come with a side condition stating that it is disjoint from other C objects. PSGraph enables users to step through evaluation and see how the goals evolve. This is a very useful when adapting a tactic in this way. To illustrate, the following VC has been generated from a simple C program:

figure c

With the debugging support of Tinker to interactively step through our PSGraph version of Supertac, we were able to pinpoint where in end_proof_tac1 it was assumed that the target programming language constructs need to be eliminated. First we split this sub-tactic into end_proof_tac1_1 and end_proof_tac1_2. The C specific parts will need to be eliminated between these two sub-tactics. To illustrate, after end_proof_tac1_1 our VC looks as follows:

figure d
figure e

where assumption (* 1 *) has been inserted by structure_tac to guide the rest of the proof. The intuition behind this VC is that the C value of comparing with , using the C version of integer comparison operator , is not equal to the C value of 0. When the VC contains C specific parts, then these have to be reduced to set theory before end_proof_tac1_2 executes. As can be seen in Fig. 4 (left), we introduce a new nested tactic called qcz_conversion, and insert it between the two tactics discussed above. VCs containing C vocabulary are identified by the is_qcz_conv predicate, found on the wire leading to the qcz_conversion tactic.

Fig. 4.
figure 4

New end_proof_tac1 (left) & code conversion types (right)

The nested graph of this tactic is shown in Fig. 4 (right). Depending on certain properties of the goal, identified by the wire labels of the PSGraph, one of three tactics may be applied.

Each of them applies some conversions, which are rewrite rules proven from the semantics of the language formalisation, to simplify the goal: C_rewrite_patterns does general simplifications of C constructs; C_rewrite_Int_Boolean deals with special simplifications related to C booleans (represented as integer in C); while C_rewrite_crucial does the crucial steps in eliminating C vocabulary. To illustrate, the following are some of the crucial conversions of C_rewrite_crucial:

figure f

These changes were sufficient to complete the running example, and has been successfully applied to fully automate six handcrafted VCs.

4 Conclusion and Future Work

In this paper we have reported on a successful technology transfer project, where the PSGraph language has been used to start the adaptation of an industrial proof strategy to a new domain – using the state-based Z notation. Through an example, we have illustrated how it has been used to pinpoint and support the development of this adapted proof strategy.

Proofs are becoming more commonplace and increasingly complex. Like with software maintenance, proof maintainence is going to be a problem, as people who did the proof originally will have moved on or retired. For example, the substantial proof effort on the sel4 kernel [4] will be around for decades, and even small changes to the underlying kernel is likely to break many proofs. We have shown the advantages of PSGraph for the maintainence of Supertac; this experience has given us confidence that it has potential to play a supporting role in proof engineering [3] other large proof-based developments.

So far we have only decomposed structure_tac. In the medium term we aim to complete this work by decomposing the remaining tactics and we are hoping that PSGraph can be used to remove some of the “clutter” that has been the result of updating Supertac to new applications. Particular challenges will be to discover new goal types and find suitable graphical representions of some domain specific and non-trivial combinators used in Supertac.

In the long term we would like to re-implement the overall structure of Supertac from scratch, using the existing Supertac components as building blocks. Ideally, we will support both Ada and C. This will enable us to reflect on the intuition and develop a more conscious strategy, where the overall proof plan is clear. For example, we should separate reasoning about the denotational semantics given to expressions from the operational semantics of statements, as these will be tackled in different ways. We believe that this will give a much cleaner proof strategy that would be easier to analyse and adapt for future applications and changes to the modelling language or the target programming language.