Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In previous work, we designed a new language called Cogent  [9] for easing the verification of certain classes of systems code such as file systems. Cogent is a linearly-typed, pure, polymorphic, functional language with a certifying compiler. We used it in separate work to write two Linux filesystems, \(\mathtt {ext2}\) and \(\mathtt {BilbyFs}\), and achieved performance comparable to their native C implementations [2].

From a Cogent program the Cogent compiler produces three artefacts: C code, a shallow embedding of the Cogent program in Isabelle/HOL [8], and an Isabelle/HOL proof relating the two. The compiler certificate is a series of language-level proofs and per-program translation validation phases that are combined into one top-level theorem in Isabelle/HOL. The most involved phase, and the phase we discuss in this paper, is the translation validation phase relating Cogent’s imperative semantics to the generated C.

We present a refinement framework that enables the full automation of this phase of Cogent’s certifying compilation. This framework has several components that relate Cogent values, states, types, and statements to their C counterparts. We put significant proof engineering work into enabling the framework to bridge the gap between the Cogent store and the C heap semantics. Moreover, we introduced the idea of partial type erasure to eliminate linearity information from a Cogent type in order to relate it to the corresponding C type. Furthermore, to relate Cogent and C statements, we developed a refinement calculus which contains a set of compositional proof rules. Given a program, our framework then customises the proof rules based on the values, types, and states that are used in this program. Finally, our refinement tactic applies the customised rules in a syntax-directed manner, certifying the refinement for this phase.

The method scales to significant Cogent code size, as demonstrated in the two Linux filesystems [2] mentioned above. A snapshot of our work is available online [1].

Fig. 1.
figure 1

An overview of the verification chain and our refinement framework.

2 Overview and Background

This section explains the contribution of this paper within the broader Cogent project. The heart of the Cogent project is its certifying compiler. The certificate the compiler produces is a refinement theorem relating the generated shallow embedding and the generated C code. To ensure the C code is run correctly on the binary level, it can be compiled by CompCert [7].Footnote 1 It also falls into the subset of Sewell et al.’s gcc translation validator [12], which can be made to compose directly with our compiler certificate.Footnote 2

The shallow Isabelle/HOL embedding is convenient for manual reasoning; however, the compiler additionally produces a deep embedding of each Cogent program, for the sake of structuring the generated certificate theorem and proof. There are two formal semantics for this deep embedding: (1) a functional value semantics where programs evaluate to values and (2) an imperative update semantics where programs manipulate references to mutable global state.

The left side of Fig. 1 summarises the generated program representations and the breakdown of the compiler certificate. The program representations are (from the bottom of Fig. 1): the C code, the semantics of the C code [13] expressed in Simpl [11], which is a generic imperative language inside Isabelle/HOL, the same expressed as a monadic program [4], an A-normal [10] deep embedding of the Cogent program, and a shallow embedding. Several theorems rely on the Cogent program being well-typed, which we prove automatically using type inference information from the compiler.

The labelled arrows and the arrow from C to Simpl represent refinement proofs and the arrow labels correspond to the numbers in the following description. The only arrow that is not verified is the one crossing from C code into Isabelle/HOL at the bottom of Fig. 1 — this is the C parser [13], which is a mature tool used in a number of large-scale verifications [5]. It could additionally be checked by Sewell et al.’s gcc translation validation tool.

We briefly describe each intermediate theorem, starting with Simpl at the bottom. For well-typed Cogent programs, we automatically prove the following four theorems, which together form the compiler certificate:

① The C parser’s Simpl code corresponds to a monadic representation of the C code.

② The monadic code terminates and is a refinement of the update semantics of the Cogent deep embedding. To relate Cogent’s linear type system to the monadic one, we introduce the reusable idea of partial type erasure.

③ If a Cogent deep embedding evaluates in the update semantics, it evaluates to the same result in the value semantics.

④ If the Cogent deep embedding evaluates in the value semantics then the Cogent shallow embedding evaluates to a corresponding shallow Isabelle/HOL value.

In order to prove high-level functional correctness, an additional step is necessary:

Arrow ⑤ indicates verification of user-supplied abstract data types (ADTs) implemented in C and manual high-level proofs on top of the shallow embedding. We demonstrated that this step is enabled by the previous steps for two real-world filesystems [2].

Step ③ is a consequence of linear types. It is a general property about the language and has been proven manually once and for all [9]. Steps ①, ②, and ④, as well as their respective proofs, are generated by our compiler for every program. The proof for step ① is generated by an adjusted version of the AutoCorres tool [4]. For steps ② and ④ we define compositional refinement calculi which enable the automation of the proofs. The most involved refinement proof is the one for step ② which we present in this paper. It took about three person years to develop tools for automating this proof. The calculus for step ④ is similar but much simpler, as at this stage one does not reason about the state. In comparison, its development only took a few person weeks.

The right side of Fig. 1 expands on the refinement framework used for proving step ②. The bottom layer represents the underlying theory we developed for defining primitive value and type relations which we use to create a refinement calculus between Cogent deeply embedded expressions and corresponding monadic statements. The middle layer represents the proof tools that automate the refinement proof on a per-program basis. These proof tools rely on the underlying theories about the language in general, and on compiler generated theories specific to the program. In particular, we have a tool for generating non-primitive data relations, one for specialising complex rules in the calculus to support automation, and finally a proof automation tactic which composes the proof rules to provide a fully-automatic refinement proof.

2.1 Cogent

Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types. The linear types ensure that resources such as memory are disposed of correctly without run-time support like garbage collection. Crucially for us, they also allow Cogent to be compiled into efficient C, including destructive updates to values rather than the repeated copying common in purely functional styles.

Fig. 2.
figure 2

Definitions for Cogent fragment

Variables of linear type must be used exactly once. This means each active mutable heap object has exactly one active pointer in scope at any point in the program. Hence, the difference between a destructive update and a pure copy-update is unobservable.

The Cogent compiler generates C code, a shallow embedding, and a collection of “hints” used by the proof tactic to certify the compilation. Importantly, the performance of the generated C is comparable to carefully handwritten C.

Cogent’s certifying compilation makes the verification of filesystems more cost-effective, fully automating a significant part of the low-level proofs. We demonstrate this on two real-world Cogent filesystems, with a minimal TCB [2].

This paper focuses on the lower-level generated refinement proofs, which connect Cogent’s update semantics to C. Figure 2 introduces a relevant fragment of Cogent. Many features of the full language are omitted here and described in detail elsewhere [9], including polymorphism, sum types, and the foreign function interface. The following gives a brief summary.

Much of the syntax presented in our fragment is standard for a functional language, such as handling control flow (if) and local bindings (let). The main point of difference is Cogent’s record system: Some care is needed to reconcile record types and linear types. If a record contains at least one linear field, the whole record is of linear type. Otherwise, the linear field could be shared by sharing the record.

Accessing records becomes more complex as well. For instance, assume that \(\mathtt {Object}\) is a type synonym for a record type containing an integer and two (linear) buffers, where . Let us say we want to extract the field \(\text {b}_1\) from an \(\mathtt {Object}\). If we extract just a single \(\mathtt {Buf}\), we have implicitly discarded the other buffer \(\text {b}_2\). However, we cannot return the entire \(\mathtt {Object}\) along with \(\mathtt {Buf}\), as this would introduce aliasing. Our solution is to return along with \(\mathtt {Buf}\) an \(\mathtt {Object}\) where the field \(\text {b}_1\) cannot be extracted again, and reflect this in the field’s type, written as . This field extractor, whose general form is , operates as follows: given a record \(e_1\), it binds the field f of \(e_1\) to the variable y, and the new record to the variable x in \(e_2\). If that field is linear, it will then be marked as unavailable, or taken, in the type of the new record x.

Conversely, we also introduce a \(\mathbf {put}\) operation, which, given a record with a taken field, allows a new value to be supplied in its place. The expression returns the record in \(e_1\) where the field \(\text {f}\) has been replaced with the result of \(e_2\). Unless the type of the field f allows it to be discarded, it must already be taken, to avoid accidentally destroying our only reference to a linear resource.

We distinguish boxed records stored on the heap from unboxed records that are passed by value. Unboxed records can be created using a simple struct literal . Boxed records are created by invoking an externally-defined C allocator function. For these allocation functions, it is often convenient to allocate a record with all fields already taken, to indicate that they are uninitialised. That is, a function for allocating \({\mathtt {Object}}\)-like records might return values of type: .

Also included in a record type is the storage mode of the type. A record is stored on the heap when its associated mode m is not unboxed. For boxed records, the storage mode distinguishes between those that are writable vs. read-only.

Fig. 3.
figure 3

Example function in Cogent. \(flip\) updates a record on the heap in place.

Example 1

Figure 3 defines a simple function in Cogent which, given a mutable record x, first takes the field f and, depending on its value, destructively updates the field with a new value, returning the updated record.

The details of Cogent’s type system, semantics, and this proof are presented in [9], we only repeat the top-level concepts here.

The dynamic big step update semantics maps a triple of environment U, expression e, and mutable store \(\mu \) to a result value u and a new mutable environment \(\mu '\), written \(U \vdash e\ |\ \mu \Downarrow _u u\ |\ \mu '\). The rules [9] for variables and \(\mathbf {let}\) are straightforward. Functions are top-level functions in Cogent, and a function name simply evaluates to the lambda-expression it represents. The \(\mathbf {take}\) and \(\mathbf {put}\) rules evaluate as described above.

The static semantics include the standard typing judgement \(\varGamma \vdash e : \tau \). Unlike conventional type systems, linear type systems are substructural, which means that the context \(\varGamma \) cannot be treated merely as a set of assumptions that always grows as one descends into the syntax tree. Instead, assumptions may also be removed from the context. This complication requires us to occasionally generalise the corres rules presented in Sect. 3.4 with multiple typing assumptions with different contexts.

To state type preservation for Cogent, we define the corresponding typing judgement for dynamic values, written and a generalisation of it to environments and contexts, written . With this, we can prove the following (see also [9]).

Theorem 1

(type preservation). For a program e, if \(\varGamma \vdash e : \tau \) and and \(U \vdash e\ |\ \mu \Downarrow _u u\ |\ \mu '\), then

Fig. 4.
figure 4

Partial type erasure of dynamic typing relation for update semantics

For a Cogent value to be well-typed, all accessible pointers in this value, e.g. a record, must be valid. This is important for proving safety, but becomes cumbersome when showing refinement to C as there exist values in the C code, such as those for taken fields, which may include temporarily invalid pointers. We therefore include additional information in each Cogent value, called its representation, which provides enough type information to determine the corresponding C type, without requiring recursive descent into the heap. In other words, the representation shown in Fig. 4 contains only the type information which is pertinent to C, with the linearity information erased. We call this technique partial type erasure. The value typing relation ensures that the representation information agrees with the value’s type.

2.2 AutoCorres and C Monads in Isabelle/HOL

We use the C-to-Simpl [13] parser to provide a formal semantics for the generated C code. In principle, we could work from the C parser’s output directly; however, this would mean dealing with the details of its low-level memory model. Instead, we opt to work with a typed heap model, provided by AutoCorres [4]. Specifically, the \( state \) of the AutoCorres monadic representation contains a set of typed heaps, each of type \(\tau \ \text {ptr} \Rightarrow \tau \), one for each type \(\tau \) used on the heap in the C input program.

As AutoCorres was designed for human-guided verification, it uses many context-sensitive rules to simplify the generated code. As we aim to verify code automatically, we switch off most of these simplification stages in order to obtain predictable output.

AutoCorres generates shallow embeddings of code in the nondeterministic state monad of Cock et al. [3]. In this monad, computation is represented by functions of type \( state \Rightarrow (\alpha \mathrel {\times } state ) \; set \mathrel {\times } bool \). Here \( state \) is the global state of the monadic program, including global variables, while \(\alpha \) is the return-type of the computation. A computation takes as input the global state and returns a set, \( results \), of pairs with new state and result value. Additionally the computation returns a boolean, \( failed \), indicating whether there potentially was undefined behaviour.

As C does not guarantee that all pointer locations are valid, AutoCorres emits \( is\text{- }{}\!valid \) guards before each memory access. When proving refinement between Cogent and monadic code, we need to discharge those guards using a state invariant (Sect. 3.2).

Figure 5 shows an example AutoCorres specification, using the following keywords:

figure a

3 Refinement Framework

Recall that for a well-typed Cogent program, the compiler emits C code, a deep embedding of the program’s semantics, and a proof that the C code correctly refines this embedding. We choose C as a compilation target because most existing systems code is written in C, and thanks to tools like CompCert and gcc translation validation, our C subset has formalised semantics and an existing formal verification infrastructure.

Fig. 5.
figure 5

Intermediate representations of Cogent function from Fig. 3. Left: A-normalised source code, embedded into Isabelle/HOL. Right: AutoCorres monadic semantics for generated C code.

The right side of Fig. 1 provides an overview of the generation of our refinement proof. To phrase the refinement statement, we first define how deeply-embedded Cogent values relate to values in the monadic embedding (Sect. 3.2).

The C code generation is straightforward and this step itself does not perform global optimisations or transformations. Such transformations, for instance A-normalisation, are performed in earlier compiler phases. A-normalisation in particular is performed to simplify code generation, but it also simplifies our C refinement. Since it is performed early (and verified early on top of the shallow embedding [9]), it is sufficient for us to only consider Cogent expressions in A-normal form here, where nested subexpressions are replaced with explicit variable bindings. With this, the refinement calculus contains a set of compositional \(\mathbf {corres}\) proof rules, typically one for each A-normal Cogent construct, which are applied automatically in a syntax-directed manner (Sect. 3.4).

The \(\mathbf {corres} \) proof rules depend on preconditions about the expected state of the program, such as preconditions about the type and validity of pointers in the heap. We propagate the conditions similarly to the proof calculus of Cock et al. [3]. Our refinement theorem does not need an explicit assumption of well-typedness for the whole Cogent program — The proof tactic will simply fail for programs that are ill-typed.

Since our \(\mathbf {corres} \) proof rules are specialised to Cogent and to the operation of the compiler, we can predict the form of their preconditions and design proof rules to combine them. This forms the basis for automation.

3.1 Refinement Statement

We define refinement generically between a monadic computation \(p_m\) and a Cogent expression e, evaluated under the update semantics. We denote the refinement predicate \(\mathbf {corres}\). The state relation R changes for each Cogent program, so we parametrise \(\mathbf {corres}\) by an arbitrary state relation R. It is additionally parametrised by the typing context \(\varGamma \) and the environment U, as well as by the initial update semantics store \(\mu \) and monadic shallow embedding state \(\sigma \).

Definition 1

(correspondence)

Definition 1 states for well-typed stores \(\mu \) that if the state relation R holds initially, then the monadic computation \(p_m\) cannot fail and, moreover, for all executions of \(p_m\) there must exist a corresponding execution under the update semantics of the expression e such that the final states are related by a state relation R and a value relation \( val\text{- }{}\!rel \) holds between the results of e and \(p_m\).Footnote 3 We present the state and value relations in Sect. 3.2.

AutoCorres proves that if the monadic code never fails, then the C code is type- and memory-safe, and is free of undefined behaviour [4]. We prove non-failure as a side-condition of the refinement statement, essentially using Cogent’s type system to guarantee C memory safety during execution. The \(\mathbf {corres}\) predicate can compose with itself sequentially: it both assumes and shows the relation R, and the additional typing assumptions are preserved thanks to type preservation (Theorem 1).

3.2 Data Relations

For each program, based on a library for primitive types, we generate a set of relations between the values, types and heaps of the Cogent and monadic code. We denote these as \( val\text{- }{}\!rel \), \( type\text{- }{}\!rel \) and \( \mathcal {R} \) respectively.

We must give these relations separate definitions for each Cogent type, because each C struct type is embedded as a distinct Isabelle/HOL record. We use Isabelle’s ad-hoc overloading mechanism for this.

Recall that AutoCorres generates different typed heaps for each C type. The type relation \( type\text{- }{}\!rel \) is used by the state relation \( \mathcal {R} \) to select the corresponding typed heap for each Cogent type. It is defined using the \(\mathbf {repr}\) function (Fig. 4) which performs partial type erasure, unifying Cogent types that differ only in linear annotations in order to relate them to the same C type.

Given \( val\text{- }{}\!rel \) and \( type\text{- }{}\!rel \) for a particular Cogent program, the state relation \( \mathcal {R} \) defines the correspondence between the store \(\mu \) over which the Cogent update semantics operates, and the state \(\sigma \) of the monadic shallow embedding. This relation is made into an invariant in \(\mathbf {corres} \) (Sect. 3.1); it allows us to show that all C pointer accesses satisfy \( is\text{- }{}\!valid \), whenever there are corresponding objects in the Cogent store \(\mu \).

Definition 2

(state relation). \((\mu ,\sigma ) \in \mathcal {R} \) if and only if for all pointers p in the domain of \(\mu \), there exists a value v in the appropriate heap of \(\sigma \) (as defined by \( type\text{- }{}\!rel \)) at location p, such that \( val\text{- }{}\!rel \ (\mu \, p)\ v\) holds.

Generating Data Relations. We generate \( \mathcal {R} \), \( val\text{- }{}\!rel \) and \( type\text{- }{}\!rel \) after obtaining the monadic program and its typed heaps from AutoCorres. Our Cogent compiler outputs a list of \(({{\textsc {Cogent}}}, \text {C})\) type pairs, which is used by an Isabelle/ML procedure to generate the needed relations.

Example 2

The program in Fig. 5 uses the types \(\mathtt {U8}\), \(\mathtt {Bool}\) and \(\{\text {f} \, {:}{:}\, \mathtt {U8}\}\), which correspond to the C types \(\mathtt {word8}\), bool and \(\mathtt {rec}_1\), respectively. For \( val\text{- }{}\!rel \) and \( type\text{- }{}\!rel \), the \(\mathtt {U8}\)\(\mathtt {word8}\) relation can be defined a priori, but bool and \(\text {rec}_1\) are generated with the monadic program and their data relations are generated dynamically:

Note that the \( val\text{- }{}\!rel \) definition for \(\{\text {f} \, {:}{:}\, \mathtt {U8}\}\) depends on the definition for its field of type \(\mathtt {U8}\). The Cogent compiler always outputs the type list in dependency order, so this does not pose a problem.

The state relation \( \mathcal {R} \) cannot be overloaded in the same way as \( val\text{- }{}\!rel \) and \( type\text{- }{}\!rel \), because it relates the heaps for every type simultaneously. We introduce an intermediate state relation, \( heap\text{- }{}\!rel \), which relates a particular typed heap with a portion of the Cogent store. Like the other relations, this intermediate relation can make use of type-based overloading. Following Definition 2, we define \( heap\text{- }{}\!rel \) for each type \(\tau \) that appears on the heap as follows:

where \(\mathbf {vrepr}\) gives the partially-erased type for a value, similar to \(\mathbf {repr}\). The state relation over all typed heaps \(\sigma _{\tau _k}\) is .

3.3 Refinement Theorem

We state the overall top-level C refinement theorem below. In addition to the assumptions listed here, it also assumes that \(\mathbf {corres}\) holds for all the foreign functions used in the program.

Theorem 2

Let f be a Cogent function, with type \(\tau \) and body e. Let \(p_m\) be the monadic embedding of its generated C code. Let u and \(v_m\) be arguments of appropriate type for f and \(p_m\) respectively. Then:

$$ \begin{array}{l} \forall \mu \ \sigma . \; val\text{- }{}\!rel \ u \ v_m \longrightarrow \mathbf {corres}\ \mathcal {R} \ e\ (p_m\ v_m)\ (x \mapsto u)\ (x : \tau )\ \mu \ \sigma \end{array} $$

Example 3

In Fig. 5, \(f = flip\), \(p_m = flip_C\), and \(\tau = \tau ' = \{\text {f} \, {:}{:}\, \mathtt {U8}\}\).

3.4 Refinement Proof

This section describes the main components of the refinement proof automation, as shown in Fig. 1: the proof calculus used to relate Cogent and C programs, the generation of well-typedness theorems for Cogent, and the automated tactic that combines these two components to perform the overall refinement proof.

Refinement Calculus. Figure 6 depicts the \(\mathbf {corres}\) rules in our calculus for variables, \(\mathbf {let}\), \(\mathbf {if}\), and for \(\mathbf {take}\) and \(\mathbf {put}\) expressions for boxed records. The full calculus is available online [1] under \(\mathtt {c-refinement/COGENT\_Corres.thy}\). The proofs of the \(\mathbf {corres}\) rules for compound expressions rely on Theorem 1 to infer value well-typedness.

Fig. 6.
figure 6

Some of the important \(\mathbf {corres}\) rules

The assumptions for these rules fall under three main groups:

  1. 1.

    Well-typedness assumptions; we generate typing theorems to discharge these.

  2. 2.

    Assumptions relating the values and mutable heaps of Cogent and C. Once a C program is read and concrete data relations (Sect. 3.2) are defined, we specialise the \(\mathbf {corres} \) rules to simplify these assumptions.

  3. 3.

    \(\mathbf {corres} \) assumptions on sub-expressions, discharged through our proof automation.

The rules Var and Let correspond respectively to the two basic monadic operations \(\mathbf {return}\), which yields values, and \(\mathbf {do}\,\dots {;} \ \dots \,\mathbf {od} \), for sequencing computations.

Observe that Let is compositional: to prove that corresponds to \(\mathbf {do}\,e_1' {;} \ e_2'\,\mathbf {od} \), we must prove that (1) \(e_1\) corresponds to \(e_1'\) and (2) \(e_2\) corresponds to \(e_2'\) when each are executed over corresponding results \(v_u\) and \(v_m\) (e.g. as yielded by \(e_1\) and \(e_1'\) respectively). This compositionality, which is present in our whole calculus, significantly simplifies the automation of the refinement proof.

The If rule relates  expressions to monadic \(\mathbf {condition}\ ( bool \, c' \ne 0)\ e_1' \ e_2' \) statements. It works similarly to Let, requiring an equivalence between c and \(( bool \,c' \ne 0)\), and correspondences between \(e_1\) and \(e_1'\), and between \(e_2\) and \(e_2'\). Note that we represent booleans in C using a struct \(\mathbf {bool}\) with an integer field named \( bool \); we avoid C’s builtin type \(\mathtt {\_Bool}\) because it may be an alias for an existing integer type like \(\mathtt {U8}\) and therefore indistinguishable from that integer type.

The more intricate rules in Fig. 6 are \(\textsc {Put}\) and \(\textsc {Take}\), which apply to \(\mathbf {put}\) and \(\mathbf {take}\) on boxed records (additional rules exist for unboxed records). Recall that boxed records are stored on the heap and are subject to the linear typing rules. These two rules are involved and contain many assumptions. They are mainly presented here to illustrate to the reader why we have a separate phase later on dedicated to simplifying them.

The Put rule handles the correspondence between expressions and \((\mathbf {do}\,\_ \leftarrow \mathbf {guard} \, (\lambda \sigma .\ is\text{- }{}\!valid \, \sigma \, p') {;} \ \_ \leftarrow \mathbf {modify} \, h; \ e_3'\,\mathbf {od})\) statements. Note that unlike \(\mathbf {let}\), \(\mathbf {if}\), and \(\mathbf {take}\), \(\mathbf {put}\) does not contain a continuation. Therefore, the compiler ensures that \(\mathbf {put}\) expressions always appear within \(\mathbf {let}\) expressions, which allows us to have a compositional rule for \(\mathbf {put}\) in the same style as the other operators.

Recall that if \(e_1\) is a pointer p, \(\mathbf {put}\) updates the field \(f_k\), of the record pointed to by p to the value of \(e_2\). Similarly, the monadic code asserts that the corresponding \(p'\) is a valid pointer, then modifies the record at \(p'\) in h. At this stage h and \( is\text{- }{}\!valid \) are left unspecified, as these rules are defined generically regardless of type. Therefore, our Put rule additionally includes a number of assumptions describing the expected properties of h and \( is\text{- }{}\!valid \). In the next subsection, we specialise this rule to eliminate these assumptions.

Take is similar, it relates expressions and

$$(\mathbf {do}\, \_ \leftarrow \mathbf {guard}\ (\lambda \sigma .\ is\text{- }{}\!valid \ \sigma \ p') {;} \ y' \leftarrow \mathbf {gets}\ f'; \ e' \ y'\,\mathbf {od})$$

statements. Recall that \(\mathbf {take}\) removes the field \(\text {f}_k\) from \(e_1\), binds it to a new variable y and runs \(e_2\). The \(\mathbf {corres}\) assumptions of Take are that (1) \(p'\) and \(e_1\)’s value are related, and (2) given related values \(v_u\) and \(v_m\), \(e_2\) corresponds to \(e'_2\ v_m\) under the extended value environment (\(f_k\mapsto \ v_u, e_1\mapsto p\, (\mathtt {Ptr}\ r), U\)). We need to re-add \(e_1\) to U because it is linear and cannot be reused.

Generating Specialised Rules. As mentioned earlier, we generate program-specific proof rules for operators involving specific C types, such as \(\mathbf {take}\) and \(\mathbf {put}\). This is because the set of C types, different for each program, is shallowly embedded into Isabelle/HOL. Thus, the assumptions for rules involving those types can only be discharged once the C code has been parsed into Isabelle/HOL.

We could prove these assumptions while applying the \(\mathbf {corres} \) rules, but this would be inefficient for rules that are applied many times. Thus, we generate specialised rules in a separate preprocessing phase. Implemented as an Isabelle/ML program, this phase reads the \(({\textsc {Cogent}}, \text {C})\) type list used for generating data relations to produce rules for the appropriate C and Cogent types.

Example 4

For the Cogent record \(\{\text {f} \, {:}{:}\, \mathtt {U8}\}\) in Fig. 5, we generate the following specialised rules for \(\mathbf {take}\) and \(\mathbf {put}\):

figure b
figure c

Note that the cumbersome record-update assumptions from Fig. 6 have been reduced to \( val\text{- }{}\!rel \) and \( type\text{- }{}\!rel \) statements. This is only possible after we obtain the concrete program and its data relations. We also instantiate the state relation \( \mathcal {R} \) and show that \(\mathbf {take}\) and \(\mathbf {put}\) preserve it, allowing us to simplify the heap-update assumptions.

Well-Typedness. The Cogent compiler proves, via an automated Isabelle tactic, that the deep embedding of the input program is well-typed. Specifically, it shows for each function f with argument x, body e, and type \(\tau _1 \rightarrow \tau _2\), that \(x \mapsto \tau _1 \vdash e : \tau _2\).

Recall that the type system is substructural, and that proving refinement requires access to the typing judgements for each sub-expression of the program. To solve this, the Cogent compiler instructs Isabelle to store all intermediate typing judgements established during type checking. These theorems are stored in a tree structure, isomorphic to the Cogent program’s type derivation tree. Each node is a typing theorem for a program sub-expression, and can be retrieved by the refinement proof tactic as it descends into the program.

Proof Automation. The core of our refinement prover is an Isabelle/ML tactic that proves the \(\mathbf {corres} \) refinement theorem (Sect. 3.3) for each Cogent function in the program, by applying the \(\mathbf {corres} \) rules previously proven, both generic and specialised (Sect. 3.4). This algorithm is straightforward as our rules are syntax-directed.

The tactic also expands definitions of \( val\text{- }{}\!rel \) and \( type\text{- }{}\!rel \) (Sect. 3.2) in order to discharge data relation assumptions in those \(\mathbf {corres} \) rules, and retrieves the type derivation tree for the given Cogent function to discharge all well-typedness assumptions.

Example 5

For \(flip\) in Fig. 5, we wish to prove the refinement theorem

The first step of the proof applies the specialised \(\mathbf {take}\) rule for \(\{ \text {f}\, {:}{:}\,\mathtt {U8} \}\ \) (Sect. 3.4). After discharging its typing and \( val\text{- }{}\!rel \) assumptions, we are left with a \(\mathbf {corres} \) obligation on the remainder of the function, which can in turn be solved using the other proof rules.

Our tactic can be used easily for single functions, but extending it to whole programs required significant proof engineering effort, as we must handle function calls both to externally-defined C functions and to (potentially higher-order) Cogent functions.

Foreign functions. Cogent code depends on calls to foreign C functions to perform loops and I/O. Our framework requires these functions to be well-behaved, i.e. they respect Cogent’s termination order and do not break the Cogent type system (e.g. by modifying variables they do not have access to).

Foreign functions are user-supplied and not verified automatically. Thus, when proving refinement theorems for Cogent code that calls these functions, we automatically insert assumptions that they are well-behaved. These assumptions remain until they are resolved by manual verification.

Whole-program refinement. Cogent is a total language and does not permit recursion, so we have, in principle, a well-ordering on function calls in any program. However, for higher-order functions, this well-ordering is non-obvious and difficult to work with.

In practice, most function calls in systems code are direct calls to first-order functions. For such functions, we can simply prove the \(\mathbf {corres} \) theorems in bottom-up fashion, starting from the leaf functions and ending at the top-level functions.

There is one major exception: Cogent code cannot express loops using only first-order functions. Our Cogent programs use iteration combinators, which are second-order foreign functions that take a Cogent function pointer as the loop body (similar to the map or fold combinators in functional programming).

Therefore, our framework also supports second-order calls to foreign functions. Before assuming \(\mathbf {corres} \) for these functions, we first prove \(\mathbf {corres} \) for the argument function (i.e. the loop body).

This technique allows us to automate refinement for code with first- and second-order calls. While this restriction means that not all Cogent programs can be verified in our framework, we developed Cogent code for two file system drivers [2] in this fragment, demonstrating that substantial programs can be written in this subset.

4 Related Work

To date, the largest trustworthy compilation projects are the CompCert [7] C compiler and the CakeML [6] ML environment. In contrast to Cogent, they compile general-purpose programming languages and rely more heavily on verified compilation passes.

CompCert translates (a subset of) C to binary while our compiler translates the functional Cogent language to C. CompCert’s core compilation process is verified and its optimisation passes are validated; the compiler executable itself is extracted from Coq into Caml. There is ongoing work to validate the Coq code extraction process and the Caml compiler for CompCert.

We chose to use certificates for most of Cogent’s compiler passes, because our proof tools for C run in Isabelle directly, and our Cogent compiler is written in Haskell, which does not have a formal semantics nor a verified runtime at present. On one hand, processing the certificates is time-intensive. On the other hand, we do not need to trust the code extractor, nor the runtime for the extracted language. We do need to either trust the C compiler or use a verified one.

Cogent is closer to CakeML in that it is a high-level source language. However, Cogent targets a different application area. CakeML is a Turing-complete dialect of ML with complex semantics, and is suited for application code. On the other hand, Cogent is a restricted language of total functions with simple semantics that facilitate equational reasoning. Cogent avoids the need for a large runtime and a garbage collector so it can be used for embedded systems code, especially layered systems code with minimal sharing such as the control code of filesystems or network protocol stacks.

5 Take Away Lessons and Future Work

When designing the certifying compiler, we made a trade-off by writing the Cogent compiler tool-chain in Haskell, while the proof component was written in Isabelle’s Standard ML environment. This divide allows the Cogent tool-chain to be used outside the theorem prover, and allows the proof tools to build on the existing C parser and AutoCorres framework.

On the other hand, this choice leads to complexity in designing the interface between these components. This is illustrated by our well-typedness proof of Sect. 3.4, where the Cogent compiler generates a certificate with the necessary type derivation hints. Initially, we used a naïve format consisting of the entire derivation tree, resulting in gigabyte-sized certificates. We implemented various compression techniques to reduce the certificates to a reasonable size (a few megabytes). It is possible to avoid these certificates entirely by duplicating the type inference algorithm in Isabelle/ML, but this would increase the code maintenance burden.

Even though reusing the C parser and AutoCorres is desirable, they take a long time to process our verbose generated C code. They take a total of 12 CPU hours to translate the \(\mathtt {ext2}\) filesystem into a monadic embedding and they take 32 CPU hours when applied to \(\mathtt {BilbyFs}\). Further proof optimisation is needed.

Optimisation of the generated code is another topic for future work. High-level Cogent-to-Cogent optimisations will be easy, as they can be verified over the shallow embedding of Cogent using equational rewriting. For instance, we verified A-normalisation using rewriting; while it is not an optimisation, it is an example of a code transformation that does not affect the Cogent-to-C proof. For low-level optimisations, we rely on the C compiler so as not to complicate our syntax-directed proof approach.

6 Conclusions

We developed a compositional refinement calculus and proof tools to create a fully automatic refinement certificate from Cogent’s update semantics to C, including the use of partial type erasure to relate Cogent’s expressive types to simpler C types. This refinement certificate is the most involved step in the full automation of the overall compiler certificate. Through the co-generation of code and proofs, our framework significantly reduces the cost of reasoning about efficient C code, by automatically discharging cumbersome safety obligations, and providing an embedding more amenable to verification. Our framework has been applied successfully to two real-world file-systems.