1 Introduction

The theory of institutions dates to Joseph Goguen and Rod M. Burstall’s 1984 paper [7] and the subsequent more detailed analysis in 1992 [8]. An institution is a mathematical realisation of the notion of “logical system” which does not commit to any single concrete system. The key insight is that many general results about logical systems do not depend in any interesting way on the details of that system.

In her PhD thesis [6], Marie Farrell uses the theory of institutions to provide a semantics for the Event-B formal modelling method with an eye to addressing some drawbacks of the Event-B language—namely the lack of standardised modularisation constructs. \( EVT \) was shown by Farrell [6], on paper, to support such constructs.

Indeed, the theory of institutions has been applied to a wide variety of languages and formal methods; CLEAR [2], CSP [15], and UML [10] have been given an institution-theoretic semantics, to name but a few. The Hets tool for heterogeneous specification [12] has the largest single repository of such institutions and their logical relationships, represented mainly by institution morphisms and comorphisms; but as far as we know there are no machine-checked proofs that these constructions are correct. Many of the requirements—checking that categories are really categories, that functors are really functors, as well as satisfaction conditions for institutions and for the (co)morphisms that relate them—amount in some parts to simple bookkeeping, and in other parts to more novel and interesting results.

We hence provide here a framework in the Coq proof assistant [5] for interactive machine-assisted proofs for institutions and an instantiation of this framework to two institutions: the institution \( FOPEQ \) for first-order predicate logic and the institution \( EVT \) for Event-B. Coq has two properties desirable for this work. First, it is based on a dependent type theory called the calculus of inductive constructions (CIC) which makes the representation of mathematical objects and the subtle constraints that they impose on one another easier than in a system without dependent types. Second, it is an interactive proof assistant rather than an automated proof assistant. The user can design automated tactics that can discharge many simple goals, but crucially Coq allows the user to step in and spell out the proofs in detail if necessary. Our framework is available on GitHub at https://github.com/ConorReynolds/coq-institutions.

We build directly on the work done by Emmanuel Gunther, Alejandro Gadea, and Miguel Pagano [9] formalizing multi-sorted universal algebra in Agda. We also note some other work in this direction in Coq by Venanzio Capretta [3], and by Gianluca Amato, Marco Maggesi and Maurizio Parton and Cosimo Perini Brogi [1] which makes use of homotopy type theory—but none go quite as far as defining institutions or instantiating first-order logic at the time of this writing. This is the first such formalization of which we are aware.

We will begin by laying the basic mathematical groundwork for institution theory, multi-sorted universal algebra, and first-order predicate logic, before explaining how these concepts are defined in our Coq developments. First-order logic is an extremely central institution, on which many others build (including \( EVT \)) and provides an appropriate first example. We then provide the same treatment for \( EVT \) as a further case study, and to provide a concrete example of one institution building on another.

2 Mathematical Background

Institutions are based on category theory. A category consists of a collection of objects, and a collection of arrows or morphisms between those objects, subject to some straightforward laws. A functor is a map between categories which preserves the categorical structure—more precisely, it preserves identity morphisms and composition of morphisms. Definitions for these concepts can be found in Emily Riehl’s freely available Category Theory in Context [14]. We only require very light familiarity with categories and functors for this paper.

Definition 1

An institution [7] consists of

  • a category \( \mathsf {Sig}\) of signatures;

  • a sentence functor \( \mathsf {Sen}: \mathsf {Sig}\rightarrow \mathsf {Set}\);

  • a model functor \( \mathsf {Mod}: \mathsf {Sig}^{\mathsf {op}} \rightarrow \mathsf {Cat}\); and

  • a semantic entailment relation \( {\models _{\varSigma }} \subseteq | \mathsf {Mod}(\varSigma ) | \times \mathsf {Sen}(\varSigma ) \) for each \( \varSigma \in \mathsf {Sig}\),

such that for any signature morphism \( \sigma : \varSigma \rightarrow \varSigma ' \), any sentence \( \phi \in \mathsf {Sen}(\varSigma ) \), and any model \( M' \in \mathsf {Mod}(\varSigma ') \), the satisfaction condition holds:

$$\begin{aligned} M' \models _{\varSigma '} \mathsf {Sen}(\sigma )(\phi ) \quad \text {iff} \quad \mathsf {Mod}(\sigma )(M') \models _{\varSigma } \phi \end{aligned}$$

ensuring that a change in signature induces a consistent change in the satisfaction of sentences by models.

The signatures contain non-logical symbols: data types, constants, functions, and so on. The sentence functor explains how to build sentences over the non-logical symbols. The model functor explains how to interpret the symbols in any given signature. The semantic entailment relation explains how to decide if a given sentence is true or false in a given model. The requirement that the signatures form a category, and that the sentence and model constructors are functors, is due to the central concept of a signature morphism, a mapping between signatures—a “change in notation”. If the sentence construction is a functor then we can be sure that signature translations preserve the sentence structure.

The satisfaction condition explains how the components should interact with one another, and in particular how they behave under a change in signature. Without such a condition, the semantic entailment relation \( \models \) could behave just as expected on one signature, but behave utterly erratically on another. But we expect the entailment relation \( \models \) to change only so much with a change in signature. The satisfaction condition ensures that satisfaction of sentences by models is consistent under a change in signature.

2.1 First-Order Predicate Logic

We provide a brief account of multi-sorted universal algebra and first-order predicate logic, in preparation for a formal encoding in Coq (Sect. 4); see Sannella and Tarlecki’s Foundations of Algebraic Specification [17] for details.

Definition 2

An \( S \) -indexed set is a family of sets \( X = (X_s)_{s \in S} \).

Definition 3

A signature is a 3-tuple \( \langle S, \mathscr {F}, \mathscr {P} \rangle \) where \( S \) is a set of sorts, \( \mathscr {F} \) is a \( ({{\,\mathrm{\mathsf {List}}\,}}(S) \times S) \)-indexed set of function symbols, and \( \mathscr {P} \) is a \( {{\,\mathrm{\mathsf {List}}\,}}(S) \)-indexed set of predicate symbols.

Here \( {{\,\mathrm{\mathsf {List}}\,}}(A) \) is just as expected: the set of all finite sequences of elements from \( A \). The idea is that a symbol \( F \in \mathscr {F}_{w,s} \) has arity \( w \) and result sort \( s \), and a predicate symbol \( P \in \mathscr {P}_{w} \) has arity \( w \) and no result sort (since it represents a predicate). If the signature is clear from context, we instead write for predicate symbols. If a function symbol \( C \) has arity \( \mathsf {nil} \) and result sort \( s \), then it is called a constant symbol and we denote it \( C : s \).

As a running example, let \( \mathsf {stackSig} \) be a signature consisting of the symbols required to describe a stack. It has two sorts \( \mathsf {elem} \) and \( \mathsf {stack} \); some function symbols \( \mathsf {empty} : \mathsf {stack} \), , and ; and a predicate symbol .

Definition 4

Let \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) and \( \varSigma ' = \langle S', \mathscr {F}', \mathscr {P}' \rangle \) be two signatures. A signature morphism \( \sigma : \varSigma \rightarrow \varSigma ' \) consists of a function \( \sigma _{ sorts } : S \rightarrow S', \) which will usually be written \( \sigma \), as well as a pair of functions

$$\begin{aligned} \sigma _{ funcs }&: \prod _{w,s}\ \mathscr {F}_{w,s} \rightarrow \mathscr {F}'_{\sigma (w),\sigma (s)} \\ \sigma _{ preds }&: \prod _{w} \mathscr {P}_{w} \rightarrow \mathscr {P}'_{\sigma (w)} \end{aligned}$$

respectively mapping sorts, function symbols, and predicate symbols, in such a way that the sorts are translated consistently with \( \sigma _{ sorts } \). We define \( \sigma (w) \) as the action of \( \sigma _{ sorts } \) on each of the sorts in \( w \).

Definition 5

An algebra \( A \) for a signature \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) consists of three functions \( \langle A_{ sorts }, A_{ funcs }, A_{ preds } \rangle \), all of which we denote by \( A \), each respectively interpreting the sorts, function symbols, and predicate symbols as sets, functions, and predicates:

  • for any sort \( s \in S \), \( A(s) \) is a set, which we typically denote \( A_s \);

  • for any \( F \in \mathscr {F}_{w,s} \), we have \( A(F) : A_{w_1} \times \dots \times A_{w_n} \rightarrow A_s \); and

  • for any \( P \in \mathscr {P}_w \), we have \( A(P) \subseteq A_{w_1} \times \dots \times A_{w_n} \).

Algebras give meaning to the symbols in a signature. Consider again our running example \( \mathsf {stackSig} \); we could interpret the sort \( \mathsf {elem} \) as the set \( \mathbb {N}\) of natural numbers, and the sort \( \mathsf {stack} \) as the set \( {{\,\mathrm{\mathsf {List}}\,}}(\mathbb {N}) \) of lists of natural numbers; the function symbols \( \mathsf {empty} \), \( \mathsf {push} \), and \( \mathsf {pop} \) as \( \mathsf {nil} \in {{\,\mathrm{\mathsf {List}}\,}}(\mathbb {N}) \), \( \mathsf {cons} : \mathbb {N}\times {{\,\mathrm{\mathsf {List}}\,}}(\mathbb {N}) \rightarrow \mathbb {N}\), and \( \mathsf {tail} : {{\,\mathrm{\mathsf {List}}\,}}(\mathbb {N}) \rightarrow \mathbb {N}\), respectively; and the predicate symbol \( \mathsf {isEmpty} \) as the predicate \( \left\{ {s \mid s = \mathsf {nil}} \right\} \subseteq {{\,\mathrm{\mathsf {List}}\,}}(\mathbb {N}) \). We are by no means bound to this interpretation, of course.

Definition 6

Let \( \varSigma \) and \( \varSigma ' \) be signatures, let \( \sigma : \varSigma \rightarrow \varSigma ' \) be a signature morphism, and let \( A' \) be a \( \varSigma ' \)-algebra. The reduct algebra \( A'{\mid }_{\sigma } \) is a \( \varSigma \)-algebra defined at each component of the algebra to be \( A' \circ \sigma \).

Algebras are best thought of (loosely) as functions providing a concrete denotation for the symbols in a signature—functions from symbols to “real” mathematical objects. In the presence of a change in signature \( \sigma : \varSigma \rightarrow \varSigma ' \), a \( \varSigma ' \)-algebra can interpret symbols in \( \varSigma \) by first applying \( \sigma \) and interpreting the resulting \( \varSigma ' \)-symbol; hence we “precompose” \( A' \) by \( \sigma \) to obtain a \( \varSigma \)-algebra. Note that the direction is reversed; we are taking \( \varSigma ' \)-algebras to \( \varSigma \)-algebras using \( \sigma : \varSigma \rightarrow \varSigma ' \). Now is a good time to note the contravariance of the model functor in the definition of an institution: if \( \sigma : \varSigma \rightarrow \varSigma ' \) then \( \mathsf {Mod}(\sigma ) : \mathsf {Mod}(\varSigma ') \rightarrow \mathsf {Mod}(\varSigma ) \).

The following pair of definitions explain how we may build more complex expressions, which we will call terms, out of the basic symbols of a signature.

Definition 7

A set of variables for a signature \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) is an \( S \)-indexed set.

Definition 8

A term over a signature \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) with variables in \( X \) is defined inductively as follows.

  • A variable \( x \in X_s \) is a term of sort \( s \).

  • A constant symbol \( C \in \mathscr {F}_{\mathsf {nil},s} \) is a term of sort \( s \).

  • If \( | w | = n > 0 \), then given terms \( t_1 : w_1, \dots , t_n : w_n \) and a function symbol \( F \in \mathscr {F}_{w,s} \), the expression \( F(t_1, \dots , t_n) \) is a term of sort \( s \).

Terms explain how sorted variables and symbols may be put together. For example, let \( x : \mathsf {elem} \) and \( s : \mathsf {stack} \) be two variables; then \( \mathsf {push}(x, s) : \mathsf {stack} \) is a valid term; as is \( \mathsf {push}(x, \mathsf {push}(x, s)) \). But, for example, \( \mathsf {pop}(x) \) is not since \( x \) has the wrong sort.

With terms and algebras defined, all that is left is to define first-order sentences.

Definition 9

Let \(\varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) be a signature. The sentences of first-order logic are built from the logical symbols \( = \), \( \rightarrow \), \( \lnot \), \( \wedge \), \( \vee \), \( \forall \), \( \exists \). The atomic sentences are

  • \( u = v \) for terms \( u \) and \( v \) with the same sort; and

  • \( P(t_1, \dots , t_n) \) for any predicate symbol \( P \in \mathscr {P}_w \) and terms \( t_i \).

The sentences in general are defined inductively as follows:

  • Any atomic sentence \( \phi \) is a sentence.

  • The expressions \( \lnot \phi \), \( \phi \rightarrow \psi \), \( \phi \wedge \psi \), \( \phi \vee \psi \), \( \forall x.\ \phi \) and \( \exists x.\ \phi \), for any sentences \( \phi , \psi \) and variable \( x \), are all sentences.

We can now write sentences like \( \forall x.\ \forall s.\ \mathsf {pop}(\mathsf {push}(x,s)) = s \). The interpretation of first-order sentences is defined by induction on the sentence structure. We will give a more precise account in Sect. 4.

3 Institutions in Coq

Coq is an interactive proof assistant for higher-order logic based on a dependent type theory called the calculus of inductive constructions (CIC). Dependent type theories allow for extremely elegant representation of complex mathematical objects such as those found in category theory, institution theory, universal algebra, etc. All of the work presented here is formalised fully in Coq.

We depend on a formalization of category theory by John Wiegley [20]. Morphisms between objects and are denoted , and functors between categories and are denoted . The generic form of an institution can be defined directly as a dependent record.

Here refers to the semantic entailment relation \( \models \) for an institution and refers to the category of sets, which here just means the category of Coq types and functions. The term is short for “functor map” and describes the action of a functor on morphisms. For the purposes of this paper, we implement so-called set/set institutions [11], in which the target of \( \mathsf {Mod}\) is \( \mathsf {Set}\), the category of sets, and not \( \mathsf {Cat}\), the category of all small categories.

Our focus for this paper is on an instantiation of this object to \( FOPEQ \), the institution for first-order predicate logic, and \( EVT \), the institution for Event-B defined in [6]. Since \( EVT \) builds on \( FOPEQ \), we will begin with \( FOPEQ \) and work up.

4 First-Order Logic in Coq

We partially build upon a formalization of multi-sorted universal algebra in Agda [9], though we deviate in many of the details. As we define objects in Coq, we will make reference back to their mathematical definitions from Sect. 2.1. We do not show everything, only what we deem crucial to follow the basic idea of the formalization.

4.1 Representing FOL

Signatures (cf. Definition 3) are represented by a dependent record, mirroring the mathematical definition exactly.

An algebra (cf. Definition 5) for a signature needs to interpret sorts as Coq types and the function and predicate symbols as Coq functions with the right type.

For this we use heterogeneous lists—henceforth h-lists—following Gunther et al. [9]. A heterogeneous list can contain elements of different types, as distinguished from a homogeneous list which contains only elements of a single type. Our definition of h-lists comes from Chlipala’s CPDT [4], where the reader can find a more detailed description of the implementation details.

Let \( \mathcal {U} \) be a universe of types. Given an index type \( I : \mathcal {U} \), a list \( w : {{\,\mathrm{\mathsf {List}}\,}}(I) \) and a type family \( A : I \rightarrow \mathcal {U} \) which selects for each \( i : I \) a type \( A_i : \mathcal {U} \), we can build a h-list \( v : \mathsf {HList}(A, w) \) which contains \( | w | \) elements and where the \( i \)th element of \( v \), denoted \( v_i \), has the type \( A_{w_i} \). For example, if \( w = [\mathbb {N}, \mathsf {bool}, \mathsf {string}] \) and if \( A \) is the identity, then \( \langle 3, \mathsf {true}, \text {`hello'} \rangle \) would be a valid h-list of type \( \mathsf {HList}(A, w) \). Another example, more pertinent to our discussion: Consider again our running example \( \mathsf {stackSig} \). Let \( I = \left\{ {\mathsf {elem}, \mathsf {stack}} \right\} \), let \( w = [\mathsf {elem}, \mathsf {stack}] \), and let \( A : I \rightarrow \mathcal {U} \) be defined by \( \mathsf {elem} \mapsto \mathbb {N}\) and \( \mathsf {stack} \mapsto {{\,\mathrm{\mathsf {List}}\,}}(\mathbb {N}) \). Then \( \langle 2, [3, 4] \rangle \) would be a valid term of type \( \mathsf {HList}(A, w) \).

A h-list is a concrete implementation of a kind of dependent \( n \)-tuple; that is to say, \( \mathsf {HList}(A, w) \) is a concrete Coq encoding of the dependent sum \( \sum _{i} A(w_i) \). Now, let \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) be a signature, let \( F \in \mathscr {F}_{w, s} \) and let \( A \) be a \( \varSigma \)-algebra. Since \( \mathsf {HList}(A, w) \rightarrow A(s) \cong \sum _{i} A(w_i) \rightarrow A(s) \), we should interpret \( A(F) \) as a function \( \mathsf {HList}(A, w) \rightarrow A(s) \).

We are so far no different from Gunther et al. [9]. Our first deviation is in the definition of variables and terms (cf. Definition 8).

Variables (cf. Definition 7) are not represented here by members of an indexed set; instead they are dependent de Bruijn indices—see CPDT chapter nine [4]. The \( \mathsf {member} \) type is exactly as it appears there; a term \( i : \mathsf {member}(s, \varGamma ) \) can be thought of as a constructive proof that \( s \) appears at index \( i \) in the list \( \varGamma \). By defining variables this way, we can quite easily define quantifiers which correctly track the locations of free variables, as we will see.

Signature morphisms (cf. Definition 4) are the cornerstone of institution theory; much of the implementation depends on this definition.

No surprises here, but note that we must translate the sorts in and using . With this we can now define reduct algebras (cf. Definition 6).

Note that the function \( \mathsf {reindex} \) is computationally the identity but converts between the equivalent types \( \mathsf {HList}(A \circ f, w) \) and \( \mathsf {HList}(A, \mathsf {map}\ f\ w) \).

We can now start to build the syntactic and semantic structure of first-order sentences. The syntax is as follows.

We omit the other connectives since their definitions are straightforward. Syntactically, a quantifier accepts as argument a sentence in which at least one variable appears free and binds it. If \( \psi \) is a sentence with context \( s \mathbin {{:}{:}} \varGamma \), then the sentence \( \mathsf {Q}_s.\ \psi \) is a sentence with context \( \varGamma \), where \( \mathsf {Q} \) is either quantifier. Formally, we have the following syntactic formation rule:

To interpret a first-order sentence, we must decide what the logical symbols mean and what values the free variables will get. If \( \theta \) is an environment providing values for the variables in \( \varGamma \), then we denote the semantic interpretation of a sentence \( \phi \) with free variables from \( \varGamma \) by an algebra \( A \) with environment \( \theta \) by \( A \vDash ^{\theta } \phi \). Precisely, in the case of the quantifiers, we have

$$\begin{aligned} A \vDash ^{\theta } \mathsf {Forall}_s(\psi )&\quad \text {iff}\quad \text {for all}\ x \in A_s\ \text {we have}\ A \vDash ^{x, \theta } \psi \\ A \vDash ^{\theta } \mathsf {Exists}_s(\psi )&\quad \text {iff}\quad \text {there exists}\ x \in A_s\ \text {such that}\ A \vDash ^{x, \theta } \psi \end{aligned}$$

This setup makes the definition of the semantic entailment relation relatively painless. (The triple-colon operator denotes the cons function for h-lists.)

The institution \( FOPEQ \) requires closed first-order sentences, i.e. sentences of the form \( \psi : \mathsf {FOL}(\varSigma , \mathsf {nil}) \); hence \( A \models \psi \) will really mean \( \mathsf {interp\_fol}(A, \psi , \mathsf {hnil}) \).

The relation above relies on two mutually-defined term evaluation functions, for which we use the library [18].

On variables, it looks up the right value in the environment. On terms, it interprets the function symbol with the given algebra and calls itself on the function symbol’s arguments. It is possible to write this function (and others) without , but this gives the best computational behaviour and plays relatively nicely with the proofs.

4.2 Proofs and Proof Strategy

There are some more definitions that are crucial for proving the satisfaction condition for first-order predicate logic. The following mutually-defined functions promote signature morphisms to term translations:

Applying a signature translation to a variable amounts only to a reindexing; all the work is happening at the type level, but the underlying “number” \( i \) doesn’t change. To apply a signature translation to a term, we just apply it to the function symbol and then apply it to all its arguments. Promoting this a level higher to first-order sentences is a simple matter, since the sentence structure will be ignored by signature morphisms.

We will also need to define a custom induction principle for terms; the induction principle automatically generated by Coq is too weak because it is missing a hypothesis in the case where the term has the form \( F(t_1, \dots , t_n) \); namely that the predicate \( P : \prod _{i} A_i \rightarrow \mathsf {Prop} \) holds for all \( t_1, \dots , t_n \). In Coq this is represented by \( \mathsf {HForall}\ P\ \langle t_1, \dots , t_n \rangle \).

Proofs Involving Indexed Types. Consider how to define the composition of two signature morphisms \( \sigma \) and \( \tau \). Doing so directly will result in a type-level mismatch between \( \mathsf {map}\ \tau \ (\mathsf {map}\ \sigma \ w) \) and \( \mathsf {map}\ (\tau \circ \sigma )\ w \). Of course, these terms are propositionally equal via the proof \( p = \mathsf {map\_map}\ \sigma \ \tau \ w \); so we need to mention this to Coq at the point of definition. As an example, here is the definition of the composition of two first-order signature morphisms, simplified for readability.

Many definitions in our developments take a similar form. Proofs of most propositions involving such terms should follow by computation and induction on the involved identity proofs—an identity proof being a proof of the form \( p : x = y \). We call upon a range of tactics and rewriting strategies for identity proofs, many of which are defined in and some of which come from the homotopy type theory [19] Coq developments, specifically

Proofs about terms caused the most consternation. Using the following lemma,

we can write \( \mathsf {map\_on\_terms} \) in terms of \( \mathsf {hmap} \) and \( \mathsf {on\_terms} \); this exposes \( \mathsf {reindex} \), which we may convert into using some combination of the following two lemmas.

This process pulls out the hidden identity proof such that it may be combined with others. We then required the lemma

for converting the hypothesis generated by our custom induction principle for terms into a useful rewrite rule.

There is one more trick we employ, and for which we must assume proof irrelevance. Often the subject of an identity proof is of the form \( x \mathbin {{:}{:}} xs \), as in, for example, \( p' : \mathsf {map}\ \mathsf {id}\ (x \mathbin {{:}{:}} xs ) = x \mathbin {{:}{:}} xs \). If one has a proof \( p : \mathsf {map}\ \mathsf {id}\ xs = xs \), then in fact \( \mathsf {f\_equal}\ (\mathsf {cons}\ x)\ p \) is also a proof that \( \mathsf {map}\ \mathsf {id}\ (x \mathbin {{:}{:}} xs ) = x \mathbin {{:}{:}} xs \). By proof irrelevance, \( p' \) and \( \mathsf {f\_equal}\ (\mathsf {cons}\ x)\ p \) are themselves equal, but the point is that the latter form has useful structure that we can exploit. This is not always necessary—often the tactic is enough—but we found it indispensable in proofs which required more careful rewriting of identity proofs.

The Proof of Satisfaction. Throughout the process, we identified at least one non-obvious lemma required for the proof of satisfaction for first-order logic.

Lemma 1

Let \( \sigma : \varSigma _1 \rightarrow \varSigma _2 \) be a signature morphism, let \( t_1 \) be a \( \varSigma _1 \)-term with \( \varSigma _1 \)-context \( \varGamma _1 \), and let \( A_2 \) be a \( \varSigma _2 \)-algebra. Let \( \theta : \mathsf {HList}(A_2{\mid }_{\sigma }, \varGamma _1) \) be a valuation of the variables in \( \varGamma _1 \). Then

$$\begin{aligned} A_{2}^{\sigma (\theta )}(\sigma (t_1)) = (A_2{\mid }_{\sigma })^{\theta }(t_1) \end{aligned}$$

Since \( \theta \) is a h-list, the action of \( \sigma \) on \( \theta \) is just a reindexing; hence we obtain \( \sigma (\theta ) : \mathsf {HList}(A_2, \mathsf {map}\ \sigma \ \varGamma _1) \). The specific requirement generated by the proof of satisfaction, for one of the atomic sentences, \( t_1 = t_2 \), is

$$\begin{aligned} A' \models ^{\sigma (\theta )} (\sigma (t_1 = t_2)) \quad \text {iff} \quad (A'{\mid }_{\sigma }) \models ^{\theta } (t_1 = t_2) \end{aligned}$$

Lemma 1 is a strict strengthening of this requirement, since it shows in fact that the terms under analysis are equal. This lemma handles the atomic sentences; the other cases follow without much trouble.

5 Formalizing \( EVT \)

Readers should consult the backmatter of [16] for a summary of the Event-B language by Thai Son Hoang. Not much, if any, familiarity with the system will be required beyond what we describe here. Event-B machines consist, at the most basic level, of discrete state-transitions called events. Our approach is to use first-order sentences to represent updates to the machine state; for example, the variable-update statement is written as a first-order sentence \( x' = x + 1 \). The unprimed variables represent the state of the machine before an event, the primed variables represent the state of the machine after an event.

We will formally describe a more generic institution for Event-B than is presented in Farrell [6], dropping event names from the representation. We will call this institution \( EVT \) where there is no room for confusion. We found that defining \( EVT \) without event names results in simplified constructions and gives us more room for defining potential extensions to \( EVT \)—but, crucially, without changing the details of the proof of satisfaction. For a short account of a formalization that matches Farrell’s more closely, see [13].

First, we’ll define signatures and signature morphisms for \( EVT \).

Definition 10

An \( EVT \) -signature \( \hat{\varSigma } \) is a 3-tuple \( \langle \varSigma , X, X' \rangle \) where \( \varSigma \) is a first-order signature and \( X \) and \( X' \) are \( \mathsf {Sorts}(\varSigma ) \)-indexed sets, such that \( (-)' : X \rightarrow X' \) is an equivalence.

Definition 11

An \( EVT \) -signature morphism \( \hat{\sigma } : \hat{\varSigma _1} \rightarrow \hat{\varSigma _2} \) consists of a first-order signature morphism \( \sigma : \varSigma _1 \rightarrow \varSigma _2 \) and two variable morphisms \( \mathsf {on\_vars} : X_1 \rightarrow X_2 \) and \( \mathsf {on\_vars'} : X'_1 \rightarrow X'_2 \) such that the following diagram commutes.

figure am

In all cases where not otherwise specified, the \( EVT \)-signature \( \hat{\varSigma } \) is given by \( \langle \varSigma , X, X' \rangle \).

A standard construction in institution theory is the signature extension. We add variables by adding them directly into the signature as constant function symbols.

Definition 12

Let \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) be a first-order signature and let \( X \) be an \( S \)-indexed set. The expansion of \( \varSigma \) by \( X \) is a first-order signature \( \varSigma + X \) which is equal to \( \varSigma \) everywhere except on the constant function symbols; \( \varSigma + X \) has constant symbols \( \mathscr {F}_{\mathsf {nil},s} + X_s \), for each \( s \in S \).

To model signatures of the form \( \varSigma + X \), we need only expand a given \( \varSigma \)-algebra by a valuation \( X \rightarrow A \).

Definition 13

Let \( \varSigma = \langle S, \mathscr {F}, \mathscr {P} \rangle \) be a first-order signature and let \( A \) be a \( \varSigma \)-algebra. Let \( X \) be an \( S \)-indexed set of variables and let \( \theta : X \rightarrow A \) be a valuation of variables. The expansion of \( A \) by \( \theta \) is a \( (\varSigma + X) \)-algebra \( A^{\theta } \), which behaves like \( A \) on symbols from \( \varSigma \) and takes variables \( x \in X_s \) to \( \theta (x) \in A_s \).

Definition 14

A \( \hat{\varSigma } \)-model \( M \) is a 3-tuple \( \langle A,\theta , \theta ' \rangle \), where \( A \) is a \( \varSigma \)-algebra and \( \theta : X \rightarrow A \) and \( \theta ' : X \rightarrow A \) are valuations of variables.

To illustrate what we have so far, consider the following example. A model for a \( (\varSigma + X + X') \)-sentence consists of a \( \varSigma \)-algebra \( A \) and two valuations of the variables \( \theta : X \rightarrow A \) and \( \theta ' : X' \rightarrow A \), usually referred to as a \( \varSigma \)-states\( \theta \) is the pre-state and \( \theta ' \) is the post-state. One possible model for the sentence \( x' = x + 1 \) consists of the usual algebra for natural numbers, a pre-state \( x \mapsto 2 \), and a post-state \( x' \mapsto 3 \). One possible model for the sentence \( s' = \mathsf {push}(x, s) \) consists of an algebra for a stack of characters, a pre-state \( x \mapsto \text {e} \), \( s \mapsto [\text {v}, \text {t}] \), and a post-state \( s' \mapsto [\text {e}, \text {v}, \text {t}] \). Here, \( x' \) can consistently be assigned anything. If we wish to avoid this, we can assume that sentences \( \psi \) which don’t mention a given primed variable \( x' \) are really shorthand for \( \psi \wedge (x' = x) \).

Let us formally define the sentences for \( EVT \). Note that \( \mathsf {FOSen}(\varSigma ) \) denotes the set of all first-order \( \varSigma \)-sentences.

Definition 15

Let \( \hat{\varSigma } \) be an \( EVT \)-signature. A \( \hat{\varSigma } \)-sentence is either an initialization sentence \( \mathsf {Init}(\phi ) \) where \( \phi \in \mathsf {FOSen}(\varSigma + X') \), or an event sentence \( \mathsf {Event}(\phi ) \) where \( \phi \in \mathsf {FOSen}(\varSigma + X + X') \).

Initialization sentences constrain the range of possible initial states for a machine; often only one such state is possible. There is no previous state yet, so any initialization sentence is built over \( \varSigma + X' \). Event sentences explain how an event updates the state, and therefore can access both pre- and post-variables; thus event sentences are built over \( \varSigma + X + X' \).

Finally, let’s define the semantic entailment relation for \( EVT \).

Definition 16

Let \( \hat{\varSigma } \) be an \( EVT \)-signature, \( M = \langle A, \theta , \theta ' \rangle \) a \( \hat{\varSigma } \)-model, and \( \psi \) an \( EVT \)-sentence. We define \( M \models \psi \) by induction on \( \psi \): \( M \models \mathsf {Init}(\phi ) \) if \( A^{\theta '} \models \phi \); and \( M \models \mathsf {Event}(\phi ) \) if \( A^{\theta + \theta '} \models \phi \).

5.1 Representing EVT

We have a much easier job here than we did for first-order predicate logic since \( EVT \) builds directly on \( FOPEQ \). We rely on a couple of major first-order constructions. First, we will define signature extensions by a set of variables (cf. Definition 12). Note that is the category of indexed types \( I \rightarrow \mathcal {U} \).

The main part of an algebra expansion (cf. Definition 13) is given by the following function; no other part of the algebra is changed.

\( EVT \)-signatures (cf. Definition 10) are represented exactly as they are given mathematically.

Here is the proof that \( (-)' : X \rightarrow X' \) is an equivalence.

For \( EVT \) signature morphisms (cf. Definition 11), we simply define \( \mathsf {on\_vars}' \) in terms of \( \mathsf {on\_vars} \) to simplify matters.

\( EVT \)-models (cf. Definition 14) and \( EVT \)-sentences (cf. Definition 15) also offer no surprises.

Finally, the semantic entailment relation for \( EVT \) (cf. Definition 16) defers directly to entailment for \( FOPEQ \).

Here, stitches two valuations \( \theta : X \rightarrow M \) and \( \theta ' : X' \rightarrow M \) (with the same target) into \( \varTheta : X + X' \rightarrow M \).

5.2 Proofs and Proof Strategy

Most of the tricks we needed for first-order logic apply just as well here. The main additional proof strategy emerged while proving equality for dependent records.

As an example, let’s consider \( EVT \) signature morphisms. To prove that two \( EVT \) signature morphisms are equal, we need to prove that they are equal componentwise—the first-order signature morphisms have to agree everywhere, as do the two variable morphisms. The proofs that the variable morphisms are equal appear at first to depend on the proof that the base first-order signature morphisms are equal. But actually, that’s not strictly the case; we only need to know that the first-order signature morphisms agree on sorts to prove that the two variable morphisms are equal.

We can write custom equality lemmas which state the dependencies between proofs more precisely. Here is one such lemma for \( EVT \) signature morphisms.

Here we build a proof that two signature morphisms are equal from proofs that they are equal at each of their components. Note that \( q \) depends on \( p' \) only, and not \( p \). Normally the dependency is on \( p \)—but \( p' \) is typically much simpler than \( p \) and is all that is necessary. Often \( p' \) is \( \mathsf {refl} \), meaning computes away, simplifying the proofs considerably.

Most other constructions and proofs revolved around signature and model extensions. The following was the main non-trivial lemma which we identified while proving the satisfaction condition for \( EVT \). Note that the following holds for any indexed sets \( X_1 \) and \( X_2 \) and any function \( f : X_1 \rightarrow X_2 \circ \sigma \).

Lemma 2

Let \( \sigma : \varSigma _1 \rightarrow \varSigma _2 \) be a first-order signature morphism, let \( f : X_1 \rightarrow X_2 \circ \sigma \) be a variable morphism, let \( A_2 \) be a \( \varSigma _2 \)-algebra, and let \( \theta _2 : X_2 \rightarrow A_2 \) be a valuation of variables. Then

$$\begin{aligned} (A_2{\mid }_{\sigma })^{\theta _2 \circ f} = (A_2^{\theta _2}){\mid }_{\sigma + f} \end{aligned}$$

We’re taking some liberties with the notation. Note that \( \theta _2 \circ f \) is a shorthand for \( \lambda s,x.\ \theta _2(\sigma (s), f(s, x)) \) and \( \sigma + f : \varSigma _1 + X_1 \rightarrow \varSigma _2 + X_2 \) is a shorthand for the extension of \( \sigma \) by \( f \).

The proof of satisfaction itself proceeds by two cases, both of which are essentially the same, and both of which rely on the satisfaction condition for \( FOPEQ \) and Lemma 2, with \( f \) instantiated to different maps in each.

6 Conclusion

We have detailed the most important points of our formalisation of two institutions in Coq: the institution \( FOPEQ \) for first-order predicate logic, and the institution \( EVT \) for Event-B. According to the toolFootnote 1 we have over 3,000 significant lines of Coq developments, not including the many experimental or variant implementation attempts. Initial progress was slow, but the overall approach was successful; the satisfaction condition is fully formalized for both—with some difficulty for \( FOPEQ \), but with far greater ease for \( EVT \). Furthermore, both institutions have many reusable components which will aid in the construction of other concrete institutions and with proving their satisfaction conditions. Proofs involving indexed types in Coq are notoriously difficult, but we suspect for the purposes of our formalism that they are all difficult in the same way, so that the lessons we learn here can be applied more generally.

Having a formal framework for defining institutions continues to be useful in our own work. Indeed, we have already begun applying it to the problem of integrating linear-time temporal logic with Event-B specifications. We have also defined some institution-independent constructions not covered here, specifically modal and linear-time temporal logics over an arbitrary institution.

We intend in the future to add more concrete institutions to this framework; to show that both \( FOPEQ \) and \( EVT \) have the amalgamation property; to build more institution-independent constructions; to improve proof automation for institutions; and to define and verify some institution (co)morphisms. This work could also, in time, become a fully formal basis for the work already done for the Hets tool for heterogeneous specification.