Keywords

1 Introduction

A proof of the completeness theorem for a given logic conforms to the Henkin style when it applies nonconstructive methods to build models out of maximal consistent sets of formulas (possibly after a Henkin language extension) using the deductive system itself. Henkin-style completeness proofs for modal logics have been around for over five decades [9] but the formal verification of completeness with respect to Kripke semantics is comparatively recent.

We present a formalization of a Henkin-style completeness proof for the propositional modal logic S5 using the Lean theorem prover. Although the proof is specific to S5, the same techniques can be applied to weaker normal modal systems such as K, T, S4, and B, by forgetting the appropriate extra accessibility conditions (as described in [8]). The formalization covers the syntax and semantics of S5, the deduction theorem, structural rules (weakening, contraction, exchange), the recursive enumerability of the language, and the soundness and completeness theorems. It has approximately 1,500 lines of code (but only two thirds of the development is required for the completeness proof). The full source code is available online. It has been typechecked with Lean 3.4.2.

1.1 Related Work

The use of proof assistants in the mechanization of completeness proofs in the context of Kripke semantics has been recently studied in the literature for a variety of formal systems. Coquand [2] uses ALF to give a constructive formal proof of soundness and completeness w.r.t. Kripke models for intuitionistic propositional logic with implication as the sole logical constant. Building on Coquand’s work, a constructive completeness proof w.r.t Kripke semantics for intuitionistic logic with implication and universal quantification has been verified with Coq by Heberlin and Lee [7]. Also using Coq, Doczal and Smolka present a constructive formal proof of completeness w.r.t. Kripke semantics and decidability of the forcing relation for an extension of modal logic K [4] and a variety of temporal logic [5]. In his formal verification of cut elimination for coalgebraic logics, Tews [13] provides a formalization of soundness and completeness proofs covering many different logics, including modal logic K.

To the best of our knowledge, our formalization of a Henkin-style completeness proof for propositional modal logic S5 is the first of its kind.Footnote 1 Our proof is close to that of Hughes and Cresswell [8], but given for a system based on a different choice of axioms. In Hughes and Cresswell’s book, the basis of S5 is that of T plus an additional axiom. Here S5 is built on axiom schemes for K, T, S4 and B. This has the advantage that we can easily adapt the proof for different weaker systems. Another choice had to be made between using a deep or a shallow embedding for the formalization. Because our aim is metatheoretical, we use a deep embedding for the encoding of the syntax, as it allows us to prove metatheorems by structural induction on formulas or derivations.

1.2 Lean

Lean is an interactive theorem prover developed principally by Leonardo de Moura based on a version of dependent type theory known as the Calculus of Inductive Constructions [3, 11]. Theorem proving in Lean can be done by constructing proof terms directly, as in Agda [10], by using tactics, imperative commands that describe how to construct a proof term, as in Coq [12], or by mixing them together in the same environment. Lean also supports classical reasoning, which is employed in the formalization along with the declaration of noncomputable constructions. The formalization also presupposes a few basic results on data structures which are not in the standard library, so, our development makes use of mathlib, the library of formalized mathematics for Lean [1].

In the remainder of this paper, Lean code will be used to illustrate design choices and give an overview of the proof method, not to discuss the proof itself. Interested readers are encouraged to consult completeness.lean, the main file of the formalization where the crucial proof steps are given in detail. We shall also give an informal proof sketch of the completeness theorem using mathematical notation to convey the key ideas of the proof.

2 Modal Logic

2.1 The Language

For simplicity, we shall work with a language which has implication (\(\supset \)) and the false (\(\bot \)) as the only primitive logical connectives, and necessity (\(\Box \)) as the only primitive modal operator. This language can be conveniently defined using inductive types in Lean:

figure a

Using one of the four constructors displayed above (atom, bot, impl, box) is the only way to construct a term of type form. The elimination rule of this type is precisely the principle of induction on the structure of a formula.

While newly-defined terms are always exhibited in Polish notation by default, Lean supports unicode characters and has an extensible parser which allows the declaration of customized prefix or infix notations for terms and types.

figure b

Contexts are just sets of formulas. In Lean, sets are functions of type A \(\rightarrow \) :

figure d

2.2 The Proof System

We define a Mendelson system augmented with axiom schemes for K, T, S4, and B, and the necessitation rule as a rule of inference. The proof system is implemented with a type of proofs, which is inductively defined as follows:

figure e

2.3 Semantics

Kripke Models. The semantics for S5 are given by Kripke semantics. A model \(\mathcal {M}\) is a triple \(\langle \mathcal {W,R}, v \rangle \) where

  • \(\mathcal {W}\) is a non-empty set of objects called possible worlds;

  • \(\mathcal {R}\) is a binary, equivalence relation on possible worlds;

  • \(\mathsf {v}\) specifies the truth value of a formula at a world.

It is useful to let the members of \(\mathcal {W}\) (possible worlds) be sets of formulas:

figure f

Kripke models can be implemented as structures (inductive types with only one constructor). This can be done using the  command in Lean. We define a 6-tuple composed of a domain, an accessibility relation, a valuation function, and reflexivity, symmetry and transitivity proofs for the given relation:

figure h

The Boolean type  is used for truth values (i.e. either tt or ff).

Semantic Consequence. We have a forcing function which takes a model, a formula, and a world as inputs and returns a boolean value. Non-modal connectives are given truth-functionally and a formula \(\Box p\) is true at a world w iff if \(\mathcal {R}(w,v)\) then p is true at v, for all \(v \in \mathcal {W}\):

figure j

This function can be extended to contexts in the obvious way:

figure k

A formula p is a semantic consequence of a context \(\Gamma \) iff, for all \(\mathcal {M}\) and \(w \in \mathcal {W}\), \(\Gamma \) being true at w in \(\mathcal {M}\) implies p being true at w in \(\mathcal {M}\).

figure l

3 The Completeness Theorem

In this section we formalize a proof of completeness with respect to the proof system and semantics developed in the previous sections.

Theorem 1

(Completeness). For every context \(\Gamma \), any formula p that follows semantically from \(\Gamma \) is also derivable from \(\Gamma \) in the modal logic S5. In symbols:

$$\begin{aligned}\Gamma \vDash _{\mathrm {S5}} p \Longrightarrow \Gamma \vdash _{\mathrm {S5}} p\end{aligned}$$

That is, every semantic consequence is also a syntactic consequence in S5.

The proof requires a non-constructive use of contraposition. We assume that both \(\Gamma \vDash _{\mathrm {S5}} p\) and \(\Gamma \nvdash _{\mathrm {S5}} p\) hold, and then derive a contradiction using the syntax to build a model \(\mathcal {M} = \langle \mathcal {W,R}, v \rangle \) (the canonical model) where \(\Gamma \) is true but p is false at a specific world in the domain.

We shall focus on sketching the formal argument for the following facts:

  1. 1.

    \(\Gamma \cup \{\mathord {\sim }p\}\) has a maximal consistent extension \(\Delta =\bigcup _{n \in \mathbb {N}} \Delta _{n}\), for

    $$\begin{aligned} \Delta _0 :=&\Gamma \cup \{\mathord {\sim }p\} \\ \Delta _{n+1} :=&{\left\{ \begin{array}{ll} \Delta _n \cup \{\varphi _{n+1}\} &{}\text { if }\Delta _n \cup \{\varphi _{n+1}\}\text { is consistent } \\ \Delta _n \cup \{\mathord {\sim }\varphi _{n+1}\} &{}\text { otherwise} \end{array}\right. } \end{aligned}$$
  2. 2.

    There exists a canonical model where p is true at w iff \(p \in w\);

Maximal Consistent Sets. We say that a context is maximal consistent if it is consistent and, moreover, for every formula expressible in the language, either it or its negation is contained in that context.

figure m

Our language is countable, so we can construct each \(\Delta _{n+1}\) using natural numbers to run through the set of all formulas, deciding whether or not a number’s corresponding formula (when it exists) is consistent with \(\Delta _n\) or not. The enumerability of the language is expressed using encodable types, which are constructively countable types in Lean. Essentially, a type \(\alpha \) is encodable when it has an injection and a (partial) inverse .

figure p

Before proceeding any further, we must show that \(\Gamma \) in contained in max \(\Gamma \) and that max \(\Gamma \) is maximal and consistent. For each maxn \(\Gamma \) n of the family of sets, we have \(\Gamma \subseteq \) maxn \(\Gamma \) n. So \(\Gamma \) must also be contained in their union, max \(\Gamma \). This proof argument produces a term of type:

figure q

Now, every formula must be in the enumeration somewhere, so suppose that the formula p has index i. By the definition of maxn \(\Gamma \) i, either p or \(\mathord {\sim }p\) is a member of maxn \(\Gamma \) i, so one of them is a member of max \(\Gamma \). Thus, we have a term

figure r

Assume for the sake of contradiction that \(\Gamma \) is consistent but max \(\Gamma \) is not. By structural induction on the proof tree, we prove that there exists an i such that maxn \(\Gamma \) i is inconsistent. However, each maxn \(\Gamma \) i preserves consistency. This gives a function

figure s

The above proof sketches are implemented purely by unfolding definitions and inductive reasoning. They consist of approximately 150 lines of code in completeness.lean. There is even a one-line short case-reasoning proof that maximal consistent sets are closed under derivability:

figure t

The Canonical Model Construction.

We build the model as follows:

  • \(\mathcal {W}\) is the set of all maximal consistent sets of formulas;

  • \(\mathcal {R}(w,v)\) iff \(\Box p \in w\) implies \(p \in v\);

  • \(\textsf {v}(w,p)=1\) if \(w\in \mathcal {W}\) and \(p \in w\), for a propositional letter p.

We have to show that \(\mathcal {R}\) is an equivalence relation. Reflexivity translates as follows: \(\Box p \in w\) implies \(p \in w\) for a given world \(w \in \mathcal {W}\). But this is easy because w is closed under derivability (it is a maximal consistent set of formulas) and our proof system has modus ponens and axiom schema (t).

Proving symmetry requires more work. Given any worlds \(w, v \in \mathcal {W}\), suppose first that \(\Box \varphi \in w\) implies \(\varphi \in v\) for all formulas \(\varphi \), and suppose that \(\Box p \in v\). We want to show that \(p \in w\). Since \(\Diamond \Box p \supset p\) is a theorem of S5 (see syntax/lemmas.lean) we just have to prove that \(\Diamond \Box p \in w\), or, equivalently, that \(\Box \mathord {\sim }\Box p \notin w\). By contraposition on our initial hypothesis, it suffices to show that \(\mathord {\sim }\Box p \notin v\). But \(\Box p \in v\) and v is consistent.

For transitivity, we must show that \(p \in u\), on the assumptions that \(\Box p \in w\), that \(\Box \varphi \in w\) implies \(\varphi \in v\), and that \(\Box \varphi \in v\) implies \(\varphi \in u\), for any formula \(\varphi \). In other words, we want to show that \(\Box \Box p \in w\). But this follows from modus ponens and axiom scheme (s4).

This model construction is represented by the Lean code

figure u

What is here called unbox is a set operation which takes a set of formulas w as an input and returns the set of formulas p such that \(\Box p\) is a member of w.

A useful lemma about this operation is that if p is deducible from unbox w then actually \(\Box p \in w\). It can be proved by structural induction on the derivation using the necessitation rule, giving us a term:

figure v

Truth and Membership. To prove completeness, we first show that a formula is true at a world in the canonical model iff it is a member of that world:

figure w

Here model is the canonical model defined in the previous section. To prove this, we use induction on the structure of the formula p.

In the proof mechanization, we use the tactic in the tactic mode. This tactic produces four goals, of which the fourth is the most relevant one. To prove it, we begin by assuming the inductive hypothesis for p. If w is a world, and, if it is a maximal consistent set of formulas, then, by unfolding the definition of truth of a formula at a world in a model, the biconditional statement becomes

figure y

In the forwards direction, we assume that \(\Box p\) is true at w in the canonical model and that \(\mathord {\sim }\Box p \in w\). But then, by lemma mem_box_of_unbox_prf, the context unbox \(w \cup \{ \mathord {\sim }p \}\) is consistent and can be extended to a maximal consistent set (i.e. a world in the domain). It is accessible to w because unbox \(w \subseteq \) max (unbox \(w \cup \{ \sim p \}\)), so p should be true at w. But \(p \notin \) max (unbox \(w \cup \{\sim p\})\) because it is consistent.

For the backwards direction, assume that \(\Box p \in w\). Given a maximal consistent set of formulas v and assuming that \(\Box \varphi \in w\) then \(\varphi \in v\) for all \(\varphi \), we have to show that p is true at v in the model. By the inductive hypothesis, however, it suffices to show that \(p \in v\), but this follows from \(\Box p \in w\).

The Completeness Proof. We now complete our proof by putting together all the above pieces into 24 lines of code. Since we know by hypothesis that \(\Gamma \nvdash _{\mathrm {S5}} p\), it follows that \(\Gamma \cup \{\mathord {\sim }p\}\) is consistent–otherwise, if the false were deducible from it, we would have a contradiction by double negation elimination.

figure z

Now assuming that \(\Gamma \vDash _{\mathrm {S5}} p\), the basic idea for deriving the contradiction is that, as max \(\Gamma \cup \{\mathord {\sim }p\}\) is a world in the canonical model, and each formula \(\varphi \in \Gamma \) is true at that world, \(\Gamma \) is true as well. Clearly, p is not consistent with \(\Gamma \cup \{\mathord {\sim }p\}\), so \(p \notin \) max \(\Gamma \cup \{\mathord {\sim }p\}\), meaning that p must be false at that world.

This allows us to prove the desired theorem

figure aa