1 Motivation

The proof assistant community is mainly divided in two successful camps. One camp, represented by provers such as Agda [2], Coq [3], Matita [4] and Nuprl [5], uses expressive type theories as a foundation. The other camp, represented by the HOL family of provers (including HOL4 [6], HOL Light [7], HOL Zero [8] and Isabelle/HOL [9]), remains faithful to a form of classic set theory typed using simple types with rank-1 polymorphism. (Other successful provers, such as ACL2 [10] and Mizar [11], could be seen as being closer to the HOL camp, although technically they are not based on HOL.)

According to the HOL school of thought, a main goal is to acquire a sweet spot: keep the logic as simple as possible while obtaining sufficient expressiveness. The notion of sufficient expressiveness is of course debatable, and has been debated. For example, PVS [12] includes dependent types (but excludes polymorphism), HOL-Omega [13] adds first-class type constructors to HOL, and Isabelle/HOL adds ad hoc overloading of polymorphic constants. In this paper, we want to propose a gentler extension of HOL. We do not want to promote new “first-class citizens,” but merely to give better credit to an old and venerable HOL citizen: the notion of types emerging from sets.

The problem that we address in this paper is best illustrated by an example. Let \({{\mathsf {lists}}}: \alpha \,\mathsf {set}\rightarrow \alpha \,\mathsf {list}\;\mathsf {set}\) be the constant that takes a set A and returns the set of lists whose elements are in A, and \(\mathsf {P}: \alpha \,\mathsf {list}\rightarrow \mathsf {bool}\) be another constant (whose definition is not important here). Consider the following two statements, which use either types or sets to represent semantically the same mathematical fact (for clarity, we extend the usual HOL syntax by explicitly quantifying over types at the outermost level as in HOL-Omega):

$$\begin{aligned}&\forall \alpha .\exists { xs }_{\alpha \,\mathsf {list}} .\mathsf {P}\;{ xs } \end{aligned}$$
(1)
$$\begin{aligned}&\forall \alpha .\forall A_{\alpha \,\mathsf {set}}.A \ne \emptyset \longrightarrow (\exists { xs }\in {{\mathsf {lists}}}\;A.\mathsf {P}\;{ xs }) \end{aligned}$$
(2)

The formula (2) is a relativized form of (1), quantifying not only over all types \(\alpha \), but also over all their nonempty subsets A, and correspondingly relativizing the quantification over all lists to quantification over the lists built from elements of A. We call theorems such as (1) type based and theorems such as (2) set based.

Type-based theorems have obvious advantages compared to the set-based ones. First, they are more concise. Moreover, automatic proof procedures work better for them, thanks to the fact that they encode properties more rigidly and more implicitly, namely, in the HOL types (such as membership to \(\alpha \;\mathsf {list}\)) and not via formulas (such as membership to the set \({{\mathsf {lists}}}\;A\)). On the downside, type-based theorems are less flexible, and therefore unsuitable for some developments. Indeed, when working with mathematical structures, it is often the case that they have the desired property only on a proper subset of the whole type. For example, a function f from \(\tau \) to \(\sigma \) may be injective or continuous only on a subset of \(\tau \).

When wishing to apply type-based theorems from the library to deal with such situations, users are forced to produce ad hoc workarounds: they can often define a new (global) type isomorphic to the concrete set, which HOL allows only if the set can be represented by a closed term.Footnote 1 But quite often the set depends on some local variables or local definitions (typically because it was constructed inside of a proof), in which case one needs the set-based relativization of the type-based library theorem. In the most striking cases, such a relativization is created manually. For example, in Isabelle/HOL there exists the constant \({{\mathsf {inj\hbox {-}{}on}}}\;A\;f = (\forall x\;y\in A.f\;x=f\;y\longrightarrow x = y)\) together with a small library about functions being injective only on a subset of a type. In summary, while it is easier to reason about type-based statements such as (1), the set-based statements such as (2) are more general and more widely applicable.

An additional nuance to this situation is specific to Isabelle/HOL, which allows users to annotate types with Haskell-like [14] type-class constraints [15]. This provides a further level of implicit reasoning. For example, instead of explicitly quantifying a statement over an associative operation \(*\) on a type \(\sigma \), one marks \(\sigma \) as having class \(\mathsf {semigroup}\) (which carries implicitly the assumptions). This would also need to be reversed when relativizing from types to sets. If (1) made the assumption that \(\alpha \) is a semigroup, as in \(\forall \alpha _\mathsf {semigroup}.\exists { xs }_{\alpha \,\mathsf {list}}.\mathsf {P}\;{ xs }\), then (2) would need to quantify universally not only over A, but also over a binary operation on A, and explicitly assume it to be associative.

Isabelle provides a large collection of arithmetic simplification procedures, which serve as an example of an automation that works better for type-based goals than for the set-based ones since they work smoothly only if the corresponding simplification rules apply to the whole type. For example, the cancellation procedure can discharge the goal \(x - x = 0\) only if x’s type is restricted to an appropriate algebraic structure such as \(\alpha _\mathsf {group}\), in which case the goal can be easily rewritten to \(\mathsf {{True}}\). In the set-based setting, in which x has an unrestricted type \(\alpha \), the formula does not hold in general. The cancellation would work only if the simplifier could prove that \(x \in G\) and \(\mathsf {group}\,G\) for some G, which would have to be synthesized or obtained from the proof context. Moreover, since G itself could be an arbitrary term, the simplifier could rewrite G in such a way that \(x \in G\) could not be proved anymore.

The aforementioned problem, of the mismatch between type-based theorems from libraries and set-based versions needed by users, shows up regularly in requests posted on the Isabelle community mailing lists. Here is an example [16]: Various lemmas [from the theory Finite_Set] require me to show that f [commutes with \(\circ \)] for all x and y. This is a too strong requirement. I can show that it holds for all x and y in A, but not for all x and y in general.

Often, users feel the need to convert entire libraries from type-based theorems to set-based ones. For example, our colleague Fabian Immler writes about his large formalization experience [17, §5.7]: The main reason why we had to introduce this new type [of finite maps] is that almost all topological properties are formalized in terms of type classes, i.e., all assumptions have to hold on the whole type universe. It feels like a cleaner approach [would be] to relax all necessary topological definitions and results from types to sets because other applications might profit from that, too.

A prophylactic alternative is of course to develop the libraries in a set-based fashion from the beginning, agreeing to pay the price in terms of verbosity and lack of automation. And numerous developments in different HOL-based provers do just that [18,19,20,21,22].

In this paper, we propose an alternative that gets the best of both worlds: Prove easily and still be flexible. More precisely, develop the libraries type based, but export the results set based. We start from the observation that, from a set-theoretic semantics standpoint, the theorems (1) and (2) are equivalent: they both state that, for every nonempty collection of elements, there exists a list of elements from that collection for which P holds. Unfortunately, the HOL logic in its current form is blind to one direction of this equivalence: assuming that (1) is a theorem, one cannot prove (2). Indeed, in a proof attempt of (2), one would fix a nonempty set A and, to invoke (1), one would need to define a new type corresponding to A—an action not currently allowed inside a HOL proof context.

In this paper, we propose a natural and sound extension to HOL (and to Isabelle/HOL) to enable type definitions to be emulated inside HOL proof contexts, which allows us to prove the above-mentioned equivalences. We will also show how this can be used to leverage user experience as outlined above.

The paper is organized as follows. In Sect. 2, we recall the logics of HOL and Isabelle/HOL. In Sect. 3, we describe the envisioned extension of HOL: adding a new rule for simulating type definitions in proof contexts. In Sect. 4, we prove that the new rule is a consistent addition to HOL. In Sect. 5, we demonstrate how the new rule allows us to relativize type-based theorems to set-based ones in HOL. Due to the presence of type classes, we need to extend Isabelle/HOL’s logic further to achieve the relativization—this is the topic of Sect. 6. In Sect. 7, we outline the process of performing the relativization in a principled and automated way. In Sect. 8, we describe an application of our results to a more general version of relativization, namely from types to terms. Finally, in Sect. 9 we describe limitations of our approach and outline our future work.

Starting from Isabelle2016-1, the implementation of the proposed logical extensions and some examples discussed in this paper are part of the Isabelle distribution [23].

2 HOL and Isabelle/HOL Recalled

In this section, we briefly recall the logics of HOL and Isabelle/HOL. We distinguish between the core logic, which is common to HOL and Isabelle/HOL, and the definitional mechanisms, which differ between HOL and Isabelle/HOL.

2.1 Core Logic

The core logic of HOL and Isabelle/HOL is classical Higher-Order Logic with rank-1 polymorphism, Hilbert choice and the Infinity axioms.

We chose one of the equivalent formulations of the core logic of HOL in this paper. Other authors [24,25,26] might start from different primitives or use slightly different set of primitive inference rules (or add derived rules as primitives for performance reasons in a concrete implementation). All these approaches are equivalent and do not affect our results.

We fix the following:

  • an infinite set \(\mathsf {{TVar}}\), of type variables, ranged by \(\alpha ,\beta \)

  • an infinite set \(\mathsf {VarN}\), of (term) variable names, ranged by xyz

  • a set K of symbols, ranged by \(\kappa \), called type constructors, containing three special symbols: “\(\mathsf {bool}\)”, “\(\mathsf {ind}\)” and “\(\rightarrow \)” (aimed at representing the type of booleans, an infinite type and the function type constructor, respectively)

We fix a function \({{\mathsf {arOf}}}: K \rightarrow \mathbb {N}\) giving arities to type constructors, such that \({{\mathsf {arOf}}}(\mathsf {bool}) = {{\mathsf {arOf}}}(\mathsf {ind}) = 0\) and \({{\mathsf {arOf}}}(\rightarrow ) = 2\). If \({{\mathsf {arOf}}}(\kappa ) = n\), we say that \(\kappa \) is an n-ary type constructor. Types, ranged by \(\sigma ,\tau \), are defined as follows:

$$\begin{aligned} \sigma = \; \alpha \;\mid \; (\sigma _1,\ldots ,\sigma _{{{\mathsf {arOf}}}(\kappa )})\,\kappa \end{aligned}$$

Thus, a type is either a type variable or an n-ary type constructor \(\kappa \) postfix-applied to a number of types corresponding to its arity. If \(n=1\), instead of \((\sigma )\,\kappa \) we write \(\sigma \,\kappa \).

Finally, we fix the following:

  • a set \({{\mathsf {Const}}}\), ranged over by c, of symbols called constants, containing five special symbols: “\(\longrightarrow \)”, “\(=\)”, “\({{\varepsilon }}\)”, “\({{\mathsf {zero}}}\)” and “\({{\mathsf {suc}}}\)” (aimed at representing logical implication, equality, Hilbert choice of some element from a type, zero and successor, respectively)

  • a function \({{\mathsf {tpOf}}}: {{\mathsf {Const}}}\rightarrow \mathsf {{Type}}\) associating a type to every constant, such that:

    figure a

\({{\mathsf {TV}}}(\sigma )\) is the set of variables of a type \(\sigma \). Given a function \(\rho : \mathsf {{TVar}}\rightarrow \mathsf {{Type}}\), its support is the set of type variables where \(\rho \) is not the identity: \(\mathsf {supp}(\rho ) = \{\alpha \mid \rho (\alpha ) \ne \alpha \}\). A type substitution is a function \(\rho : \mathsf {{TVar}}\rightarrow \mathsf {{Type}}\) with finite support. We let \({{\mathsf {TSubst}}}\) denote the set of type substitutions. Each \(\rho \in {{\mathsf {TSubst}}}\) extends to a function \(\rho : \mathsf {{Type}}\rightarrow \mathsf {{Type}}\) by defining \(\rho (\alpha ) = \rho (\alpha )\) and \(\rho ((\sigma _1,\ldots ,\sigma _n)\,\kappa ) = (\rho (\sigma _1),\ldots ,\rho (\sigma _n))\,\kappa \).

We say that \(\sigma \) is an instance of \(\tau \)via a substitution of \(\rho \in {{\mathsf {TSubst}}}\), written \(\sigma \le _\rho \tau \), if \(\rho (\tau ) = \sigma \) and \(\mathsf {supp}(\rho ) \subseteq {{\mathsf {TV}}}(\tau )\). We say that \(\sigma \) is an instance of \(\tau \), written \(\sigma \le \tau \), if there exists \(\rho \in {{\mathsf {TSubst}}}\) such that \(\sigma \le _\rho \tau \). Notice that if \(\sigma \le \tau \), there is a unique \(\rho \) such that \(\sigma \le _\rho \tau \). We say that \(\sigma \)is in\(\tau \) if \(\sigma \) is syntactically contained in \(\tau \).

A (typed) variable is a pair of a variable name x and a type \(\sigma \), written \(x_\sigma \). Let \(\mathsf {{Var}}\) denote the set of all variables. A constant instance is a pair of a constant and a type, written \(c_\sigma \), such that \(\sigma \le {{\mathsf {tpOf}}}(c)\). We let \({{\mathsf {CInst}}}\) denote the set of constant instances.

The tuple \({\varSigma }= (K,{{\mathsf {arOf}}},{{\mathsf {Const}}},{{\mathsf {tpOf}}})\), which will be fixed in what follows, is called a signature. This signature’s pre-terms, ranged over by st, are defined by the grammar:

Thus, a pre-term is either a typed variable, or a constant instance, or an application, or an abstraction. As usual, we identify pre-terms modulo alpha-equivalence. Typing is defined as a binary relation between pre-terms and types, written \(t : \sigma \), structurally inductively as follows:

figure b

A term is a well-typed pre-term if there exists a (necessarily unique) type \(\tau \) such that \(t : \tau \). We let \({{\mathsf {Term}}}\) be the set of well-typed terms. We can apply a type substitution \(\rho \) to a term t, written \(\rho (t)\), by applying \(\rho \) to the types of all variables and constant instances occurring in t and potentially renaming some bound variables if they would get identified. We say that a constant c (or a type \(\sigma \)) is in a term t if t syntactically contains c (or \(\sigma \) respectively). The set \({{\mathsf {FV}}}(t)\) is the set of t’s free variables. The term t is called closed if it has no free variables: \({{\mathsf {FV}}}(t) = \emptyset \). When writing terms, we sometimes omit the types of variables if they can be inferred. We sometimes use brackets to indicate that certain types and/or terms are a part of the term, e.g., \(t[\alpha ]\), \(t[\sigma ,c_\tau ]\).

A formula is a term of type \(\mathsf {bool}\). The logical constants \(\mathsf {{True}}\) and \(\mathsf {{False}}\), formula connectives and quantifiers are defined in the standard way, starting from the implication and equality primitives.

In HOL, types represent “rigid” collections of elements. More flexible collections can be obtained using sets. Essentially, a set on a type \(\sigma \), also called a subset of \(\sigma \), is given by a predicate \(S : \sigma \rightarrow \mathsf {bool}\). Then membership of an element a to S, written \(a \in S\), is defined to mean \(S\;a\) (which is the same as \(S\;a = \mathsf {{True}}\)). HOL systems differ in how abstractly they represent this concept of “sets as predicates”. Our paper is agnostic on this in the following sense: we write \(\alpha \,\mathsf {{set}}\) to be either a textual abbreviation for \(\alpha \rightarrow \mathsf {bool}\) (corresponds to HOL Light, which uses directly the predicate type) or syntactic sugar for predicates (as in HOL4) or an actual type constructor defined to be isomorphic to the predicates (as in Isabelle/HOL). All these approaches yield essentially the same notion and our results apply to all of them.

A proof context \({\varGamma }\) and a HOL theory D are finite sets of formulas. We say that \(({\varGamma },\varphi )\) is a sequent if \({\varGamma }\) is a context and \(\varphi \) is a formula. The HOL deduction relation \(\vdash \) is a ternary relation between theories, contexts and formulas, \(D;{\varGamma }\vdash \varphi \), and is defined inductively starting from the formulas in D and HOL axioms (containing axioms for equality, infinity, choice, and excluded middle) and applying deduction rules (introduction and elimination of \(\longrightarrow \), term and type instantiation, extensionality and \(\beta \)-reduction). Precise definitions of (equivalent variations of) the HOL deduction system can be found elsewhere [25, 27, 28].

The definitional mechanisms for constants and types extend the signature (by adding the newly defined symbol) and the theory (by adding the definitional formula). We call a theory D definitional if D was created by a sequence of theory extensions corresponding either to a constant or a type definition. Since Isabelle/HOL allows for more flexible constant definitions than HOL does, we will cover their definitional mechanisms separately in the next two sections.

A theory D is consistent if it cannot derive \(\mathsf {{False}}\) in the empty context, i.e., if \(D; \emptyset \vdash \mathsf {{False}}\) does not hold.

2.2 Definitional Mechanisms of HOL

Most of the systems implementing HOL follow the tradition to discourage their users from using arbitrary underlying theories D and to promote merely definitional ones, containing definitions of constants and types.

A HOL constant definition is a formula \(c_\sigma = t\), where:

  • c is a fresh constant of type \(\sigma \)

  • t is a term that is closed (i.e., has no free term variables) and whose type variables are included in those of \(\sigma \)

HOL type definitions are more complex entities. They are based on the notion of a newly defined type \(\beta \) being embedded in an existing type \(\alpha \), i.e., being isomorphic to a given nonempty subset S of \(\alpha \) via mappings \( Abs \) and \( Rep \). Let \({}_{\alpha }(\beta \approx S)^{ Abs }_{ Rep }\) denote the formula expressing this:

$$\begin{aligned} (\forall x_\beta .\; Rep \,x \in S) \,\wedge \, (\forall x_\beta .\; Abs \,( Rep \;x) = x) \,\wedge \, (\forall y_\alpha .\;y \in S \longrightarrow Rep \,( Abs \;y) = y) \end{aligned}$$

When the user issues a command \(\texttt {typedef}\ \ \tau = S_{\sigma \,\mathsf {set}}\), they are required to discharge the goal \(S \ne \emptyset \), after which the system introduces a new type \(\tau \) and two constants \(\mathsf {{Abs}}^\tau : \sigma \rightarrow \tau \) and \(\mathsf {{Rep}}^\tau : \tau \rightarrow \sigma \) and adds the axiom \({}_{\sigma }(\tau \approx S)^{\mathsf {{Abs}}^\tau }_{\mathsf {{Rep}}^\tau }\) to the theory.

2.3 Definitional Mechanisms of Isabelle/HOL

While a member of the HOL family, Isabelle/HOL is special w.r.t. constant definitions. Namely, a constant is allowed to be declared with a given type \(\sigma \) and then “overloaded” on various types \(\tau \) less general than \(\sigma \) and mutually orthogonal. For example, we can have \(\mathsf {d}\) declared to have type \(\alpha \), and then \(\mathsf {d}_\mathsf {bool}\) defined to be \(\mathsf {{True}}\) and \(\mathsf {d}_{\alpha \,\mathsf {list}}\) defined to be \([\mathsf {d}_\alpha ]\). We shall write \({\varDelta }_c\) for the collection of all types where c has been overloaded. In the above example, \({\varDelta }_\mathsf {d}= \{\mathsf {bool}, \alpha \,\mathsf {list}\}\).

The mechanism of overloaded definitions offers broad expressive power. But with power also comes responsibility. The system has to make sure that the defining equations cannot form a cycle. To guarantee that, a binary constant/type dependency relation \(\rightsquigarrow \) on types and constants is maintained, where \(u \rightsquigarrow v\) holds true iff one of the following holds:

  1. 1.

    u is a constant c that was declared with type \(\sigma \) and v is a type in \(\sigma \)

  2. 2.

    u is a constant c defined as \(c = t\) and v is a type or constant in t

  3. 3.

    u is a type \(\sigma \) defined as \(\sigma = A\) and v is a type or constant in A

We write \(\rightsquigarrow ^{\downarrow }\) for (type-)substitutive closure of the constant/type dependency relation, i.e., if \(p \rightsquigarrow q\), the type instances of p and q are in \(\rightsquigarrow ^{\downarrow }\). The system accepts only overloaded definitions for which \(\rightsquigarrow ^{\downarrow }\) does not contain an infinite chain.

In addition, Isabelle supports user-defined axiomatic type classes, which are essentially predicates on types. They effectively improve the type system with the ability to carry implicit assumptions. For example, we can define the type class \({{\mathsf {finite}}}(\alpha )\) expressing that \(\alpha \) has a finite number of inhabitants. Then, we are allowed to annotate type variables by such predicates, e.g., \(\alpha _{{\mathsf {finite}}}\) or \(\alpha _\mathsf {semigroup}\) from Sect. 1. Finally, we can substitute a type \(\tau \) for \(\alpha _{{\mathsf {finite}}}\) only if \(\tau \) has been previously proved to fulfill \({{\mathsf {finite}}}(\tau )\).

The axiomatic type classes become truly useful when we use overloaded constants for their definitions. This combination allows the use of Haskell-style type classes. E.g., we can reason about arbitrary semigroups by declaring a global constant \(* : \alpha \rightarrow \alpha \rightarrow \alpha \) and defining the HOL predicate \(\mathsf {semigroup}(\alpha )\) stating that \(*\) is associative on \(\alpha \).

In this paper, we are largely concerned with results relevant for the entire HOL family of provers, but also take special care with the Isabelle/HOL maverick. Namely, we show that our local typedef proposal can be adapted to cope with Isabelle/HOL’s type classes.

3 Proposal of a Logic Extension: Local Typedef

To address the limitation described in Sect. 1, we propose extending the HOL logic with a new rule for type definition with the following properties:

  • It enables type definitions to be emulated inside proofs while avoiding the introduction of dependent types by a simple syntactic check.Footnote 2

  • It is natural and sound w.r.t. the standard HOL semantics à la Pitts [28], as well as consistent with the logic of Isabelle/HOL.

To motivate the formulation of the new rule and to understand the intuition behind it, we will first look deeper into the idea behind type definitions in HOL. Let us take a purely semantic perspective and ignore the rank-1 polymorphism for a minute. Then the principle behind type definitions simply states that for all types \(\alpha \) and nonempty subsets A of them, there exists a type \(\beta \) isomorphic to A:

figure c

The typedef mechanism can be regarded as the result of applying a sequence of standard rules for connectives and quantifiers to (\(\star \)) in a more expressive logic (notationally, we use Gentzen’s sequent calculus):

  1. 1.

    Left \(\forall \) rule, instantiating \(\alpha \) and \(a_{\sigma \,\mathsf {set}}\) with type \(\sigma \) and closed term A (of type \({\sigma \,\mathsf {set}}\)) (both provided by the user), and left implication rule:

    figure d
  2. 2.

    Left \(\exists \) rule for \(\beta \), \( Abs \) and \( Rep \), introducing some new/fresh type \(\tau \), and functions \(\mathsf {{Abs}}^\tau \) and \(\mathsf {{Rep}}^\tau \):

    figure e

The user further discharges \({\varGamma }\vdash A \ne \emptyset \), and therefore the overall effect of this chain is the sound addition of \({}_{\sigma }(\tau \approx A)^{\mathsf {{Abs}}^\tau }_{\mathsf {{Rep}}^\tau }\) as an extra assumption when trying to prove an arbitrary fact \(\varphi \).

What we propose is to use a variant of the above with fewer instantiations, which takes better advantage of the result of Step 1: It refrains from introducing the type \(\tau \) from Step 2, but keeps \(\beta \) as a fresh type variable. We obtain:

figure f

To conclude, the overall rule, written (LT) as in “Local Typedef”, looks as follows:

figure g

This rule allows us to locally assume that there is a type \(\beta \) isomorphic to an arbitrary nonempty set A. The requirement that \(\beta \) be fresh for A prevents the introduction of a dependent type (since A may contain term variables).

The above discussion shows that (LT) is morally correct and, more importantly, natural, in the sense that it is an instance of a more general principle, namely the rule (\(\star \)). In the next section, we give a rigorous proof of the rule’s consistency.

4 Consistency of Local Typedef

A typical development in HOL consists of issuing definitions and proving theorems from them. Thus, the only “axioms” that are introduced are the definitions. The appeal of this definitional approach to the many users of HOL is of course the guaranteed consistency. We will show that our new rule does not break the consistency of definitional theories. To this end, we first need to revisit the standard, model-theoretic argument for consistency.

4.1 Model Theory of HOL

We follow the model theory of HOL developed by Andrew Pitts [28]. Our presentation might diverge in concrete details and notation but the overall approach is the same.

We fix a Grothendieck universe \(\mathcal {V}\), i.e., a transitive set that contains an infinite set and is closed under all standard set operations such as power set and union. It follows from transitivity and power-set closure that \(\mathcal {V}\) is also closed under inclusion: if \(A \in \mathcal {V}\) and \(B \subseteq A\) then \(B \in \mathcal {V}\). From this and the fact that \(\mathcal {V}\) contains an infinite set, we obtain that \(\mathcal {V}\) contains (copies of) \(\mathbb {N}\) and \(\mathbb {B}= \{ \mathsf {{false}}, \mathsf {{true}}\}\).

We define our universe \(\mathcal {U}\) as \(\mathcal {U}= \mathcal {V}\setminus \emptyset \) (since we will interpret types as nonempty sets). Moreover, we fix a choice function, \({{\mathsf {choice}}}\), that assigns to each set \(A \in \mathcal {U}\) an element \({{\mathsf {choice}}}(A) \in A\).

We fix an interpretation function I of type constructors from the signature \({\varSigma }\), such that, for an n-ary type constructor \(\kappa \in K\), \(I(\kappa )\) is a function in \(\mathcal {U}^n \rightarrow \mathcal {U}\). We assume the standard (fixed) interpretation for the built-in type constructors: \(I(\mathsf {bool}) = \mathbb {B}\), \(I(\mathsf {ind}) = \mathbb {N}\) and for the function type constructor \(\rightarrow \), we assume that \(I(\rightarrow )\) sends any pair of sets (AB) to \(A \rightarrow B\) (the set of functions from A to B).

The interpretation of a type \(\tau \) is a function \([\tau ]_I : (\mathsf {{TVar}}\rightarrow \mathcal {U}) \rightarrow \mathcal {U}\), which takes an assignment to type variables \(\theta : \mathsf {{TVar}}\rightarrow \mathcal {U}\) as input. Instead of \([\tau ]_I(\theta )\), we will write \([\tau ]_{I,\theta }\). The function is defined as follows:

$$\begin{aligned}&[\alpha ]_{I,\theta } = \theta (\alpha )&\\&\quad [(\sigma _1,\ldots ,\sigma _n)\,\kappa ]_{I,\theta } = I(\kappa )([\sigma _1]_{I,\theta },\ldots ,[\sigma _n]_{I,\theta }) \text{ where } {{\mathsf {arOf}}}(\kappa ) = n \end{aligned}$$

We fix an interpretation function for constants, also denoted by I, which assigns to any constant \(c \in C\) a function \(I(c) \in \prod _{\theta \in \mathsf {{TVar}}\rightarrow \mathcal {U}} [{{\mathsf {tpOf}}}(c)]_{I,\theta }\). We assume the standard (fixed) interpretation for the built-in constants: \(I({\longrightarrow })(\theta )\) is the logical implication on \(\mathbb {B}\); \(I(=)(\theta )\) is the equality predicate in \(\theta (\alpha ) \rightarrow \theta (\alpha ) \rightarrow \mathbb {B}\); \(I({{\mathsf {zero}}})(\theta )\) and \(I({{\mathsf {suc}}})(\theta )\) are the zero and successor on \(\mathbb {N}\); the Hilbert choice operator is interpreted as

$$\begin{aligned} \quad&I({{\varepsilon }})(\theta )(f) = {\left\{ \begin{array}{ll} {{\mathsf {choice}}}(\{a \in \theta (\alpha ) \mid f(a) = \mathsf {{true}}\}) &{} \text{ if } \text{ the } \text{ set } \text{ is } \text{ non-empty, }\\ {{\mathsf {choice}}}(\theta (\alpha )) &{} \text{ otherwise. } \end{array}\right. }&\end{aligned}$$

We say that an assignment \(\xi : \mathsf {{Var}}\rightarrow \mathcal {U}\) (of term variables to elements of \(\mathcal {U}\)) is \(\theta \)-compatible if \(\xi \in \prod _{x_\tau \in \mathsf {{Var}}} [\tau ]_{I,\theta }\), i.e., \(\xi (x_{\tau })\in [\tau ]_{I,\theta }\) for all term variables \(x_\tau \). The interpretation of a term \(t : \sigma \) is a function

$$\begin{aligned}{}[t]_I \in \prod _{\theta \in \mathsf {{TVar}}\rightarrow \mathcal {U}} \left( \prod _{x_{\tau } \in \mathsf {{Var}}} [\tau ]_{I,\theta }\right) \rightarrow [\sigma ]_{I, \theta }. \end{aligned}$$

Thus, the interpretation of terms takes an assignment to type variables \(\theta \) and a (\(\theta \)-compatible) assignment to term variables \(\xi \) as input. We will write \([t]_{I,\theta , \xi }\) instead of \([t]_I(\theta )(\xi )\). We define \([t]_{I, \theta , \xi }\) as followsFootnote 3:

figure h

Above, the interpretation of lambda abstraction is the function sending each \(a \in [\sigma ]_{I,\theta }\) to \([t]_{I,\theta , \xi [x_{\sigma } \leftarrow a]}\), where \(\xi [x_{\sigma } \leftarrow a]\) denotes \(\xi \) updated with a at \(x_{\sigma }\).

Thus, an interpretation I consists of two homonymous functions, one for type constructors and one for constants. We say that I satisfies a sequent \(({\varGamma }, \varphi )\), written \({\varGamma }\vDash _I \varphi \), if for all \(\theta \) and \(\xi \) it holds that \([\varphi ]_{I, \theta , \xi } = \mathsf {{true}}\) whenever \([\psi ]_{I, \theta , \xi } = \mathsf {{true}}\) for all \(\psi \in {\varGamma }\). If \({\varGamma }\) is empty, we write \(\vDash _I \varphi \). A deduction rule has the form \(\frac{({\varGamma }_1,\varphi _1) \ldots ({\varGamma }_n,\varphi _n)}{({\varGamma }, \varphi )}\); we say that such a deduction rule is sound for I if \({\varGamma }\vDash _I \varphi \) holds whenever \({\varGamma }_j \vDash _I \varphi _j\) holds for all \(j \in \{1,\ldots ,n\}\).

We say that a interpretation I is a model of a theory D if I satisfies all HOL axioms and all formulas from D. Pitts proved that (i) the deduction rules of HOL are sound for all standard interpretations, and (ii) every definitional theory has a standard model. The facts (i) and (ii) give us that, for every definitional theory D, if \(D;{\varGamma }\vdash \varphi \), then there exists a standard model I of D such that \({\varGamma }\vDash _I\varphi \). In particular, this implies the consistency of the HOL definitional theories, since \(\not \vDash _I\mathsf {{False}}\).

4.2 Proof of Consistency

Now we have enough background material to carry out the proof of consistency of (LT) in HOL. We shall employ the following folklore lemma, stating that interpretations only depend on the free variables:

Lemma 1

Let I be an interpretation.

(1) If \(\theta \) and \(\theta '\) do not differ on \({{\mathsf {TV}}}(\tau )\), then \([\tau ]_{I,\theta } = [\tau ]_{I,\theta '}\).

(2) If \(\theta \) and \(\theta '\) do not differ on \({{\mathsf {TV}}}(t)\) and \(\xi \) and \(\xi '\) do not differ on \({{\mathsf {FV}}}(t)\), then \([t]_{I,\theta , \xi } = [t]_{I,\theta ', \xi '}\).

Lemma 2

The rule (LT) is sound for any standard interpretation.

Proof

Let us fix a standard interpretation \(I\) and assume that the hypotheses of the (LT) rule are satisfied by I, i.e.,

$$\begin{aligned}&{\varGamma }\vDash _IA \ne \emptyset \end{aligned}$$
(3)
$$\begin{aligned}&{\varGamma }\vDash _I(\exists Abs \; Rep .{}_{\sigma }(\beta \approx A)^{ Abs }_{ Rep }) \longrightarrow \varphi \end{aligned}$$
(4)

We wish to prove the rule’s conclusion, \({\varGamma }\vDash _I\varphi \). To this end, let us fix \(\theta \) and a \(\theta \)-compatible \(\xi \) such that \([\psi ]_{I,\theta ,\xi } = \mathsf {{true}}\) for all \(\psi \in {\varGamma }\). We wish to prove \([\varphi ]_{I,\theta ,\xi } = \mathsf {{true}}\).

Let us temporarily fix \(B \in \mathcal {U}\), and let \(\theta _B\) denote \(\theta [\beta \leftarrow B]\) (\(\theta \) with \(\beta \) updated to B). We let \(\xi _B\) be a \(\theta _B\)-compatible assignment to term variables such that \(\xi _B\) is the same as \(\xi \) on the free variables of A, \({\varGamma }\) and \(\varphi \). Since \(\beta \) is fresh for \({\varGamma }\), by Lemma 1 we have \([\psi ]_{I,\theta _B,\xi _B} = \mathsf {{true}}\) for all \(\psi \in {\varGamma }\). Using (3) and (4) and under the standard interpretation of \(\longrightarrow \) in (4), we obtain:

$$\begin{aligned}&[A \ne \emptyset ]_{I,\theta _B,\xi _B} = \mathsf {{true}}\\&[\exists Abs \; Rep .{}_{\sigma }(\beta \approx A)^{ Abs }_{ Rep }]_{I,\theta _B,\xi _B} = \mathsf {{true}}\; \text{ implies } \;[\varphi ]_{I,\theta _B,\xi _B} = \mathsf {{true}}. \end{aligned}$$

Since \(\beta \) is fresh for A and \(\varphi \), with Lemma 1 we obtain:

$$\begin{aligned}&[A \ne \emptyset ]_{I,\theta ,\xi } = \mathsf {{true}},\end{aligned}$$
(5)
$$\begin{aligned}&[\exists Abs \; Rep .{}_{\sigma }(\beta \approx A)^{ Abs }_{ Rep }]_{I,\theta _B,\xi _B} = \mathsf {{true}}\; \text{ implies } \;[\varphi ]_{I,\theta ,\xi } = \mathsf {{true}}. \end{aligned}$$
(6)

Since B above was arbitrary, if we were able to find some\(B \in \mathcal {U}\) such that the antecedent of (6), namely,

$$\begin{aligned}{}[\exists Abs \; Rep .{}_{\sigma }(\beta \approx A)^{ Abs }_{ Rep }]_{I,\theta _B,\xi _B} = \mathsf {{true}} \end{aligned}$$
(7)

holds, then the proof would be finished, since we would obtain \([\varphi ]_{I,\theta ,\xi } = \mathsf {{true}}\), as desired.

From (5) we obtain

$$\begin{aligned} \{a \in [\sigma ]_{I,\theta } \mid [A]_{I,\theta ,\xi }\,a = \mathsf {{true}}\} \ne \emptyset . \end{aligned}$$
(8)

Recall that \(\mathcal {V}\) is closed under inclusions and thus \(\mathcal {U}\) is closed under inclusions of nonempty subsets. To prove (7), we take B to be \(\{a \in [\sigma ]_{I,\theta } \mid [A]_{I,\theta ,\xi }\,a = \mathsf {{true}}\}\) and since \(B \subseteq [\sigma ]_{I,\theta }\), \([\sigma ]_{I,\theta } \in \mathcal {U}\) and (8), we obtain \(B \in \mathcal {U}\). Note that, again by Lemma 1, we have \([A]_{I,\theta _B,\xi _B} = [A]_{I,\theta ,\xi }\) and, since \({{\mathsf {TV}}}(\sigma ) \subseteq {{\mathsf {TV}}}(A)\), we have \([\sigma ]_{I,\theta _B} = [\sigma ]_{I,\theta }\). We define \(\mathsf {{Abs}}: [\sigma ]_{I,\theta } \rightarrow B\) as

$$\begin{aligned} \mathsf {{Abs}}(x) = {\left\{ \begin{array}{ll} x &{} \text {if } x \in B \\ \epsilon (B) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

and \(\mathsf {{Rep}}: B \rightarrow [\sigma ]_{I,\theta }\) as the inclusion map. It is a routine to verify

$$\begin{aligned}{}[{}_{\sigma }(\beta \approx A)^{ Abs }_{ Rep }]_{I,\theta _B,\xi _B[ Abs \leftarrow \mathsf {{Abs}}, Rep \leftarrow \mathsf {{Rep}}]} = \mathsf {{true}}, \end{aligned}$$

which proves (7). \(\square \)

The bottom line of the above proof was to show a semantic analog of (\(\star \)), our intuitive formulation of HOL’s typedef in a logic richer than HOL: Assuming (8) holds, we find B such that (7) holds. The important twist is that this was performed in an arbitrary proof context, freed from the restrictions imposed by traditional HOL.

Proposition 1

Any HOL definitional theory is consistent w.r.t. HOL deduction extended with the (LT) rule.

Proof

Let D be a definitional theory. Assume, for a contradiction, that \(D;\vdash ' \mathsf {{False}}\) where \(\vdash '\) denotes HOL deduction enriched with (LT). We take \(I\) to be a model of the HOL axioms (of whose existence we know from Pitts). Since the HOL rules are sound for \(I\) and, from the above lemma, (LT) is also sound for \(I\), we obtain \(\vDash _I\mathsf {{False}}\), which is impossible. \(\square \)

Consistency also holds for the case of Isabelle/HOL:

Proposition 2

Any Isabelle/HOL definitional theory is consistent w.r.t. HOL deduction extended with the (LT) rule.

The proof of this proposition can be found in our recent paper on proving Isabelle/HOL’s consistency [27]. There, we first propose a stronger logic of HOL with comprehension types (HOLC) and show that HOLC is consistent. Then we soundly translate Isabelle/HOL’s logic into HOLC and prove that the (LT) rule is admissible in HOLC, which makes it consistent in Isabelle/HOL [27, §4].

5 From Types to Sets in HOL

Let us look again at the motivating example from Sect. 1 and see how the rule (LT) allows us to achieve the relativization from a type-based theorem to a set-based theorem in HOL or Isabelle/HOL without type classes. We assume (1) is a theorem, and wish to prove (2). We fix \(\alpha \) and \(A_{\alpha \,\mathsf {set}}\) and assume \(A \ne \emptyset \). Applying (LT), we obtain a type \(\beta \) (represented by a fresh type variable) such that \(\exists Abs \; Rep .{}_{\alpha }(\beta \approx A)^{ Abs }_{ Rep }\), from which we obtain \(\mathsf {{Abs}}\) and \(\mathsf {{Rep}}\) such that \({}_{\alpha }(\beta \approx A)^{\mathsf {{Abs}}}_{\mathsf {{Rep}}}\). From this, (1) with \(\alpha \) instantiated to \(\beta \), and the definition of \({{\mathsf {lists}}}\), we obtain

$$\begin{aligned} \exists { xs }_{\beta \,\mathsf {list}} \in {{\mathsf {lists}}}\;({{\mathsf {UNIV}}}_{\beta \,\mathsf {set}}).\mathsf {P}_{\beta \,\mathsf {list}\rightarrow \mathsf {bool}}\;{ xs }. \end{aligned}$$

Furthermore, using that \(\mathsf {{Abs}}\) and \(\mathsf {{Rep}}\) are isomorphisms between \(A_{\alpha \,\mathsf {set}}\) and \({{\mathsf {UNIV}}}_{\beta \,\mathsf {set}}\), we obtain

$$\begin{aligned} \exists { xs }_{\alpha \,\mathsf {list}} \in {{\mathsf {lists}}}\;A_{\alpha \,\mathsf {set}}.\mathsf {P}_{\alpha \,\mathsf {list}\rightarrow \mathsf {bool}}\;{ xs }, \end{aligned}$$

as desired.Footnote 4

We will consider a general case now. Let us start with a type-based theorem

$$\begin{aligned} \forall \alpha .\varphi [\alpha ], \end{aligned}$$
(9)

where \(\varphi [\alpha ]\) is a formula containing \(\alpha \). We fix \(\alpha \) and \(A_{\alpha \,\mathsf {set}}\), assume \(A \ne \emptyset \) and “define” a new type \(\beta \) isomorphic to A. Technically, we fix a fresh type variable \(\beta \) and assume

$$\begin{aligned} \exists Abs \; Rep .{}_{\alpha }(\beta \approx A)^{ Abs }_{ Rep }. \end{aligned}$$
(10)

From the last formula, we can obtain the isomorphism \(\mathsf {{Abs}}\) and \(\mathsf {{Rep}}\) between \(\beta \) and A. Having the isomorphisms, we can carry out the relativization along them and prove

$$\begin{aligned} \varphi [\beta ] \longleftrightarrow \varphi ^\mathsf {on}[\alpha , A_{\alpha \,\mathsf {set}}], \end{aligned}$$
(11)

where \(\varphi ^\mathsf {on}[\alpha , A_{\alpha \,\mathsf {set}}]\) is the relativization of \(\varphi [\beta ]\). In the motivational example:

$$\begin{aligned} \varphi [\beta ]&= \exists { xs }_{\beta \,\mathsf {list}}.\mathsf {P}\;{ xs }\\ \varphi ^\mathsf {on}[\alpha , A_{\alpha \,\mathsf {set}}]&= \exists { xs }_{\alpha \,\mathsf {list}} \in {{\mathsf {lists}}}\;A.\mathsf {P}\;{ xs }\end{aligned}$$

We postpone the discussion of how to derive \(\varphi ^\mathsf {on}\) from \(\varphi \) in a principled way and how to automatically prove the equivalence between them until Sect. 7. Here, we only appeal to the intuition: For example, if \(\varphi \) contains the universal quantification \(\forall x_\beta \), we replace it by the related bounded quantification \(\forall x_\alpha \in A\) in \(\varphi ^\mathsf {on}\). Or, if \(\varphi \) contains the predicate \({{\mathsf {inj}}}\;f_{\beta \rightarrow \gamma }\), we replace it by the related notion of \({{\mathsf {inj}}}^\mathsf {on}\;A_{\alpha \,\mathsf {set}}\;f_{\alpha \rightarrow \gamma }\) in \(\varphi ^\mathsf {on}\).

Since the left-hand side of the equivalence (11) is an instance of (9), we discharge the left-hand side and obtain \(\varphi ^\mathsf {on}[\alpha , A_{\alpha \,\mathsf {set}}]\), which no longer contains the locally “defined” type \(\beta \). Thus we can discard \(\beta \). Technically, we use the (LT) rule and remove the assumption (10). Thus we obtain the final result:

$$\begin{aligned} \forall \alpha .\forall A_{\alpha \,\mathsf {set}}.A \ne \emptyset \longrightarrow \varphi ^\mathsf {on}[\alpha , A] \end{aligned}$$

This theorem is the set-based version of \(\forall \alpha .\varphi [\alpha ]\).

We will move to Isabelle/HOL in the next section and explore how the isomorphic journey between types and sets proceeds in the environment where we are allowed to restrict type variables by type-class annotations.

6 From Types to Sets in Isabelle/HOL

Isabelle/HOL goes beyond traditional HOL, extending it by ad hoc overloading and axiomatic type classes. In this section, we explain how these features are in conflict with the algorithm described in Sect. 5 and show how to circumvent the problem.

6.1 Local Axiomatic Type Classes

The first complication is the implicit assumptions on types given by the axiomatic type classes. Recall that \(\alpha _{{\mathsf {finite}}}\) means that \(\alpha \) can be instantiated only with a type that we proved to fulfill the conditions of the type class \({{\mathsf {finite}}}\), namely, that the type must contain finitely many elements.

To explain the complication on an example, let us modify (9) to speak about types of class \({{\mathsf {finite}}}\):

$$\begin{aligned} \forall \alpha _{{\mathsf {finite}}}.\varphi [\alpha _{{\mathsf {finite}}}] \end{aligned}$$
(12)

As a modification of the algorithm from Sect. 5, we fix a set A and assume that it is nonempty and finite. As before, we locally define a new type \(\beta \) isomorphic to A. Although \(\beta \) fulfills the condition of the type class \({{\mathsf {finite}}}\), we cannot add the type into the type class since this action is allowed only at the global theory level in Isabelle and not locally in a proof context.

On the other hand, without adding \(\beta \) into \({{\mathsf {finite}}}\) we cannot continue since we need to instantiate \(\beta \) for \(\alpha _{{\mathsf {finite}}}\) to prove the analog of the equivalence (11). Our solution is to internalize the type-class assumption in (12) and obtain

$$\begin{aligned} \forall \alpha .{{\mathsf {finite}}}(\alpha ) \longrightarrow \varphi [\alpha ], \end{aligned}$$
(13)

where \({{\mathsf {finite}}}(\alpha )\) is a term of type \(\mathsf {bool}\), which is true if and only if \(\alpha \) is a finite type.Footnote 5 Now we can instantiate \(\alpha \) by \(\beta \) and get \({{\mathsf {finite}}}(\beta ) \longrightarrow \varphi [\beta ]\). Using the fact that the relativization of \({{\mathsf {finite}}}(\beta )\) is \({{\mathsf {finite}}}\;A\), we apply the isomorphic translation between \(\beta \) and A and obtain

$$\begin{aligned} {{\mathsf {finite}}}\;A \longrightarrow \varphi ^\mathsf {on}[\alpha , A]. \end{aligned}$$

Quantifying over the fixed variables and adding the assumptions yields the final result, the set-based version of (12):

$$\begin{aligned} \forall \alpha .\forall A_{\alpha \,\mathsf {set}}.A \ne \emptyset \longrightarrow {{\mathsf {finite}}}\;A \longrightarrow \varphi ^\mathsf {on}[\alpha , A] \end{aligned}$$

The internalization of type classes (inferring (13) from (12)) is already supported by the kernel of Isabelle—thus no further work is required from us. The rule for internalization of type classes is a result of the work by Haftmann and Wenzel [29, 30].

6.2 Local Overloading

In the previous section we addressed implicit assumptions on types given by axiomatic type classes and showed how to reduce the relativization of such types to the original translation algorithm by internalizing the type classes as predicates on types. As we explained in Sect. 2.3, the mechanism of Haskell-like type classes in Isabelle is more general than the notion of axiomatic type classes since additionally we are allowed to associate operations with every type class. In this respect, the type class \({{\mathsf {finite}}}\) is somewhat special since there are no operations associated with it.

Therefore we use semigroups as the running example in this section since semigroups require an associated operation—multiplication. A general specification of a semigroup would contain a nonempty set \(A_{\alpha \,\mathsf {set}}\), a binary operation \(f_{\alpha \rightarrow \alpha \rightarrow \alpha }\) such that A is closed under f, and a proof of the specific property of semigroups that f is associative on A. We capture the last property by the predicate

$$\begin{aligned} \mathsf {semigroup}^\mathsf {on}_\mathsf {with}\;A\;f = (\forall x\;y\;z \in A.f\;(f\;x\;y)\;z = f\;x\;(f\;y\;z)), \end{aligned}$$

which we read along the paradigm: a structureonthe setAwithoperations\(f_1, \dots , f_n\).

The realization of semigroups by type classes in Isabelle is an example that reflects the problem’s full generality. The type \(\sigma \) can belong to the type class \(\mathsf {semigroup}\) if \(\mathsf {semigroup}(\sigma )\) is provable, where

$$\begin{aligned} \mathsf {semigroup}(\alpha ) \text{ iff } \; \forall x_\alpha \;y_\alpha \;z_\alpha .(x*y)*z = x*(y*z). \end{aligned}$$
(14)

Notice that the associated multiplication operation is represented by the global overloaded constant \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\), which will cause the complication.

Let us now relativize \(\forall \alpha _\mathsf {semigroup}.\varphi [\alpha _\mathsf {semigroup}]\). We fix a nonempty set A, a binary f such that A is closed under f and assume \(\mathsf {semigroup}^\mathsf {on}_\mathsf {with}\;A\;f\). As before, we locally define \(\beta \) to be isomorphic to A and obtain the respective isomorphisms \(\mathsf {{Abs}}\) and \(\mathsf {{Rep}}\).

Having defined \(\beta \), we want to prove that \(\beta \) belongs into \(\mathsf {semigroup}\). Using the approach from the previous section, this goal translates into proving \(\mathsf {semigroup}(\beta )\), which requires that the overloaded constant \(*_{\beta \rightarrow \beta \rightarrow \beta }\) used in the definition of \(\mathsf {semigroup}\) [see (14)] must be isomorphic to f on A. In other words, we have to locally define \(*_{\beta \rightarrow \beta \rightarrow \beta }\) to be a projection of f onto \(\beta \), i.e., \(x_\beta * y_\beta \) must equal \(\mathsf {{Abs}}(f\;(\mathsf {{Rep}}\;x)\;(\mathsf {{Rep}}\;y))\). Although we can locally “define” a new constant (fix a fresh term variable c and assume \(c = t\)), we cannot overload the global symbol \(*\) locally for \(\beta \). This is not supported by Isabelle.

We will cope with the complication by compiling out the overloaded constant \(*\) from

$$\begin{aligned} \forall \alpha .\mathsf {semigroup}(\alpha ) \longrightarrow \varphi [\alpha ] \end{aligned}$$
(15)

as follows: Whenever \(c = \ldots * \ldots \) (i.e., c was defined in terms of \(*\) and thus depends implicitly on the overloaded meaning of \(*\)), define \(c_\mathsf {with}\;f = \dots f \dots \) and use it instead of c. The parameter f abstracts over the meaning of \(*\) here: whenever we want to use \(c_\mathsf {with}\), we have to explicitly specify how to perform multiplication in \(c_\mathsf {with}\) by instantiating f. That is to say, the implicit meaning of \(*\) in c was made explicit by f in \(c_\mathsf {with}\). Using this approach, we obtain:

$$\begin{aligned} \forall \alpha .\forall f_{\alpha \rightarrow \alpha \rightarrow \alpha }.\mathsf {semigroup}_\mathsf {with}\;f \longrightarrow \varphi _\mathsf {with}[\alpha ,f], \end{aligned}$$
(16)

where \(\mathsf {semigroup}_\mathsf {with}\;f_{\alpha \rightarrow \alpha \rightarrow \alpha } = (\forall x_\alpha \;y_\alpha \;z_\alpha .f\;(fxy)\;z = fx(fyz))\) and similarly for \(\varphi _\mathsf {with}\). For now, let us assume that we can provably obtain (16) from (15) and let us look at how it helps us to finish the relativization and later we will explain how to derive (16) as a theorem.

Given (16), we will instantiate \(\alpha \) with \(\beta \) and obtain

$$\begin{aligned} \forall f_{\beta \rightarrow \beta \rightarrow \beta } .\mathsf {semigroup}_\mathsf {with}\;f \longrightarrow \varphi _\mathsf {with}[\beta ,f]. \end{aligned}$$

Recall that the quantification over all functions of type \(\beta \rightarrow \beta \rightarrow \beta \) is isomorphic to the bounded quantification over all functions of type \(\alpha \rightarrow \alpha \rightarrow \alpha \) under which \(A_{\alpha \,\mathsf {set}}\) is closed.Footnote 6 The difference after compiling out the overloaded constant \(*\) is that now we are isomorphically relating two bounded (local) variables from the quantification and not a global constant \(*\) to a local variable.

Thus we reduced the relativization once again to the original algorithm and can obtain the set-based version

$$\begin{aligned}&\forall \alpha .\forall A_{\alpha \,\mathsf {set}}.A\ne \emptyset \longrightarrow \\&\quad \forall f_{\alpha \rightarrow \alpha \rightarrow \alpha }.(\forall x_\alpha \;y_\alpha \in A.f\;x\;y \in A) \longrightarrow \mathsf {semigroup}^\mathsf {on}_\mathsf {with}\;A\;f \longrightarrow \varphi ^\mathsf {on}_\mathsf {with}[\alpha ,A,f]. \end{aligned}$$

Let us discuss how to obtain (16) as a theorem. We derive it by performing the dictionary construction, whose detailed description can be found, for example, in the paper by Krauss and Schropp [31, §5.2]. We will outline the process only informally here. Our task is to compile out an overloaded constant \(*\) from a term s. As a first step, we transform s into \(s_\mathsf {with}[*/f]\) such that \(s = s_\mathsf {with}[*/f]\) and such that unfolding the definitions of all constants in \(s_\mathsf {with}\) does not yield \(*\) as a subterm. We proceed for every constant c in s as follows: if c has no definition, we do not do anything. If c was defined as \(c = t\), we first apply the construction recursively on t and obtain \(t_\mathsf {with}\) such that \(t = t_\mathsf {with}[*/f]\); thus \(c = t_\mathsf {with}[*/f]\). Now we define a new constant \(c_\mathsf {with}\;f = t_\mathsf {with}\). As \(c_\mathsf {with}\;* = c\), we replace c in s by \(c_\mathsf {with}\;*\). At the end, we obtain \(s = s_\mathsf {with}[*/f]\) as a theorem. Notice that this procedure produces \(s_\mathsf {with}\) that does not semantically depend on \(*\) only if there is no type in s that depends on \(*\).

Thus the above-described step applied to (15) produces

$$\begin{aligned} \forall \alpha .\mathsf {semigroup}_\mathsf {with}\;*_{\alpha \rightarrow \alpha \rightarrow \alpha } \longrightarrow \varphi _\mathsf {with}[\alpha ,f_{\alpha \rightarrow \alpha \rightarrow \alpha }][*_{\alpha \rightarrow \alpha \rightarrow \alpha }/f_{\alpha \rightarrow \alpha \rightarrow \alpha }]. \end{aligned}$$

To finish the dictionary construction, we replace every occurrence of \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\) by a universally quantified variable \(f_{\alpha \rightarrow \alpha \rightarrow \alpha }\) and obtain (16). This derivation step is not currently allowed in Isabelle. The idea why this is a sound derivation is as follows: since \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\) is a type-class operation, there exist overloaded definitions only for strict instances of \(*\) (such as \(*_{\mathsf {nat}\rightarrow \mathsf {nat}\rightarrow \mathsf {nat}}\)) but never for \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\); thus the meaning of \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\) remains unrestricted. That is to say, \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\) permits any interpretation and hence it must behave as a term variable. We will formulate a rule (an extension of Isabelle’s logic) that allows us to perform the above-described derivation.

First, let us recall that \(\rightsquigarrow ^{\downarrow }\) is the substitutive closure of the constant/type dependency relation \(\rightsquigarrow \) from Sect. 2.3 and \({\varDelta }_c\) is the set of all types for which c was overloaded. The notation \(\sigma \not \le S\) means that \(\sigma \) is not an instance of any type in S. We shall write \(R^+\) for the transitive closure of R. Now we can formulate the Unoverloading Rule (UO):

figure i

This means that we can replace occurrences of the constant \(c_\sigma \) in \(\varphi \) by the universally quantified variable \(x_\sigma \) under the following two side conditions:

  1. 1.

    All types and constant instances in \(\varphi \) do not semantically depend on \(c_\sigma \) through a chain of constant and type definitions. The constraint is fulfilled in the first step of the dictionary construction since for example \(\varphi _\mathsf {with}[\alpha ,*]\) does not contain any hidden \(*\)s due to the construction of \(\varphi _\mathsf {with}\).Footnote 7

  2. 2.

    There is no matching definition for \(c_\sigma \). In our use case, \(c_\sigma \) is always a type-class operation with its most general type (e.g., \(*_{\alpha \rightarrow \alpha \rightarrow \alpha }\)). As already mentioned, we overload a type-class operation only for strictly more specific types (such as \(*_{\mathsf {nat}\rightarrow \mathsf {nat}\rightarrow \mathsf {nat}}\)) and never for its most general type and thus the condition \(\sigma \not \le {\varDelta }_c\) must be fulfilled.

Proposition 3

Any Isabelle/HOL definitional theory is consistent w.r.t. HOL deduction extended with the (LT) and (UO) rules.

Again, the proof of this proposition is based on the translation from Isabelle/HOL to HOL with comprehension types (HOLC) from [27]. It is presented in [27, §4].

Notice that the (UO) rule suggests that even in presence of ad hoc overloading, the polymorphic overloaded constants retain parametricity under some conditions. In the next section, we will look at a concrete example of relativization of a formula with type classes.

6.3 Example: Relativization of Topological Spaces

We will show an example of relativization of a type-based theorem with type classes in a set-based theorem from the field of topology (addressing Immler’s concern discussed in Sect. 1). The type class in question will be a topological space, which has one associated operation \(\mathsf {open}: \alpha \,\mathsf {set}\rightarrow \mathsf {bool}\), a predicate defining the open subsets of \(\alpha \). We require that the whole space is open, finite intersections of open sets are open, finite or infinite unions of open sets are open and that every two distinct points can be separated by two open sets that contain them. Such a topological space is called a T2 space and therefore we call the respective type class \(\mathsf {T2\hbox {-}{}space}\).

One of the basic properties of T2 spaces is the fact that every compact set is closed:

$$\begin{aligned} \forall \alpha _\mathsf {T2\hbox {-}{}space}.\forall S_{\alpha \,\mathsf {set}}.\mathsf {compact}\;S\longrightarrow \mathsf {closed}\;S \end{aligned}$$
(17)

A set is compact if every open cover of it has a finite subcover. A set is closed if its complement is open. i.e., \(\mathsf {closed}\;S = \mathsf {open}\;(-S)\). Recall that our main motivation is to solve the problem when we have a T2 space on a proper subset of \(\alpha \). Let us show the translation of (17) into a set-based variant, which solves the problem. We will observe what happens to the predicate \(\mathsf {closed}\) during the translation.

We will first internalize the type class \(\mathsf {T2\hbox {-}{}space}\) and then abstract over its operation \(\mathsf {open}\) via the first step of the dictionary construction. As a result, we obtain

$$\begin{aligned} \forall \alpha .\mathsf {T2\hbox {-}{}space}_\mathsf {with}\;\mathsf {open}\longrightarrow \forall S_{\alpha \,\mathsf {set}}.\mathsf {compact}_\mathsf {with}\;\mathsf {open}\;S\longrightarrow \mathsf {closed}_\mathsf {with}\;\mathsf {open}\;S, \end{aligned}$$

where \(\mathsf {closed}_\mathsf {with}\; open \;S = open \;(-S)\). Let us apply (UO) and generalize over \(\mathsf {open}\):

$$\begin{aligned} \begin{aligned}&\forall \alpha .\forall open _{\alpha \,\mathsf {set}\rightarrow \mathsf {bool}}.\\&\quad \mathsf {T2\hbox {-}{}space}_\mathsf {with}\; open \longrightarrow \forall S_{\alpha \,\mathsf {set}}.\mathsf {compact}_\mathsf {with}\; open \;S\longrightarrow \mathsf {closed}_\mathsf {with}\; open \;S \end{aligned} \end{aligned}$$
(18)

The last formula is a variant of (17) after we internalized the type class \(\mathsf {T2\hbox {-}{}space}\) and compiled out its operation. Now we reduced the task to the original algorithm (using Local Typedef) from Sect. 5. As always, we fix a nonempty set \(A_{\alpha \,\mathsf {set}}\), locally define \(\beta \) to be isomorphic to A and transfer the \(\beta \)-instance of (18) onto the \(A_{\alpha \,\mathsf {set}}\)-level:

$$\begin{aligned}&\forall \alpha .\forall A_{\alpha \,\mathsf {set}}.A\ne \emptyset \longrightarrow \forall open _{\alpha \,\mathsf {set}\rightarrow \mathsf {bool}}.\mathsf {T2\hbox {-}{}space}^\mathsf {on}_\mathsf {with}\;A\; open \longrightarrow \\&\quad \forall S_{\alpha \,\mathsf {set}}\subseteq A.\mathsf {compact}^\mathsf {on}_\mathsf {with}\;A\; open \;S\longrightarrow \mathsf {closed}^\mathsf {on}_\mathsf {with}\;A\; open \;S \end{aligned}$$

This is the set-based variant of the original theorem (17). Let us show what happened to \(\mathsf {closed}_\mathsf {with}\): its relativization is defined as \(\mathsf {closed}_\mathsf {with}^\mathsf {on}\;A\; open \;S = open \;(-S \cap A)\). Notice that we did not have to restrict \( open \) while moving between \(\beta \) and A (since the function does not produce any values of type \(\beta \)), whereas S is restricted since subsets of \(\beta \) correspond to subsets of A.

6.4 General Case

Having seen a concrete example, let us finally aim for the general case. Let us assume that \(\varUpsilon \) is a type class depending on the overloaded constants \(*_1, \dots , *_n\), written \(\overline{*}\). We write \(A\downarrow \overline{f}\) to mean that A is closed under operations \(f_1, \dots , f_n\).

The following derivation tree shows how we derive, from the type-based theorem \(\vdash \forall \alpha _\varUpsilon .\varphi [\alpha _\varUpsilon ]\) (the topmost formula in the tree), its set-based version (the bottommost formula). Explanation of the derivation steps follows after the tree.

figure j

Derivation steps:

  1. (1)

    The class internalization from Sect. 6.1.

  2. (2)

    The first step of the dictionary construction from Sect. 6.2.

  3. (3)

    The Unoverloading rule (UO) from Sect. 6.2.

  4. (4)

    We fix fresh \(\alpha \), \(A_{\alpha \,\mathsf {set}}\) and assume that A is nonempty. We locally define a new type \(\beta \) to be isomorphic to A; i.e., we fix fresh \(\beta \), \( Abs _{\alpha \rightarrow \beta }\) and \( Rep _{\beta \rightarrow \alpha }\) and assume \({}_{\alpha }(\beta \approx A)^{ Abs }_{ Rep }\).

  5. (5)

    We instantiate \(\alpha \) in the conclusion with \(\beta \).

  6. (6)

    Relativization along the isomorphism between \(\beta \) and A—see Sect. 7.

  7. (7)

    Since \( Abs \) and \( Rep \) are present only in \({}_{\alpha }(\beta \approx A)^{ Abs }_{ Rep }\), we can existentially quantify over them and replace the hypothesis with \(\exists Abs \; Rep .{}_{\alpha }(\beta \approx A)^{ Abs }_{ Rep }\), which we discharge by the Local Typedef rule from Sect. 3, as \(\beta \) is not present elsewhere either (the previous step (6) removed all occurrences of \(\beta \) in the conclusion).

  8. (8)

    We move all hypotheses into the conclusion and quantify over all fixed variables.

As previously discussed, step (2), the dictionary construction, cannot be performed for types depending on overloaded constants unless we want to compile out such types too. In the next section, we will explain the last missing piece: the relativization step (6).

Note that our approach addresses one of the long-standing user complaints: the impossibility to provide two different orders for the same type when using the type class of orders. With our approach, users can still enjoy the advantages of type classes while proving abstract properties about orders, and then only export the final product as a set-based theorem (which quantifies over all possible orders).

7 Transfer: Automated Relativization

In this section, we will describe a procedure that automatically achieves relativization of the type-based theorems. Recall that we are facing the following problem: we have two types \(\beta \) and \(\alpha \) such that \(\beta \) is isomorphic to some (nonempty) set \(A_{\alpha \,\mathsf {set}}\), a proper subset of \(\alpha \), via two isomorphisms \(\mathsf {{Abs}}_{\alpha \rightarrow \beta }\) and \(\mathsf {{Rep}}_{\beta \rightarrow \alpha }\). In this setting, given a formula \(\varphi [\beta ]\), we want to find its isomorphic image \(\varphi ^\mathsf {on}[\alpha , A]\) along \(\mathsf {{Abs}}\) and \(\mathsf {{Rep}}\) and prove \(\varphi [\beta ] \iff \varphi ^\mathsf {on}[\alpha , A]\). Thanks to the previous work in which the first author of this paper participated [32], we can use Isabelle’s Transfer tool, which automatically synthesizes the relativized formula \(\varphi ^\mathsf {on}[\alpha , A]\) and proves the equivalence with the original formula \(\varphi [\beta ]\).

We will sketch the main principles of the tool on the following example, where (20) is a relativization of (19):

$$\begin{aligned}&\forall f_{\beta \rightarrow \gamma }\;{ xs }_{\beta \;\mathsf {list}}\;{ ys }_{\beta \;\mathsf {list}}.{{\mathsf {inj}}}\;f\longrightarrow (\mathsf {{map}}\;f\;{ xs }= \mathsf {{map}}\;f\;{ ys }) \iff ({ xs }= { ys }) \end{aligned}$$
(19)
$$\begin{aligned}&\forall f_{\alpha \rightarrow \gamma }.\forall { xs }\;{ ys }\in {{\mathsf {lists}}}\;A_{\alpha \,\mathsf {set}}.\nonumber \\&\quad {{\mathsf {inj}}}_\mathsf {on}\;A\;f\longrightarrow (\mathsf {{map}}\;f\;{ xs }= \mathsf {{map}}\;f\;{ ys }) \iff ({ xs }= { ys }) \end{aligned}$$
(20)

First of all, we reformulate the problem a little bit. We will not talk about isomorphisms \(\mathsf {{Abs}}\) and \(\mathsf {{Rep}}\) but express the isomorphism between A and \(\beta \) by a binary relation \(\mathsf {T}_{\alpha \rightarrow \beta \rightarrow \mathsf {bool}}\) defined as \(\mathsf {T}\;x\;y = (\mathsf {{Rep}}\;y = x)\). We call \(\mathsf {T}\) a transfer relation. Notice that \(\mathsf {T}\) captures inversion of injection of \(\beta \) into \(\alpha \) and therefore the relation is right-total (\(\forall y.\exists x.R\;x\;y\)), right-unique (\(\forall x\;y\;z.R\;x\;y \longrightarrow R\;x\;z\longrightarrow y = z\)) and left-unique (\(\forall x\;y\;z.R\;x\;z \longrightarrow R\;y\;z\longrightarrow x = y\)).

The Transfer tool requires two sorts of setup as input: (1) we need more structural information about the type constructors in \(\varphi \) (more than that they are just sets of elements); (2) we need to know how to transfer the constants in \(\varphi \) along the transfer relation \(\mathsf {T}\).

Concerning (1), we assume that there exists a relator for every nonnullary type constructor in \(\varphi \). Relators lift relations over type constructors: Related data structures have the same shape, with pointwise-related elements (e.g., the relator \(\mathsf {list\_{}all2}\) for lists), and related functions map related input to related output. Concrete definitions follow:

$$\begin{aligned}&\mathsf {list\_{}all2}: (\alpha \rightarrow \beta \rightarrow \mathsf {bool})\rightarrow \alpha \,\mathsf {list}\rightarrow \beta \,\mathsf {list}\rightarrow \mathsf {bool}\\&(\mathsf {list\_{}all2}\;R)\;xs\;ys \equiv ({{\mathsf {length}}}\;xs = {{\mathsf {length}}}\;ys) \wedge (\forall (x, y) \in \mathsf {set}\, (\mathsf {zip}\;xs\;ys).\;R\;x\;y)\\&{\mathbin {\Mapsto }} : (\alpha \rightarrow \gamma \rightarrow \mathsf {bool}) \rightarrow (\beta \rightarrow \delta \rightarrow \mathsf {bool}) \rightarrow (\alpha \rightarrow \beta ) \rightarrow (\gamma \rightarrow \delta ) \rightarrow \mathsf {bool}\\&(R \mathbin {\Mapsto }S)\;f\;g \equiv \forall x\;y.R\;x\;y \longrightarrow S\;(f\;x)\;(g\;y) \end{aligned}$$

Concerning (2), we need a transfer rule for every constant present in \(\varphi \). The transfer rules express how constants on \(\alpha \) relate to constants on \(\beta \) along \(\mathsf {T}\). Let us look at some examples:

$$\begin{aligned}&((\mathsf {T}\mathbin {\Mapsto }{=}) \mathbin {\Mapsto }{=})\;({{\mathsf {inj}}}_\mathsf {on}\;A)\;{{\mathsf {inj}}} \end{aligned}$$
(21)
$$\begin{aligned}&((\mathsf {T}\mathbin {\Mapsto }{=}) \mathbin {\Mapsto }{=})\;(\forall \_ \in A)\;(\forall )\end{aligned}$$
(22)
$$\begin{aligned}&((\mathsf {list\_{}all2}\;\mathsf {T}\mathbin {\Mapsto }{=}) \mathbin {\Mapsto }{=})\;(\forall \_ \in {{\mathsf {lists}}}\;A)\;(\forall )\end{aligned}$$
(23)
$$\begin{aligned}&((\mathsf {T}\mathbin {\Mapsto }{=}) \mathbin {\Mapsto }\mathsf {list\_{}all2}\;\mathsf {T}\mathbin {\Mapsto }\mathsf {list\_{}all2}\;{=})\;\mathsf {{map}}\;\mathsf {{map}}\end{aligned}$$
(24)
$$\begin{aligned}&(\mathsf {list\_{}all2}\;\mathsf {T}\mathbin {\Mapsto }\mathsf {list\_{}all2}\;\mathsf {T}\mathbin {\Mapsto }{=})\;(=)\;(=) \end{aligned}$$
(25)

As already mentioned, the universal quantification on \(\beta \) corresponds to a bounded quantification over A on \(\alpha \) (\(\forall \_ \in A\)). The relation between the two constants is obtained purely syntactically from the type of the constant on \(\beta \) (e.g., \((\beta \rightarrow \gamma ) \rightarrow \mathsf {bool}\) for \({{\mathsf {inj}}}\)): We replace every type that does not change (\(\gamma \) and \(\mathsf {bool}\)) by the identity relation \(=\), every nonnullary type constructor by its corresponding relator (\(\rightarrow \) by \(\mathbin {\Mapsto }\) and \(\mathsf {list}\) by \(\mathsf {list\_{}all2}\)) and every type that changes by the corresponding transfer relation (\(\beta \) by \(\mathsf {T}\)).

Having the appropriate setup, the Transfer tool builds a derivation tree whose root is the equivalence theorem between (19) and (20), whose leaves contain the above-stated transfer rules (21)–(25) and whose other nodes are derived by the following three structural rules (for a bound variable, application and lambda abstraction):

Similarity of the rules to those for typing of the simply typed lambda calculus is not a coincidence. A typing judgment here involves two terms instead of one, and a binary relation takes the place of a type. The environment \({\varGamma }\) collects the local assumptions for bound variables. Thus since (19) and (20) are of type \(\mathsf {bool}\), the procedure produces (19) = (20) as the corresponding relation for \(\mathsf {bool}\) is \({=}\). Overall, if the Transfer tool has appropriate setup available (for type constructors and transfer rules), it can automatically derive the equivalence theorem for any closed lambda term.

Finally, let us discuss for which class of formulas we can obtain the appropriate setup. First, the setup for type constructors:Footnote 8 Besides the manual setup for the function type (it is a part of Isabelle’s library), the Transfer tool generates the setup automatically for every type constructor that is a natural functor (sets, finite sets, all algebraic datatypes and codatatypes) [34].

Second, transfer rules: Instead of providing a specific transfer rule for every particular transfer relation (such as \(\mathsf {T}\)) and for every type instance of the given constant [for \(\forall \) we needed two rules (22) and (23)], the Transfer tool needs only one general rule for a given polymorphic constant—the specific ones are derived internally from the general one. These general rules capture a certain notion of restricted parametricityFootnote 9 and take a transfer relation as a parameter. Since we transfer only over relations that are inversion of injections in our Types-to-Sets project, we can additionally assume that the relation is right-total, right-unique and left-unique. Such parametricity rules hold for \(\forall \) and \(=\):

$$\begin{aligned}&\mathsf {right\_{}total}\;R \longrightarrow ((R \mathbin {\Mapsto }{=}) \mathbin {\Mapsto }{=})\;(\forall \_ \in (\mathsf {Domain}\;R))\;(\forall )\\&\mathsf {left\_{}unique}\;R \longrightarrow \mathsf {right\_{}unique}\;R \longrightarrow (R \mathbin {\Mapsto }R \mathbin {\Mapsto }{=})\;({=})\;({=}) \end{aligned}$$

Such parametricity rules hold for a broad class of functions—in particular, transitively for any function whose defining term is build from equality and logical quantifiers. The only troublemaker here is the choice operator, which is in general not parametric.

To conclude, the Transfer tool can automatically produce relativization of any theorem that contains only (1) types built from the function type and natural functors and (2) constants that are parametric w.r.t. right-total, right-unique and left-unique relations.

8 Beyond Types to Sets: Types to Terms

So far we have observed that types are semantically equivalent to nonempty sets in HOL and developed a technique that allows us to translate types into sets in a principled way. In this section, we go beyond the connection types–sets and look at how types can represent mathematical objects different from sets (for example, numbers) and thus simulate restricted dependent types in HOL. We show how to use our Types-to-Sets mechanism to compile out such pseudo-dependent types back into terms in case they become too restrictive.

8.1 Harrison’s Trick

HOL does not support dependent types. For example, one cannot define a type of vectors of length n by a type definition that takes the term n as a parameter. However, making use of polymorphism, the type definition could be parametrized by a type whose cardinality would encode the length n. This technique is known under the appellation Harrison’s trick in the HOL community—it was introduced in John Harrison’s work on Euclidean spaces [38], where the spaces \(\mathbb {R}^n\) are represented by a type constructor \(\alpha \,\mathsf {rvec}\), with the cardinality of \(\alpha \) encoding the dimension n [39, §1–2].

This allows us to prove theorems about Euclidean spaces of any dimension and obtain theorems about specific sizes by taking suitable type instances. For example, a theorem about \(\mathbb {R}^3\) is obtained by instantiating \(\alpha \) in \(\alpha \,\mathsf {rvec}\) to any type of cardinality 3. Harrison also used the trick to encode compatibility of dimensions in matrix multiplication—making the dimensions automatically inferable (most of the times) by HOL’s type inference. Naturally, this trick is still far from the expressiveness of full-fledged dependent types; in particular, one cannot perform arithmetic operations or induction on these type variables (at least not without a complex automation setup).

In Isabelle/HOL, we can additionally use the type classes (recall they are essentially predicates on types) to further improve the encoding: For example, we would use \(\alpha _{{\mathsf {finite}}}\,\mathsf {rvec}\) to avoid any special treatment for infinite types \(\alpha \), whereas these are artificially designated to represent dimension 1 in Harrison’s definition.

8.2 Finite Fields as a Type

Divasòn et al. used Harrison’s trick in their recent work [40] to encode Galois fields of prime order \(\mathsf {GF}(p)\), which are finite fields of prime order (i.e., prime cardinality). These are represented as the type \(\alpha _\mathsf {pc}\,\mathsf {mod\hbox {-}{}ring}\), where:

  • \(\mathsf {pc}\) is the class of finite types of prime cardinality—note that \(\mathsf {pc}\) is a subclass of \({{\mathsf {finite}}}\)

  • \(\mathsf {mod\hbox {-}{}ring}\) is a type constructor defined as \(\alpha _{{\mathsf {finite}}}\,\mathsf {mod\hbox {-}{}ring}= \{0,\ldots ,|\alpha |-1\}\), i.e., consisting of the first \(|\alpha |\) numbers

We will abbreviate \(\alpha _\mathsf {pc}\,\mathsf {mod\hbox {-}{}ring}\) as \(\alpha _\mathsf {pc}\,\mathsf {GF}\).

Besides the usual advantage of types making proofs easier, defining prime fields as a (polymorphic) type enables the reuse of type-based results from Isabelle’s library (such as Gauss-Jordan elimination working over any \(\alpha _\mathsf {field}\)).

While using the type \(\alpha _\mathsf {pc}\,\mathsf {GF}\), Divasòn et al. faced a natural limitation of Harrison’s trick. They work with an (executable) factorization algorithm \(\mathsf {factor}: \alpha _\mathsf {pc}\,\mathsf {GF}\,\mathsf {poly}\rightarrow \alpha _\mathsf {pc}\;\mathsf {GF}\times \alpha _\mathsf {pc}\,\mathsf {GF}\,\mathsf {poly}\), where \(\beta \;\mathsf {poly}\) is the type of polynomials over \(\beta \), with \(\beta \) assumed to be a ring (as usual, via the type class mechanism). As long as the field’s order is static (e.g., when one executes \(\mathsf {factor}\) directly), one can manually obtain the appropriate instance of \(\mathsf {factor}\). But a problem arises if \(\mathsf {factor}\) is used as a black box in a bigger algorithm in which the order is computed dynamically. The algorithm would have to dynamically create a type of the given cardinality to obtain the right instance of \(\mathsf {factor}\)—this is not possible in HOL.

Divasòn et al. solved the problem by using Local Typedef to compile out the implicit type parameter into an explicit term parameter in two steps. First, they defined \(\mathsf {factor}\)’s term-based version, \(\mathsf {factor}' : \mathsf {nat}\rightarrow \mathsf {int}\,\mathsf {poly}\rightarrow \mathsf {int}\times \mathsf {int}\,\mathsf {poly}\), where the type \(\mathsf {int}\) (of integers) is the “universe” holding the carriers of all finite fields. The order p of the field is now an explicit term parameter, as in \(\mathsf {factor}'\,p\). They defined a relation between \(\alpha \,\mathsf {GF}\) and \(\mathsf {int}\) and used the Transfer tool to move between these two representations and to obtain the correctness theorem for \(\mathsf {factor}'\) from the correctness theorem for \(\mathsf {factor}\). Concretely, given the theorem \(\varphi [\alpha _\mathsf {pc}\,\mathsf {GF}, \mathsf {factor}]\), Transfer produces a theorem

$$\begin{aligned} \mathsf {prime}\,p \longrightarrow p = |\alpha _\mathsf {pc}| \longrightarrow \varphi '[\mathsf {int}, \mathsf {factor}'\,p] \end{aligned}$$

where \(\varphi '\) is the isomorphic image of \(\varphi \) along the relation between \(\alpha \,\mathsf {GF}\) and \(\mathsf {int}\).

Second, since \(\alpha \) is not present in \(\varphi '\), the assumption about \(\alpha \) really means “for any given prime p there exists a type of cardinality p.” Local Typedef allowed them to remove this assumption after proving “for any given prime p, there exists a nonempty subset of natural numbers such that the subset has cardinality p” (which is trivial, with witness \(\{0,\dots , p-1\}\)). Divasòn et al. have automated the second step for their specific use case.

8.3 Types to Terms: General Case

We now describe how to carry out the second step of Divasòn et al.’s approach in general. Recall that our goal is to use types to implicitly encode some “term information” in order to overcome the lack of dependent types. This can be achieved through a decoding function \(f : \forall \alpha _\varUpsilon .\sigma \), where the type \(\sigma \) does not contain \(\alpha \).Footnote 10 Thus any type \(\tau \) of class \(\varUpsilon \) represents the term “f applied to \(\tau \),” of type \(\sigma \).

When working with types becomes too restrictive (e.g., we want to make induction over \(\alpha _\varUpsilon \)), we need to switch back to terms (which allows us take advantage of existing induction principles for \(\sigma \)), and transfer affected type-based theorems to term-based ones. Say we have proved \(\forall \alpha _\varUpsilon .\;\varphi [\alpha ]\), where \(\varphi [\alpha ]\) is a polymorphic formula that represents a property about the encoded (term) entities. From this, applying the decoding function together with ad hoc knowledge about the behavior of the encoding, we obtain a term-based version of the above theorem, \(\forall \alpha _\varUpsilon .\;\varphi '[f(\alpha )]\).

However, this is not good enough, since the theorem still depends on \(\alpha \) and requires f to be applied explicitly to \(\alpha \). Let us imagine for a moment that f were an ordinary function (from terms to terms), say, \(f : \tau \rightarrow \sigma \). Then we could easily remove f from \(\forall x_\tau .\varphi '[f\,x]\), obtaining \(\forall y_\sigma \in \mathsf {range}\,f.\varphi '[y]\) (where \(\mathsf {range}\,f\) can be described independently of f in most practical applications). The idea is to use Types-to-Sets to translate the “types-to-terms” f into a related “terms-to-terms” function as imagined above.

We assume that there exists f’s relativization \(f^\mathsf {on}: \alpha \,\mathsf {set}\rightarrow \sigma \) together with the corresponding transfer rule. Applying our Types-to-Sets tool from Sect. 6 to the formula \(\forall \alpha _\varUpsilon .\varphi '[f(\alpha )]\), we obtain its relativization to sets

$$\begin{aligned} \forall \alpha .\forall A_{\alpha \,\mathsf {set}}.A \ne \emptyset \longrightarrow \varUpsilon ^\mathsf {on}\,A\longrightarrow \varphi '[f^\mathsf {on}\;A]. \end{aligned}$$

Finally,Footnote 11 let us assume there existFootnote 12 representation type \(\tau \) and \(g : \sigma \rightarrow \tau \,\mathsf {set}\) that is a right inverse for \(\tau \)-instance of \(f^\mathsf {on}\), namely \(f^\mathsf {on}: \tau \,\mathsf {set}\rightarrow \sigma \), on some \(S_{\sigma \,\mathsf {set}}\). Then we can compile \(f^\mathsf {on}\) out by rewriting the last theorem into the desired form

$$\begin{aligned} \forall y_\sigma \in S.g\,y \ne \emptyset \longrightarrow \varUpsilon ^\mathsf {on}\,(g\,y)\longrightarrow \varphi '[y]. \end{aligned}$$

For the Galois field example, we use the following instantiation: \(\sigma = \mathsf {nat}\), \(f(\alpha ) : \mathsf {nat}= |\alpha |\), \(f^\mathsf {on}: \alpha \,\mathsf {set}\rightarrow \mathsf {nat}= |.|\), \(S = {\mathbb {N}}\), and . Since it holds that \(\varUpsilon ^\mathsf {on}\,(g\,p) \iff \mathsf {prime}\,p\) and \(g\,p \ne \emptyset \) for every prime number p, we obtain

$$\begin{aligned} \forall p_\mathsf {nat}.\mathsf {prime}\,p\longrightarrow \varphi '[\mathsf {int},\mathsf {factor}'\,p]. \end{aligned}$$

We showed how Types-to-Sets allows us to compile out Harrison’s trick when we reach its limits. This matches our introduction discussion on chasing the sweet spot between expressiveness and complexity: We can bring some of the dependent type expressiveness into HOL theorem provers and still offer an exit strategy once the approach turns out to be too restrictive. We plan to look more deeply into these unknown waters to see how far we can reliably and flexibly bend HOL towards dependent types.

9 Limitations and Future Work

We already mentioned that the dictionary construction cannot be performed for a type constructor depending on overloaded constants. To workaround the problem, we could also “unfold” the definition of such a type and relativize the formula to the type’s defining predicate as we did in our work on conservativity of type definitions in HOL and Isabelle/HOL [41]. Whether this approach would yield still practically usable theorems remains an open problem.

As mentioned, we implemented the proposed extensions (the (LT) and (UO) rules), which are part of the Isabelle distribution since Isabelle2016-1. Our future work is to implement an infrastructure that would streamline the relativization process as follows: it would automatically define the relativized constants (all the \(c_\mathsf {with}^\mathsf {on}\) constants), prove their transfer rules w.r.t. their nonrelativized counterparts and call the Transfer tool. These steps must be performed manually at the moment. We do not expect any principal problems; it is just an engineering task that must be done.

With the infrastructure in place, we plan to perform a case study, in which we would transform one of the set-based formalizations into a type-based one and obtain the set-based formulation by our Types-to-Sets tool. A natural candidate for such a case study would be Isabelle’s HOL-Algebra library [42], which does not use the type classes system and instead represents various algebraic structures by locales, which are user space parametrized theory modules with assumptions [43]. Every such a locale represents the algebra’s axioms by assumptions and is parametrized by the algebra’s explicit carrier and its operations. This is a truly set-based approach.

The HOL-Algebra example exposes another limitation of Isabelle’s type system: type classes can have only one type parameter. Since we translate sets into type parameters, we cannot express algebraic structures with multiple set parameters (carriers) as a type class. Indeed, our case study would have to stop at modules, which are parametrized by two sets (corresponding to a “scalar” ring and a “vector” abelian group). In short, not every locale can be transformed into a corresponding type class.

Since all the derived type definitional commands in the various HOL systems are implemented in terms of the foundational type definition principle and this paper shows how to localize this principle, we could in principle localize all the derived definitional commands: we could have local quotients, local datatypes or local records. Needless to say, the local variants would be restricted to only monomorphic definitions as the Local Typedef is. Whether this localization proposal is worth the effort, only time will tell.

10 Conclusion

In this paper, we proposed extending Higher-Order Logic with a Local Typedef (LT) rule. We showed that the rule is not an ad hoc, but a natural addition to HOL in that it incarnates a semantic perspective characteristic to HOL: for every nonempty set A, there must be a type that is isomorphic to A. At the same time, (LT) is careful not to introduce dependent types since it is an open question how to integrate them into HOL.

We demonstrated how the rule allows for more flexibility in the proof development: with (LT) in place, the HOL users can enjoy succinctness and proof automation provided by types during proving, while still having access to the more widely applicable, set-based theorems.

Being natural, semantically well justified and useful, we believe that the Local Typedef rule is a good candidate for HOL citizenship. We have implemented this extension in Isabelle/HOL, but its implementation should be straightforward and noninvasive in any HOL prover. And in a more expressive prover, such as HOL-Omega [13], this rule could simply be added as an axiom in the user space.

In addition, we showed that our method for relativizing theorems is applicable to types restricted by type classes as well, if we extend the logic by a rule for compiling out overloading constants (UO). With (UO) in place, the Isabelle users can reason abstractly using type classes, while at the same time having access to different instances of the relativized result.

All along according to the motto: Prove easily and still be flexible.