1 Introduction

Auto2, introduced by the author in [17], is a proof automation tool for the proof assistant Isabelle. It is designed to be a powerful, extensible prover that can consistently solve “routine” tasks encountered during a proof, thereby enabling a style of formalization using succinct proof scripts written in a custom, purely declarative language.

In this paper, we present an application of auto2 to formalization of mathematics in untyped set theoryFootnote 1. In particular, we discuss the formalization in Isabelle/FOL of the entire chain of development from the axioms of set theory to the definition of the fundamental group for an arbitrary topological space. Along the way, we discuss several improvements to auto2 as well as strategies of usage that allow us to work effectively with untyped set theory.

The contribution of this paper is two-fold. First, we demonstrate that the auto2 system is capable of independently supporting proof developments on a relatively large scale. In the previous paper, several case studies for auto2 were given in Isabelle/HOL. Each case study is at most several hundred lines long, and the use of auto2 is mixed with the use of other Isabelle tactics, as well as proof scripts provided by Sledgehammer. In contrast, the example we present in this paper is a unified development consisting of over 13,000 lines of theory files and 3,500 lines of ML code (not including the core auto2 program). The auto2 prover is used exclusively starting from basic set theory.

Second, we demonstrate one way to manage the additional complexity in proofs that arise when working with untyped set theory. For a number of reasons, untyped set theory is considered to be difficult to work with. For example, everything is represented as sets, including objects such as natural numbers that we usually do not think of as sets. Moreover, statements of theorems tend to be longer in untyped set theory than in typed theories, since assumptions that would otherwise be included in type constraints must now be stated explicitly. In this paper, we show that with appropriate definitions of basic concepts and setup for automation, all these complexities can be managed, without sacrificing the inherent flexibility of the logic.

We now give an outline for the rest of the paper. In Sect. 2, we sketch our choice of definitions of basic concepts in axiomatic set theory. In particular, we describe how to use tuples to realize extensible records, and build up the hierarchy of algebraic structures. In Sect. 3, we review the main ideas of the auto2 system, and describe several additional features, as well as strategies of usage, that allow us to manage the additional complexities of untyped set theory.

In Sect. 4, we give two examples of proof scripts using auto2, taken from the proofs of the Schroeder-Bernstein theorem and a challenge problem in analysis from Lasse Rempe-Gillen. In Sect. 5, we describe our main example, the definition of the fundamental group, in detail. Given a topological space X and a base point x on X, the fundamental group \(\pi _1(X,x)\) is defined on the quotient of the set of loops in X based at x, under the equivalence relation given by path homotopy. Multiplication on \(\pi _1(X,x)\) comes from joining two loops end-to-end. Formalizing this definition requires reasoning about algebraic and topological structures, equivalence relations, as well as continuous functions on real numbers. We believe this is a sufficiently challenging task with which to test the maturity of our framework, although it has been achieved before in the Mizar system. HOL Light and Isabelle/HOL also formalized the essential ideas on path homotopy. We review these and other related works in Sect. 6, and conclude in Sect. 7.

2 Basic Constructions in Set Theory

We now discuss our choice of definitions of basic concepts, starting with the choice of logic. Our development is based on the FOL (first-order logic) instantiation of Isabelle. The initial parts are similar to those in Isabelle/ZF, and we refer to [13, 14] for detailed explanations.

The only Isabelle types available are i for sets, o for propositions (booleans), and function types formed from them. We call objects with types other than i and o meta-functions, to distinguish them from functions defined within set theory (which have type i). It is possible to define higher-order meta-functions in FOL, and supply them with arguments in the form of lambda expressions. Theorems can be quantified over variables with functional type at the outermost level. These can be thought of as theorem-schemas in a first-order theory. However, one can only quantify over variables of type i inside the statement of a theorem, and the only equalities defined within FOL are those between types i (notation \(\cdot = \cdot \)) and o (notation \(\cdot \longleftrightarrow \cdot \)). In practice, these restrictions mean that any functions that we wish to consider as first-class objects must be defined as set-theoretic functions.

2.1 Axioms of Set Theory

For uniformity of presentation, we start our development from FOL rather than theories in Isabelle/ZF. However, the list of axioms we use is mostly the same. The only main addition is the axiom of global choice, which we use as an easier-to-apply version of the axiom of choice. Note that as in Isabelle/ZF, several of the axioms introduce new sets or meta-functions, and declare properties satisfied by them. The exact list of axioms is as follows:

figure afigure a

Next, we define several basic constructions in set theory. They are summarized in the following table. See [13] for more explanations.

figure bfigure b

2.2 Extensible Records as Tuples

We now consider the problem of representing records. In our framework, records are used to represent functions, algebraic and topological structures, as well as morphisms between structures. It is often advantageous for records of different types to share certain fields. For example, groups and rings should share the multiplication operator, rings and ordered rings should share both addition and multiplication operators, and so on.

It is well-known that when formalizing mathematics using set theory, records can be represented as tuples. To achieve sharing of fields, the key idea is to assign each shared field a fixed position in the tuple.

We begin with the example of functions. A function is a record consisting of a source set (domain), a target set (codomain), and the graph of the function. In particular, we consider two functions with the same graph but different target sets to be different functions (another structure called family is used to represent functions without specified target set). The three fields are assigned to the first three positions in the tuple:

figure cfigure c

A function with source S, target T, and graph G is represented by the tuple (we append an at the end so the definition of works properly). For G to actually represent a function, it must satisfy the conditions for a functional graph:

figure dfigure d

The set of all functions from S to T (denoted ) is then given by:

figure efigure e

Functions can be created using the following constructor. Note this is a higher-order meta-function. The argument b can be supplied by a lambda expression.

figure ffigure f

Evaluation of a function f at x (denoted ) is then defined as:

figure gfigure g

2.3 Algebraic Structures

The second major use of records is to represent algebraic structures. In our framework, we will define structures such as groups, abelian groups, rings, and ordered rings. The carrier set of a structure is assigned to the first position. The order relation, additive data, and multiplicative data are assigned to the third, fourth, and fifth position, respectively. This is expressed as follows:

figure hfigure h

Here is a subset of , and are elements of . Hence, the operators \(\le , +,\) and \(*\) can be defined as follows:

figure ifigure i

These are abbreviated to , and , respectively (in both theory files and throughout this paper, we use \(*\) to denote multiplication in groups and rings, and \(\times \) to denote product on sets and other structures). We also abbreviate to .

The constructor for group-like structures is as follows:

figure jfigure j

The following predicate asserts that a structure contains at least the fields of a group-like structure, with the right membership properties ( abbreviates ):

figure kfigure k

To check whether such a structure is in fact a monoid/group, we use the following predicates:

figure lfigure l
figure mfigure m
figure nfigure n

Note these definitions are meaningful on any structure that has multiplicative data. Likewise, we can define a predicate for abelian groups, that is meaningful for any structure that has additive data. These can be combined with distributive properties to define the predicate for a ring:

figure ofigure o

Likewise, we can define the predicate for ordered rings, and constructors for such structures. Structures are used to represent the hierarchy of numbers: we let nat int, ra, and real denote the set of natural numbers, integers, etc., while \({\mathbb {N}}, {\mathbb {Z}}, {\mathbb {Q}}\), and \({\mathbb {R}}\) denote the corresponding structures. Hence, addition on natural numbers is denoted by , addition on real numbers by , etc. We can also state and prove theorems such as , which contains all proof obligations for showing that the real numbers form an ordered field.

2.4 Morphism Between Structures

Finally, we discuss morphisms between structures. Morphisms can be considered as an extension of functions, with additional fields specifying structures on the source and target sets. The two additional fields are assigned to the fourth and fifth positions in the tuple:

figure pfigure p

The constructor for a morphism is as follows (here S and T are the source and target structures, while the source and target sets are automatically derived):

figure qfigure q

The space of morphisms (denoted ) is given by:

figure rfigure r

Note the notation for evaluation still works for morphisms. Several other concepts defined in terms of evaluation, such as image and inverse image, continue to be valid for morphisms as well, as are lemmas about these concepts. However, operations that construct new morphisms, such as inverse and composition, must be redefined. We will use to denote the composition of two functions, and to denote the composition of two morphisms.

Having morphisms store the source and target structures means we can define properties such as homomorphism on groups as a predicate:

figure sfigure s

The following lemma then states that the composition of two homomorphisms is a homomorphism (this is proved automatically using auto2):

figure tfigure t

3 Auto2 in Untyped Set Theory

In this section, we describe several additional features of auto2, as well as general strategies of using it to manage the complexities of untyped set theory.

We begin with an overview of the auto2 system (see [17] for details). Auto2 is a theorem prover packaged as a tactic in Isabelle. It works with a collection of rules of reasoning called proof steps. New proof steps can be added at any time within an Isabelle theory. They can also be deleted at any time, although it is rarely necessary to add and delete the same proof step more than once. In general, when building an Isabelle theory, the user is responsible for specifying, by adding proof steps, how to use the results proved in that theory. In return, the user no longer needs to worry about invoking these results by name in future developments.

The overall algorithm of auto2 is as follows. First, the statement to be proved is converted into contradiction form, so the task is always to derive a contradiction from a list of assumptions. During the proof, auto2 maintains a list of items, the two most common types of which are propositions (that are derived from the assumptions) and terms (that have appeared so far in the proof). Each item resides in a box, which can be thought of as a subcase of the statement to be proved (the box corresponding to the original statement is called the home box). A proof step is a function that takes as input one or two items, and outputs either new items, new cases, or the action of shadowing one of the input items, or resolving a box by proving a contradiction in that box.

The main loop of the algorithm repeatly applies the current collection of proof steps and adds any new items and cases in a best-first-search manner, until some proof step derives a contradiction in the home box. In addition to the list of items, auto2 also maintains several tables. The most important of which is the rewrite table, which keeps track of the list of currently known equalities (not containing arbitrary variables), and maintains the congruence closure of these equalities. There are two other tables: the property table and the well-form table, which we will discuss later in this section.

There are two broad categories of proof steps, which we call the standard and special proof steps in this paper. A standard proof step applies an existing theorem in a specific direction. It matches the input items to one or two patterns in the statement of the theorem, and applies the theorem to derive a new proposition. Here the matching is up to rewriting (E-matching) using the rewrite table. A special proof step can have more complex behavior, and is usually written as an ML function. The vast majority of proof steps in our example are standard, although special proof steps also play an important role.

The auto2 prover is not intended to be complete. For example, it may intentionally apply a theorem in only one of several possible directions, in order to narrow the search space. For more difficult theorems, auto2 provides a custom language of proof scripts, allowing the user to specify intermediate steps of the proof. Generally, when proving a result using auto2, the user will first try to prove it without any scripts, and in case of failure, successively add intermediate steps, perhaps by referring to an informal proof of the result. In case of failure, auto2 will indicate the first intermediate step that it is unable to prove, as well as what it is able to derive in the course of proving that step. We will show examples of proof scripts in Sect. 4.

The current version of auto2 can be set up to work with different logics in Isabelle. It contains a core program, for reasoning about predicate logic and equality, that is parametrized over the list of constants and theorems for the target logic. In particular, auto2 is now set up and tested to work with both HOL and FOL in Isabelle.

3.1 Encapsulation of Definitions

One commonly cited problem with untyped set theory is that every object is a set, including those that are not usually considered as sets. Common examples of the latter include ordered pairs, natural numbers, functions, etc. In informal treatments of mathematics, these definitions are only used to establish some basic properties of the objects concerned. Once these properties are proved, the definitions are never used again.

In formal developments, when automation is used to produce large parts of the proof, one potential problem is that the automation may needlessly expand the original definitions of objects, rather than focusing on their basic properties. This increases the search space and obscures the essential ideas of the proof. Using the ability to delete proof steps in auto2, this problem can be avoided entirely. For any definition that we wish to drop in the end, we use the following three-step procedure:

  1. 1.

    The definition is stated and added to auto2 as rewrite rules.

  2. 2.

    Basic properties of the object being defined are stated and proved. These properties are added to auto2 as appropriate proof steps.

  3. 3.

    The rewrite rules for the original definition are deleted.

For example, after the definitions concerning the representation of functions as tuples in Sect. 2.2, we prove the following lemmas, and add them as appropriate proof steps (as indicated by the attributes in brackets):

figure ufigure u
figure vfigure v
figure wfigure w

After proving these (and a few more) lemmas, the rewriting rules for the definitions of , etc., are removed. Note that all lemmas above are independent of the representation of functions as tuples. Hence, this representation is effectively hidden from the point of view of the prover. Some of the original definitions may be temporarily re-added in rare instances (for example when defining the concept of morphisms).

3.2 Property and Well-Form Tables

In this section, we discuss two additional tables maintained by auto2 during a proof. The property table is already present in the version introduced in [17], but not discussed in that paper. The well-form table is new.

The main motivation for both tables is that for many theorems, especially those stated in an untyped logic, some of its assumptions can be considered as “side conditions”. To give a basic example, consider the following lemma:

figure xfigure x

In this lemma, the last two assumptions are the “main” assumptions, while the first three are side conditions asserting that the variables in the main assumptions are well-behaved in some sense. In Isabelle/HOL, these side conditions may be folded into type or type-class constraints.

We consider two kinds of side conditions. The first kind, like the first assumption above, checks that one of the variables in the main assumptions satisfy a certain predicate. In Isabelle/HOL, these may correspond to type-class constraints. In auto2, we call these property assumptions. More precisely, given any predicate (in FOL this means constant of type ), we can register it as a property. The property table records the list of properties satisfied by each term that has appeared so far in the proof. Properties propagate through equalities: if P(a) is in the property table, and a = b is known from the rewrite table, then P(b) is automatically added to the property table. The user can also add theorems of certain forms as further propagation rules for the property table (we omit the details here).

The second kind of side conditions assert that certain terms occuring in the main assumptions are well-formed. We use the terminology of well-formedness to capture a familiar feature of mathematical language: that an expression may make implicit assumptions about its subterms. These conditions can be in the form of type constraints. For example, the expression implicitly assumes that a and b are elements in the carrier set of R. However, this concept is much more general. Some examples of well-formedness conditions are summarized in the following table:

figure yfigure y

In general, given any meta-function f, any propositional expression in terms of the arguments of f can be registered as a well-formedness condition of f. In particular, well-formedness conditions are not necessarily properties. For example, the condition for involves two variables and hence is not a property. The well-form table records, for every term encountered so far in the proof, the list of its well-formedness conditions that are satisfied. Whenever a new fact is added, auto2 checks against every known term to see whether it verifies a well-formedness condition of that term.

The property and well-form tables are used in similar ways in standard proof steps. After the proof step matches one or two patterns in the “main” assumptions or conclusion of the theorem that it applies, it checks for the side conditions in the two tables, and proceed to apply the theorem only if all side conditions are found. Of course, this requires proof steps to be re-applied if new properties or well-formedness conditions of a term becomes known.

3.3 Well-Formed Conversions

Algebraic simplification is an important part of any automatic prover. For every kind of algebraic structure, e.g. monoids, groups, abelian groups, and rings, there is a concept of normal form of an expression, and two terms can be equated if they have the same normal form. In untyped set theory, such computation of normal forms is complicated by the fact that the relevant rewriting rules have extra assumptions. For example, the rule for associativity of addition is:

figure zfigure z

The first assumption can be verified at the beginning of the normalization process. The remaining assumptions, however, are more cumbersome. In particular, they may require membership status of terms that arise only during the normalization. For example, when normalizing the term , we may first rewrite it to . The next step, however, requires , where does not occur initially and may not have occured so far in the proof. In typed theories, this poses no problem, since will be automatically given the same type as b and c when the term is created.

In untyped set theory, such membership information must be kept track of and derived when necessary. The concept of well-formed terms provides a natural framework for doing this. Before performing algebraic normalization on a term, we first check for all relevant well-formedness conditions. If all conditions are present, we produce a data structure (of type wfterm in Isabelle/ML) containing the certified term as well as theorems asserting well-formedness conditions. A theorem is called a well-formed rewrite rule if its main conclusion is an equality, each of its assumptions is a well-formedness condition for terms on the left side of the equality, and it has additional conclusions that verify all well-formedness conditions for terms on the right side of the equality that are not already present in the assumptions. For example, the associativity rule stated above is not yet a well-formed rewrite rule: there is no justification for , which is a well-formedness condition for the term on the right side of the equality. The full well-formed rewrite rule is:

figure aafigure aa

Given a well-formed rewrite rule, we can produce a well-formed conversion that acts on wfterm objects, in a way similar to how equalities produce regular conversions that act on cterm objects in Isabelle/ML. Like regular conversions, well-formed conversions can be composed in various ways, and full normalization procedures can be written using the language of well-formed conversions. These normalization procedures in turn form the basis of several special proof steps. We give two examples:

  • Given two terms s and t that are non-atomic with respect to operations in , where is a monoid (group/abelian group/ring), normalize s and t using the rules for . If the normalizations are equal, output \(s = t\).

  • Given two propositions and , where R is an ordered ring. Compare the normalizations of and . If they are equal, output a contradiction.

These proof steps, when combined with proof scripts provided by the user, allow algebraic manipulations to be performed rapidly. They replace the handling of associative-commutative functions for HOL discussed in [17].

3.4 Discussion

We conclude this section with a discussion of our overall approach to untyped set theory, and compare it with other approaches. One feature of our approach is that we do not seek to re-institute a concept of types in our framework, but simply replace type constraints with set membership conditions (or predicates, for constraints that cannot be described by a set). The aim is to fully preserve the flexibility of set-membership as compared to types. Empirically, most of the extra assumptions that arise in the statement of theorems can be taken care of by classifying them as properties or well-formedness conditions. Our approach can be contrasted with that taken by Mizar, which defines a concept of soft types [16] within the core of the system.

Every framework for formalizing modern mathematics need a way to deal with structures. In Mizar, structures are defined in the core of the system as partial functions on selectors [9, 15]. In both Isabelle/HOL and IsarMathLib’s treatement of abstract algebra, structures are realized with extensive use of locales. For Coq, one notable approach is the use of Canonical Structures [10] in the formalization of the Odd Order Theorem. We chose a relatively simple scheme of realizing structures as tuples, which is sufficient for the present purposes. Representing them as partial functions on selectors, as in Mizar, is more complicated but may be beneficial in the long run.

Finally, we emphasize that we do not make any modification to Isabelle/FOL in our development. The concept of well-formed terms, for example, is meaningful only to the automation. The whole of auto2’s design, including the ability for users to add new proof steps, follows the LCF architecture. To have confidence in the proofs, one only need to trust the existing Isabelle system, the ten axioms stated in Sect. 2.1, and the definitions involved in the statement of the results.

4 Examples of Proof Scripts

Using the techniques in the above two sections, we formalized enough mathematics in Isabelle/FOL to be able to define the fundamental group. In addition to work directly used for that purpose, we also formalized several interesting results on the side. These include the well-ordering theorem and Zorn’s lemma, the first isomorphism theorem for groups, and the intermediate value theorem. Two more examples will be presented in the remainder of this section, to demonstrate the level of succinctness of proof scripts that can be achieved.

Throughout our work, we referred to various sources including both mathematical texts and other formalizations. We list these sources here:

  • Axioms of set theory and basic operations on sets, construction of natural numbers using least fixed points: from Isabelle/ZF [13, 14].

  • Equivalence and order relations, arbitrary products on sets, well-ordering theorem and Zorn’s lemma: from Bourbaki’s Theory of Sets [2].

  • Group theory and the construction of real numbers using Cauchy sequences: from my previous case studies [17], which in turn is based on corresponding theories in the Isabelle/HOL library.

  • Point-set topology and construction of the fundamental group: from Topology by Munkres [12].

4.1 Schroeder-Bernstein Theorem

For our first example, we present the proof of the Schroeder-Bernstein theorem. See [14] for a presentation of the same proof in Isabelle/ZF. The bijection is constructed by gluing together two functions. Auto2 is able to prove automatically that under certain conditions, the gluing is a bijection (lemma ). For the Schroeder-Bernstein theorem, a proof script (provided by the user) is needed. This is given immediately after the statement of the theorem.

figure abfigure ab
figure acfigure ac
figure adfigure ad

4.2 Rempe-Gillen’s Challenge

For our second example, we present our solution to a challenge problem proposed by Lasse Rempe-Gillen in a mailing list discussionFootnote 2. See [1] for proofs of the same result in several other systems. The statement to be proved is:

Lemma 1

Let f be a continuous real-valued function on the real line, such that \(f(x) > x\) for all x. Let \(x_0\) be a real number, and define the sequence \(x_n\) recursively by \(x_{n+1} := f(x_n)\). Then \(x_n\) diverges to infinity.

Our solution is as follows. We make use of several previously proved results: any bounded increasing sequence in \(\mathbb {R}\) converges (line 2), a continuous function f maps a sequence converging to x to a sequence converging to (line 4), and finally that the limit of a sequence in \(\mathbb {R}\) is unique.

figure aefigure ae

5 Construction of the Fundamental Group

In this section, we describe our construction of the fundamental group. We will focus on stating the definitions and main results without proof, to demonstrate the expressiveness of untyped set theory under our framework. The entire formalization including proofs is 864 lines long.

Let I be the interval , equipped with the subspace topology from the topology on \(\mathbb {R}\). Given two continuous maps f and g from S to T, a homotopy between f and g is a continuous map from the product topology on to T that restricts to f and g at and , respectively:

figure affigure af

A path is a continuous function from the interval. A homotopy between two paths is a path homotopy if it remains constant on and :

figure agfigure ag
figure ahfigure ah

Two paths are path-homotopic if there exists a path homotopy between them. This is an equivalence relation on paths.

figure aifigure ai

The path product is defined by gluing two morphisms. It is continuous by the pasting lemma:

figure ajfigure aj
figure akfigure ak

The loop space is a set of loops on X. Path homotopy gives an equivalence relation on the loop space, and we define to be the quotient set:

figure alfigure al
figure amfigure am
figure anfigure an

Finally, the fundamental group is defined as:

figure aofigure ao

To show that the fundamental group is actually a group, we need to show that the path product respects the equivalence relation given by path homotopy, and is associative up to equivalence (along with properties about inverse and identity). The end result is:

figure apfigure ap

An important property of the fundamental group is that a continuous function between topological spaces induces a homomorphism between their fundamental groups. This is defined as follows:

figure aqfigure aq

The induced map is a homomorphism satisfying functorial properties:

figure arfigure ar
figure asfigure as
figure atfigure at

6 Related Work

In Isabelle, the main library for formalized mathematics using FOL is Isabelle/ZF. The basics of Isabelle/ZF is described in [13, 14]. We also point to [13] for a review of older work on set theory from automated deduction and artificial intelligence communities. Outside the official library, IsarMathLib [5] is a more recent project based on Isabelle/ZF. It formalized more results in abstract algebra and point-set topology, and also constructed the real numbers. The initial parts of our development closedly parallels that in Isabelle/ZF, but we go further in several directions including constructing the number system. The primary difference between our work and IsarMathLib is that we use auto2 for proofs, and develop our own system for handling structures, so that we do not make use of Isabelle tactics, Isar, or locales.

Outside Isabelle, the major formalization projects using set theory include Metamath [11] and Mizar [4], both of which have extensive mathematical libraries. There are some recent efforts to reproduce the Mizar environment in HOL-type systems [6, 8]. While there are some similarities between our framework and Mizar’s, we do not aim for an exact reproduction. In particular, we maintain the typical style of stating definitions and theorems in Isabelle. More comparisons between our approach and Mizar are discussed in Sect. 3.4.

Mizar formalized not just the definition of the fundamental group [7], but several of its properties, including the computation of the fundamental group of the circle. There is also a formalization of path homotopy in HOL Light which is then ported to Isabelle/HOL. This is used for the proof of the Brouwer fixed-point theorem and the Cauchy integral theorem, although the fundamental group itself does not appear to be constructed.

In homotopy type theory, one can work with fundamental groups (and higher-homotopy groups) using synthetic definitions. This has led to formalizations of results about homotopy groups that are well beyond what can be achieved today using standard definitions (see [3] for a more recent example). We emphasize that our definition of the fundamental group, as with Mizar’s, follows the standard one in set theory.

7 Conclusion

We applied the auto2 prover to the formalization of mathematics using untyped set theory. Starting from the axioms of set theory, we formalized the definition of the fundamental group, as well as many other results in set theory, group theory, point-set topology, and real analysis. The entire development contains over 13,000 lines of theory files and 3,500 lines of ML code, taking the author about 5 months to complete. On a laptop with two 2.0 GHz cores, it can be compiled in about 24 min. Through this work, we demonstrated the ability of auto2 to scale to relatively large projects. We also hope this result can bring renewed interest to formalizing mathematics in untyped set theory in Isabelle.