1 Introduction

Graph theory supplies many well-quasi-ordering theorems for proof theory to study. The best known of these is Kruskal’s theorem, which as discovered independently by Schmidt [13] and Friedman (published by Simpson [14]) possesses an unusually high proof-theoretic strength that lies above that of \(\text {ATR}_0\). This result was then extended by Friedman to extended Kruskal’s theorem, a form of Kruskal’s theorem that uses labelled trees for which the embedding has to obey a certain gap-condition, which was shown to have proof-theoretic strength just above even the theory of \(\Pi _1^1\text {--CA}_0\), the strongest of the five main theories considered in the research program known as reverse mathematics.

Reverse mathematics (RM) strives to classify the strength of particular theorems, or bodies of theorems, of “ordinary” mathematics by means of isolating the essential set existence principles used to prove them, mainly in the framework of subsystems of second order arithmetic. The program is often summarized by saying that there are just five systems, known as the “Big Five”, that are sufficient for this classification. The picture of RM that we currently see, though, is more complicated:

  1. 1.

    Those parts of mathematics that have been analyzed in RM, are mostly results from the 19th century and the early 20th century with rather short proofs (varying from half a page to a few pages in length). By contrast, e.g., the large edifice of mathematics that Wiles’ proof of Fermat’s Last Theorem utilizes has not been analyzed in detail.

  2. 2.

    By now there are quite a number of theorems that do not fit the mold of the Big Five. For instance, Ramsey’s theorem for pairs, Kruskal’s theorem and the graph minor theorem do not equate to any of them. For several others, such as Hindman’s theorem, this is still an open question.

  3. 3.

    There are areas of mathematics where complicated double, triple and more times nested transfinite inductions play a central role. Such proof strategies are particularly frequent in set theory (e.g. in fine structure theory and combinatorial theorems pertaining to L) and in higher proof theory (e.g. in the second predicative cut elimination theorem and the impredicative cut elimination and collapsing theorems). As RM is usually presented, one might be tempted to conclude that such transfinite proof modes are absent from or even alien to “ordinary” mathematics. However, they are used in the proof of the graph minor theorem. Are they really necessary for its proof?

In this paper we will be concerned with the proof of the graph minor theorem, which is a fairly recent result. It has a very complicated and long proof that features intricate transfinite inductions. In particular, we will be analyzing these inductions and classify them according to principles that are familiar from proof theory and the foundations of mathematics. As to the importance attributed to the graph minor theorem, let’s quote from a book on Graph Theory [2], p. 249.

Our goal \([\ldots ]\) is a single theorem, one which dwarfs any other result in graph theory and may doubtless be counted among the deepest theorems that mathematics has to offer: in every infinite set of graphs there are two such that one is a minor of the other. This graph minor theorem, inconspicuous though it may look at first glance, has made a fundamental impact both outside graph theory and within. Its proof, due to Neil Robertson and Paul Seymour, takes well over 500 pages.

The starting point of this grand proof is the bounded graph minor theorem, i.e. the graph minor theorem restricted to those graphs of bounded “tree-width”. The bounded graph minor theorem was connected to Friedman’s extended Kruskal’s theorem by Friedman et al. [4], and the two were even shown to be equivalent. This provided a natural example of a theorem of combinatorial mathematics that has extremely high proof-theoretic strength, and at the same time gave a lower bound on the graph minor theorem. While the precise proof-theoretic strength of the bounded graph minor theorem was established by Friedman et al. [4], the same was not the case for the full graph minor theorem, for which not even an upper bound was found, which no doubt was due to the fact that the proof spreads over 500 pages of complicated combinatorial arguments. In the following, we will thus outline how the graph minor theorem and other important theorems of the Graph Minors series, like the immersion theorem, can be proved in \(\Pi _1^1\text {--CA}_0\) with the additional principles of \(\Pi ^1_3\)-induction and \(\Pi ^1_2\)-bar induction.

2 Well-Quasi-ordering Theorems of the Graph Minors Series

The relations of minor and immersion can be understood as finding a certain expansion of one graph \(G_1\) in another graph \(G_2\). All graphs in this paper are finite and without loops unless noted otherwise, and we denote the vertex set of a graph G by V(G) and its edge set by E(G). For the minor relation, define a minor-expansion of \(G_1\) to be a function \(f:G_1\longrightarrow G_2\) so that \(v\in V(G_1)\) gets mapped to a connected subgraph \(f(v)\subseteq G_2\) so that \(f(v)\cap f(u)=\emptyset \) if \(u\ne v\), and each edge \(e\in E(G_1)\) gets mapped injectively to an edge \(f(e)\in E(G_2)\) so that if the endpoints of e are u and v, then f(e) connects vertices \(u'\in f(u)\) and \(v'\in f(v)\). If an expansion of \(G_1\) is a subgraph of \(G_2\), \(G_1\) is said to be a minor of \(G_2\), denoted \(G_1\le G_2\). An immersion relation between graphs \(G_1\) and \(G_2\) is similarly witnessed by an immersion-expansion \(f:G_1\longrightarrow G_2\) so that vertices of \(G_1\) are mapped injectively to vertices of \(G_2\), and so that an edge e with endpoints u and v is mapped to a path f(e) in \(G_2\) between f(u) and f(v) so that for distinct edges \(e_1,e_2\in E(G_1)\) the paths \(f(e_1)\) and \(f(e_2)\) are edge-disjoint (but may intersect at vertices), i.e. \(E(f(e_1))\cap E(f(e_2)) = \emptyset \). The graph minor and immersion theorem are then the following theorems.

Theorem 1

(Graph minor theorem, Robertson and Seymour [11]) For every sequence \(\left\langle G_i:i\in \mathbb {N}\right\rangle \) of graphs there are \(i < j\) so that \(G_i\) is a minor of \(G_j\).

Theorem 2

(Immersion theorem, Robertson and Seymour [12]) For every sequence \(\left\langle G_i:i\in \mathbb {N}\right\rangle \) of graphs there are \(i < j\) so that there is an immersion of \(G_i\) into \(G_j\).

The proof of the graph minor theorem can be divided into two major steps. First, the excluded minor theorem is proved, which takes up most of the Graph Minors series. The excluded minor theorem says that if one graph G does not contain another graph H as a minor, then G has to have a certain structure, namely that it can be decomposed into parts which are connected in a tree-like shape and can almost be embedded into a surface into which H can not be embedded. This is then used as follows: In a proof of the graph minor theorem, for any sequence of graphs \(\left\langle G_1, G_2,\ldots \right\rangle \) one may assume that \(G_1\) is not a minor of any \(G_j\), \(j > 1\), as otherwise the graph minor theorem holds. Thus, it suffices to prove the graph minor theorem for any sequence of graphs possessing the structure obtained by applying the excluded minor theorem for \(G_1\), for any such \(G_1\). This means that it is enough to prove the graph minor theorem for graphs which consist of parts connected in a tree-like shape that are almost embeddable into some fixed surface, which is the second major step of the proof of the graph minor theorem.

The proof of the excluded minor theorem is not very complex from a metamathematical point of view. This is due to the fact that surfaces are uniquely determined by their fundamental polygons, and that graph embeddings on any surface can thus be represented by a natural number encoding a graph drawing with rational coordinates in this fundamental polygon. With this approach, the entire proof of the excluded minor theorem does not feature any infinite objects nor any infinite proof techniques, and it is straightforward to carry it out in \(\text {ACA}_0\), which will be our base theory in the following. The only papers of the Graph Minors series that use more advanced proof techniques are Graph Minors IV [7], VIII [8], XVIII [9], XIX [10], XX [11] and XXIII [12].

Graph Minors IV [7] proves in a sense an early version of the graph minor theorem for graphs with a certain structure as described above, namely the graph minor theorem for graphs that have bounded tree-width, a property which is defined in terms of tree-decompositions. A tree-decomposition of a graph G is essentially a decomposition of G into parts that are connected in a tree-like shape, i.e. a tree-decomposition of G consists of a tree T and for every \(t\in V(T)\) a subgraph \(G_t\) of G so that

  • \(\bigcup _{t\in V(T)}G_t = G\), and

  • if an edge e of T has endpoints \(t_1\) and \(t_2\), and \(T_1\) and \(T_2\) are the two components of T obtained by removing e from T, then every path in G from some \(v\in \bigcup _{t\in V(T_1)}G_t\) to some \(u\in \bigcup _{t\in V(T_2)}G_t\) has to contain a vertex of \(G_{t_1}\cap G_{t_2}\).

The width of such a tree-decomposition is then defined to be \(\max _{t\in V(T)}\left| V(G_t)\right| - 1\). The tree-width tw(G) of G is the minimum width of all its tree-decompositions, and the bounded graph minor theorem can be stated as follows.

Theorem 3

(Bounded graph minor theorem, [7]) Let n be a natural number, then in any sequence \(\left\langle G_i:i\in \mathbb {N}\right\rangle \) of graphs so that \(tw(G_i)\le n\) for every \(i\in \mathbb {N}\), there are \(G_i\) and \(G_j\) with \(i < j\) so that \(G_i\) is a minor of \(G_j\).

The bounded graph minor theorem has been analyzed from a metamathematical perspective by Friedman, Robertson and Seymour [4], who determined that its proof-theoretic strength lies just above that of \(\Pi _1^1\text {--CA}_0\). They observed that the bounded graph minor theorem can be proved for each individual tree-width in \(\Pi _1^1\text {--CA}_0\), and since the bounded graph minor theorem is a \(\Pi ^1_1\)-statement, that an application of \(\Pi ^1_1\)-reflection for \(\Pi _1^1\text {--CA}_0\) thus suffices to prove the bounded graph minor theorem. This approach circumvents a \(\Pi ^1_3\)-induction, which is roughly used to show that some minimal bad sequence always exists under certain circumstances, and Friedman et al. [4] in turn showed that no theory of lower proof-theoretic strength than \(\Pi _1^1\text {--CA}_0\) augmented with \(\Pi ^1_1\)-reflection for \(\Pi _1^1\text {--CA}_0\) can prove the bounded graph minor theorem. There is however no such proof for some theorems of Graph Minors IV [7] which are more important for the rest of the Graph Minors series, and for these theorems only the upper bound of \(\Pi _1^1\text {--CA}_0+ \Pi ^1_3\text {-IND}\) is known. Friedman et al. [4] further showed that the bounded graph minor theorem is equivalent to the planar graph minor theorem, i.e. the graph minor theorem for those graphs which can be drawn (or equivalently, embedded) in the plane.

Graph Minors VIII [8] proves a generalization of the planar graph minor theorem. Define for every surface \(\Sigma \) the \(\Sigma \)-graph minor theorem:

Theorem 4

(\(\Sigma \)-graph minor theorem) For every sequence \(\left\langle G_i:i\in \mathbb {N}\ldots \right\rangle \) of graphs that can be drawn in \(\Sigma \) without crossings there are \(i < j\) so that \(G_i\le G_j\).

If \(S^2\) denotes the sphere, then the planar graph minor theorem is just the \(S^2\)-graph minor theorem, since embeddability in the sphere and drawability in the plane are equivalent. Denote by \(\forall \Sigma \)-GMT the statement that the \(\Sigma \)-graph minor theorem holds for every surface \(\Sigma \). It is shown in Graph Minors VIII that the \(\Sigma \)-graph minor theorem and \(\forall \Sigma \)-GMT are indeed true, and it can further be shown that both of these theorems are equivalent to the planar and hence also the bounded graph minor theorem. This is done by extending the proof that each instance of the bounded graph minor theorem is provable in \(\Pi _1^1\text {--CA}_0\) all the way into Graph Minors VII [8], so that it can be shown that for each surface \(\Sigma \), the \(\Sigma \)-graph minor theorem is provable in \(\Pi _1^1\text {--CA}_0\). An application of \(\Pi ^1_1\)-reflection for \(\Pi _1^1\text {--CA}_0\) then establishes the equivalence of \(\forall \Sigma \)-GMT and the planar graph minor theorem, and hence also that of \(\forall \Sigma \)-GMT and the bounded graph minor theorem. The results of [4] can thus be extended as follows, see [5].

Theorem 5

The following are equivalent over \(\text {ACA}_0\):

  • The well-orderedness of the ordinal \(\psi _0(\Omega _\omega )\),

  • Friedman’s extended Kruskal’s theorem,

  • the bounded graph minor theorem,

  • the planar graph minor theorem,

  • the \(\Sigma \)-graph minor theorem, for any surface \(\Sigma \), and

  • \(\forall \Sigma \)-GMT.

The next use of strong infinitary proof-techniques is in Graph Minors XVIII [9] which provides another restricted form of the graph minor theorem that facilitates the proof of the version of the graph minor theorem necessary for the second major step of the proof of the graph minor theorem outlined above. The theorem of Graph Minors XVIII [9] in a sense allows one to focus on the individual pieces of the graph decomposition obtained by the excluded minor theorem, thereby avoiding the need to work with tree-decompositions. The theorem that these individual pieces of the above graph decomposition are well-quasi-ordered by the minor relation is then proved in Graph Minors XIX [10]. The proof of this version of the graph minor theorem requires a further very strong proof principle, namely that of \(\Pi ^1_2\)-bar induction. In Graph Minors XX [11] these results are then combined to prove the full graph minor theorem. Finally, Graph Minors XXIII [12] proves the immersion theorem and a generalization of the graph minor theorem to hypergraphs in a certain sense.

This generalization to hypergraphs can be stated as follows. For a vertex set V denote by \(K_V\) the complete graph on V, i.e. the graph with vertex set V in which every two distinct vertices are connected by an edge. Then a collapse f of \(G_2\) to \(G_1\) is a function mapping vertices of \(G_1\) to disjoint connected subgraphs of \(K_{V(G_2)}\) and edges of \(G_1\) injectively to edges of \(G_2\) so that f(e) is incident with a vertex of f(v) whenever e is incident with v for all \(e\in E(G_1)\) and \(v\in V(G_1)\), and further that for every vertex v and every edge \(e_v\) of f(v) with endpoints \(v_1\) and \(v_2\), there must be an edge of \(G_2\) that has among its endpoints the vertices \(v_1\) and \(v_2\). Further, if Q is a well-quasi-order and the edges of \(G_1\) and \(G_2\) are labelled via functions \(\phi _1:E(G_1)\longrightarrow Q\), \(\phi _2:E(G_2)\longrightarrow Q\), then f is also required to respect the edge labels of \(G_1\) and \(G_2\), in the sense that \(\phi _1(e)\le _Q\phi _2(f(e))\) has to hold for every edge \(e\in E(G_1)\). Then Graph Minors XXIII [12] shows that the following generalization of the graph minor theorem holds.

Theorem 6

Let Q be a well-quasi-order. Then in every infinite sequence \(\left\langle G_i: i\in \mathbb {N}\right\rangle \) of Q-edge-labelled hypergraphs there are \(j > i\) so that there is a collapse of \(G_j\) to \(G_i\) which respects the labels of \(G_i\) and \(G_j\).

Further, Graph Minors XXIII [12] also proves that similar labelled versions of the graph minor and immersion theorem hold. If Q is a well-quasi-order and \(\phi _1:E(G)\longrightarrow Q\), \(\phi _2:E(G)\longrightarrow Q\) are labelling functions for the edges of \(G_1\) and \(G_2\), then a minor relation \(G_1\le G_2\) via an expansion f is said to respect these labels if \(\phi _1(e)\le _Q \phi _2(f(e))\) for every edge \(e\in G_1\). Similarly, for vertex-labelling functions \(\phi _1:V(G)\longrightarrow Q\), \(\phi _2:V(G)\longrightarrow Q\) the minor relation is said to respect the labels if for every \(v\in V(G_1)\) there is a \(v'\in f(v)\) so that \(\phi _1(v)\le _Q \phi _2(v')\). If \(\phi _1\) and \(\phi _2\) are vertex-labelling functions from a well-quasi-order Q of \(G_1\) and \(G_2\) respectively, say that an immersion f respects this labelling if \(\phi _1(v)\le _Q\phi _2(f(v))\) for every \(v\in V(G)\). Then the labelled graph minor and immersion theorem are true as well.

Theorem 7

(Labelled graph minor theorem) Let Q be a well-quasi-order and let \(\left\langle G_i:i\in \mathbb {N}\right\rangle \) be a sequence of Q-vertex- and edge-labelled graphs. Then there are \(i < j\) and a minor expansion \(f:G_i\longrightarrow G_j\) that respects the labels of \(G_i\) and \(G_j\).

Theorem 8

(Labelled immersion theorem) Let Q be a well-quasi-order and let \(\left\langle G_i:i\in \mathbb {N}\right\rangle \) be a sequence of Q-vertex-labelled graphs. Then there are \(i < j\) and an immersion expansion \(f:G_i\longrightarrow G_j\) that respects the labels of \(G_i\) and \(G_j\).

In order to prove these theorems, Graph Minors XXIII [12] requires another \(\Pi ^1_2\)-bar induction similar to that used in Graph Minors XIX [10]. The bar induction of Graph Minors XIX [10] is used when assuming that a certain class of graph embeddings is minimal with respect to certain properties, in order to prove that the above mentioned sequence of graphs embedded in a surface is good. As said above, the graphs themselves might not actually be completely embeddable in the surface, and so the non-embeddable parts are coded as labels from a well-quasi-order, to provide a (now labelled) graph that is completely embeddable into the surface. When assuming that the set of possible labels is a minimal well-quasi-order so that the set of corresponding graphs is a counterexample, one essentially performs a \(\Pi ^1_2\)-bar induction on a well-quasi-order.

3 Bar Induction in the Graph Minors Series

More precisely, in Graph Minors XIX [10] two \(\Pi ^1_2\)-bar inductions and three ordinary \(\Pi ^1_2\)-inductions need to be performed. These inductions take the form of the assumption that there is no minimal bad counterexample to a version of the graph minor theorem. This version of the graph minor theorem is for graphs that are embedded in a fixed surface and have labels from well-quasi-orders on the edges. Further, the minor relation between these graphs is altered in such a way that edges incident with a cuff stay fixed on the surface under minor-expansions, and so that it respects the labels of the well-quasi-order. The minimal counterexample to the graph minor theorem for such graphs is then required to have as few handles, crosscaps, cuffs and edges around cuffs as possible, which correspond to the ordinary \(\Pi ^1_2\)-inductions mentioned above, since the well-quasi-orders for the edges are not required to be the same for “smaller” possible counterexamples.

The \(\Pi ^1_2\)-bar inductions then occur when requiring that the well-quasi-orders of the counterexample are also minimal with respect to the initial ideal ordering and so-called refinement relation. We present the bar induction corresponding to the initial ideal relation in greater detail to illustrate that it can deal with the induction principle actually performed in Graph Minors XIX [10]; the relation corresponding to refinement can be handled analogously. As already noted, the counterexample to our version of the graph minor theorem is required to have labels from a well-quasi-order that is minimal with regard to the initial ideal relation. A well-quasi-order X is an initial ideal of another well-quasi-order \(X'\), denoted \(X\preceq X'\), if \(X\subseteq X'\) and if X is closed downward with regard to \(X'\), that is if

$$\begin{aligned} \forall x\in X\forall x'\in X'(x'\le _{X'} x\rightarrow x'\in X). \end{aligned}$$

Assuming that the counterexample has minimal well-quasi-orders with regard to this relation then corresponds to the induction scheme

$$\begin{aligned} \forall X(WQO(X) \rightarrow (\forall X'\prec X (\forall X''\prec X' \varphi (X'') \rightarrow \varphi (X'))\rightarrow \varphi (X))). \end{aligned}$$

This is different from the standard bar induction scheme, which postulates that

$$\begin{aligned} \forall X(WF(X) \rightarrow \forall j(\forall i<_X j \varphi (i)\rightarrow \varphi (j))\rightarrow \forall n\in X \varphi (n)). \end{aligned}$$

Further, it is not clear whether the induction scheme used in Graph Minors XIX [10] is actually implied by the usual bar-induction scheme, and it does not seem to be the case that this initial ideal induction scheme has been considered before in the literature of reverse mathematics. Note also that due to the different kinds of quantifiers present in second order arithmetic, it may for instance occur that the initial ideal induction scheme quantifies over uncountably many predecessor objects while the ordinary bar induction scheme is constrained to only countably many predecessor objects. Inspecting the proofs of Graph Minors XIX [10] further, it can however be discerned that a more restricted notion of initial ideal is sufficient to carry out the proofs. In the proofs of Graph Minors XIX [10], the minimality of the counterexample with regard to this initial ideal relation is only used when a whole segment above a certain element is “cut out” of the well-quasi-ordering, that is only the relation \(\preceq _1\) defined by

$$\begin{aligned} X'\prec _1 X :\Leftrightarrow \exists \left\langle x_1,\ldots ,x_n\right\rangle \in X^{<\omega }\forall x'(x'\in X' \leftrightarrow x'\in X \wedge \forall i < n( x'\not \ge x_i)) \end{aligned}$$

is actually used in Graph Minors XIX [10]. Defining a relation \(\le _1\) (in other contexts known as the Smyth quasi-order) on the finite subsets \([X]^{<\omega }\) of a well-quasi-ordered set X by

$$\begin{aligned} \{y_1,\ldots ,y_n\} \le _1 \{z_1,\ldots ,z_m\} :\Leftrightarrow \forall j\in \{1,\ldots ,m\}\exists i\in \{1,\ldots ,n\} y_i\le z_j, \end{aligned}$$

and setting \(X^{z_1,\ldots ,z_n} := \{x\in X:\forall i < n(x\not \ge z_i)\}\) it can be shown that bar induction for \(\le _1\) implies initial ideal induction for \(\preceq _1\):

Lemma 9

Assume that for every well-quasi-ordered set \(X^*\) and every \(\Pi ^1_2\)-formula \(\varphi '(n)\) the ordinary bar induction scheme holds with regard to \([X^*]^{<\omega }\) and \(\le _1\), i.e. that

$$\begin{aligned} \forall j(\forall i<_1 j \varphi '(i)\rightarrow \varphi '(j))\rightarrow \forall n\in [X^*]^{<\omega } \varphi '(n). \end{aligned}$$

Then also the initial ideal induction scheme holds for every well-quasi-ordered set X and every \(\Pi ^1_2\)-formula \(\varphi (Y)\) with regard to \(\preceq _1\), i.e.

$$\begin{aligned} \forall X'\prec _1 X (\forall X''\prec _1 X' \varphi (X'') \rightarrow \varphi (X'))\rightarrow \varphi (X). \end{aligned}$$

Proof

Note that if X is well-quasi-ordered then \(\le _1\) is well-founded on \([X]^{<\omega }\) since a bad \(\le _1\)-sequence in X would in particular induce a bad \(\preceq \)-sequence in X (see e.g. [3]), which is in contradiction to the well-quasi-orderedness of X.

Now let X be well-quasi-ordered and let \(\top \) be a new element so that \(\top > x\) for all \(x\in X\). Define \(\hat{X} := X\cup \{\top \}\). The idea for showing that the initial ideal induction scheme holds given the ordinary induction scheme is to encode the predecessors of X with regard to \(\preceq _1\) by finite subsets of \(\hat{X}\), and to perform an ordinary bar induction on \([\hat{X}]^{<\omega }\) instead.

So assume that the usual bar induction scheme for \(\Pi ^1_2\)-formulas with regard to \([\hat{X}]^{<\omega }\) and \(\le _1\) holds. Let \(\varphi (X)\) be any \(\Pi ^1_2\)-formula, then we need to show that \(\prec _1\)-initial ideal induction over X holds for \(\varphi \). Hence assume \(\varphi \) is progressive with respect to \(\prec _1\), i.e. that

$$\begin{aligned} \forall X'\prec _1 X(\forall X''\prec _1 X'\varphi (X'')\rightarrow \varphi (X')). \end{aligned}$$

Then we need to show that \(\varphi (X)\) holds. To do this, we define a formula \(\varphi '(i)\) so that \(\varphi '(\{y_1,\ldots ,y_n\})\) essentially emulates \(\varphi (\{x\in \hat{X}:\forall j < n: x\not \ge y_j\})\), as follows:

$$\begin{aligned} \varphi '(i) := \forall Y(i = \{y_1,\ldots ,y_n\} \rightarrow (\forall x(x\in Y\leftrightarrow x\in \hat{X}\wedge \forall j<n: x\not \ge _1 y_j) \rightarrow \varphi (Y))). \end{aligned}$$

By \(\Sigma ^0_0\)-comprehension a set Y satisfying the conditions in the antecedent always exists, and so \(\varphi '\) is in fact the intended statement. Note that \(\varphi '(i)\) is further still a \(\Pi ^1_2\)-formula, and that we can thus utilize our idea to employ \(\Pi ^1_2\)-bar induction for \(\varphi '\) in order to show that \(\varphi '(\{\top \})\) and hence \(\varphi (X)\) holds. To this end we need to prove the progressiveness of \(\varphi '\). So assume (letting i, j be codes for finite subsets of \(\hat{X}\)) that \(\forall i <_1 j\varphi '(i)\), then we need to show \(\varphi '(j)\).

For this, we first show that \(\forall i <_1 j\varphi '(i)\) implies \(\forall X''\prec _1 X^j \varphi (X'')\). But if \(j = \{x_1,\ldots ,x_m\}\), say, then \(X''\prec _1 X^j\) means that \(X'' = X^{x_1,\ldots ,x_m,z_1,\ldots ,z_k}\) for some \(z_1,\ldots ,z_k\), and trivially \(\{x_1,\ldots ,x_m,z_1,\ldots ,z_k\} <_1 \{x_1,\ldots ,x_m\}\), where the inequality must be strict since \(X''\prec _1 X^j\). Let \(i = \{x_1,\ldots ,x_m,z_1,\ldots ,z_k\}\). Then \(\varphi '(i)\) holds since we assumed \(\forall i <_1 j\varphi '(i)\), and since \(X^i = X''\) we can infer that \(\varphi (X'')\) holds as well.

So we have shown that \(\forall X''\prec _1 X^j \varphi (X'')\). Since \(\varphi \) was assumed to be progressive with regard to \(\prec _1\), this gives \(\varphi (X^j)\) and therefore \(\varphi '(j)\). This is what we needed to show for \(\varphi '\) to be progressive. Since \(\varphi '\) is progressive we can apply \(\Pi ^1_2\)-bar induction on \(\varphi '\) to obtain \(\forall x\in [\hat{X}]^{<\omega } \varphi '(x)\). This gives us in particular \(\varphi '(\{\top \})\), which in turn implies \(\varphi (X)\) and thus completes the proof.    \(\square \)

In the above, finite sets of elements of X are used to code the appropriate subsets of X. For the bar induction corresponding to the refinement relation, a finite sequence of such finite sets is needed instead. The critical condition of the refinement relation says in a sense that the well-quasi-orders from which some of the edges are allowed to be labelled can be arranged in such a way that some of those well-quasi-orders are initial ideals of others, and at most identical. More precisely, a sequence \(\left\langle X_1,\ldots , X_n\right\rangle \) is a refinement of a sequence \(\left\langle X_1',\ldots ,X_m'\right\rangle \) if \(n\ge m\) and there is a function \(f:\{1,\ldots ,n\}\longrightarrow \{1,\ldots ,m\}\) with the property that \(X_i\preceq X_{f(i)}\) for all \(i\le n\), so that additionally \(X_i,X_j\prec X_{f(i)}\) whenever \(f(i) = f(j)\) for \(i\ne j\), and so that \(X_i\prec X_{f(i)}\) for some i. As in the previous induction, the \(\prec \)-relations are not actually required in their full form and can be replaced by \(\prec _1\) relations, which enables us to perform a bar-induction in order to simulate the induction corresponding to the refinement relation. We write \(\left\langle X_1,\ldots , X_n\right\rangle \prec _2\left\langle X_1',\ldots ,X_m'\right\rangle \) if \(\left\langle X_1,\ldots , X_n\right\rangle \) is a refinement of \(\left\langle X_1',\ldots ,X_m'\right\rangle \). To perform the bar-induction, we need a relation corresponding to \(\prec _2\). As above, denote the set of finite subsets of a set Y by \([Y]^{<\omega }\), and use \(\rho \) and \(\sigma \) as variables for such finite subsets. Define then on \(([X]^{<\omega })^{<\omega }\) a relation \(<_2\) by

$$\begin{aligned} \left\langle \rho _1,\ldots ,\rho _n\right\rangle<_2 \left\langle \sigma _1,\ldots ,\sigma _m\right\rangle :\Leftrightarrow \,&\exists f:\{1,\ldots ,n\}\longrightarrow \{1,\ldots ,m\} \\&\quad (\forall i\le n(\rho _i \le _1 \sigma _{f(i)})\wedge \exists i\le n(\rho _i<_1 \sigma _{f(i)})\wedge \\&\quad \forall i,j(i\ne j\wedge f(i)=f(j)\rightarrow \rho _i <_1 \sigma _{f(i)}) ). \end{aligned}$$

In order to be able to carry out a bar-induction along this relation, we need to show that it is well-founded. This is done in the next lemma.

Lemma 10

Let X be a well-quasi-ordered set. Then \(([X]^{<\omega })^{<\omega }\) is well-founded with regard to \(\le _2\).

Proof

Because X is well-quasi-ordered, \([X]^{<\omega }\) is well-founded with regard to \(<_1\) by the remarks in the proof of the above lemma. Our aim is to employ König’s lemma in order to show that there can be no infinite descending \(\le _2\)-sequence in \(([X]^{<\omega })^{<\omega }\). Thus if \(\left\langle \rho _1,\ldots ,\rho _n\right\rangle <_2 \left\langle \sigma _1,\ldots ,\sigma _m\right\rangle \) via f, we say that \(\sigma _j\) branches into \(\rho _{i_1},\ldots ,\rho _{i_{m_j}}\) if \(f^{-1}(j) = \{i_1,\ldots ,i_{m_j}\}\) and \(\rho _{i_1} <_1 \sigma _j\) (which is immediate if \(f^{-1}(j)\) consists of more than one element).

Now assume that there is a sequence \(s:=\left\langle \left\langle \rho _1^i,\ldots ,\rho ^i_{n_i}\right\rangle :i\in \mathbb {N}\right\rangle \) so that \(s(i) >_2 s(i + 1)\) for all i, and let \(\left\langle f_i:\{1,\ldots ,n_i\}\longrightarrow \{1,\ldots ,n_{i - 1}\}\right\rangle _{i \ge 2}\) be the corresponding sequence of functions witnessing the \(<_2\) relations. In order to avoid confusing duplicate elements that may appear multiple times in that sequence, we interpret each \(\rho ^i_k\) as a term, and identify two such terms transitively if \(\rho ^{i + 1}_k = \rho ^i_l\) and \(f_{i + 1}(k) = l\).

We now turn toward defining the tree we want to use König’s lemma on. Let \(S=\{\rho ^i_k:i\in \mathbb {N}\wedge k\le n_i\}\), and for \(\rho ,\sigma \in S\) define \(\sigma \) to be a successor of \(\rho \) if at some step in s an element underlying \(\rho \) branches into an element underlying \(\sigma \). Note that due to the definition of \(<_2\) every \(\rho \) can branch only once, and that it can only branch into finitely many successors. This successor relation thus defines a forest on S, which is infinite since s is an infinite descending sequence and in which every tree is finitely branching. Since this forest consists of \(n_1\) and hence finitely many trees, one of these trees must be infinite as well. We can thus apply König’s Lemma to this tree to obtain an infinite, strictly decreasing \(<_1\)-sequence in \([X]^{<\omega }\), which is a contradiction since \([X]^{<\omega }\) is well-founded by \(<_1\).    \(\square \)

Similarly to \(\prec _1\)-initial ideal induction, we can now prove a lemma that shows that ordinary bar induction for \(\le _2\) implies the induction scheme corresponding to refinement. This is made precise in the following lemma.

Lemma 11

Assume that for every well-quasi-ordered set \(X^*\) and every \(\Pi ^1_2\)-formula \(\varphi '(n)\) the bar induction scheme holds with regard to \(([X^*]^{<\omega })^{<\omega }\) and \(\le _2\), i.e. that

$$\begin{aligned} \forall j(\forall i<_2 j \varphi '(i)\rightarrow \varphi '(j))\rightarrow \forall n\in ([X^*]^{<\omega })^{<\omega } \varphi '(n). \end{aligned}$$

Then for every finite sequence of well-quasi-ordered sets \(X := \left\langle X_1,\ldots , X_n\right\rangle \) and every \(\Pi ^1_2\)-formula \(\varphi (Y)\) the induction scheme corresponding to refinement

$$\begin{aligned} (\forall X'\prec _2 X (\forall X''\prec _2 X' \varphi (X'') \rightarrow \varphi (X'))\rightarrow \varphi (X)) \end{aligned}$$

holds as well.

Proof

The proof is essentially the same as the one for Lemma 9.    \(\square \)

This shows that the critical parts of Graph Minors XIX [10] can be dealt with by a \(\Pi ^1_2\)-bar induction. A similar induction is performed in the proof of the immersion theorem in Graph Minors XXIII [12] that can be dealt with by the same techniques. To give an overview, based on unpublished research we have the following placements of proof-theoretic strength:

  1. (a)

    \(|\Pi ^1_1{-}{\mathbf {CA}}_0| =\psi _0(\Omega _{\omega })\).

  2. (b)

    \(|\Pi ^1_1{-}{\mathbf {CA}}_0+\Pi ^1_2\text{-IND }| =\psi _0(\Omega _{\omega }{\cdot }\omega ^{\omega })\).

  3. (c)

    \(|\Pi ^1_1{-}{\mathbf {CA}}| =\psi _0(\Omega _{\omega }{\cdot }\varepsilon _0)\).

  4. (d)

    \(|\Pi ^1_1{-}{\mathbf {CA}}_0+\Pi ^1_2\text{-BI }| = \psi _0(\Omega _{\omega }^{\omega })\).

  5. (e)

    \(|\Pi ^1_1{-}{\mathbf {CA}}_0+\Pi ^1_2\text{-BI }+\Pi ^1_3\text{-IND }| = \psi _0(\Omega _{\omega }^{\omega ^{\omega }})\).

  6. (f)

    \(\psi _0(\Omega _{\omega })\;<\;\text{ ordinal } \text{ of } \text{ graph } \text{ minor } \text{ and } \text{ immersion } \text{ theorems }\;\le \; \psi _0(\Omega _{\omega }^{\omega ^{\omega }})\).

4 Possible Lower Bound Improvements

To narrow down the corridor in which the proof-theoretic strength of the theorems considered above lies, one might try to increase their lower bounds. The immersion theorem with well-quasi-ordered labels seems to be particularly suited for such a task, since it almost imposes an approach similar to that of Friedman’s extended Kruskal’s theorem EKT [14]. There, a function is used to relate labelled trees ordered by embedding with gap-condition to ordinals from the ordinal notation system \(OT(\Omega _\omega )\). This ordinal notation system is used for the ordinal analysis of \(\Pi _1^1\text {--CA}_0\), which shows that \(\left| \Pi _1^1\text {--CA}_0\right| = \Psi _0(\Omega _\omega )\), and derived from the set \(C_0(\Omega _\omega )\) from Buchholz [1]. In Simpson [14] it is then shown that the above approach yields:

Theorem 12

\(\text {ACA}_0\vdash EKT\rightarrow WO(\Psi _0(\Omega _\omega ))\). In particular, EKT is not provable in \(\Pi _1^1\text {--CA}_0\).

Similar to EKT, a principle \(GKT_\omega (Q)\), denoting generalized Kruskal’s theorem with labels from \(\omega \) and additional well-quasi-ordered labels from a well-quasi-order Q, can be defined as follows. First, the objects related to this principle are rooted trees T that have two labelling functions associated with them, one function \(l:V(T)\longrightarrow \omega \) and another function \(l_Q:V(T)\longrightarrow Q\). They are ordered by embeddings \(f:T_1\longrightarrow T_2\) that satisfy the gap-condition

$$\begin{aligned} \forall x\in V(T_1)\forall y\in V(T_2)(y\le f(x)\wedge \lnot \exists z\in V(T_1)(z< x\wedge y\le f(z))\rightarrow l(y)\ge l(x)), \end{aligned}$$

and additionally respect the labels from Q in the sense that

$$\begin{aligned} \forall x\in V(T_1)(l_Q(x)\le l_Q(f(x))). \end{aligned}$$

For any vertex \(v\ne root(T)\) in such a tree, if w is the first vertex on the path from v to root(T), we define \(T^v\) to be the component of \(T\setminus w\) which includes v, and set \(root(T^v) := v\). Then one can relate ordinals to a subset of these trees, by decreeing that the well-quasi-order Q have the form \(Q = W_Q\cup \{+,\omega ^\cdot ,\psi \}\), where \(W_Q\) is a well-order and the elements of \(\{+,\omega ^\cdot ,\psi \}\) are incomparable to all others, in the following way. First, we need an ordinal notation system \(OT(\Omega _\omega \cdot W)\) from [6] which relativizes \(OT(\Omega _\omega )\) by putting \(\sup (W)\) many copies of \(\Omega _\omega \) above \(\Omega _\omega \). Interpret a well-order W as an ordinal and for \(w\in W\) set \(\overline{w}:= \Omega _\omega \cdot (1 + w)\). Define then sets \(C^W_m(\alpha )\), \(m\in \mathbb {N}\), and collapsing functions \(\psi ^W_m (\alpha )\), \(m\in \mathbb {N}\) by induction on \(\alpha \). Let \(C^W_m(\alpha )\) be the least set \(C\supseteq \Omega _m\cup \{\Omega _i:i\in \mathbb {N}\}\cup \{\overline{w}:w\in W\}\) so that:

  • \(C\cap \Omega _\omega \) is closed under \(+\) and \(\omega ^\cdot \),

  • \(\overline{w} + \alpha \in C\) whenever \(w\in W\) and \(\alpha \in C\cap \Omega _\omega \), and

  • \(C\cap \alpha \) is closed under \(\psi _n\) for all \(n\in \mathbb {N}\).

Then we can define \(\psi ^W_m (\alpha )\) by

$$\begin{aligned} \psi ^W_m (\alpha ) := \min \{\xi : \xi \notin C^W_m(\alpha )\}. \end{aligned}$$

We also write \(\psi _m\) instead of \(\psi ^W_m\) if no confusion is possible. The proof-theoretic ordinal of \(\Pi _1^1\text {--CA}\) in terms of these collapsing functions is then \(\psi _0(\Omega _\omega \cdot \varepsilon _0)\). Let \(w' := \sup (W)\). In the following we will always assume that ordinals are in normal form with regard to the ordinal notation system \(OT(\Omega _\omega \cdot \ W)\) that corresponds to \(C_0\left( \overline{w'}\right) \); see [6] for details.

To define the ordinal related to a tree, we additionally assume that W has a special element \(w_0\) so that \(w_0 < w\) for all \(w\in W\setminus \{w_0\}\) (normally \(w_0\) would correspond to 0, but we need it to be “less than” 0). We then define \(\psi _m(w_0) := \Omega _m\), and to simplify notation, we define further \(\psi _m(w + \alpha ) := \psi _m(\overline{w} + \alpha )\) for all \(w\in W\setminus \{w_0\}\). A tree T can then be assigned an ordinal o(T) from \(OT(\Omega _\omega \cdot W)\cap \Omega _\omega \) as follows:

  • If \(l_Q(root(T))\in W\) and root(T) has no successor, then set \(o(T) := \psi _n (w)\), where \(n = l(root(T))\) and \(w = l_Q(root(T))\).

  • If \(l_Q(root(T))\in W\setminus \{w_0\}\) and root(T) has one successor v, then set \(o(T) := \psi _n (w + o(T^v))\), where \(n = l(root(T))\) and \(w = l_Q(root(T))\).

  • If \(l_Q(root(T)) = +\) and \(v_1\), \(v_2\) are the successors of root(T) ordered so that \(o(T^{v_1}) \ge o(T^{v_2})\), then set \(o(T) := o(T^{v_1}) + o(T^{v_2})\).

  • If \(l_Q(root(T)) = \omega ^\cdot \) and v is the successor of root(T), then set \(o(T) := \omega ^{o(T^{v})}\).

  • If \(l_Q(root(T)) = \psi \) and v is the successor of root(T), then set \(o(T) := \psi _n o(T^{v})\), where \(n = l(root(T))\).

  • If none of these cases can be applied, T is not assigned an ordinal.

In the following we will restrict ourselves to trees that can be assigned an ordinal as above, and well-quasi-orders suitable for labelling those trees. Then it can be shown that:

Theorem 13

([5]) Let Q be a well-quasi-order and \(T_1\), \(T_2\) be trees as above. Then \(o(T_1)\le o(T_2)\) whenever \(T_1\le T_2\).

In particular, \(GKT_\omega (Q)\) implies the well-orderedness of \(OT(\Omega _\omega \cdot W_Q)\).

From which, letting \(GKT_\omega (\forall Q) := \forall Q( WQO(Q)\rightarrow GKT_\omega (Q))\), follows immediately:

Theorem 14

\(\text {ACA}_0\vdash GKT_\omega (\forall Q)\rightarrow [\forall X(WO(X)\rightarrow WO(OT(\Omega _\omega \cdot X)))]\).

Then, observing that \(\left| \Pi _1^1\text {--CA}\right| = \Psi _0(\Omega _\omega \cdot \varepsilon _0)\), we get stronger lower bounds on \(GKT_\omega (\forall Q)\) (and in fact even \(GKT_\omega (\varepsilon _0)\)).

Corollary 15

\(\Pi _1^1\text {--CA}_0+ GKT_\omega (\forall Q)\) proves \(WO(\psi _0(\Omega _\omega \cdot \varepsilon _0))\).

Corollary 16

\(\Pi _1^1\text {--CA}\not \vdash GKT_\omega (\forall Q)\).

This idea might possibly be leveraged in the following way, by extending it to theorems of the Graph Minors series. Recall that an immersion of one graph \(G_1\) into another graph \(G_2\) is an injective function \(f:G_1\longrightarrow G_2\) that maps vertices injectively to vertices and edges to edge-disjoint paths (the paths may intersect at vertices however). Given a labelled tree T as in the statement \(GKT_\omega (Q)\) with \(Q = W_Q\cup \{+,\omega ^\cdot ,\psi \}\), one can then define a tree-like graph which under immersion expansion aims to behave like the labelled tree.

Set \(Q' := Q\cup \{root\}\) where root is incomparable to all other elements of \(Q'\), and define \(V(G) := V(T)\cup \{r\}\), where r is a new vertex. Set further \(l_{Q'}(v) := l_Q(v)\) if \(v\in V(T)\) and set \(l_{Q'}(r) := root\). Connect then vertices v of G to their immediate predecessors by \(l(v) + 1\) parallel edges, and connect root(T) to r by \(l(root(T)) + 1\) parallel edges. We then adopt the notation \(v\le u\) if when deleting edges in G until no multiple edges remain (which results in a tree), v lies on the unique path from u to the vertex labelled with root in G. We also speak of predecessors and successors in G with regard to this ordering. For v in V(G) define then \(G^v\) to be the induced subgraph of G with vertex-set \(\{u\in V(G): v\le u\}\cup \{r'\}\) where \(r'\) is a new vertex labelled with root, and where \(r'\) is connected to v by as many edges as v was connected to its immediate predecessor p(v) in G. For vertices v not labelled with root set further \(l(v) := \left| \{e\in E(G): \text { e{ connects}v{ and}p(v)}\}\right| - 1\) (which is the same as l(v) in T).

One can then relate an ordinal to G in the obvious way, by definining o(G) as follows:

  • If the successor v of r is labelled from W and v has no successors, let \(o(G) := \psi _{l(v)}(l_{Q'}(v))\).

  • If the successor v of r is labelled from W and v has a successor w, let \(o(G) := \psi _{l(v)}(l_{Q'}(v) + o(G^w))\).

  • If the successor v of r is labelled with \(+\), set \(o(G) := o(G^{w_1}) + o(G^{w_2})\), where \(w_1\) and \(w_2\) are the successors of v so that \(o(G^{w_1})\ge o(G^{w_2})\).

  • If the successor v of r is labelled with \(\omega ^\cdot \), set \(o(G) := \omega ^{o(G^{w})}\), where w is the successor of v.

  • If the successor v of r is labelled with \(\psi \), set \(o(G) := \psi _{l(v)}o(G^{w})\), where w is the successor of v.

One could hope that \(o(G_1)\le o(G_2)\) whenever \(G_1\) can be immersed into \(G_2\), but sadly this result has not been established yet. When doing the proof for labelled trees, an induction on the height of the tree with additional induction hypotheses is usually used. However, aside from mapping the vertex labelled with root in \(G_1\) to the vertex labelled with root in \(G_2\), an immersion from \(G_1\) into \(G_2\) does not have to respect the “tree-structure” of \(G_1\), as illustrated in Fig. 1.

Fig. 1
figure 1

One example where a valid immersion embedding does not respect “infima” of the graphs. The labels of the vertices are drawn inside the nodes, with r used instead of root. The vertex map of the immersion embedding is given by the dashed arrows, with the edge map implied in the obvious way

The induction hypotheses necessary for proving \(o(G_1)\le o(G_2)\) can not always be used in such a case, which makes the proof that this holds (if it should indeed hold) a lot harder. It should be noted that the immersion relation between two such graphs corresponds to a root preserving embedding f between edge-labelled trees that is not order or infimum preserving (i.e. so that f maps vertices injectively to vertices and edges to paths that do not have to be disjoint), that however satisfies a different gap-condition, namely that for \(e\in E(G_2)\) it has to hold that \(l(e)\ge \sum _{e'\in f^{-1}(e)}{l(e')}\), where \(f^{-1}(e)\) denotes the set of edges \(e'\) so that e is an edge of \(f(e')\).

While it is not clear whether this construction works with immersions due to the above, it should be noted that it does work when using directed graphs and immersions, i.e. so that edges are directed from u to v if \(u\le v\) and so that an immersion expansion maps edges to edge-disjoint directed paths. However, the immersion theorem is known to not hold for the class of all directed graphs in general, and it is currently an ongoing effort in graph theory to establish for which classes of directed graphs it does hold. Thus, it is an open question whether lower bounds like these can be established for a more natural class of directed graphs, and further whether these results can be extended to undirected immersions.