1 Introduction

A central topic in extremal set theory is intersection properties of families of finite sets. This topic began with classical results of Erdős, Ko and Rado [14] determining the maximum size of an intersecting family of subsets of an n-element set, for both arbitrary subsets and for subsets of a given size.

A family \({\mathcal {F}}\) of sets is maximal (or saturated) with respect to some property if it satisfies the property, but no family properly containing \({\mathcal {F}}\) satisfies the property. The problem of finding the smallest object which is maximal with respect to some property has an extensive literature. In the setting of graphs the topic was initiated by Erdős, Hajnal and Moon [15], who determined the minimum number of edges in a maximal n-vertex graph not containing a clique of size k (see [16] for an extensive survey on graph saturation). For set systems consisting of sets of a fixed size r, Erdős and Lovász [12] suggested the problem of finding a maximal intersecting family of minimum size. For this problem, the best known lower bound of 3r was proved by Dow, Drake, Füredi and Larson [7], and the best known upper bound of \(r^2/2 + 5r + o(r)\) (when \(r-1\) is a prime power) is due to Boros, Füredi and Kahn [5]. Kahn [22] conjectured that O(r) is an upper bound. Other properties for which saturation results have been obtained for set systems include the k-Sperner [18], equal disjoint union free [8], VC-dimension at most k [17] and k-matching free [6] properties.

In this paper we study a natural saturation problem concerning an intersection property. To state the problem we need to introduce some notation. For a positive integer n, denote by [n] the set \(\{1,2,\ldots , n\}\). Given a set S, we write \(2^S\) for the power set of S, which is the family of all subsets of S. For \({\mathcal {F}}\subseteq 2^{[n]}\), let \({\bar{{\mathcal {F}}}} :=\{F^c: F \in {\mathcal {F}}\}\) where \(F^c\) denotes \([n]\setminus F\). We use the notation \({\mathcal {F}}\Delta {\mathcal {G}}\) for the symmetric difference \(({\mathcal {F}}\setminus {\mathcal {G}})\cup ({\mathcal {G}}\setminus {\mathcal {F}})\) of two set families \({\mathcal {F}}\) and \({\mathcal {G}}\).

We say that a family \({\mathcal {F}}\subseteq 2^{[n]}\) is maximal k-wise intersecting if the intersection of every collection of at most k sets in \({\mathcal {F}}\) is non-empty, and no set from \(2^{[n]} \setminus {\mathcal {F}}\) can be added to \({\mathcal {F}}\) while preserving this property. In the case \(k=2\), it is well-known that every maximal 2-wise intersecting family has the same size, namely \(2^{n-1}\). For \(k \ge 3\), an old question of Erdős and Kleitman [13] from 1974 asks for the size of the smallest maximal k-wise intersecting family. We answer this question for the case \(k=3\) and n sufficiently large.

We now present our construction, which we call a balanced pair of linked cubes. A balanced pair of linked cubes is a set family of the form \(\{A: S \subsetneq A \subseteq [n]\} \cup \{B: S^c \subsetneq B \subseteq [n]\}\) for some \(S \subseteq [n]\) with \(|S|=\lfloor n/2 \rfloor \). It is not hard to check that a balanced pair of linked cubes is a maximal 3-wise intersecting family of size \(2^{\lfloor n/2 \rfloor } + 2^{\lceil n/2 \rceil } -3\).

Our main result is that for sufficiently large n, the smallest maximal 3-wise intersecting families are balanced pairs of linked cubes.

Theorem 1.1

If \({\mathcal {F}}\) is a maximal 3-wise intersecting family of minimum size on ground set [n], where n is sufficiently large, then \({\mathcal {F}}\) is a balanced pair of linked cubes.

Remark

The even case of Theorem 1.1 was first proved by the third, fourth, sixth and seventh authors [20]. In [4], the first, second and fifth authors used the method of [20] together with some new ideas to establish the odd case. The present paper provides a unified treatment of both cases.

A key ingredient in the proof of Theorem 1.1 is the following stability result, which shows that a maximal 3-wise intersecting family of size at most \((1+o(1))(2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil })\) must be close in structure to a balanced pair of linked cubes.

Theorem 1.2

For every \(\varepsilon > 0\) there exists \(\delta >0\) such that, if \({\mathcal {F}}\) is a maximal 3-wise intersecting family on [n] of size \(|{\mathcal {F}}| \le (1+\delta ) (2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil })\), where n is sufficiently large, then there exists a balanced pair of linked cubes \({\mathcal {F}}_0\) such that \(|{\mathcal {F}}\triangle {\mathcal {F}}_0| \le \varepsilon 2^{\lfloor n/2\rfloor }\).

Organization and Notation The rest of the paper is organized as follows. In Sect. 2 we prove Theorem 1.2, in Sect. 3 we deduce Theorem 1.1 from Theorem 1.2 using a perturbation argument, and in Sect. 4 we discuss maximal k-wise intersecting families when \(k\ge 4\).

We use standard asymptotic notation throughout. For functions \(f = f(n)\) and \(g = g(n)\) we write \(f=o(g)\), \(f\ll g\), or \(g \gg f\) to mean \(f/g \rightarrow 0\) as \(n\rightarrow \infty \). All asymptotics are as \(n \rightarrow \infty \).

2 Proof of Theorem 1.2

2.1 Proof Overview

In the proof of Theorem 1.2, we find it more convenient to work with \({\bar{{\mathcal {F}}}}\).

Definition

A balanced pair of cubes on [n] is a set system of the form \(\{A:A \subsetneq S\}\cup \{B:B\subsetneq S^c\}\), where \(S\subseteq [n]\) with \(|S|=\lfloor \frac{n}{2}\rfloor \). More generally, a balanced series of k cubes on [n], denoted by \({\mathcal {F}}_{n,k}\), is a set system of the form \(\cup _{i=1}^k\{A:A \subsetneq S_i\}\), where \(S_1,\ldots ,S_k\) is a partition of [n] with \(\lfloor \frac{n}{k}\rfloor \le |S_i| \le \lceil \frac{n}{k}\rceil \) for each \(i\in [k]\).

We also need the following key concepts, whose relevance will be revealed in Lemma 2.1 below.

Definition

A set system \({\mathcal {H}}\subseteq 2^{[n]}\) is a \((1-\varepsilon )\)-k-generator for [n] if all but at most \(\varepsilon 2^n\) subsets of [n] can be expressed as a union of at most k disjoint sets of \({\mathcal {H}}\).

Definition

Given a set system \({\mathcal {H}}\subseteq 2^{[n]}\), denote by \({\text {dp}}({\mathcal {H}})\) the number of disjoint pairs in \({\mathcal {H}}\), i.e., \({\text {dp}}({\mathcal {H}}) =\left| \{\{A,B\}:A,B\in {\mathcal {H}},\, A\cap B=\emptyset \}\right| .\)

The following lemma is central to the proof of Theorem 1.2.

Lemma 2.1

If \({\mathcal {F}}\) is a maximal 3-wise intersecting family on [n], then

$$\begin{aligned} {\text {dp}}({\bar{{\mathcal {F}}}} )\ge 2^n-|{\mathcal {F}}|. \end{aligned}$$

Moreover, if \(|{\mathcal {F}}| \le \varepsilon 2^n\), then \({\bar{{\mathcal {F}}}}\) is a \((1-\varepsilon )\)-2-generator for [n].

Proof

Let \({\mathcal {F}}^c :=2^{[n]}\setminus {\mathcal {F}}\), then \(|{\mathcal {F}}^c|=2^n-|{\mathcal {F}}|\). For every \(A\in {\mathcal {F}}^c\), there exist \(B,C\in {\mathcal {F}}\) such that \(B\cap C\subseteq A^c\), since \({\mathcal {F}}\) is a maximal 3-wise intersecting family. Notice that \({\mathcal {F}}\) is upward closed, i.e., if \(F\in {\mathcal {F}}\) and \(F\subseteq F'\subseteq [n]\), then \(F'\in {\mathcal {F}}\). Thus, we may select these sets \(B,C\in {\mathcal {F}}\) so that \(A^c=B\cap C\) and \(B\cup C=[n]\), which means \(B^c\cap C^c=\emptyset \). This (not necessarily unique) choice of B and C defines an injective map from \({\mathcal {F}}^c\) into disjoint pairs of sets in \({\bar{{\mathcal {F}}}}\) (where \(\emptyset \) is mapped to \(\{\emptyset , \emptyset \}\)). Therefore,

$$\begin{aligned} {\text {dp}}({\bar{{\mathcal {F}}}} )\ge |{\mathcal {F}}^c|=2^n-|{\mathcal {F}}|. \end{aligned}$$

Now assume that \(|{\mathcal {F}}| \le \varepsilon 2^n\). Since \(|{\mathcal {F}}^c|=2^n-|{\mathcal {F}}| \ge (1-\varepsilon )2^n\) and every set in \({\mathcal {F}}^c\) can be expressed as a union of at most two disjoint sets of \({\bar{{\mathcal {F}}}}\), \({\bar{{\mathcal {F}}}}\) is a \((1-\varepsilon )\)-2-generator for [n]. \(\square \)

The following result together with Lemma 2.1 immediately implies Theorem 1.2. We will prove it in Sect. 2.3.

Theorem 2.2

For every \(\varepsilon '>0\), there exists \(\delta =\delta (\varepsilon ')>0\) such that for sufficiently large n, if \({\mathcal {H}}\subseteq 2^{[n]}\) is a \((1-\delta )\)-2-generator for [n] with \(|{\mathcal {H}}|\le (1+\delta )(2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil })\), then there exists a balanced pair of cubes \({\mathcal {F}}_{n,2}\) such that \(|{\mathcal {H}}\triangle {\mathcal {F}}_{n,2}|\le \varepsilon ' 2^{\lfloor n/2\rfloor }\).

Remark

Theorem 2.2 for n even was first proved by Ellis and Sudakov [9, Proposition 9]. Our proof of Theorem 2.2 uses their method.

Proof of Theorem 1.2 assuming Theorem 2.2

Let \(\varepsilon >0\) and n be a sufficiently large integer. Let \(\delta =\delta (\varepsilon )>0\) be obtained from Theorem 2.2. Let \({\mathcal {F}}\) be a maximal 3-wise intersecting family on [n] of size \(|{\mathcal {F}}| \le (1+\delta ) (2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil })\), then \(|{\mathcal {F}}|\le \delta 2^n\), when n is sufficiently large. Since \(|{\bar{{\mathcal {F}}}}|=|{\mathcal {F}}|\le \delta 2^n\), \({\bar{{\mathcal {F}}}}\) is a \((1-\delta )\)-2-generator for [n] by Lemma 2.1.

Applying Theorem 2.2 to \({\mathcal {H}}:={\bar{{\mathcal {F}}}}\), we conclude that there exists a balanced pair of cubes \({\mathcal {F}}_{n,2}\) with \(|{\bar{{\mathcal {F}}}}\triangle {\mathcal {F}}_{n,2}|\le \varepsilon 2^{\lfloor n/2\rfloor }\). Taking the complements of the sets in \({\bar{{\mathcal {F}}}}\) and \({\mathcal {F}}_{n,2}\) yields the desired result. \(\square \)

Lemma 2.1 also leads to another classic question posed by Erdős [11]: How many disjoint pairs of sets can there be in a set system of given size? A lower bound comes from \({\mathcal {F}}_{n,k}\), a balanced series of k cubes, i.e. \({\mathcal {F}}_{n,k}=\cup _{i=1}^k\{A: A \subsetneq S_i\}\), where \(S_1,\ldots ,S_k\) is a partition of [n] with \(\lfloor \frac{n}{k}\rfloor \le |S_i| \le \lceil \frac{n}{k}\rceil \) for each \(i\in [k]\). With the additional assumption that k divides n, it is easy to see that \(|{\mathcal {F}}_{n,k}|= k2^{n/k}-2k+1\) and \({\text {dp}}({\mathcal {F}}_{n,k}) > (1-1/k) \left( {\begin{array}{c}|{\mathcal {F}}_{n,k}|\\ 2\end{array}}\right) \). Solving a conjecture of Daykin and Erdős [19], Alon and Frankl [1] proved that \({\mathcal {F}}_{n,k}\) has asymptotically the maximum number of disjoint pairs, given its size.

Theorem 2.3

(Alon–Frankl) For every positive integer k, there exists \(\beta =\beta (k)>0\) such that, if \({\mathcal {H}}\) is a set system on [n] of size \(m:=|{{\mathcal {H}}}| \ge 2^{(1/(k+1)+\varepsilon )n}\), where \(\varepsilon >0\), then

$$\begin{aligned} {\text {dp}}({\mathcal {H}}) \le \left( 1-\frac{1}{k}\right) \left( {\begin{array}{c}m\\ 2\end{array}}\right) +O\left( m^{2-\beta \varepsilon ^2}\right) . \end{aligned}$$

We strengthen Theorem 2.3 by proving a stability result, showing families with close to \(\left( 1-\frac{1}{k}\right) \left( {\begin{array}{c}m\\ 2\end{array}}\right) \) disjoint pairs must resemble a balanced series of k cubes.

Theorem 2.4

For every \(\varepsilon >0\) and integer \(k\ge 2\), there exists an \(\eta >0\) such that for every sufficiently large n, if a set system \({\mathcal {H}}\) on [n] of size \(m:=|{\mathcal {H}}| \ge (1-\eta )k2^{n/k}\) has at least \((1-\frac{1}{k}-\eta )\frac{m^2}{2}\) disjoint pairs, then k divides n, and there exists a balanced series of k cubes \({\mathcal {F}}_{n,k}\) with \(|{\mathcal {H}}\triangle {\mathcal {F}}_{n,k}| \le \varepsilon 2^{\lfloor n/k\rfloor }\).

The proof of Theorem 2.4, given in Sect. 2.2, borrows some ideas of Alon and Frankl [1], and Alon, Das, Glebov and Sudakov [2]. The even case of Theorem 2.2 can be deduced from Theorem 2.4 as follows.

Proof of Theorem 2.2 for even n assuming Theorem 2.4

Let \(\eta \) be as given by Theorem 2.4 for \(k=2\) and \(\varepsilon >0\), and let \(\delta =\eta ^2\). Let \({\mathcal {H}}' \supseteq {\mathcal {H}}\) such that \(|{\mathcal {H}}'| = \lfloor (1+\delta )2^{(n+2)/2} \rfloor \), and every element of \({\mathcal {H}}' \setminus {\mathcal {H}}\) has size greater than n/2. This is possible because n is large and the number of sets of size greater than n/2 is nearly \(2^{n-1}\). Since \({\mathcal {H}}'\supseteq {\mathcal {H}}\) and \({\mathcal {H}}\) is a \((1-\delta )\)-2-generator for [n], we have

$$\begin{aligned} |{\mathcal {H}}'|+{\text {dp}}({\mathcal {H}}') \ge |{\mathcal {H}}|+{\text {dp}}({\mathcal {H}}) \ge (1-\delta )2^n. \end{aligned}$$

Hence \({\text {dp}}({\mathcal {H}}') \ge (1-\delta )2^n-(1+\delta )2^{(n+2)/2} \ge (1-2\delta )2^n\) for n sufficiently large. On the other hand,

$$\begin{aligned} \left( \frac{1}{2} - \eta \right) \frac{|{\mathcal {H}}'|^2}{2} \le \left( \frac{1}{2} - \eta \right) (1+\eta )^2 2^{n+1} = (1-3\eta ^2-2\eta ^3) 2^n \le (1-3\delta )2^n. \end{aligned}$$

It follows that \({\text {dp}}({\mathcal {H}}') \ge (1/2 - \eta )|{\mathcal {H}}'|^2/2\), and the hypotheses of Theorem 2.4 are satisfied for \({\mathcal {H}}'\). Theorem 2.4 gives a balanced pair of cubes \({\mathcal {F}}_{n,2}\) with \(|{\mathcal {H}}'\triangle {\mathcal {F}}_{n,2}|\le \varepsilon 2^{n/2}\). By the second property of \({\mathcal {H}}'\), \({\mathcal {H}}'\setminus {\mathcal {H}}\) is disjoint from \({\mathcal {F}}_{n,2}\). Hence, \(|{\mathcal {H}}\triangle {\mathcal {F}}_{n,2}|\le |{\mathcal {H}}'\triangle {\mathcal {F}}_{n,2}|\le \varepsilon 2^{n/2}\).

Before starting the proofs of Theorems 2.2 and 2.4, we introduce the so-called disjointness graph, which is the main object that we analyze in Sects. 2.2 and 2.3.

Definition

For a set system \({\mathcal {H}}\), the disjointness graph \(G_{\mathcal {H}}\) is the graph with vertex set \({\mathcal {H}}\) and edge set \(\{\{A,B\}\subseteq {\mathcal {H}}:A\cap B =\emptyset \}\). For two (not necessarily disjoint) set families \({\mathcal {H}}_1,{\mathcal {H}}_2\), the disjointness bipartite graph \(G_{{\mathcal {H}}_1, {\mathcal {H}}_2}\) is the bipartite graph with classes \(({\mathcal {H}}_1,{\mathcal {H}}_2)\), where there is an edge between \(A \in {\mathcal {H}}_1\) and \(B \in {\mathcal {H}}_2\) if and only if \(A \cap B = \emptyset \).

2.2 Proof of Theorem 2.4

The heart of the proof of Theorem 2.4 is the following lemma, which shows that the disjointness graph of a large family has only few cliques of size \(k+1\). This lemma essentially appears in [1]. For completeness, we include its proof here.

Lemma 2.5

For every \(\varepsilon >0\), \(\gamma >0\) and integer \(k\ge 2\), and sufficiently large integer n, if \({\mathcal {H}}\) is a set system on [n] of size \(m:=|{\mathcal {H}}| \ge 2^{(1/(k+1)+\varepsilon )n}\), then \(G_{{\mathcal {H}}}\) contains at most \(\gamma \left( {\begin{array}{c}m\\ k+1\end{array}}\right) \) copies of \(K_{k+1}\).

We denote by \(K_r(t)\) the complete r-partite graph with parts of size t. The following proposition is standard and follows from a result of Erdős [10] by a simple averaging argument (see for instance [2, Proposition 3.2]).

Proposition 2.6

For integers \(r\ge 2\), \(t \ge 1\) and any real \(\gamma >0\), there exists \(\delta _{(2.6)}=\delta _{(2.6)}(r,t,\gamma )>0\), such that if m is sufficiently large and G is a graph on m vertices with at least \(\gamma \left( {\begin{array}{c}m\\ r\end{array}}\right) \) copies of \(K_r\), then G contains at least \(\delta _{(2.6)} \left( {\begin{array}{c}m\\ rt\end{array}}\right) \) copies of \(K_r(t)\).

We will use a probabilistic argument to derive Lemma 2.5 from Proposition (2.6).

Proof of Lemma 2.5

Let \(t=\lceil \frac{2}{\varepsilon }\rceil +k\). Select t sets \(A_1,\ldots ,A_t \in {{\mathcal {H}}}\) independently uniformly at random, with repetitions allowed. The probability that \(|A_1\cup A_2\cup \cdots \cup A_t| \le \frac{n}{k+1}\) is bounded above by

$$\begin{aligned} \sum _{\begin{array}{c} S\subseteq [n] \\ |S|=\lfloor n/(k+1)\rfloor \end{array} } \Pr \big [ A_i\subseteq S, i=1,\ldots ,t\big ] \le 2^n \left( 2^{n/(k+1)}/m\right) ^t \le m^{-k}, \end{aligned}$$

since \(m\ge 2^{(1/(k+1)+\varepsilon )n}\), where \(2^n\) estimates the number of choices of S, and S has \(2^{\lfloor n/(k+1)\rfloor }\) many subsets, each could be chosen as \(A_i\), if they are in \({\mathcal {H}}\).

Let \({\textbf{A}}=(A_1,A_2,\ldots ,A_{(k+1)t})\) be a random sample (chosen independently uniformly at random, allowing repetition) of \((k+1)t\) vertices of \(G_{{\mathcal {H}}}\). It follows from the discussion above that the probability that \(|A_{(i-1)t+1}\cup A_{(i-1)t+2}\cup \cdots \cup A_{it}| \le \frac{n}{k+1}\) for some \(i\in [k+1]\) is at most \((k+1)m^{-k}\). On the other hand, if our random sample \({\textbf{A}}\) gives a copy of \(K_{(k+1)}(t)\) with \(\{A_{(i-1)t+1},\ldots , A_{it}\}\) being the i-th vertex class for every \(i\in [k+1]\), then \(A_{1}\cup \cdots \cup A_{t}, \ldots , A_{kt+1}\cup \cdots \cup A_{(k+1)t}\) are \(k+1\) disjoint subsets of [n]. Hence, in this case, we must have \(|A_{(i-1)t+1}\cup A_{(i-1)t+2}\cup \cdots \cup A_{it}| \le \frac{n}{k+1}\) for some \(i\in [k+1]\). Moreover, the probability that our random sample gives such a copy of \(K_{k+1}(t)\) is precisely

$$\begin{aligned} m^{-(k+1)t}\cdot (k+1)! (t!)^{k+1} \cdot \# \ \text {copies of } K_{k+1}(t) ~\text { in}~ G_\mathcal {{\mathcal {H}}}. \end{aligned}$$

Therefore, the number of copies of \(K_{k+1}(t)\) in \(G_{{\mathcal {H}}}\) is at most

$$\begin{aligned} \frac{m^{(k+1)t-k}}{k! (t!)^{k+1}} \le \delta _{(2.6)}(k+1,t,\gamma )\left( {\begin{array}{c}m\\ (k+1)t\end{array}}\right) \end{aligned}$$

for m sufficiently large. Hence, by Proposition 2.6, \(G_{{\mathcal {H}}}\) has at most \(\gamma \left( {\begin{array}{c}m\\ k+1\end{array}}\right) \) copies of \(K_{k+1}\). This completes the proof of Lemma 2.5. \(\square \)

We will make use of the following theorem of Balogh, Bushaw, Collares, Liu, Morris and Sharifzadeh [3, Theorem1.2].

Theorem 2.7

For every \(m,k,t \in {\mathbb {N}}\), the following holds. Suppose G is graph on m vertices with at most \(\frac{m^{k-1}}{e^{2k}\cdot k!}\left( e(G)+t-\left( 1-\frac{1}{k}\right) \frac{m^2}{2}\right) \) copies of \(K_{k+1}\). Then there is a partition of the vertex set of G as \(V(G)=V_1\cup V_2\cup \cdots \cup V_k\) with \(\sum _{i=1}^k e(V_i) \le t\).

We also use the following well-known estimate on the size of a set system in terms of the binary entropy function.

Lemma 2.8

Let \({\mathcal {F}}\) be a set system on [n]. Denote by \(p_i\) the fraction of sets in \({\mathcal {F}}\) that contain i. Then

$$\begin{aligned} |{\mathcal {F}}| \le 2^{\sum _{i=1}^n h(p_i)}, \end{aligned}$$

where \(h(p)=-p\log _2 p-(1-p)\log _2 (1-p)\) is the binary entropy function.

We are now ready to prove Theorem 2.4. The proof closely follows the approach from [2].

Proof of Theorem 2.4

Let \({\mathcal {H}}\) be a family of subsets of [n] such that \(m:=|{{\mathcal {H}}}| \ge (1 - o(1)) k 2^{n/k}\) and \({\text {dp}}({\mathcal {H}}) \ge \left( 1-1/k-o(1) \right) \frac{m^2}{2}\). By Lemma 2.5, \(G_{{\mathcal {H}}}\) contains at most \(o(m^{k+1})\) copies of \(K_{k+1}\). Thus, applying Theorem 2.7 with \(t=o(m^2)\), we conclude that \(G_{{\mathcal {H}}}\) has a k-partite subgraph with \((1 - 1/k - o(1))\frac{m^2}{2}\) edges. In order to contain this many edges, it is clear that the vertex classes must all have size \((1 - o(1)) m/k > (1 - o(1)) 2^{n/k}\). Furthermore, again by the edge density, all but at most o(m) vertices must have at least \((1-1/k-o(1))m\) neighbors. Thus, we may take a k-partite subgraph G of \(G_{{\mathcal {H}}}\) consisting of parts of size at least \((1 - o(1)) 2^{n/k}\) such that each vertex has at least \((1-1/k-o(1)) k 2^{n/k}\) neighbors, and thus at least \((1-o(1)) 2^{n/k}\) neighbors in every other class.

Let \(\varepsilon >0\) be a sufficiently small constant with respect to k. Let \({\mathcal {H}}_1, {\mathcal {H}}_2,\dots ,{\mathcal {H}}_k\) be the sets from \({\mathcal {H}}\) corresponding to the k color classes of G, respectively. By the above discussion,

$$\begin{aligned} |{\mathcal {H}}_i|\ge (1-\varepsilon )2^{n/k} \quad \text {for all} i\in [k]. \end{aligned}$$
(1)

We will construct a balanced series of k cubes containing \(\bigcup _{i=1}^k {\mathcal {H}}_i\). To do this we begin with a partition of the ground set [n] into \(2k+1\) disjoint sets: \(A_1,B_1,\ldots ,A_k,B_k,R\). For \(i \in [k]\), let \(A_i\) and \(B_i\) be the set of those elements of \([n] \setminus \bigcup _{j=1}^{i-1} (A_j \cup B_j)\) occurring in more than \(\varepsilon |{\mathcal {H}}_i|\) members of \({\mathcal {H}}_i\) and between 1 and \(\varepsilon |{\mathcal {H}}_i|\) members of \({\mathcal {H}}_i\), respectively. Let R be the set of remaining elements of [n].

By the definition of the partition, if for some \(i \in [k]\) we have \(x \in B_i\), then there is a set \(F \in {\mathcal {H}}_i\) such that \(x \in F\). It follows from the density properties of G discussed above that for \(j \ne i\), x is not contained in at least \((1 - \varepsilon ) |{{\mathcal {H}}_j}|\) members of \({\mathcal {H}}_j\). If \(x \in A_i\), then for \(j\ne i\), x does not appear in any member of \({\mathcal {H}}_{j}\), since each member of \({\mathcal {H}}_{j}\) is disjoint from at least \((1 - \varepsilon ) |{\mathcal {H}}_i|\) members of \({\mathcal {H}}_i\).

It follows that no set \(F\in {\mathcal {H}}_i\) has vertices from \(\bigcup _{j=1}^{i-1} A_j\). Furthermore, each element \(x \in \bigcup _{j=1}^{i} B_j\) is contained in at most \(\varepsilon |{\mathcal {H}}_i|\) sets in \({\mathcal {H}}_i\). Using Lemma 2.8 we have

$$\begin{aligned} |{\mathcal {H}}_i|\le 2^{|A_i|+h(\varepsilon )(\sum _{j \le i}|B_j|)}. \end{aligned}$$

We thus get

$$\begin{aligned} 2^{n-\frac{1}{2}} < (1-\varepsilon )^k 2^{n}&\le \prod _{i=1}^k|{\mathcal {H}}_i| \le 2^{\sum _{i\in [k]}\left( |A_i| + h(\varepsilon )(k-i+1)|B_i|\right) }\\&= 2^{n - |R|-\sum _{i\in [k]} \left( 1-h(\varepsilon )(k-i+1)\right) |{B_i}|} x\le 2^{n -|R|- \frac{3}{4}\sum _{i\in [k]}|B_i|}, \end{aligned}$$

where in the first equality we used the fact that \(n=|R|+\sum _{i\in [k]}(|A_i|+|B_i|)\), and the last inequality holds since \(1-h(\varepsilon )k \ge 3/4\) for \(\varepsilon \) sufficiently small with respect to k. This implies that \(R=B_1=\cdots =B_k=\emptyset \). As \({\mathcal {H}}_i \subseteq 2^{A_i\cup B_i}\), by the definition of \(A_i\) and \(B_i\), we have that \(|{\mathcal {H}}_i| \le 2^{|A_i|+|B_i|}=2^{|A_i|}\). Together with the lower bound \(|{\mathcal {H}}_i|\ge (1-\varepsilon )2^{n/k}\) from (1), we must have \(|{A_i}| > (n-1)/k\), and so \(\left| A_i \right| \ge n/k\). Since \(\sum _{i \in [k]} \left| A_i \right| = n\), we in fact have \(|A_i|=n/k\) for all \(i\in [k]\). It follows that the family \(\bigcup _{i=1}^{k} {\mathcal {H}}_i\) is contained in the balanced series of k cubes \(A_1,\ldots ,A_k\). Since \({\mathcal {H}}\) contains at most \(\varepsilon |{\mathcal {H}}|\) members not in \(\bigcup _{i=1}^{k} {\mathcal {H}}_i\), the proof is complete. \(\square \)

2.3 Proof of Theorem 2.2

Before stating the proof of Theorem 2.2, we need some preparation.

Definition

Given set systems \({\mathcal {H}}_1, {\mathcal {H}}_2\) and a bipartite subgraph \(B\subseteq G_{{\mathcal {H}}_1}\) with bipartition \(({\mathcal {X}},{\mathcal {Y}})\), we say B (and sometimes we say E(B)) generates \({\mathcal {H}}_2\) if every \(H\in {\mathcal {H}}_2\) can be expressed as a disjoint union of some \(X\in {\mathcal {X}}\) and \(Y\in {\mathcal {Y}}\), i.e., every \(H\in {\mathcal {H}}_2\) corresponds to an edge of B.

Definition

For a set system \({\mathcal {H}}\) and \(i\in [n]\), let \({\mathcal {H}}_i^- :=\{H\in {\mathcal {H}}: i\notin H\}\) be the subfamily of \({\mathcal {H}}\), consisting of sets not containing i and \({\mathcal {H}}_i^+ :=\{H\setminus \{i\}: i\in H\in {\mathcal {H}}\}\). Note that \(|{\mathcal {H}}_i^+|+|{\mathcal {H}}_i^-|=|{\mathcal {H}}|\).

We will use the following result by Ellis and Sudakov [9].

Lemma 2.9

( [9, Proposition 18]) Let \(c>0\). Then, for every set family \({\mathcal {H}}\subseteq 2^{[n]}\) with \(|{\mathcal {H}}|\ge c2^{n/2}\), the disjointness graph \(G_{{\mathcal {H}}}\) can be made bipartite by deleting at most \(o(|{\mathcal {H}}|^2)\) edges.

We are now ready to prove Theorem 2.2.

Proof of Theorem 2.2

We have already proved the even case of Theorem 2.2, hence from now on we assume that \(n=2\ell +1\) is a sufficiently large odd integer. \(\square \)

Let \(\varepsilon '\gg \varepsilon \gg \delta >0\) be sufficiently small. Suppose that \({\mathcal {H}}\subseteq 2^{[n]}\) is a \((1-\delta )\)-2-generator for [n] with \(|{\mathcal {H}}|\le (1+\delta ) (2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil })=3(1+\delta )2^\ell \). Then, the number of ways to choose at most two disjoint sets (whose unions are all different) from \({\mathcal {H}}\) is at least \((1-\delta )2^n\), by the definition of a \((1-\delta )\)-2-generator. Hence, \(|{\mathcal {H}}|^2\ge (1-\delta )2^n\), implying

$$\begin{aligned} |{\mathcal {H}}|\ge \sqrt{1-\delta }\cdot 2^{n/2}. \end{aligned}$$

Moreover, let \(G_0\) be the disjointness graph of \({\mathcal {H}}\), then

$$\begin{aligned} 1+|{\mathcal {H}}|+e(G_0)\ge (1-\delta )2^n, \end{aligned}$$

which implies that

$$\begin{aligned} e(G_0)\ge (1-\delta )2^{2\ell +1}-3(1+\delta )2^\ell -1. \end{aligned}$$

We conclude that \(G_0\) has edge-density

$$\begin{aligned} \frac{e(G_0)}{\left( {\begin{array}{c}|{\mathcal {H}}|\\ 2\end{array}}\right) }\ge \frac{(1-\delta )2^{2\ell +1}-3(1+\delta ) 2^\ell -1}{9(1+\delta )^2 2^{2\ell -1}} \ge \frac{4-5\delta }{9(1+\delta )^2}, \end{aligned}$$

where the last inequality comes from \(3(1+\delta )2^\ell +1\le \delta 2^{2\ell -1}\). Applying Lemma 2.9 to \({\mathcal {H}}\) with \(c=\sqrt{1-\delta }\), we conclude that one can delete at most

$$\begin{aligned} o(|{\mathcal {H}}|^2)=o(2^{2\ell +1}) \end{aligned}$$

edges from \(G_0\) and obtain a bipartite graph \(G=({\mathcal {X}},{\mathcal {Y}})\) with \({\mathcal {X}}\cup {\mathcal {Y}}={\mathcal {H}}\). Note that G generates all but at most

$$\begin{aligned} \delta 2^{2\ell +1}+1+|{\mathcal {H}}|+o(2^{2\ell +1}) \end{aligned}$$

subsets of [n].

Since \(|{\mathcal {H}}|\le 3(1+\delta ) 2^\ell =o(2^{2\ell +1})\) and \(\delta \ll \varepsilon \), G generates all but at most \(\varepsilon 2^{2\ell +1}\) subsets of [n]. In particular,

$$\begin{aligned} e(G)\ge (1-\varepsilon )2^{2\ell +1}. \end{aligned}$$
(2)

Let \(\alpha :=|{\mathcal {X}}|/2^\ell \) and \(\beta :=|{\mathcal {Y}}|/2^\ell \). Since \(|{\mathcal {X}}| + |{\mathcal {Y}}| = |{\mathcal {H}}| \le 3 (1+\delta ) 2^\ell \), we have

$$\begin{aligned} \alpha +\beta \le 3(1+\delta ). \end{aligned}$$
(3)

Therefore, we have

$$\begin{aligned} \alpha \beta \le \frac{9}{4}(1+\delta )^2. \end{aligned}$$
(4)

Since \(\alpha \beta 2^{2\ell }=|{\mathcal {X}}||{\mathcal {Y}}|\ge e(G)\ge (1-\varepsilon )2^{2\ell +1}\), we also have \(\alpha \beta \ge 2-2\varepsilon \). Combining with (3), we get

$$\begin{aligned} 1-3\varepsilon<\alpha ,\beta <2+3\varepsilon , \end{aligned}$$
(5)

where we use \(\varepsilon \gg \delta \).

Let

$$\begin{aligned} {\mathcal {X}}_{1/3}:=\{i\in [n]: |{\mathcal {X}}_i^+|\ge |{\mathcal {X}}|/3\}\quad \quad \mathrm{{and}} \quad {\mathcal {Y}}_{1/3}:=\{i\in [n]: |{\mathcal {Y}}_i^+|\ge |{\mathcal {Y}}|/3\}.\nonumber \\ \end{aligned}$$
(6)

Letting \(A:={\mathcal {X}}_{1/3}\) and \(B:={\mathcal {Y}}_{1/3}\), we will show that AB forms an equipartition of [n], and that \({\mathcal {X}}\) and \({\mathcal {Y}}\) are not too far from \(2^A\) and \(2^B\), respectively, thereby proving Theorem 2.2. We will achieve this goal step by step via the following lemma and a series of claims.

Lemma 2.10

It cannot happen that all the following equations hold at the same time:

$$\begin{aligned}{} & {} |{\mathcal {X}}|, |{\mathcal {Y}}|=(3/2-o(1))2^\ell ,\quad |{\mathcal {X}}_n^+|, |{\mathcal {Y}}_n^+|=(1-o(1))2^{\ell -1} \quad \text {and} \nonumber \\{} & {} |{\mathcal {X}}_n^-|, |{\mathcal {Y}}_n^-|=(1-o(1))2^\ell . \end{aligned}$$
(7)

Proof

Suppose for a contradiction that (7) holds. Since all but at most \(\varepsilon 2^n=2\varepsilon \cdot 2^{n-1}\) sets in [n] can be expressed as a union of at most two disjoint sets in \({\mathcal {H}}\), we have \({\mathcal {H}}_n^-={\mathcal {X}}_n^- \cup {\mathcal {Y}}_n^-\) is a \((1-2\varepsilon )\)-2-generator for \([n-1]\). By the assumption that (7) holds, \(|{\mathcal {H}}_n^-|=(2-o(1))2^\ell =(2-o(1))2^{(n-1)/2}\). As \(|{\mathcal {F}}_{n-1,2}|=2\cdot 2^{(n-1)/2}-1\), we can apply the even case of Theorem 2.2 to \({\mathcal {H}}_n^-\) and conclude that there exists an equipartition \(S_1\cup S_2=[n-1]\) such that each of \({\mathcal {X}}_n^-, {\mathcal {Y}}_n^-\) contains at least \((1-o(1))2^\ell \) sets in \(2^{S_1}, 2^{S_2}\). Define \({\mathcal {U}}:=\{F\in {\mathcal {X}}: F\cap S_2=\emptyset \}, {\mathcal {V}}:=\{F\in {\mathcal {Y}}: F\cap S_1=\emptyset \}\). Note that \({\mathcal {U}}_n^- = {\mathcal {X}}_n^- \cap 2^{S_1}\) and \({\mathcal {V}}_n^- = {\mathcal {Y}}_n^- \cap 2^{S_2}\). We have \(|{\mathcal {U}}_n^-|, |{\mathcal {V}}_n^-|=(1-o(1))2^\ell \), implying

$$\begin{aligned} |{\mathcal {X}}_n^-\setminus {\mathcal {U}}_n^-|, |{\mathcal {Y}}_n^-\setminus {\mathcal {V}}_n^-|=o(2^\ell ). \end{aligned}$$
(8)

Now we prove that

$$\begin{aligned} |{\mathcal {X}}_n^+\setminus {\mathcal {U}}_n^+|, |{\mathcal {Y}}_n^+\setminus {\mathcal {V}}_n^+|=o(2^\ell ). \end{aligned}$$

In fact, for every \(X\in {\mathcal {X}}_n^+\setminus {\mathcal {U}}_n^+\), we have \(X\cap S_2\ne \emptyset \) by the definition of \({\mathcal {U}}\), thus \(X\cup \{n\}\) is disjoint from at most \(2^{\ell -1}\) subsets of \(S_2\). Since \(|{\mathcal {Y}}_n^-\setminus 2^{S_2}|=|{\mathcal {Y}}_n^-\setminus {\mathcal {V}}_n^-|=o(2^\ell )\), the set \(X\cup \{n\}\) is disjoint from at most \(2^{\ell -1}+o(2^\ell )\) sets in \({\mathcal {Y}}\). Similarly, for every \(Y\in {\mathcal {Y}}_n^+\setminus {\mathcal {V}}_n^+\), the set \(Y\cup \{n\}\) is disjoint from at most \(2^{\ell -1}+o(2^\ell )\) sets in \({\mathcal {X}}\). Let \(e_n\) be the number of edges \(XY\in E(G)\) such that \(n\in X\cup Y\). Since G generates all but at most \(\varepsilon 2^n\) subsets of [n], at least \(2^{2\ell }-\varepsilon 2^n=(1-2\varepsilon )2^{2\ell }\) sets containing n correspond to edges of G, which implies that

$$\begin{aligned} e_n\ge (1-2\varepsilon )2^{2\ell }=\left( 1-o(1)\right) 2^{2\ell }. \end{aligned}$$

Let \(\phi :=|{\mathcal {U}}_n^+|/|{\mathcal {X}}_n^+|\) and \(\theta :=|{\mathcal {V}}_n^+|/|{\mathcal {Y}}_n^+|\), then \(\phi ,\theta \in [0,1]\). Combining with (7), we have

$$\begin{aligned} \begin{aligned} e_n&\le |{\mathcal {U}}_n^+||{\mathcal {Y}}_n^-|+|{\mathcal {X}}_n^+\setminus {\mathcal {U}}_n^+|\left( 2^{\ell -1}+o(2^\ell )\right) +|{\mathcal {V}}_n^+||{\mathcal {X}}_n^-|+|{\mathcal {Y}}_n^+\setminus {\mathcal {V}}_n^+|\left( 2^{\ell -1}+o(2^\ell )\right) \\&=\phi (1-o(1))2^{2\ell -1}+(1-\phi )(1-o(1))2^{2\ell -2}\\&\quad +\theta (1-o(1))2^{2\ell -1}+(1-\theta )(1-o(1))2^{2\ell -2}\\&=\left( 2+\phi +\theta -o(1)\right) 2^{2\ell -2}. \end{aligned} \end{aligned}$$

Hence,

$$\begin{aligned} \left( 1-o(1)\right) 2^{2\ell }\le e_n\le \left( 2+\phi +\theta -o(1)\right) 2^{2\ell -2}, \end{aligned}$$

which implies that \(\phi ,\theta =1-o(1)\). Therefore, \(|{\mathcal {X}}_n^+\setminus {\mathcal {U}}_n^+|=o(|{\mathcal {X}}_n^+|)=o(2^{\ell })\), and \(|{\mathcal {Y}}_n^+\setminus {\mathcal {V}}_n^+|=o(|{\mathcal {Y}}_n^+|)=o(2^{\ell }),\) as desired.

Now we have

$$\begin{aligned} |{\mathcal {X}}\setminus {\mathcal {U}}|+ |{\mathcal {Y}}\setminus {\mathcal {V}}|=|{\mathcal {X}}_n^-\setminus {\mathcal {U}}_n^-|+|{\mathcal {X}}_n^+\setminus {\mathcal {U}}_n^+|+|{\mathcal {Y}}_n^-\setminus {\mathcal {V}}_n^-|+|{\mathcal {Y}}_n^+\setminus {\mathcal {V}}_n^+|=o(2^\ell ). \end{aligned}$$

For a set \(F\in {\mathcal {H}}\), notice that \(F\in ({\mathcal {X}}\setminus {\mathcal {U}})\cup ({\mathcal {Y}}\setminus {\mathcal {V}})\) if and only if \(F\in {\mathcal {X}}\) satisfies \(F\cap S_2\ne \emptyset \) or \(F\in {\mathcal {Y}}\) satisfies \(F\cap S_1\ne \emptyset \). For \(S\subseteq S_1\), if \(S\cup \{n\}\in {\mathcal {X}}\), then \(S\in {\mathcal {X}}_n^+\) by definition. Hence, there are at least \(2^{|S_1|}-|{\mathcal {X}}_n^+|=2^\ell -(1-o(1))2^{\ell -1}=(1+o(1))2^{\ell -1}\) sets \(S\subseteq S_1\) satisfying \(S\cup \{n\}\notin {\mathcal {X}}\). Similarly, there are at least \((1+o(1))2^{\ell -1}\) sets \(S'\subseteq S_2\) satisfying \(S'\cup \{n\}\notin {\mathcal {Y}}\). Thus, there are at least \((1+o(1))2^{2\ell -2}\) sets of the form \(S\cup S'\cup \{n\}\subseteq [n]\) satisfying \(S\subseteq S_1, S\cup \{n\}\notin {\mathcal {X}}\) and \(S'\subseteq S_2, S'\cup \{n\}\notin {\mathcal {Y}}\). Denote the family of sets of this form by \({\mathcal {S}}\). Since G generates all but at most \(\varepsilon 2^n\) subsets of [n], at least \((1+o(1))2^{2\ell -2}-\varepsilon 2^{2\ell +1}=(1-o(1))2^{2\ell -2}\) sets in \({\mathcal {S}}\) correspond to edges of G. If \(F\in {\mathcal {S}}\) can be expressed as a disjoint union of \(X\in {\mathcal {X}}\) and \(Y \in {\mathcal {Y}}\), then either \(X\cap S_2\ne \emptyset \) or \(Y\cap S_1\ne \emptyset \), implying that either X or Y is in \(({\mathcal {X}}\setminus {\mathcal {U}})\cup ({\mathcal {Y}}\setminus {\mathcal {V}})\). Hence, the number of choices for F is at most \(o(2^\ell )|{\mathcal {H}}|\le o(2^\ell )\cdot 3(1+\delta )2^\ell \ll (1-o(1))2^{2\ell -2}\), a contradiction. \(\square \)

Claim 2.11

\(A\cup B=[n]\).

Proof

Suppose for a contradiction that \(A\cup B\ne [n]\). We may assume without loss of generality that \(n\notin A\cup B\).

Let \(x:=|{\mathcal {X}}_n^+|/|{\mathcal {X}}|\) and \(y:=|{\mathcal {Y}}_n^+|/|{\mathcal {Y}}|\), then \(x,y<1/3\) by the definitions of A and B. Recalling that \(e_n\ge (1-2\varepsilon )2^{2\ell }\) is the number of disjoint pairs \(X\in {\mathcal {X}}, Y\in {\mathcal {Y}}\) such that \(n\in X\cup Y\), we have

$$\begin{aligned} \begin{aligned} (1-2\varepsilon )2^{2\ell }\le e_n&\le |{\mathcal {X}}_n^+||{\mathcal {Y}}_n^-|+|{\mathcal {Y}}_n^+||{\mathcal {X}}_n^-|= |{\mathcal {X}}_n^+|\left( |{\mathcal {Y}}|-|{\mathcal {Y}}_n^+|\right) +|{\mathcal {Y}}_n^+|\left( |{\mathcal {X}}|-|{\mathcal {X}}_n^+|\right) \\&= x\alpha 2^\ell (\beta 2^\ell - y\beta 2^\ell ) + y\beta 2^\ell (\alpha 2^\ell - x\alpha 2^\ell ) =(x+y-2xy)\alpha \beta 2^{2\ell }. \end{aligned}\nonumber \\ \end{aligned}$$
(9)

Define the function \(f(x,y):=x+y-2xy\). On \(0\le x,y\le 1/3\), the function f(xy) attains its maximum value 4/9, when \(x=y=1/3\). Combining with (9), we have

$$\begin{aligned} 1-2\varepsilon \le f(x,y)\alpha \beta \le \frac{4}{9}\alpha \beta . \end{aligned}$$

Recalling that \(\alpha +\beta \le 3(1+\delta )\) by (3), we then get

$$\begin{aligned} \frac{3}{2}-4\sqrt{\frac{\varepsilon }{2}}\le \alpha ,\beta \le \frac{3}{2}+4\sqrt{\frac{\varepsilon }{2}}. \end{aligned}$$

Additionally, \(\alpha \beta \le \frac{9}{4}(1+\delta )^2\) by (4), implying that \(1-2\varepsilon \le f(x,y)\alpha \beta \le \frac{9}{4}(1+\delta )^2f(x,y)\), so

$$\begin{aligned} \frac{1}{3}- 3\varepsilon \le x,y\le \frac{1}{3}. \end{aligned}$$

In summary, we have

$$\begin{aligned}{} & {} |{\mathcal {X}}|, |{\mathcal {Y}}|=(3/2-o(1))2^\ell ,\quad |{\mathcal {X}}_n^+|, |{\mathcal {Y}}_n^+|=(1-o(1))2^{\ell -1} \quad \text {and} \\{} & {} |{\mathcal {X}}_n^-|, |{\mathcal {Y}}_n^-|=(1-o(1))2^\ell , \end{aligned}$$

where \(\varepsilon =o(1)\). Therefore, Claim 2.11 follows from Lemma 2.10. \(\square \)

Claim 2.12

\(A\cap B=\emptyset \).

Proof

Suppose for a contradiction that \(A\cap B\ne \emptyset \). We may assume without loss of generality that \(n\in A\cap B\). Let \(x:=|{\mathcal {X}}_n^+|/|{\mathcal {X}}|\) and \(y:=|{\mathcal {Y}}_n^+|/|{\mathcal {Y}}|\), then \(x,y\ge 1/3\) by the definitions of A and B. Notice that

$$\begin{aligned} (2-2\varepsilon )2^{2\ell }\le e(G)\le |{\mathcal {X}}||{\mathcal {Y}}|-|{\mathcal {X}}_n^+||{\mathcal {Y}}_n^+|=(1-xy)\alpha \beta 2^{2\ell }. \end{aligned}$$

Hence,

$$\begin{aligned} 2-2\varepsilon \le (1-xy)\alpha \beta \le \frac{8}{9}\alpha \beta . \end{aligned}$$

Recalling that \(\alpha + \beta \le 3(1+\delta )\) by (3), we then get

$$\begin{aligned} \frac{3}{2}-2\sqrt{\varepsilon }\le \alpha ,\beta \le \frac{3}{2}+2\sqrt{\varepsilon }. \end{aligned}$$

Additionally, \(\alpha \beta \le \frac{9}{4}(1+\delta )^2\) by (4), implying that \(2-2\varepsilon \le (1-xy)\alpha \beta \le \frac{9}{4}(1+\delta )^2(1-xy)\), so

$$\begin{aligned} \frac{1}{3}\le x,y\le \frac{1}{3}+3\varepsilon . \end{aligned}$$

In summary, we have

$$\begin{aligned}{} & {} |{\mathcal {X}}|, |{\mathcal {Y}}|=(3/2-o(1))2^\ell ,\quad |{\mathcal {X}}_n^+|, |{\mathcal {Y}}_n^+|=(1-o(1))2^{\ell -1} \quad \text {and} \\{} & {} |{\mathcal {X}}_n^-|, |{\mathcal {Y}}_n^-|=(1-o(1))2^\ell , \end{aligned}$$

where again \(\varepsilon =o(1)\). By Lemma 2.10, we have completed the proof of Claim 2.12. \(\square \)

By Claims 2.11 and 2.12, we now know that \(A\cup B\) is a partition of [n]. It remains to show that \(A\cup B\) is in fact an equipartition of [n] and \({\mathcal {X}}, {\mathcal {Y}}\) are close to \(2^A,2^B\), respectively. The following observation is simple but will be useful hereafter.

Observation 2.13

If \(F\in {\mathcal {X}}\setminus 2^A\), then F has at most \(2|{\mathcal {Y}}|/3\) neighbors in \({\mathcal {Y}}\). Similarly, if \(F\in {\mathcal {Y}}\setminus 2^B\), then F has at most \(2|{\mathcal {X}}|/3\) neighbors in \({\mathcal {X}}\).

Proof

By symmetry, it suffices to prove the first part. Suppose \(F\in {\mathcal {X}}\setminus 2^A\), then there exists \(i\in [n]\) such that \(i\in F\cap B\). By the definition of \(B={\mathcal {Y}}_{1/3}\) (see (6)), there are at least \(|{\mathcal {Y}}|/3\) sets in \({\mathcal {Y}}\) containing i, which therefore have non-empty intersection with F. By the definition of G, we conclude that F has at most \(|{\mathcal {Y}}|-|{\mathcal {Y}}|/3=2|{\mathcal {Y}}|/3\) neighbors in \({\mathcal {Y}}\), as desired. \(\square \)

Claim 2.14

We have \(|{\mathcal {X}}\cap 2^A|\ge (2/3-3\varepsilon )|{\mathcal {X}}|\) and \(|{\mathcal {Y}}\cap 2^B|\ge (2/3-3\varepsilon )|{\mathcal {Y}}|\). Additionally, \([n]=A\cup B\) is an equipartition.

Proof

Let \(\theta :=\frac{|{\mathcal {X}}\cap 2^A|}{|{\mathcal {X}}|}\) and \(\phi :=\frac{|{\mathcal {Y}}\cap 2^B|}{|{\mathcal {Y}}|}\). By Observation 2.13 and (4), we have

$$\begin{aligned} (2-2\varepsilon )2^{2\ell }\le & {} e(G)\le |{\mathcal {X}}\cap 2^A|\cdot |{\mathcal {Y}}|+|{\mathcal {X}}\setminus 2^A|\cdot \frac{2}{3}|{\mathcal {Y}}|=\frac{2+\theta }{3}\alpha \beta 2^{2\ell }\\\le & {} \frac{2+\theta }{3}\cdot \frac{9}{4}(1+\delta )^2 2^{2\ell }. \end{aligned}$$

Hence \(\frac{2+\theta }{3}\cdot \frac{9}{4}(1+\delta )^2\ge 2-2\varepsilon \), which implies that \(\theta \ge 2/3-3\varepsilon \), as desired. Similarly, we can prove \(\phi \ge 2/3-3\varepsilon \).

If \(|A|\le \ell -1\), then

$$\begin{aligned} |{\mathcal {X}}|=\frac{|{\mathcal {X}}\cap 2^A|}{\theta }\le \frac{|2^A|}{\theta }\le \frac{2^{\ell -1}}{2/3-3\varepsilon }<(1-2\varepsilon )2^\ell , \end{aligned}$$

a contradiction to (5), so we have \(|A|\ge \ell \). Similarly, we have \(|B|\ge \ell \). Therefore, \([n]=A\cup B\) is an equipartition. \(\square \)

According to Claim 2.14, we can assume from now on that \(|A|=\ell \) and \(|B|=\ell +1\). We claim that

$$\begin{aligned} \alpha \le \frac{3}{2}+8\varepsilon ,\quad \beta \ge \frac{3}{2}-7\varepsilon , \end{aligned}$$
(10)

which are better bounds for \(\alpha , \beta \) than (5). Notice that we only need to show \(\beta \ge 3/2-7\varepsilon \), as \(\alpha +\beta \le 3(1+\delta )\) will then imply \(\alpha \le 3(1+\delta )-\beta \le 3/2+8\varepsilon \). Suppose that \(\beta =3/2-\gamma \) with some \(\gamma \ge 0\), then \(\alpha \le 3/2+3\delta +\gamma \) since \(\alpha +\beta \le 3(1+\delta )\) by (3). By (2) and Observation 2.13, we have

$$\begin{aligned} \begin{aligned}&(2-2\varepsilon )2^{2\ell }\le e(G)\le |{\mathcal {X}}\cap 2^A||{\mathcal {Y}}|+|{\mathcal {X}}\setminus 2^A| \cdot \frac{2}{3}|{\mathcal {Y}}|= |{\mathcal {X}}\cap 2^A||{\mathcal {Y}}|\\&\quad +\left( |{\mathcal {X}}|-|{\mathcal {X}}\cap 2^A|\right) \cdot \frac{2}{3}|{\mathcal {Y}}|\\&=\left( \frac{1}{3}|{\mathcal {X}}\cap 2^A|+\frac{2}{3}|{\mathcal {X}}|\right) |{\mathcal {Y}}| \le \left( \frac{1}{3}\cdot 2^\ell +\frac{2}{3}\right. \\&\quad \left. \left( \frac{3}{2}+3\delta +\gamma \right) 2^\ell \right) \left( \frac{3}{2}-\gamma \right) 2^\ell \le \left( 2+3\delta -\frac{1}{3}\gamma \right) 2^{2\ell }, \end{aligned} \end{aligned}$$

which implies that \(2-2\varepsilon \le 2+3\delta -\gamma /3\). Therefore, \(\gamma \le 7\varepsilon \).

The final three claims show that \({\mathcal {X}}\) and \({\mathcal {Y}}\) are not too far from \(2^A\) and \(2^B\), respectively.

Claim 2.15

  1. (i)

    \(|2^A\setminus {\mathcal {X}}|\le 24\varepsilon 2^\ell \).

  2. (ii)

    \(|{\mathcal {Y}}\setminus 2^B|\le (\sqrt{\varepsilon }+3\varepsilon )2^\ell \).

Proof

Let

$$\begin{aligned} D:=\min \{|2^B\setminus {\mathcal {Y}}|, |{\mathcal {Y}}\setminus 2^B|\}, \end{aligned}$$

then \(D\le |{\mathcal {Y}}\setminus 2^B|\le (1/3+3\varepsilon )|{\mathcal {Y}}|<2|{\mathcal {Y}}|/3\) by Claim 2.14. Define \({\mathcal {Y}}'\) from \({\mathcal {Y}}\) by adding D sets in \(2^B\setminus {\mathcal {Y}}\) and deleting D sets in \({\mathcal {Y}}\setminus 2^B\). Thus, \(|{\mathcal {Y}}'|=|{\mathcal {Y}}|\le (2+3\varepsilon )2^\ell \) by (5). Note that \({\mathcal {X}}\) and \({\mathcal {Y}}'\) are not necessarily disjoint. Let \(G_1 = G_{{\mathcal {X}},{\mathcal {Y}}'}\). If \(D=|2^B\setminus {\mathcal {Y}}|\le |{\mathcal {Y}}\setminus 2^B|\), then \(2^B\subseteq {\mathcal {Y}}'\) and \(|{\mathcal {Y}}'\setminus 2^B|=|{\mathcal {Y}}'|-2^{\ell +1}\le 2\varepsilon 2^{\ell +1}\); if \(D=|{\mathcal {Y}}\setminus 2^B|\le |2^B\setminus {\mathcal {Y}}|\), then \({\mathcal {Y}}'\subseteq 2^B\) and \(|{\mathcal {Y}}'\setminus 2^B|=0\). In both cases, we have \(|{\mathcal {Y}}'\cap 2^B|\ge |{\mathcal {Y}}\cap 2^B|\) and

$$\begin{aligned} |{\mathcal {Y}}'\setminus 2^B|\le 2\varepsilon 2^{\ell +1}. \end{aligned}$$
(11)

We now compare \(e(G_1)\) and e(G). Every deleted \(Y\in {\mathcal {Y}}\setminus 2^B\) has at most \(2|{\mathcal {X}}|/3\) neighbors in \({\mathcal {X}}\) by Observation 2.13. On the other hand, every added \(S\in 2^B\setminus {\mathcal {Y}}\) is disjoint from every set in \(2^A\cap {\mathcal {X}}\), thus has at least \(|2^A\cap {\mathcal {X}}|\ge (2/3-3\varepsilon )|{\mathcal {X}}|\) neighbors in \({\mathcal {X}}\) by Claim 2.14. Therefore, by (3),

$$\begin{aligned} e(G)-e(G_1)\le{} & {} D\left( \frac{2}{3}|{\mathcal {X}}|-\left( \frac{2}{3}-3\varepsilon \right) |{\mathcal {X}}|\right) \le \frac{2}{3} |{\mathcal {Y}}|\cdot 3 \varepsilon |{\mathcal {X}}|\\{} & {} \quad =2\varepsilon \alpha \beta 2^{2\ell }\le 2\varepsilon \cdot \frac{9}{4}(1+\delta )^2 2^{2\ell }\le 3\varepsilon 2^{2\ell +1}, \end{aligned}$$

which, with (2), implies that

$$\begin{aligned} e(G_1)\ge e(G)-3\varepsilon 2^{2\ell +1}\ge (1-\varepsilon )2^{2\ell +1}-3\varepsilon 2^{2\ell +1}=(1-4\varepsilon )2^{2\ell +1}. \end{aligned}$$

Similarly, we define another bipartite graph \(G_2\). Let

$$\begin{aligned} C:=\min \{|2^A\setminus {\mathcal {X}}|, |{\mathcal {X}}\setminus 2^A|\}. \end{aligned}$$

Define \({\mathcal {X}}'\) from \({\mathcal {X}}\) by adding C sets in \(2^A\setminus {\mathcal {X}}\) and deleting C sets in \({\mathcal {X}}\setminus 2^A\). Thus, \(|{\mathcal {X}}'|=|{\mathcal {X}}|\ge (1-3\varepsilon )2^\ell \) by (5). Note that \({\mathcal {X}}'\) and \({\mathcal {Y}}'\) are not necessarily disjoint. Let \(G_2 = G_{{\mathcal {X}}', {\mathcal {Y}}'}\). If \(C=|2^A\setminus {\mathcal {X}}|\le |{\mathcal {X}}\setminus 2^A|\), then \(2^A\subseteq {\mathcal {X}}'\) and \(|{\mathcal {X}}'\cap 2^A|=|2^A|=2^\ell \); if \(C=|{\mathcal {X}}\setminus 2^A|\le |2^A\setminus {\mathcal {X}}|\), then \({\mathcal {X}}'\subseteq 2^A\) and \(|{\mathcal {X}}'\cap 2^A|=|{\mathcal {X}}'|\ge (1-3\varepsilon )2^\ell \). In both cases, we have \(|{\mathcal {X}}'\cap 2^A|\ge (1-3\varepsilon )2^\ell \). We now compare \(e(G_2)\) and \(e(G_1)\). Every deleted \(X\in {\mathcal {X}}\setminus 2^A\) intersects B, thus has at most \(2^\ell \) neighbors in \(2^B\). By (11), X has at most \(2\varepsilon 2^{\ell +1}\) neighbors in \({\mathcal {Y}}'\setminus 2^B\), thus has at most \((1+4\varepsilon )2^\ell \) in \({\mathcal {Y}}'\). On the other hand, every added \(S\in 2^A\setminus {\mathcal {X}}\) is disjoint from every set in \(2^B\cap {\mathcal {Y}}'\), thus has at least

$$\begin{aligned} |2^B\cap {\mathcal {Y}}'|=|{\mathcal {Y}}'|-|{\mathcal {Y}}'\setminus 2^B|\ge (3/2-7\varepsilon )2^\ell -(2\varepsilon )2^{\ell +1}=(3/2-11\varepsilon )2^\ell \end{aligned}$$

neighbors by (10) and (11). Therefore,

$$\begin{aligned} e(G_2)\ge e(G_1)+C\left( \frac{3}{2}-11\varepsilon - (1+4\varepsilon )\right) 2^\ell \ge (1-4\varepsilon )2^{2\ell +1} + C\left( \frac{1}{2}-15\varepsilon \right) 2^\ell . \end{aligned}$$
(12)

If \(|{\mathcal {X}}'|\le 2^\ell \), then by (3), we have that \(e(G_2)\le |{\mathcal {X}}'||{\mathcal {Y}}'|\le |{\mathcal {X}}'|(3(1+\delta )2^\ell -|{\mathcal {X}}'|)\) attains its maximum value \((1+\frac{3}{2}\delta )2^{2\ell +1}\) when \(|{\mathcal {X}}'|=2^\ell \). If \(|{\mathcal {X}}'| > 2^\ell \), then let \(|{\mathcal {X}}'|=(1+\gamma )2^\ell \) for some \(\gamma > 0\). Since \(|{\mathcal {X}}'|+|{\mathcal {Y}}'| = |{\mathcal {X}}| + |{\mathcal {Y}}| \le 3(1+\delta ) 2^\ell \) by (3), we have \(|{\mathcal {Y}}'|\le (2+3\delta -\gamma )2^\ell \). Recalling \(|{\mathcal {Y}}'| = |{\mathcal {Y}}| \ge (3/2 - 7\varepsilon ) 2^{\ell }\) by (10), we have \(\gamma \le 1/2+7\varepsilon +3\delta \). By the definition of \({\mathcal {X}}'\), we have \({\mathcal {X}}'\supseteq 2^A\), so \(|{\mathcal {X}}' \setminus 2^A| = |{\mathcal {X}}'| - |2^A| = \gamma 2^\ell \). Every \(X\in {\mathcal {X}}'\setminus 2^A\) intersects B, thus has at most \(2^\ell \) neighbors in \({\mathcal {Y}}'\cap 2^B\), so it has at most \((1+4\varepsilon )2^\ell \) neighbors in \({\mathcal {Y}}'\) by (11). Therefore,

$$\begin{aligned} \begin{aligned} e(G_2)&\le |{\mathcal {X}}'\setminus 2^A|(1+4\varepsilon )2^\ell +|{\mathcal {X}}'\cap 2^A||{\mathcal {Y}}'| \le \gamma 2^\ell (1+4\varepsilon )2^\ell +2^\ell (2+3\delta -\gamma )2^\ell \\&=\left( 1+2\varepsilon \gamma +\frac{3}{2}\delta \right) 2^{2\ell +1}\le (1+1.01\varepsilon )2^{2\ell +1}. \end{aligned} \end{aligned}$$

In both cases, we have

$$\begin{aligned} e(G_2)\le (1+1.01\varepsilon )2^{2\ell +1}. \end{aligned}$$
(13)

Combining (2), (12) and (13), we conclude that

$$\begin{aligned} e(G_1)-e(G)\le e(G_2)-e(G)\le (1+1.01\varepsilon )2^{2\ell +1}-(1-\varepsilon )2^{2\ell +1} \le 3\varepsilon 2^{2\ell +1}.\nonumber \\ \end{aligned}$$
(14)

Now we prove (i). By (12) and (13), we have

$$\begin{aligned} C\le \frac{5.01\varepsilon }{1/2-15\varepsilon }2^{\ell +1} \le 21\varepsilon 2^\ell , \end{aligned}$$
(15)

so if \(C=|2^A\setminus {\mathcal {X}}|\), then we are done. Assume \(C=|{\mathcal {X}}\setminus 2^A|\). Then, by (5) and (15), we have

$$\begin{aligned} |2^A\setminus {\mathcal {X}}|= & {} |2^A|-|2^A\cap {\mathcal {X}}|=2^{|A|}-(|{\mathcal {X}}|-|{\mathcal {X}}\setminus 2^A|)\\\le & {} 2^\ell -(1-3\varepsilon )2^\ell +21\varepsilon 2^\ell =24\varepsilon 2^\ell , \end{aligned}$$

as desired.

For (ii), we show that it suffices to prove

$$\begin{aligned} D\le \sqrt{\varepsilon }2^\ell . \end{aligned}$$
(16)

Indeed, if \(D=|{\mathcal {Y}}\setminus 2^B|\), then we are done. Assume \(D=|2^B\setminus {\mathcal {Y}}|\). Then, by (5), we have

$$\begin{aligned}{} & {} |{\mathcal {Y}}\setminus 2^B|=|{\mathcal {Y}}|-|2^B\cap {\mathcal {Y}}|=\\{} & {} \qquad |{\mathcal {Y}}|-(|2^B|-|2^B\setminus {\mathcal {Y}}|)\le (2+3\varepsilon )2^\ell -2^{2\ell +1}+\sqrt{\varepsilon }2^\ell =\left( \sqrt{\varepsilon }+ 3\varepsilon \right) 2^\ell , \end{aligned}$$

as desired.

Now suppose for a contradiction that \(D> \sqrt{\varepsilon }2^\ell \). We claim that there exists some \(Y\in {\mathcal {Y}}\setminus 2^B\) having at least \(2|{\mathcal {X}}|/3-8\sqrt{\varepsilon }2^\ell \) neighbors in \({\mathcal {X}}\). Otherwise, recall that every \(S\in {\mathcal {Y}}'\setminus {\mathcal {Y}}\subseteq 2^B\setminus {\mathcal {Y}}\) has at least \((2/3-3\varepsilon )|{\mathcal {X}}|\) neighbors in \({\mathcal {X}}\) and \(|{\mathcal {X}}|\le (3/2+8\varepsilon )2^\ell \) by (10). Then,

$$\begin{aligned}{} & {} e(G_1)-e(G) \ge D\left( \left( \frac{2}{3}-3 \varepsilon \right) |{\mathcal {X}}|-\left( \frac{2|{\mathcal {X}}|}{3}-8\sqrt{\varepsilon }2^\ell \right) \right) \\{} & {} \qquad \ge \sqrt{\varepsilon }\left( 8\sqrt{\varepsilon }-\frac{9}{2}\varepsilon -24\varepsilon ^2\right) 2^{2\ell }>3\varepsilon 2^{2\ell +1}, \end{aligned}$$

a contradiction to (14). Fix some \(Y\in {\mathcal {Y}}\setminus 2^B\) having at least \(2|{\mathcal {X}}|/3-8\sqrt{\varepsilon }2^\ell \) neighbors in \({\mathcal {X}}\). Note that there exists \(i\in [n]\) such that \(i\in Y\cap A\). We may assume without loss of generality that \(i=n\). By the definition of G, at most \(|{\mathcal {X}}|-(2|{\mathcal {X}}|/3-8\sqrt{\varepsilon }2^\ell )=(1/3+o(1))|{\mathcal {X}}|\) sets in \({\mathcal {X}}\) contain n, i.e., \(|{\mathcal {X}}_n^+|/|{\mathcal {X}}| \le 1/3+o(1)\). On the other hand, since \(|2^A\setminus {\mathcal {X}}|\le 24\varepsilon 2^\ell \) by (i), at least \(2^{\ell -1}-24\varepsilon 2^\ell =(1-o(1))2^{\ell -1}\) subsets of A containing n are contained in \({\mathcal {X}}\), so \(|{\mathcal {X}}_n^+|\ge (1-o(1))2^{\ell -1}\) and hence \(|{\mathcal {X}}| \ge (3/2-o(1))2^\ell \). Recalling that \(|{\mathcal {X}}| \le (3/2 + o(1))2^\ell \) by (10), we have \(|{\mathcal {X}}|=(3/2-o(1))2^\ell \), which then implies that \(|{\mathcal {X}}_n^+|=(1-o(1))2^{\ell -1}\) and \(|{\mathcal {Y}}|=(3/2-o(1))2^\ell \). By the definition of B, we have \(|{\mathcal {Y}}_n^+|/|{\mathcal {Y}}| \le 1/3\), since \(n\in A\). Recall (9), where we have now \(x,y\le 1/3 + o(1)\) and \(\alpha ,\beta \le 3/2 + o(1)\). We get \(y = 1/3 + o(1)\) and hence

\(|{\mathcal {Y}}_n^+|=(1-o(1))2^{\ell -1}\). Therefore, \(|{\mathcal {X}}_n^-|,|{\mathcal {Y}}_n^-|=(1-o(1))2^\ell \). Again, by Lemma 2.10, we obtain a contradiction and complete the proof of Claim 2.15. \(\square \)

Claim 2.16

\(|2^B\setminus {\mathcal {Y}}|\le 6\sqrt{\varepsilon }2^\ell \).

Proof

For every \(e\in E(G)\), call e bad if e has an endpoint in \({\mathcal {Y}}\setminus 2^B\) and good otherwise. By Claim 2.15 (ii) and (10), G has at most

$$\begin{aligned} |{\mathcal {Y}}\setminus 2^B||{\mathcal {X}}|\le \left( \sqrt{\varepsilon }+3\varepsilon \right) 2^\ell \cdot \left( 3/2+8\varepsilon \right) 2^\ell \le 2\sqrt{\varepsilon }2^{2\ell } \end{aligned}$$
(17)

bad edges.

Fix \(S'\in 2^B\setminus {\mathcal {Y}}\) and choose a set \(F\subseteq [n]\) of the form \(F=S\cup S'\), where \(S\subseteq A\).

If F corresponds to a good edge of G, assume that \(F = X\cup Y\) with \(X\in {\mathcal {X}}, Y\in {\mathcal {Y}}\cap 2^B, X \cap Y=\emptyset \). By the assumption that \(S' \in 2^B \setminus {\mathcal {Y}}\) and \(Y \in {\mathcal {Y}}\cap 2^B\), we have \(Y \ne S'\) and hence \(Y \subsetneq S'\). Hence, \(X \cap B \ne \emptyset \), so \(X \in {\mathcal {X}}\setminus 2^A\). Furthermore, \(X \cap A = S\),

which implies that S is determined by X.

By (10) and Claim 2.15 (i), we have

$$\begin{aligned}{} & {} |{\mathcal {X}}\setminus 2^A|=|{\mathcal {X}}|-|{\mathcal {X}}\cap 2^A|\\{} & {} \quad =|{\mathcal {X}}| - (|2^A| - |2^A \setminus {\mathcal {X}}|) \le (3/2+8\varepsilon )2^\ell -(1-24\varepsilon ) 2^\ell =(1/2+32\varepsilon )2^\ell . \end{aligned}$$

Hence, for every fixed \(S'\in 2^B\setminus {\mathcal {Y}}\), there are at most \((1/2+32\varepsilon )2^\ell \) sets F of the form \(F=S\cup S'\), where \(S\subseteq A\), corresponding to good edges of G. There are \(2^\ell |2^B\setminus {\mathcal {Y}}|\) sets of the form \(S\cup S'\) with \(S\subseteq A, S'\in 2^B\setminus {\mathcal {Y}}\) in total, so there are at least \(|2^B\setminus {\mathcal {Y}}|(2^\ell -(1/2+32\varepsilon )2^\ell )\) sets of the form \(S\cup S'\) with \(S\subseteq A, S'\in 2^B\setminus {\mathcal {Y}}\) that correspond to bad edges of G or do not correspond to edges of G. Recalling that \({\mathcal {H}}\) is a \((1-\varepsilon )\)-2-generator for [n] and using (17), we have

$$\begin{aligned} |2^B\setminus {\mathcal {Y}}|\left( 2^\ell -\left( 1/2+32\varepsilon \right) 2^\ell \right) \le 2\sqrt{\varepsilon }2^{2\ell }+\varepsilon 2^n=\left( 2\sqrt{\varepsilon }+2\varepsilon \right) 2^{2\ell }. \end{aligned}$$

Therefore, we get \(|2^B\setminus {\mathcal {Y}}|\le 6\sqrt{\varepsilon }2^\ell \). \(\square \)

Claim 2.17

\(|{\mathcal {X}}\setminus 2^A| \le 7\sqrt{\varepsilon } 2^\ell \).

Proof

By Claims 2.15 (i) and 2.16, we have

$$\begin{aligned} 3(1+\delta )2^\ell \ge |{\mathcal {H}}|&\quad = |{\mathcal {X}}| + |{\mathcal {Y}}|\\&\quad =|{\mathcal {X}}\setminus 2^A| + \left( |2^A| - |2^A \setminus {\mathcal {X}}| \right) + |{\mathcal {Y}}\setminus 2^B| {+} \left( |2^B| {-} |2^B \setminus {\mathcal {Y}}| \right) \\&\quad \ge |{\mathcal {X}}\setminus 2^A| + (1-24\varepsilon )2^\ell + (2 - 6\sqrt{\varepsilon })2^\ell =\\&\qquad |{\mathcal {X}}\setminus 2^A| +3\cdot 2^\ell -(6\sqrt{\varepsilon } + 24\varepsilon )2^\ell , \end{aligned}$$

implying \(|{\mathcal {X}}\setminus 2^A| \le 7\sqrt{\varepsilon } 2^\ell \). \(\square \)

By Claim 2.14, \([n] = A \cup B\) is an equipartition. By Claims 2.152.16 and 2.17, we have

$$\begin{aligned} |{\mathcal {H}}\Delta (2^A \cup 2^B)| \le (14\sqrt{\varepsilon } + 27\varepsilon )2^\ell \le \varepsilon ' 2^\ell , \end{aligned}$$

completing the proof of Theorem 2.2.

3 Proof of Theorem 1.1

In this section, we will employ a perturbation argument to deduce Theorem 1.1 from Theorem 1.2. Let n be a sufficiently large integer and \({\mathcal {F}}\) be a maximal 3-wise intersecting family on [n] of size at most \(2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil }-3\). Let \({\mathcal {H}}:={\bar{{\mathcal {F}}}}=\{F^c: F\in {\mathcal {F}}\}\) and fix some \(\varepsilon \in (0,1/4)\). By Theorem 1.2, there exists \(S\subseteq [n]\) of size \(\lfloor n/2 \rfloor \) such that \({\mathcal {F}}_0:=\{A: A\subseteq S\}\cup \{B: B\subseteq S^c\}\) satisfies \(|{\mathcal {H}}\Delta {\mathcal {F}}_0|\le \varepsilon 2^{\lfloor n/2 \rfloor }\).

Recall that \({\mathcal {H}}\) is downward closed since \({\mathcal {F}}\) is upward closed, so \(\emptyset \in {\mathcal {H}}\). We first prove that \(S\notin {\mathcal {H}}={\bar{{\mathcal {F}}}}\). Suppose for a contradiction that \(S^c\in {\mathcal {F}}\). Since \(|{\mathcal {H}}\Delta {\mathcal {F}}_0|\le \varepsilon 2^{\lfloor n/2 \rfloor }\), among the \(2^{\lceil n/2 \rceil }\) subsets of \(S^c\), there exists some \(A\subseteq S^c\) such that both A and \(S^c\setminus A\) are contained in \({\mathcal {H}}\). However, this would imply that \(S^c, A^c, (S^c\setminus A)^c=S\cup A\) are in \({\mathcal {F}}\). As \(S^c\cap A^c\cap (S\cup A)=\emptyset \), this contradicts that \({\mathcal {F}}\) is a 3-wise intersecting family. It can be proved similarly that \(S^c\notin {\mathcal {H}}\).

We work with the following partition \({\mathcal {H}}={\mathcal {H}}_1\cup {\mathcal {H}}_2\cup {\mathcal {H}}_3\cup \{\emptyset \}\), where

$$\begin{aligned} {\mathcal {H}}_1:=\{A\in {\mathcal {H}}:\emptyset \subsetneq A\subsetneq S\}, \quad {\mathcal {H}}_2:=\{B\in {\mathcal {H}}:\emptyset \subsetneq B\subsetneq S^c\}, \quad {\mathcal {H}}_3:={\mathcal {H}}\setminus {\mathcal {F}}_0. \end{aligned}$$

Claim 3.1

\(|2^{[n]}\setminus ({\mathcal {F}}\cup {\mathcal {F}}_0)|\le |{\mathcal {H}}_1||{\mathcal {H}}_2|+|{\mathcal {H}}_3|\cdot \varepsilon 2^{\lfloor n/2 \rfloor }\).

Proof

If \(A\notin {\mathcal {F}}\), then as stated in the proof of Lemma 2.1, there exist \(B,C\in {\mathcal {H}}\) such that \(A=B\cup C\) with \(B\cap C=\emptyset \). There exists a pair \(\{B,C\}\) such that

$$\begin{aligned} g(\{B,C\}):=\min \{|B\setminus S|+|C\setminus S^c|, |B\setminus S^c|+|C\setminus S|\} \end{aligned}$$

attains its minimum value over all such choices of B and C. Therefore, we can define an injection f on \(2^{[n]}\setminus ({\mathcal {F}}\cup 2^S\cup 2^{S^c})\) by mapping A to a pair of sets \(\{B,C\}\) in \({\mathcal {H}}\), which has minimum \(g(\{B,C\})\) given \(A=B\cup C\) and \(B\cap C=\emptyset \). Since \(\emptyset \in {\mathcal {H}}\), \(S,S^c\notin {\mathcal {H}}\) and \({\mathcal {H}}\) is downward closed, we have that \(f(A)=\{B,C\}\) must be one of the following two types: \(|\{B,C\}\cap {\mathcal {H}}_1|=|\{B,C\}\cap {\mathcal {H}}_2|=1\); \(|\{B,C\}\cap {\mathcal {H}}_3|\ge 1\). The number of pairs \(\{B,C\}\) satisfying \(|\{B,C\}\cap {\mathcal {H}}_1|=|\{B,C\}\cap {\mathcal {H}}_2|=1\) is at most \(|{\mathcal {H}}_1||{\mathcal {H}}_2|\), so it suffices to prove that if \(f(A)=\{B,C\}\) with \(B\in {\mathcal {H}}_3\), then there are at most \(\varepsilon 2^{\lfloor n/2 \rfloor }\) choices for C.

Suppose \(B=X\cup Y\), where \(\emptyset \ne X\subsetneq S, \emptyset \ne Y\subsetneq S^c\). Then, \(g(\{B,C\})>0\) and \(Y\in {\mathcal {H}}\), since \(Y\subseteq B\in {\mathcal {H}}\) and \({\mathcal {H}}\) is downward closed. There are three possibilities for C.

  1. (1)

    \(C\subseteq S\): Define \(B' :=X\cup C, C':=Y\), then \(B'\cup C'=B\cup C=A\). Since \(g(\{B',C'\})=0<g(\{B,C\})\) and \(C'=Y\in {\mathcal {H}}\), we have that \(B'\notin {\mathcal {H}}\) by the definition of f. Note that \(B'=X\cup C\) is determined by C, so the number of choices for \(C\subseteq S\) is at most \(|2^S\setminus {\mathcal {H}}|\).

  2. (2)

    \(C\subseteq S^c\): Similarly, the number of choices for C is at most \(|2^{S^c}\setminus {\mathcal {H}}|\).

  3. (3)

    \(C\in {\mathcal {H}}\setminus {\mathcal {F}}_0\): The number of choices for such C is at most \(|{\mathcal {H}}\setminus {\mathcal {F}}_0|\).

In summary, the number of choices for C is at most

$$\begin{aligned} |2^S\setminus {\mathcal {H}}|+|2^{S^c}\setminus {\mathcal {H}}|+|{\mathcal {H}}\setminus {\mathcal {F}}_0|=|{\mathcal {H}}\Delta {\mathcal {F}}_0|\le \varepsilon 2^{\lfloor n/2 \rfloor }, \end{aligned}$$

as desired. \(\square \)

Claim 3.2

\(|{\mathcal {H}}_1||{\mathcal {H}}_2|+|{\mathcal {H}}_3|\cdot \varepsilon 2^{\lfloor n/2 \rfloor } \le (2^{\lfloor n/2 \rfloor }-2)(2^{\lceil n/2 \rceil }-2)\). Equality holds if and only if \({\mathcal {H}}_1=\{A:\emptyset \subsetneq A\subsetneq S\}\), \({\mathcal {H}}_2=\{B:\emptyset \subsetneq B\subsetneq S^c\}\) and \({\mathcal {H}}_3=\emptyset \).

Proof

Define the function \(\varphi (x_1,x_2,x_3) :=x_1x_2+x_3 \cdot \varepsilon 2^{\lfloor n/2 \rfloor }\) on the domain

$$\begin{aligned}{} & {} D :=\{(x_1,x_2,x_3) \in {\mathbb {R}}^3: (1-\varepsilon )2^{\lfloor n/2 \rfloor } \le x_1 \le 2^{\lfloor n/2 \rfloor }-2,\,(1-\varepsilon )2^{\lfloor n/2 \rfloor } \le x_2 \le 2^{\lceil n/2 \rceil }\\{} & {} \quad -2,\, x_3 \ge 0,\, x_1+x_2+x_3 \le 2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil }-4 \}. \end{aligned}$$

Let \((x_1,x_2,x_3)\) be a point in D, which achieves the maximum value of \(\varphi (x_1,x_2,x_3)\) over D. If \(x_3 > 0\), then we have \(2^{\lfloor n/2 \rfloor }-2 - x_1 \ge x_3 /2\), in which case we let \((x'_1,x'_2,x'_3) = (x_1 + x_3/2, x_2, x_3/2)\), or \(2^{\lceil n/2 \rceil }-2 - x_2 \ge x_3 /2\), in which case we let \((x'_1,x'_2,x'_3) = (x_1, x_2+x_3/2, x_3/2)\). Then we have \((x'_1,x'_2,x'_3)\in D\), and

$$\begin{aligned} \varphi (x'_1,x'_2,x'_3) - \varphi (x_1,x_2,x_3) \ge \frac{x_3}{2} \cdot \left( (1-\varepsilon )2^{\lfloor n/2 \rfloor } - \varepsilon \cdot 2^{\lfloor n/2 \rfloor } \right) > 0, \end{aligned}$$

a contradiction. Hence, \(x_3 = 0\). Therefore, the maximum value of \(\varphi (x_1,x_2,x_3)\) over D is \((2^{\lfloor n/2 \rfloor }-2)(2^{\lceil n/2 \rceil }-2)\), which is achieved only when \((x_1,x_2,x_3) = (2^{\lfloor n/2 \rfloor }-2, 2^{\lceil n/2 \rceil }-2, 0)\).

Note that \(\min \{|{\mathcal {H}}_1|,|{\mathcal {H}}_2|\}\ge 2^{\lfloor n/2 \rfloor }-|{\mathcal {F}}_0\setminus {\mathcal {H}}|\ge 2^{\lfloor n/2 \rfloor }-|{\mathcal {F}}_0\Delta {\mathcal {H}}|\ge (1-\varepsilon )2^{\lfloor n/2 \rfloor }\) and \(|{\mathcal {H}}_1| + |{\mathcal {H}}_2| + |{\mathcal {H}}_3| = |{\mathcal {H}}| -1 \le 2^{\lfloor n/2\rfloor }+2^{\lceil n/2 \rceil }-4\), thus \((|{\mathcal {H}}_1|,|{\mathcal {H}}_2|,|{\mathcal {H}}_3|) \in D\). Therefore, we have

$$\begin{aligned} |{\mathcal {H}}_1||{\mathcal {H}}_2|+|{\mathcal {H}}_3|\cdot \varepsilon 2^{\lfloor n/2 \rfloor }\le (2^{\lfloor n/2 \rfloor }-2)(2^{\lceil n/2 \rceil }-2), \end{aligned}$$

where equality holds if and only if \({\mathcal {H}}_1=\{A:\emptyset \subsetneq A\subsetneq S\}\), \({\mathcal {H}}_2=\{B:\emptyset \subsetneq B\subsetneq S^c\}\) and \({\mathcal {H}}_3=\emptyset \). \(\square \)

By Claims 3.1 and 3.2, we have

$$\begin{aligned}{} & {} 2^{n}-(2^{\lfloor n/2 \rfloor } + 2^{{\lceil n/2 \rceil }}-1)\\{} & {} \quad -|{\mathcal {F}}| \le |2^{[n]}\setminus ({\mathcal {F}}\cup {\mathcal {F}}_0)| \le (2^{\lfloor n/2 \rfloor }-2)(2^{\lceil n/2 \rceil }-2)=2^{n}-2\cdot 2^{\lfloor n/2 \rfloor } - 2 \cdot 2^{\lceil n/2 \rceil }+4, \end{aligned}$$

which implies that \(|{\mathcal {F}}|\ge 2^{\lfloor n/2 \rfloor } + 2^{\lceil n/2 \rceil }-3\). By our assumption that \(|{\mathcal {F}}|\le 2^{\lfloor n/2 \rfloor } + 2^{\lceil n/2 \rceil }-3\), equality has to hold in Claim 3.2. Hence, we have \({\mathcal {H}}={\mathcal {F}}_0 \setminus \{S, S^c\}\), which means that \({\mathcal {F}}\) is a balanced pair of linked cubes. This completes our proof of Theorem 1.1.

4 The Case When \(k\ge 4\)

Denote by f(nk) the minimum possible size of a maximal k-wise intersecting family on [n]. For the case when \(k\ge 4\), we remark that our method also gives the following general bounds for f(nk).

Proposition 4.1

For every \(k\ge 4\) there exist positive constants \(c_k\) and \(d_k\) such that for every positive integer n, we have

$$\begin{aligned} c_k\cdot 2^{n/(k-1)}\le f(n,k)\le d_k\cdot 2^{n/\lceil k/2\rceil }. \end{aligned}$$

Proof

The proof of Lemma 2.1 easily generalizes to give a lower bound for f(nk).

Indeed, there is an injective map from \({\mathcal {F}}^c\) to \({\mathcal {F}}^{k-1}\), such that each set \(A \in {\mathcal {F}}^c\) is sent to a \((k-1)\)-tuple \((B_1, B_2, \ldots , B_{k-1})\) of sets in \({\mathcal {F}}\) with \(B_1 \cap B_2 \cap \cdots \cap B_{k-1} = A^c\).

This observation immediately leads to the lower bound.

The linked cubes construction also generalizes. In more detail, suppose that \(\ell \) divides n, and let \(S_1, S_2, \ldots , S_\ell \) be a partition of [n] such that \(|S_i| = n/\ell \) for each i. Let \({\mathcal {F}}_i = \{A:[n]\setminus S_i \subsetneq A \subseteq [n]\}\), and let \({\mathcal {F}}= {\mathcal {F}}_1 \cup {\mathcal {F}}_2 \cup \cdots \cup {\mathcal {F}}_\ell \). For any set \({\mathcal {G}}\) of \(2\ell -1\) elements of \({\mathcal {F}}\), there is an i such that \(|{\mathcal {G}}\cap {\mathcal {F}}_i| \le 1\). The sets in \({\mathcal {G}}-{\mathcal {F}}_i\) all contain \(S_i\), and the set in \({\mathcal {G}}\cap {\mathcal {F}}_i\) contains at least one element of \(S_i\), which element is in every set of \({\mathcal {G}}\). Hence, \({\mathcal {G}}\) is \((2\ell -1)\)-wise intersecting. On the other hand, if \(A \notin {\mathcal {F}}\), then either A is missing elements from two distinct sets \(S_i\), or A is missing all of the elements from some \(S_i\). In either case, it is easy to find a non-intersecting \((2\ell - 1)\)-tuple of elements in \({\mathcal {F}}\cup \{A\}\).

For the case when k is even, let \({\mathcal {F}}'\) be the maximal \((k-1)\)-intersecting family on \([n-1]\) constructed as in the previous paragraph. Then the family \({\mathcal {F}}= \{A \cup \{n\}: A \in {\mathcal {F}}'\} \cup \{[n-1]\}\) is a maximal k-wise intersecting family.

These constructions give the upper bound. \(\square \)

Very recently, Janzer [21] showed that the lower bound in Proposition 4.1 is the correct order of magnitude of f(nk), by constructing a maximal k-wise intersecting family of size \(O(2^{n/(k-1)})\) for every \(k\ge 3\) and n. Note that in the special case \(k=3\), Theorem 1.1 matches Janzer’s [21] construction.

Assume that n is sufficiently large. Let \({\mathcal {F}}\) be a maximal k-wise intersecting family on [n] with minimum size. Similarly to the case \(k=3\), one can show that \({\bar{{\mathcal {F}}}}\) is a \((1-\varepsilon )\)-\((k-1)\)-generator for [n], where \(\varepsilon =o(1)\). A modification of the method in this paper could be used to determine the structure of \({\bar{{\mathcal {F}}}}\), if \(|{\bar{{\mathcal {F}}}} \setminus {\mathcal {F}}_{n,k-1}|\) was small. However, Janzer [21] showed that \(|{\bar{{\mathcal {F}}}} \setminus {\mathcal {F}}_{n,k-1}|\) cannot be small.

Theorem 4.2

([21, Lemma 1.2]) For every \(k \ge 4\) and \(d \ge 0\), there exist \(c = c(k,d) >0\) and \(n_0 = n_0(k,d) >0\) such that the following holds when \(n \ge n_0\). Let \(S_1 \cup \cdots \cup S_{k-1}\) be a partition of [n], where \(\frac{n}{k-1} -d \le |S_i| \le \frac{n}{k-1} +d\) for every \(i \in [k-1]\). Let \({\mathcal {F}}_0=2^{S_1}\cup \cdots \cup 2^{S_{k-1}}\). For every set system \({\mathcal {F}}\subseteq 2^{[n]}\) with \(|{\mathcal {F}}\setminus {\mathcal {F}}_0| \le c\cdot 2^{n/(k-1)}\), \({\bar{{\mathcal {F}}}}\) cannot be maximal k-wise intersecting.

Combining our method with Theorem 4.2, we obtain the following result.

Proposition 4.3

For every \(k \ge 4\), there exists \(c = c(k)>0\) such that

$$\begin{aligned} f(n,k)\ge (1+c)|{\mathcal {F}}_{n,k-1}| = (1+c) \left( (k-1)2^{\frac{n}{k-1}} -k + 2\right) , \end{aligned}$$

when n is divisible by \(k-1\).

Therefore, a maximal k-wise intersecting family of minimum size will necessarily have a more complex structure when \(k \ge 4\). It is worth mentioning that the exact value of the upper bound on Janzer’s construction [21] is \((k-1)2^{k-3}2^{n/(k-1)}-(k-2)(2^{k-1}-1)\), which is about \(2^{k-3}|{\mathcal {F}}_{n,k-1}|\).