1 Introduction

The principal aim of this note is to prove the following result.

Theorem 1

The irrationality exponent \(\mu (\zeta (2))\) of \(\zeta (2)=\pi ^2/6\) is bounded from above by \(5.09541178\dots \) .

Recall that the irrationality exponent \(\mu (\alpha )\) of a real number \(\alpha \) is the supremum of the set of exponents \(\mu \) for which the inequality \(|\alpha -p/q|<q^{-\mu }\) has infinitely many solutions in rationals \(p/q\).

The history of \(\mu (\zeta (2))\) can be found in the 1996 paper [13] of G. Rhin and C. Viola, where they not only establish the previous record estimate \(\mu (\zeta (2))\le 5.44124250\dots \) but also introduce the remarkable permutation group arithmetic method based on birational transformations of underlying multiple integrals.

One of the corollaries of Theorem 1 is the estimate \(\mu (\pi \sqrt{d})\le 10.19082357\dots \) valid for any choice of nonzero rational \(d\). Note, however, that for some particular values of \(d\) better bounds are known: the results

$$\begin{aligned} \mu (\pi )\le 7.606308\dots , \;\; \mu (\pi \sqrt{3})\le 4.601057\dots \;\;\text {and}\;\;\mu (\pi \sqrt{10005})\le 10.021363\dots \end{aligned}$$

are due to Salikhov [9], Androsenko and Salikhov [1], and the present author [18], respectively.

A particular case of the hypergeometric constructions below was discussed in [20, Section 1.3] (see also [19, Section 2]) in relation with simultaneous rational approximations to \(\zeta (2)\) and \(\zeta (3)\). In the joint paper [4] with S. Dauguet, we address these simultaneous approximations more specifically.

Our proof of Theorem 1 below is organised as follows. In Sect. 2 we introduce some analytical and arithmetic ingredients, while Sects. 3 and 4 are devoted to exposing details of our first hypergeometric construction of rational approximations to \(\zeta (2)\). Sections 24 are closely related to the corresponding material in [4]. In Sect. 5 we discuss an identity between two hypergeometric integrals that motivates another hypergeometric construction of approximations to \(\zeta (2)\), the construction we further examine in Sect. 6. We finalise our findings in Sect. 7, where we prove Theorem 1 and comment on related hypergeometric constructions.

2 Prelude: auxiliary lemmata

This section discusses auxiliary results about decomposition of Barnes–Mellin-type integrals and special arithmetic of integer-valued polynomials.

Lemma 1

For \(\ell =0,1,2,\dots \),

$$\begin{aligned} \frac{1}{2\pi i}\int _{1/2-i\infty }^{1/2+i\infty }\left( \frac{\pi }{\sin \pi t}\right) ^2 \frac{(t-1)(t-2)\cdots (t-\ell )}{\ell !}\,{\mathrm {d}}t \;=\; \frac{(-1)^\ell }{\ell +1}\cdot \end{aligned}$$
(1)

Proof

The integrand is

$$\begin{aligned} \frac{1}{\ell !}\left( \frac{\pi }{\sin \pi t}\right) ^2 \frac{\Gamma (t)}{\Gamma (t-\ell )}\; =\; \frac{(-1)^\ell }{\ell !}\,\Gamma (t)^2\Gamma (1-t)\,\Gamma (1+\ell -t); \end{aligned}$$

the evaluation in (1) follows from Barnes’s first lemma [10, Section 4.2.1]. \(\square \)

Lemma 2

For \(k=0,1,2,\dots \),

$$\begin{aligned} \frac{1}{2\pi i}\int _{1/2-i\infty }^{1/2+i\infty }\left( \frac{\pi }{\sin \pi t}\right) ^2\frac{{\mathrm {d}}t}{t+k}\; =\;\sum _{m=1}^\infty \frac{1}{(m+k)^2}\; =\;\zeta (2)-\sum _{\ell =1}^k\frac{1}{\ell ^2}\cdot \end{aligned}$$
(2)

Proof

Since

$$\begin{aligned} \left( \frac{\pi }{\sin \pi t}\right) ^2{\mathrm {d}}t ={\mathrm {d}}(-\pi \,\cot \pi t), \end{aligned}$$

partial integration on the left-hand side in (2) transforms the integral into

$$\begin{aligned} -\frac{1}{2\pi i}\int _{1/2-i\infty }^{1/2+i\infty }\frac{\pi \,\cot \pi t\,{\mathrm {d}}t}{(t+k)^2}\cdot \end{aligned}$$

By considering first integration along the rectangular closed contour with vertices at

$$\begin{aligned} (1/2\pm iN,N+1/2\pm iN), \end{aligned}$$

where \(N>0\) is an integer, applying the residue sum theorem as in [15, Lemma 2.4] and finally letting \(N\rightarrow \infty \), we arrive at claim (2). \(\square \)

Remark 1

The form and principal ingredients of Lemma 2 are suggested by [7, Lemma 2]. The statement is essentially a particular case of [15, Lemma 2.4], where an artificial assumption on the growth of a regular rational function at infinity was used; the assumption can be dropped out by applying partial integration as above.

In what follows \(D_n\) denotes the least common multiple of the numbers \(1,2,\dots ,n\).

Lemma 3

Given integers \(b<a\), set

$$\begin{aligned} R(t)=R(a,b;t)=\frac{(t+b)(t+b+1)\cdots (t+a-1)}{(a-b)!}\cdot \end{aligned}$$

Then for any \(k,\ell \in {\mathbb {Z}}\), \(\ell \ne k\),

$$\begin{aligned} R(k)\in {\mathbb {Z}}, \quad D_{a-b}\cdot \frac{{\mathrm {d}}R(t)}{{\mathrm {d}}t}\bigg |_{t=k}\in {\mathbb {Z}} \quad \text {and}\quad D_{a-b}\cdot \frac{R(k)-R(\ell )}{k-\ell }\in {\mathbb {Z}}. \end{aligned}$$

Proof

Denote by \(m=b-a\) the degree of the polynomial \(R(t)\). The first two family of inclusions are classical [17]. For the remaining one, introduce the polynomial

$$\begin{aligned} P(t)=\frac{R(t)-R(\ell )}{t-\ell } \end{aligned}$$

of degree \(m-1.\) As \(D_m\cdot 1/(k-\ell )\) is an integer for \(k=\ell +1,\ell +2,\dots ,\ell +m\) as well as \(R(k)-R(\ell )\in {\mathbb {Z}}\), we deduce that \(D_m\cdot P(k)\in {\mathbb {Z}}\) for those values of \(k\). This means that the polynomial \(D_mP(t)\) of degree \(m-1\) assumes integer values at \(m\) successive integers. By [12, Division 8, Problem 87] the polynomial is integer-valued. \(\square \)

Lemma 4

Let \(R(t)\) be a product of several integer-valued polynomials

$$\begin{aligned} R_j(t)=R(a_j,b_j;t)=\frac{(t+b_j)(t+b_j+1)\cdots (t+a_j-1)}{(a_j-b_j)!}, \end{aligned}$$

where \( b_j<a_j\) and \(m=\max _j\{a_j-b_j\}\). Then for any \(k,\ell \in {\mathbb {Z}}\), \(\ell \ne k\),

$$\begin{aligned} R(k)\in {\mathbb {Z}}, \quad D_m\cdot \frac{{\mathrm {d}}R(t)}{{\mathrm {d}}t}\bigg |_{t=k}\in {\mathbb {Z}} \quad \text {and}\quad D_m\cdot \frac{R(k)-R(\ell )}{k-\ell }\in {\mathbb {Z}}. \end{aligned}$$
(3)

Proof

It is sufficient to establish the result for a product of just two polynomials \(R(t)\) and \(\widetilde{R}(t)\) satisfying the assertions in (3) and then use mathematical induction on the number of such factors. We have

$$\begin{aligned} \left\{ \begin{aligned}&\bigl (R(t)\widetilde{R}(t)\bigr )\big |_{t=k} =R(k)\widetilde{R}(k) \in {\mathbb {Z}},\\&D_m\frac{{\mathrm {d}}(R(t)\widetilde{R}(t))}{{\mathrm {d}}t}\Big |_{t=k} = R(k)\cdot D_m\frac{{\mathrm {d}}\widetilde{R}(t)}{{\mathrm {d}}t}\Big |_{t=k} +D_m\frac{{\mathrm {d}}R(t)}{{\mathrm {d}}t}\Big |_{t=k}\cdot \widetilde{R}(k) \in {\mathbb {Z}}\\&D_m\frac{R(k)\widetilde{R}(k)-R(\ell )\widetilde{R}(\ell )}{k-\ell } = D_m\frac{R(k)-R(\ell )}{k-\ell }\cdot \widetilde{R}(k) +R(\ell )\cdot D_m\frac{\widetilde{R}(k)-\widetilde{R}(\ell )}{k-\ell } \in {\mathbb {Z}},\\ \end{aligned} \right. \end{aligned}$$

and the result follows. \(\square \)

3 First hypergeometric tale

The construction in this section is a general case of the one considered in [19, Section 2].

For a set of parameters

$$\begin{aligned} ({\varvec{a}},{\varvec{b}})=\left( \begin{matrix} a_1, \, a_2, \, a_3, \, a_4 \\ b_1,\, b_2, \, b_3, \, b_4\\ \end{matrix}\right) \end{aligned}$$

subject to the conditions

$$\begin{aligned} \left\{ \begin{array}{l} b_1,b_2,b_3\le a_1,a_2,a_3,a_4<b_4, \\ d=(a_1+a_2+a_3+a_4)-(b_1+b_2+b_3+b_4)\ge 0,\\ \end{array}\right. \end{aligned}$$
(4)

define the rational function

$$\begin{aligned} R(t)&= R({\varvec{a}},{\varvec{b}};t)\nonumber \\&= \frac{(t+b_1)\cdots (t+a_1-1)}{(a_1-b_1)!} \times \frac{(t+b_2)\cdots (t+a_2-1)}{(a_2-b_2)!} \nonumber \\&\quad \times \frac{(t+b_3)\cdots (t+a_3-1)}{(a_3-b_3)!} \times \frac{(b_4-a_4-1)!}{(t+a_4)\cdots (t+b_4-1)} \end{aligned}$$
(5)
$$\begin{aligned}&= \Pi ({\varvec{a}},{\varvec{b}})\times \frac{\Gamma (t+a_1)\,\Gamma (t+a_2)\,\Gamma (t+a_3)\,\Gamma (t+a_4)}{\Gamma (t+b_1)\,\Gamma (t+b_2)\,\Gamma (t+b_3)\,\Gamma (t+b_4)}, \end{aligned}$$
(6)

where

$$\begin{aligned} \Pi ({\varvec{a}},{\varvec{b}}) =\frac{(b_4-a_4-1)!}{(a_1-b_1)!\,(a_2-b_2)!\,(a_3-b_3)!}\cdot \end{aligned}$$

We also introduce the ordered versions \(a_1^*\le a_2^*\le a_3^*\le a_4^*\) of the parameters \(a_1,a_2,a_3,a_4\) and \(b_1^*\le b_2^*\le b_3^*\) of \(b_1,b_2,b_3\), so that \(\{a_1^*,a_2^*,a_3^*,a_4^*\}\) coincide with \(\{a_1,a_2,a_3,a_4\}\) and \(\{b_1^*,b_2^*,b_3^*\}\) coincide with \(\{b_1,b_2,b_3\}\) as multi-sets. Then \(R(t)\) has poles at \(t=-k\) where \(k=a_4^*,a_4^*+1,\dots ,b_4-1\), has zeroes at \(t=-\ell \) where \(\ell =b_1^*,b_1^*+1,\dots ,a_3^*-1\) and has double zeroes at \(t=-\ell \) where \(\ell =b_2^*,b_2^*+1,\dots ,a_2^*-1\).

Decomposing (5) into the sum of partial fractions, we get

$$\begin{aligned} R(t)=\sum _{k=a_4^*}^{b_4-1}\frac{C_k}{t+k}+P(t), \end{aligned}$$
(7)

where \(P(t)\) is a polynomial of degree \(d\) (see (4)) and

$$\begin{aligned} C_k&= \bigl (R(t)(t+k)\bigr )|_{t=-k} \nonumber \\&= (-1)^{d+b_4+k}\left( {\begin{array}{c}k-b_1\\ k-a_1\end{array}}\right) \left( {\begin{array}{c}k-b_2\\ k-a_2\end{array}}\right) \left( {\begin{array}{c}k-b_3\\ k-a_3\end{array}}\right) \left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) \in {\mathbb {Z}} \end{aligned}$$
(8)

for \(k=a_4^*,a_4^*+1,\dots ,b_4-1\).

Lemma 5

Set \(c=\max \{a_1-b_1,a_2-b_2,a_3-b_3\}\). Then \(D_cP(t)\) is an integer-valued polynomial of degree \(d\).

Proof

Write \(R(t)=R_1(t)R_2(t)\), where

$$\begin{aligned} R_1(t)= \frac{ \prod _{j=b_1}^{a_1-1}(t+j)}{(a_1-b_1)!} \times \frac{ \prod _{j=b_2}^{a_2-1}(t+j)}{(a_2-b_2)!} \times \frac{\prod _{j=b_3}^{a_3-1}(t+j)}{(a_3-b_3)!} \end{aligned}$$

is the product of three integer-valued polynomials and

$$\begin{aligned} R_2(t)\;=\; \frac{(b_4-a_4-1)!}{ \prod _{j=a_4}^{b_4-1}(t+j)} \;=\; \sum _{k=a_4}^{b_4-1}\frac{(-1)^{k-a_4} \left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) }{t+k}\cdot \end{aligned}$$

It follows from Lemma 4 that

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l} \displaystyle D_c\cdot \frac{{\mathrm {d}}R_1(t)}{{\mathrm {d}}t}\Big |_{t=j}\in {\mathbb {Z}} &{}\displaystyle \text {for } j\in {\mathbb {Z}}, \\ \displaystyle D_c\cdot \frac{R_1(j)-R_1(m)}{j-m}\in {\mathbb {Z}}&{}\displaystyle \text {for } j,m\in {\mathbb {Z}}, \; j\ne m.\\ \end{array}\right. \end{aligned}$$
(9)

Furthermore, note that

$$\begin{aligned} C_k&= R_1(-k)\cdot \bigl (R_2(t)(t+k)\bigr )\big |_{t=-k} \\&= R_1(-k)\cdot (-1)^{k-a_4}\left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) \quad \text {for}\; k\in {\mathbb {Z}}, \end{aligned}$$

and the expression in fact vanishes if \(k\) is outside the range \(a_4^*\le k\le b_4-1\).

For \(\ell \in {\mathbb {Z}}\), we have

$$\begin{aligned}&\frac{{\mathrm {d}}}{{\mathrm {d}}t}\bigl (R(t)(t+\ell )\bigr )\bigg |_{t=-\ell }\\&\quad =\frac{{\mathrm {d}}}{{\mathrm {d}}t}\bigl (R_1(t)\cdot R_2(t)(t+\ell )\bigr )\bigg |_{t=-\ell } \\&\quad =\frac{{\mathrm {d}}R_1(t)}{{\mathrm {d}}t}\bigg |_{t=-\ell } \cdot \bigl (R_2(t)(t+\ell )\bigr )\big |_{t=-\ell } +R_1(-\ell )\cdot \frac{{\mathrm {d}}}{{\mathrm {d}}t}\bigr (R_2(t)(t+\ell )\bigr )\bigg |_{t=-\ell } \\&\quad =\frac{{\mathrm {d}}R_1(t)}{{\mathrm {d}}t}\bigg |_{t=-\ell }\cdot (-1)^{\ell -a_4}\left( {\begin{array}{c}b_4-a_4-1\\ \ell -a_4\end{array}}\right) \\&\qquad +R_1(-\ell )\cdot \frac{{\mathrm {d}}}{{\mathrm {d}}t}\sum _{k=a_4}^{b_4-1}(-1)^{k-a_4} \left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) \left( 1-\frac{-\ell +k}{t+k}\right) \bigg |_{t=-\ell }\\&\quad =\frac{{\mathrm {d}}R_1(t)}{{\mathrm {d}}t}\bigg |_{t=-\ell }\cdot (-1)^{\ell -a_4}\left( {\begin{array}{c}b_4-a_4-1\\ \ell -a_4\end{array}}\right) +R_1(-\ell )\sum _{\begin{array}{c} k=a_4\\ k\ne \ell \end{array}}^{b_4-1}\frac{(-1)^{k-a_4}\left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) }{-\ell +k} \end{aligned}$$

and

$$\begin{aligned} \frac{{\mathrm {d}}}{{\mathrm {d}}t}\left( \sum _{k=a_4^*}^{b_4-1}\frac{C_k}{t+k}\cdot (t+\ell )\right) \bigg |_{t=-\ell 3}&= \frac{{\mathrm {d}}}{{\mathrm {d}}t}\left( \sum _{k=a_4}^{b_4-1}\frac{C_k}{t+k}\cdot (t+\ell )\right) \bigg |_{t=-\ell } \\&= \frac{{\mathrm {d}}}{{\mathrm {d}}t}\sum _{k=a_4}^{b_4-1}C_k\left( 1-\frac{-\ell +k}{t+k}\right) \bigg |_{t=-\ell } = \sum _{\begin{array}{c} k=a_4\\ k\ne \ell \end{array}}^{b_4-1}\frac{C_k}{-\ell +k} \\&= \sum _{\begin{array}{c} k=a_4\\ k\ne \ell \end{array}}^{b_4-1} \frac{R_1(-k)\cdot (-1)^{k-a_4}\left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) }{-\ell +k}\cdot \end{aligned}$$

Therefore,

$$\begin{aligned} P(-\ell )&= \left. \frac{{\mathrm {d}}}{{\mathrm {d}}t}\bigl (P(t)(t+\ell )\bigr )\right| _{t=-\ell } =\frac{{\mathrm {d}}}{{\mathrm {d}}t}\left( R(t)(t+\ell ) -\sum _{k=a_4^*}^{b_4-1}\frac{C_k}{t+k}\cdot (t+\ell )\right) \bigg |_{t=-\ell }\\&= \frac{{\mathrm {d}}R_1(t)}{{\mathrm {d}}t}\bigg |_{t=-\ell }\cdot (-1)^{\ell -a_4}\left( {\begin{array}{c}b_4-a_4-1\\ \ell -a_4\end{array}}\right) \\&+\sum _{\begin{array}{c} k=a_4\\ k\ne \ell \end{array}}^{b_4-1}(-1)^{k-a_4}\left( {\begin{array}{c}b_4-a_4-1\\ k-a_4\end{array}}\right) \frac{R_1(-\ell )-R_1(-k)}{-\ell +k}, \end{aligned}$$

and this implies, on the basis of the inclusions (9) above, that \(D_cP(-\ell )\in {\mathbb {Z}}\) for all \(\ell \in {\mathbb {Z}}\). \(\square \)

Finally, define the quantity

$$\begin{aligned} r({\varvec{a}},{\varvec{b}}) =\frac{(-1)^d}{2\pi i}\int _{C-i\infty }^{C+i\infty }\left( \frac{\pi }{\sin \pi t}\right) ^2R({\varvec{a}},{\varvec{b}};t)\,{\mathrm {d}}t, \end{aligned}$$
(10)

where \(C\) is arbitrary from the interval \(-a_2^*<C<1-b_2^*\). The definition does not depend on the choice of \(C\), as the integrand does not have singularities in the strip \(-a_2^*<\mathrm{Re }t<1-b_2^*\).

Proposition 1

We have

$$\begin{aligned} r({\varvec{a}},{\varvec{b}})=q({\varvec{a}},{\varvec{b}})\zeta (2)-p({\varvec{a}},{\varvec{b}}) \quad \text{ with } \left\{ \begin{array}{l} q({\varvec{a}},{\varvec{b}})\in {\mathbb {Z}} ,\\ D_{c_1}D_{c_2}p({\varvec{a}},{\varvec{b}})\in {\mathbb {Z}},\\ \end{array}\right. \end{aligned}$$
(11)

where

$$\begin{aligned}\left\{ \begin{array}{l} c_1=\max \{a_1-b_1,a_2-b_2,a_3-b_3,b_4-a_2^*-1\},\\ c_2=\max \{d+1,b_4-a_2^*-1\}.\\ \end{array} \right. \end{aligned}$$

Furthermore, the quantity \(r({\varvec{a}},\,{\varvec{b}})/\Pi ({\varvec{a}},\,{\varvec{b}})\) is invariant under any permutation of the parameters \(a_1,a_2,a_3,a_4\).

Proof

We choose \(C=1/2-a_2^*\) in (10) and write (7) as

$$\begin{aligned} R(t)=\sum _{k=a_4^*}^{b_4-1}\frac{C_k}{t+k}+\sum _{\ell =0}^dA_\ell P_\ell (t+a_2^*), \end{aligned}$$

where

$$\begin{aligned} P_\ell (t)=\frac{(t-1)(t-2)\cdots (t-\ell )}{\ell !} \end{aligned}$$

and \(D_cA_\ell \in {\mathbb {Z}}\) in accordance with Lemma 5. Applying Lemmas 1 and 2 we obtain

$$\begin{aligned} r({\varvec{a}},{\varvec{b}})&= \frac{(-1)^d}{2\pi i}\int _{1/2-i\infty }^{1/2+i\infty }\left( \frac{\pi }{\sin \pi t}\right) ^2R(t-a_2^*)\,{\mathrm {d}}t \\&= \zeta (2)\cdot (-1)^d\sum _{k=a_4^*}^{b_4-1}C_k -(-1)^d\sum _{k=a_4^*}^{b_4-1}C_k\sum _{\ell =1}^{k-a_2^*}\frac{1}{\ell ^2} +\sum _{\ell =0}^d\frac{(-1)^{d+\ell }A_\ell }{\ell +1}\cdot \end{aligned}$$

This representation clearly implies that \(r({\varvec{a}},{\varvec{b}})\) has the desired form (11), while the invariance of \(r({\varvec{a}},{\varvec{b}})/\Pi ({\varvec{a}},{\varvec{b}})\) under permutations of \(a_1,a_2,a_3,a_4\) follows from (6) and definition (10) of \(r({\varvec{a}},{\varvec{b}})\). \(\square \)

4 Towards proving Theorem 1

For the particular case

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} a_1=7n+1, &{} a_2=6n+1, &{} a_3=5n+1, &{} a_4= 8n+1,\\ b_1=1, &{} b_2= 1n+1, &{} b_3=2n+1, &{} b_4=14n+2,\\ \end{array}\right. \end{aligned}$$
(12)

from Proposition 1 we obtain

$$\begin{aligned} r_n=r({\varvec{a}},{\varvec{b}})=q_n\zeta (2)-p_n, \qquad \text {where}\quad q_n, \, D_{9n}D_{8n}p_n\in {\mathbb {Z}}. \end{aligned}$$
(13)

The asymptotic behaviour of \(r_n\) and \(q_n\) for a generic choice

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} a_1=\alpha _1n+1, &{} a_2=\alpha _2n+1, &{} a_3=\alpha _3n+1, &{} a_4=\alpha _4n+1, \\ b_1=\beta _1n+1, &{} b_2=\beta _2n+1, &{} b_3=\beta _3n+1, &{} b_4=\beta _4n+2,\\ \end{array} \right. \end{aligned}$$
(14)

where the integral parameters \(\alpha _j\) and \(\beta _j\) satisfy

$$\begin{aligned} \beta _1,\beta _2,\beta _3<\alpha _1,\alpha _2,\alpha _3,\alpha _4<\beta _4, \quad \alpha _1+\alpha _2+\alpha _3+\alpha _4>\beta _1+\beta _2+\beta _3+\beta _4 \end{aligned}$$

(to ensure the earlier imposed conditions (4)), is pretty standard.

Lemma 6

Assume that the cubic polynomial

$$\begin{aligned} \prod _{j=1}^4(\tau -\alpha _j)-\prod _{j=1}^4(\tau -\beta _j) \end{aligned}$$

has one real zero \(\tau _1\) and two complex conjugate zeroes \(\tau _0\) and \(\overline{\tau _0}\). Then

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{\log |r_n|}{n}=\mathrm{Re }f_0(\tau _0) \quad \text {and}\quad \lim _{n\rightarrow \infty }\frac{\log |q_n|}{n}=f_0(\tau _1), \end{aligned}$$

where

$$\begin{aligned} f_0(\tau )&= \sum _{j=1}^4\bigl (\alpha _j\log (\tau -\alpha _j)-\beta _j\log (\tau -\beta _j)\bigr )\\&\quad -\sum _{j=1}^3(\alpha _j-\beta _j)\log (\alpha _j-\beta _j)+(\beta _4-\alpha _4)\log (\beta _4-\alpha _4). \end{aligned}$$

For a proof of the statement we refer to similar considerations in [1517]. An alternative proof can be given, based on Poincaré’s theorem and on explicit recurrence relations satisfied by both \(r_n\) and \(q_n\) — we touch the latter aspect for our concrete choice (12) in Sect. 5.

When the parameters are chosen in accordance with (12), we obtain

$$\begin{aligned} \left\{ \begin{aligned} -\limsup \limits _{n\rightarrow \infty }\frac{\log |r_n|}{n}&=C_0=15.88518998\dots ,\\ \lim \limits _{n\rightarrow \infty }\frac{\log |q_n|}{n}&=C_1=23.22906071\cdots .\\ \end{aligned}\right. \end{aligned}$$
(15)

For a generic choice (14), the quantities \(c_1\) and \(c_2\) in Proposition 1 assume the form \(\gamma _1n\) and \(\gamma _2n\), where the integers \(\gamma _1\) and \(\gamma _2\) only depend on the data \(\alpha _j,\beta _j\) for \(j=1,\dots ,4\); for simplicity we order them: \(\gamma _1\ge \gamma _2\). In what follows, \(\lfloor \,\cdot \,\rfloor \) denotes the integer part of a real number.

Lemma 7

In the above notation, we have

$$\begin{aligned} \Phi _n^{-1}q_n,\,\Phi _n^{-1}D_{\gamma _1n}D_{\gamma _2n}p_n\in {\mathbb {Z}} \end{aligned}$$
(16)

with

$$\begin{aligned} \Phi _n=\prod _{\sqrt{2\gamma _0n}<p\;\text {prime}\,\le \gamma _2n}p^{\varphi (n/p)}, \end{aligned}$$

where

$$\begin{aligned} \varphi (x)=\max _{\varvec{\alpha }'=\sigma \varvec{\alpha }:\sigma \in \mathfrak {S}_4} \left( \begin{array}{l} \lfloor (\beta _4-\alpha _4)x\rfloor -\lfloor (\beta _4-\alpha _4')x\rfloor \\ \quad -\sum _{j=1}^3\bigl (\lfloor (\alpha _j-\beta _j)x\rfloor -\lfloor (\alpha _j'-\beta _j)x\rfloor \bigr ) \end{array}\right) , \end{aligned}$$

the maximum being taken over all permutations \((\alpha _1',\alpha _2',\alpha _3',\alpha _4')\) of \((\alpha _1,\alpha _2,\alpha _3,\alpha _4)\). Furthermore,

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\log \Phi _n}{n} =\int _0^1\varphi (x)\,{\mathrm {d}}\psi (x)-\int _0^{1/\gamma _2}\varphi (x)\,\frac{{\mathrm {d}}x}{x^2}, \end{aligned}$$

where \(\psi (x)\) is the logarithmic derivative of the gamma function.

Proof

The arithmetic “correction” in (16) uses by now a standard method, based on the permutation group from Proposition 1; see the original source [13] or its adaptation to hypergeometric settings in [17] for details. The function \(\varphi (x)\) is chosen to count the maximum

$$\begin{aligned} \varphi \left( \frac{n}{p}\right) =\max _{\sigma \in \mathfrak {S}_4}\mathrm{ord }_p\frac{\Pi ({\varvec{a}},{\varvec{b}})}{\Pi (\sigma {\varvec{a}},{\varvec{b}})}\cdot \end{aligned}$$

\(\square \)

Remark 2

There is an alternative way to compute \(\varphi (x)\) using

$$\begin{aligned} \varphi (x)=\min _{0\le y<1} \left( \begin{array}{l} \sum _{j=1}^3 \bigl (\lfloor y-\beta _jx\rfloor -\lfloor y-\alpha _jx\rfloor -\lfloor (\alpha _j-\beta _j)x\rfloor \bigr )\\ \quad +\lfloor (\beta _4-\alpha _4)x\rfloor -\lfloor \beta _4x-y\rfloor -\lfloor y-\alpha _4x\rfloor \end{array}\right) , \end{aligned}$$

though it is not at all straightforward that this expression represents the same function \(\varphi (x)\) as in Lemma 7. The technique is discussed in related contexts, for example, in [15, Section 4], [17, Section 7] and [8, Section 2]. We use this strategy in Sect. 6 below.

Under the choice (12), we get \(\gamma _1=9\), \(\gamma _2=8\) and

$$\begin{aligned} \varphi (x)={\left\{ \begin{array}{ll} 2 &{}\text {if}\,x\in \bigl [\frac{1}{6},\frac{1}{5}\bigr )\cup \bigl [\frac{1}{4},\frac{2}{7}\bigr )\cup \bigl [\frac{1}{2},\frac{4}{7}\bigr )\cup \bigl [\frac{5}{6},\frac{6}{7}\bigr ),\\ 1 &{}\text {if}\,x\in \bigl [\frac{1}{8},\frac{1}{7}\bigr )\cup \bigl [\frac{1}{5},\frac{1}{4}\bigr )\cup \bigl [\frac{2}{7},\frac{3}{7}\bigr )\cup \bigl [\frac{4}{7},\frac{5}{6}\bigr )\cup \bigl [\frac{6}{7},\frac{8}{9}\bigr ),\\ 0 &{}\text {otherwise}, \end{array}\right. } \end{aligned}$$
(17)

so that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\log \Phi _n}{n}=8.12793878\cdots . \end{aligned}$$

Taking then

$$\begin{aligned} C_2=\lim _{n\rightarrow \infty }\frac{\log (\Phi _n^{-1}D_{9n}D_{8n})}{n}=9+8-8.12793878\ldots =8.87206121\dots \end{aligned}$$

and applying [6, Lemma 2.1] we arrive at the following irrationality measure for \(\zeta (2)\):

$$\begin{aligned} \mu (\zeta (2))\le \frac{C_0+C_1}{C_0-C_2}=5.57728968\cdots . \end{aligned}$$

This estimate is clearly worse than the one obtained by Rhin and Viola in [13]. We will see later that the inclusions (16) can be further sharpened in our case (12).

Remark 3

A different choice of parameters than in (12), namely,

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} a_1=4n+1, &{} a_2=5n+1, &{} a_3=6n+1, &{}a_4= 7n+1, \\ b_1=1, &{} b_2=n+1, &{} b_3=2n+1, &{} b_4=12n+2,\\ \end{array}\right. \end{aligned}$$

allows us to obtain the estimate \(\mu (\zeta (2))\le 5.20514736\dots \) already better than the one in [13]. This choice, however, fails to achieve a significant sharpening by means of the machinery that we discuss below.

5 Interlude: a hypergeometric integral

Let us prove the following result.

Proposition 2

For each \(n=0,1,2,\dots \), the following identity is true:

$$\begin{aligned}&\frac{(6n)!}{(7n)!\,(5n)!\,(3n)!}\,\frac{1}{2\pi i}\int _{C_1-i\infty }^{C_1+i\infty } \frac{\Gamma (7n+1+t)\,\Gamma (6n+1+t)\,\Gamma (5n+1+t)}{\Gamma (1+t)\,\Gamma (n+1+t)\,\Gamma (2n+1+t)}\qquad \nonumber \\&\qquad \times \frac{\Gamma (8n+1+t)}{\Gamma (14n+2+t)}\left( \frac{\pi }{\sin \pi t}\right) ^2{\mathrm {d}}t \nonumber \\&\quad =\frac{(6n)!^2}{(9n)!\,(3n)!}\, \frac{1}{2\pi i}\int _{C_2-i\infty }^{C_2+i\infty } \frac{\Gamma (11n+2+2t)\,\Gamma (3n+1+t)}{\Gamma (2n+2+2t)\,\Gamma (1+t)} \nonumber \\&\quad \quad \times \frac{\Gamma (4n+1+t)\,\Gamma (5n+1+t)}{\Gamma (10n+2+t)\,\Gamma (11n+2+t)} \,\frac{\pi }{\sin 2\pi t}\,{\mathrm {d}}t, \end{aligned}$$
(18)

where the integration paths separate the two groups of poles of the integrands; (for example, \(C_1=-2n-1/2\) and \(C_2=-1/2\)).

Proof

Executing the Gosper–Zeilberger algorithm of creative telescoping for the rational functions

$$\begin{aligned} R(t)=\frac{\prod _{j=1}^{7n}(t+j)}{(7n)!}\times \frac{\prod _{j=1}^{5n}(t+n+j)}{(5n)!}\times \frac{\prod _{j=1}^{3n}(t+2n+j)}{(3n)!}\times \frac{(6n)!}{\prod _{j=1}^{6n+1}(t+8n+j)} \end{aligned}$$

and

$$\begin{aligned} \hat{R}(t) =\frac{\prod _{j=2}^{9n+1}(2t+2n+j)}{(9n)!}\!\times \!\frac{\prod _{j=1}^{3n}(t+j)}{(3n)!}\!\times \! \frac{(6n)!}{\prod _{j=1}^{6n+1}(t+4n+j)}\!\times \! \frac{(6n)!}{\prod _{j=1}^{6n+1}(t+5n+j)}, \end{aligned}$$

we find out that the integrals

$$\begin{aligned} r_n=\frac{1}{2\pi i}\int _{-i\infty }^{i\infty }R(t)\left( \frac{\pi }{\sin \pi t}\right) ^2{\mathrm {d}}t \quad \text {and}\quad \hat{r}_n=\frac{1}{2\pi i}\int _{-i\infty }^{i\infty }\hat{R}(t)\,\frac{\pi }{\sin 2\pi t}\,{\mathrm {d}}t \end{aligned}$$

satisfy the same recurrence equation

$$\begin{aligned} s_0(n)r_{n+3}+s_1(n)r_{n+2}+s_2(n)r_{n+1}+s_3(n)r_n=0 \quad \text {for}\; n=0,1,2,\dots , \end{aligned}$$

where \(s_0(n)\), \(s_1(n)\), \(s_2(n)\) and \(s_3(n)\) are polynomials in \(n\) of degree 64. Verifying the equality in (18) directly for \(n=0\), \(1\) and \(2\), we conclude that it is valid for all \(n\). \(\square \)

Other applications of the algorithm of creative telescoping to proving identities for Barnes-type integrals are discussed in [5, 11].

Remark 4

Note that the left-hand side in (18) is the linear form from Sect. 3 which corresponds to our particular choice (12) of the parameters. The characteristic polynomial of the recurrence equation is equal to

$$\begin{aligned} 2^2\,3^{12}\,7^{14}\,\lambda ^3+3^3\,7^7\,794493690983053821271\,\lambda ^2 -2^{20}\,3^4\,7^5\,2687491277\,\lambda +2^{48}, \end{aligned}$$

and its zeroes determine the asymptotics (15) of \(r_n\) and \(q_n\) by means of Poincaré’s theorem.

For a “sufficiently generic”   set of integral parameters, the following identity is expected to be true:

$$\begin{aligned}&\frac{1}{2\pi i}\int _{-i\infty }^{i\infty } \frac{\Gamma (a+t)\,\Gamma (b+t)\,\Gamma (e+t)\,\Gamma (f+t)}{\Gamma (1+t)\,\Gamma (1+a-e+t)\,\Gamma (1+a-f+t)\,\Gamma (g+t)} \left( \frac{\pi }{\sin \pi t}\right) ^2{\mathrm {d}}t\qquad \qquad \nonumber \\&\quad =(-1)^{a+b+e+f}\frac{\Gamma (e+f-a)\,\Gamma (e)\,\Gamma (f)}{\Gamma (g-b)} \nonumber \\&\qquad \times \frac{1}{2\pi i}\int _{-i\infty }^{i\infty } \frac{\Gamma (a-b+g+2t)\,\Gamma (a+t)\,\Gamma (e+t)\,\Gamma (f+t)}{\Gamma (1+a+2t)\,\Gamma (1+a-b+t)\,\Gamma (e+f+t)\,\Gamma (g+t)} \left( \frac{\pi }{\sin 2\pi t}\right) {\mathrm {d}}t.\nonumber \\ \end{aligned}$$
(19)

The satellite identity, in which \((\pi /\sin \pi t)^2\) and \(\pi /\sin 2\pi t\) are replaced with

$$\begin{aligned} \pi ^3\cos \pi t/ (\sin \pi t)^3\quad \text{ and } \quad (\pi /\sin \pi t)^2 \end{aligned}$$

respectively, is expected to hold as well; the other integrals represent rational approximations to \(\zeta (3)\) [4, 20]. These identities can be possibly shown in full generality using contiguous relations for the integrals on both sides; it seems to be a tough argument though.

Proposition 2 is a particular case of (19) when

$$\begin{aligned} a=8n+1, \quad b=5n+1, \quad e=6n+1, \quad f=7n+1 \quad \text {and}\quad g=14n+2. \end{aligned}$$

Identity (19) and its satellite should be a special case of a hypergeometric-integral identity valid for generic complex parameters. We could not detect the existence of the more general identity in the literature, though there are a few words about it at the end of W. N. Bailey’s paper [2]:

“The formula (1.4)Footnote 1 and its successor are rather more troublesome to generalize, and the final result was unexpected. The formulae obtained involve five series instead of three or four as previously obtained. In each case two of the series are nearly-poised and of the second kind, one is nearly-poised and of the first kind, and the other two are Saalschützian in type. In the course of these investigations some integrals of Barnes’s type are evaluated analogous to known sums of hypergeometric series. Considerations of space, however, prevent these results being given in detail.”

It is quite similar in spirit to Fermat’s famous “I have discovered a truly marvelous proof of this, which this margin is too narrow to contain”, is not it? Interestingly enough, the last paragraph in Chapter 6 of Bailey’s book [3] again reveals us no details about the troublesome generalization. Did Bailey possess the identity?

6 Second hypergeometric tale

Our discussion in the previous section suggests a different construction of rational approximations to \(\zeta (2)\). This time we design the rational function to be

$$\begin{aligned} \hat{R}(t) =\hat{R}(\hat{{\varvec{a}}},\hat{{\varvec{b}}};t)&= \frac{(2t+\hat{b}_0)(2t+\hat{b}_0+1)\cdots (2t+\hat{a}_0-1)}{(\hat{a}_0-\hat{b}_0)!} \times \frac{(t+\hat{b}_1)\cdots (t+\hat{a}_1-1)}{(\hat{a}_1-\hat{b}_1)!}\\&\quad \times \frac{(\hat{b}_2-\hat{a}_2-1)!}{(t+\hat{a}_2)\cdots (t+\hat{b}_2-1)} \times \frac{(\hat{b}_3-\hat{a}_3-1)!}{(t+\hat{a}_3)\cdots (t+\hat{b}_3-1)}\\&= \hat{\Pi }(\hat{{\varvec{a}}},\hat{{\varvec{b}}})\cdot \frac{\Gamma (2t+\hat{a}_0)\,\Gamma (t+\hat{a}_1)\,\Gamma (t+\hat{a}_2)\,\Gamma (t+\hat{ a}_3)}{\Gamma (2t+\hat{b}_0)\,\Gamma (t+\hat{b}_1)\,\Gamma (t+\hat{b}_2)\,\Gamma (t+\hat{b}_3)}, \end{aligned}$$

where

$$\begin{aligned} \hat{\Pi }(\hat{{\varvec{a}}},\hat{{\varvec{b}}}) =\frac{(\hat{b}_2-\hat{a}_2-1)!\,(\hat{b}_3-\hat{a}_3-1)!}{(\hat{a}_0-\hat{b}_0)!\,(\hat{a}_1-\hat{ b}_1)!} \end{aligned}$$

and the integral parameters

$$\begin{aligned} (\hat{{\varvec{a}}},\hat{{\varvec{b}}}) =\left( \begin{array}{l} \hat{a}_0; \hat{a}_1, \, \hat{a}_2, \, \hat{a}_3 \\ \hat{b}_0; \, \hat{b}_1, \, \hat{b}_2, \, \hat{b}_3 \end{array}\right) \end{aligned}$$

satisfy the conditions

$$\begin{aligned} \left\{ \begin{array}{l} \tfrac{1}{2}\hat{b}_0,\hat{b}_1\le \tfrac{1}{2}\hat{a}_0,\hat{a}_1,\hat{a}_2,\hat{a}_3<\hat{b}_2,\hat{b}_3,\\ \hat{a}_0+\hat{a}_1+\hat{a}_2+\hat{a}_3=\hat{b}_0+\hat{b}_1+\hat{b}_2+\hat{b}_3-2.\\ \end{array}\right. \end{aligned}$$
(20)

The latter condition implies that \(\hat{R}(t)=O(1/t^2)\) as \(t\rightarrow \infty \). Though it will not be as important as it was in our arithmetic consideration of Sect. 3, we introduce the ordered versions \(\hat{a}_1^*\le \hat{a}_2^*\le \hat{a}_3^*\) of the parameters \(\hat{a}_1,\hat{a}_2,\hat{a}_3\) and \(\hat{b}_2^*\le \hat{b}_3^*\) of \(\hat{b}_2,\hat{b}_3\). Then this ordering and conditions (20) imply that the rational function \(\hat{R}(t)\) has poles at \(t=-k\) for \(\hat{a}_2^*\le k\le \hat{b}_3^*-1\), double poles at \(t=-k\) for \(\hat{a}_3^*\le k\le \hat{b}_2^*-1\), and zeroes at \(t=-\ell /2\) for \(\hat{b}_0\le \ell \le \hat{a}_0^*-1\) where \(\hat{a}_0^*=\min \{\hat{a}_0,2\hat{a}_2^*\}\).

The partial-fraction decomposition of the regular rational function \(\hat{R}(t)\) assumes the form

$$\begin{aligned} \hat{R}(t)=\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}\frac{A_k}{(t+k)^2}+\sum _{k=\hat{a}_2^*}^{\hat{b}_3^*-1}\frac{B_k}{t+k}, \end{aligned}$$

where

$$\begin{aligned} A_k&= \bigl (\hat{R}(t)(t+k)^2\bigr )|_{t=-k} \nonumber \\&= (-1)^{\hat{d}}\left( {\begin{array}{c}2k-\hat{b}_0\\ 2k-\hat{a}_0\end{array}}\right) \left( {\begin{array}{c}k-\hat{b}_1\\ k-\hat{a}_1\end{array}}\right) \left( {\begin{array}{c}\hat{b}_2-\hat{a}_2-1\\ k-\hat{a}_2\end{array}}\right) \left( {\begin{array}{c}\hat{b}_3-\hat{a}_3-1\\ k-\hat{a}_3\end{array}}\right) \in {\mathbb {Z}} \end{aligned}$$
(21)

with \(\hat{d}=\hat{b}_2+\hat{b}_3\), for \(k=\hat{a}_3^*,\hat{a}_3^*+1,\dots ,\hat{b}_2^*-1\) and, similarly,

$$\begin{aligned} B_k =\left. \frac{{\mathrm {d}}}{{\mathrm {d}}t}\bigl (\hat{R}(t)(t+k)^2\bigr )\right| _{t=-k} \end{aligned}$$

for \(k=\hat{a}_2^*,\hat{a}_2^*+1,\dots ,\hat{b}_3^*-1\). In addition,

$$\begin{aligned} \sum _{k=\hat{a}_2^*}^{\hat{b}_3^*-1}B_k =-\mathrm{Res }_{t=\infty }\hat{R}(t)=0 \end{aligned}$$
(22)

by the residue sum theorem.

The inclusions

$$\begin{aligned} D_{\max \{\hat{a}_0-\hat{b}_0,\hat{a}_1-\hat{b}_1,\hat{b}_3^*-\hat{a}_2-1,\hat{b}_3^*-\hat{a}_3-1\}}\cdot B_k\in {\mathbb {Z}} \end{aligned}$$
(23)

follow then from standard consideration; see, for example, Lemma 3 and the proof of Lemma 4 in [17]. More importantly, for primes \(p\) we have

$$\begin{aligned} \mathrm{ord }_pA_k, \, 1+\mathrm{ord }_pB_k&\ge \biggl \lfloor \frac{2k-\hat{b}_0}{p}\biggr \rfloor -\biggl \lfloor \frac{2k-\hat{a}_0}{p}\biggr \rfloor -\biggl \lfloor \frac{\hat{a}_0-\hat{b}_0}{p}\biggr \rfloor \nonumber \\&\quad +\biggl \lfloor \frac{k-\hat{b}_1}{p}\biggr \rfloor -\biggl \lfloor \frac{k-\hat{a}_1}{p}\biggr \rfloor -\biggl \lfloor \frac{\hat{a}_1-\hat{b}_1}{p}\biggr \rfloor \nonumber \\&\quad +\sum _{j=2}^3\left( \biggl \lfloor \frac{\hat{b}_j-\hat{a}_j-1}{p}\biggr \rfloor -\biggl \lfloor \frac{k-\hat{a}_j}{p}\biggr \rfloor -\biggl \lfloor \frac{\hat{b}_j-1-k}{p}\biggr \rfloor \right) \quad \qquad \end{aligned}$$
(24)

for \(k=\hat{a}_2^*,\dots ,\hat{b}_3^*-1\). These estimates on the \(p\)-adic order of the coefficients in the partial-fraction decomposition of \(\hat{R}(t)\) follow from [17, Lemmas 17 and 18].

The quantity of our interest in this section is

$$\begin{aligned} \hat{r}(\hat{\varvec{a}},\hat{\varvec{b}}) =\frac{(-1)^{\hat{d}}}{2\pi i}\int _{C/2-i\infty }^{C/2+i\infty }\frac{\pi }{\sin 2\pi t}\,\hat{R}(\hat{\varvec{a}},\hat{\varvec{b}};t)\,{\mathrm {d}}t, \end{aligned}$$

where \(C\) is arbitrary from the interval \(-\hat{a}_0^*<C<1-\hat{b}_0\).

Proposition 3

We have

$$\begin{aligned} \hat{r}(\hat{\varvec{a}},\hat{\varvec{b}})=\hat{q}(\hat{\varvec{a}},\hat{\varvec{b}})\zeta (2)-\hat{p}(\hat{\varvec{a}},\hat{\varvec{b}}) \quad \text {with }\left\{ \begin{array}{l} \hat{q}(\hat{\varvec{a}},\hat{\varvec{b}})\in {\mathbb {Z}},\\ D_{\hat{c}_1}D_{\hat{c}_2}\hat{p}(\hat{\varvec{a}},\hat{\varvec{b}})\in {\mathbb {Z}},\\ \end{array}\right. \end{aligned}$$
(25)

where

$$\begin{aligned} \left\{ \begin{array}{ll} \hat{c}_1=\max \{\hat{a}_0-\hat{b}_0,\hat{a}_1-\hat{b}_1,\hat{b}_3^*-\hat{a}_2-1,\hat{b}_3^*-\hat{a}_3-1,2\hat{b}_2^*-\hat{a}_0^*-2\},\\ \hat{c}_2=2\hat{b}_3^*-\hat{a}_0^*-2.\\ \end{array}\right. \end{aligned}$$

Proof

We use

$$\begin{aligned} \mathrm{Res }_{t=m/2}\frac{\pi \hat{R}(t)}{\sin 2\pi t} = \frac{(-1)^m}{2}\,\hat{R}(t)\Big |_{t=m/2} \end{aligned}$$

for \(m\ge 1-\hat{a}_0^*\;\) to write

$$\begin{aligned} \hat{r}(\hat{\varvec{a}},\hat{\varvec{b}})&= -\frac{(-1)^{\hat{d}}}{2}\sum _{m=1-\hat{a}_0^*}^\infty (-1)^m\hat{R}(t)\bigg |_{t=m/2} \\&= (-1)^{\hat{d}}\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}2A_k\sum _{m=1-\hat{a}_0^*}^\infty \frac{(-1)^{m-1}}{(m+2k)^2} +(-1)^{\hat{d}}\sum _{k=\hat{a}_2^*}^{\hat{b}_3^*-1}B_k\sum _{m=1-\hat{a}_0^*}^\infty \frac{(-1)^{m-1}}{m+2k} \\&= 2\sum _{\ell =1}^\infty \frac{(-1)^{\ell -1}}{\ell ^2}\cdot (-1)^{\hat{d}}\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}A_k -(-1)^{\hat{d}}\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}2A_k\sum _{\ell =1}^{2k-\hat{a}_0^*}\frac{(-1)^{\ell -1}}{\ell ^2} \\&\quad +\sum _{\ell =1}^\infty \frac{(-1)^{\ell -1}}{\ell }\cdot (-1)^{\hat{d}}\sum _{k=\hat{a}_2^*}^{\hat{b}_3^*-1}B_k -(-1)^{\hat{d}}\sum _{k=\hat{a}_2^*}^{\hat{b}_3^*-1}B_k\sum _{\ell =1}^{2k-\hat{a}_0^*}\frac{(-1)^{\ell -1}}{\ell } \\&= \zeta (2)\cdot (-1)^{\hat{d}}\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}A_k -(-1)^{\hat{d}}\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}2A_k\sum _{\ell =1}^{2k-\hat{a}_0^*}\frac{(-1)^{\ell -1}}{\ell ^2} \\&\quad -(-1)^{\hat{d}}\sum _{k=\hat{a}_2^*}^{\hat{b}_3^*-1}B_k\sum _{\ell =1}^{2k-\hat{a}_0^*}\frac{(-1)^{\ell -1}}{\ell }, \end{aligned}$$

where the equality (22) was used. In view of the inclusions (21), (23) the found representation of \(\hat{r}(\hat{\varvec{a}},\hat{\varvec{b}})\) implies the form (25). \(\square \)

Remark 5

The binomial expressions (8) and (21) allow us to write

$$\begin{aligned} q({\varvec{a}},{\varvec{b}})=(-1)^d\sum _{k=a_4^*}^{b_4-1}C_k \quad \text {and}\quad \hat{q}(\hat{\varvec{a}},\hat{\varvec{b}})=(-1)^{\hat{d}}\sum _{k=\hat{a}_3^*}^{\hat{b}_2^*-1}A_k \end{aligned}$$

as certain \({}_4F_3\)- and \({}_5F_4\)-hypergeometric series, respectively (see the books [3, 10] for the definition of generalized hypergeometric series). Then Whipple’s classical transformation [10, p. 65, eq. (2.4.2.3)],

$$\begin{aligned}&{}_4F_3\left( \begin{matrix} f, \, 1+f-h, \, h-a, \, -N \\ h, \, 1+f+a-h, \, g\\ \end{matrix} \biggm |1\right) =\frac{(g-f)_N}{(g)_N} \nonumber \\&\quad \times _5F_4\left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} a,\,-N, \, &{} 1+f-g, \, &{} \tfrac{1}{2}f, \, &{} \tfrac{1}{2}f+\tfrac{1}{2} \\ h, \, &{} 1\!+\!f\!+\!a\!-\!h, \, &{} \tfrac{1}{2}(1+f-N-g), \, &{} \tfrac{1}{2}(1\!+\!f-N-g)\!+\!\tfrac{1}{2} \\ \end{array}\biggm |1\right) \!, \end{aligned}$$
(26)

can be stated as the following identity:

$$\begin{aligned}&q\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} a_1, &{} a_2, &{} a_3, &{} a_4\\ 1, &{} a_4-a_1+1, &{} a_4-a_2+1, &{} b_4\\ \end{array}\right) \\&\quad =\hat{q}\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} b_4-a_3+a_4; &{} a_2, &{}a_1, &{} a_4\\ a_4+1; &{} a_4-a_3+1, &{} a_1+ a_2, &{} b_4\\ \end{array}\right) . \end{aligned}$$

Note that (19) is equivalent to

$$\begin{aligned}&r\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} a_1, &{} a_2, &{} a_3, &{} a_4\\ 1, &{} a_4-a_1+1, &{} a_4-a_2+1, &{} b_4\\ \end{array}\right) \\&\quad =\hat{r}\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} b_4-a_3+a_4; &{} a_2, &{}a_1, &{} a_4\\ a_4+1; &{} a_4-a_3+1, &{} a_1+a_2, &{} b_4\\ \end{array}\right) , \end{aligned}$$

so that it is Whipple’s transformation (26) that offers us to expect the coincidence of the two families of linear forms in \(1\) and \(\zeta (2)\).

As in Sect. 4, we take the parameters \((\hat{\varvec{a}},\hat{\varvec{b}})\) as follows:

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} \hat{a}_0=\hat{\alpha }_0n+2, &{} \hat{a}_1=\hat{\alpha }_1n+1, &{} \hat{a}_2=\hat{\alpha }_2n+1, &{} \hat{a}_3=\hat{\alpha }_3n+1, \\ \hat{b}_0=\hat{\beta }_0n+2, &{} \hat{b}_1=\hat{\beta }_1n+1, &{} \hat{b}_2=\hat{\beta }_2n+2, &{} \hat{b}_3=\hat{\beta }_3n+2, \end{array}\right. \end{aligned}$$
(27)

where the fixed integers \(\hat{\alpha }_j\) and \(\hat{\beta }_j\), \(j=0,\dots ,3\), satisfy

$$\begin{aligned} \tfrac{1}{2}\hat{\beta }_0,\hat{\beta }_1<\tfrac{1}{2}\hat{\alpha }_0,\hat{\alpha }_1,\hat{\alpha }_2,\hat{\alpha }_3<\hat{\beta }_2,\hat{\beta }_3, \quad \hat{\alpha }_0+\hat{\alpha }_1+\hat{\alpha }_2+\hat{\alpha }_3=\hat{\beta }_0+\hat{\beta }_1+\hat{\beta }_2+\hat{\beta }_3 \end{aligned}$$

to ensure that all hypotheses (20) are satisfied. Though we can give the analogue of Lemma 6, our principal interest in the construction of this section is purely arithmetic.

Lemma 8

Assuming the choice (27), for each prime \(p\) we have

$$\begin{aligned} \mathrm{ord }_p\hat{q}(\hat{\varvec{a}},\hat{\varvec{b}})\ge \hat{\varphi }(n/p) \quad \text {and}\quad \mathrm{ord }_p\hat{p}(\hat{\varvec{a}},\hat{\varvec{b}})\ge -2+\hat{\varphi }(n/p), \end{aligned}$$

where the (\(1\)-periodic and integer-valued) function \(\hat{\varphi }(x)\) is given by

$$\begin{aligned} \hat{\varphi }(x)=\min _{0\le y<1} \left( \begin{array}{l} \lfloor 2y-\hat{\beta }_0x\rfloor -\lfloor 2y-\hat{\alpha }_0x\rfloor -\lfloor (\hat{\alpha }_0-\hat{\beta }_0)x\rfloor \\ \quad +\lfloor y-\hat{\beta }_1x\rfloor -\lfloor y-\hat{\alpha }_1x\rfloor -\lfloor (\hat{\alpha }_1-\hat{\beta }_1)x\rfloor \\ \quad + \sum _{j=2}^3\bigl (\lfloor (\hat{\beta }_j-\hat{\alpha }_j)x\rfloor -\lfloor \hat{\beta }_jx-y\rfloor -\lfloor y-\hat{\alpha }_jx\rfloor \bigr )\\ \end{array} \right) . \end{aligned}$$

Proof

This follows from the estimates (24), the explicit expressions for \(\hat{q}(\hat{\varvec{a}},\hat{\varvec{b}})\) and \(\hat{p}(\hat{\varvec{a}},\hat{\varvec{b}})\) given in the proof of Proposition 3: we simply assign \(y=(k-1)/p\) and minimise over \(k\). \(\square \)

Note that the special choice of parameters \((\hat{\varvec{a}},\hat{\varvec{b}})\),

$$\begin{aligned}\left\{ \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} \hat{a}_0=11n+2, &{} \hat{a}_1=3n+1, &{} \hat{a}_2=4n+1, &{} \hat{a}_3=5n+1, \\ \hat{b}_0=2n+2, &{} \hat{b}_1=1, &{} \hat{b}_2=10n+2, &{} \hat{b}_3=11n+2,\\ \end{array}\right. \end{aligned}$$

results in the linear forms

$$\begin{aligned} \hat{r}_n=\hat{r}(\hat{\varvec{a}},\hat{\varvec{b}})=\hat{q}_n\zeta (2)-\hat{p}_n, \end{aligned}$$

which are related, by Proposition 2, to the linear forms (12), (13) as follows:

$$\begin{aligned} r_n=q_n\zeta (2)-p_n=\hat{q}_n\zeta (2)-\hat{p}_n, \end{aligned}$$

so that \(q_n=\hat{q}_n\) and \(p_n=\hat{p}_n\) for \(n=0,1,2,\dots \) .

Then with the help of Lemma 8, we find that

$$\begin{aligned} \hat{\Phi }_n^{-1}q_n=\hat{\Phi }_n^{-1}\hat{q}_n\in {\mathbb {Z}} \quad \text {and}\quad \hat{\Phi }_n^{-1}D_{9n}D_{8n}p_n=\hat{\Phi }_n^{-1}D_{9n}D_{8n}\hat{p}_n\in {\mathbb {Z}}, \end{aligned}$$

where

$$\begin{aligned} \hat{\Phi }_n=\prod _{p\le 8n}p^{\hat{\varphi }(n/p)} \end{aligned}$$

and

$$\begin{aligned} \hat{\varphi }(x)&= \min \limits _{0\le y<1} \left( \begin{array}{l} \lfloor 2y-2x\rfloor -\lfloor 2y-11x\rfloor -\lfloor 9x\rfloor +\lfloor y\rfloor \\ \quad -\lfloor y-3x\rfloor -\lfloor 3x\rfloor +\lfloor 6x\rfloor -\lfloor 10x-y\rfloor \\ \quad -\lfloor y-4x\rfloor +\lfloor 6x\rfloor -\lfloor 11x-y\rfloor -\lfloor y-5x\rfloor \\ \end{array} \right) \nonumber \\&= {\left\{ \begin{array}{ll} 2 &{}\text {if}\,x\in \bigl [\frac{1}{6},\frac{2}{9}\bigr )\cup \bigl [\frac{1}{2},\frac{5}{9}\bigr )\cup \bigl [\frac{5}{6},\frac{7}{8}\bigr ),\\ 1 &{}\text {if}\,x\in \bigl [\frac{2}{9},\frac{4}{9}\bigr )\cup \bigl [\frac{5}{9},\frac{7}{9}\bigr )\cup \bigl [\frac{7}{8},\frac{8}{9}\bigr ),\\ 0 &{}\text {otherwise},\\ \end{array}\right. } \end{aligned}$$
(28)

so that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\log \hat{\Phi }_n}{n}=7.03418177\cdots . \end{aligned}$$

Comparing (17) and (28) we find out that \(\varphi (x)\ge \hat{\varphi }(x)\) for all \(x\in [0,1)\) except for \(x\in \bigl [\frac{1}{5},\frac{2}{9}\bigr )\cup \bigl [\frac{3}{7},\frac{4}{9}\bigr )\cup \bigl [\frac{6}{7},\frac{7}{8}\bigr )\). It means that with the choice

$$\begin{aligned} \tilde{\Phi }_n=\prod _{p\le 8n}p^{\tilde{\varphi }(n/p)} \end{aligned}$$

where

$$\begin{aligned} \tilde{\varphi }(x) =\max \{\varphi (x),\hat{\varphi }(x)\} ={\left\{ \begin{array}{ll} 2 &{}\text {if}\,x\in \bigl [\frac{1}{6},\frac{2}{9}\bigr )\cup \bigl [\frac{1}{4},\frac{2}{7}\bigr )\cup \bigl [\frac{1}{2},\frac{4}{7}\bigr )\cup \bigl [\frac{5}{6},\frac{7}{8}\bigr ),\\ 1 &{}\text {if}\,x\in \bigl [\frac{1}{8},\frac{1}{7}\bigr )\cup \bigl [\frac{2}{9},\frac{1}{4}\bigr )\cup \bigl [\frac{2}{7},\frac{4}{9}\bigr )\cup \bigl [\frac{4}{7},\frac{5}{6}\bigr )\cup \bigl [\frac{7}{8},\frac{8}{9}\bigr ),\\ 0 &{}\text {otherwise},\\ \end{array}\right. } \end{aligned}$$

we have the inclusions

$$\begin{aligned} \tilde{\Phi }_n^{-1}q_n,\,\tilde{\Phi }_n^{-1}D_{9n}D_{8n}p_n\in {\mathbb {Z}} \quad \text {for}\; n=0,1,2,\dots , \end{aligned}$$
(29)

and

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\log \tilde{\Phi }_n}{n} =8.79117698\cdots . \end{aligned}$$

7 Finale: Proof of Theorem 1 and concluding remarks

Here is the

Proof of Theorem 1

In the course of our study, we constructed the forms

$$\begin{aligned} r_n=q_n\zeta (2)-p_n, \quad n=0,1,2,\dots , \end{aligned}$$

such that their rational coefficients \(q_n\) and \(p_n\) satisfy (29), while the growth of \(r_n\) and \(q_n\) as \(n\rightarrow \infty \) is determined by (15). Denoting

$$\begin{aligned} \tilde{C}_2=\lim _{n\rightarrow \infty }\frac{\log (\tilde{\Phi }_n^{-1}D_{9n}D_{8n})}{n}=8.20882301\dots \end{aligned}$$

and applying [6, Lemma 2.1] we arrive at the estimate

$$\begin{aligned} \mu (\zeta (2))\le \frac{C_0+C_1}{C_0-\tilde{C}_2}=5.09541178\dots \end{aligned}$$

for the irrationality exponent of \(\zeta (2)=\pi ^2/6\). \(\square \)

As discussed in [4], the sequence of approximations \(r_n=q_n\zeta (2)-p_n\) constructed in the proof of Theorem 1 can be complemented with the satellite sequence

$$\begin{aligned} r_n'=q_n\zeta (3)-p_n' \end{aligned}$$

of rational approximations to \(\zeta (3)\), which satisfy

$$\begin{aligned} \tilde{\Phi }_n^{-1}D_{9n}D_{8n}^2p_n'\in {\mathbb {Z}}\quad \text{ for }\;n=0,1,2,\dots \end{aligned}$$

and

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{\log |r_n'|}{n}=-C_0=-15.88518998\dots \end{aligned}$$

(cf. (15)). Because

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\log (\tilde{\Phi }_n^{-1}D_{9n}D_{8n}^2)}{n}=16.20882301\ldots >C_0, \end{aligned}$$

the linear forms

$$\begin{aligned} \tilde{\Phi }_n^{-1}D_{9n}D_{8n}^2r_n'\in {\mathbb {Z}}\zeta (3)+{\mathbb {Z}} \end{aligned}$$

are unbounded as \(n\rightarrow \infty \) and, therefore, cannot be used for proving the irrationality of \(\zeta (3)\) (which would in this case also lead to the \(\mathbb Q\)-linear independence of \(1\), \(\zeta (2)\) and \(\zeta (3)\)).

With the help of the recurrence equation, used in our proof of Proposition 2 and satisfied by the sequences

$$\begin{aligned} q_n, \quad r_n=q_n\zeta (2)-p_n\quad \text{ and } \quad r_n'=q_n\zeta (3)-p_n', \end{aligned}$$

we computed the first 300 terms of the sequence

$$\begin{aligned} \Lambda _n=\gcd (\tilde{\Phi }_n^{-1}D_{9n}D_{8n}^2q_n,\tilde{\Phi }_n^{-1}D_{9n}D_{8n}^2p_n,\tilde{\Phi }_n^{-1}D_{9n}D_{8n}^2p_n'), \quad n=0,1,2,\cdots . \end{aligned}$$

The primes involved in the factorisation of \(\Lambda _n\) do not seem to possess a structural dependence on \(n\), and for the majority of \(n\)’s these primes \(p\) are in the (asymptotically neglectable) range \(p\le \sqrt{8n}\). Nevertheless, the absolute values of the forms

$$\begin{aligned} (\Lambda _n\tilde{\Phi }_n)^{-1}D_{9n}D_{8n}^2r_n\in {\mathbb {Z}}\zeta (2)+{\mathbb {Z}} \quad \text {and}\quad (\Lambda _n\tilde{\Phi }_n)^{-1}D_{9n}D_{8n}^2r_n'\in {\mathbb {Z}}\zeta (3)+{\mathbb {Z}} \end{aligned}$$

happen to be simultaneously less than 1 for

$$\begin{aligned} n&= 1,\dots ,21,23,\dots ,35,37,38,39,41,42,43,47,\dots ,50,53,54,64,68,\\&\quad 71,\dots ,74,79,80,81,84,85,89,101,102,106,110,113,128,129,178,228 \end{aligned}$$

in the range \(n\le 300\).

It would be nice to investigate arithmetically the other classical hypergeometric instances from Bailey’s and Slater’s books [3, 10]: the philosophy is that behind any hypergeometric transformation there is some interesting arithmetic. Already the previously achieved irrationality measure for \(\zeta (2)\) in [13] and the best known irrationality measure for \(\zeta (3)\) in [14], both due to Rhin and Viola, have deep hypergeometric roots (see [17]). Another example in this direction is the hypergeometric construction of rational approximations to \(\zeta (4)\) in [16].