1 Introduction

In physical processes, infinite time-averaged quantities are often of more interest than particular solutions. Their dependence on parameters is of fundamental theoretical interest and also practical importance. The derivation of bounds for such averages by algebraic optimization has received increasing attention in recent years (Fantuzzi 2016; Goluskin 2018; Olson et al. 2021; Olson and Doering 2022). Bounds derived by estimates need not be sharp, but also sharp bounds may be misleading, if they are realized by dynamically unstable states (Wen et al. 2022a, b). Inspired by Souza and Doering (2015), Goluskin (2018), Olson et al. (2021), in this paper we study such a situation in the regime of large Rayleigh number \(\rho \gg 1\) for the famous Lorenz equations and variants thereof. The Lorenz equations, given as follows,

$$\begin{aligned} \begin{array}{l} X' = \sigma (Y -X) \\ Y' = \rho X - Y -XZ \\ Z' = -\beta Z + XY, \end{array} \end{aligned}$$
(1)

arose first in the context of atmospheric convection. In the original derivation (Lorenz 1963), Lorenz considered a fluid in a periodic box, heated from below and cooled from above, and obtained (1) from a PDE model by retaining only the lowest order Fourier modes. The parameters in (1) stem from the PDE, where \(\sigma >0\) is the Prandtl number, characterizing the viscosity of the fluid, and where \(\beta > 0\) is a shape parameter, measuring the ratio of the length of the box to its height. Often of primary interest is the parameter \(\rho \ge 0\) which is the rescaled Rayleigh number, measuring the intensity of the heating. The rescaling is chosen such that a bifurcation occurs at \(\rho =1\) and indeed the Lorenz model accurately captures the onset of steady atmospheric convection rolls for \(\rho >1\). For larger values of \(\rho \), the Lorenz equations do not accurately capture the full PDE model, but the relation between the hierarchy of higher-order mode truncations with the convection PDE model continue to be of interest, e.g., Felicio and Rech (2018), Park (2021), Olson et al. (2021), Olson and Doering (2022).

Although physically unrealistic for large regions of parameter space, (1) is frequently used as a benchmark and test bed for nonlinear dynamics, in particular in the famous chaotic regime, but also in the context of time averages (Olson et al. 2021). Of particular interest is the average

$$\begin{aligned} H(\rho ,\beta , \sigma , {\textbf {X}}_0) = \limsup _{t \rightarrow \infty } \frac{1}{t} \int \nolimits _0^t X(t) Y(t) dt, \end{aligned}$$
(2)

which we refer to as transport, following Souza and Doering (2015). The quantity H is the mode truncated form of excess heat transport, which defines the Nusselt number by a scaling factor and constant shift. H is well defined due to the dissipation at infinity in (1) and in particular depends on the initial condition \({\textbf {X}}_0\) of the solution (XYZ)(t) of (1). The dependence of the Nusselt number on \(\rho \) in the full PDE model is of major physical interest, but is difficult to determine or bound analytically and numerically (Wen et al. 2022a, b). Examining H and its parameter dependence in the simplified context of the Lorenz equations provides a tractable, non-trivial case study which can provide insight into the analysis of the Nusselt number for the full PDE. As such, an optimal bound for H has been a longstanding question, which was settled by Souza and Doering (2015). They proved that transport is maximal in the non-trivial equilibria of (1), which emerge in the bifurcation at \(\rho =1\). However, since these equilibria are unstable in large parameter regimes, the question remains what values the transport takes in attractors.

The inclusion of stability in the study of transport and the resulting scaling for \(\rho \gg 1\) is our main motivation for this paper. The relation to stability is particularly clear in the case \(0\le \rho \le 1\), where for any value of \(\sigma , \beta \) the system admits a Lyapunov function and the origin is the global attractor (see for instance (Sparrow 1982)). Hence, for any initial condition we obtain \(H = 0\). For higher Rayleigh numbers, the functional form of transport becomes more complicated. At \(\rho =1\), a pitchfork bifurcation occurs, where the aforementioned nonzero fixed points \({\textbf {X}}_{\pm } = (\pm \sqrt{\beta (\rho -1)}, \pm \sqrt{\beta (\rho - 1)}, \rho - 1)\) emerge. For \(\rho \) sufficiently near 1, these nonzero fixed points seem to attract every trajectory except for \({\textbf {X}}_0\) belonging to the stable manifold of the origin, \(W^s({\textbf {0}})\), so that

$$\begin{aligned} H(\rho ,\beta ,\sigma ,{\textbf {X}}_0) = \left\{ \begin{array}{cl} 0 &{} \text {if } {\textbf {X}}_0 \in W^s({\textbf {0}}), \\ H_{\pm }(\rho ,\beta ):=\beta ( \rho - 1) &{} \text {otherwise. } \end{array} \right. \end{aligned}$$

Here, \(W^s({\textbf {0}})\) is a surface of dimension 2, which means almost all initial conditions give a positive transport, namely that of the fixed points \(H_{\pm }(\rho ,\beta )\). This is certainly the case for initial data in their non-trivial basins of attraction.

Upon increasing \(\rho \) further, additional periodic orbits emerge and further complicate the function H. It has been noticed in Sparrow (1982) that a decisive parameter for (1) at higher \(\rho \) is

$$\begin{aligned} \lambda = \frac{\sigma + 1}{\beta + 2}. \end{aligned}$$

For \(0 < \lambda \le 1\), the fixed points \({\textbf {X}}_{\pm }\) are locally stable for all \(\rho \), cf. Sparrow (1982) so that the fixed point transport \(H_{\pm }\) is observed at least for \({\textbf {X}}_{0}\) belonging to a basin of attraction of positive measure. On the other hand for \(\lambda > 1\), the fixed points are locally stable only for \(1< \rho < \rho ^* = \sigma (\sigma + \beta + 3)/(\sigma - \beta - 1)\). As \(\rho \) is increased through \(\rho ^*\), the fixed points \({\textbf {X}}_{\pm }\) lose stability via a sub-critical Andronov–Hopf bifurcation and at least for some open set containing \(\beta = 8/3\), \(\sigma = 10\), generic initial conditions give chaotic solutions (Tucker 1999).

As mentioned, it was proven in Souza and Doering (2015) that despite this complexity, for all \(\rho > 1\) and any \(\sigma , \beta \in {\mathbb {R}}_{>0}\), \({\textbf {X}}_0 \in {\mathbb {R}}^3\) one has the simple bound

$$\begin{aligned} H(\rho ,\beta ,\sigma ,{\textbf {X}}_0) \le H_{\pm }(\rho ,\beta ). \end{aligned}$$

For the Lorenz equations, this bound is actually sharp as it is realized by the steady states \({\textbf {X}}_\pm \). However, since \({\textbf {X}}_\pm \) are unstable for \(\rho>\rho ^*, \lambda >1\), the transport that is realized by typical solutions might be much lower. Indeed, numerical experiments presented in Souza and Doering (2015) indicate that for \(\rho >\rho ^*\) a gap \(\Delta H = H_{\pm } - H>0\) occurs, cf. Fig. 1. To the best of our knowledge, quantitative results for the size of the observed transport gap for large Rayleigh number that we provide in this paper have not yet been established previously.

Fig. 1
figure 1

a Plot of the transport \(H(\rho ,\beta ,\sigma ,{\textbf {X}}_0)\) for \(\beta = \frac{8}{3}\), \(\sigma = 10\) near the transition to chaos \(\rho \approx \rho ^*\) depicting the steady convection rate \(H_{\pm }(\rho ,\beta )\) (orange) and the rate obtained from a numerical simulation of (1) with randomly selected initial conditions \({\textbf {X}}_0\) (blue). b Plot of the transport gap \(\Delta H(\rho ) = H_{\pm }(\rho ) - H(\rho )\) for \(\beta =\frac{8}{3}\), \(\sigma = 10\) and for \(\rho \) large. The vertical lines mark bifurcations that bound the region of ‘large \(\rho \)’ as in Fig. 5b, in particular for \(\rho \) above the blue line the transport is dominated by stable periodic orbits (Color figure online)

It has been observed already in Robbins (1979) that for sufficiently large \(\rho \) the chaotic attractor collapses and a periodic attractor occurs, but it appears that the implications for transport and other time averages have not been studied. Moreover, we found the arguments given in Robbins (1979) and also Li and Zhang (1993) to be incomplete. Sparrow (1982) devotes a chapter to the regime of large \(\rho \) and obtains various results based on a formal application of the method of averaging. In this paper, we rigorously confirm several of these results and provide a complete proof of the existence and stability of symmetric periodic attractors \({\textbf {X}}_{\textrm{sym}}={\textbf {X}}_{\textrm{sym}}(\rho ,\beta ,\sigma )\) for sufficiently large \(\rho \) and its dependence on \(\lambda >2/3\). We also prove the existence and instability of a pair of asymmetric periodic orbits and obtain some results on homoclinic orbits. Our analysis is based on the observation that the well-known limit system as \(\rho \rightarrow \infty \) possesses a Hamiltonian structure that allows one to apply the extended Melnikov theory of Wiggins and Holmes (1987b).

For the periodic orbits, it becomes tractable to compute the transport analytically and we quantify the gap \(\Delta H\) to leading order: We show that \(H_{\textrm{sym}}(\rho ,\beta ,\sigma ):=H(\rho , \beta ,\sigma ,{\textbf {X}}_{\textrm{sym}})\) is a monotone increasing function of \(\lambda \), for fixed \(\rho \gg 1\), whose range is a subinterval of \((0,H_\pm )\) that limits to this interval as \(\rho \rightarrow \infty \). Specifically, \(H(\rho ,\beta ,\sigma ,{\textbf {X}}_{\textrm{sym}})\sim \rho \) for \(\rho \gg 1\), just as \(H_\pm \), but with a \(\lambda \)-dependent downshift that can bring it arbitrarily close to zero, and we provide the leading-order term of the downshift in terms of elliptic integrals. The resulting bifurcation diagram contains a hysteresis loop between \({\textbf {X}}_\pm \) and \({\textbf {X}}_{\textrm{sym}}\) in terms of \(\lambda \). This highlights difficulty to recover from low transport when \(\lambda \) grows beyond the ‘tipping point’ at \(\lambda =1\). Moreover, for any \(\gamma \in (0,1]\) we can choose \(\sigma (\rho )\), so that \(H_{\textrm{sym}}(\rho ,\beta ,\sigma (\rho ))\sim \rho ^\gamma \) along the periodic attractors \({\textbf {X}}_{\textrm{sym}}\).

We employ numerical pathfollowing to corroborate the analytical results for large fixed \(\rho \) and find that for large \(\lambda \) the symmetric periodic orbits terminate in a symmetric heteroclinic cycle, akin to cycles found in a different regime in Sparrow (1982). We also compute the stability boundary of the symmetric orbits in the \((\lambda ,\rho )\)-plane and find that it extends to values of \(\rho \) below 200. In Fig. 1, this region begins near \(\rho = 313\). It is well known that beyond this boundary various period-doubling bifurcations occur (Robbins 1979).

Finally, we turn to variants and extensions of the Lorenz equations and identify regimes in which our analytical results system remain valid and explicitly illustrate this for the Lorenz–Stenflo system.

2 Periodic Orbits at Large Rayleigh Number

It is well known that (1) possesses a semi-Lyapunov function at infinity, cf. Souza and Doering (2015), and thus, a bounded trapping region exist. However, \({\textbf {X}}_\pm \) grow unboundedly with \(\rho \), and as pointed out by Sparrow (1982, Chapter 7), see also Robbins (1979), a suitable scaling of the variables with respect to \(\rho \) is given by

$$\begin{aligned} \varepsilon = \rho ^{-\frac{1}{2}}, \;\; X = \varepsilon ^{-1} \xi , \;\; Y = \varepsilon ^{-2}\sigma ^{-1} \eta , \;\; Z = \varepsilon ^{-2} (\sigma ^{-1} \zeta + 1), \;\; t = \varepsilon \tau , \end{aligned}$$
(3)

which yields the equivalent system

$$\begin{aligned} \dot{\xi }&= \eta - \varepsilon \sigma \xi \nonumber \\ \dot{\eta }&= -\xi \zeta - \varepsilon \eta \nonumber \\ \dot{\zeta }&= \xi \eta - \varepsilon \beta (\zeta + \sigma ). \end{aligned}$$
(4)

Notably, the symmetry \((X,Y,Z) \mapsto (-X,-Y,Z)\) of (1) turns into \((\xi ,\eta ,\zeta ) \mapsto (-\xi ,-\eta ,\zeta )\) in (4).

2.1 Hamiltonian Structure and the Extended Melnikov Theory

It is well known that the limiting system at \(\varepsilon =0\),

$$\begin{aligned} \dot{\xi }&= \eta \nonumber \\ \dot{\eta }&= -\xi \zeta \nonumber \\ \dot{\zeta }&= \xi \eta , \end{aligned}$$
(5)

is integrable with the conserved quantities

$$\begin{aligned} A = \frac{1}{2} \xi ^2 - \zeta , \;\; B = (\eta ^2 + \zeta ^2)^{1/2}, \end{aligned}$$
(6)

and we recap known results, essentially as provided in Sparrow (1982), in preparation of the existence and stability proofs.

The character of a solution is completely determined by its location in the (AB)-half-plane with \(B\ge 0\). In particular, \(B = 0\) consists of a line of equilibria. When \(B >0\) there are two domains with distinct behavior, \(D_1 =\{ 0<|A| < B\}\), \(D_2= \{A>B\}\), and two boundaries \(D_3=\{ A = B\}\), \(D_4= \{A = -B\}\), since the region \(A < -B\) has no real solutions. \(D_4\) has the simplest solutions, consisting only of the line of equilibria \(\xi = \eta = 0\), \(\zeta = B\). In other regions, the solutions are given in terms of Jacobi elliptic functions and complete elliptic integrals. Following Byrd and Friedman (1971), for a given elliptic modulus \(0< k < 1\), the complete elliptic integrals of the first and second kind are defined by

$$\begin{aligned} K(k) = \int \nolimits _0^1 \frac{1}{\sqrt{(1-t^2)(1-k^2t^2)}} dt \quad , \quad E(k) = \int \nolimits _0^1 \sqrt{\frac{1-k^2t^2}{1-t^2}} dt. \end{aligned}$$
(7)

For the Jacobi elliptic functions, one first defines the amplitude function \(\textsf{am}(u,k)\) as an inverse function via

$$\begin{aligned} u = \int \nolimits _{0}^{\phi } \frac{d\theta }{\sqrt{1-k^2 \sin ^2 \theta }} \quad \Leftrightarrow \quad \textsf{am}(u,k) = \phi \end{aligned}$$

and the Jacobi elliptic functions are then defined

$$\begin{aligned}{} & {} {{\,\textrm{sn}\,}}(u,k) = \sin \big [ \textsf{am} (u,k) \big ] \quad , \quad {{\,\textrm{cn}\,}}(u,k) = \cos \big [ \textsf{am} (u,k) \big ],\\{} & {} {{\,\textrm{dn}\,}}(u,k) = \Big ( 1 - k^2 \sin ^2 \big [ \textsf{am} (u,k) \big ] \Big )^{1/2}. \end{aligned}$$

Each \((A,B)\in D_1\) defines a symmetric periodic orbit \(L_1^{A, B} = (\xi _1(\tau ), \eta _1(\tau ), \zeta _1(\tau ))\) with period \(T_1 = 4 K (k_1) B^{-\frac{1}{2}}\), which can be written in terms of Jacobi elliptic functions as

$$\begin{aligned} u&= \sqrt{B} \tau , k_1^2 = \displaystyle \frac{A + B}{2B} \nonumber \\ \xi _1(\tau )&= 2 k_1 \sqrt{B} {{\,\textrm{cn}\,}}(u,k_1) \nonumber \\ \eta _1(\tau )&= - 2 k_1 B {{\,\textrm{dn}\,}}(u,k_1) {{\,\textrm{sn}\,}}(u,k_1) \nonumber \\ \zeta _1(\tau )&= B(1 - 2 k_1^2 {{\,\textrm{sn}\,}}^2 (u,k_1)). \end{aligned}$$
(8)

Each \((A,B)\in D_2\) defines a pair of asymmetric periodic orbits \(L_{2,\pm }^{A, B} = (\xi _{2,\pm }(\tau ), \eta _{2,\pm }(\tau ), \zeta _2(\tau ))\) with period \(\displaystyle T_2 = 4 K(k_2) k_2 B^{-\frac{1}{2}}\) that can be represented in elliptic functions as

$$\begin{aligned} u&= \sqrt{B}k_2^{-1} \tau , k_2^2 = 2 \left( 1+\frac{A}{B}\right) ^{-1} \nonumber \\ \xi _{2,\pm }(\tau )&= \pm 2 \sqrt{B} k_2^{-1} {{\,\textrm{dn}\,}}(u,k_2) \nonumber \\ \eta _{2,\pm }(\tau )&= \mp 2 B {{\,\textrm{sn}\,}}(u,k_2) {{\,\textrm{cn}\,}}(u,k_2) \nonumber \\ \zeta _2(\tau )&= B(1 - 2 {{\,\textrm{sn}\,}}^2 (u,k_2)). \end{aligned}$$
(9)

The region \(D_3: \{ A = B\}\) corresponds to the line of saddle equilibria \(\xi = \eta = 0, \;\; \zeta = -B\), each with a pair of homoclinic orbits \(L_{3,\pm }^{B} = (\xi _{3,\pm }(\tau ), \eta _{3,\pm }(\tau ), \zeta _3(\tau ))\) contained in \(D_3\) and given by

$$\begin{aligned} \displaystyle u&= \sqrt{B} \tau \nonumber \\ \xi _{3,\pm }(\tau )&= \pm 2 \sqrt{B} {{\,\textrm{sech}\,}}(u) \nonumber \\ \displaystyle \eta _{3,\pm }(\tau )&= \mp 2B \tanh (u) {{\,\textrm{sech}\,}}(u) \nonumber \\ \displaystyle \zeta _3(\tau )&= B \big ( 1 - 2 \tanh ^2(u) \big ) \end{aligned}$$
(10)

Due to the reflection symmetry of (5), without loss of generality we consider only \(L_{2}^{A, B} = L_{2,+}^{A, B}\) and \(L_{3}^{B} = L_{3,+}^{B}\).

In the following, we suppress the dependence of the elliptic functions and integrals on the elliptic modulus \(k=k_i\), e.g., writing \({{\,\textrm{sn}\,}}{u}\) for \({{\,\textrm{sn}\,}}(u, k_i)\) and K for \(K(k_i)\).

Next we deviate from the approach of Sparrow, who proceeds with a formal use of the method of averaging, and instead follow that of Li and Zhang (1993) with corrections. In order to exploit the Hamiltonian structure of (5) that is available when \(\varepsilon = 0\), we introduce polar coordinates with B from (6) given by

$$\begin{aligned} \zeta = B \cos \varphi , \;\; \eta = B \sin \varphi , \;\; \xi = \xi \end{aligned}$$
(11)

which transform (4) into

$$\begin{aligned} \dot{\xi }&= B \sin \varphi - \varepsilon \sigma \xi \nonumber \\ \displaystyle \dot{\varphi }&= -\xi + \varepsilon \frac{\sin \varphi }{B}((\beta - 1) B \cos \varphi + \beta \sigma )\nonumber \\ \dot{B}&= -\varepsilon (B + (\beta - 1)B \cos ^2 \varphi + \beta \sigma \cos \varphi ). \end{aligned}$$
(12)

Notably, for \(\varepsilon > 0\) the radial variable B is no longer a conserved quantity and must be included as a dynamical variable.

We will use that (12) has the form

$$\begin{aligned} \dot{\xi }&= f_1(\xi , \varphi , B) +\varepsilon g_1 (\xi , \varphi , B) \nonumber \\ \dot{\varphi }&= f_2(\xi , \varphi , B) +\varepsilon g_2 (\xi , \varphi , B) \nonumber \\ \dot{B}&= \varepsilon g_3 (\xi , \varphi , B). \end{aligned}$$
(13)

for smooth functions \(f_i\), \(g_i\). In this formulation, at \(\varepsilon = 0\), the first two equations possess the Hamiltonian structure with \(A(\xi , \varphi , B)\) defined in (6) serving as a Hamiltonian:

$$\begin{aligned} A(\xi , \varphi , B) = \displaystyle \frac{1}{2}\xi ^2 - B \cos \varphi , \;\; \displaystyle f_1 =\frac{\partial A}{\partial \varphi }, \;\; f_2 = -\frac{\partial A}{\partial \xi }. \end{aligned}$$
(14)

As noted above, the solutions at \(\varepsilon = 0\) are given by families of periodic orbits, saddle equilibria and their homoclinic orbits. Therefore, as already noticed in Li and Zhang (1993), we can apply the extended Melnikov perturbation theory of Wiggins and Holmes (1987b, 1987a) in order to identify which periodic and homoclinic orbits of (12) persist for small \(\varepsilon > 0\). The main idea of this method is that for \(\varepsilon = 0\) the phase space is represented as a one-parametric family of two-dimensional manifolds (parameterized by B in our case), and on each manifold the system is Hamiltonian. Then, generically, in the phase space there exist two-parameter families of periodic orbits, parameterized by A (the Hamiltonian) and B, and one-parameter families of homoclinic orbits. Upon perturbing by \(\varepsilon >0\), this structure is destroyed, and, generically, one can expect that only isolated periodic orbits and isolated homoclinic orbits will exist. The existence, local uniqueness and the topological type of these objects can be established with the help of Melnikov integrals.

For the periodic orbits \(L^{A,B}_{i}\), the analysis simplifies when changing canonical Hamiltonian variables for \(\varepsilon = 0\) from \((\xi , \varphi )\) to action-angle variables \((I, \theta )\). The action variable I is computed as

$$\begin{aligned} I(A, B) = \oint _{L_i^{A, B}} \varphi d \xi =-\oint _{L_i^{A, B}} \xi d \varphi =\int \limits _0^{T_i} \xi ^2 d\tau , \end{aligned}$$
(15)

and the angle \(\theta \in [0,1)\) increases from an 0 to 1 along the periodic orbits with constant frequency \(\Omega (A, B)\) given by

$$\begin{aligned} \frac{1}{\Omega (A, B)} = \frac{\partial I}{\partial A}\Bigr |_B. \end{aligned}$$
(16)

Here and below, we use the notation \(\displaystyle \frac{\partial R}{\partial P}\Bigr |_Q\) from thermodynamics and elsewhere to emphasize that we differentiate the quantity R only with respect to its explicit dependence on P, neglecting implicit relationships between P and Q.

In these new variables (12) turns into

$$\begin{aligned} \dot{I}&= \varepsilon F(I, \theta , B) \nonumber \\ \dot{\theta }&= \Omega + \varepsilon G(I, \theta , B) \nonumber \\ \dot{B}&= \varepsilon \tilde{g}_3(I, \theta , B), \end{aligned}$$
(17)

where

$$\begin{aligned} \displaystyle F(I, \theta , B)&= \frac{\partial I}{\partial \xi } \tilde{g}_1 + \frac{\partial I}{\partial \varphi } \tilde{g}_2 + \frac{\partial I}{\partial B} \tilde{g}_3 \nonumber \\ \displaystyle G(I, \theta , B)&= \frac{\partial \theta }{\partial \xi } \tilde{g}_1 + \frac{\partial \theta }{\partial \varphi } \tilde{g}_2 + \frac{\partial \theta }{\partial B} \tilde{g}_3 \nonumber \\ \tilde{g}_i(I, \theta , B)&= g_i(\xi (I, \theta , B), \varphi (I, \theta , B), B), \;\; i = 1, 2, 3. \end{aligned}$$
(18)

For small \(\varepsilon > 0\), a trajectory starting from the initial point \((I_0, 0, B_0)\) has monotonically increasing \(\theta \) from 0 to 1 with velocity \(\varepsilon \)-close to \(\Omega \), and slowly evolving (IB) with velocity \(O(\varepsilon )\). Hence, the trajectory stays in an \(\varepsilon \)-neighborhood of the corresponding unperturbed periodic trajectory and reaches \(\theta = 1\) after finite time \(T_\varepsilon = 1/\Omega + O(\varepsilon )\), thus returning to the starting plane \(\{\theta =0\}\). This defines a Poincaré map \((I_0, B_0) \rightarrow (I_1, B_1)\),

$$\begin{aligned} (I_1, B_1) = (I_0, B_0) + \varepsilon (M_1, M_3) + O(\varepsilon ^2), \end{aligned}$$
(19)

whose fixed points correspond to periodic orbits, and where \(M_1, M_3\) will be Melnikov integral terms. We denote the linearization matrix at \((I_0, B_0)\) as

$$\begin{aligned} DM(I_0, B_0) = \displaystyle \frac{\partial (M_1, M_3)}{\partial (I, B)} (I_0, B_0). \end{aligned}$$

Theorem 3.2 of Wiggins and Holmes (1987b) states that for any \((I_0,B_0)\) for which \(M_1 = M_3 = 0\), \(\det DM(I_0,B_0) \ne 0 \) there exist \(\varepsilon _0>0\) and an isolated fixed point of the Poincaré map (19) in an \(\varepsilon \)-neighborhood of \((I_0,B_0)\) for any \(0<\varepsilon <\varepsilon _0\). This corresponds to a persistent periodic orbit of (4) and (1) for \(0<\varepsilon \ll 1\). Moreover, in case \((M_1,M_3)\ne 0\) for \(0<\varepsilon \ll 1\) there is no periodic orbit in a neighborhood of \((I_0,B_0)\). The expressions for \(M_1\) and \(M_3\) are given in Wiggins and Holmes (1987b) and in our notation read

$$\begin{aligned} \displaystyle M_1&= \frac{1}{\Omega } \tilde{M}_1 +\frac{\partial I}{\partial B}\Bigr |_A M_3, \;\; \; \tilde{M}_1 = \int \limits _0^{T_i} \left[ f_1 \tilde{g}_2 - f_2 \tilde{g}_1 + \frac{\partial A}{\partial B} \Bigr |_{\xi , \varphi } \tilde{g}_3 \right] (L_i^{A,B}(\tau )) d\tau \nonumber \\ \displaystyle M_3&= \int \limits _0^{T_i} \tilde{g}_3 (L_i^{A,B}(\tau )) d \tau ; \end{aligned}$$
(20)

in particular \((M_1,M_3)=0\) is equivalent to \((\tilde{M}_1, M_3) =0\). We emphasize that we apply (Wiggins and Holmes 1987b, Theorem 3.2) separately for two topologically different families of periodic orbits, lying, respectively, in domains \(D_1\) and \(D_2\). For every fixed point of the Poincaré map \((I_0, B_0) \in D_i\), there exists small \(\varepsilon _0>0\) such that for \(0< \varepsilon < \varepsilon _0\) the corresponding periodic orbit lies entirely in \(D_i\). Also, it is clear that upon increasing \(\varepsilon \) further, the trajectory can touch the boundary of both domains—\(D_3\), and cross it. This scenario is confirmed below by numerical experiments in Sect. 4, see Fig. 6.

The stability type of such a fixed point, and thus the periodic orbit, is determined by the eigenvalues \(\nu _\pm \) of \(DM(I_0,B_0)\) which are given by

$$\begin{aligned} \nu _\pm = 1 + \frac{\varepsilon }{2} \big ( {{\,\textrm{tr}\,}}DM \pm \sqrt{({{\,\textrm{tr}\,}}DM)^2 - 4 \det DM} \big ) + O(\varepsilon ^2). \end{aligned}$$
(21)

Although \((M_1, M_3)\) depend on (IB), it is more convenient to consider them as functions of \(k_i, B\), where \(k_i\) are the moduli defined in (8), (9). The following simplified formulas for the trace and determinant can be obtained by using the fact that the Melnikov functions are zero at the appropriate value of \((I_0,B_0)\), and by changing variables:

$$\begin{aligned} {{\,\textrm{tr}\,}}DM&= \left( \frac{\partial I}{\partial k_i}\Bigr |_{B}\right) ^{-1} \Big [ \frac{1}{\Omega } \frac{\partial \tilde{M}_1 }{\partial k_i} \Bigr |_{B} + \frac{\partial I}{\partial B}\Bigr |_A \frac{\partial M_3 }{\partial k_i} \Bigr |_{B} \Big ] + \frac{\partial M_{3}}{\partial B}\Bigr |_{k_i} \!\!-\frac{\partial M_{3}}{\partial k_i}\Bigr |_{B} \frac{\partial I}{\partial B}\Bigr |_{k_i}\!\! \left( \frac{\partial I}{\partial k_i}\Bigr |_{B}\right) ^{-1} \nonumber \\ \det DM&= \frac{1}{\Omega } \left( \frac{\partial I}{\partial k_i} \Bigr |_{B}\right) ^{-1} \Big [ \frac{\partial \tilde{M}_1}{\partial k_i} \Bigr |_{B} \frac{\partial M_{3}}{\partial B}\Bigr |_{k_i} -\frac{\partial \tilde{M}_1}{\partial B} \Bigr |_{k_i} \frac{\partial M_{3}}{\partial k_i}\Bigr |_{B} \Big ]. \end{aligned}$$
(22)

Remark 2.1

We briefly comment on the previous existence and stability studies in the literature. The authors of Li and Zhang (1993) also follow the method from Wiggins and Holmes (1987b) with the difference that in formula (11) they define the radius as \((B + \rho )\), where \(B = {{\,\textrm{const}\,}}\) and \(\rho \) being the third dynamical variable in system (12) (not the Rayleigh number, as in our notation). However, in the new coordinates they then write formula (14) as

$$\begin{aligned} A(\xi , \theta , \rho ) = \displaystyle \frac{1}{2}\xi ^2 - B\cos \theta , \end{aligned}$$

which is incorrect as the term \(\displaystyle \frac{\partial A}{\partial \rho }\) is missing from the Melnikov integrals (20). This is why the results of Li and Zhang (1993) differ from Sparrow (1982), Robbins (1979) and our results.

In Robbins (1979), the Lorenz system of another form is considered. It can be obtained from system (1) via a coordinate transformation and setting \(\beta =1\), and thus, the parameter space is reduced. Moreover, there is a gap in the argumentation. Namely, for the unperturbed periodic orbit \((x_0(t), y_0(t), z_0(t))\) a perturbation \((x_1(t), y_1(t), z_1(t)) = O(\varepsilon )\) is considered. The author solves the system of differential equations for \((x_1(t), y_1(t), z_1(t))\) under an assumption \(z_0(t) \ne 0\) (see (Robbins 1979, formula (6))), but on the symmetric orbit function \(z_0(t)\) obviously vanishes two times. Thus, while the results of the computation seem correct, they are not sufficiently justified.

In the book (Sparrow 1982), system (4) is analyzed via formally averaging over the unperturbed periodic orbits. Isolated equilibrium points of the averaged system correspond then to isolated periodic orbits of the original system. Formulas (27) and (33), that determine the existence and uniqueness of periodic orbits in domains \(D_1\) and \(D_2\), were also obtained there, however, without a rigorous proof of monotonicity. Also, analogues of (29) and (36) were obtained in Sparrow (1982), giving the sign of the trace of the linearization matrix. However, the determinant was not computed, and thus, the stability of the symmetric periodic orbit and instability of the pair of non-symmetric periodic orbits could not be determined.

Remark 2.2

The question of persistence also arises for the homoclinic orbits (10) of (12) in region \(D_3\). From the line of the corresponding saddle equilibria, for \(0<\varepsilon \ll 1\) only one saddle equilibrium \((\xi , \varphi , B) = (0, \pi , \sigma )\) remains. Hence, it is possible that its stable and unstable manifolds will form homoclinic orbits, and periodic orbits may appear in bifurcations from the homoclinic orbits at \(\varepsilon >0\). In Wiggins and Holmes (1987a), such a phenomenon was studied; however, the results are not valid in the claimed generality and do not apply in our case as discussed in §2.4.

2.2 Symmetric Periodic Orbits (\(D_1\))

In the regime \(D_1\), the explicit solutions (8) have the following form in polar coordinates:

$$\begin{aligned} \xi&= 2 k_1 \sqrt{B} \textsf{cn} (\sqrt{B} \tau ) \\ \sin \varphi&= - 2k_1 \textsf{dn}(\sqrt{B}\tau ) \textsf{sn} (\sqrt{B}\tau ) \\ \cos \varphi&= 1-2k_1^2 \textsf{sn}^2(\sqrt{B}\tau ). \end{aligned}$$

Hence, from (15) and (16) the period and the action are

$$\begin{aligned} \displaystyle T_1 = \frac{4K}{\sqrt{B}} = \frac{1}{\Omega (k_1, B)} \quad , \quad I(k_1, B) = 16 \sqrt{B}(E - (1 - k_1^2)K). \end{aligned}$$
(23)

Substituting these explicit solutions into (20) using the formulas for \(f_i,g_i\) from (13) allows to compute the Melnikov integrals explicitly as

$$\begin{aligned} \tilde{M}_1&= \int _0^{4 K} \big [ - 4 \sigma k_1^2 \sqrt{B} \textsf{cn}^2(u ) + \sqrt{B} \beta (1 - 2 k_1^2 \textsf{sn}^2(u)) + \frac{\beta \sigma }{\sqrt{B}} \big ] du\\&= (B\beta + \beta \sigma ) \frac{4K}{\sqrt{B}} - 16 \sigma \sqrt{B} ( E - (1-k_1^2 ) K) + 8 \beta \sqrt{B} ( E - K) \\ M_3&= -(B + \sigma ) \beta \frac{4K}{\sqrt{B}} \\&\quad + \int _0^{4 K} \bigg [ \left( 4 k_1^2 (\beta -1)\sqrt{B} + \frac{2 \sigma \beta k_1^2}{\sqrt{B}}\right) \textsf{sn}^2(u) - 4k_1^4 (\beta -1)\sqrt{B} \textsf{sn}^4(u) \bigg ] du \\&= -(B + \sigma ) \beta \frac{4K}{\sqrt{B}} -8\left( 2(\beta -1)\sqrt{B} + \frac{\sigma \beta }{\sqrt{B}}\right) (E-K)\\&\quad - \frac{16 (\beta -1)\sqrt{B}}{3} ( -2(1+k_1^2)E + (2+k_1^2) K ). \end{aligned}$$

Equating these expressions to zero equivalently gives the equations

$$\begin{aligned}&K \beta \sigma + \beta B (2E - K ) - 4 B \sigma ( E - (1-k_1^2)K) = 0 \nonumber \\&4B \big [ (1-k_1^2) K + (2k_1^2-1) E \big ] + 3 \beta \sigma (2 E - K)\nonumber \\&\quad + \beta B \big [ (4k_1^2-1) K + 4(1-2k_1^2)E \big ] = 0 \end{aligned}$$
(24)

for \(k_1, B\) as necessary conditions for the existence of persistent periodic orbits. Indeed, the same equations were obtained by C. Sparrow and P. Swinnerton-Dyer (see (Sparrow 1982, Appendix K)) by formally using the method of averaging. In fact, part of the following analysis is an extension of that in Sparrow (1982) regarding (24).

We are now ready to formulate and prove our main result for symmetric periodic orbits.

Proposition 2.3

For every \(\lambda = \frac{\sigma + 1}{\beta + 2} > 2/3\) and \(\varepsilon \) sufficiently small, there exists a (locally) asymptotically stable symmetric periodic orbit. It is the unique periodic orbit in an order \(\varepsilon \)-neighborhood of an explicit solution \(L_1^{A, B}\), where (AB) corresponds via (8) to the unique solution \((k_1,B)\) of (24) for the chosen value of \(\lambda \).

Proof

We start by showing that (24) possesses a unique solution in the terms of \(k_1\) and B. In order to solve (24), we first show that the coefficient of B in the second equation is strictly positive for \(0 \le k_1 \le 1\). Let \(e_1,e_2\) be defined by

$$\begin{aligned} e_1(k_1) = (1-k_1^2) K + (2k_1^2-1) E, \qquad e_2(k_1) = (4k_1^2-1) K + 4(1-2k_1^2)E \end{aligned}$$
(25)

For \(e_1\), write

$$\begin{aligned} e_1(k_1) = (1-k_1^2) K + (2k_1^2-1) E = (K - E) (1-k_1^2) + E k_1^2 \end{aligned}$$

and since it is clear from the definitions (7) that \(K \ge E \ge 0\), this quantity is strictly positive except when \(k_1 = 0\), in which case it is zero. The function \(e_2\) is also strictly positive, although it is more work to prove it. Rather than interrupt the Melnikov analysis, we include this proof in Appendix A. Thus, in order to have positive solutions for B, one should have \(2E - K < 0\). As noticed in Sparrow (1982, Appendix K), this is possible precisely when \(k_* < k_1 \le 1\) with \(k_* \approx 0.908909\).

Since the coefficient of B is strictly positive, one can solve the second equation for B:

$$\begin{aligned} B = \frac{3 \beta \sigma (K-2E)}{4 e_1 + \beta e_2 } =: \frac{3 \beta \sigma (K-2E)}{ d_1 }, \end{aligned}$$
(26)

where \(d_1 = 4 e_1 + \beta e_2 \) denotes the denominator of the first fraction. Substituting this into the first equation gives

$$\begin{aligned} Kd_1 - 3\beta (K-2E)^2 -12 \sigma (K-2E)(E-(1-k_1^2)K) = 0 \end{aligned}$$

as the remaining equation. Moving the term involving \(\sigma \) to the right-hand side, using the definition \(\lambda =(\sigma +1)/(\beta +2)\), and after some arithmetic one obtains

$$\begin{aligned} \displaystyle 2 \lambda - 1 = \frac{K((1 - k_1^2)K +(2 k_1^2 - 1)E)}{3 (K - 2E)(E - (1 - k_1^2)K)}. \end{aligned}$$
(27)

Next, we prove the claim of Sparrow (1982) that the right-hand side in equation (27) is a monotonically decreasing function of \(k_1\) in the domain \(k_* < k_1 \le 1\) and the right-hand side limits to 1/3 at \(k_1=1\). This immediately implies the existence and uniqueness of the solution of equations (24) for \(\lambda > 2/3\). Writing the right-hand side of (27) as

$$\begin{aligned} \displaystyle \frac{1}{3(E - (1 - k_1^2)K)}\left( K (1 - k_1^2) +\frac{E}{1 - 2 E/K} \right) , \end{aligned}$$

we claim that the first factor and both summands in the parentheses are nonnegative and monotonically decreasing. Indeed, we compute for the prefactor and the first summand that

$$\begin{aligned}&\displaystyle \frac{d}{dk_1} (E - (1 - k_1^2)K) = k_1 K> 0, \quad \left. (E - (1 - k_1^2)K) \right| _{k_1 = k_*} = (2 k_*^2 - 1) E > 0,\\&\displaystyle \frac{d}{dk_1}( K (1 - k_1^2)) = \frac{E - K}{k_1} - k_1 K < 0, \quad \lim \limits _{k_1 \rightarrow 1}(K (1 - k_1^2)) = 0, \end{aligned}$$

cf. Byrd and Friedman (1971). The second summand is decreasing due to the monotonicity of E and K and is positive since \(K>2E\). Together with \(E(1)=0\) and \(K\rightarrow \infty \) as \(k_1\rightarrow 1\), the right-hand side of equation (24) is monotonically decreasing from \(\infty \) to 1/3 when \(k_1\) increases from \(k_*\) to 1. This range corresponds to the values \(\lambda > 2/3\) on the left-hand side of (27). Also, every \(k_1 \in (k_*, 1)\) gives a unique positive value for B via formula (26).

Having found the unique solution to the condition \(M_1 = M_3 =0\) for each \(\lambda >2/3\), we now consider the determinant from (22). Computing the derivatives in (22) gives

$$\begin{aligned} \det DM = \frac{1}{3 k_1^2 B^{3}(1-k_1^2)} \Big [ \tilde{c}_2 K^2 + \tilde{c}_1 KE + \tilde{c}_0 E^2,\Big ] \end{aligned}$$

where \(\tilde{c}_0,\tilde{c}_1,\tilde{c}_2\) are coefficients depending on \(\sigma , \beta , B\) only. Using the identity (26) to eliminate B and the identity \(\sigma =\lambda (\beta +2)-1\) to eliminate \(\sigma \), one obtains

$$\begin{aligned} \tilde{c}_2 K^2 + \tilde{c}_1 KE + \tilde{c}_0 E^2 = \frac{48 \sigma ^2 \beta ^2 (\beta +2)}{d_1^2 } \Big [ \hat{c}_4 K^4 + \hat{c}_3 K^3E + \hat{c}_1 KE^3 + \hat{c}_0 E^4 \Big ] \end{aligned}$$

for coefficients \(\hat{c}_0,\hat{c}_1,\hat{c}_2,\hat{c}_3,\hat{c}_4\) depending only on \(\lambda , \beta \). Finally, applying (27) to eliminate \(\lambda \) yields

$$\begin{aligned} \hat{c}_4 K^4 + \hat{c}_3 K^3E + \hat{c}_1 KE^3 + \hat{c}_0 E^4 = \frac{d_1}{E-(1-k_1^2)K} F_1 \end{aligned}$$

where

$$\begin{aligned} F_1&= K^3 (1-k_1^2) \left( \frac{7}{20} E - (1-k_1^2) K\right) + E^3 \left( (2k_1^2-1) E - (1-k_1^2) K \right) \\&\quad + KE(1-k_1^2) \left( (\frac{73}{20}-2k_1^2)K^2 - 6 KE +5E^2 \right) . \end{aligned}$$

This equality expresses \(F_1\) as a sum of strictly positive terms, since for the first two terms we have

$$\begin{aligned}&\frac{d}{dk_1} \left( \frac{7}{20} E - (1 - k_1^2)K\right) =\left( k_1 + \frac{7}{20 k_1}\right) K - \frac{7}{20 k_1} E> 0,\\&\quad \left( \frac{7}{20} E - (1 - k_1^2)K\right) \big |_{k_1 = k_*} = \left( 2 k_*^2 - \frac{33}{20}\right) E> 0 \\&(2k_1^2-1)E - (1 - k_1^2)K \ge \frac{7}{20} E - (1 - k_1^2)K > 0, \end{aligned}$$

and we prove positivity of the last term in Appendix A.2. This proves positivity of the determinant and thus existence and local uniqueness for \(0<\varepsilon \ll 1\) by Wiggins and Holmes (1987b, Theorem 3.2) as mentioned above.

Having proven the existence of a persistent periodic orbit, we can now prove its stability. Since we already have \(\det DM > 0\), we study the trace from (22). In region \(D_1\), one has \(\displaystyle \frac{\partial I}{\partial B} \big |_A = \frac{4}{\sqrt{B}}(2 E - K)\), so that by computing the derivatives in the formula (22) we find

$$\begin{aligned} {{\,\textrm{tr}\,}}DM = \frac{4}{3 B^{3/2}} \Big [ \hat{c}_1 K + \hat{c}_0 E \Big ], \end{aligned}$$

where \(\hat{c}_0, \hat{c}_1\) are coefficients depending only on \(\sigma \), \(\beta \), B and \(k_1\) (but not E or K). Using the first equation of (24) and \(\lambda >2/3\) (so that \(\beta <2\sigma \)), we express E as

$$\begin{aligned} \displaystyle E = \frac{K ( \beta (B - \sigma ) - 4 B (1 - k_1^2) \sigma )}{2 B (\beta - 2 \sigma )} \end{aligned}$$
(28)

and substituting this into the equation for the trace, we find

$$\begin{aligned} \displaystyle {{\,\textrm{tr}\,}}DM = -\frac{4 K(1 + \beta + \sigma )}{\sqrt{B}} < 0. \end{aligned}$$
(29)

Since \({{\,\textrm{tr}\,}}DM < 0\) and \(\det DM > 0\), from (21) it follows that for \(\varepsilon >0\) sufficiently small the eigenvalues lie inside the unit circle so that the periodic orbit is asymptotically stable. \(\square \)

2.3 Asymmetric Periodic Orbits (\(D_2\))

In the regime \(D_2\), we consider the explicit solutions \(L_2^{A,B}(\tau )\) given in (9). In polar coordinates, these are given as

$$\begin{aligned} \xi&= 2 \sqrt{B}k_2^{-1} \textsf{dn} (\sqrt{B}k_2^{-1} \tau ) \\ \sin \varphi&= - 2 \textsf{sn}( \sqrt{B}k_2^{-1} \tau ) \textsf{cn} (\sqrt{B}k_2^{-1} \tau ) \\ \cos \varphi&= 1-2 \textsf{sn}^2(\sqrt{B}k_2^{-1}\tau ) , \end{aligned}$$

with period and action

$$\begin{aligned} \displaystyle T_2 = \frac{4K k_2}{\sqrt{B}} = \frac{1}{\Omega (k_2, B)}, \quad I(k_2, B) = 16 \sqrt{B} k_2^{-1} E. \end{aligned}$$
(30)

Substituting these explicit solutions into (13) and (20), we compute the Melnikov integrals explicitly as

$$\begin{aligned} \tilde{M}_1&= \int _0^{4K} \frac{k_2}{\sqrt{B}} \big [ - 4 \sigma B k_2^{-2} \textsf{dn}^2(u) +\beta B \big ( 1-2 \textsf{sn}^2(u) \big ) +\beta \sigma \big ] du\\&= \frac{4 k_2}{\sqrt{B}} \big [ \beta \sigma K - 4\sigma B k_2^{-2} E +\beta B \big ( K - 2k_2^{-2}(K-E) \big ) \big ] \\ M_3&= -\int _0^{4 K} \frac{k_2}{\sqrt{B}} \big [\beta (B+\sigma ) - 2 (\beta \sigma + 2 (\beta - 1)B ) \textsf{sn}^2(u) + 4 (\beta - 1) B \textsf{sn}^4(u) \big ] du \\&= -\frac{\beta k_2 (B+\sigma )}{\sqrt{B}}4K +\frac{8}{\sqrt{B}}(\beta \sigma + 2 (\beta - 1)B ) \frac{K-E}{k_2}\\&\quad - 16 (\beta - 1) \sqrt{B} \frac{(2+k_2^2) K - 2(1+k_2^2)E}{3k_2^3}. \end{aligned}$$

Equating these expressions to zero, we see that we must solve

$$\begin{aligned}&\beta \sigma K - 4 B \sigma k_2^{-2} E - \beta B k_2^{-2} ((2 - k_2^2)K - 2 E) = 0 \nonumber \\&4 B \left[ (2 - k_2^2) E - 2 (1 - k_2^2) K\right] +\beta B \left[ 4 (k_2^2 - 2) E + (3 k_2^4 - 8 k_2^2 + 8) K\right] \nonumber \\&\quad + 3 \beta \sigma k_2^2 \left[ 2 E - K (2 - k_2^2)\right]&= 0. \end{aligned}$$
(31)

for \(k_2, B\). As above, the same conditions were obtained in Sparrow (1982, Appendix K) by formal averaging. We can now state and prove our existence and instability result for asymmetric periodic orbits:

Proposition 2.4

For every \(\varepsilon \) sufficiently small and all \(2/3< \lambda <1\), there exists a pair of saddle-type asymmetric periodic orbits. Each is unique in an order \(\varepsilon \)-neighborhood of \(L_2^{A,B}\), where (AB) corresponds via (9) to the unique solution \((k_2,B)\) of (31) for the chosen value of \(\lambda \).

Proof

The coefficient of B in the first equation of (31) is always positive, since

$$\begin{aligned} (2-k_2^2) K - 2E = \frac{1}{4}\int _0^{4K} \textsf{sn}^2(u) \textsf{cd}^2(u) du > 0, \end{aligned}$$

so that the unique solution in the terms of B is

$$\begin{aligned} B = \frac{\beta \sigma K k_2^2}{4 \sigma E +\beta ( ( 2-k_2^2)K-2E)} =: \frac{\beta \sigma K k_2^2}{d_2}, \end{aligned}$$
(32)

where \(d_2\) denotes the denominator. Substitution into the second equation of (31) gives an equation for \(k_2\):

$$\begin{aligned} 2 \lambda - 1 = \frac{K((2 - k_2^2)E - 2 (1 - k_2^2)K)}{3 E ((2 - k_2^2)K - 2 E)}. \end{aligned}$$
(33)

Next, we prove that the right-hand side of (33) monotonically decreases from 1 to 1/3 as \(k_2\) increases from 0 to 1, which implies precisely for every \(\lambda \in (2/3,1)\) there exists a unique solution \(k_2\). To prove monotonicity, first we compute

$$\begin{aligned} \frac{d}{d k_2} \left[ \frac{K((2 - k_2^2)E - 2 (1 - k_2^2)K)}{3 E ((2 - k_2^2)K - 2 E)} \right] = \frac{2 P}{3 E^2 k_2 ( 1-k_2^2)((2-k_2^2)K-2E)^2 } \end{aligned}$$

where

$$\begin{aligned} P&= E^3 \big ( -(2-k_2^2)E + 2(1-k_2^2)K \big ) +3 K E^2 (1-k_2^2) \big ( 2E - (2-k_2^2)K \big )\nonumber \\&\quad + \frac{1}{2} K^2 (1-k_2^2)(2-k_2^2) \big ( 3E ( -2E + (2-k_2^2)K) + K ((2-k_2^2)E-2K ) \big ). \end{aligned}$$
(34)

The three summands of P are each negative since for the first we have

$$\begin{aligned}{} & {} \frac{d}{dk_2} \big [ -(2-k_2^2)E+2(1-k_2^2)K \big ] = 2 k_2( E - K) < 0 \qquad \text { and }\\{} & {} \lim _{k_2 \rightarrow 0 } -(2-k_2^2)E+2(1-k_2^2)K = 0, \end{aligned}$$

for the second \(K\ge 0\), \(2E-K<0\), and for the third we estimate its non-trivial factor from above by the negative \(-4K^2+2E^2\) using the quadratic estimate \(2KE\le K^2+E^2\).

Regarding the determinant, by computing the derivatives in (22) we find it has the form

$$\begin{aligned} \det DM = \frac{1}{3 k_2^2 B^{3}(1-k_2^2)} \Big [ \tilde{c}_2 K^2 + \tilde{c}_1 KE +\tilde{c}_0 E^2 \Big ], \end{aligned}$$

where \(\tilde{c}_0,\tilde{c}_1,\tilde{c}_2\) are coefficients depending only on \(\sigma , \beta , B, k_2\). Again using the identities (32), \(\sigma = \lambda (\beta +2)-1\) and (33) to eliminate \(B, \sigma , \lambda \), one finds

$$\begin{aligned} \det DM = \frac{ -16 (\beta + 2) d_2 F_2 }{3B k_2^2 (1-k_2^2) KE ((2-k_2^2)K - 2E) }, \end{aligned}$$
(35)

where \(F_2\) is given by

$$\begin{aligned} F_2(k_2)&= (2-k_2^2)E^4 - 8(1-k_2^2)E^3 K + 6(1-k_2^2) (2-k_2^2)E^2K^2 \\ {}&\quad - 2(2-k_2^2)^2(1-k_2^2)E K^3 + (2-k_2^2) (1-k_2^2)^2 K^4. \end{aligned}$$

In Appendix A.3, we prove that \(F_2(k_2)>0\) for \(0< k_2 < 1\), and hence, it follows that \(\det DM < 0\). This proves existence and local uniqueness for \(0<\varepsilon \ll 1\) by Wiggins and Holmes (1987b, Theorem 3.2) as mentioned above.

Turning now to the question of stability, in region \(D_2\) one has \(\displaystyle \frac{\partial I}{\partial B} \big |_A = 4(2 E - K(2-k_2^2))/(k_2\sqrt{B})\), and hence by computing the derivatives in (22) one obtains

$$\begin{aligned} {{\,\textrm{tr}\,}}DM = \frac{1}{3 B^{3/2}k_2^3} \Big [ \check{c}_1 K +\check{c}_0 E \Big ], \end{aligned}$$

where \(\check{c}_1, \check{c}_0\) are coefficients depending only on \(\sigma \), \(\beta \), B and \(k_2\) (but not E or K). Using the identity (32) to eliminate B, the identity \(\sigma = \lambda (\beta +2)-1\) to eliminate \(\sigma \), and the identity (33) to eliminate \(\lambda \) we obtain

$$\begin{aligned} \displaystyle {{\,\textrm{tr}\,}}DM = -\frac{4 K k_2 (1+\beta + \sigma )}{\sqrt{B}}<0. \end{aligned}$$
(36)

Since \({{\,\textrm{tr}\,}}DM < 0\) and \(\det DM < 0\), by (21), for any sufficiently small \(\varepsilon >0\), there is one eigenvalue inside the unit circle and one outside, and thus, these periodic orbits are of saddle type. \(\square \)

2.4 Homoclinic Orbits (\(D_3\))

In \(D_3\), the homoclinic solutions from (10) are given for every \(B > 0\) in polar coordinates as

$$\begin{aligned}{} & {} \displaystyle \xi = 2 \sqrt{B} {{\,\textrm{sech}\,}}( \sqrt{B} \tau ) \\{} & {} \displaystyle \sin \phi = - 2 \tanh (\sqrt{B} \tau ) {{\,\textrm{sech}\,}}(\sqrt{B} \tau ) \\{} & {} \displaystyle \cos \phi = \big ( 1 - 2 \tanh ^2(\sqrt{B} \tau ) \big ). \end{aligned}$$

For \(\varepsilon >0\), the slow evolution of the quantities A and B can be written as (cf. Sparrow (1982)):

$$\begin{aligned} \dot{A}&= \varepsilon (- \sigma \xi ^2 + \beta \zeta + \beta \sigma ) =\varepsilon (- \sigma \xi _0^2 + \beta \zeta _0 + \beta \sigma ) + O(\varepsilon ^2)\nonumber \\ B \dot{B}&= -\varepsilon (\eta ^2 + \beta \zeta ^2 + \beta \sigma \zeta ) =-\varepsilon (\eta _0^2 + \beta \zeta _0^2 + \beta \sigma \zeta _0) +O(\varepsilon ^2), \end{aligned}$$
(37)

where zero index denotes the unperturbed solutions from (10).

As already mentioned in Remark 2.2, in the present case the only equilibrium point which survives the perturbation \(0<\varepsilon \ll 1\) is \((\xi , \eta , \zeta ) = (0, 0, -\sigma )=: \Xi _\sigma \), corresponding to the origin in the original Lorenz system (1), and we have \(B = A = \sigma \) at that point. For \(\varepsilon > 0\) the equilibrium \(\Xi _\sigma \) has two stable eigenvalues \(\displaystyle \nu _1 = -\sqrt{\sigma } + O(\varepsilon )\) and \(\nu _2 = - \varepsilon \beta \) and the unstable one \(\displaystyle \nu _3 = \sqrt{\sigma } + O(\varepsilon )\). The leading stable direction corresponding to the eigenvalue \(\nu _2\) is the invariant line \(\{\xi = 0, \; \eta = 0 \}\) of (4), which is \(\{A = B\}\) in (37).

We note that the second equation of system (37) determines the dynamics of variable B and is in particular equivalent to the last equation of system (13). Integration over the unperturbed homoclinic orbit gives, to leading order,

$$\begin{aligned} B^2(+\infty ) - B^2(-\infty ) = -4/3(\beta + 2)\sigma ^{3/2}\varepsilon , \end{aligned}$$
(38)

which is nonzero for \(\varepsilon >0\) since \(\beta >0\). This is inconsistent with the persistence result for perturbed homoclinic orbits in Wiggins and Holmes (1987a), where it is claimed that this integral always vanishes. Hence, Wiggins and Holmes (1987a) is not applicable in the present situation (and is not valid in the claimed generality).

Toward a correct prediction, first note that a homoclinic orbit converges to the equilibrium point as \(\tau \rightarrow -\infty \) along the unstable direction, and \(\lim \limits _{\tau \rightarrow -\infty } A(\tau ) = \lim \limits _{\tau \rightarrow -\infty } B(\tau ) = \sigma \). Let us assume that a generic homoclinic orbit exists for \(0<\varepsilon \ll 1\), which approaches the equilibrium along the leading direction. In the present case, this is the line \(\{A = B\}\), along which it slowly converges to the equilibrium with rate \(\nu _2 = -\varepsilon \beta \). For \(0<\varepsilon \ll 1\), this slow part dominates the Melnikov integral, which means a connection of the unstable manifold along \(\{A=B\}\) requires that the jumps of \(B^2\) in (38), and of \(A^2\) coincide to leading order in \(\varepsilon \). The latter can be computed as

$$\begin{aligned} \int \limits _{-\infty }^{+\infty }A(\tau )\dot{A}(\tau )d \tau = 4\varepsilon (\beta - 2 \sigma )\sigma ^{3/2}, \end{aligned}$$
(39)

which equals (38) if and only if \(\sigma = \frac{1}{3}(1 + 2 \beta )\), that is, \(\lambda = 2/3\). This is consistent with the discussion of periodic orbits in Proposition 2.4, and although we do not give a full proof, we thus expect there exists a homoclinic bifurcation curve that tends to \(\lambda = 2/3\) when \(\varepsilon \rightarrow 0\).

3 Implications for Infinite Time Averages

In this section, we use the angle bracket notation to denote the infinite time average:

$$\begin{aligned} \langle f({\textbf {X}}) \rangle := \lim _{t\rightarrow \infty } \frac{1}{t} \int _0^t f({\textbf {X}}(s))ds \end{aligned}$$

In Goluskin (2018), Goluskin illustrates an application of semi-definite programming to dynamical systems by obtaining bounds on time averages for the Lorenz equations. He proves that the time averages of the following monomials are maximized by the value obtained at the fixed points \({\textbf {X}}_{\pm }\) for a wide range of parameters \((\beta , \sigma )\) and for all \(0< \rho < \infty \):

$$\begin{aligned} \langle Z \rangle \quad , \quad \langle X^2 \rangle \quad , \quad \langle XY \rangle \quad , \quad \langle Z^2 \rangle \quad , \quad \langle XYZ \rangle \quad , \quad \langle Z^3 \rangle \quad , \quad \langle XYZ^2 \rangle . \end{aligned}$$

Due to the form of the Lorenz equations, certain infinite time averages must be proportional. For instance, one has

$$\begin{aligned} \langle XY \rangle= & {} \lim _{t\rightarrow \infty } \frac{1}{t} \int _0^t X(s)Y(s) ds = \lim _{t\rightarrow \infty } \frac{1}{t} \int _0^t \big [ \beta Z(s) + Z'(s) \big ] ds\\= & {} \beta \langle Z \rangle + \lim _{t\rightarrow \infty } \frac{Z(t)-Z(0)}{t} = \beta \langle Z \rangle . \end{aligned}$$

In this way, one has the following equalities

$$\begin{aligned} \langle X^2 \rangle= & {} \langle XY \rangle =\beta \langle Z \rangle \quad , \quad \langle XYZ \rangle = \beta \langle Z^2 \rangle , \\ \langle XYZ^2 \rangle= & {} \beta \langle Z^3 \rangle , \end{aligned}$$

and hence, it suffices to consider the following three time averages:

$$\begin{aligned} \langle Z \rangle \quad , \quad \langle Z^2 \rangle \quad , \quad \langle Z^3 \rangle . \end{aligned}$$

As stated previously, for \(\lambda > 1\) the fixed points become unstable for sufficiently large \(\rho \), and hence, while these sharp upper bounds are indeed valid, they do not represent the values that most trajectories obtain. However, with the results of the previous section in hand we can provide complementary results which give values for these time averages which are observed for all trajectories within the non-trivial basin of attraction of the symmetric periodic orbit. For such initial conditions, the infinite time averages above are given by the average over the periodic orbit, i.e.,

$$\begin{aligned} \langle Z \rangle = \lim _{t\rightarrow \infty } \frac{1}{t} \int _0^t Z(s,\varepsilon ) ds = \frac{1}{ \varepsilon ^2 T_1^{\varepsilon }} \int _0^{T_1^{\varepsilon }} \big [ \sigma ^{-1} \zeta _1 (\tau ',\varepsilon ) + 1 \big ] d\tau ', \end{aligned}$$

where \(T_1^{\varepsilon }\) is the period of the solution \(\zeta _1(\tau , \varepsilon )\). It seems likely these values do not have a simple closed form and instead we compute the time averages via an expansion in \(\varepsilon \). For instance, the lowest order term can be found from the explicit formulas for the unperturbed solution and period, \(\zeta _1(\tau )\) and \(T_1\), as follows

$$\begin{aligned} \langle Z \rangle&= \frac{1}{\varepsilon ^2} \Big [ 1 + \frac{1}{\sigma T_{1}} \int _0^{T_{1}} \zeta _1 (\tau ') d\tau ' + \mathscr {O}(\varepsilon ) \Big ]\\&= \frac{1}{\varepsilon ^2} \Big [ 1 + \frac{\sqrt{B}}{\sigma 4K} \int _0^{\frac{4 K}{\sqrt{B}}} B \big ( 1-2k_1^2 \textsf{sn} (\sqrt{B}\tau ') \big ) d\tau ' + \mathscr {O}(\varepsilon ) \Big ] \\&= \frac{1}{\varepsilon ^2}\left[ 1 - \frac{B}{\sigma } \left( 1 - \frac{2 E}{K}\right) +O(\varepsilon ) \right] . \end{aligned}$$

In this way, we obtain the following expressions for the infinite time averages, expressed in terms of \(\rho \) rather than \(\varepsilon \):

$$\begin{aligned} \langle Z \rangle&= \rho \left[ 1 - \frac{B}{\sigma } \left( 1 - \frac{2 E}{K} \right) \right] + \mathscr {O} (\rho ^{1/2} ) \nonumber \\ \langle Z^2 \rangle&= \rho ^2 \Big [ 1 - \frac{3\beta (K-2E)^2}{d_1^2 K} \big ( 8e_1 + \beta e_2 \big ) \Big ] + \mathscr {O}(\rho ^{3/2}) \nonumber \\ \langle Z^3 \rangle&= \rho ^3 \Big [ 1 - \frac{9\beta (K-2E)^2}{5 d_1^3 K} \Big ( 20d_1e_1 + \beta ^2 (K-2E)\nonumber \\&\quad \big ( (32 k^4-36k^2 +19) K - (64 k^4 -64k^2+34 ) E \big ) \Big ) \Big ] + \mathscr {O}(\rho ^{5/2}) . \end{aligned}$$
(40)

Recall the expressions \(e_1,e_2,d_1\) defined in4 (25), (26) were shown to be positive. Hence for the time averages \(\langle Z \rangle \) and \(\langle Z^2 \rangle \), these expressions resolve the coefficient of the leading-order term as a function of \(\sigma , \beta \) which is strictly less than one, hence less than that of the fixed point value. For \(\langle Z^3 \rangle \), the term \(d_1 e_1\) is always positive, whereas one can check that the other expression inside the parentheses is positive for all \(\lambda > 2.5611...\), whereas it is negative for \(\lambda \) less than this. Hence by fixing such \(\lambda \) and choosing \(\beta \) sufficiently large, this exceeds the fixed point value. This agrees with Goluskin’s result, however, since the region in parameter space where \(\langle Z^3 \rangle \) is maximized at \({\textbf {X}}_{\pm }\) does not include large \(\beta \).

As mentioned previously, the transport \(\langle XY \rangle \) is of particular interest, since this is the truncated version of the Nusselt number from fluid dynamics and hence the most well-studied such average. Toward understanding the organization of solution branches and the associated transport, we denote the transport of the stable symmetric periodic orbit by

$$\begin{aligned} H_1 = \beta \langle Z \rangle = \beta \rho \left[ 1 - \frac{B}{\sigma } \left( 1 - \frac{2 E}{K} \right) \right] + \mathscr {O}(\rho ^{1/2} ), \end{aligned}$$

and we also compute the transport obtained by the unstable, asymmetric periodic orbits

$$\begin{aligned} H_2= & {} \beta \rho \left[ 1 + \frac{B}{\sigma T_2} \int \limits _0^{ T_2} \left( 1 - 2 {{\,\textrm{sn}\,}}^2 \left( \frac{\sqrt{B}}{k_2} \tau ' \right) \right) d\tau ' + O(\rho ^{-1/2}) \right] \\= & {} \beta \rho \left[ 1 - \frac{B}{\sigma }\frac{K(2 - k_2^2) - 2 E}{K k_2^2} \right] +\mathscr {O}(\rho ^{1/2}). \end{aligned}$$

Next, we cast \(H_1, H_2\) in terms of \(\rho \) and rescale to a finite range of transport values. This gives

$$\begin{aligned}{} & {} h_{\rho ,1}(\lambda ):= H_1/\rho = \beta \left( 1 - R_1(\lambda ) +O(\rho ^{-1/2}) \right) ,\nonumber \\{} & {} R_1(\lambda ) := \frac{B}{\sigma } \left( 1 - \frac{2 E}{K}\right) , \; \lambda \in (2/3,\infty ), \end{aligned}$$
(41)
$$\begin{aligned}{} & {} h_{\rho ,2}(\lambda ):= H_2/\rho = \beta \left( 1 - R_2(\lambda ) + O(\rho ^{-1/2}) \right) ,\nonumber \\{} & {} R_2(\lambda ) := \frac{B}{\sigma } \frac{K(2 - k_2^2) - 2 E}{K k_2^2}, \; \lambda \in (2/3,1). \end{aligned}$$
(42)

We first show that \(h_{\rho ,1}(2/3)=h_{\rho ,2}(2/3)=0\) understood as the limit \(\lambda \searrow 2/3\), and analogously \(h_{\rho ,1}(1)= h_{\rho ,2}(\infty )=\beta \). Indeed, \(R_1(2/3)=R_2(2/3)=1\), \(R_1(\infty )=R_2(1)=0\) due to the following.

  • \(R_1(2/3)=1\): \(\lambda \rightarrow 2/3\) gives \(k_1, k_2\rightarrow 1\) so that \(E(1)=0\), \(K(1)=\infty \) and (26), (32) imply \(B/\sigma \rightarrow 1\);

  • \(R_2(2/3)=1\): the previous also implies \((K(2 - k_2^2) - 2 E)/(K k_2^2)\rightarrow 1\);

  • \(R_1(\infty )=0\): \(\lambda \rightarrow \infty \) means \(k_1\rightarrow k_*\), B is bounded and \(E/K\rightarrow 1/2\);

  • \(R_2(1)=0\): \(\lambda \rightarrow 1\) gives \(k_2\rightarrow 0\) and using (32) as well as \(E=K= \pi /2\) at \(k_2=0\) implies

    $$\begin{aligned} R_2(\lambda )= & {} \frac{B}{\sigma } \left( \frac{2(K-E) -K k_2^2}{K k_2^2}\right) \\= & {} \sigma \frac{2(K-E) -K k_2^2}{4 \sigma E + \beta ( 2(K-E) -k_2^2 K)} \rightarrow 0 \text { as}\ k_2\rightarrow 0. \end{aligned}$$

The differences to the scaled maximum transport \(H_{\pm }/\rho = \beta (1-\rho ^{-1})\) are the positive quantities

$$\begin{aligned} \beta (1-\rho ^{-1})-h_{\rho ,j} =\beta (R_j(\lambda ) +O(\rho ^{-1/2})), \; j=1,2. \end{aligned}$$
(43)

In particular, the stable symmetric periodic orbits L yield the same order of magnitude of transport with respect to \(\rho \), but feature a \(\lambda \) dependent downshift that vanishes at \(\lambda =\infty \), i.e., at \({\textbf {X}}_{\pm }\). In addition, from our perhaps rough error estimates we obtain a correction of order (at least) \(\rho ^{-1/2}\) compared to the next order being \(\rho ^{-1}\) for \({\textbf {X}}_{\pm }\). Numerical computations suggest that this term might in fact be of order \(\rho ^{-1}\), but we do not explore this further here.

4 Numerical Computations and Hysteresis Loop

We present numerical results that corroborate the analytical results for \(\rho =\infty \) and \(1\ll \rho <\infty \) of the previous section, and that highlight the occurrence of a hysteresis loop.

Fig. 2
figure 2

Bifurcation diagrams (solid=stable, dashed=unstable) of relevant equilibria and periodic orbits in terms of \(\lambda \) and the rescaled transport, illustrating the hysteresis. a Equilibria in the averaged planar system at \(\rho =\infty \) that persist for large finite \(\rho \) as equilibria (orange) or periodic orbits (blue). Computations are done via the elliptic integrals with Mathematica. The bullet marks \(\lambda =2/3\) at zero transport; for illustration, the thin blue line extends numerical computations to the theoretical limit at zero transport. b Overlay of (a) with branches of stable or unstable symmetric (red solid/dashed) and unstable asymmetric (red dashed) periodic solutions to the full system at \(\rho =1000\) computed with Auto (Doedel [5]) (Color figure online)

In Fig. 2a, we plot the numerical evaluation of (41) for symmetric periodic orbits and (42) for asymmetric ones. However, the elliptic integral routines of the current version of the software Mathematica for BEK have failed to numerically converge for transport below \(\approx 0.6\). The analytical prediction is that the branches terminate at \(\lambda =2/3\) in homoclinic bifurcations of the zero equilibrium and thus at zero transport. Indeed, at \(\lambda =2/3\) the intersection of the level sets of the conserved quantities AB forms a symmetric pair of homoclinic loops, cf. Fig. 3a, which is the limit of the branch of symmetric periodic orbits, and each branch of asymmetric periodic orbits limits on one of the homoclinic loops.

The arrangement of branches in Fig. 2a together with the stability properties suggests a hysteresis loop of equilibria and periodic orbits in terms of \(\lambda \): For \(\lambda <2/3\), the equilibria \({\textbf {X}}_{\pm }\) that maximize transport are stable, while for \(\lambda >1\) the symmetric periodic orbit are. Intermediate values \(2/3<\lambda <1\) lie in the analytically predicted region of bistability with stable equilibria \({\textbf {X}}_{\pm }\) and stable symmetric periodic orbit L.

For large finite \(\rho \) and moderate values of \(\lambda \), numerical pathfollowing computations using Auto corroborate that branches of symmetric and asymmetric periodic orbits persist as predicted. See Fig. 2b. Toward zero transport, the branches of symmetric and asymmetric periodic orbits appear to terminate in homoclinic bifurcations to the zero equilibrium near \(\lambda =0.688\). See also Fig. 3a, b. The asymmetric periodic orbits are unstable as predicted, but the symmetric periodic orbits lose stability at low transport. For \(\rho =1000\), this occurs at \(\lambda =\lambda _{\textrm{bp}}(\rho )\approx 0.79\) in a supercritical pitchfork bifurcation. A branch of stable periodic orbits bifurcates, which are asymmetric in a different sense, but these lose stability at \(\lambda =\lambda _{\textrm{pd}}(\rho )\approx 0.787\) in a period-doubling bifurcation. We plot the loci of \(\lambda _{\textrm{bp}}, \lambda _{\textrm{pd}}\) in Fig. 5b, showing that as \(\rho \) increases, \(\lambda _{\textrm{bp}}, \lambda _{\textrm{pd}}\) approach \(\lambda =2/3\).

Fig. 3
figure 3

Profiles of periodic solutions for \(\rho =1000\) from the red solid branch in Fig. 2b. a Near the double homoclinic loop at the left termination point; b at \(\lambda \approx 1\); and c near an apparent symmetric heteroclinic cycle \(\lambda \approx 11.3\)

Further numerical simulations corroborate the hysteresis-type loop: For \(\lambda <2/3\), the maximum transport equilibria \({\textbf {X}}_{\pm }\) appear to be global attractors, while for \(\lambda >1\) this seems to be the stable symmetric periodic orbit, as in Fig. 3b. See also Fig. 1b. For \(2/3<\lambda <1\), the situation with large finite \(\rho \) is complicated by the fact that symmetric periodic orbits are born in a homoclinic bifurcation at some \(\lambda _{\textrm{hom}}\in (2/3,1)\) and, as mentioned, are unstable until a bifurcation point \(\lambda _{\textrm{bp}} \in (\lambda _{\textrm{hom}},1)\). Up to the aforementioned region of stable asymmetric periodic orbits that bifurcate from \(\lambda _{\textrm{bp}}\), the global attractors for \(\lambda <\lambda _{\textrm{bp}}\) seem to be \({\textbf {X}}_{\pm }\) and the region of bistability with the symmetric periodic orbit is effectively \(\lambda \in (\lambda _{\textrm{bp}}, 1)\).

Using time-varying values of \(\lambda \), we consistently found hysteresis as plotted in Fig. 4a: For slowly increasing \(\lambda \) from 0, the solution is quickly close to \({\textbf {X}}_+\) so that maximum local transport is realized, i.e., transport computed over a time interval of finite length, which can be chosen longer for slower change of \(\lambda \). As \(\lambda \) increases beyond 1, the solution eventually approaches the stable symmetric periodic orbit, so that the realized local transport is smaller than the theoretical maximum. Analogous to delayed bifurcations, this transition to the periodic orbit does not occur immediately after crossing \(\lambda =1\) at \(t=100\), but with a delay, here until around \(t=190\). Subsequent decrease of \(\lambda \) causes the solution to track the stable branch of symmetric periodic orbits, cf. Fig. 4b, which decreases the observed local transport further until \(\lambda =\lambda _{\textrm{bp}} \approx \lambda _{\textrm{pd}}\). Upon decreasing \(\lambda \) below this threshold, a switch to a stable equilibrium \({\textbf {X}}_{\pm }\) occurs, thus re-creating maximum local transport.

Fig. 4
figure 4

We plot a simulation of the hysteresis loop with time-varying \(\lambda \) for \(\rho =1000\) using MATLAB’s ode45 routine. a The X-coordinates of the resulting solution (blue curve, left axis) from a parabolic variation of \(\lambda \) (orange curve, right axis). The vertical bars mark the homoclinic bifurcation point \(\lambda \approx 0.688\) (green), the period-doubling \(\lambda \approx 0.787\) (purple), and the Hopf bifurcation \(\lambda =1\) (black). b The value of X vs. Y of the simulation in (a) (Color figure online)

Fig. 5
figure 5

a Bifurcation diagram for larger values of \(\lambda \) analogous to right panel of Fig. 2 but without stability information of the periodic solutions (long dashed lines). The red curve corresponds to the extension of the branch of stable symmetric periodic orbits from Fig. 2. The magenta curve is the analogue for \(\rho =4000\). Continuing along these branches from their lower left ends, numerically before each fold a destabilizing period-doubling bifurcation occurs and the solution restabilizes at the fold point. Each branch appears to terminate in a symmetric heteroclinic cycle between the pair of equilibria \({\textbf {X}}_{\pm }\). b loci of branch points of symmetric periodic orbits (blue) and period-doubling points on the bifurcating branches (purple); for values of \(\rho \) above the blue curve periodic orbits appear to be stable. Gray lines mark \(\rho =1000\) (horizontal) and \(\lambda \approx 2.36\) at the classical Lorenz values \(\sigma =10, \beta =8/3\) (Color figure online)

While the asymptotically predicted branch of symmetric periodic orbits of Fig. 2 continues for increasing \(\lambda \) monotonically and unboundedly, we found that for finite \(\rho \) this is not the case. As plotted in Fig. 5, the branch of stable symmetric periodic orbits turns around, oscillates, and appears to terminate in a symmetric heteroclinic bifurcation of \({\textbf {X}}_{\pm }\) at a finite value of \(\lambda \). See also Fig. 3c. Upon increasing \(\rho \), this turning and termination occurs at larger values of \(\lambda \). Hence, this scenario is consistent with the analytical results, which concern \(\rho \rightarrow \infty \) for bounded ranges of \(\lambda \). The appearance of a symmetric heteroclinic cycle between \({\textbf {X}}_{\pm }\) in the Lorenz system has already been noticed in Sparrow (1982), Glendinning and Sparrow (1986), albeit apparently not in the regime of large \(\rho \).

The transport at such a heteroclinic cycle is that of the symmetric equilibria, i.e., \(\beta (1-\rho ^{-1})\), which is indeed very closely matched at the numerical termination points. The \((\lambda ,h_\rho )\)-loci of the termination points lie near the curve of symmetric periodic orbits (blue solid), which therefore appear to predict the loci of the heteroclinic cycles. The oscillating stability along the branch creates multi-stable regions in \(\lambda \); we note that generic unfoldings of the type of heteroclinic cycle with leading oscillating dynamics yield chaotic attractors (Bykov 1999).

We plot the projection of a solution near the symmetric heteroclinic cycle into the (AB)-plane in Fig. 6. This corroborates the conjecture by Sparrow in Sparrow (1982) that orbits bifurcate from \(\rho =\infty \) which cross through the diagonal \(A=B\). We find that also the solutions near the double homoclinic loop with small transport cross the diagonal. In contrast, the solutions for moderate transport remain in \(D_1\) as predicted by the limit \(\rho \rightarrow \infty \).

Fig. 6
figure 6

Projections into the (AB)-coordinate plane of periodic solutions for \(\rho =1000\) (blue) and the diagonal (orange). a Near the double homoclinic loop at the left termination point of the red solid branch in Fig. 2b. b From the red solid branch of Fig. 2b at \(\lambda \approx 1\). c From the red long dashed branch of Fig. 5a at \(\lambda =7.2\), toward the heteroclinic cycle. d Near the heteroclinic cycle from Fig. 2b (Color figure online)

5 Other Lorenz-like Systems

The analysis for large \(\rho \) carries over to other models related to the Lorenz equations (1). For the general context of extensions, we refer to Curry (1978), Sparrow (1982), Park (2021), Olson and Doering (2022) and the references therein. For illustration purposes, let us consider linear additions to (1) in the form

$$\begin{aligned} {\textbf {X}}'&= {\textbf {F}}({\textbf {X}}) + {\textbf {A}}w + {\textbf {b}}\nonumber \\ w'&= {\textbf {B}}({\textbf {X}}, w), \end{aligned}$$
(44)

with \(w\in {\mathbb {R}}^k\), linear \({\textbf {A}}, {\textbf {B}}\) and constant \({\textbf {b}}\). Upon rescaling as in (3) and \(w=\varepsilon ^{-j}\omega \), with \(\varvec{\Xi }=(\xi ,\eta ,\zeta )^\intercal \), we obtain the form

$$\begin{aligned} \dot{\varvec{\Xi }}&= \varvec{\Phi }_\varepsilon (\varvec{\Xi }) + \textrm{diag}(\varepsilon ^{2-j}, \varepsilon ^{3-j},\varepsilon ^{3-j}) {\textbf {A}}\omega + \textrm{diag} (\varepsilon ^2,\varepsilon ^3,\varepsilon ^3){\textbf {b}}\nonumber \\ \dot{\omega }&= {\textbf {B}}( \textrm{diag}(\varepsilon ^j,\varepsilon ^{j-1}, \varepsilon ^{j-1})\varvec{\Xi }, \varepsilon \omega ), \end{aligned}$$
(45)

where \(\varvec{\Phi }_\varepsilon \) is the right-hand side in (4). For \({\textbf {A}}={\textbf {B}}=0\), i.e., in the absence of \(\omega \), the difference to (4) is of order \(\varepsilon ^2\). Hence, the leading-order analysis is unchanged, which means that periodic orbits bifurcate/persist as for \({\textbf {b}}=0\), although their symmetry properties may be broken. In particular, this applies to the Lorenz models with offsets from Weady (2018), Palmer (1998) for which one can also show that the transport is maximized in an equilibrium (Ovsyannikov 2022).

Nonzero \({\textbf {B}}\) generally requires \(j\ge 1\) for a regular limit in which the right-hand side of the equation for \(\omega \) becomes independent of \(\omega \) and vanishes for \(j>1\). For \(j=1\), we obtain, up to terms of order \(\varepsilon ^2\),

$$\begin{aligned} \dot{\varvec{\Xi }}&= \varvec{\Phi }_\varepsilon (\varvec{\Xi }) +(\varepsilon {\textbf {A}}_1 \omega ,0,0)^\intercal \nonumber \\ \dot{\omega }&= {\textbf {B}}( \textrm{diag}(\varepsilon ,1,1)\varvec{\Xi }, \varepsilon \omega ), \end{aligned}$$
(46)

with \({\textbf {A}}_1\) being the first row of \({\textbf {A}}\). For \({\textbf {B}}\) of the form \({\textbf {B}}= [B_1 | 0 | 0 | B_2]\), the equation for \(\omega \) has the slow form

$$\begin{aligned} \dot{\omega } = \varepsilon (B_1 \xi + B_2\omega ), \end{aligned}$$
(47)

which occurs with \(w\in {\mathbb {R}}\) in the Lorenz–Stenflo model from Stenflo (1996), its magnetic variant (Wawrzaszek and Krasinska 2019), and with \(w\in {\mathbb {R}}^2\) in the models from Molteni et al. (1993); for the latter we choose \(\alpha =O(\varepsilon )\) and shift the auxiliary variables (which gives \({\textbf {b}}\ne 0\)) to obtain the form (47). An extension of Lorenz–Stenflo with nonlinear additional equations is considered in Moon et al. (2021), but still fits into the present framework when, e.g., scaling the variables in addition to Lorenz–Stenflo with \(j=3\) and choosing Lewis number of order \(\varepsilon ^{-2}\). Other extensions of the Lorenz model with two nonlinear auxiliary equations are studied in Da Costa et al. (1981), Shen (2014), Felicio and Rech (2018), which also fit into the present framework when suitably scaling the auxiliary modes and parameters. However, in many cases the situation is more complicated, for instance, for the three-dimensional extension in Shen (2015), Felicio and Rech (2018).

We next show that for the case (47) the results of the previous sections also carry over; the following analysis is more explicit in §5.1 for the model from Stenflo (1996). In the case (47), the Melnikov analysis of §2 can be simply extended by adding the slow equation for \(\omega \) to the action-angle formulation. The additional Melnikov integral term \({\textbf {M}}_3\) is then simply the integral of \(B_1 \xi (t) + B_2\omega _0\) over the period T, with \(\omega _0\) constant. Since \(\xi \) has zero average (\(\dot{\varphi }=\xi \) at \(\varepsilon =0\) in (12)), this term becomes \(\omega _0/T\), with the period T, so that \({\textbf {M}}_3=0\) requires \(\omega _0=0\). This means that the values of the other two Melnikov integrals for (46), \(\tilde{M}_1, \tilde{M}_2\), actually coincide with those of \(M_1, M_2\) from §2. The non-degeneracy condition turns into invertibility of the matrix

$$\begin{aligned} DM = \frac{\partial (\tilde{M}_1,\tilde{M}_2, {\textbf {M}}_3)}{\partial (I,B, \omega )}, \end{aligned}$$

where \(\tilde{M}_2\) is independent of \(\omega \), and it turns out that also \(\tilde{M}_1\) is: In its integrand F from (17), the additional term from \({\textbf {A}}_1 \omega _0\) is constant and has a factor \(\frac{\partial I}{\partial \xi } =\frac{\partial I}{\partial A} \frac{\partial A}{\partial \xi } = T \xi \), where \(\xi \) has zero average as noted above. Hence, the matrix has lower left triangular block structure and the block \(\partial _\omega {\textbf {M}}_3 = T B_2\) is invertible, if \(B_2\) is. In that case, the non-degeneracy condition is therefore the same as for \(M_1, M_2\) from the original Lorenz system.

5.1 Lorenz–Stenflo

The Lorenz–Stenflo system is given as follows:

$$\begin{aligned} X'&= \sigma ( Y-X ) +s V \nonumber \\ Y'&= \rho X - Y - XZ \nonumber \\ Z'&= -\beta Z + XY \nonumber \\ V'&= - X - \sigma V. \end{aligned}$$
(48)

This system is a mode truncation of the rotating Boussinesq equations:

$$\begin{aligned} \partial _t {\textbf {u}} + ({\textbf {u}} \cdot \nabla ) {\textbf {u}} + \nabla P + 2\Omega \hat{z} \times {\textbf {u}}&= \nu _m \Delta {\textbf {u}} + \alpha g T \hat{z} \nonumber \\ \partial _t T + {\textbf {u}} \cdot \nabla T&= \nu _T \Delta T \nonumber \\ \nabla \cdot {\textbf {u}}&= 0, \end{aligned}$$
(49)

where one is considering convection in a fluid in a rotating frame, and a term representing the Coriolis force has been added. One obtains (48) by making the analogous reduction to a system of ODE’s for the Fourier coefficients, but one must include an additional Fourier coefficient V(t) in the expansion of the velocity, which couples to the X-mode via the Coriolis force. The parameter s measures the speed of the rotation.

Since X and V both represent velocity variables, we expect they have the same scaling in \(\rho \), and hence, we scale

$$\begin{aligned} \epsilon = \rho ^{-1/2}, X = \epsilon ^{-1} \xi , Y = \epsilon ^{-2} \sigma ^{-1} \eta , Z = \epsilon ^{-2} (\sigma ^{-1} \zeta + 1 ), V = \epsilon ^{-1} \chi , t = \epsilon \tau \end{aligned}$$

and we obtain the system of equations

$$\begin{aligned} \frac{d \xi }{d\tau }&= \eta - \epsilon ( \sigma \xi - s \chi ) \nonumber \\ \frac{d \eta }{d\tau }&= -\xi \zeta - \epsilon \eta \nonumber \\ \frac{d \zeta }{d\tau }&= \xi \eta - \epsilon \beta (\zeta + \sigma ) \nonumber \\ \frac{d \chi }{d\tau }&= -\epsilon ( \xi + \sigma \chi ). \end{aligned}$$
(50)

The limiting system when \(\epsilon = 0\) coincides with (4) except trivial dynamics in the variable \(\chi \), so that the system now admits three invariants of motion

$$\begin{aligned} \xi ^2 - 2\zeta = 2A,\quad \eta ^2 + \zeta ^2 = B^2,\quad \chi . \end{aligned}$$
(51)

Using the first two invariants as for the Lorenz system, (50) can be solved at \(\varepsilon =0\), where the solutions have a different form depending on the choice of (AB) as described in § 2. Analogous to §2, we change coordinates via

$$\begin{aligned} \zeta = B \cos ( \phi ) \quad , \quad \eta = B \sin ( \phi ) \quad , \quad \xi = \xi \quad , \quad \chi = \chi \end{aligned}$$

and (50) becomes

$$\begin{aligned} \begin{array}{lll} \dot{\xi } = f_1 + \epsilon g_1 \qquad &{} \qquad &{} g_1 = - ( \sigma \xi - s \chi )\\ \dot{\phi } = f_2 + \epsilon g_2 &{} f_1 = B \sin (\phi ) &{} g_2 = \sin (\phi ) \big [ (\beta -1) \cos (\phi ) + \frac{ \beta \sigma }{B} \big ] \\ \dot{B} = \epsilon g_3 &{} f_2 = -\xi &{} g_3 = - \big [ B \sin ^2(\phi ) + \beta \cos (\phi ) ( B \cos (\phi ) + \sigma ) \big ] \\ \dot{\chi } = \epsilon g_4 &{} &{} g_4 = - ( \xi + \sigma \chi ) \end{array} \end{aligned}$$

Hence, B and \(\chi \) are constant at \(\varepsilon =0\), and for fixed B and \(\chi \), the remaining system for \((\xi ,\phi )\) possesses the same Hamiltonian structure as (5). Converting to action angle coordinates, the system becomes

$$\begin{aligned} \dot{I}&= \epsilon F_1(I,\theta ,B,\chi ) \nonumber \\ \dot{\theta }&= \Omega (I,B) + \epsilon F_2(I,\theta ,B,\chi ) \nonumber \\ \dot{B}&= \epsilon g_3(I,\theta ,B,\chi ) \nonumber \\ \dot{\chi }&= \epsilon g_4(I,\theta ,B,\chi ) \end{aligned}$$

where

$$\begin{aligned}&F_1 = \frac{1}{\Omega (I,B)} \big ( \xi g_1 + B \sin (\phi ) g_2 - \cos (\phi ) g_3 \big ) + \frac{\partial I}{\partial B}|_A g_3, \\&F_2 = \frac{\partial \theta }{\partial \xi } g_1 + \frac{\partial \theta }{\partial \phi } g_2 + \frac{\partial \theta }{\partial B} g_3. \end{aligned}$$

In this case, we have a three-dimensional Melnikov function given by

$$\begin{aligned} \begin{pmatrix} M_1 \\ M_3 \\ M_4 \end{pmatrix} = \begin{pmatrix} \int _0^t F_1 \big ( I_0,\theta _0 + \Omega (I_0,B_0,\chi _0) s, B_0,\chi _0 \big ) ds \\ \int _0^t g_3 \big ( I_0,\theta _0 + \Omega (I_0,B_0,\chi _0) s, B_0,\chi _0 \big ) ds \\ \int _0^t g_4 \big ( I_0,\theta _0 + \Omega (I_0,B_0,\chi _0) s, B_0,\chi _0 \big )ds \end{pmatrix}. \end{aligned}$$

In order to find the persistent periodic orbits, we need to find the zeros of this vector valued Melnikov function such that the non-degeneracy condition

$$\begin{aligned} \det DM = \det \left( \frac{\partial (M_1,M_3,M_4)}{\partial (I,B, \chi )}\right) \ne 0 \end{aligned}$$

is satisfied. Since we can write

$$\begin{aligned} M_1 = \frac{1}{\Omega (I_0,B_0)} \tilde{M}_1 +\frac{\partial I}{\partial B}|_A M_3 \quad \text { for } \quad \tilde{M}_1 = \int _0^T \big [ \xi g_1 + B \sin \phi g_2 - \cos \phi g_3 \big ] dt \end{aligned}$$

, it suffices to find AB such that \(\tilde{M}_1 = M_3 = M_4 = 0\). Explicitly the Melnikov integrals are given via

$$\begin{aligned} \tilde{M}_1&= \int _0^T \big [ -\xi ^2 \sigma + s \xi \chi +B \beta \cos (\phi ) + \beta \sigma \big ] dt \\ M_3&= -\int _0^T \big [ B + (\beta -1) B \cos ^2 \phi +\sigma \beta \cos \phi \big ] dt \\ M_4&= \int _0^T ( \xi + \sigma \chi ) dt. \end{aligned}$$

Since we aim at illustration, we consider \(|A| \le B\) only and then compute

$$\begin{aligned} \tilde{M}_1&= (B \beta + \beta \sigma ) \frac{4K(k_1)}{\sqrt{B}} -16 \sigma \sqrt{B} ( E(k_1) - (1-k_1^2 ) K(k_1)) + 8 \beta \sqrt{B} ( E(k_1) - K(k_1)) \\ M_3&= -(B + \sigma ) \beta \frac{4K(k_1)}{\sqrt{B}} -8(2(\beta -1)\sqrt{B} + \frac{\sigma \beta }{\sqrt{B}}) (E(k_1)-K(k_1))\\&\quad - \frac{16 (\beta -1)\sqrt{B}}{3} (-2(1+k_1^2)E(k_1) + (2+k_1^2) K(k_1)) \\ M_4&= \frac{4K(k_1)}{\sqrt{B}} \sigma \chi _0 \end{aligned}$$

As noticed a priori for such an extension of the Lorenz system, the first two Melnikov functions are the same as for (4), and the third vanishes if and only if \(\chi _0 = 0\), and \(M_1\), \(M_2\) are independent of \(\chi _0\). Hence,

$$\begin{aligned} DM = \begin{pmatrix} \frac{\partial M_1}{\partial I_0} &{} \frac{\partial M_1}{\partial B_0} &{} 0 \\ \frac{\partial M_3}{\partial I_0} &{} \frac{\partial M_3}{\partial B_0} &{} 0 \\ \frac{\partial M_4}{\partial I_0} &{} \frac{\partial M_4}{\partial B_0} &{} \frac{\partial M_4}{\partial \chi _0} \\ \end{pmatrix} \end{aligned}$$

and the determinant is given by

$$\begin{aligned} \textsf{det} DM = \frac{\partial M_4 }{\partial \chi _0} \Big [ \frac{\partial M_1}{\partial I_0} \frac{\partial M_3}{\partial B_0} - \frac{\partial M_1}{\partial B_0} \frac{\partial M_3}{\partial I_0} \Big ] = \frac{4 K(k_1) \sigma }{\sqrt{B}} \Big [ \frac{\partial M_1}{\partial I_0} \frac{\partial M_3}{\partial B_0} - \frac{\partial M_1}{\partial B_0} \frac{\partial M_3}{\partial I_0} \Big ], \end{aligned}$$

which is nonzero as shown in §2.

6 Discussion

In this paper, we have revisited the dynamics of the Lorenz equation in the regime of large Rayleigh number \(\rho \), which is known to feature periodic attractors rather than the famous chaotic dynamics (Robbins 1979; Sparrow 1982; Da Costa et al. 1981). Our main motivation was to study properties of transport of attractors in a parameter regime where states that maximize transport are dynamically unstable. For the Lorenz equations, it was proven in Souza and Doering (2015) that maximal transport is realized by the nonzero fixed points, which are unstable for \(\rho>\rho ^*, \lambda >1\). However, we found that the literature concerning existence and stability theory of periodic states for large \(\rho \) was incomplete. We have therefore provided a rigorous treatment, which essentially confirms the predictions of Sparrow (1982). Numerical computations for large finite \(\rho \) based on continuation methods and direct simulations have further corroborated these findings. In addition, we have quantified the transport of the periodic attractors and thus the gap of transport compared with the maximum possible. In particular, the transport of the periodic attractors can be arbitrarily small in a parameter range of bistability, where the states that maximize transport are also stable. Indeed, for fixed \(\rho \) we have identified a hysteresis loop in terms of the parameter \(\lambda = \frac{\sigma + 1}{\beta + 2}\), which illustrates difficulty to recover from a loss in transport once \(\lambda \) exceeds the ‘tipping point’ \(\lambda =1\). Moreover, we have computed the stability boundary of periodic attractors in the \((\lambda ,\rho )\)-plane and found that it extends to relatively low values of \(\rho \) below 200. For fixed \(\rho \), we also found a relation to well-known period-doubling bifurcations and symmetric heteroclinic cycles, which produce further regions of bi- and multi-stability of local attractors.

The Lorenz equations are the crudest mode truncation of the physical model, and there are numerous extensions. For several such generalizations, we have found that our results apply in suitable parameter regimes, in particular for the Lorenz–Stenflo system (Stenflo 1996). Although our results have no immediate implications in the context of atmospheric convection, we believe they provide a relevant case study for the relation of theoretical bounds and dynamically realized transport. The approach by perturbing selected solutions from the infinite Rayleigh number limit by exploiting structural properties would be interesting to explore for higher mode truncations and even the viscous Boussinesq equations. Indeed, recent numerical investigations for meaningful bounds in the Boussinesq equation are based on specific solutions and consider stability properties (Wen et al. 2022a, b). We remark that the mode reduced Nusselt number \(\textrm{Nu}=1+\frac{2}{\beta \rho } H(\rho ,\beta ,\sigma ,{\textbf {X}}_0)\), cf. Souza and Doering (2015), is bounded by 3 as \(\rho \rightarrow \infty \) due to the transport bound from Souza and Doering (2015). However, this is far from the ‘ultimate’ or ‘classical’ Nusselt number bounds of order \(\rho ^{1/2}\) or \(\rho ^{1/3}\) for the PDE model (Wen et al. 2022a, b).

The present paper makes a step toward completely settling the question of transport for the Lorenz model. The set of parameter values for which the transport has not been analytically determined is now reduced to a compact set for which the dynamics are chaotic. In the large \(\rho \) regime, we have analytically determined stable structures and their transport. Although we have found numerical evidence for further stable invariant structures, it numerically appears (but remains to be proven) that for fixed \(\lambda >1\) and sufficiently large \(\rho \) the symmetric periodic orbits are the only attractors. In the chaotic regime for intermediate Rayleigh numbers, the transport is also reduced compared to the nonzero steady states. However, despite the numerous analytical results for the Lorenz attractor, it seems difficult to quantify the transport in that case. It would also be interesting to explore the possible emergence of discrete Lorenz attractors in extended Lorenz systems such as Palmer’s (Palmer 1998), which is close to a periodic forcing of the Lorenz in a suitable parameter regime.