1 Introduction

Random matrix theory (RMT) is a powerful mathematical tool for studying complex quantum systems; it describes the universal properties of random matrix that are determined only by the system’s symmetry while independent of microscopic details. For this reason, RMT has been applied to various fields ranging from disordered nuclei to isolated quantum many-body systems [1,2,3].

Among various statistical quantities of RMT, the most widely used ones are the distribution of nearest level spacings \(\left\{ s_{i}=E_{i+1}-E_{i}\right\} \) and gap ratios \(\left\{ r_{i}=s_{i+1}/s_{i}\right\} \). It is well established that in a chaotic system, the distribution of level spacing \(P\left( s\right) \) will follow a Wigner–Dyson distribution [4, 5] (see Eq. (3) in Sect. 3), which reveals the level repulsion in a direct way. However, when accounting \(P\left( s\right) \), an unfolding procedure is required to erase the model-dependent information about local density of states (DOS). In contrast, gap ratios distribution \(P\left( r\right) \) is independent of DOS and requires no unfolding procedure [6, 7] and has found various applications especially in the context of many-body localization (MBL) [7,8,9,10,11,12,13,14,15,16,17,18] .

Both the nearest level spacing and gap ratio account for the short-range level correlations. However, long-range correlations are also important, especially in the study of MBL transition phenomena. Actually, there’re a number of RMT models accounting for the intermediate level statistics between Wigner–Dyson and Poisson ensembles [19,20,21,22,23,24], and some of them are suggested to describe the level statistics in MBL transition regime [25,26,27,28,29], all of which more or less describe the short-range level correlations well, while their difference can only be revealed when long-range correlations are concerned. Commonly, the long-range correlation in a random matrix ensemble is described by the number variance \(\Sigma ^{2}\) or the Dyson–Mehta \(\Delta _{3}\) statistics [5]; however, both of them are very sensitive to the concrete unfolding strategy which has already been a source of misleading signatures [30]. Instead, it’s more direct and numerically easier to study the higher-order level spacings and gap ratios, as have been done in a number of recent works [28, 29, 31,32,33,34,35,36,37,38,39,40] .

Formally, the nth-order level spacing and gap ratio are defined as \( \big \{ s_{i}^{\left( n\right) }=E_{i+n}-E_{i}\big \} \) and \(\big \{ r_{i}^{\left( n\right) }=s_{i+n}^{\left( n\right) }/s_{i}^{\left( n\right) }\big \} \), respectively. The higher-order gap ratios are first studied in Ref. [33], where the authors provide strong numerical evidences that \(P\left( r^{\left( n\right) }\right) \) in a random matrix ensemble with Dyson index \(\beta \) bears the same form as \(P\left( r\right) \) with a rescaled parameter

$$\begin{aligned} \gamma =\frac{n(n+1)}{2}\beta +n-1, \end{aligned}$$
(1)

where \(\beta =1,2,4\) correspond to the Gaussian orthogonal ensemble (GOE), Gaussian unitary ensemble (GUE) and Gaussian symplectic ensemble (GSE), respectively. On the other hand, it is later proved in Ref. [39] that \(P\left( s^{\left( n\right) }\right) \) has the same form as Wigner–Dyson distribution with a rescaled parameter \(\gamma \) that is identical to the one in Eq. (1) for \(\beta \in \left( 0,\infty \right) \). This strongly hints a homogeneous relation between the higher-order level spacing and gap ratio, but an explanation is still lacking, which gives the first motivation of this work.

The second motivation comes from the recent works that encounter physical systems that go beyond the three standard ensembles with \(\beta =1,2,4\). For example, the \(\beta =3\) behavior has been found in a 2D lattice with non-Hermitian disorder [41], and the ensembles with non-integer \(\beta \) are suggested to describe the level statistics in the whole region along the MBL transition in 1D random spin system [27], while their efficiency in describing long-range spectral correlations is controversial [29]. Therefore, it’s beneficial to have an expression for the higher-order spacing distributions in these ensembles, which will also be offered in this study.

In this work, we find that the key to link the higher-order level spacing and gap ratio is the reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\equiv E_{in}\big \} \), i.e. the spectrum constructed by picking one level from every n levels in the original spectrum \(\left\{ E_{i}\right\} \). By this construction, the nth-order level spacing and gap ratio in \(\big \{ E_{i}\big \} \) become the lowest order ones in \(\big \{ E_{i}^{\left( n\right) }\big \} \). It will be verified that the joint probability distribution of \(\big \{ E_{i}^{\left( n\right) }\big \} \) (to leading order) bears the same form as \(\big \{ E_{i}\big \} \) with a rescaled parameter \(\gamma \) expressed in Eq. (1), which hence explains their distributions by virtue of Wigner surmise. Furthermore, this rescaling relation holds for general \(\beta \) beyond \(\beta =1,2,4\); therefore, the higher-order level spacing and gap ratio distributions in these ensembles can be obtained accordingly, which is thus a natural extension of Ref. [33].

This paper is organized as follows. In Sect. 2 we summarize the formulas regarding the higher-order level spacings and gap ratios, which motivates the construction of reduced energy spectrum. In Sect. 3 we focus on the cases that \(\big \{ E_{i}^{\left( n\right) }\big \} \) has two and three levels and provide compelling numerical evidence for the scaling of its joint probability distribution in ensembles with general \(\beta \). In Sect. 4 we provide numerical simulations for the distributions of nearest level spacing and gap ratio in \(\big \{ E_{i}^{\left( n\right) }\big \} \) with a large number of levels and confirm their coincidence with the higher-order ones in \(\big \{ E_{i}\big \} \). In Sect. 5 we briefly discuss the higher-order reduced energy spectrums. Conclusion and discussion come in Sect. 6.

2 Motivating the reduced energy spectrum

The starting point to study the spectral statistics in the Gaussian ensemble of random matrix is the joint probability distribution (JPDT) of its energy levels, whose form is [4, 5]

$$\begin{aligned} P\left( \beta ,\left\{ E_{i}\right\} \right) =C\prod _{i<j}\left| E_{i}-E_{j}\right| ^{\beta }e^{-A\sum _{i=1}^{N}E_{i}^{2}}, \end{aligned}$$
(2)

where the Dyson index \(\beta \in \left( 0,\infty \right) \) is a continuous parameter, C and A are coefficients correlated by the normalization condition \(\int \prod _{i}dE_{i}P\left( \beta ,\left\{ E_{i}\right\} \right) =1\). It’s worth noting that only the subset \(\beta =1,2,4\) are physically invariant ensembles, that is, each matrix element is allowed to drawn from a Gaussian distribution provided the matrix being invariant under orthogonal, unitary or symplectic transformations, while for other values of \(\beta \), the JPDT stems from a special tridiagonal random matrix (see Eq. (21) in Sect. 4).

From the JPDT in Eq. (2), we can in principle calculate any statistical quantity we want, in particular the distribution of nearest level spacings and gap ratios. The general distributions for them in large dimension N are complicated, while for most practical purpose it is sufficient to adopt the so-called Wigner surmise that deals with smallest matrix that holds the quantity of interest. For example, to study nearest level spacing it’s sufficient to consider a \(2\times 2\) matrix, which gives the celebrated Wigner–Dyson distribution [5]

$$\begin{aligned} P\left( \beta ,s\right) =C\left( \beta \right) s^{\beta }e^{-A\left( \beta \right) s^{2}}, \end{aligned}$$
(3)

where the coefficients \(C\left( \beta \right) ,A\left( \beta \right) \) are determined by the normalization conditions

$$\begin{aligned} \int _{0}^{\infty }P\left( \beta ,s\right) \mathrm{d}s=1,\int _{0}^{\infty }sP\left( \beta ,s\right) \mathrm{d}s=1. \end{aligned}$$
(4)

It’s easy to see \(P\left( \beta ,s\rightarrow 0\right) \sim s^{\beta }\); hence, \(\beta \) is the parameter that controls the strength of level repulsion.

As for the nearest gap ratios \(\left\{ r_{i}=s_{i+1}/s_{i}\right\} \), a Wigner-like surmise is applicable by studying \(3\times 3\) matrices [6], which gives

$$\begin{aligned} P\left( \beta ,r\right) =\frac{1}{Z_{\beta }}\frac{\left( r+r^{2}\right) ^{\beta }}{\left( 1+r+r^{2}\right) ^{1+3\beta /2}} \end{aligned}$$
(5)

where \(Z_{\beta }\) is the normalization factor determined by requiring \( \int _{0}^{\infty }P\left( \beta ,r\right) dr=1\). It is crucial to note that the derivations for Eqs. (3) and (5) are purely mathematical, that is, applicable for arbitrary positive \(\beta \).

For the higher-order level spacings \(\big \{ s_{i}^{\left( n\right) }=E_{i+n}-E_{i}\big \} \), its distribution is studied in Ref. [39] using a Wigner-like surmise that deals with \(\left( n+1\right) \times \left( n+1\right) \) matrix, and the result shows they follow a generalized Wigner–Dyson distribution that bears the same form as Eq. (3) with the parameter \(\beta \) rescaled to \(\gamma \) as expressed in Eq. (1).

On the other hand, higher-order gap ratios come in two different ways, i.e. the “overlapping” way [31] and “non-overlapping” way [33]. In the former case, we are dealing with

$$\begin{aligned} {\widetilde{r}}_{i}^{\left( n\right) }=\frac{E_{i+n+1}-E_{i+1}}{E_{i+n}-E_{i}}= \frac{s_{i+n}+s_{i+n-1}+\cdots +s_{i+1}}{s_{i+n-1}+s_{i+n-2}+\cdots +s_{i}}, \end{aligned}$$
(6)

which is termed “overlapping” gap ratio since there are shared spacings between the numerator and denominator, while the nth-order “non-overlapping” gap ratio is defined as

$$\begin{aligned} r_{i}^{\left( n\right) }=\frac{E_{i+2n}-E_{i+n}}{E_{i+n}-E_{i}}=\frac{ s_{i+2n-1}+s_{i+2n-2}+\cdots +s_{i+n}}{s_{i+n-1}+s_{i+n-2}+\cdots +s_{i}}. \end{aligned}$$
(7)

Both these two generalizations reduce to the nearest gap ratio when \(n=1\), but they are very different when studying their distributions using Wigner surmise: for non-overlapping ratio \(r^{\left( n\right) }\), the smallest matrix dimension is \(\left( 2n+1\right) \times \left( 2n+1\right) \), while it is \(\left( n+2\right) \times \left( n+2\right) \) for overlapping ratios. Naively, it’s expected that \(P\left( {\widetilde{r}}^{\left( n\right) }\right) \) is more complicated due to the overlapping spacings. Indeed, the analytical form of \(P\left( {\widetilde{r}}^{\left( 2\right) }\right) \) was worked out in Ref. [31] and the result is quite involving. For the non-overlapping ratios, Ref. [33] provides compelling numerical evidence that in cases with \(\beta =1,2,4\), \(P\left( r^{\left( n\right) }\right) \) bears the same form as \(P\left( r\right) \) with the same rescaling parameter as higher-order level spacing, that is, Eq. (1).

In summary, we have

$$\begin{aligned} P\left( \beta ,s^{\left( n\right) }\right)= & {} P\left( \gamma ,s\right) \text { ,} \end{aligned}$$
(8)
$$\begin{aligned} \text { }P\left( \beta ,r^{\left( n\right) }\right)= & {} P\left( \gamma ,r\right) , \end{aligned}$$
(9)

where \(\gamma \) is expressed in Eq. (1). For the rest of this paper, “gap ratio” will always refer to the non-overlapping one.

The identical rescaling behavior for higher level spacing and gap ratio hints they may be attributed to one single reason, which is found to be the reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\big \} \). Formally, a reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\big \} \) is constructed by picking one level from every n levels in the original spectrum \(\big \{ E_{i}\big \} \), which is mathematically achieved by tracing out every \(n-1\) levels in between. This construction is very similar to that of the reduced density matrix there we trace out the degrees of freedom in a subsystem; hence, \(\big \{ E_{i}^{\left( n\right) }\big \} \) is named “reduced energy spectrum”. By this construction, the nth-order level spacing and gap ratio in \(\big \{ E_{i}\big \} \) are mapped to the lowest order counterparts in \(\big \{ E_{i}^{\left( n\right) }\big \} \). It’s then natural to conjecture that \(\big \{ E_{i}\big \} \) and \(\big \{ E_{i}^{\left( n\right) }\big \} \) (to leading order) bear the same form for their probability distributions, with the Dyson index \(\beta \) for the former rescaled to \(\gamma \) for the latter according to Eq. (1). Therefore, by applying Wigner surmise to \(\big \{ E_{i}^{\left( n\right) }\big \} \), the scaling behaviors in Eqs. (8) and (9) can be explained simultaneously. This is the main task of current work.

Before proceeding, we want to mention the relatively trivial case of Poisson ensemble. The reduced energy spectrum in Poisson ensemble has been studied in Ref. [42] (which is named “Daisy model” by the authors), where nth-order level spacing is shown to follow the generalized semi-Poisson distribution

$$\begin{aligned} P\left( s^{\left( n\right) }=s\right) =\frac{n^{n}}{(n-1)!}s^{n-1}e^{-ns}, \end{aligned}$$
(10)

which reduces to the conventional Poisson distribution \(P\left( s\right) =\exp \left( -s\right) \) when \(n=1\). For the nth-order gap ratios, we can derive their distribution from the results in Ref. [42], that is

$$\begin{aligned} P\left( r^{(n)}=r\right) =\frac{r^{n-1}}{\left( 1+r\right) ^{2n}}, \end{aligned}$$
(11)

which reduces to the one given in Ref. [6] when \(n=1\). The formulas in Eqs. (8), (9), (10 ) and (11) will be used for later numerical simulations.

3 Scaling of \(P\left( \left\{ E_{i}^{\left( n\right) }\right\} \right) \)

By the construction of reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\big \} \), its formal joint probability distribution is

$$\begin{aligned} P\left( \left\{ E_{i}^{\left( n\right) }\right\} \right) =\prod _{i}\int _{E_{in}}^{E_{\left( i+1\right) n}}\prod _{j=in+1}^{\left( i+1\right) n-1}dE_{j}P\left( \beta ,\left\{ E_{i}\right\} \right) . \end{aligned}$$
(12)

For reasons described in the previous section, we conjecture \(P\left( \big \{ E_{i}^{\left( n\right) }\big \} \right) \) (to leading order) bear the same form as \(P\left( \left\{ E_{i}\right\} \right) \) with the rescaled parameter \(\gamma \) as in Eq. (1), that is,

$$\begin{aligned} P\left( \left\{ E_{i}^{\left( n\right) }\right\} \right) \sim \prod _{i<j}\left| E_{i}^{\left( n\right) }-E_{j}^{\left( n\right) }\right| ^{\gamma }e^{-A^{\prime }\sum _{i=1}^{N/n}\left( E_{i}^{\left( n\right) }\right) ^{2}}. \end{aligned}$$
(13)

An analytical derivation from Eqs. (12) to  (13 ) for arbitrary matrix dimension N is mathematically challenging, only the special case with \(\beta =2/k\) (k being positive integer) has been proven rigorously in Ref. [43]. It is not our purpose to give a general proof; instead, to explain the behaviors of the \(P\left( s^{\left( n\right) }\right) \) and \(P\left( r^{\left( n\right) }\right) \) in Eqs. (8) and (9), we only need to verify Eq. (13) in the sense of Wigner surmise, that is, in the cases that \( \big \{ E_{i}^{\left( n\right) }\big \} \) has only two and three levels, for which we will provide strong numerical evidence in the following.

First of all, the constant \(A^{\prime }\) is not important since it is only a decay rate parameter, whose value can be tuned by global rescaling of the energy levels without affecting the distribution of level spacing or gap ratio. Therefore, we will focus on the parameter \(\gamma \) that controls the strength of level repulsion.

The ensembles with general positive \(\beta \) can be divided into four typical categories: (i) \(\beta =1,2,4\), corresponding to the three standard Gaussian ensembles, which are of most physical interest; (ii) \(\beta \) is an integer that goes beyond the three standard ensembles, for which we choose \( \beta =3\); (iii) \(\beta \) is a fraction, for which we choose \(\beta =1/3\); (iv) \(\beta \) is an irrational value, for which we choose \(\beta =\left( \sqrt{5}-1\right) /2\) (the golden ratio). We will verify the rescaling relation Eq. (13) in these cases.

We start with the case that \(\big \{ E_{i}^{\left( n\right) }\big \} \) has only two levels, where the rescaling in Eq. (13) becomes

$$\begin{aligned} I\left( E_{0},E_{n}\right)= & {} \int _{E_{0}}^{E_{n}}\prod _{i=1}^{n-1}dE_{i}P\left( \beta ,\left\{ E_{i}\right\} \right) \nonumber \\\sim & {} \left| E_{0}-E_{n}\right| ^{\gamma }e^{-A^{\prime }\left( E_{0}^{2}+E_{n}^{2}\right) }. \end{aligned}$$
(14)

Denote \(E_{0}=R\cos \theta \) and \(E_{n}=R\sin \theta \), and keeping R constant, we reach to

$$\begin{aligned} \log I\left( \theta \right) =\gamma \log \left| \cos \theta -\sin \theta \right| +\text {const.} \end{aligned}$$
(15)

Without loss of generality, we take \(A=1\) and \(R=1\) and randomly generate 200 sets of \(\theta \) in the range \([0,2\pi )\). We then numerically calculate \(\log I\left( \theta \right) \) and \(\log \left| \cos \theta -\sin \theta \right| \), and collect the results for \(n=2,3,4\), which are presented in Fig. 1. As can be seen, the \(\log I\left( \theta \right) \) and \(\log \left| \cos \theta -\sin \theta \right| \) shows a perfect linear dependence in all cases, with the fitted values of \( \gamma \) quite close to the expected ones in Eq. (1), as summarized in Table 1.

Fig. 1
figure 1

The fitting of Eq. (15) for \(\beta =1,2,3,4, \frac{1}{3},\frac{\sqrt{5}-1}{2}\) with \(n=2,3,4\). The fitted slopes are shown in the figure legends, where the numbers in the bracket are the expected values according to Eq. (1)

Table 1 The values of \(\gamma \) with different \(\beta \) and order n, where \(\gamma _{\text {e}}\) is the expected value according to Eq. (1). \(\gamma _{\text {num.}}^{\left( 1\right) }\) and \(\gamma _{\text {num.}}^{\left( 2\right) }\) are the values fitted from Eqs. (15) to (18); their relative errors with respect to \(\gamma _{\text {e}}\) are denoted by \(\hbox {error}^{\left( 1\right) }\) and \(\hbox {error}^{\left( 2\right) }\), respectively

Next, we consider the case that \(\left\{ E_{i}^{\left( n\right) }\right\} \) has three levels; now the rescaling in Eq. (13) becomes

$$\begin{aligned}&Q\left( E_{0},E_{n},E_{2n}\right) \nonumber \\&\quad \equiv \int _{E_{0}}^{E_{n}}\prod _{i=1}^{n-1}dE_{i}\int _{En}^{E_{2n}} \prod _{j=n+1}^{2n-1}dE_{j}P\left( \beta ,\left\{ E_{i}\right\} \right) \nonumber \\&\quad \sim \left| E_{0}-E_{n}\right| ^{\gamma }\left| E_{n}-E_{2n}\right| ^{\gamma }\left| E_{0}-E_{2n}\right| ^{\gamma }e^{-A^{\prime }\left( E_{0}^{2}+E_{n}^{2}+E_{2n}^{2}\right) }. \end{aligned}$$
(16)

With the transformation to spherical coordinates

$$\begin{aligned} E_{0}=R\sin \theta \cos \varphi ,E_{n}=R\sin \theta \sin \varphi ,E_{2n}=R\cos \theta \end{aligned}$$
(17)

and keeping R constant, we can reach to

$$\begin{aligned} \log Q\left( R,\theta ,\varphi \right) =\gamma \log G\left( \theta ,\varphi \right) +\text {const.} \end{aligned}$$
(18)

where

$$\begin{aligned} G\left( \theta ,\varphi \right)= & {} |\left( \sin \theta \cos \varphi -\sin \theta \sin \varphi \right) \nonumber \\&\times \left( \sin \theta \cos \varphi -\cos \theta \right) \nonumber \\&\times \left( \sin \theta \sin \varphi -\cos \theta \right) |. \end{aligned}$$
(19)

We then perform numerical checks, where we fix \(R=1\) and \(A=1\) as before. We randomly generate 200 pairs of \(\left( \theta ,\varphi \right) \) and numerically determine \(\log Q\left( \theta ,\varphi \right) \equiv \log Q\left( 1,\theta ,\varphi \right) \) and \(\log G\left( \theta ,\varphi \right) \); the results are displayed in Fig. 2. As can be seen, the linear dependence between \(\log Q\left( \theta ,\varphi \right) \) and \(\log G\left( \theta ,\varphi \right) \) is still perfect in all cases, and the fitted values of \(\gamma \) are close to the expected ones in Eq. (1).

Fig. 2
figure 2

The fitting of Eq. (18) for \(\beta =1,2,3,4, \frac{1}{3},\frac{\sqrt{5}-1}{2}\) with \(n=2,3,4\). The fitted \( \gamma \) values are shown in the figure legends, with the expected values according to Eq. (1) in the brackets

For convenience, we collected the theoretical and numerical values of \( \gamma \) in Table 1, where \(\gamma _{\text {e}}\) refers to the expected value according to Eq. (1), \(\gamma _{\text {num.} }^{\left( 1\right) }\) and \(\gamma _{\text {num.}}^{\left( 2\right) }\) are the numerical values fitted from Eqs. (15) and  (18), and \(\hbox {error}^{\left( 1\right) }\) and \(\hbox {error}^{\left( 2\right) }\) are their relative deviations from \(\gamma _{\text {e}}\) respectively. In general, the numerical errors increase with larger \(\beta \), but they’re controlled within a satisfactory level in all cases.

Up to now, we have verified the scaling behavior for the probability distribution of \(\big \{ E_{i}^{\left( n\right) }\big \} \) with two and three levels, the cases with more levels can be verified with the same method, but it will become more and more tedious when the number of integrals increases. Nevertheless, the present results are sufficient to justify the scaling behaviors of higher-order level spacing/gap ratio by applying Wigner surmise to \(\big \{ E_{i}^{\left( n\right) }\big \} \). More importantly, we have verified the scaling behavior to hold for general \( \beta \)s that go beyond GOE, GUE and GSE (\(\beta =1,2,4\)), even when \(\beta \) is non-integer or irrational. This indicates the distribution for higher order level spacing and gap ratio in Eqs. (8) and (9) also hold for general \(\beta \) ensembles, for which we will present numerical evidences in the next section.

4 Numerical simulations

In this section, we numerically check the distribution of nearest level spacing/gap ratio in \(\big \{ E_{i}^{\left( n\right) }\big \} \) with many levels, and show they indeed coincide with the nth-order counterparts in \(\big \{ E_{i}\big \} \). Before that, a technical issue needs to be pointed out, that is, the nearest level spacings in \(\big \{ E_{i}^{\left( n\right) }\big \} \) is actually \(\big \{ E_{i+1}^{\left( n\right) }-E_{i}^{\left( n\right) }=E_{\left( i+1\right) n}-E_{in}\big \}\), whose total number is \(\left[ \frac{N}{n}\right] -1\), while the nth-order level spacing in original energy spectrum \(\big \{ E_{i}\big \} \) are \(\big \{ E_{i+n}-E_{i}\big \} \) with total number \(N-n\); therefore, the mapping does not strictly hold, and the same thing happens to gap ratios. However, since the distribution is extracted from a large number of level spacings (gap ratios), it’s natural to suspect the difference is negligible when the number of samples and matrix dimension are large, which we will soon justify.

Fig. 3
figure 3

Distribution of nearest level spacing and gap ratio in the reduced energy spectrum \(\left\{ E_{i}^{\left( n\right) }\right\} \) of model Eq. ( 20) in a \(L=12\) chain with \(n=2\) a and d, \(n=3\) b and e and \(n=4\) c and f. The data from \(h=1\) in the orthogonal (unitary) model represent GOE (GUE), and those from \(h=5\) in orthogonal model represent Poisson, respectively. The reference curves correspond to the higher-order spacing distributions in \(\left\{ E_{i}\right\} \) according to Eqs. (8), (9) for GOE and GUE, and Eqs. (10), (11) for Poisson; the parameter \(\gamma \) for the former is calculated by Eq. (1). The perfect fittings in all cases confirm the coincidence between higher-order spacing distributions in \(\left\{ E_{i}\right\} \) and the lowest order counterparts in \(\left\{ E_{i}^{(n)}\right\} \)

For GOE, GUE and Poisson ensemble, we perform simulations from a real physical system, that is, the 1D Heisenberg chain with random external fields, which is the canonical model to study many-body localization [44]. The Hamiltonian reads

$$\begin{aligned} H=J\sum _{i=1}^{L}{\mathbf {S}}_{i}\cdot {\mathbf {S}}_{i+1}+\sum _{i=1}^{L}\sum _{ \alpha =x,y,z}h^{\alpha }\varepsilon _{i}^{\alpha }S_{i}^{\alpha }, \end{aligned}$$
(20)

where \({\mathbf {S}}_{i}\) is spin-1/2 operator. The anti-ferromagnetic coupling strength J is set to be unity, and \(\varepsilon _{i}^{\alpha }\)s are random numbers within range \(\left[ -1,1\right] \). The \(h^{\alpha }\) is referred as randomness strength. We consider two sets of \(h^{\alpha }\): (i) \( h^{x}=h^{z}=h\ne 0\) and \(h^{y}=0\), the model is orthogonal and belongs to GOE; (ii) \(h^{x}=h^{y}=h^{z}=h\ne 0\), and the model is unitary and belongs to GUE. This model undergoes a thermal-MBL transition at roughly \(h_{c}\simeq 3\) (2.5) in the orthogonal (unitary) case, where the level spacing distribution evolves from GOE (GUE) to Poisson [9, 10].

We choose a \(L=12\) system to perform simulations and prepare 500 samples of energy spectrum at \(h=1\) in both the orthogonal and unitary model, representing GOE and GUE, respectively. We also prepare 500 samples at \(h=5\) in the orthogonal model to represent Poisson ensemble. For each sample of energy spectrum, we manually construct the reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\big \} \) with \(n=2,3,4\) and count the corresponding nearest level spacing and gap ratio distributions and compare them to the formulas in Eqs. (8)–(11); the results are displayed in Fig. 3. As can be seen, the fittings are quite good, confirming the correspondence between nearest level spacing/gap ratio in \(\big \{ E_{i}^{\left( n\right) }\big \} \) with the nth-order counterparts in \(\big \{ E_{i}\big \} \).

For ensembles with general \(\beta \), we perform numerical simulations from modeling random matrices. It was proven in Ref. [45] that the eigenvalues of the following tridiagonal matrix ensemble

$$\begin{aligned} M_{\beta }=\frac{1}{\sqrt{2}}\left( \begin{array}{ccccc} x_{1} &{}\quad y_{1} &{}\quad &{}\quad &{}\quad \\ y_{1} &{}\quad x_{2} &{}\quad y_{2} &{}\quad &{}\quad \\ &{} \begin{array}{ccc} . &{}\quad &{}\quad \\ &{}\quad . &{}\quad \\ &{}\quad &{}\quad . \end{array} &{} \begin{array}{ccc} . &{}\quad &{}\quad \\ &{}\quad . &{}\quad \\ &{}\quad &{}\quad . \end{array} &{} \begin{array}{ccc} . &{}\quad &{}\quad \\ &{}\quad . &{}\quad \\ &{}\quad &{}\quad . \end{array} &{}\quad \\ &{}\quad &{}\quad y_{N-2} &{}\quad x_{N-1} &{}\quad y_{N-1} \\ &{}\quad &{}\quad &{}\quad y_{N-1} &{}\quad x_{N} \end{array} \right) \end{aligned}$$
(21)

will follow the distribution in Eq. (2) with continuous parameter \(\beta \in \left( 0,\infty \right) \) provided the diagonals \( x_{i}\,\)(\(i=1,2,\ldots ,N\)) follow the normal distribution \( N \left( 0,2\right) \), that is

$$\begin{aligned} P\left( x_{i}\right) =\frac{1}{2\sqrt{2\pi }}e^{-x_{i}^{2}/8}, i=1,2,\ldots ,N, \end{aligned}$$
(22)

and \(y_{k}\) (\(k=1,2,\ldots ,N-1\)) follows the \(\chi \) distribution with parameter \(\left( N-k\right) \beta \), that is

$$\begin{aligned} P\left( y_{k}\right) =\left\{ \begin{array}{ll} \frac{2}{2^{\left( N-k\right) \beta }\Gamma \left( \left( N-k\right) \beta /2\right) }y_{k}^{\left( N-k\right) \beta -1}e^{-y_{k}^{2}/2}, &{}\quad y_{k}\ge 0 \\ 0, &{}\quad y_{k}<0 \end{array} \right. . \end{aligned}$$
(23)

By virtue this remarkable construction, we can efficiently generate energy spectrums with any \(\beta \) we want [46].

In accordance with Sect. 3, we choose to simulate the cases with \( \beta =3,\frac{1}{3},\frac{\sqrt{5}-1}{2}\). For each \(\beta \), we generate 500 samples of energy spectrums by using Eq. (21), with the number of energy levels in the original spectrum \(\big \{ E_{i}\big \} \) kept to be 600. Then, we construct the corresponding reduced energy spectrums \(\big \{ E_{i}^{\left( n\right) }\big \} \) with \(n=2,3,4\) and determine the distributions of nearest level spacings/gap ratios in them; the results are shown in Fig. 4. As can be seen, the fittings are quite satisfactory in all cases.

Fig. 4
figure 4

Distribution of nearest level spacing and gap ratio in the reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\big \}\) with \(n=2,3,4\) from ensembles with \(\beta =3\) a, d, \(\frac{1}{3}\) b, e and \(\frac{\sqrt{5}-1}{2}\) c,f, and the reference curves with index \(\gamma \) correspond to the ones in Eqs. (8 ) and (9) for level spacing and gap ratio, respectively. These fittings confirm the higher-order spacing distributions in general \(\beta \) ensembles

Up to now, we have verified the coincidence between nearest level spacing/gap ratio in \(\big \{ E_{i}^{\left( n\right) }\big \} \) with the n-th-order ones in \(\left\{ E_{i}\right\} \) for general Gaussian \(\beta \) ensembles. As a result, we have generalized the scaling behavior for \( P\left( s^{\left( n\right) }\right) \) and \(P\left( r^{\left( n\right) }\right) \) in Eqs. (8) and (9) to cases beyond GOE, GUE and GSE. The Gaussian ensemble with non-integer \(\beta \) has been used to describe the spacing distributions along the thermal-MBL transition [27], while its efficiency in describing long-range level correlations is under debate [28]. Our results thus provide a numerical criteria for such studies.

5 Higher-order reduced energy spectrums

Results in Sect. 3 verified the scaling of \(\big \{ E_{i}^{\left( n\right) }\big \} \) in cases with two/three levels, which is sufficient for applying Wigner surmise to \(\big \{ E_{i}^{\left( n\right) }\big \} \) regarding level spacing/gap ratio, and the numerical simulations in Sect. 4 provide strong support for the cases where \(\big \{ E_{i}^{\left( n\right) }\big \} \) has more levels. Actually, we can continue to construct the 2nd-order reduced energy spectrums, i.e. the “reduced energy spectrum of reduced energy spectrum”, whose nearest level spacing/gap ratio will correspond to the higher-order ones in \(\big \{ E_{i}^{\left( n\right) }\big \} \), and will rescale in a similar manner. Denote the mth-order level spacing and gap ratio in \(\big \{ E_{i}^{\left( n\right) }\big \} \) as \(s_{i}^{\left( n,m\right) }\) and \( r_{i}^{\left( n,m\right) }\), that is,

$$\begin{aligned} s_{i}^{\left( n,m\right) }=E_{i+m}^{\left( n\right) }-E_{i}^{\left( n\right) },\, r_{i}^{\left( n,m\right) }=\frac{E_{i+2m}^{\left( n\right) }-E_{i+m}^{\left( n\right) }}{E_{i+m}^{\left( n\right) }-E_{i}^{\left( n\right) }}, \end{aligned}$$
(24)

it’s straightforward to write down their expected probability distributions

$$\begin{aligned}&P\left( \beta ,s^{\left( n,m\right) }\right) =P\left( \delta ,s\right) , \,P\left( \beta ,r^{\left( n,m\right) }\right) =P\left( \delta ,r\right) \nonumber \\&\delta =\frac{m(m+1)}{2}\gamma +m-1, \gamma =\frac{n(n+1)}{2}\beta +n-1. \end{aligned}$$
(25)

The same procedure can continue for higher-order reduced energy spectrums.

Of course, such a construction is artificial; meanwhile, it reveals a hierarchy of energy spectrums can emerge from single spectrum, which (to lowest order) bear the same form of probability distributions. Moreover, by taking a closer look at the scaling expression \(\gamma =\frac{n\left( n+1\right) }{2}\beta +n-1\), we immediately recognize an infinite number of coincident relations between \(\big \{ E_{i}^{\left( n\right) }\big \} \) from different ensembles. For example, \(\big \{ E_{i}^{\left( 2\right) }\big \} \) in \(\beta =1\) (GOE) has the same structure as \(\big \{ E_{i}\big \} \) in \(\beta =4\) (GSE)—a result known before [43, 47], and both of them coincide with \(\big \{ E_{i}^{\left( 3\right) }\big \} \) in \(\beta =\frac{1}{3}\). Actually, it’s easy to verify that for \(\gamma \in (k,k+1]\) there exists k different sets of \(\left( \beta ,n\right) \) that have equal \(\gamma \), and their lower-order level statistics are expected to be identical.

6 Conclusion and discussion

We studied the reduced energy spectrum \(\big \{ E_{i}^{\left( n\right) }\big \} \), constructed by picking one level from every n levels in original spectrum \(\big \{ E_{i}\big \} \). It is verified the distribution of \(\big \{ E_{i}^{\left( n\right) }\big \} \) (to leading order) bears the same form as \(\left\{ E_{i}\right\} \), with the Dyson index rescaled from \( \beta \) to \(\gamma =\frac{n\left( n+1\right) }{2}\beta +n-1\). It’s then demonstrated the nearest level spacing and gap ratio in \(\big \{ E_{i}^{\left( n\right) }\big \} \) corresponds to the nth-order ones in \( \left\{ E_{i}\right\} \), which explains the distributions of the latter found recently in Ref. [39] and Ref. [33] simultaneously.

Moreover, we find the rescaling of reduced energy spectrum holds for Gaussian ensembles that go beyond the standard GOE, GUE and GSE and establish the distributions of higher-order level spacings and gap ratios in these ensembles. We also confirmed such correspondences in the Poisson ensemble and discovered the distribution of nth-order gap ratios, as expressed in Eq. (11).

It’s noted the reduced energy spectrum has been studied for the Poisson ensemble [42] and some Gaussian ensembles, that is, the special ones with \(\beta =2/k\) (k being positive integer) [43], which contains the well-known coincidence between \(\left\{ E_{i}^{\left( 2\right) }\right\} \) in GOE and \(\left\{ E_{i}\right\} \) in GSE. Our work is thus a natural extension of these studies.

The significance of our work is three-folded. First, we explained the distributions of higher-order level spacings and gap ratios—both of which are widely used in the study of MBL and whose distributions are found separately in recent studies—-by a single common mechanism: the reduced energy spectrum. Second, we generalized the higher-order spacing distributions in Eqs. (8) and (9) to general \(\beta \) ensembles, which may be beneficial for studying systems that go beyond the standard Gaussian ensembles [27, 41, 48,49,50]. Third, our results reveal a rich set of structures hidden in the energy spectrum, by constructing the reduced energy spectrums.

Last but not least, in our numerical simulations in Sect. 4, we were employing the modelling matrix of general \(\beta \) ensemble as expressed in Eq. (21). It’s thus natural and interesting to ask whether this “parent matrix” corresponds to a real quantum system, and what’s the property of such a system if it does. Given the physically relevant GOE, GUE and GSE are incorporated in Eq. (21), we conjecture such a “parent Hamiltonian” does exist, whose construction is left for a future study.