1 Introduction

Genes switch among different states due to the regulation of transcription factors and synthesise proteins at a state-dependent rate. This paper considers self-regulatory genes with positive feedback regulations, in which the transcription factors are synthesised by the regulated gene itself and reversely help the genes switch to a state with a relatively large synthesis rate. This may lead to a copy-number distribution with multiple modals, which, from a biological point of view, correspond to multiple phenotypic states of a living cell continuously exchanging materials and energy with its surroundings (Choi et al. 2008; Gupta et al. 2011; Ozbudak et al. 2004). Transitions among phenotypic states induced by intrinsic stochasticity are advantageous for cells to survive in fluctuating environments (Kussell and Leibler 2005; Acar et al. 2008).

Early mathematical works use the reaction rate equation to model the expression of a gene with two (Babloyantz and Sanglier 1972) or multiple (Santillán 2008) states, which neglects the randomness. The chemical-master-equation (CME) model, on the contrary, describes a random dynamics inside a single cell (Delbrück 1940; Gillespie 1977), and has been applied to study lots of gene-regulatory networks (Samad et al. 2005; Berg 1978; Thattai and van Oudenaarden 2001; Paulsson 2005; Jia et al. 2018; Newby and Chapman 2014; Newby 2015). Although the CME captures both types of randomness, i.e. stochastic gene-state switchings and low copy numbers of chemical species inside a single cell (Li and Xie 2011; Taniguchi et al. 2010; Eldar and Elowitz 2010), its mathematical computation is usually complex, and the exact solution is only available in simple cases (Hornos et al. 2005; Ramos et al. 2011).

Therefore, many algorithms have been proposed to numerically solve the probability distributions of CME accurately for problems of interests, without Monte Carlo simulations which are usually computationally expensive. For example, the Finite State Projection (FSP) algorithm (Munskya and Khammashb 2006) and its improved versions (Peleš et al. 2006; MacNamara et al. 2008; Kazeev et al. 2014; Hegland et al. 2008) utilize efficient projections in the vector-based state space of CME. Recently, the FSP is further developed to efficiently estimate the stationary distribution and the parameter sensitivities of the CME (Gupta et al. 2017; Dürrenberger et al. 2019). Other examples include the on-the-fly variant of the uniformisation technique, which improves the original algorithm at the cost of a small approximation error (Mateescu et al. 2010), the method of conditional moments (MCM), which employs a discrete stochastic description for low-copy number species and a moment-based description for medium/high-copy number species (Hasenauer et al. 2014), and so on.

Alternatively, various simplified mechanistic models rather than just numerical algorithms have also been proposed to approximate the CME and investigate the mechanism of single-cell dynamics. Their mathematical foundation is the limit behavior of general CME under different scales of reaction rate, species abundance, and time (Crudu et al. 2012; Kang and Kurtz 2013). Specifically, for gene-regulatory networks, simplified models are applicable under different parameter regions based on the relationship between the gene-state switchings and the birth–death kinetics of proteins. Most previous works assumed that gene-state switchings are extremely slow (Karmakar and Bose 2004; Artyomov et al. 2007; Qian et al. 2009; To and Maheshri 2010; Feng et al. 2011; Ochab-Marcinek and Tabaka 2010) or extremely rapid (Ge and Qian 2009; Wang et al. 2010; Zhou et al. 2012; Qian 2014; Lu et al. 2014; Hufton et al. 2019a, b) to avoid the mathematical difficulty in subsequent analyses. However, at least in bacteria, the single-cell gene-state switchings are neither extremely slow nor extremely rapid (Li and Xie 2011; Taniguchi et al. 2010; Choi et al. 2008; Gupta et al. 2011; Ozbudak et al. 2004). The relative stability of phenotypic states and the transition rates among them in such an intermediate region are far from quantitatively understood.

Recently, Ge et al. (2015) have proposed a so-called fluctuating-rate (FR) model for the more realistic intermediate region, which neglects the randomness caused by the low copy number of proteins but retains the randomness caused by gene-state switchings. The FR model is much more accessible for mathematical analyses than the full CME model because its mathematical prototype, i.e. the piecewise deterministic Markov processes (PDMP), has been well studied (Davis 1984, 1993). PDMP is a Markov process, whose randomness is only given by the jumps among different deterministic dynamics. PDMP has appeared in several previous studies of the gene-regulatory networks (Kepler and Elston 2001; Newby 2012; Hufton et al. 2016, 2018). These works are all restricted to the case of specific number (two or three) of gene states. In addition, Lin and Doering (2016) studied two-state cases theoretically, but only provided the numerical results for multiple-state ones.

Actually, it has already been proved that, after taking the limit of large active synthesis rates of proteins, the full CME model approaches the FR model, and if the switching rates between discrete gene states further tend to infinity, the FR model finally converges to the deterministic mean-field dynamics (DMFD) (Crudu et al. 2012; Faggionato et al. 2010). Studying the relative stability of phenotypic states and the transition rates among them should go one step further beyond this law of large number. That is the large deviation principle (LDP), especially the Freidlin–Wentzell-type one (Freidlin and Wentzell 2014). Although the general dynamic LDP theory of PDMP models has already been derived (Kifer 2009; Faggionato et al. 2009, 2010), explicit expressions of the LDP rate function of steady-state distribution as well as the proof for its Lyapunov property for the DMFD are still lacking for gene-regulatory FR models with multiple gene states. The case with only two gene states has been solved via the WKB expansion in Ge et al. (2015). The connection between LDP and WKB methods is justified in Bressloff and Faugeras (2017). People may believe that the scenario with multiple gene states should share the same mathematical results as the case with only two gene states, and the multiple-state FR model has indeed already been applied to investigate the stochastic kinetics of lac operon (Ge et al. 2018), but the mathematical methods in Ge et al. (2015) cannot be generalized straightforwardly.

Besides the rigorous LDP for PDMPs, there are a variety of application techniques in solving the first-passage time problems of stochastic hybrid systems, which include WKB approximations and matched asymptotics (Bressloff and Newby 2014b; Keener and Newby 2011; Newby 2012; Newby and Keener 2011; Newby et al. 2013), and path-integrals (Bressloff and Newby 2014a; Bressloff 2015). But the calculation of the LDP rate function by these techniques is still difficult for the multiple-state case even numerically. That is why most applications of the PDMP so far are still restricted to the case of two or three gene states.

We focus on the LDP rate function of the stationary distribution, which is a quasi-potential of the FR model and describes the leading order of the fluctuations of the protein abundance. We found out that its derivative with respect to the continuous variable can be formulated as the unique nontrivial solution of a dominant generalised eigenvalue problem in the case with arbitrary finite number of gene states, which generalises the results in Ge et al. (2015). Main mathematical tools are the famous Perron–Frobenius theorem (Frobenius 1912) and the convexity of the dominant eigenvalue of an essentially nonnegative matrix on diagonal elements (Deutsch and Neumann 1984). Such a detailed investigation of dominant generalised eigenvalue problems promotes a new numerical algorithm for obtaining the LDP rate function of steady-state distribution in the FR models. We further prove the Lyapunov property of the LDP rate function with respect to the DMFD, based on the above analysis. Whereas the results of Faggionato et al. (2009) are restricted to the case with unique fixed point, our result of the Lyapunov property is general. The prefactor of the FR model, which provides the next order of fluctuations, is also proved to be continuous and positive.

We use a tristable example to numerically show that the rate function of the FR model correctly predicts the transition rates between phenotypic states, i.e. different attractors of DMFD, in the intermediate region based on the Freidlin–Wentzell-type LDP. Moreover, the local fluctuations, i.e. asymptotic variance, of each phenotypic state in the intermediate region of gene-state switchings are highly different from those in the rapid region, even if their DMFDs are the same.

This paper is organised as follows. Heuristic derivation of the FR model with multiple gene states as well as the associated LDP rate function is given in Sect. 2. Mathematically rigorous proof of the LDP rate function as the unique nontrivial solution of a dominant generalised eigenvalue problem and the Lyapunov property with respect to DMFD are given in Sect. 3. In the same section, we propose a new numerical algorithm for the calculation of LDP rate function. In Sect. 4, a tristable example is analysed in detail to further justify the main results. The conclusions and remarks are presented in Sect. 5.

2 Approximate the full CME model with multiple gene states

We briefly describe the full CME model of protein syntheses in Sect. 2.1. In Sect. 2.2, we reduce the full CME model to the FR model through the rapid limit of protein synthesis. Then as the gene-state switching rates further approach infinity, the DMFD of the FR model and the LDP for its stationary distribution are given in Sects. 2.3 and 2.4, respectively. We briefly introduce the reduced CME model in Sect. 2.5, which approximates the full CME model through the rapid limit of gene-state switchings.

Fig. 1
figure 1

Diagram of an example of the full CME model with three gene states

2.1 Full CME model

We assume the total number of gene states is G. A gene switches from state i to state j by the rate \(h_{i,j}\left( n\right) \), which depends on the protein copy number n. A gene in state i synthesises proteins by the rate \(k_i\), and a protein degrades by rate \(\gamma \). Without loss of generality, assume that \(k_1>k_2>\cdots >k_G\). Figure 1 is the diagram of an example of the full CME model with three gene states. The state of a single cell is characterized by the gene state and copy number of protein molecules. Define \(p_i\left( n,t\right) \) as the probability of the cell state in which the gene state is i and there are n protein molecules at the moment t. The CME is (Delbrück 1940; Grima et al. 2012; Gillespie 1976, 1977)

$$\begin{aligned}&\partial _t p_i\left( n,t\right) =k_ip_i\left( n-1,t\right) +\gamma \left( n+1\right) p_i\left( n+1,t\right) +\sum _{j\ne i}h_{j,i}\left( n\right) p_j\left( n,t\right) \nonumber \\&\quad -\,k_ip_i\left( n,t\right) -\gamma n p_i\left( n,t\right) -\sum _{j\ne i}h_{i,j}\left( n\right) p_i\left( n,t\right) . \end{aligned}$$
(1)

2.2 FR model

Let \(h_{i,j}\left( n\right) \) and \(\gamma \) be fixed, and define \(n_{\max }=\frac{k_1}{\gamma }\). Denote \(k_i=n_{\max }k_i^0\) with \(k_i^0\) being fixed, in which \(k_1^0=\gamma \). We give a heuristic derivation of the FR model which approximates the full CME model as \(n_{\max }\rightarrow \infty \). The FR model is actually a PDMP. Rigorous definition and proof of the convergence of the full CME model to PDMP as \(n_{\max }\rightarrow \infty \) are given by Theorem 3.1 in Crudu et al. (2012).

Define \(x:=n/n_{\max }\), \({\widetilde{p}}_i\left( x,t\right) :=p_i\left( n_{\max }\cdot x,t\right) \) and \({\widetilde{h}}_{i,j}\left( x\right) :=h_{i,j}\left( n_{\max }\cdot x\right) \). Then Eq. (1) becomes

$$\begin{aligned}&\partial _t {\widetilde{p}}_i\left( x,t\right) =n_{\max }k_i^0{\widetilde{p}}_i\left( x-\frac{1}{n_{\max }},t\right) +\gamma n_{\max }\left( x+\frac{1}{n_{\max }}\right) {\widetilde{p}}_i\left( x+\frac{1}{n_{\max }},t\right) \nonumber \\&\quad +\sum _{j\ne i}{\widetilde{h}}_{j,i}\left( x\right) {\widetilde{p}}_j\left( x,t\right) -n_{\max }k_i^0{\widetilde{p}}_i\left( x,t\right) -\gamma n_{\max }x{\widetilde{p}}_i\left( x,t\right) -\sum _{j\ne i}{\widetilde{h}}_{i,j}\left( x\right) {\widetilde{p}}_i\left( x,t\right) .\nonumber \\ \end{aligned}$$
(2)

Substitute

$$\begin{aligned}&{\widetilde{p}}_i\left( x-\frac{1}{n_{\max }},t\right) \approx {\widetilde{p}}_i\left( x,t\right) -\frac{1}{n_{\max }}\partial _x {\widetilde{p}}_i\left( x,t\right) ,\\&\quad \left( x+\frac{1}{n_{\max }}\right) {\widetilde{p}}_i\left( x+\frac{1}{n_{\max }},t\right) \approx x{\widetilde{p}}_i\left( x,t\right) +\frac{1}{n_{\max }}\partial _x \left[ x{\widetilde{p}}_i\left( x,t\right) \right] , \end{aligned}$$

into Eq. (2), resulting in

$$\begin{aligned} \partial _t {\widetilde{p}}_i\left( x,t\right)= & {} -\partial _x\left\{ \left[ k_i^0-\gamma x\right] {\widetilde{p}}_i\left( x,t\right) \right\} \nonumber \\&+\,\sum _{j\ne i}{\widetilde{h}}_{j,i}\left( x\right) {\widetilde{p}}_j\left( x,t\right) -\sum _{j\ne i}{\widetilde{h}}_{i,j}\left( x\right) {\widetilde{p}}_i\left( x,t\right) . \end{aligned}$$
(3)

It is exactly the Fokker–Planck equation of a PDMP process, called FR model for single-cell dynamics (Ge et al. 2015). The gene switches stochastically among different states, while at each gene state, the fluctuation of the protein kinetics is eliminated by taking the limit of the large active synthesis rate of proteins, leaving the rescaled protein number x to follow a deterministic dynamics.

2.3 DMFD of the FR model

Define the negative transition rate matrix as

$$\begin{aligned} {\widetilde{H}}\left( x\right) :=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} {\widetilde{h}}_{1,\cdot }\left( x\right) &{} -{\widetilde{h}}_{2,1}\left( x\right) &{} \cdots &{} -{\widetilde{h}}_{G,1}\left( x\right) \\ -{\widetilde{h}}_{1,2}\left( x\right) &{} {\widetilde{h}}_{2,\cdot }\left( x\right) &{} \cdots &{} -{\widetilde{h}}_{G,2}\left( x\right) \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ -{\widetilde{h}}_{1,G}\left( x\right) &{} -{\widetilde{h}}_{2,G}\left( x\right) &{} \cdots &{} {\widetilde{h}}_{G,\cdot }\left( x\right) \\ \end{array} \right) , \end{aligned}$$

where \({\widetilde{h}}_{i,\cdot }\left( x\right) :=\sum _{j\ne i}{\widetilde{h}}_{i,j}\left( x\right) \). In the FR model of Eq. (3), let \(k_i^0\) and \(\gamma \) be fixed, and \({\widetilde{H}}\left( x\right) ={\mathcal {H}}\widehat{{\widetilde{H}}}\left( x\right) \) (\({\widetilde{h}}_{i,j}\left( x\right) ={\mathcal {H}}\widehat{{\widetilde{h}}}_{i,j}\left( x\right) \)) with \(\widehat{{\widetilde{H}}}\left( x\right) \) being fixed and \({\mathcal {H}}\rightarrow \infty \). If \(\widehat{{\widetilde{H}}}\left( x\right) \) is irreducible, the Markov chain with the transition rate matrix \(-{\mathcal {H}}\widehat{{\widetilde{H}}}\left( x\right) \) has the unique stationary probability \({\widetilde{\zeta }}_i\left( x\right) \) for gene state i. We give a heuristic derivation of the DMFD of the FR model. The rigorous counterpart is the averaging principle of PDMP given by Theorem 2.2 in Faggionato et al. (2010).

As \({\mathcal {H}}\rightarrow \infty \), the characteristic time-scale of the gene switching becomes much faster than the dynamics of the protein abundance x in the FR model, and \({\widetilde{p}}_i\left( x,t\right) \propto {\widetilde{\zeta }}_i\left( x\right) \) approximately. Thus, we define

$$\begin{aligned} g\left( x\right) :=\sum _{i=1}^G{\widetilde{\zeta }}_i\left( x\right) k_i^0. \end{aligned}$$
(4)

Let \({\widetilde{p}}\left( x,t\right) :=\sum _{i=1}^G{\widetilde{p}}_i\left( x,t\right) \). Then summing Eq. (3) over i, and substituting Eq. (4), we have

$$\begin{aligned} \partial _t {\widetilde{p}}\left( x,t\right) =-\partial _x\left\{ \left[ g\left( x\right) -\gamma x\right] {\widetilde{p}}\left( x,t\right) \right\} . \end{aligned}$$
(5)

Equation (5) is exactly the Louville equation, describing the evolution of probability along the DMFD

$$\begin{aligned} \frac{dx}{dt}=g\left( x\right) -\gamma x. \end{aligned}$$
(6)

2.4 LDP for the stationary distribution of the FR model

Except for the particular cases in Propositions 3.4 and 3.5 of Faggionato et al. (2009), obtaining the exact stationary solution of the FR model with more than two gene states is generally difficult. We alternatively give a heuristic derivation of the LDP rate function for the stationary distribution of the FR model.

Assume Eq. (3) has the unique stationary distribution \({\widetilde{p}}_i^{ss}\left( x\right) \), which is absolutely continuous w.r.t. the Lebesgue measure. Then

$$\begin{aligned} 0=-\partial _x\left\{ \left[ k_i^0-\gamma x\right] {\widetilde{p}}_i^{ss}\left( x\right) \right\} +\sum _{j\ne i}{\mathcal {H}}\widehat{{\widetilde{h}}}_{j,i}\left( x\right) {\widetilde{p}}_j^{ss}\left( x\right) -\sum _{j\ne i}{\mathcal {H}}\widehat{{\widetilde{h}}}_{i,j}\left( x\right) {\widetilde{p}}_i^{ss}\left( x\right) .\nonumber \\ \end{aligned}$$
(7)

The pathwise LDP has been proved for many mesoscopic models approaching the macroscopic ones (including diffusion processes) (Freidlin and Wentzell 2014; Feng and Kurtz 2015; Olivieri and Vares 2005; Touchette 2009). For PDMP, the LDP of the path density is proved by Theorem 2.3 in Faggionato et al. (2010). Then by the same classic techniques used in Freidlin–Wentzell theory (Freidlin and Wentzell 2014), one can prove that the limit \(\lim _{{\mathcal {H}}\rightarrow \infty }-\frac{1}{{\mathcal {H}}}\log \sum _i{\widetilde{p}}_i^{ss}\left( x\right) \) exists. We further assume that the limit \(\lim _{{\mathcal {H}}\rightarrow \infty }-\frac{1}{{\mathcal {H}}}\log {\widetilde{p}}_i^{ss}\left( x\right) \) exists for each i and is independent of i, which is denoted by \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \).

In other words, the stationary distribution of the FR model satisfies \({\widetilde{p}}_i^{ss}\left( x\right) =C_i\left( x,{\mathcal {H}}\right) \exp \left( -{\mathcal {H}}\widehat{{\widetilde{\varPhi }}}\left( x\right) \right) \) scaled by the gene switching intensity \({\mathcal {H}}\), where \(C_i\left( x,{\mathcal {H}}\right) \) is the prefactor satisfying \(\lim _{{\mathcal {H}}\rightarrow \infty }-\frac{1}{{\mathcal {H}}}\log C_i\left( x,{\mathcal {H}}\right) =0\).

We have assumed that \(\widehat{{\widetilde{\varPhi }}}_i\left( x\right) =\lim _{{\mathcal {H}}\rightarrow \infty }-\frac{1}{{\mathcal {H}}}\log {\widetilde{p}}_i^{ss}\left( x\right) \) is independent of i. This is based on the following heuristic argument. Substitute \({\widetilde{p}}_i^{ss}\left( x\right) =C_i\left( x,{\mathcal {H}}\right) \exp \left( -{\mathcal {H}}\widehat{{\widetilde{\varPhi }}}_i\left( x\right) \right) \) into Eq. (7), and divide both sides by \(\exp \left( -{\mathcal {H}}\widehat{{\widetilde{\varPhi }}}_i\left( x\right) \right) \).

$$\begin{aligned} 0= & {} -\partial _x\left\{ \left[ k_i^0-\gamma x\right] C_i\left( x,{\mathcal {H}}\right) \right\} +{\mathcal {H}}\left[ k_i^0-\gamma x\right] C_i\left( x,{\mathcal {H}}\right) \partial _x\widehat{{\widetilde{\varPhi }}}_i\left( x\right) \nonumber \\&+\,\sum _{j\ne i}{\mathcal {H}}\widehat{{\widetilde{h}}}_{j,i}\left( x\right) C_j\left( x,{\mathcal {H}}\right) \exp \left[ -{\mathcal {H}}\left( \widehat{{\widetilde{\varPhi }}}_j\left( x\right) -\widehat{{\widetilde{\varPhi }}}_i\left( x\right) \right) \right] \nonumber \\&-\,\sum _{j\ne i}{\mathcal {H}}\widehat{{\widetilde{h}}}_{i,j}\left( x\right) C_i\left( x,{\mathcal {H}}\right) . \end{aligned}$$
(8)

The third term on the right hand side of Eq. (8) will be exponentially large unless its exponential parts vanish. Thus,

$$\begin{aligned} \widehat{{\widetilde{\varPhi }}}_1\left( x\right) =\widehat{{\widetilde{\varPhi }}}_2\left( x\right) =\widehat{{\widetilde{\varPhi }}}_3\left( x\right) =\cdots =\widehat{{\widetilde{\varPhi }}}_G\left( x\right) :=\widehat{{\widetilde{\varPhi }}}\left( x\right) . \end{aligned}$$

Define \(C\left( x,{\mathcal {H}}\right) :=\left[ C_1\left( x,{\mathcal {H}}\right) ,C_2\left( x,{\mathcal {H}}\right) ,\cdots ,C_G\left( x,{\mathcal {H}}\right) \right] ^T\). Assume that

$$\begin{aligned}&\lim _{{\mathcal {H}}\rightarrow +\infty }C(x,{\mathcal {H}})/\Vert C(x,{\mathcal {H}})\Vert _2={\widetilde{C}}(x),\\&\lim _{{\mathcal {H}}\rightarrow \infty }\frac{1}{{\mathcal {H}}}\partial _x \left[ \log C_i(x,{\mathcal {H}})\right] =0. \end{aligned}$$

Then divide both sides of Eq. (8) by \({\mathcal {H}}\Vert C(x,{\mathcal {H}})\Vert _2\), neglect the first term in Eq. (8) as \({\mathcal {H}}\rightarrow \infty \), and substitute \(x_i^*:=k_i^0/\gamma \), we have

$$\begin{aligned} 0=\gamma \left( x_i^*-x\right) {\widetilde{C}}_i\left( x\right) \partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) +\sum _{j\ne i}\widehat{{\widetilde{h}}}_{j,i}\left( x\right) {\widetilde{C}}_j\left( x\right) -\sum _{j\ne i}\widehat{{\widetilde{h}}}_{i,j}\left( x\right) {\widetilde{C}}_i\left( x\right) . \end{aligned}$$
(9)

Define

$$\begin{aligned} A\left( x\right) :=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} \gamma \left( x_1^*-x\right) &{} 0 &{} \cdots &{} 0 \\ 0 &{} \gamma \left( x_2^*-x\right) &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \gamma \left( x_G^*-x\right) \\ \end{array} \right) , \end{aligned}$$
(10)

then Eq. (9) can be rewritten in the matrix form:

$$\begin{aligned} \left[ \widehat{{\widetilde{H}}}\left( x\right) -\partial _x \widehat{{\widetilde{\varPhi }}}\left( x\right) A\left( x\right) \right] {\widetilde{C}}\left( x\right) =0. \end{aligned}$$
(11)

\(\exists {\widetilde{C}}\left( x\right) \ne 0\) satisfying Eq. (11) iff

$$\begin{aligned} \det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\partial _x \widehat{{\widetilde{\varPhi }}}\left( x\right) A\left( x\right) \right] =0. \end{aligned}$$
(12)

What we have used is the WKB method. The connections between LDP and less rigorous WKB methods for Markov chains have been established for a long time (Dykman et al. 1994; Hanggi et al. 1984; Knessl et al. 1985; Vellela and Qian 2008). Similar results for PDMP are recently proved in Bressloff and Faugeras (2017). Such a LDP rate function also provides the transition-rate formula between different attractors of DMFD, which takes the general Arrhenius/Kramers form (Keener and Newby 2011; Ge et al. 2015). In literature, people sometimes call the LDP rate function \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) the nonequilibrium landscape function (NLF) (Feng et al. 2011; Ge et al. 2015), which is an analog of the equilibrium landscape function established in the field of protein folding (Frauenfelder et al. 1991; Onuchic et al. 1997).

2.5 Reduced CME model

Let \(k_i\) and \(\gamma \) be fixed, and \(H\left( n\right) ={\mathcal {H}}{\widehat{H}}\left( n\right) \) with \({\widehat{H}}\left( n\right) \) being fixed and \({\mathcal {H}}\rightarrow \infty \). Resembling Ge et al. (2015), we give a heuristic derivation of the reduced CME model which averages the full CME model over the stationary distribution of the gene state as \({\mathcal {H}}\rightarrow \infty \). The rigorous averaging principle of the full CME model is given by Theorem 5.1 in Kang and Kurtz (2013).

Define \(p\left( n,t\right) :=\sum _{i=1}^Gp_i\left( n,t\right) \). Let \(\zeta _i\left( n\right) \) be the stationary probability for gene state i of the Markov chain with transition rate matrix \(-{\mathcal {H}}{\widehat{H}}\left( n\right) \). As \({\mathcal {H}}\rightarrow \infty \), \(p_i\left( n,t\right) \propto \zeta _i\left( n\right) \) approximately. Define \({\bar{k}}\left( n\right) :=\sum _{i=1}^G\zeta _i\left( n\right) k_i\). Sum Eq. (1) over i.

$$\begin{aligned} \partial _t p\left( n,t\right)= & {} {\bar{k}}\left( n-1\right) p\left( n-1,t\right) \gamma \left( n+1\right) p\left( n+1,t\right) \nonumber \\&-\,{\bar{k}}\left( n\right) p\left( n,t\right) -\gamma np\left( n,t\right) . \end{aligned}$$
(13)

Equation (13) is called the reduced CME model.

Note that the reduced CME is actually a full CME with only one gene state. Thus, the FR approximation of it as in Sect. 2.2 directly leads to the DMFD. Use the same rate scales as in Sect. 2.2, and define \(\widetilde{{\bar{k}}}\left( x\right) :={\bar{k}}\left( n\right) \). By Eq. (4), \(\widetilde{{\bar{k}}}\left( x\right) =n_{\max }g\left( x\right) \). Then Eq. (13) becomes

$$\begin{aligned} \partial _t {\widetilde{p}}\left( x,t\right)= & {} n_{\max }g\left( x-\frac{1}{n_{\max }}\right) {\widetilde{p}}\left( x-\frac{1}{n_{\max }},t\right) \nonumber \\&+\,\gamma n_{\max }\left( x+\frac{1}{n_{\max }}\right) {\widetilde{p}}\left( x+\frac{1}{n_{\max }},t\right) \nonumber \\&-\,n_{\max }g\left( x\right) {\widetilde{p}}\left( x,t\right) -\gamma n_{\max } x{\widetilde{p}}\left( x,t\right) . \end{aligned}$$
(14)

Substitute

$$\begin{aligned}&g\left( x-\frac{1}{n_{\max }}\right) {\widetilde{p}}\left( x-\frac{1}{n_{\max }},t\right) \approx g\left( x\right) {\widetilde{p}}\left( x,t\right) -\frac{1}{n_{\max }}\partial _x\left[ g\left( x\right) {\widetilde{p}}\left( x,t\right) \right] ,\\&\left( x+\frac{1}{n_{\max }}\right) {\widetilde{p}}\left( x+\frac{1}{n_{\max }},t\right) \approx {\widetilde{p}}\left( x,t\right) +\frac{1}{n_{\max }}\partial _x\left[ x{\widetilde{p}}\left( x,t\right) \right] , \end{aligned}$$

into Eq. (14), one arrives at

$$\begin{aligned}&\partial _t {\widetilde{p}}\left( x,t\right) =-\partial _x\left\{ \left[ g\left( x\right) -\gamma x\right] {\widetilde{p}}\left( x,t\right) \right\} , \end{aligned}$$

which is exactly Eq. (5). Therefore, the FR model and the reduced CME model share the same DMFD in Eq. (6).

Define \(p^{ss}\left( n\right) \) as the stationary distribution of \(p\left( n,t\right) \). By Eq. (13), for \(n\ge 1\),

$$\begin{aligned} 0={\bar{k}}\left( n-1\right) p^{ss}\left( n-1\right) +\gamma \left( n+1\right) p^{ss}\left( n+1\right) -{\bar{k}}\left( n\right) p^{ss}\left( n\right) -\gamma np^{ss}\left( n\right) . \end{aligned}$$

\(0=\gamma p^{ss}\left( 1\right) -{\bar{k}}\left( 0\right) p^{ss}\left( 0\right) \) for \(n=0\). Therefore,

$$\begin{aligned}&p^{ss}\left( xn_{\max }\right) =p^{ss}\left( 0\right) \prod _{i=0}^{xn_{\max }-1}\frac{{\bar{k}}\left( i\right) }{\gamma \left( i+1\right) }=p^{ss}\left( 0\right) \exp \left[ \sum _{i=0}^{xn_{\max }-1}\log \left( \frac{{\bar{k}}\left( i\right) }{\gamma \left( i+1\right) }\right) \right] \nonumber \\&\quad =p^{ss}\left( 0\right) \exp \left[ n_{\max }\sum _{i=0}^{xn_{\max }-1}\frac{1}{n_{\max }}\log \left( \frac{{\bar{k}}\left( i\right) }{\gamma \left( i+1\right) }\right) \right] \nonumber \\&\quad =p^{ss}\left( 0\right) \exp \left[ n_{\max }\sum _{i=0}^{xn_{\max }-1}\frac{1}{n_{\max }}\log \left( \frac{g\left( \frac{i}{n_{\max }}\right) }{\gamma \frac{i+1}{n_{\max }}}\right) \right] \nonumber \\&\quad \approx p^{ss}\left( 0\right) \exp \left[ n_{\max }\int _0^x\log \left( \frac{g\left( y\right) }{\gamma y}\right) dy\right] . \end{aligned}$$
(15)

Define \(\widehat{{\widetilde{\varPhi }}}^R\left( x\right) :=\lim _{n_{\max }\rightarrow \infty }-\frac{1}{n_{\max }}\log \left[ p^{ss}\left( xn_{\max }\right) \right] =-\int _0^x\log \left( \frac{g\left( y\right) }{\gamma y}\right) dy\). Then

$$\begin{aligned} \partial _x \widehat{{\widetilde{\varPhi }}}^R\left( x\right) =-\log \left( \frac{g\left( x\right) }{\gamma x}\right) . \end{aligned}$$
(16)

By Eqs. (6) and (16), we have the Lyapunov property of \(\widehat{{\widetilde{\varPhi }}}^R\left( x\right) \).

$$\begin{aligned} \frac{d\widehat{{\widetilde{\varPhi }}}^R\left( x\right) }{dt}=\partial _x \widehat{{\widetilde{\varPhi }}}^R\left( x\right) \frac{dx}{dt}=-\log \left( \frac{g\left( x\right) }{\gamma x}\right) \left[ g\left( x\right) -\gamma x\right] \le 0. \end{aligned}$$

3 Rigorous analysis of the rate function and the prefactor of the FR model

Based on the heuristic arguments in the previous section, we propose several basic assumptions for the FR models:

Assumption 1

The stationary distribution \({\widetilde{p}}_i^{ss}\left( x\right) \) of the FR model can be expressed as

$$\begin{aligned} {\widetilde{p}}_i^{ss}\left( x\right) =C_i\left( x,{\mathcal {H}}\right) \exp \left( -{\mathcal {H}}\widehat{{\widetilde{\varPhi }}}\left( x\right) \right) , \end{aligned}$$

in which \(\lim _{{\mathcal {H}}\rightarrow \infty }-\frac{1}{{\mathcal {H}}}\log C_i\left( x,{\mathcal {H}}\right) =0\) and \(\lim _{{\mathcal {H}}\rightarrow \infty }\frac{1}{{\mathcal {H}}}\partial _x \left[ \log C_i(x,{\mathcal {H}})\right] =0\).

Furthermore, \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) is continuously-differentiable, and \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \) satisfies Eq. (12).

Assumption 2

\(C(x,{\mathcal {H}})\) in Assumption 1 satisfies \(\lim _{{\mathcal {H}}\rightarrow +\infty }C(x,{\mathcal {H}})/\Vert C(x,{\mathcal {H}})\Vert _2={\widetilde{C}}(x)\), and \({\widetilde{C}}(x)\) satisfies Eq. (11).

Assumption 3

The negative transition matrixes \(\widehat{{\widetilde{H}}}\left( x\right) \) are irreducible and continuous in \(\left( x_G^*,x_1^*\right) \).

Generalised eigenvalue (GE) proposed in previous works (Chu 1987; Ikramov 1993) and dominant generalised eigenvalue (DGE) proposed in this work are the two most important concepts in this section. The classical eigenvalue problem of a matrix Z is to solve the equation \(\det \left( \lambda I-Z\right) =0\). Replacing the identity matrix I by another matrix D, roots of \(\det \left( \lambda D-Z\right) =0\) are called GEs of the matrix Z on D, which form the generilised spectrum. Assume Z is a Z-matrix, i.e. all the off-diagonal elements are nonpositive, then we can define the classical dominant eigenvalue problem for Z. For \(\lambda \) large enough, \(\lambda I-Z\) is a nonnegative matrix with a real positive dominant eigenvalue \(\lambda _0\) due to the Perron–Frobenius theorem. Then \(\lambda -\lambda _0\) is a real eigenvalue with the smallest real part among all eigenvalues of Z, which is called the classic dominant eigenvalue of Z and denoted by \(r\left( Z\right) \). Obviously, such a definition does not depend on the choice of \(\lambda \). Note that \(Z-r\left( Z\right) I\) is still a Z-matrix, and the eigenvalue of \(Z-r\left( Z\right) I\) with the smallest real part is exactly 0. Thus, \(r\left( Z-r\left( Z\right) I\right) =0\). Replacing I by another diagonal matrix D, any solution of \(r\left( Z-\lambda D\right) =0\) (see Definition A1) is called a DGE of Z on D.

By Assumption 1 and Definition A1, \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \) is a GE of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \). By Assumption 3, Corollary A1 and Theorem A1, \(\widehat{{\widetilde{H}}}\left( x\right) \) has at most two DGEs on \(A\left( x\right) \) for \(x\in \left( x_G^*,x_1^*\right) \). One is always 0. The other, if exists, is defined as the nontrivial DGE \(\mu \left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \). Otherwise, define \(\mu \left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) :=0\) (see Definition A4). For simplicity, define \(\mu \left( x\right) :=\mu \left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \) for \(x\in \left( x_G^*,x_1^*\right) \).

3.1 The relation between \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \) and \(\mu \left( x\right) \)

First, we justify the continuity of \(\mu (x)\), especially at those \(x_i^*\)’s with degenerated \(A\left( x\right) \). For \(x\ne x_i^*\), \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =0\) is a \(\lambda \)-polynomial of order G and has G roots (GEs). However, \(\det \left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\lambda A\left( x_i^*\right) \right] =0\) is of order \(G-1\) and has only \(G-1\) roots (GEs). The following theorem shows that one of the G GEs for \(x\ne x_i^*\), denoted by \(\lambda _{\infty }(x)\), diverges at \(x_i^*\), and the other \(G-1\) GEs for \(x\ne x_i^*\) tend to the \(G-1\) GEs at \(x_i^*\). Furthermore, for \(x\ne x_i^*\), the nontrival DGE is not \(\lambda _{\infty }(x)\) near \(x_i^*\), and converges at \(x_i^*\) to the nontrival DGE at \(x_i^*\).

Theorem 1

Under Assumption 3, we have for \(i\in [2,G-1]\)

  1. 1.

    There exists a single root \(\lambda _{\infty }\left( x\right) \) of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =0\) near \(x_i^*\) such that \(\widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) =\lim _{x\rightarrow x_i^*}\gamma \left( x_i^*-x\right) \lambda _{\infty }\left( x\right) \).

  2. 2.

    \(\forall \lambda \),

    $$\begin{aligned} \lim _{x\rightarrow x_i^*}p\left( x\right) \left( \lambda \right) =\det \left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\lambda A\left( x_i^*\right) \right] \Bigg /\left[ \widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) \prod _{i'\ne i}\gamma \left( x_i^*-x_{i'}^*\right) \right] , \end{aligned}$$

    where for \(x\in \bigcup _{i'=1}^{G-1}\left( x_{i'+1}^*,x_{i'}^*\right) \),

    $$\begin{aligned} p\left( x\right) \left( \lambda \right) :=\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] \Bigg /\left\{ \left[ \lambda -\lambda _{\infty }\left( x\right) \right] \prod _{i'=1}^G\gamma \left( x-x_{i'}^*\right) \right\} . \end{aligned}$$
    (17)
  3. 3.

    \(\lim _{x\rightarrow x_i^*}\mu \left( x\right) =\mu \left( x_i^*\right) \).

Proof

For \(x\in \bigcup _{i'=1}^{G-1}\left( x_{i'+1}^*,x_{i'}^*\right) \), \(A\left( x\right) \) is nonsingular, thereby

$$\begin{aligned} \det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =\det \left[ \widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) -\lambda I\right] \det \left[ A\left( x\right) \right] , \end{aligned}$$

and the GEs of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \) are the eigenvalues of \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \). \(\forall \lambda \),

$$\begin{aligned}&\lim _{x\rightarrow x_i^*}\det \left[ \gamma \left( x_i^*-x\right) \widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) -\lambda I\right] =\left( -1\right) ^G\lambda ^{G-1}\left[ \lambda -\widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) \right] , \end{aligned}$$

so as \(x\rightarrow x_i^*\), \(G-1\) eigenvalues (count multiplicity) of \(\gamma \left( x_i^*-x\right) \widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) tend to 0, and the remaining one tends to \(\widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) \). The eigenvalues of \(\gamma \left( x_i^*-x\right) \widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) are just those of \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) multiplying \(\gamma \left( x_i^*-x\right) \), so we have the statement 1.

Multiply both sides of Eq. (17) by \(\left[ \lambda -\lambda _{\infty }\left( x\right) \right] \prod _{i'=1}^G\gamma \left( x-x_{i'}^*\right) \).

$$\begin{aligned}&\left[ \prod _{i'\ne i}\gamma \left( x-x_{i'}^*\right) \right] \gamma \left( x-x_i^*\right) \left[ \lambda -\lambda _\infty \left( x\right) \right] p\left( x\right) \left( \lambda \right) =\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] . \end{aligned}$$

Because \(\lim _{x\rightarrow x_i^*}\gamma \left( x_i^*-x\right) \lambda _{\infty }\left( x\right) =\widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) \), for any fixed \(\lambda \),

$$\begin{aligned}&\widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) \left[ \prod _{i'\ne i}\gamma \left( x_i^*-x_{i'}^*\right) \right] \lim _{x\rightarrow x_i^*}p\left( x\right) \left( \lambda \right) \\&\quad =\lim _{x\rightarrow x_i^*}\left[ \prod _{i'\ne i}\gamma \left( x-x_{i'}^*\right) \right] \gamma \left( x-x_i^*\right) \left[ \lambda -\lambda _\infty \left( x\right) \right] p\left( x\right) \left( \lambda \right) \\&\quad =\lim _{x\rightarrow x_i^*}\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =\lim _{x\rightarrow x_i^*}\det \left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\lambda A\left( x_i^*\right) \right] . \end{aligned}$$

This is the statement 2.

Because \(\lim _{x\rightarrow x_i^*}\gamma \left( x_i^*-x\right) \lambda _{\infty }\left( x\right) =\widehat{{\widetilde{h}}}_{i,\cdot }\left( x_i^*\right) >0\), we have

$$\begin{aligned} \lim _{x\rightarrow x_i^*-}\lambda _{\infty }\left( x\right) =+\infty ,\quad \lim _{x\rightarrow x_i^*+}\lambda _{\infty }\left( x\right) =-\infty . \end{aligned}$$

Denote the number of positive and negative diagonal elements of \(A\left( x\right) \) by \(m\left[ A\left( x\right) \right] \) and \(n\left[ A\left( x\right) \right] \), respectively. Because \(m[A(x_i^*)],n[A(x_i^*)]>0\), some diagonal element of \(\widehat{{\widetilde{H}}}\left( x\right) -\lambda _{\infty }\left( x\right) A\left( x\right) \) tends to \(-\infty \) as \(x\rightarrow x_i^*\). By the Gershgorin circle theorem (Gershgorin 1931), \(\lim _{x\rightarrow x_i^*}r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda _{\infty }\left( x\right) A\left( x\right) \right] =-\infty \). Thus, \(\mu \left( x\right) \ne \lambda _{\infty }\left( x\right) \) for x sufficiently close to \(x_i^*\). By statement 2, the roots of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =0\) except \(\lambda _\infty \left( x\right) \) tend to the \(G-1\) roots of \(\det \left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\lambda A\left( x_i^*\right) \right] =0\) as \(x\rightarrow x_i^*\), thereby continuous at \(x_i^*\) and bounded near \(x_i^*\); thus, \(-\infty<\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) \le \varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) <+\infty \). Let \(\lim _{j\rightarrow +\infty }y_j=x_i^*\) be a sequence such that \(\lim _{j\rightarrow +\infty }\mu \left( y_j\right) =\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) \). \(\widehat{{\widetilde{h}}}_{i,\cdot }\left( y_j\right) >0\) because \(\widehat{{\widetilde{H}}}(x)\) is irreducible by Assumption 3. Then

$$\begin{aligned} r\left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) A\left( x_i^*\right) \right] =\lim _{j\rightarrow +\infty }r\left[ \widehat{{\widetilde{H}}}\left( y_j\right) -\mu \left( y_j\right) A\left( y_j\right) \right] =0. \end{aligned}$$

Thus, \(\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) \) is a DGE of \(\widehat{{\widetilde{H}}}\left( x_i^*\right) \) on \(A\left( x_i^*\right) \), and so is \(\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) \).

Now we are going to prove that \(\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) =\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) \) by contradiction. If not so, then \(\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) \) and \(\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) \) are the only two DGEs of \(\widehat{{\widetilde{H}}}\left( x_i^*\right) \) on \(A\left( x_i^*\right) \) by Corollary A1, and one must be 0 by Theorem A1. Assume without loss of generality that \(0=\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) >\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) \). Let \({\bar{\mu }}:=\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) /2<0\). By Corollary A1, \(r\left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -{\bar{\mu }} A\left( x_i^*\right) \right] >0\) since \(\varliminf _{x\rightarrow x_i^*}\mu \left( x\right)<{\bar{\mu }}<\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) \). For j large enough, \({\bar{\mu }}<\mu \left( y_j\right) \). Then \(r\left[ \widehat{{\widetilde{H}}}\left( y_j\right) -{\bar{\mu }}A\left( y_j\right) \right] <0\) by Corollary A1 since \({\bar{\mu }}<0\) and \({\bar{\mu }}<\mu (y_j)\). As \(j\rightarrow \infty \), \(r\left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -{\bar{\mu }} A\left( x_i^*\right) \right] \le 0\), conflicts. Therefore, \(\varlimsup _{x\rightarrow x_i^*}\mu \left( x\right) =\varliminf _{x\rightarrow x_i^*}\mu \left( x\right) \).

Now we show that \(\lim _{x\rightarrow x_i^*}\mu \left( x\right) =\mu \left( x_i^*\right) \).

  1. 1.

    If \(\mu \left( x_i^*\right) =0\), then by Definition A4, 0 is the only DGE of \(\widehat{{\widetilde{H}}}\left( x_i^*\right) \) on \(A\left( x_i^*\right) \). Then \(\lim _{x\rightarrow x_i^*}\mu \left( x\right) =0=\mu \left( x_i^*\right) \) because \(\lim _{x\rightarrow x_i^*}\mu \left( x\right) \) is a DGE.

  2. 2.

    If \(\mu \left( x_i^*\right) >0\), then following Definition A3, denote the number (count multiplicity) of GEs with positive (negative) real parts of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \) by \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \) (\(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \)). By symmetry, it is enough to prove \(\lim _{x\rightarrow x_i^*-}\mu \left( x\right) =\mu \left( x_i^*\right) \).

    By Theorem A1, \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x_i^*\right) =m\left[ A\left( x_i^*\right) \right] =i-1\). By continuity, \(\exists \delta >0\) such that for \(x\in \left( x_i^*-\delta ,x_i^*\right) \), the \(i-1\) roots of \(\det \left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\lambda A\left( x_i^*\right) \right] =0\) with positive real parts at \(x_i^*\) still have positive real parts at x. By \(\lim _{x\rightarrow x_i^*-}\lambda _\infty \left( x\right) =+\infty \), it is possible to decrease \(\delta \) such that \(\lambda _\infty \left( x\right) >0\), thereby \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \ge i\) for \(x\in \left( x_i^*-\delta ,x_i^*\right) \). Conversely, for \(x\in \left( x_{i+1}^*,x_i^*\right) \), \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \le m\left[ A\left( x\right) \right] =i\) by Theorem A1. Therefore, \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =m\left[ A\left( x\right) \right] =i\) for \(x\in \left( x_i^*-\delta ,x_i^*\right) \). By Theorem A1, \(\mu \left( x\right) >0\) for \(x\in \left( x_i^*-\delta ,x_i^*\right) \). Denote the roots of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =0\) other than \(\lambda _{\infty }(x)\) and \(\mu (x)\) by \(\{\lambda _j(x)\}_{j=1}^{G-2}\). Assume without loss of generality that \(\{\lambda _j(x)\}_{j=1}^{i-2}\) have positive real parts in \(\left( x_i^*-\delta ,x_i^*\right) \).

    By statement 2, \(\mu (x)\) and \(\{\lambda _j(x)\}_{j=1}^{G-2}\) tend to the roots of \(\det \left[ \widehat{{\widetilde{H}}}\left( x_i^*\right) -\lambda A\left( x_i^*\right) \right] =0\). If \(\lim _{x\rightarrow x_i^*-}\mu \left( x\right) =\mu \left( x_i^*\right) \) does not hold, then \(\lim _{x\rightarrow x_i^*-}\mu \left( x\right) =0\). Also, since \(\{\lambda _j\}_{j=i-1}^{G-2}\) have nonpositive real parts in \(\left( x_i^*-\delta ,x_i^*\right) \), \(\lim _{x\rightarrow x_i^*-}\mathfrak {R}[\lambda _j(x)]\le 0\) for \(j\in [i-1,G-2]\). Thus, \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x_i^*\right) \le i-2<i-1=m\left[ A\left( x_i^*\right) \right] \), which conflicts Theorem A1.

  3. 3.

    The case for \(\mu \left( x_i^*\right) <0\) can be proved similarly as if \(\mu \left( x_i^*\right) >0\).

\(\square \)

Theorems A2 and 1 imply Corollary 1.

Corollary 1

Under Assumption 3, \(\mu \left( x\right) \) is continuous in \(\left( x_G^*,x_1^*\right) \).

Remark 1

A generalisation of Theorem 1 is given in Theorem A3 under a weaker assumption than Assumption 3 that for \(i\in \left[ 2,G-1\right] \), \(\widehat{{\widetilde{H}}}\left( x\right) \) is continuous at \(x_i^*\), \(\lim _{x\rightarrow x_i^*}\frac{\widehat{{\widetilde{h}}}_{i,\cdot }\left( x\right) }{|x-x_i^*|}=+\infty \), and \(\lim _{x\rightarrow x_i^*}\frac{\widehat{{\widetilde{h}}}_{i,j}\left( x\right) }{\widehat{{\widetilde{h}}}_{i,\cdot }\left( x\right) }\) exists \(\forall j\ne i\).

The following lemma is essential for the relation between \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \) and \(\mu \left( x\right) \).

Lemma 1

Under Assumption 3, any real continuous GE \(\lambda \left( x\right) \) in \(\left( x_G^*,x_1^*\right) \) satisfies \(\lambda \left( x\right) \in \left\{ 0,\mu \left( x\right) \right\} \), \(\forall x\in \left( x_G^*,x_1^*\right) \).

Proof

By Corollary 1, \(\mu \left( x\right) \) is continuous in \(\left( x_G^*,x_1^*\right) \). By Lemma A1, \(\mu \left( x\right) \) and 0 is not a single root of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =0\) iff \(\mu \left( x\right) =0\). So \(\mu \left( x\right) \) and 0 do not intersect any other GE of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \). Thus, \(\exists x\in \left( x_G^*,x_1^*\right) \) such that \(\lambda \left( x\right) \in \left\{ 0,\mu \left( x\right) \right\} \) implies \(\lambda \left( x\right) \in \left\{ 0,\mu \left( x\right) \right\} \) \(\forall x\in \left( x_G^*,x_1^*\right) \). So it is enough to prove the former. By Lemma A6, \(0<\lambda \left( x\right) \) for \(x\in \left( x_G^*,x_{G-1}^*\right) \), and \(0>\lambda \left( x\right) \) for \(x\in \left( x_2^*,x_1^*\right) \). By continuity of \(\lambda \left( x\right) \), \(\exists x\in \left( x_G^*,x_1^*\right) \) such that \(\lambda \left( x\right) =0\in \left\{ 0,\mu \left( x\right) \right\} \). \(\square \)

By Assumption 1 and Lemma 1, \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \in \left\{ 0,\mu \left( x\right) \right\} \). However, we still can not uniquely determine \(\partial _x\widehat{{\widetilde{\varPhi }}}\). It is because if \(\mu \left( x\right) \) has two roots \(x_G^*<x'<x''<x_1^*\), one can let \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) =0\) in \(\left( x',x''\right) \) and \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) =\mu \left( x\right) \) in \(\left( x_G^*,x'\right) \cup \left( x'',x_1^*\right) \). Then \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \) is still continuous and does not violate Assumption 1 and Lemma 1. Therefore, we have to add another assumption in order to not consider those trivial scenarios.

Assumption 4

If \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \in \left\{ 0,\mu \left( x\right) \right\} \), then \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) =\mu \left( x\right) \).

3.2 The Lyapunov property of \(\widehat{{\widetilde{\varPhi }}}(x)\) for DMFD

Theorem 2 proves the Lyapunov property of \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \).

Theorem 2

Under Assumptions 13 and 4, in \(\left( x_G^*,x_1^*\right) \):

  1. 1.

    \(\mu (x)=0\) iff \(g(x)-\gamma x=0\);

  2. 2.

    \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) satisfies

    $$\begin{aligned} \frac{d\widehat{{\widetilde{\varPhi }}}\left( x\right) }{dt}=\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \frac{dx}{dt}=\mu \left( x\right) \left[ g\left( x\right) -\gamma x\right] \le 0, \end{aligned}$$
    (18)

    and the equality holds iff \(g(x)-\gamma x=0\).

Proof

By Eq. (4) and Lemma A8,

$$\begin{aligned} g(x)-\gamma x=\frac{\sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] \gamma (x_j^*-x)}{\sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] }. \end{aligned}$$
(19)

\(\sum \limits _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] >0\) because \(\widehat{{\widetilde{H}}}\left( x\right) \) is irreducible. So

$$\begin{aligned} g(x)-\gamma x=0 \Leftrightarrow \sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] \gamma (x_j^*-x)=0, \end{aligned}$$
(20)

where \(\widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \) is the submatrix of \(\widehat{{\widetilde{H}}}\left( x\right) \) deleting the jth row and column. For \(x\in \bigcup _{j=1}^{G-1}\left( x_{j+1}^*,x_j^*\right) \), denote the G roots of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] \) by \(\{\lambda _j\}_{j=1}^G\), and for \(x=x_i^*\) for some \(i\in [2,G-1]\), denote the \(G-1\) roots by \(\{\lambda _j\}_{j=1}^{G-1}\). Assume without loss of generality that \(\lambda _1\left( x\right) \equiv 0\). Observing the coefficients of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] \), we have for \(x\in \bigcup _{j=1}^{G-1}\left( x_{j+1}^*,x_j^*\right) \),

$$\begin{aligned} \frac{\sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] \gamma (x_j^*-x)}{\prod _{j=1}^G\gamma \left( x_j^*-x\right) }=\sum _{j=1}^G\prod _{j'\ne j}\lambda _{j'}\left( x\right) =\prod _{j=2}^G\lambda _j\left( x\right) , \end{aligned}$$
(21)

and for \(x=x_i^*\),

$$\begin{aligned} \frac{\sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x_i^*\right) \right] \gamma (x_j^*-x_i^*)}{\widehat{\tilde{h}}_{i,\cdot }(x_i^*)\prod _{j\ne i}\gamma \left( x_j^*-x_i^*\right) }=\sum _{j=1}^{G-1}\prod _{j'\ne j}\lambda _{j'}\left( x_i^*\right) =\prod _{j=2}^{G-1}\lambda _j\left( x_i^*\right) . \end{aligned}$$

By Lemma A1, \(\mu \left( x\right) \) or 0 is not a single root of \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] =0\) iff \(\mu \left( x\right) =0\). In other words, \(\prod _{j=2}^G\lambda _j\left( x\right) =0\) (or \(\prod _{j=2}^{G-1}\lambda _j\left( x\right) =0\) for \(x=x_i^*\)) is equivalent to \(\mu (x)=0\), thereby \(\sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] \gamma (x_j^*-x)=0\). This is the statement 1.

By Eqs. (19) and (21), and Assumption 4, for \(x\in \bigcup _{j=1}^{G-1}\left( x_{j+1}^*,x_j^*\right) \),

$$\begin{aligned} \frac{d\widehat{{\widetilde{\varPhi }}}\left( x\right) }{dt}= & {} \partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) \frac{dx}{dt}=\mu \left( x\right) \left[ g\left( x\right) -\gamma x\right] \\= & {} \frac{\mu \left( x\right) }{\sum _{j=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \right] }\left[ \prod _{j=2}^G\lambda _j\left( x\right) \right] \prod _{j=1}^G \gamma \left( x_j^*-x\right) . \end{aligned}$$

By Lemma A6, exactly one of the following occurs.

  1. 1.

    \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =m\left[ A\left( x\right) \right] \), \(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =n\left[ A\left( x\right) \right] -1\), \(\mu \left( x\right) >0\).

  2. 2.

    \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =m\left[ A\left( x\right) \right] -1\), \(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =n\left[ A\left( x\right) \right] \), \(\mu \left( x\right) <0\).

  3. 3.

    \(m\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =m\left[ A\left( x\right) \right] -1\), \(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =n\left[ A\left( x\right) \right] -1\), \(\mu \left( x\right) =0\).

The signs of \(\left[ \prod _{j=2}^G\lambda _j\left( x\right) \right] \) and \(\prod _{j=1}^G \gamma \left( x_j^*-x\right) \) are determined by \(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \) and \(n\left[ A\left( x\right) \right] \), respectively. In case 1, they have opposite signs, and \(\mu (x)>0\). In case 2, they have the same sign, and \(\mu (x)<0\). In case 3, \(\mu (x)=0\). Since \(\sum _{i=1}^G\det \left[ \widehat{{\widetilde{H}}}^{{\bar{i}}}\left( x\right) \right] >0\), \(\frac{d\widehat{{\widetilde{\varPhi }}}\left( x\right) }{dt}\le 0\) in all cases. Because \(\mu \left( x\right) \) and \(g\left( x\right) \) are continuous in \(\left( x_G^*,x_1^*\right) \), \(\frac{d\widehat{{\widetilde{\varPhi }}}\left( x\right) }{dt}\) is also continuous by Eq. (18). Thus, \(\frac{d\widehat{{\widetilde{\varPhi }}}\left( x\right) }{dt}\le 0\) for \(x=x_i^*\) with \(i\in \left[ 2,G-1\right] \). The necessary and sufficient condition for the equality is implied by the statement 1. Hence the statement 2 is proved. \(\square \)

Proposition 1

Under Assumptions 23 and 4, \({\widetilde{C}}_i\left( x\right) \) is continuous and positive in \(\left( x_G^*,x_1^*\right) \).

Proof

By definition, \(r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\mu \left( x\right) A\left( x\right) \right] =0\) for \(x\ne x_i^*\). By continuity, \(r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\mu \left( x\right) A\left( x\right) \right] =0\) \(\forall x\in \left( x_G^*,x_1^*\right) \). By Assumption 4, \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) =\mu (x)\). Then by Eq. (11), \({\widetilde{C}}\left( x\right) \) is the eigenvector of \(\widehat{{\widetilde{H}}}\left( x\right) -\mu \left( x\right) A\left( x\right) \) corresponding to the dominant eigenvalue \(r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\mu \left( x\right) A\left( x\right) \right] =0\). Because \(\widehat{{\widetilde{H}}}\left( x\right) -\mu \left( x\right) A\left( x\right) \) is irreducible, by the Perron–Frobenius theorem, \({\widetilde{C}}_i\left( x\right) >0\) \(\forall i\), and \({\widetilde{C}}_i\left( x\right) \) is continuous since \(r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\mu \left( x\right) A\left( x\right) \right] =0\) is a single eigenvalue. \(\square \)

Proposition 2

Under Assumptions 13 and 4, for \(\epsilon >0\),

$$\begin{aligned} \int _{x_G^*}^{x_G^*+\epsilon }\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) dx=-\infty ,\quad \int _{x_1^*-\epsilon }^{x_1^*}\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) dx=+\infty . \end{aligned}$$

Proof

Statement 1 in Theorem 1 is also true for \(x_1^*\) and \(x_G^*\). So

$$\begin{aligned} \int _{x_G^*}^{x_G^*+\epsilon }\lambda _{\infty }\left( x\right) dx=-\infty ,\quad \int _{x_1^*-\epsilon }^{x_1^*}\lambda _{\infty }\left( x\right) dx=+\infty . \end{aligned}$$

By Assumption 4, \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x\right) =\mu (x)\). It is enough to prove that \(\mu \left( x\right) =\lambda _{\infty }\left( x\right) \) in \(\left( x_G^*,x_G^*+\epsilon \right) \) and \(\left( x_1^*-\epsilon ,x_1^*\right) \) for \(\epsilon \) small enough. Since \(\lim _{x\rightarrow x_G^*+}\lambda _{\infty }\left( x\right) =-\infty \), for \(\epsilon <x_{G-1}^*-x_G^*\) small enough, \(\lambda _{\infty }\left( x\right) <0\) in \(\left( x_G^*,x_G^*+\epsilon \right) \). Thus, \(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) \ge 1=n\left[ A\left( x\right) \right] \). By Theorem A1, \(n\left( \widehat{{\widetilde{H}}}|A\right) \left( x\right) =n\left[ A\left( x\right) \right] \) (\(\lambda _{\infty }\left( x\right) \) is the only GE with negative real part) and \(\mu \left( x\right) <0\). Thus, \(\mu \left( x\right) =\lambda _{\infty }\left( x\right) \) in \(\left( x_G^*,x_G^*+\epsilon \right) \). The case for \(x_1^*\) can be proved similarly. \(\square \)

By Proposition 2,

$$\begin{aligned} \lim _{x\rightarrow x_G^*+}\widehat{{\widetilde{\varPhi }}}\left( x\right) =+\infty ,\quad \lim _{x\rightarrow x_1^*-}\widehat{{\widetilde{\varPhi }}}\left( x\right) =+\infty . \end{aligned}$$

It is appropriate to define \(\widehat{{\widetilde{\varPhi }}}\left( x\right) =+\infty \) for \(x\in \left[ 0,x_G^*\right] \cup \left[ x_1^*,+\infty \right) \) because the stationary distribution of the FR model vanishes outside \(\left( x_G^*,x_1^*\right) \). In fact, once x entering \(\left( x_G^*,x_1^*\right) \), it will never leave based on the piecewise-deterministic dynamics of the FR models.

3.3 The numerical algorithm for \(\widehat{{\widetilde{\varPhi }}}(x)\)

We need the following proposition to support our numerical algorithm.

Proposition 3

Under Assumption 3, if \(\mu \left( x\right) \ne 0\), then no real GE of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \) locates strictly between \(\mu \left( x\right) \) and 0.

Proof

Assume without loss of generality that \(\mu \left( x\right) >0\). By Corollary A1, \(r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] >0\) for \(0<\lambda <\mu \left( x\right) \). Recall that \(r\left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] \) is the eigenvalue of \(\widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \) with the smallest real part. Thus, all eigenvalues of \(\widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \) have positive real parts. Therefore, \(\det \left[ \widehat{{\widetilde{H}}}\left( x\right) -\lambda A\left( x\right) \right] >0\). So any \(\lambda \in \left( 0,\mu \left( x\right) \right) \) cannot be a GE.\(\square \)

Proposition 3 together with Theorem A1 can be used to calculate \(\mu \left( x\right) \) in \(\left( x_G^*,x_1^*\right) \) numerically, thereby \(\partial _{\left( x\right) }\widehat{{\widetilde{\varPhi }}}\left( x\right) \) under Assumption 4. Then based on the fact that the global minimum of the LDP rate function is always zero (Touchette 2009), one can numerically obtain \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \).

The rest of this subsection is the numerical algorithm for obtaining the nontrivial DGE at any fixed \(x\in \bigcup _{i=1}^{G-1}\left( x_{i+1}^*,x_i^*\right) \). Since A is nonsingular, the GEs of \(\widehat{{\widetilde{H}}}\) on A are the eigenvalues of \(\widehat{{\widetilde{H}}}A^{-1}\), which can be solved by existing numerical methods. We first assume the ideal case that the numerically solved spectrum is completely accurate. Then \(\mu \) can be obtained based on Theorem A1 and Proposition 3, following the four steps:

  1. 1.

    Calculate \(m\left( \widehat{{\widetilde{H}}}|A\right) \) and \(n\left( \widehat{{\widetilde{H}}}|A\right) \) by the G eigenvalues of \(\widehat{{\widetilde{H}}}A^{-1}\).

  2. 2.

    If \(m\left( \widehat{{\widetilde{H}}}|A\right) =m\left( A\right) \), let \(\mu \) be the smallest real positive eigenvalue in the spectrum.

  3. 3.

    If \(n\left( \widehat{{\widetilde{H}}}|A\right) =n\left( A\right) \), let \(\mu \) be the largest real negative eigenvalue in the spectrum.

  4. 4.

    If \(m\left( \widehat{{\widetilde{H}}}|A\right) =m\left( A\right) -1\) and \(n\left( \widehat{{\widetilde{H}}}|A\right) =n\left( A\right) -1\), let \(\mu =0\).

However, in real applications, the solved eigenvalues have round-off error. So we have to modify the above procedure as follows:

  1. 1.

    Set the eigenvalue with the smallest absolute value as 0.

  2. 2.

    Calculate \(m\left( \widehat{{\widetilde{H}}}|A\right) \) and \(n\left( \widehat{{\widetilde{H}}}|A\right) \) by the G eigenvalues.

  3. 3.

    If \(m\left( \widehat{{\widetilde{H}}}|A\right) \ge m\left( A\right) \), then

    • find the eigenvalues with the smallest absolute imaginary part among all the eigenvalues with positive real parts;

    • let \(\mu \) be the smallest real part among the eigenvalues found in the above step.

  4. 4.

    If \(n\left( \widehat{{\widetilde{H}}}|A\right) \ge n\left( A\right) \), do symmetrically as the case when \(m\left( \widehat{{\widetilde{H}}}|A\right) \ge m\left( A\right) \).

  5. 5.

    Otherwise, let \(\mu =0\).

One may expect that the above modification is able to obtain an accurate \(\mu \) if the round-off error is small. Nevertheless, no theory promises its correctness. Thus, one still needs to check the obtained \(\mu \). Corollary A1 provides a possible method, which implies that

  1. 1.

    If \(r\left( \widehat{{\widetilde{H}}}-\lambda A\right) \ge 0\), then both \((-\infty ,\lambda ]\) and \([\lambda ,+\infty )\) contain DGE.

  2. 2.

    If \(\lambda >0\) and \(r\left( \widehat{{\widetilde{H}}}-\lambda A\right) \le 0\), then \((\lambda ,+\infty )\) contains no DGE.

  3. 3.

    If \(\lambda <0\) and \(r\left( \widehat{{\widetilde{H}}}-\lambda A\right) \le 0\), then \((-\infty ,\lambda )\) contains no DGE.

We assume that the above numerical algorithm is accurate enough such that the following statements are valid for some small error tolerances \(\tau>\epsilon >0\):

  1. 1.

    If \(r\left( \widehat{{\widetilde{H}}}-\lambda A\right) \ge 0\), then both \((-\infty ,\lambda +\epsilon ]\) and \([\lambda -\epsilon ,+\infty )\) contain DGE.

  2. 2.

    If \(\lambda \ge \tau \) and \(r\left( \widehat{{\widetilde{H}}}-\lambda A\right) \le 0\), then \((\lambda -\epsilon ,+\infty )\) contains no DGE.

  3. 3.

    If \(\lambda \le \tau \) and \(r\left( \widehat{{\widetilde{H}}}-\lambda A\right) \le 0\), then \((-\infty ,\lambda +\epsilon )\) contains no DGE.

Denote the accurate value of \(\mu \) by \(\mu ^*\). The following procedure checks whether \(\mu \) is a good approximation of \(\mu ^*\). If not so, such a check procedure still can further narrow the possible range for the searching of \(\mu ^*\).

  1. 1.

    If \(|\mu |<2\tau \), then

    • If \(r\left( \widehat{{\widetilde{H}}}+2\tau A\right) \le 0\) and \(r\left( \widehat{{\widetilde{H}}}-2\tau A\right) \le 0\), then \(\mu ^*\in [-2\tau -\epsilon ,2\tau +\epsilon ]\), thereby \(\mu \) is a good approximation of \(\mu ^*\).

    • If \(r\left( \widehat{{\widetilde{H}}}+2\tau A\right) > 0\) and \(r\left( \widehat{{\widetilde{H}}}-2\tau A\right) \le 0\), then \((-\infty ,-2\tau +\epsilon ]\) contains DGE. Since \(0\notin (-\infty ,-2\tau +\epsilon ]\), \(\mu ^*\in (-\infty ,-2\tau +\epsilon ]\).

    • If \(r\left( \widehat{{\widetilde{H}}}+2\tau A\right) \le 0\) and \(r\left( \widehat{{\widetilde{H}}}-2\tau A\right) > 0\), then \(\mu ^*\in [2\tau -\epsilon ,+\infty )\) symmetrically.

    • If \(r\left( \widehat{{\widetilde{H}}}+2\tau A\right) > 0\) and \(r\left( \widehat{{\widetilde{H}}}-2\tau A\right) > 0\), then \(\mu ^*\in (-\infty ,-2\tau +\epsilon ]\cap [2\tau -\epsilon ,+\infty )=\emptyset \), conflicts.

  2. 2.

    If \(\mu \ge 2\tau \), then

    • If \(r\left( \widehat{{\widetilde{H}}}-(\mu -\tau ) A\right) \ge 0\) and \(r\left( \widehat{{\widetilde{H}}}-(\mu +\tau ) A\right) \le 0\), then \(\mu ^*\in [\mu -\tau -\epsilon ,\mu +\tau +\epsilon )\), thereby \(\mu \) is a good approximation of \(\mu ^*\).

    • If \(r\left( \widehat{{\widetilde{H}}}-(\mu -\tau ) A\right) < 0\) and \(r\left( \widehat{{\widetilde{H}}}-(\mu +\tau ) A\right) \le 0\), then \(\mu ^*\in (-\infty ,\mu -\tau +\epsilon ]\).

    • If \(r\left( \widehat{{\widetilde{H}}}-(\mu -\tau ) A\right) \ge 0\) and \(r\left( \widehat{{\widetilde{H}}}-(\mu +\tau ) A\right) >0\), then \(\mu ^*\in [\mu +\tau -\epsilon ,+\infty )\).

    • If \(r\left( \widehat{{\widetilde{H}}}-(\mu -\tau ) A\right) <0\) and \(r\left( \widehat{{\widetilde{H}}}-(\mu +\tau ) A\right) >0\), then \(\mu ^*\in (-\infty ,\mu -\tau +\epsilon ]\cap [\mu +\tau -\epsilon ,+\infty )=\emptyset \), conflicts.

  3. 3.

    If \(\mu \le -2\tau \), do symmetrically as the case when \(\mu \ge 2\tau \).

Unless \(\mu \) is already a good approximation of \(\mu ^*\), one should

  1. 1.

    remove eigenvalues whose real parts equal to \(\mu \), or do not belong to the possible range of \(\mu ^*\) obtained from the above steps;

  2. 2.

    find one eigenvalue with the smallest absolute imaginary part among the remaining eigenvalues, and let \(\mu \) be the real part of it;

  3. 3.

    check whether the new \(\mu \) is a good approximation of \(\mu ^*\), and if not so, further narrow the possible range of \(\mu ^*\).

Repeat the above procedure until one of the following cases is encountered.

  1. 1.

    If \(\mu \) is a good approximation of \(\mu ^*\), then return \(\mu \).

  2. 2.

    Otherwise, if the possible range of \(\mu ^*\) is small enough, then return the mid-point of the range.

  3. 3.

    If all eigenvalues are removed, and the possible range of \(\mu ^*\) is still not small enough, then apply a modified dichotomic search in the possible range.

In summary, we have Algorithm 3.1, where \(\mathfrak {R}\) and \(\mathfrak {I}\) mean the real and imaginary parts, respectively, and [ab] is the possible range of \(\mu ^*\). The modified dichotomic search converges because \(b-a\le (b'-a')/2+\tau \), where \([a',b']\) is the possible range of \(\mu ^*\) in the previous step. \(\mu \) at \(x_i^*\) for \(i\in [2,G-1]\) can be obtained by continuity.

figure a

4 Local fluctuations and transition rates between phenotypic states in a tristable example

We apply Algorithm 3.1 and Eq. (16) to a tristable example. The LDP rate functions in the intermediate and rapid regions of gene-state switchings are calculated, and the local fluctuations obtained from the rate function are compared. Here, local fluctuations are just the second derivative of the LDP rate function at each local minimum, which corresponds to each steady fixed point of the DMFD. To be more precise, we can expand the rate function \(\phi (x)\) near any local minimum \(x^*\), i.e.

$$\begin{aligned} \phi \left( x\right) = \phi \left( x^*\right) + \frac{\phi ''\left( x^*\right) }{2}\left( x-x^*\right) ^2 + \cdots . \end{aligned}$$
(22)

Then the asymptotic Gaussian variance of \(p^{ss}\left( x\right) \) close to \(x^*\) is \(({\mathcal {H}}\phi ''(x^*))^{-1}\).

In this section, we also show that the transition rates between phenotypic states in the intermediate region are correctly predicted by rate formula proposed in the FR model based on the Freidlin–Wentzell LDP (Freidlin and Wentzell 2014), and a direct comparison with the stationary distribution of the full CME model further appreciates the validity of the numerically obtained rate function of the FR model.

4.1 Tristable example

Let \(G=3\). By the definitions of \(g\left( x\right) \) and \(A\left( x\right) \) in Eqs. (4) and (10), the fixed points (phenotypic states) of the DMFD in Eq. (6) are the roots of

$$\begin{aligned}&\det \left[ \widehat{{\widetilde{H}}}^{{\bar{1}}}\left( x\right) \right] \gamma \left( x_1^*-x\right) +\det \left[ \widehat{{\widetilde{H}}}^{{\bar{2}}}\left( x\right) \right] \gamma \left( x_2^*-x\right) \nonumber \\&\quad +\det \left[ \widehat{{\widetilde{H}}}^{{\bar{3}}}\left( x\right) \right] \gamma \left( x_3^*-x\right) =0, \end{aligned}$$
(23)

where \(\widehat{{\widetilde{H}}}^{{\bar{j}}}\left( x\right) \) is the submatrix of \(\widehat{{\widetilde{H}}}\left( x\right) \) deleting the jth row and column. Let

$$\begin{aligned}&\widehat{{\widetilde{h}}}_{1,3}\left( x\right) \equiv \widehat{{\widetilde{h}}}_{3,1}\left( x\right) \equiv 0,\quad \widehat{{\widetilde{h}}}_{1,2}\left( x\right) :=\widehat{{\widetilde{h}}}_{1,2}^c\equiv 1,\quad \widehat{{\widetilde{h}}}_{2,3}\left( x\right) :=\widehat{{\widetilde{h}}}_{2,3}^c>0,\nonumber \\&\widehat{{\widetilde{h}}}_{2,1}\left( x\right) :=\widehat{{\widetilde{h}}}_{2,1}^cx^2>0,\quad \widehat{{\widetilde{h}}}_{3,2}\left( x\right) :=\widehat{{\widetilde{h}}}_{3,2}^cx^2>0. \end{aligned}$$
(24)

Then,

$$\begin{aligned} \det \left[ \widehat{{\widetilde{H}}}^{{\bar{1}}}\left( x\right) \right]= & {} \widehat{{\widetilde{h}}}_{2,1}\left( x\right) \widehat{{\widetilde{h}}}_{3,2}\left( x\right) ,\nonumber \\ \det \left[ \widehat{{\widetilde{H}}}^{{\bar{2}}}\left( x\right) \right]= & {} \widehat{{\widetilde{h}}}_{1,2}\left( x\right) \widehat{{\widetilde{h}}}_{3,2}\left( x\right) ,\nonumber \\ \det \left[ \widehat{{\widetilde{H}}}^{{\bar{3}}}\left( x\right) \right]= & {} \widehat{{\widetilde{h}}}_{1,2}\left( x\right) \widehat{{\widetilde{h}}}_{2,3}\left( x\right) . \end{aligned}$$
(25)

Eq. (24) together with \(k_1>k_2>k_3\) describes a positive feedback regulation with proteins attaching to the gene as dimers.

Substitute Eqs. (24) and (25) into Eq. (23).

$$\begin{aligned} \widehat{{\widetilde{h}}}_{2,1}^c\widehat{{\widetilde{h}}}_{3,2}^c\left( k_1^0-\gamma x\right) x^4+\widehat{{\widetilde{h}}}_{1,2}^c\widehat{{\widetilde{h}}}_{3,2}^c\left( k_2^0-\gamma x\right) x^2+\widehat{{\widetilde{h}}}_{1,2}^c\widehat{{\widetilde{h}}}_{2,3}^c\left( k_3^0-\gamma x\right) =0.\nonumber \\ \end{aligned}$$
(26)

Define \(u_1:=\widehat{{\widetilde{h}}}_{1,2}^c/\widehat{{\widetilde{h}}}_{2,1}^c\), \(u_2:=\widehat{{\widetilde{h}}}_{2,3}^c/\widehat{{\widetilde{h}}}_{3,2}^c\), and divide Eq. (26) by \(\widehat{{\widetilde{h}}}_{2,1}^c\widehat{{\widetilde{h}}}_{3,2}^c\).

$$\begin{aligned} f\left( x,u_1,u_2\right) :=\left( k_1^0-\gamma x\right) x^4+u_1\left( k_2^0-\gamma x\right) x^2+u_1u_2\left( k_3^0-\gamma x\right) =0. \end{aligned}$$

We plot a curve

$$\begin{aligned} \left\{ \begin{array}{c} f\left( x,u_1,u_2\right) =0, \\ \partial _x f\left( x,u_1,u_2\right) =0, \\ \end{array} \right. \end{aligned}$$
(27)

in the phase plane of \(u_1\) and \(1/u_2\) in Fig. 2a for \(k_2^0/k_1^0=0.1283\) and \(k_3^0/k_1^0=0.0078\) (remind that \(k_1^0\equiv \gamma \)).

Fig. 2
figure 2

a Curve in the phase plane of \(u_1\) and \(1/u_2\) by Eq. (27) with \(G=3\), \(k_2^0/k_1^0=0.1283\), \(k_3^0/k_1^0=0.0078\). As a function of x, the number of roots of \(f\left( x,u_1,u_2\right) =0\) changes with \(u_1\) and \(u_2\). In each region divided by the curve, we label the number of roots of \(f\left( x,u_1,u_2\right) =0\) (i.e. the fixed points of the DMFD). b Phase diagram with the same parameters as in a and \(u_1=0.3329\). It shows the fixed points of the DMFD vary with \(u_2\). Dash lines are the unstable fixed points indicated by \(x^{\left( 1.5\right) }\) and \(x^{\left( 2.5\right) }\), while real lines are those stable ones indicated by \(x^{\left( 1\right) }\), \(x^{\left( 2\right) }\) and \(x^{\left( 3\right) }\). c The rate function \(\widehat{{\widetilde{\varPhi }}}^R\left( x\right) \) of the reduced CME model, and the Gaussian approximations around \(x^{\left( i\right) }\) with \(\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 1\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 2\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 3\right) }\right) =1:0.0056:0.0027\). d The rate function \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) of the FR model, and the Gaussian approximations around \(x^{\left( i\right) }\) with \(\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 1\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 2\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 3\right) }\right) =1:0.0163:0.0170\). In c, d, \(G=3\), \(k_2^0/k_1^0=0.1283\), \(k_3^0/k_1^0=0.0078\), \(u_1=0.3329\), \(u_2=0.0054\). In d, \(\widehat{{\widetilde{h}}}^c_{3,2}/\widehat{{\widetilde{h}}}^c_{2,1}=1\) and \(k_1^0\equiv \gamma =0.01\)

The root number in each region divided by the curve is labeled in Fig. 2a. In one region, \(f\left( x,u_1,u_2\right) =0\) has five roots. Fixing \(u_1=0.3329\), we plot \(1/u_2\) as a function of x by \(f\left( x,u_1,u_2\right) =0\) in Fig. 2b. As \(u_2=0.0054\) (vertical dotted line), \(f\left( x,u_1,u_2\right) =0\) has five roots \(x^{\left( 1\right) }\), \(x^{\left( 1.5\right) }\), \(x^{\left( 2\right) }\), \(x^{\left( 2.5\right) }\), \(x^{\left( 3\right) }\). These roots are the fixed points of the DMFD (an ODE). Three stable fixed points \(x^{\left( 1\right) }\), \(x^{\left( 2\right) }\) and \(x^{\left( 3\right) }\) are interlaced by two unstable ones \(x^{\left( 1.5\right) }\) and \(x^{\left( 2.5\right) }\), which indicates that the FR model is tristable.

4.2 FR model exhibits different local fluctuations from those of the reduced CME model

In our tristable example, \(\partial _x\widehat{{\widetilde{\varPhi }}}\left( x^{\left( i\right) }\right) =\partial _x\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( i\right) }\right) =0\) at the stable fixed point \(x^{\left( i\right) }\) of the DMFD by Theorem 2 and Eq. (16). Therefore, around \(x^{\left( i\right) }\), we have the Gaussian approximations

$$\begin{aligned}&\widehat{{\widetilde{\varPhi }}}\left( x\right) \approx \widehat{{\widetilde{\varPhi }}}\left( x^{\left( i\right) }\right) +\frac{\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( i\right) }\right) }{2}\left( x-x^{\left( i\right) }\right) ^2,\\&\widehat{{\widetilde{\varPhi }}}^R\left( x\right) \approx \widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( i\right) }\right) +\frac{\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( i\right) }\right) }{2}\left( x-x^{\left( i\right) }\right) ^2. \end{aligned}$$

The local fluctuations around the fixed point \(x^{\left( i\right) }\) for the FR model and the reduced CME model are indicated by \(\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( i\right) }\right) \) and \(\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( i\right) }\right) \), respectively. The larger the second derivative is, the weaker the local fluctuation will be, illustrated by the ratio \(\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 1\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 2\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 3\right) }\right) \) (or \(\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 1\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 2\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 3\right) }\right) \)) of the local fluctuations over all fixed points.

In Fig. 2c, we plot \(\widehat{{\widetilde{\varPhi }}}^R\left( x\right) \) of the tristable example in Sect. 4.1. Further fixing \(\widehat{{\widetilde{h}}}^c_{3,2}/\widehat{{\widetilde{h}}}^c_{2,1}=1\) and \(k_1^0\equiv \gamma =0.01\), we plot \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) of the same example in Fig. 2d. The Gaussian approximations Eq. (28) around \(x^{\left( i\right) }\) are plotted in dash line in Fig. 2c, d. The local fluctuations of \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) are different from those of \(\widehat{{\widetilde{\varPhi }}}^R\left( x\right) \) even if they share the same DMFD. Actually, we have

$$\begin{aligned}&\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 1\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 2\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}^R\left( x^{\left( 3\right) }\right) =1:0.0056:0.0027,\\&\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 1\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 2\right) }\right) :\partial _x^2\widehat{{\widetilde{\varPhi }}}\left( x^{\left( 3\right) }\right) =1:0.0163:0.0170. \end{aligned}$$

Thus, the local fluctuations at \(x^{\left( 2\right) }\) and \(x^{\left( 3\right) }\) are stronger in the rapid regime (the reduced CME model) of gene-state switchings than those in the intermediate one (the FR model).

4.3 Transition rates between phenotypic states

The mean first-passage time \(T_{ij}^c\) from \(x^{\left( i\right) }\) to \(x^{\left( j\right) }\) of the full CME model is numerically calculated by the conventional first-passage theory (Redner 2007). In Fig. 3, \(T_{ij}^c\) is plotted logarithmically as functions of \({\mathcal {H}}\) in \(\circ \), \(\times \), \(+\) for different \(n_{\max }\).

\(T_{ij}^c\) is insensitive to \(n_{\max }\) in the intermediate region \(n_{\max }k_i\gg {\mathcal {H}}\widehat{{\widetilde{h}}}_{i,j}\left( x\right) \gg \gamma \) of gene-state switchings. The FR model is simulated by the Doob–Gillespie method to obtain the mean first-passage time \(T_{ij}^f\) (Gillespie 1976, 1977). In Fig. 3, \(T_{ij}^f\) (real line) agrees well with \(T_{ij}^c\) for different \(n_{\max }\) (\(\circ \), \(\times \), \(+\)).

Refer to the discussion on the links between large deviations and WKB for the PDMP (Bressloff and Faugeras 2017), the transition rate \(k_{ij}\) from \(x^{\left( i\right) }\) to \(x^{\left( j\right) }\) is approximated by the general Arrhenius form

$$\begin{aligned} k_{ij}\approx k_{ij}^0\exp \left( -{\mathcal {H}}\varDelta \widehat{{\widetilde{\varPhi }}}_{ij}\right) , \end{aligned}$$
(28)

where \(\varDelta \widehat{{\widetilde{\varPhi }}}_{ij}:=\widehat{{\widetilde{\varPhi }}}\left[ x^{\left( \frac{i+j}{2}\right) }\right] -\widehat{{\widetilde{\varPhi }}}\left[ x^{\left( i\right) }\right] \) is the barrier height from \(x^{\left( i\right) }\) to \(x^{\left( j\right) }\), and \(k_{ij}^0\) is a prefactor such that \(\lim _{{\mathcal {H}}\rightarrow \infty }-\frac{1}{{\mathcal {H}}}\log k_{ij}^0=0\). Substitute \(k_{ij}=1/T_{ij}^c\) into Eq. (28).

$$\begin{aligned} \log \left( T_{ij}^c\right) \approx \log \left( 1/k_{ij}^0\right) +{\mathcal {H}}\varDelta \widehat{{\widetilde{\varPhi }}}_{ij}. \end{aligned}$$
(29)

The linear relationship between \(\log \left( T_{ij}^c\right) \) and \({\mathcal {H}}\) in Eq. (29) is observed in Fig. 3, by which we fit and plot the slope \(\varDelta \widehat{{\widetilde{\varPhi }}}_{ij}\) for different \(n_{\max }\) in the insets of Fig. 3 by dash lines. The real lines, which are calculated directly by \(\varDelta \widehat{{\widetilde{\varPhi }}}_{ij}:=\widehat{{\widetilde{\varPhi }}}\left[ x^{\left( \frac{i+j}{2}\right) }\right] -\widehat{{\widetilde{\varPhi }}}\left[ x^{\left( i\right) }\right] \), match the dash lines. In conclusion, the LDP of the FR model correctly predicts the transition rates between phenotypic states in the intermediate region of gene-state switchings.

Fig. 3
figure 3

Logarithmic mean first-passage time of the full CME model as functions of \({\mathcal {H}}\) for different \(n_{\max }\) (\(\circ \), \(\times \), \(+\)), and that of the FR model (real lines). \(k_1^0\equiv \gamma =0.01\), \(u_1=0.3329\), \(u_2=0.0054\), \(\widehat{{\widetilde{h}}}^c_{3,2}/\widehat{{\widetilde{h}}}^c_{2,1}=1\), \(k_1^0:k_2^0:k_3^0=1:0.1283:0.0078\). The insets show the barrier heights \(\varDelta \widehat{{\widetilde{\varPhi }}}_{ij}\) calculated by fitting Eq. (29) for different \(n_{\max }\) (dash lines), and those directly calculated by \(\varDelta \widehat{{\widetilde{\varPhi }}}_{ij} \,{=}\, \widehat{{\widetilde{\varPhi }}}\left[ x^{\left( \frac{i+j}{2}\right) }\right] -\widehat{{\widetilde{\varPhi }}}\left[ x^{\left( i\right) }\right] \) (real lines)

4.4 Comparison with the stationary distribution of the full CME model

By the same parameters as in Fig. 2d, we plot the stationary distribution \(p^{ss}_c\left( x\right) \) (\(x=n/n_{\max }\)) of the full CME model for \(n_{\max }=1000000\) and \({\mathcal {H}}=50\) in Fig. 4a. As expected, \(-\frac{1}{{\mathcal {H}}}\log p^{ss}_c\left( x\right) \) (real line) and \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) (dash line) are quite close (Fig. 4b).

Fig. 4
figure 4

Parameters are the same as in Fig. 2d. a Stationary distribution \(p^{ss}_c\left( x\right) \) (\(x=n/n_{\max }\)) of the full CME model for \(n_{\max }=1{,}000{,}000\) and \({\mathcal {H}}=50\). b \(-\frac{1}{{\mathcal {H}}}\log p^{ss}_c\left( x\right) \) (real line) and \(\widehat{{\widetilde{\varPhi }}}\left( x\right) \) obtained by our numerical algorithm (dash line)

5 Conclusions and remarks

A living cell usually has multiple phenotypic states to face fluctuating environments (Kussell and Leibler 2005; Acar et al. 2008). The FR model proposed in Ge et al. (2015) for single-cell dynamics with two gene states quantitatively describes the stabilities of the phenotypic states and the transition rates among them in the intermediate regime, which is the case in E. coli, but rarely studied before. Considering that genes often have more than two states due to the combinatorial nature of transcriptional regulations (Zhu et al. 2005), we generalise the FR model together with the LDP rate function of steady state to the scenario with more than two gene states. We found that the derivative of the rate function is just the unique nontrival DGE of the negative gene-state transition-rate matrix on the protein birth–death matrix.

Given that several GEs exist, a method that determines the rate function from GEs is necessary. Under appropriate assumptions, we prove that the nontrivial DGE is continuous, the derivative of the rate function equals the nontrivial DGE, and the rate function satisfies the Lyapunov property of the DMFD. The numerical results of a tristable example support the equivalence of the nontrivial DGE and the LDP rate function.

The result in Corollary 1 is interesting because under Assumption 3 the matrix \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) is discontinuous at each \(x_i^*\) but always has a continuous nontrival real eigenvalue \(\mu (x)\) in the whole interval \((x_G^*,x_1^*)\). On the contrary, Example 1 will show that even if \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) is continuous in \((x_G^*,x_1^*)\), it may have no continuous nontrivial real eigenvalue on the whole interval.

Example 1

Let \(x_i^*=\left( 5-i\right) /4\) for \(i\in \left[ 1,4\right] \), \(\gamma =1\),

$$\begin{aligned}&J:=\left( \begin{matrix} 0 &{}\quad 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 0 &{}\quad 1 &{}\quad 0 \\ 0 &{}\quad 1 &{}\quad 0 &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{matrix}\right) ,\qquad F:=\left( \begin{matrix} -\,11 &{}\quad 8 &{}\quad 0 &{}\quad 8 \\ 4 &{}\quad -\,16 &{}\quad 10 &{}\quad 1 \\ 5 &{}\quad 4 &{}\quad -\,17 &{}\quad 0 \\ 2 &{}\quad 4 &{}\quad 7 &{}\quad -\,9 \end{matrix}\right) ,\\&E_2\left( x\right) :=\left( \begin{matrix} 1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 64\left( x-3/4\right) ^2 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 1 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 1 \end{matrix}\right) ,\qquad E_3\left( x\right) :=\left( \begin{matrix} 1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 64\left( x-1/2\right) ^2 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 1 \end{matrix}\right) ,\\&\widehat{{\widetilde{H}}}\left( x\right) =\left\{ \begin{array}{ll} F|A|\left( 3/8\right) , &{} x\in \left( 0,3/8\right] , \\ F|A|\left( 3/8\right) E_3\left( x\right) , &{} x\in \left( 3/8,7/12\right] , \\ 12\left( 2/3-x\right) F|A|\left( 3/8\right) E_3\left( 7/12\right) &{} \\ +12\left( x-7/12\right) JFJ|A|\left( 7/8\right) E_2\left( 2/3\right) , &{} x\in \left( 7/12,2/3\right] , \\ JFJ|A|\left( 7/8\right) E_2\left( x\right) , &{} x\in \left( 2/3,7/8\right] , \\ JFJ|A|\left( 7/8\right) , &{} x\in \left( 7/8,+\infty \right) . \end{array}\right. \end{aligned}$$

It is not hard to see that the limit of \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) at each \(x_i^*\) is finite, hence the matrix \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) is continuous in the whole \(\left( x_4^*,x_1^*\right) \). The eigenvalues of \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) given in Fig. 5a–c are also continuous in \(\left( x_4^*,x_1^*\right) \) as expected, but none of them is always real.

Fig. 5
figure 5

ac GEs of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \), or equivalently, eigenvalues of \(\widehat{{\widetilde{H}}}\left( x\right) A^{-1}\left( x\right) \) in Example 1 for \(x\in \left( x_4^*,x_1^*\right) \). a, b Real and imaginary parts of GEs. c Corresponding 3D plot

The realness of the nontrivial DGE is automatically promised by Definition A1, whereas nondominant GEs may be complex. The nontrivial DGE \(\mu \left( x\right) \) of \(\widehat{{\widetilde{H}}}\left( x\right) \) on \(A\left( x\right) \) indicated by red lines in Fig. 5b, c, which should be continuous if Assumption 3 holds, is now discontinuous at \(x_3^*\) and \(x_2^*\) because \(\widehat{{\widetilde{H}}}\left( x\right) \) becomes reducible.

In this paper, we have assumed a linear A(x) in Eq. (10). However, this is not necessary. Generally, denote the ith diagonal element of A(x) by \(A_i(x)\). If no two diagonal elements \(A_i(x)\) vanish at a same x, then simply replacing \(\gamma (x_i^*-x)\) by \(A_i(x)\), the proof of Theorem 1 is still valid. After slight modification, other main results in this paper are kept.

For now, we only considered the one-dimensional case (single self-regulatory gene). For high dimensional cases, Eq. (12) becomes \(\det \left[ \widehat{{\widetilde{H}}}\left( {{\varvec{x}}}\right) -{\mathbb {F}}\left( {{\varvec{x}}}\right) \right] =0\), where

$$\begin{aligned} {\mathbb {F}}\left( {{\varvec{x}}}\right) :=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} \nabla \widehat{{\widetilde{\varPhi }}}\left( {{\varvec{x}}}\right) \cdot {\mathcal {F}}_1\left( {{\varvec{x}}}\right) &{} 0 &{} \cdots &{} 0 \\ 0 &{} \nabla \widehat{{\widetilde{\varPhi }}}\left( {{\varvec{x}}}\right) \cdot {\mathcal {F}}_2\left( {{\varvec{x}}}\right) &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \nabla \widehat{{\widetilde{\varPhi }}}\left( {{\varvec{x}}}\right) \cdot {\mathcal {F}}_G\left( {{\varvec{x}}}\right) \\ \end{array} \right) , \end{aligned}$$
(30)

and \({\mathcal {F}}_i\left( {{\varvec{x}}}\right) \) is the vector field at gene state i. To apply the methods in this paper to high dimensional cases, many concepts must be generalised since the derivative \(\nabla \widehat{{\widetilde{\varPhi }}}\left( {{\varvec{x}}}\right) \) is now a vector but not a scalar. The most crucial step should be to properly regard the difference between \(\nabla \widehat{{\widetilde{\varPhi }}}\left( {{\varvec{x}}}\right) \) and other solutions of the underdetermined system \(\det \left[ \widehat{{\widetilde{H}}}\left( {{\varvec{x}}}\right) -{\mathbb {F}}\left( {{\varvec{x}}}\right) \right] =0\). In one-dimensional cases, it is the unique nontrivial DGE.