1 Introduction

Many mechanical systems are subjected to noisy or stochastic forcing inputs, or have inherent noise [1, 2]. For nonlinear systems where the response is highly sensitive to initial conditions, low-intensity noise can have dramatic effects on the system behavior [3, 4]. Exploring dynamics of nonlinear systems subjected to noise is of interest for various purposes and applications, including the design of wide-band nonlinear energy harvesters [5, 6], mechanical reliability of turbo-machinery blades [7], and noise-induced synchronization between nonlinear oscillators [8, 9]. Statistical jump rates around a subcritical pitchfork bifurcation have been investigated for a bifurcation-based MEMS sensor device [10, 11]. In a nonlinear system, noise can induce transitions between stable solutions [12], drive the response to a chaotic attractor [13], and cause trajectories to jump between basins of attraction [14]. This powerful influence of noise has not only led to studies on noise reduction in mechanical systems [15], but also inspired the ones making use of noise to alter a system’s response. Examples include using noise to drive a Duffing oscillator response from a high-amplitude attractor to a low-amplitude attractor [16], and suppressing energy localizations in a circular oscillator array with noise [17].

Predicting noise-induced transitions has been of interest to various efforts across different disciplines, such as population dynamics [18], vegetation ecosystems [19], biological information processing [20], and chemical reactions under external perturbations [21]. To study noise-influenced dynamics of nonlinear systems, numerical techniques such as Euler–Maruyama integration scheme [22], and the Path Integral Method to solve Fokker-Planck equations [23] are widely used. Numerical schemes often involve discretization of the transition probabilities over the phase space [24]. Ulam’s method is widely used to calculate escape rates in nonlinear systems [25]. Hsu established a representation of cell-to-cell mapping by using Markov chains to study transition probability densities [26]. Sun et al. [27] also used the generalized cell mapping method to calculate transitions in short periods of time, for which the transition probabilities can be approximated with a Gaussian distribution.

In a nonlinear system, estimating the transition probabilities, and average escape times from basins allow us to quantify the effects of noise on the system response [28]. Trajectories leaving a basin of attraction due to noise can end up in another basin of attraction, or diverge to infinity [29]. One way to find the rates of these transitions, and escape times from one basin to another under low-intensity noise, is to simulate individual trajectories under noise by using numerical integration schemes such as the Euler–Maruyama technique [30], and find the average escape times. However, for low noise intensities, the escape times are long, and running Monte Carlo simulations on hundreds of trajectories until they escape the basin is computationally expensive, and in some cases, impractical. Neglecting escape rates due to low-intensity noise may result in an inaccurate assessment of the dynamics of a nonlinear system. It is important to develop a technique to estimate escape times that is applicable to systems subjected to low-intensity noise.

Here, we outline an approach to estimate escape rates and escape times of responses when the dynamical system is subjected to Gaussian noise. This approach is applicable to systems under low-intensity noise where “small” escape rates, and “long” escape times occur. We estimate escape times up to \(10^{13}\) periods without relying on computationally expensive simulations. The approach can be summarized as follows: First, we divide the solution domain into a grid, and calculate transition probability maps between grid points for only one period. This can be done by using a numerical technique such as the Euler–Maruyama or Path Integral Method. Then, we iterate this map until we find a constant escape rate from a basin of attraction. We use this escape rate to estimate an average escape time. We demonstrate this procedure on a one-dimensional nonlinear map with a single attractor, namely a cubic map, and then extend it to a two-dimensional continuous system with two attractors, a non-autonomous Duffing oscillator system.

2 Escape time estimations for a cubic map under noise

We study dynamical systems that have an attractor A with a basin of attraction (BoA) B. In the presence of noise with a normal distribution and standard deviation \(\sigma \), every trajectory will eventually escape from B [31]. Our goal is to determine the mean escape time and how it depends on \(\sigma \), for low \(\sigma \) values and long escape times. For this system, the mean escape times are estimated to be longer than 1000 periods for noise intensities \(\sigma \le 0.2\), and we refer to these noise levels as low-intensity noise throughout this section.

We begin by considering the cubic map with the equation

$$\begin{aligned} x_{n+1} = ax_n + (1-a)x_n^3. \end{aligned}$$
(1)

The system has an attractor at \(x = 0\) with the basin of attraction (\(-1,1\)), and two repellers (unstable fixed points) \(x = \pm 1\) (see Fig. 1).

Fig. 1
figure 1

The cubic map with \(a = 0.6\). (See Eq.(1)). There is an attracting fixed point at \(x = 0\) and repelling fixed points at \(x = \pm 1\). The basin of attraction (BoA) for \(x = 0\) is (-1,1)

The system we consider is a stochastically perturbed cubic map with the equation:

$$\begin{aligned} x_{n+1} = ax_n + (1-a)x_n^3 + \sigma \eta _{n}, \end{aligned}$$
(2)

where \(\eta _{n}\) is a random number chosen from a normal distribution with zero mean and standard deviation 1. The noise added to the system causes trajectories to eventually escape from the BoA (-1,1), for an arbitrarily small \(\sigma > 0\). However, trajectories can cross the basin’s boundary at \(\pm 1\) and then almost immediately return. We consider the escape time as the time it takes to finally leave the region and not return. When a trajectory is far from \([-1, 1]\), the dynamics of the map (Eq. (1)) strongly forces the trajectory to move away from the region. As a result, we compute the escape time from a larger interval \([-L, L]\) rather than \([-1, 1]\). Escape times as a function of the noise intensity are shown in Fig. 2. We also used Monte Carlo simulations to validate the estimated escape times for large noise intensities (\(\sigma \ge 0.2\)). Although for \(L-1 < \sigma \), the escape times depend on the interval width (L), we show that our results in Fig. 2 for the escape time results converge for \(L-1>> \sigma \).

Fig. 2
figure 2

Estimated escape time vs \(\sigma \) (noise intensity) for the cubic map. For large-intensity noise (\(\sigma \ge 0.2\)), we run Monte Carlo simulations for validation. For \(\sigma < 0.08\), the escape rates go below \(10^{-16}\), and we hit a computational limit due to the floating point resolution we use. Here, \(a = 0.6\), \(L = 1.5\), the grid size \(g = 0.002\), and \(N = 1500\)

When \(\sigma \) is large, the average escape time can be determined through Monte Carlo simulations where solution trajectories can be iterated under different noise samples, and the average escape time can be computed. However, for small \(\sigma \), the average escape times are very long, and computing escape times by integrating the system for long periods of time may not be practical.

As an alternative, we discretize the probability distribution by dividing the interval into a uniform grid which extends from -L to L, where \(L > 1\), and which has size g. For a function \(\pi \) on the grid, we write \(\text {sum}(\pi )= \sum _k\pi (k)\), where k is the grid number (\(k = 1, 2, \ldots ,\frac{2L}{g}\)).

We represent the probabilities of position \(x_n\) as a probability distribution \(\pi _n\), which depends upon time n. We start with an initial distribution \(\pi _0(k)\), where \(\text {sum}(\pi _{0})= 1.\) The probability of being inside the interval \([-L, L]\) at time n is

$$\begin{aligned} \text {sum}(\pi _{n})= \sum _k\pi _{n}(k) \le 1. \end{aligned}$$
(3)

2.1 The map P on probability densities \(\pi _n\)

We write \(P_{jk}\) for the probability that a point x in grid interval j will move to grid interval k in one time step, applying Eq. (2) once. To simplify the notation, we ignore the effect of the rather small variance of the probability within each grid box, since \(g<< \sigma \). This P satisfies

$$\begin{aligned} \sum _kP_{jk} \le 1 \text { for each } j, \end{aligned}$$
(4)

and

$$\begin{aligned} \pi _{n+1} = P\pi _n \end{aligned}$$
(5)

Let \(\phi (y, \mu , \sigma )\) be the cumulative distribution function for the normal distribution with standard deviation \(\sigma \) and mean \(\mu \). Specifically, the probability that the point x at the center of grid box j maps to an interval (b,c) is \(\phi (c,f(x),\sigma ) - \phi (b,f(x),\sigma )\), which can be represented in terms of the normal distribution as

$$\begin{aligned}&\int _{-\infty }^c \left( \frac{1}{2\sigma }e^{-0.5\left( \frac{\xi -f(x)}{\sigma }\right) ^2}\right) d\xi \\&-\int _{-\infty }^b \left( \frac{1}{2\sigma }e^{-0.5\left( \frac{\xi -f(x)}{\sigma }\right) ^2}\right) d\xi , \end{aligned}$$

where \(f(x) = ax + (1-a)x^3\).

If x is the center of grid box j, then x maps to f(x), and the points in grid box j map to a normal probability distribution with center f(x) and standard deviation \(\sigma \). Then, we can calculate

$$\begin{aligned} P_{jk}= & {} \phi (kg - L,f(x(j)),\sigma ) \nonumber \\{} & {} -\phi ((k-1)g - L,f(x(j)),\sigma ). \end{aligned}$$
(6)

where \(j, k = 1, 2, \ldots ,\frac{2L}{g}\). For example, for \(L = 1.5\) and \(g = 0.05\), a trajectory starting at the midpoint of grid \(j = 40\) (\(x_j = 0.475\)) is mapped to \(f(x_j) = 0.328\) in the next iteration in the absence of noise. In the presence of noise, the probability distribution is centered at the \(f(x_j) = 0.328\) (see Fig. 3). Then, to find the probability in grid k, we can use Eq. (6). For example, for a system under noise with intensity \(\sigma = 0.5\) the probability of trajectory starting at \(x_{40}\) getting mapped into the grid \(k = 40\), can be found as

$$\begin{aligned} P_{40,40} = \phi (0.50,0.328,0.5) -\phi (0.45,0.328,0.5). \end{aligned}$$
Fig. 3
figure 3

Probability distribution for \(j = 40\). The distribution has a mean \(f(x_{40}) = 0.328\). The black region shows the probability \(P_{40,40}\). Here, \(L = 1.5\) and \(g = 0.05\), and \(\sigma = 0.5\)

The probability mapping matrices (P) for \(\sigma = 0.1\) and \(\sigma = 0.5\) are shown in Fig. 4. For a solution inside grid j at time n, the probability of getting mapped into the grid k at the next time step (\(n + 1\)) is visualized with the color map.

Fig. 4
figure 4

P matrices for \(\sigma = 0.1\) and \(\sigma = 0.5\). The color map shows the \(P_{jk}\), which is the probability of a solution in grid j getting mapped into grid k. Here, \(L = 1.5\) and \(g = 0.01\), and the interval is 300 grids

2.2 Finding the largest eigenvalue and its eigenvector of P

The matrix P has an eigenvector \(\lambda \) where for each grid box k, \(\lambda (k)\ge 0\), and \(P\lambda = \alpha \lambda \) where \(1\ge \alpha >0\). This eigenvector is dominant, and the associated eigenvalue is the largest according to the Frobenius–Perron Theorem [32, 33]. Ineq. (4) implies \(\alpha \le 1\), since points can escape the region \([-L, L]\), for \(\alpha < 1\). For \(\alpha < 1\), the trajectories keep escaping the region at a constant rate, and the normalized distribution of the solutions staying in the region converge to a constant distribution, which is the eigenvector associated with \(\alpha \).

For a normalized eigenvector where \(\text {sum}(\lambda ) = 1\), the probability distribution at the next iteration satisfies \(\text {sum}(P\lambda ) = \text {sum}(\alpha \lambda ) = \alpha \text {sum}(\lambda ) = \alpha \). Therefore, the asymptotic (large n) rate of escape from \([-L, L]\) is \(1-\alpha \), and at each step \((1 - \alpha )S_j\) of the trajectories escape, where \(S_j\) is the total number of trajectories inside the region at time j.

For a system that has initially \(S_0\) trajectories in the \([-L, L]\) region, \((1-\alpha )S_0\) trajectories escape after one period, and \(S_1 = \alpha S_0\) remains in the region. In the second period, \((1-\alpha )S_1\) trajectories escape, and \(S_2 = \alpha S_1 = \alpha ^2 S_0\) remain in the region. As such, at the \(j^{th}\) period, \((1-\alpha )\alpha ^{(j-1)}S_0\) solutions escape, and \(S_j = \alpha ^j S_0\) remain in the region. To calculate the mean escape time, we multiply the number of solutions escaping in each period with the number of periods passed until that time, sum them up, and divide by the total number of solutions \(S_0\):

$$\begin{aligned} t_\textrm{esc}= & {} \frac{1}{S_0}\sum _{j=1}^{\infty }{j(1-\alpha )\alpha ^{(j-1)}S_0}\nonumber \\{} & {} = (1-\alpha )\sum _{j=1}^{\infty }{j\alpha ^{(j-1)}} = \frac{(1 - \alpha )}{(1 - \alpha )^2} = \frac{1}{1-\alpha } \nonumber \\ \end{aligned}$$
(7)

The infinite sum in Eq. (7) converges to \(\frac{1}{(1-\alpha )^2}\) for \(0< \alpha < 1\), and we can refer to the mean “escape time” as \(\frac{1}{ 1-\alpha }\).

Both the matrix P and \(\alpha \) depend on \(\sigma ,L\), and g. For a fixed noise intensity \(\sigma \), \(\alpha \) converges to a number \(\alpha (\sigma )\) as L increases and g decreases. Our goal is to establish the limiting value of \(\alpha = \alpha (\sigma )\). We first compute the eigenvector \(\lambda \), and show its convergence.

To compute the positive eigenvector of P, we obtain an upper and a lower bound. We start with two initial distributions: a constant initial probability distribution \(\gamma \) where \(\gamma (k)\) is constant for all k, and an initial “spike distribution” \(\delta \) that has probability 0 for every entry except at 0 where all of the probability is concentrated.

Let \(\lambda \) be the normalized eigenvector associated with the dominant eigenvalue \(\alpha \), such that

$$\begin{aligned} P\lambda = \alpha \lambda \text { and }\text {sum}(\lambda ) = 1. \end{aligned}$$

By the Frobenius–Perron Theorem, as \(n\rightarrow \infty \) the “normalized” densities converge:

$$\begin{aligned} P^n_{1}(\gamma ):=&\frac{P^n(\gamma )}{\text {sum}(P^n(\gamma ))}\rightarrow \lambda \text { and }\end{aligned}$$
(8)
$$\begin{aligned} P^n_{1}(\delta ):=&\frac{P^n(\delta )}{\text {sum}(P^n(\delta ))}\rightarrow \lambda \end{aligned}$$
(9)

The associated eigenvalue \(\alpha \) can be estimated as \(\alpha = \text {sum}(P\lambda )\). For both the uniform (\(\gamma \)) and spike (\(\delta \)) initial distributions, and for \(\sigma \ge 0.09\), the normalized distributions are effectively the same, with a difference

$$\begin{aligned} |P^{2048}_{1}(\gamma )(k)-P^{2048}_{1}(\delta )(k)| <10^{-29} \end{aligned}$$

after 2048 iterates. Figure 5 is used to show how the densities converge. There is an equilibrium distribution \(\lambda \) and \(P^n_{1}(\delta )\) converges to \(\lambda \) from below while \(P^n_{1}(\gamma )\) converges from above. They all have the same sums since they are all normalized. Spike (\(\delta \)) distribution converges to \(\lambda \) faster than the uniform (\(\gamma \)) distribution. For lower \(\sigma \) values, the probability densities converge slower. To speed up computations of powers of the matrix P, we only compute \(P^{2^M}\) such as \(P^{2} = P^{}P^{}\), \(P^{4} = P^{2}P^{2}\), and so on to \(P^{2048}\). In Fig. 5, we only plot the distributions for our smallest \(\sigma \) value, \(\sigma = 0.09\), because its convergence is slower compared with larger \(\sigma \) values, and the number of iterations used for its convergence is sufficient for those associated with larger \(\sigma \).

Fig. 5
figure 5

The normalized probability distribution \(P^N_{1}(\delta )\) and \(P^N_{1}(\gamma )\) (see (8),(9)) after various numbers of iterations for both spike (\(\delta \)) and uniform (\(\gamma \)) initial distributions. Here, \(\sigma = 0.09\), \(L = 1.5\), and \(g = 0.002\)

In order to find the suitable number of grid boxes \(N = 2L/g\) and interval boundary L for accurate escape time calculations, we find the estimated escape times by using various N and L values, and calculate the fractional errors in the escape times (see Figs. 6 and 7). For the noise intensities tested (\(\sigma = 0.1\), \(\sigma = 0.2\), and \(\sigma = 0.5\)), the escape times converge around \(L = 1.5\) and \(N = 1200\). The fractional error in escape time (\(t_\textrm{esc}\)) is calculated by using

$$\begin{aligned} \textrm{FE}(L) = \frac{t_\textrm{esc}(L) - t_\textrm{esc}(L_\textrm{max})}{t_\textrm{esc}(L_\textrm{max})}, \end{aligned}$$

where \(L_\textrm{max} = 1.7\) for Fig. 6 and

$$\begin{aligned} FE(N) = \frac{t_\textrm{esc}(N) - t_{esc}(N_\textrm{max})}{t_\textrm{esc}(N_\textrm{max})}, \end{aligned}$$

where \(N_\textrm{max} = 1500\) for Fig. 7. The “ideal” interval length (L) and the grid size (\(N = 2L/g\)) depend on the system parameters, and the noise intensities used. For \(N = 1500\), and \(L = 1.7\), the eigenvalues \(\alpha \) are listed in Table 1 for various noise intensities (\(\sigma \)).

Fig. 6
figure 6

Fractional error in escape time vs L, plotted for \(\sigma = 0.1,\,0.2,\,0.5\) and \(N = 1500\): Errors are calculated relative to the \(L_{max} = 1.7\) case. For smaller \(\sigma \) values, a smaller L is sufficient. However for larger \(\sigma \) values, larger L is required to accurately calculate the escape time

Fig. 7
figure 7

Fractional error in escape time vs g, plotted for \(\sigma = 0.1,\,0.2,\,0.5\) and \(L = 1.5\): Errors are calculated relative to the \(N_{max} = 1500\) case. For larger \(\sigma \) values, a smaller number of grid boxes N is sufficient. However for smaller \(\sigma \) values, a finer grid (i.e., larger N) is required to accurately calculate the escape time

3 Escape time estimations for a Duffing oscillator under noise

We extend the approach to a forced, hardening Duffing oscillator influenced by noise with the stochastic differential equation

$$\begin{aligned} \ddot{y} + \delta _c \dot{y} + \alpha _1 y + \beta y^3 = F \cos (\omega t) + \sigma \dot{W}, \end{aligned}$$
(10)

where \(\sigma \) is the intensity of an additive Gaussian noise, W(t) is a Wiener process and \(\dot{W}(t)\) is a representation of its time derivative. The parameters used for this study are \(\delta _c = 0.1\), \(\alpha _1 = 1\), \(\beta = 0.3\), \(F = 0.4\), \(\omega = 1.4\). This is a non-autonomous, nondeterministic, weakly damped, nonlinear system, and analyzing the stochastic behavior of these systems is an active area of research [34, 35]. For \(\sigma = 0\), the system is deterministic and has at least two attracting stable periodic solutions and one unstable periodic solution of saddle type. These solutions and their respective basins of attractions are shown as fixed points of a Poincaré section in Fig. 8, wherein the Poincaré section is sampled once a period, \(T = 2\pi / \omega \).

Table 1 The largest eigenvalues of the P matrices for various \(\sigma \) values
Fig. 8
figure 8

Equilibrium points of the hardening Duffing equation and the basins of attraction. The circle D with radius R = 0.5 is shown as the neighborhood of the low amplitude attractor. The escape rates are calculated from the HAA (\(A_0\)) to LAA (\(A_f\))

As shown in Fig. 8, there is a high-amplitude attractor (HAA) and a low-amplitude attractor (LAA).

Under the influence of noise, transitions can occur from one stable mode to another. Such a transition begins at \(A_0\), the location of the initial attractor, and ends at \(A_f\), the location of the final attractor. The transition paths of the trajectories moving between the two attractors pass through the saddle point on the boundary between the two basins [36].

In Sect. 2, we used L to identify a region \(L-1\) distance away from the basin of the initial attractor. For systems that escape to a final attractor(s), it suffices to specify a domain, D that completely surrounds the final attractor(s). For example, we define D as a circle around \(A_f\), with radius R. For small enough R, D is far from the basin boundary that solutions entering D have high probability of remaining in D. Therefore, we define the escape time based on the time it takes for a solution to enter in D.

The escape time \(t_{D}\), is the time it takes the system to first transition into D. Consider a formula for the \(t_{D}\) defined as

$$\begin{aligned} \begin{aligned} t_{D} \equiv&\inf \{{t > 0 \,|\, (\varvec{x}(t)) \in D, \varvec{x}(t_0) = A_0}\} \end{aligned} \end{aligned}$$
(11)

Here, \(\varvec{x}(t)\) is the Poincaré section of \((y(t),\dot{y}(t))\) sampled once a period. The expected escape time is

$$\begin{aligned} \tau _{D} = E[t_{D}] \end{aligned}$$
(12)

When we estimate the escape time from the HAA to LAA, D is a circle that surrounds LAA. For \(R = 0.5\), the circle D is given in Fig 8.

3.1 Computing the map P as a matrix for the Duffing oscillator

The probability transition matrix of this system is computed by integrating nonlinear moment differential equations derived by using the Fokker-Planck equation. This is a procedure commonly used in the Path Integral Method to generate transition probability estimates. We note this procedure requires a fine mesh when the noise intensity is small, and does not scale well with dimension, but is adequate for this two-dimensional (2D) example. Furthermore, this procedure assumes that for short time intervals the transition probability density functions are Gaussian. Narayanan and Kumar and Kumar and Narayanan [37, 38] have provided a recipe for using a non-Gaussian expansion by using Hermite Polynomial expansions, but for these parameters, and for the duration of a period of integration, the errors due to the Gaussian assumption were found to be small in prior work [39].

The moment differential equations for the forced Duffing oscillator are available in [39] in terms of the moments \({m}_{ij}\). The moment time derivatives, \(\dot{m}_{ij}\), are defined from

$$\begin{aligned} \dot{m}_{ij} = \dot{E}[ x^i\dot{x}^j ] = \int _{{\mathbb {R}}^2} x^i \dot{x}^j \frac{\partial p(x,\dot{x},t)}{\partial t}dx d\dot{x} \end{aligned}$$
(13)

and \(\frac{\partial p(x,\dot{x},t)}{\partial t}\) can be derived from the Fokker Planck Equation. Since a Gaussian assumption was applied, the nonlinear moment differential equations were closed using Gaussian Moment Closure as described in [39]. Integrating the moment equations forward in time, one can define forward mapped means, \(\varvec{\mu }(t,\varvec{x}_0,t_0)\), and a covariance matrix \(\varvec{K}(t,\varvec{x}_0,t_0)\) of a Gaussian transition probability density function

$$\begin{aligned} \begin{aligned}&q(\varvec{x},t\,|\,\varvec{x}_0,t_o) \\&\quad =\frac{1}{2 \pi \sqrt{|\varvec{K}|}} \, \textrm{exp}\left( \frac{1}{2}(\varvec{x}-\varvec{\mu })*\varvec{K}^{{-1}}*(\varvec{x}-\varvec{\mu })^T\right) \end{aligned} \end{aligned}$$
(14)

Here, \(\varvec{x} \equiv (x,\dot{x})\). We divide the two-dimensional region for the interval x = [-5,5] and \(\dot{x}\) = [-5,5] into a grid with N grid boxes (cells) of equal size. Each cell has midpoint \(\varvec{x}^{(j)}\) with index \(j\in [1,2,..., N]\). A transition probability density matrix \(\varvec{q}\) has dimension \(N\times N\) and has components

$$\begin{aligned} \begin{aligned} q_{jk} = q(\varvec{x}^{(j)},t\,|\,\varvec{x}^{(k)}_0,t_o) \quad \forall \, j,k\in [1,2, ..., N] \end{aligned} \end{aligned}$$
(15)

A matrix of transition cumulative distribution functions can be assembled from \(\varvec{q}\) by integrating the transition probability density functions over the domain of each cell as

$$\begin{aligned} \begin{aligned} Q_{jk} =&\int _{\Omega _k} q(\varvec{x},t\,|\,\varvec{x}^{(j)}_0,t_o) d\varvec{x} \\&\quad \forall \, j,k\in [1,2, ..., N] \end{aligned} \end{aligned}$$
(16)

Here, \(\Omega _k\) is the domain of cell with index k and \(Q_{jk}\) represents the probability that a midpoint \(\varvec{x}^{(j)}\) of a cell with index j will move to the cell with index k after one period. Finally, we remark that to get a P matrix analogous to the P matrix in Sect. 2, we need to remove cells that contain midpoints in the region D. Hence, a P matrix is derived from Q as follows:

$$\begin{aligned} P_{jk} = {\left\{ \begin{array}{ll} 0 &{} \varvec{x}^{(j)} \in D \\ 0 &{} \varvec{x}^{(k)} \in D \\ \frac{Q_{kj}}{\sum _{i = 1}^N Q_{ji}} &{} \text {otherwise} \\ \end{array}\right. } \end{aligned}$$
(17)

Note that \(\varvec{P}\) is a transpose of \(\varvec{Q}\) since in calculating the eigenvector as in Sect. 2 we perform the \(\varvec{P}\) multiplication from the left side.

The probability distribution of solutions obtained after \(2^{11} = 2048\) periods with \(\sigma = 0.08\) is shown in Fig. 9. We iterate the map \(Q^T\) for \(N = 2^{11} = 2048\) periods. Then, we apply this \((Q^T)^{2048}\) map to a uniform distribution (\(\gamma (k) =\) constant). The solutions have a high probability in the vicinity of the two attractors, consistent with the literature [40]. We are studying the transition times from one region to the other to estimate escape time from one attractor to the other.

Fig. 9
figure 9

Probability distribution for \(\sigma = 0.08\) after \(2^{12} = 4096\) periods. Here \(A_0\) is the HAA and \(A_f\) is the LAA. Only the boxes with probabilities greater than 0.0001 are plotted in red, where the total probability density of the red regions around the two attractors is \(99.4\%\). The probabilities in transition paths between the two attractors are small compared to the probabilities around the attractors, since the jumps happen quickly

Applying the procedure outlined in the previous section, we iterate the \(\varvec{P}\) matrix and compute the \(\varvec{P}^{2^M}\) powers. We then use Eq. (8) and (9) to calculate \(\varvec{P}\)’s dominant eigenvalue \(\alpha \) by using both uniform (\(\gamma \)) and spike (\(\delta \)) initial distributions. Similar to the cubic map, the distributions converge to a \(\lambda \) distribution after 2048 iterations (see Fig. 10), and \(\alpha \) can be estimated as \(\alpha = \text {sum}(\varvec{P}\lambda )\). We then express the transition rate into D as \(1-\alpha \), and estimate the escape time as \(1/(1-\alpha )\).

Fig. 10
figure 10

The normalized probability distributions after \(N = 2048\) and \(N = 4096\) iterations calculated for \(\sigma = 0.08\). Here \(A_0\) is the HAA and \(A_f\) is the LAA. The distribution converges to the eigenvector \(\lambda \) of the P matrix associated with the dominant eigenvalue \(\alpha \). Only the boxes with probabilities greater than 0.0001 are plotted in red, where the total probability density of the red region around the HAA is \(99.7\%\)

The escape times from HAA to LAA and from LAA to HAA are shown in Fig. 11, where we estimate escape times up to \(10^{13}\) periods. We do not calculate the escape times for \(\sigma \) values smaller than 0.07 since the escape rates becomes smaller than \(10^{-16}\), at which point the method requires computations with more than 64-bit float precision.

Fig. 11
figure 11

Expected escape time vs \(\sigma \) (noise intensity) for the Duffing oscillator. The oscillator parameters are \(\delta _c = 0.1\), \(\alpha _1 = 1\), \(\beta = 0.3\), \(F = 0.4\), and \(\omega = 1.4\). The escape times are calculated by using \(N = 25600\), and \(R = 0.5\)

For the Duffing system, we show the effect of the radius (R) of the circular region D and the grid size (N) on the estimated escape time in Figs. 12 and 11, respectively. For \( 0.2< R < 1.2\), the fractional error in escape time is \(<10^{-6}\), since the time it takes for a trajectory to leave the basin is much longer compared to the time it takes to jump to the new attractor, as we show in Figs. 14 and 15. Therefore, the size of the circular region does not affect the escape time estimations as long as \(R<1.2\), and for the calculations used to generate Fig. 11, we pick \(R = 0.5\).

Fig. 12
figure 12

Expected escape time vs R. The fractional error in the escape time is \(<10^{-6}\) for \(R<1.2\). The error in escape times is greater for \(R>1.4\). As the circle D gets close to the basin boundary (or even intersects with the boundary), some of the trajectories crossing into the circle gets back into the original basin due to noise

Fig. 13
figure 13

Fractional error vs N. \(N = 25600\) is the reference for the comparison

Fig. 14
figure 14

Poincaré section of the Euclidean norm of the stochastic response of a forced Duffing oscillator. The Poincaré section is plotted for a single trajectory under noise with intensity \(\sigma = 0.14\)

Fig. 15
figure 15

Sudden transitions between modes. Left: Transition from the high attractor to the low attractor. Right: Transition from the low attractor to the high attractor. Note that while the transition at period 230, 000 takes approximately 10 periods, the system remains near the low attractor for approximately the next 3000 periods, until approximately period 233, 100. Each dot represents the distance between the response and the low attractor at integer periods for \(\sigma = 0.14\). Consecutive dots are connected with straight lines for clarity

For the Duffing system, the estimated mean escape times are longer than 1000 periods for noise intensities \(\sigma < 0.18\), and we consider these noise levels as low-intensity noise. To validate the escape rates calculated by using the largest eigenvalue of P, we use Monte Carlo simulations with Euler–Maruyama integration for high noise intensities, \(\sigma = 0.19\) and \(\sigma = 0.2\). We limit these simulations to high noise intensities, simply because integrating hundreds of trajectories and calculating their average escape times is not practical for low-noise intensities. We run simulations for a total of 50000 periods, where the initial condition is set to the HAA. Then, as soon as the trajectory falls into D, we restart the simulation with the initial conditions set to the HAA again, as shown in Fig. 16. We observed an average escape time from the HAA to LAA around 450 periods for \(\sigma = 0.2\), and 550 periods for \(\sigma = 0.19\). For both cases, the escape times calculated with the Euler–Maruyama simulations is within one standard deviation distance from the mean escape time estimated using the largest eigenvalue of P.

Fig. 16
figure 16

Euler–Maruyama simulations to estimate the average escape times from the HAA to LAA. We start the simulations at the HAA, and as soon as the trajectory enters into the D region (e.g., at t = 255 and t = 555 for the first and the second plots), we stop the simulation. Then we restart the simulation at the HAA again, and repeat the procedure

When extending this approach to systems with higher dimensions, the dimensions of the distribution vectors and transition probability matrices grow rapidly. While the number of grid boxes (N) is proportional to (1/g) for the one-dimensional cubic map, it is proportional to \((1/g^2)\) for the Duffing equation. For a system with m dimensions, the number of cells grow with \((1/g^m)\). The size of the transition probability matrix P is then proportional to \((1/g^m)\times (1/g^m)\), since the size of P is \(N\times N\). Therefore, for systems with higher dimensions, the computational power needed to compute the powers of P grows rapidly, especially, for lower noise intensities, where a finer grid is needed to find the very small transition probabilities and escape rates. In this paper, we limit our examples to one-dimensional and two-dimensional systems, and leave tackling the challenges faced with extending this approach to higher-dimensional systems for future work.

4 Concluding remarks

In this study, we have outlined an approach to estimate the escape times of responses from basins of attraction under the presence of noise. Our approach has an advantage of being suitable for estimating long escape times when the system is subjected to low-intensity Gaussian noise. We showed the numerical technique wherein we discretized the solution interval into a grid, and calculated probability maps between grid points in consecutive periods. Then, we iterated these maps to find the constant escape rates from the basins of attraction, and estimated the average escape times. We have applied this approach to a cubic map and a non-autonomous Duffing oscillator, where we determined the probability maps between the grids by using a Path Integral Method. Our approach can also be used with other techniques for computing the probability maps, such as the Euler–Maruyama integration or a finite-element based technique. The probability maps then can be iterated with our approach to calculate the escape times. We validated our results by comparing the escape times to those calculated by using Monte Carlo simulations with the Euler–Maruyama integration. We estimated escape times as high as \(10^{13}\) periods without relying on computationally expensive simulations. We noted that the approach works well for the low-dimensional systems, but computational challenges may arise when extending this approach to high-dimensional systems.