1 Introduction

From physical sciences to social sciences, many phenomena are modeled by noisy dynamical systems. In many such systems, several widely separated time scales are present. The system obtained in the homogenization limit, in which the fast time scales go to zero, is simpler than the original one, while often retaining the essential features of its dynamics [1,2,3,4]. On the other hand, the different fast time scales may compete and this competition is reflected in the homogenized equations.

Of particular interest is the model of a Brownian particle interacting with the environment [5]. The usual model for such system neglects the memory effects, representing the interaction of the particle with the environment as a sum of an instantaneous damping force and a white noise. Although such an idealized model generally gives a good approximate description of the dynamics of the particle, there are situations where the memory effects play an important role, for instance when the particle is subject to a hydrodynamic interaction [6], or when the particle is an atom embedded in a condensed-matter heat bath [7]. In addition, the stochastic forcing introduced by the environment is often more accurately modeled by a colored noise than by white noise.

In this paper, we study a class of generalized Langevin equations (GLEs), with state-dependent drift and diffusion coefficients, driven by a colored noise. They provide a realistic description of the dynamics of a classical Brownian particle in an inhomogeneous environment; their solutions are not Markov processes. We are interested in the limiting behavior of the particle when the characteristic time scales become small and in how competition of the time scales, as well as inhomogeneity of the environment, impact its limiting dynamics. The main mathematical result of this paper is Theorem 2, in which we derive the homogenization limit for a general class of non-Markovian systems. Special cases are studied in some details to obtain more explicit results. Their physical relevance is illustrated by an application to thermophoresis models.

The paper is organized as follows. In Sect. 2, we introduce and discuss a class of GLEs, as well as its two sub-classes, to be studied in this paper. In Sect. 3, we revisit the Smoluchowski–Kramers limit for a class of SDEs with state-dependent drift and diffusion coefficients, under a weaker assumption on the spectrum of the damping matrix than that used in earlier work [8]. Using this result of Sect. 3, we study homogenization for the GLEs in Sect. 4. We specialize the study to two sub-classes of models in Sect. 5. In Sect. 6, we apply the results obtained in the previous sections to study thermophoresis of a Brownian particle in a non-equilibrium heat bath. We end the paper by giving the conclusions and final remarks in Sect. 7. The appendices provide some technical results used in the main paper, as well as physical motivation for the form of the GLEs studied here. In Appendix A we provide a variant of a (heuristic) derivation of the equations studied in this paper from Hamiltonian model of a particle interacting with a system of harmonic oscillators. Appendix B contains a sketch of the proof of Theorem 1.

2 Generalized Langevin Equations (GLEs)

2.1 GLEs as Non-Markovian Models

We consider a class of non-Markovian Langevin equations, with state-dependent coefficients, that describe the dynamics of a particle moving in a force field and interacting with the environment. Let \(\varvec{x}_{t} \in \mathbb {R}^{d}\), \(t \ge 0\), be the position of the particle. The evolution of position, \(\varvec{x}_{t}\), is given by the solution to the following stochastic integro-differential equation (SIDE):

$$\begin{aligned} m \ddot{\varvec{x}}_{t} = \varvec{F}(\varvec{x}_{t}) - \varvec{g}(\varvec{x}_{t}) \int _{0}^{t} \varvec{\kappa }(t-s) \varvec{h}(\varvec{x}_{s}) \dot{\varvec{x}}_{s} ds + \varvec{\sigma }(\varvec{x}_{t}) \varvec{\xi }_{t}, \end{aligned}$$
(1)

with the initial conditions (here the initial time is chosen to be \(t=0\)):

$$\begin{aligned} \varvec{x}_{0} = \varvec{x}, \ \ \dot{\varvec{x}}_{0} = \varvec{v}. \end{aligned}$$
(2)

The initial conditions \(\varvec{x}\) and \(\varvec{v}\) are random variables independent of the process \(\{\varvec{\xi }_{t}: \ t \ge 0\}\). Our motivation to study the SIDE (1) is that study of microscopic dynamics leads naturally to equations of this form (see Appendix A).

Here and throughout the paper, overdot denotes derivative with respect to time t, the superscript \(^*\) denotes conjugate transposition of matrices or vectors and E denotes expectation. In the SIDE (1), \(m > 0\) is the mass of the particle, the matrix-valued functions \(\varvec{g}: \mathbb {R}^{d} \rightarrow \mathbb {R}^{d \times q}\), \(\varvec{h} : \mathbb {R}^{d} \rightarrow \mathbb {R}^{q \times d}\) and \(\varvec{\sigma }: \mathbb {R}^{r} \rightarrow \mathbb {R}^{d \times r}\) are the state-dependent coefficients of the equation, and \(\varvec{F} :\mathbb {R}^{d} \rightarrow \mathbb {R}^{d}\) is a force field acting on the particle. Here d, q and r are, possibly distinct, positive integers. The second term on the right hand side of (1) represents the drag experienced by the particle and the last term models the noise.

The matrix-valued function \(\varvec{\kappa }: \mathbb {R}\rightarrow \mathbb {R}^{q \times q}\) is a memory function which is Bohl, i.e. the matrix elements of \(\varvec{\kappa }(t)\) are finite linear combinations of the functions of the form \(t^k e^{\alpha t} \cos (\omega t)\) and \(t^k e^{\alpha t} \sin (\omega t)\), where k is an integer and \(\alpha \) and \(\omega \) are real numbers. For properties of Bohl functions, we refer to Chapter 2 of [9]. The noise process \(\varvec{\xi }_{t}\) is an r-dimensional mean zero stationary real-valued Gaussian vector process having a Bohl covariance function, \(\varvec{R}(t):=E \varvec{\xi }_t \varvec{\xi }_0^* = \varvec{R}^*(-t) \), and, therefore, its spectral density, \(\varvec{S}(\omega ) := \int _{-\infty }^{\infty } \varvec{R}(t) e^{-i\omega t} dt\), is a rational function [10].

The SIDE (1) is a non-Markovian Langevin equation, since its solution at time t depends on the entire past. Two of its terms are different than those in the usual Langevin equations. One of them is the drag term, which here involves an integral over the particle’s past velocities with a memory kernel \(\varvec{\kappa }(t-s)\). It describes the state-dependent dissipation which comprises the back-action effects of the environment up to current time. The other term, involving a Gaussian colored noise \(\varvec{\xi }_{t}\), is a multiplicative noise term, also arising from interaction of the particle with the environment. Therefore, (1) is a generalized Langevin equation (GLE), which in its most basic form was first introduced by Mori in [11] and subsequently used to model many systems in statistical physics [12,13,14].

As remarked by van Kampen in [15], “Non-Markov is the rule, Markov is the exception”. Therefore, it is not surprising that non-Markovian equations (including those of form (1)) find numerous applications and thus have been studied widely in the mathematical, physical and engineering literature. See, for instance [16, 17] for surveys of non-Markovian processes, [18,19,20] for physical applications and [21] for asymptotic analysis.

Note that the Gaussian process \(\varvec{\xi }_t\) which drives the SIDE (1) is not assumed to be Markov. The assumptions we made on its covariance will allow us to present it as a projection of a Markov process in a (typically higher-dimensional) space. This approach, which originated in stochastic control theory [22], is called stochastic realization. We describe it in detail below.

Let \(\varvec{\varGamma }_1 \in \mathbb {R}^{d_1 \times d_1}\), \(\varvec{M}_1 \in \mathbb {R}^{d_1 \times d_1}\), \(\varvec{C}_1 \in \mathbb {R}^{q \times d_1}\), \(\varvec{\varSigma }_1 \in \mathbb {R}^{d_1 \times q_1}\), \(\varvec{\varGamma }_2 \in \mathbb {R}^{d_2 \times d_2}\), \(\varvec{M}_2 \in \mathbb {R}^{d_2 \times d_2}\), \(\varvec{C}_2 \in \mathbb {R}^{r \times d_2}\), \(\varvec{\varSigma }_2 \in \mathbb {R}^{d_2 \times q_2}\) be constant matrices, where \(d_1,d_2,q_1,q_2\), q and r are positive integers. In this paper, we study the class of SIDE (1), with the memory function defined in terms of the triple \((\varvec{\varGamma }_1,\varvec{M}_1,\varvec{C}_1)\) of matrices as follows:

$$\begin{aligned} \varvec{\kappa }(t)=\varvec{C}_1e^{-\varvec{\varGamma _1}|t|}\varvec{M}_1\varvec{C}_1^*. \end{aligned}$$
(3)

The noise process is the mean zero, stationary Gaussian vector process, whose covariance will be expressed in terms of the triple \((\varvec{\varGamma }_2,\varvec{M}_2,\varvec{C}_2)\). More precisely, we define it as:

$$\begin{aligned} \varvec{\xi }_t = \varvec{C}_2 \varvec{\beta }_t, \end{aligned}$$
(4)

where \(\varvec{\beta }_t\) is the solution to the Itô SDE:

$$\begin{aligned} d\varvec{\beta }_t = -\varvec{\varGamma }_2\varvec{\beta }_t dt + \varvec{\varSigma }_2 d\varvec{W}^{(q_2)}_t, \end{aligned}$$
(5)

with the initial condition, \(\varvec{\beta }_0\), normally distributed with zero mean and covariance \(\varvec{M}_2\). Here, \(\varvec{W}^{(q_2)}_t\) denotes a \(q_2\)-dimensional Wiener process and is independent of \(\varvec{\beta }_0\). Throughout the paper the dimension of the Wiener process will be specified by the superscript.

For \(i=1,2\), the matrix \(\varvec{\varGamma }_i\) is positive stable, i.e. all its eigenvalues have positive real parts and \(\varvec{M}_i = \varvec{M}_i^* > 0\) satisfies the following Lyapunov equation:

$$\begin{aligned} \varvec{\varGamma }_i \varvec{M}_i+\varvec{M}_i \varvec{\varGamma }_i^*=\varvec{\varSigma }_i \varvec{\varSigma }_i^*. \end{aligned}$$
(6)

It follows from positive stability of \(\varvec{\varGamma }_i\) that this equation indeed has a unique solution [23]. The covariance matrix, \(\varvec{R}(t) \in \mathbb {R}^{r \times r}\), of the noise process is therefore expressed in terms of the matrices \((\varvec{\varGamma }_2,\varvec{M}_2,\varvec{C}_2)\) as follows:

$$\begin{aligned} \varvec{R}(t)=\varvec{C}_2e^{-\varvec{\varGamma _2}|t|}\varvec{M}_2\varvec{C}_2^*, \end{aligned}$$
(7)

and therefore the triple \((\varvec{\varGamma }_2,\varvec{M}_2,\varvec{C}_2)\) completely specifies the probability distribution of \(\varvec{\xi }_t\). It is worth mentioning that the triples that specify the memory function in (3) and the noise process in (4) are only unique up to the following transformations:

$$\begin{aligned} (\varvec{\varGamma }'_i=\varvec{T}_i \varvec{\varGamma }_i \varvec{T}^{-1}_i, \varvec{M}_i' = \varvec{T}_i \varvec{M}_i \varvec{T}_i^{*}, \varvec{C}'_i = \varvec{C}_i \varvec{T}_i^{-1}), \end{aligned}$$
(8)

where \(i=1,2\) and the \(\varvec{T}_i\) are invertible matrices of appropriate dimensions.

The triple \((\varvec{\varGamma }_2,\varvec{M}_2,\varvec{C}_2)\) above is called a (weak) stochastic realization of the covariance matrix \(\varvec{R}(t)\) in the well established theory of stochastic realization, which is concerned with solving the inverse problem of stationary covariance generation (see [24, 25]). Any zero mean stationary Gaussian process, \(\varvec{\xi }'_t\), having a Bohl covariance function, can be realized as a projection of a Gaussian Markov process in the above way. Let us remark that Gaussian processes with Bohl covariance functions are precisely those with rational spectral density [10].

Our approach allows us to consider the most general Gaussian noises that can be realized in a finite-dimensional state space in the above way (i.e. as a linear transformation of a Gaussian Markov process). In fact, the condition on the covariance function to have entries in the Bohl class is necessary and sufficient for solvability of the problem of stochastic realization of stationary Gaussian processes. We refer to the propositions and theorems on page 303-308 of [10] for a brief exposition of stochastic realization problems.

Remark 1

Physically, the choice of the matrices \(\varvec{\varGamma }_2,\varvec{M}_2,\varvec{C}_2\) specifies the characteristic time scales (eigenvalues of \(\varvec{\varGamma }_2^{-1}\)) present in the environment, introduces the initial state of a stationary Markovian Gaussian noise and selects the parts of the prepared Markovian noise that are (partially) observed, respectively. In other words, we have assumed that the noise in the SIDE (1) is realized or “experimentally prepared” by the above triple of matrices.

For our homogenization study of Eq. (1) we need the effective damping constant,

$$\begin{aligned} \varvec{K}_1 := \int _0^{\infty } \varvec{\kappa }(t) dt = \varvec{C}_1 \varvec{\varGamma }_1^{-1} \varvec{M}_1 \varvec{C}_1^* \in \mathbb {R}^{q \times q}, \end{aligned}$$
(9)

and the effective diffusion constant,

$$\begin{aligned} \varvec{K}_2 := \int _0^{\infty } \varvec{R}(t) dt = \varvec{C}_2 \varvec{\varGamma }_2^{-1} \varvec{M}_2 \varvec{C}_2^* \in \mathbb {R}^{r \times r}, \end{aligned}$$
(10)

to be invertible (see Sect. 2.2). This is equivalent to the matrices \(\varvec{C}_i\) having full rank. Homogenization for a class of systems with vanishing effective damping and/or diffusion constant [26] will be explored in our future work.

With the above definitions of memory kernel and noise process, the SIDE (1) becomes:

$$\begin{aligned} m \ddot{\varvec{x}}_{t} = \varvec{F}(\varvec{x}_{t}) - \varvec{g}(\varvec{x}_{t}) \int _{0}^{t} \varvec{C}_1 e^{-\varvec{\varGamma }_1(t-s)} \varvec{M}_1\varvec{C}_1^* \varvec{h}(\varvec{x}_{s}) \dot{\varvec{x}}_{s} ds + \varvec{\sigma }(\varvec{x}_{t}) \varvec{C}_2 \varvec{\beta }_{t}, \end{aligned}$$
(11)

where \(\varvec{\beta }_t\) is the solution to the SDE (5). To illustrate the results of the general study in important special cases (which will also be used later in applications), we consider two representative classes of SIDE (1). The driving Gaussian colored noise is Markovian in the first class and non-Markovian in the second. We set \(d=d_1=d_2=q_1=q_2=q=r\) in the following examples.

  1. (i)

    Example of a SIDE driven by a Markovian colored noise. The memory kernel is given by an exponential function, i.e.

    $$\begin{aligned} \varvec{\kappa }(t-s) = \varvec{\kappa }_{1}(t-s) := \varvec{A} e^{-\varvec{A}|t-s|}, \end{aligned}$$
    (12)

    where \(\varvec{A} \in \mathbb {R}^{d \times d}\) is a constant diagonal matrix with positive eigenvalues. The driving noise is the Ornstein–Uhlenbeck (OU) process, \(\varvec{\xi }_{t} = \varvec{\eta }_{t} \in \mathbb {R}^d\), i.e. a mean zero stationary Gaussian process which is the solution to the SDE:

    $$\begin{aligned} d\varvec{\eta }_{t} = -\varvec{A} \varvec{\eta }_{t} dt + \varvec{A} d\varvec{W}^{(d)}_{t}. \end{aligned}$$
    (13)

    In order for the process \(\varvec{\eta }_{t}\) to be stationary, the initial condition has to be distributed according to the (unique) stationary measure of the Markov process defined by the above equation, i.e. \(\varvec{\eta }_{0} = \varvec{\eta }\) is normally distributed with zero mean and covariance \(\varvec{A}/2\). The mean and the covariance of \(\varvec{\eta }_{t}\) are given by:

    $$\begin{aligned} E[\varvec{\eta }_{t}] = 0, \ \ E[\varvec{\eta }_{t} \varvec{\eta }_{s}^{*}] = \frac{1}{2}\varvec{\kappa }_{1}(t-s), \ \ s,t \ge 0. \end{aligned}$$
    (14)

    The resulting SIDE reads:

    $$\begin{aligned} m \ddot{\varvec{x}}_{t} = \varvec{F}(\varvec{x}_{t}) - \varvec{g}(\varvec{x}_{t})\int _{0}^{t} \varvec{\kappa }_{1}(t-s) \varvec{h}(\varvec{x}_{s}) \dot{\varvec{x}}_{s} ds + \varvec{\sigma }(\varvec{x}_{t}) \varvec{\eta }_{t}. \end{aligned}$$
    (15)

    Let us note that Ornstein–Uhlenbeck processes are the only stationary, ergodic, Gaussian, Markov processes with continuous covariance functions [27]. When all diagonal entries of \(\varvec{A}\) go to infinity, the OU process approaches the white noise. For details on OU processes, see for instance [27] and Sect. 2 of [28].

  2. (ii)

    Example of a SIDE driven by a non-Markovian colored noise. The memory kernel is given by an oscillatory function whose amplitude is exponentially decaying, i.e. \( \varvec{\kappa }(t-s) := \varvec{\kappa }_2(t-s)\), a diagonal matrix with the diagonal entries:

    $$\begin{aligned} (\varvec{\kappa }_{2})_{ii}(t-s) := {\left\{ \begin{array}{ll} \frac{1}{\tau _{ii}} e^{-\omega _{ii}^2 \frac{|t-s|}{2 \tau _{ii}}}\left[ \cos \left( \frac{\omega ^0_{ii}}{\tau _{ii}} (t-s) \right) + \frac{\omega ^1_{ii}}{2} \sin \left( \frac{\omega ^0_{ii}}{\tau _{ii}} |t-s| \right) \right] , &{} \text {if } |\omega _{ii}|<2 \\ \frac{1}{\tau _{ii}} e^{-\omega _{ii}^2 \frac{|t-s|}{2 \tau _{ii}}}\left[ \cosh \left( \frac{\tilde{\omega }^0_{ii}}{\tau _{ii}}(t-s) \right) + \frac{\tilde{\omega }^1_{ii}}{2} \sinh \left( \frac{\tilde{\omega }^0_{ii}}{\tau _{ii}} |t-s| \right) \right] , &{} \text {if } |\omega _{ii}|>2, \end{array}\right. } \end{aligned}$$
    (16)

    where for \(i=1,\dots ,d\), \(\tau _{ii}\) is a positive constant, \(\omega _{ii}\) is a real constant, \(\omega ^0_{ii} := \omega _{ii}\sqrt{1-\omega _{ii}^2/4}\), \(\tilde{\omega }^0_{ii} := \omega _{ii}\sqrt{\omega _{ii}^2/4-1}\), \(\omega _{ii}^1 := \omega _{ii}/\sqrt{1-\omega _{ii}^2/4}\), and \(\tilde{\omega }_{ii}^1 := \omega _{ii}/\sqrt{\omega _{ii}^2/4-1}\).

    Let \(\varvec{\tau }\) be constant diagonal matrix with the positive eigenvalues \((\tau _{jj})_{j=1}^d\), \(\varvec{\varOmega }\) be constant diagonal matrix with the real eigenvalues \((\omega _{jj})_{j=1}^{d}\), \(\varvec{\varOmega }_{0}\) be constant \(d \times d\) diagonal matrix with the eigenvalues \(\omega _{jj}\sqrt{1-\omega _{jj}^2/4}\) (if \(|\omega _{jj}| < 2\)) and \(i \omega _{jj}\sqrt{\omega _{jj}^2/4-1}\) (if \(|\omega _{jj}|>2\)), and \(\varvec{\varOmega }_{1}\) be constant \(d \times d\) diagonal matrix with the eigenvalues \(\omega _{jj}/\sqrt{1-\omega _{jj}^2/4}\) (if \(|\omega _{jj}|<2\)) and \(-i\omega _{jj}/\sqrt{\omega _{jj}^2/4-1}\) (if \(|\omega _{jj}|>2\)), where i is the imaginary unit.

    The driving noise is given by the harmonic noise process, \(\varvec{\xi }_{t}=\varvec{h}_{t} \in \mathbb {R}^d\), i.e. a mean zero stationary Gaussian process which is the solution to the SDE system:

    $$\begin{aligned} \varvec{\tau } d\varvec{h}_{t}&= \varvec{u}_{t} dt, \end{aligned}$$
    (17)
    $$\begin{aligned} \varvec{\tau } d\varvec{u}_{t}&= -\varvec{\varOmega }^2 \varvec{u}_{t} dt - \varvec{\varOmega }^2 \varvec{h}_{t} dt + \varvec{\varOmega }^2 d\varvec{W}^{(d)}_{t}, \end{aligned}$$
    (18)

    with the initial conditions, \(\varvec{h}_0\) and \(\varvec{u}_0\), distributed according to the (unique) stationary measure of the above SDE system. The mean and the covariance of \(\varvec{h}_{t}\) are given by:

    $$\begin{aligned} E[\varvec{h}_{t}] = \varvec{0}, \ \ E[\varvec{h}_{t} \varvec{h}_{s}^{*}] = \frac{1}{2} \varvec{\kappa }_{2}(t-s),\ \ s, t \ge 0. \end{aligned}$$
    (19)

    Note that \(\varvec{h}_t\) is not a Markov process (but the process \((\varvec{h}_t, \varvec{u}_t)\) is).

    The resulting SIDE reads:

    $$\begin{aligned} m \ddot{\varvec{x}}_{t} = \varvec{F}(\varvec{x}_{t}) - \varvec{g}(\varvec{x}_{t})\int _{0}^{t} \varvec{\kappa }_{2}(t-s) \varvec{h}(\varvec{x}_{s}) \dot{\varvec{x}}_{s} ds + \varvec{\sigma }(\varvec{x}_{t}) \varvec{h}_{t}. \end{aligned}$$
    (20)

    The harmonic noise is an approximation of the white noise, smoother than the Ornstein–Uhlenbeck process. It can be shown that in the limit \(\omega _{ii} \rightarrow \infty \) (for all i) the process \(\varvec{h}_{t}\) converges to the Ornstein–Uhlenbeck process whose ith component process has correlation time \(\tau _{ii}\), whereas in the limit \(\tau _{ii} \rightarrow 0\) (for all i) the process \(\varvec{h}_{t}\) converges to the white noise. For detailed properties of harmonic noise process, see for instance [29] or the Appendix in [30]. We remark that the harmonic noise is one of the simplest examples of a non-Markovian process and its use as the driving noise in the SIDE (1) is a natural choice that models the environment as a bath of damped harmonic oscillators [31].

Remark 2

Note that in the SIDEs for the above two sub-classes, the dimension of the driving Wiener process, \(\varvec{W}_t^{(d)}\), is the same as that of the colored noise processes \(\varvec{\eta }_t\) and \(\varvec{h}_t\), as well as the processes, \(\varvec{x}_t\) and \(\varvec{v}_t\). One could as well consider realizing the noise processes using a driving Wiener process of different dimension. Our choice of working with the same dimensions is for the sake of convenience as it will help to simplify the exposition.

Remark 3

Without loss of generality (due to (8)), we have taken \(\varvec{A}\) and \(\varvec{\varOmega }\) to be diagonal.

Remark 4

In cases of particular interest in statistical physics, the triples \((\varvec{\varGamma }_i, \varvec{M}_i, \varvec{C}_i)\) coincide, up to the transformations in (8) (for \(i = 1, 2\)), \(\varvec{h} = \varvec{g}^*\) and \(\varvec{g}\) is proportional to \(\varvec{\sigma }\), with the proportionality factor equals \(k_B T\), where \(k_B\) denotes the Boltzmann constant and \(T > 0\) is temperature of the environment (see Appendix A). In this case, we have \(d_1=d_2\) and \(q=r\). In particular, for the two sub-classes above we have

$$\begin{aligned} E[\varvec{\eta }_{t}^{0} (\varvec{\eta }_{s}^{0})^{*}] = \frac{1}{2} \varvec{\kappa }_1(t-s) \end{aligned}$$
(21)

for the first sub-class and

$$\begin{aligned} E[\varvec{h}_{t}^{0} (\varvec{h}_{s}^{0})^{*}] = \frac{1}{2} \varvec{\kappa }_{2}(t-s) \end{aligned}$$
(22)

for the second sub-class. In such cases, the SIDEs describe a particle interacting with an equilibrium heat bath at a temperature T, whose dynamics satisfy the fluctuation–dissipation relation [13, 32].

2.2 Homogenization of SIDEs: Discussion and Statement of the Problem

There are three characteristic time scales defining the non-Markovian dynamics described by the SIDE (1):

  1. (i)

    the inertial time scale, \(\lambda _{m}\), proportional to m, whose physical significance is the relaxation time of the particle velocity process \(\varvec{v}_{t} := \dot{\varvec{x}}_{t}\). The limit \(\lambda _{m} \rightarrow 0\) is equivalent to the limit \(m \rightarrow 0\);

  2. (ii)

    the memory time scale, \(\lambda _{\kappa }\), defined as the inverse rate of exponential decay of the memory kernel \(\varvec{\kappa }(t-s)\);

  3. (iii)

    the noise correlation time scale, \(\lambda _{\xi }\).

For the purpose of general multiscale analysis, we set \(m = m_{0} \epsilon ^{\mu }\), \(\lambda _{\kappa } = \tau _{\kappa } \epsilon ^{\theta }\) and \(\lambda _{\xi } = \tau _{\xi } \epsilon ^{\nu }\), where \(\epsilon > 0\) is a parameter which will be taken to zero, \(m_0\), \(\tau _\kappa \), \(\tau _\xi \) are (fixed) proportionality constants, and \(\mu , \theta , \nu \) are positive constants (exponents), specifying the orders at which the time scales \(\lambda _{m}, \lambda _{\kappa }, \lambda _{\xi }\) vanish respectively. We consider a family of SIDEs, parametrized by \(\epsilon \), with the inertial time scale \(\lambda _m\) proportional to \(m_0 \epsilon ^\mu \), memory time scale \(\lambda _{\kappa } = \tau _\kappa \epsilon ^\theta \) and noise correlation time scale \(\lambda _{\xi } = \tau _{\xi } \epsilon ^{\nu }\), to be defined in the following.

We replace m with \(m_0 \epsilon ^\mu \), \(\varvec{\varGamma }_1\) with \(\varvec{\varGamma }_1/(\tau _{\kappa } \epsilon ^{\theta })\), \(\varvec{M}_1\) with \(\varvec{M}_1/(\tau _{\kappa } \epsilon ^{\theta })\), and \(\varvec{x}_t\) with \(\varvec{x}_t^\epsilon \) in (11). Also, we substitute \(\varvec{\varGamma }_2\) with \(\varvec{\varGamma }_2/(\tau _{\xi } \epsilon ^{\nu })\), \(\varvec{\varSigma }_2\) with \(\varvec{\varSigma }_2/(\tau _{\xi } \epsilon ^{\nu })\), and \(\varvec{\beta }_t\) with \(\varvec{\beta }_t^\epsilon \) in (5). The SIDE (11) then becomes:

$$\begin{aligned} m_0 \epsilon ^{\mu } \ddot{\varvec{x}}^{\epsilon }_{t} = \varvec{F}(\varvec{x}^\epsilon _{t}) - \frac{\varvec{g}(\varvec{x}^\epsilon _{t})}{\tau _{\kappa } \epsilon ^{\theta }} \int _{0}^{t} \varvec{C}_1e^{-\frac{\varvec{\varGamma }_1}{\tau _{\kappa }\epsilon ^{\theta }}(t-s)} \varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x}^\epsilon _{s}) \dot{\varvec{x}}^{\epsilon }_{s} ds + \varvec{\sigma }(\varvec{x}^\epsilon _{t}) \varvec{C}_2 \varvec{\beta }^\epsilon _{t}, \end{aligned}$$
(23)

with the initial conditions, \(\varvec{x}^\epsilon _0 = \varvec{x}\) and \(\varvec{v}^\epsilon _0 = \varvec{v}\), where \(\varvec{\beta }^\epsilon _t\) is the Ornstein-Uhlenbeck process, with correlation time \(\tau _{\xi } \epsilon ^\nu \), satisfying the SDE:

$$\begin{aligned} d\varvec{\beta }^\epsilon _t = -\frac{\varvec{\varGamma }_2}{\tau _{\xi } \epsilon ^\nu } \varvec{\beta }^\epsilon _t dt + \frac{\varvec{\varSigma }_2}{\tau _{\xi } \epsilon ^\nu } d\varvec{W}^{(q_2)}_t, \end{aligned}$$
(24)

with the initial condition, \(\varvec{\beta }^\epsilon _0\), normally distributed with zero mean and covariance of \(\varvec{M}_2/(\tau _{\xi } \epsilon ^\nu )\).

We will also perform similar analysis on the two sub-classes of SIDE, in which case:

  1. (i)

    the SIDE (15) becomes (with \(m:=m_0 \epsilon ^\mu \), \(\varvec{A}\) in the formula for \(\varvec{\kappa }_{1}\) replaced by \(\varvec{A}/(\tau _{\kappa } \epsilon ^{\theta })\), \(\varvec{A}\) in (13) replaced by \(\varvec{A}/(\tau _{\eta }\epsilon ^{\nu })\), \(\varvec{x}_t\) replaced by \(\varvec{x}_t^\epsilon \) and \(\varvec{\eta }_{t}\) replaced by \(\varvec{\eta }_{t}^\epsilon \)):

    $$\begin{aligned} m_{0} \epsilon ^{\mu } \ddot{\varvec{x}}^{\epsilon }_{t} = \varvec{F}(\varvec{x}^\epsilon _{t}) - \frac{\varvec{g}(\varvec{x}^\epsilon _{t})}{\tau _{\kappa } \epsilon ^{\theta }} \int _{0}^{t} \varvec{A} e^{-\frac{\varvec{A}}{\tau _{\kappa } \epsilon ^{\theta }} (t-s)} \varvec{h}(\varvec{x}^\epsilon _{s}) \dot{\varvec{x}}^{\epsilon }_{s} ds + \varvec{\sigma }(\varvec{x}^\epsilon _{t}) \varvec{\eta }^\epsilon _{t}, \end{aligned}$$
    (25)

    where \(\varvec{\eta }^\epsilon _{t}\) is the Ornstein–Uhlenbeck process with the correlation time \(\tau _{\eta } \epsilon ^{\nu }\), i.e. it is a process satisfying the SDE:

    $$\begin{aligned} d\varvec{\eta }^\epsilon _{t} = -\frac{\varvec{A}}{\tau _{\eta } \epsilon ^{\nu }} \varvec{\eta }^\epsilon _{t} dt + \frac{\varvec{A}}{\tau _{\eta } \epsilon ^{\nu }} d \varvec{W}^{(d)}_{t}. \end{aligned}$$
    (26)
  2. (ii)

    the SIDE (20) becomes (with \(m:=m_0 \epsilon ^\mu \), \(\tau _{ii} := \tau _{\kappa } \epsilon ^{\theta }\) in the formula for (\(\varvec{\kappa }_{2})_{ii}\) in (16), \(\varvec{x}_t\) replaced by \(\varvec{x}^\epsilon _t\), \(\varvec{h}_{t}\), \(\varvec{u}_{t}\) replaced by \(\varvec{h}^\epsilon _{t}\), \(\varvec{u}^\epsilon _{t}\) respectively and \(\varvec{\tau } := \tau _{h} \epsilon ^{\nu } \varvec{I}\) in (17), (18)):

    $$\begin{aligned}&m_{0} \epsilon ^{\mu } \ddot{\varvec{x}}^\epsilon _{t} = \varvec{F}(\varvec{x}^\epsilon _{t}) - \frac{\varvec{g}(\varvec{x}^\epsilon _{t})}{\tau _{\kappa } \epsilon ^{\theta }} \int _{0}^{t} e^{-\varvec{\varOmega }^2\frac{(t-s)}{2\tau _{\kappa }\epsilon ^{\theta }}}\left[ \cos \left( \frac{\varvec{\varOmega }_{0}}{\tau _{\kappa }\epsilon ^{\theta }}(t-s) \right) \right. \nonumber \\&\qquad \left. + \frac{\varvec{\varOmega }_{1}}{2} \sin \left( \frac{\varvec{\varOmega }_{0}}{\tau _{\kappa }\epsilon ^{\theta }}(t-s) \right) \right] \varvec{h}(\varvec{x}^\epsilon _{s}) \dot{\varvec{x}}^{\epsilon }_{s} ds + \varvec{\sigma }(\varvec{x}^\epsilon _{t}) \varvec{h}^\epsilon _{t}, \end{aligned}$$
    (27)

    where \(\varvec{h}^\epsilon _{t}\) is the harmonic noise process with the correlation time \(\tau _{h} \epsilon ^{\nu }\), i.e. it is a process satisfying the SDE system:

    $$\begin{aligned} d\varvec{h}^\epsilon _{t}&= \frac{1}{\tau _{h}\epsilon ^{\nu }} \varvec{u}^\epsilon _{t} dt, \end{aligned}$$
    (28)
    $$\begin{aligned} d\varvec{u}^\epsilon _{t}&= -\frac{\varvec{\varOmega }^2}{\tau _{h} \epsilon ^{\nu }} \varvec{u}^\epsilon _{t} dt - \frac{\varvec{\varOmega }^2}{\tau _{h} \epsilon ^{\nu }} \varvec{h}^\epsilon _{t} dt + \frac{\varvec{\varOmega }^2}{\tau _{h} \epsilon ^{\nu }} d\varvec{W}^{(d)}_{t}. \end{aligned}$$
    (29)

    Both SIDEs have the initial conditions \(\varvec{x}^\epsilon _{0} = \varvec{x}, \ \dot{\varvec{x}}^\epsilon _{0} = \varvec{v}\). The initial conditions \(\varvec{\eta }^\epsilon _{0}\) (respectively, \(\varvec{h}^\epsilon _{0}\) and \(\varvec{u}^\epsilon _{0}\)) are distributed according to the stationary measure of the SDE that the process \(\varvec{\eta }^\epsilon _{t}\) (respectively, \(\varvec{h}^\epsilon _{t}\) and \(\varvec{u}^\epsilon _{t}\)) satisfies.

In this paper we set \(\mu = \theta = \nu \), which is the case when all the characteristic time scales are of comparable magnitude in the limit as \(\epsilon \rightarrow 0\). Our main goal is to derive a limiting equation for the (slow) \(\varvec{x}^\epsilon \)-component of the process solving Eqs. (23), (24), including the special cases (25), (26) and (27)–(29), in the limit as \(\epsilon \rightarrow 0\), in a strong pathwise sense.

We explain the motivation behind the above rescalings. The rescaling of the memory kernels \(\varvec{\kappa }(t-s)\), \(\varvec{\kappa }_{1}(t-s)\), \(\varvec{\kappa }_{2}(t-s)\) is such that in the limit \(\epsilon \rightarrow 0\) the rescaled memory kernels converge to \(\varvec{K}_1 \delta (t) \) formally, where \(\delta (t)\) is the Dirac-delta function and \(\varvec{K}_1\) is the effective damping constant defined in (9). On the other hand, the noise processes \(\varvec{\xi }^\epsilon _t = \varvec{C}_2 \varvec{\beta }^\epsilon _{t}\), \(\varvec{\eta }^\epsilon _{t}\) and \(\varvec{h}^\epsilon _{t}\) converge to white noise processes in the limit \(\epsilon \rightarrow 0\).

3 Smoluchowski–Kramers Limit of SDE’s Revisited

Let \((\varvec{x}_{t}^m, \varvec{v}^m_{t}) \in \mathbb {R}^{n} \times \mathbb {R}^n\), where \(t \in [0,T]\), be a family of solutions (parametrized by a positive constant m) to the following SDEs:

$$\begin{aligned} d\varvec{x}^m_{t}&= \varvec{v}^m_{t} dt, \end{aligned}$$
(30)
$$\begin{aligned} m d\varvec{v}^m_{t}&= \varvec{F}(\varvec{x}^m_{t}) dt -\varvec{\gamma }(\varvec{x}^m_{t}) \varvec{v}^m_{t} dt + \varvec{\sigma }(\varvec{x}^m_{t}) d\varvec{W}^{(k)}_{t}. \end{aligned}$$
(31)

In the SDEs above, \(m > 0\) is the mass of the particle, \(\varvec{F}: \mathbb {R}^{n} \rightarrow \mathbb {R}^{n}\), \(\varvec{\gamma }: \mathbb {R}^{n} \rightarrow \mathbb {R}^{n \times n}\), \(\varvec{\sigma }: \mathbb {R}^{n} \rightarrow \mathbb {R}^{n \times k}\), and \(\varvec{W}^{(k)}\) is a k-dimensional Wiener process on a filtered probability space \((\varOmega , \mathcal {F}, \mathcal {F}_{t}, \mathbb {P})\) satisfying the usual conditions [33]. The initial conditions are given by \(\varvec{x}^m_{0} = \varvec{x}, \ \varvec{v}^m_{0} = \varvec{v}^m\). The above SDE system models diffusive phenomena in cases where the damping coefficient \(\varvec{\gamma }\) and diffusion coefficient \(\varvec{\sigma }\) are state-dependent.

The Smoluchowski–Kramers limit (or the small mass limit) of the system (30), (31) has been studied in [8, 34,35,36,37]. The main result in [37] says that, under certain assumptions, the \(\varvec{x}^m\)-component of the solution to (30), (31) converges (in a strong pathwise sense), as \(m \rightarrow 0\), to the solution of a homogenized SDE that contains in particular the so-called noise-induced drift, not present in the pre-limit SDEs (see Theorem 1 for a precise statement). The presence of such noise-induced drift in the homogenized equation is a consequence of the state-dependence of the damping coefficient (and therefore also the diffusion coefficient if the system satisfies a fluctuation–dissipation relation). For an overview of the noise-induced drift phenomena, we refer to the review article [38].

In all the works mentioned previously, the spectral assumption made on the matrix \(\varvec{\gamma }\) was that the symmetrized damping matrix \(\frac{1}{2}(\varvec{\gamma } + \varvec{\gamma }^{*})\) is uniformly positive definite (i.e. its smallest eigevalue is positive uniformly in \(\varvec{x}\)). The same results can be obtained under a weaker assumption that matrix \(\varvec{\gamma }\) is uniformly positive stable, i.e. all real parts of the eigenvalues of \(\varvec{\gamma }\) are positive uniformly in \(\varvec{x}\) [39].

Notation Here and in the following, we use Einstein summation convention on repeated indices. The Euclidean norm of a vector \(\varvec{w}\) is denoted by \(| \varvec{w} |\) and the (induced operator) norm of a matrix \(\varvec{A}\) by \(\Vert \varvec{A} \Vert \). For an \(\mathbb {R}^{n_2 \times n_3}\)-valued function \(\varvec{f}(\varvec{y}):=([f]_{jk}(\varvec{y}))_{j=1,\dots ,n_2; k=1,\dots , n_3}\), \(\varvec{y} := ([y]_1, \dots , [y]_{n_1}) \in \mathbb {R}^{n_1}\), we denote by \((\varvec{f})_{\varvec{y}}(\varvec{y})\) the \(n_1 n_2 \times n_3\) matrix:

$$\begin{aligned} (\varvec{f})_{\varvec{y}}(\varvec{y}) = (\varvec{\nabla }_{\varvec{y}}[f]_{jk}(\varvec{y}))_{j=1,\dots , n_2; k=1,\dots ,n_3}, \end{aligned}$$
(32)

where \(\varvec{\nabla }_{\varvec{y}}[f]_{jk}(\varvec{y})\) denotes the gradient vector \((\frac{\partial [f]_{jk}(\varvec{y})}{\partial [y]_1}, \dots , \frac{\partial [f]_{jk}(\varvec{y})}{\partial [y]_{n_1}}) \in \mathbb {R}^{n_1}\) for every jk. The symbol \(\mathbb {E}\) denotes expectation with respect to \(\mathbb {P}\).

We make the following assumptions.

Assumption 1

For every \(\varvec{x} \in \mathbb {R}^n\), the functions \(\varvec{F}(\varvec{x})\) and \(\varvec{\sigma }(\varvec{x})\) are continuous, bounded and Lipschitz in \(\varvec{x}\), whereas the functions \(\varvec{\gamma }(\varvec{x})\) and \((\varvec{\gamma })_{\varvec{x}}(\varvec{x})\) are continuously differentiable, bounded and Lipschitz in \(\varvec{x}\). Moreover, \((\varvec{\gamma })_{\varvec{x} \varvec{x}}(\varvec{x})\) is bounded for every \(\varvec{x} \in \mathbb {R}^n\).

Assumption 2

The matrix \(\varvec{\gamma }\) is uniformly positive stable, i.e. all real parts of the eigenvalues of \(\varvec{\gamma }(\varvec{x})\) are bounded below by some constant \(2\kappa > 0\), uniformly in \(\varvec{x}\in \mathbb {R}^n\).

Assumption 3

The initial condition \(\varvec{x}^m_0 = \varvec{x}_0\) is a random variable independent of m and has finite moments of all orders, i.e. \(\mathbb {E} |\varvec{x}|^{p} < \infty \) for all \(p > 0\). The initial condition \(\varvec{v}^m_0\) is a random variable that possibly depends on m and we assume that for every \(p>0\), \(\mathbb {E} |m \varvec{v}^m|^p = O(m^\alpha )\) as \(m \rightarrow 0\), for some \(\alpha \ge p/2\).

Assumption 4

The global solutions, defined on [0, T], to the pre-limit SDEs (30), (31) and to the limiting SDE (33) a.s. exist and are unique for all \(m > 0\) (i.e. there are no explosions).

We now state the result.

Theorem 1

Suppose that the SDE system (30), (31) satisfies Assumption 14. Let \((\varvec{x}^m_{t}, \varvec{v}^m_{t}) \in \mathbb {R}^{n} \times \mathbb {R}^{n}\) be its solution, with the initial condition \((\varvec{x}, \varvec{v}^m)\). Let \(\varvec{X}_{t} \in \mathbb {R}^{n}\) be the solution to the following Itô SDE with initial position \(\varvec{X}_{0} = \varvec{x}\):

$$\begin{aligned} d\varvec{X}_{t} = [\varvec{\gamma }^{-1}(\varvec{X}_{t}) \varvec{F}(\varvec{X}_{t}) + \varvec{S}(\varvec{X}_{t})] dt + \varvec{\gamma }^{-1}(\varvec{X}_{t}) \varvec{\sigma }(\varvec{X}_{t}) d \varvec{W}^{(k)}_{t}, \end{aligned}$$
(33)

where \(\varvec{S}(\varvec{X}_{t})\) is the noise-induced drift whose ith component is given by

$$\begin{aligned} S_{i}(\varvec{X}) = \frac{\partial }{\partial X_{l}}[(\gamma ^{-1})_{ij}(\varvec{X})] J_{jl}(\varvec{X}), \ \ i,j,l = 1, \dots , n, \end{aligned}$$
(34)

and \(\varvec{J}\) is the unique matrix solving the Lyapunov equation

$$\begin{aligned} \varvec{J} \varvec{\gamma }^{*} + \varvec{\gamma } \varvec{J} = \varvec{\sigma } \varvec{\sigma }^{*}. \end{aligned}$$
(35)

Then the process \(\varvec{x}^m_{t}\) converges, as \(m \rightarrow 0\), to the solution \(\varvec{X}_{t}\), of the Itô SDE (33), in the following sense: for all finite \(T>0\),

$$\begin{aligned} \sup _{t \in [0,T]} |\varvec{x}_t^m - \varvec{X}_t| \rightarrow 0 \end{aligned}$$
(36)

in probability, in the limit as \(m \rightarrow 0\).

We end this section with a few remarks concerning the statements in Theorem 1.

Remark 5

  1. (i)

    Because of the relaxed spectral assumption on \(\varvec{\gamma }\), a new idea has to be used to prove decay estimates for solutions of the velocity equation. Once this is done, Theorem 1 can be proven using the technique of [37] (note that Assumption 1 is essentially the same as the assumptions in Appendix A of [37], specialized to the present case). In Appendix B we give a sketch of the proof of Theorem 1, pointing out the necessary modifications. The reader is referred to [37] for more details.

  2. (ii)

    Our assumption on the initial variable \(\varvec{v}_0^m\) implies that the initial average kinetic energy, \(K(\varvec{v}^m) := \mathbb {E} \frac{1}{2} m |\varvec{v}^m|^2\), does not blow up (but can possibly vanish) as \(m \rightarrow 0\). This is analogous to the Assumption 2.4 in [37] and it is more general than the one in [8].

  3. (iii)

    With slightly more work and additional assumptions, one could prove the statement in Assumption 4 from Assumptions 13. See Appendix C in [37] for a result along these lines. However, such existence and uniqueness result is not the focus of this paper and, therefore, we choose to take the existence and uniqueness result for granted in Assumption 4.

  4. (iv)

    We make no claim that Assumptions 24 are as weak or as general as possible. In particular, the boundedness assumption on the coefficients of the SDEs could be relaxed (for instance, using the techniques in [35]) and the initial condition \(\varvec{x}\) could have some dependence on m (see, for instance [37]) at the cost of more technicalities.

    The main focus of our revisit here is to point out that the result in [8] still holds with a relaxed spectral assumption on the matrix \(\varvec{\gamma }\) and with the initial condition \(\varvec{v}_0^m\) possibly dependent on m—this will be important for applications in later sections (see also Remark 11).

4 Homogenization for Generalized Langevin Dynamics

In this section, we study homogenization for the system of equations (23), (24) (with \(\mu = \theta = \nu \)) by taking the limit as \(\epsilon \rightarrow 0\), under appropriate assumptions.

Without loss of generality, we set \(\mu = \theta = \nu = 1\). We cast (23), (24) as the system of SDEs for the Markov process \((\varvec{x}^\epsilon _{t}, \varvec{v}^\epsilon _{t}, \varvec{z}^\epsilon _{t}, \varvec{y}^\epsilon _{t}, \varvec{\zeta }^\epsilon _{t}, \varvec{\beta }^\epsilon _{t})\) on the state space \(\mathbb {R}^{d}\times \mathbb {R}^d \times \mathbb {R}^{d_1} \times \mathbb {R}^{d_1} \times \mathbb {R}^{d_2} \times \mathbb {R}^{d_2}\):

$$\begin{aligned} d\varvec{x}^\epsilon _{t}&= \varvec{v}^\epsilon _{t} dt, \end{aligned}$$
(37)
$$\begin{aligned} m_{0} \epsilon d\varvec{v}^\epsilon _{t}&= - \varvec{g}(\varvec{x}^\epsilon _{t}) \varvec{C}_1 \varvec{y}^\epsilon _{t} dt + \varvec{\sigma }(\varvec{x}^\epsilon _{t}) \varvec{C}_2 \varvec{\beta }^\epsilon _{t} dt +\varvec{F}(\varvec{x}^\epsilon _{t})dt, \end{aligned}$$
(38)
$$\begin{aligned} d \varvec{z}^\epsilon _{t}&= \varvec{y}^\epsilon _{t} dt, \end{aligned}$$
(39)
$$\begin{aligned} \tau _{\kappa } \epsilon d\varvec{y}^\epsilon _{t}&= -\varvec{\varGamma }_1 \varvec{y}^\epsilon _{t} dt + \varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x}^\epsilon _{t}) \varvec{v}^\epsilon _{t} dt, \end{aligned}$$
(40)
$$\begin{aligned} d\varvec{\zeta }^\epsilon _{t}&= \varvec{\beta }^\epsilon _{t} dt, \end{aligned}$$
(41)
$$\begin{aligned} \tau _{\xi } \epsilon d\varvec{\beta }^\epsilon _{t}&= -\varvec{\varGamma }_2 \varvec{\beta }^\epsilon _{t} dt + \varvec{\varSigma }_2 d\varvec{W}^{(q_2)}_{t}, \end{aligned}$$
(42)

where we have defined the auxiliary process

$$\begin{aligned} \varvec{y}^\epsilon _{t} := \frac{1}{\tau _{\kappa } \epsilon } \int _{0}^{t} e^{-\frac{\varvec{\varGamma }_1}{\tau _{\kappa } \epsilon }(t-s)} \varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x}^\epsilon _{s}) \varvec{v}^\epsilon _{s} ds. \end{aligned}$$
(43)

Here, the initial conditions \(\varvec{x}^\epsilon _0 = \varvec{x}\), \(\varvec{v}^\epsilon _0 = \varvec{v}\), \(\varvec{z}^\epsilon _0 = \varvec{z}\) and \(\varvec{\zeta }^\epsilon _0 = \varvec{\zeta }\) are random variables. Note that \(\varvec{y}^\epsilon _0 = \varvec{0}\), and \(\varvec{\beta }^\epsilon _0\) is a zero mean Gaussian random variable with covariance \(\varvec{M}_2/\tau _\xi \epsilon \).

Let \(\varvec{W}^{(q_2)}_{t}\) be an \(\mathbb {R}^{q_2}\)-valued Wiener process on a filtered probability space \((\varOmega , \mathcal {F}, \mathcal {F}_{t}, \mathbb {P})\) satisfying the usual conditions [33] and \(\mathbb {E}\) denotes expectation with respect to \(\mathbb {P}\).

We use the notation introduced in Sect. 3 and make the following assumptions.

Assumption 5

For every \(\varvec{x} \in \mathbb {R}^{d}\), the vector-valued function \(\varvec{F}(\varvec{x})\) is continuous, bounded and Lipschitz in \(\varvec{x}\), whereas the matrix-valued functions \(\varvec{g}(\varvec{x})\), \(\varvec{h}(\varvec{x})\), \(\varvec{\sigma }(\varvec{x})\), \((\varvec{g})_{\varvec{x}}(\varvec{x})\), \((\varvec{h})_{\varvec{x}}(\varvec{x})\) and \((\varvec{\sigma })_{\varvec{x}}(\varvec{x})\) are continuously differentiable, bounded and Lipschitz in \(\varvec{x}\). Moreover, \((\varvec{g})_{\varvec{x}\varvec{x}}(\varvec{x})\), \((\varvec{h})_{\varvec{x}\varvec{x}}(\varvec{x})\) and \((\varvec{\sigma })_{\varvec{x}\varvec{x}}(\varvec{x})\) are bounded for every \(\varvec{x} \in \mathbb {R}^d\).

Assumption 6

The initial conditions \(\varvec{x}\), \(\varvec{v}\), \(\varvec{z}\), \(\varvec{\zeta }\) are random variables independent of \(\epsilon \). We assume that they have finite moments of all orders, i.e. \(\mathbb {E}|\varvec{x}|^{p}, \ \mathbb {E}|\varvec{v}|^{p}, \ \mathbb {E}|\varvec{z}|^p, \ \mathbb {E}|\varvec{\zeta }|^p < \infty \) for all \(p>0\).

Assumption 7

There are no explosions, i.e. almost surely, for any value of the parameter \(\epsilon \) there exists a unique solution on the compact time interval [0, T] to the pre-limit Eqs. (23), (24), and also to the limiting Eq. (45).

The following convergence theorem is the main result of this paper. It provides a homogenized SDE for the particle’s position in the limit as the inertial time scale, the memory time scale and the noise correlation time scale go to zero at the same rate in the case when the pre-limit dynamics are described by the family of Eqs. (23), (24) (with \(\mu = \theta = \nu = 1\)), or equivalently by the SDEs (37)–(42). In the following, \((\varvec{D})_{ij}\) denotes the (ij)-entry of the matrix \(\varvec{D}\).

Theorem 2

Let \(\varvec{x}^\epsilon _{t} \in \mathbb {R}^{d}\) be the solution to the SDEs (37)–(42). Suppose that Assumptions 57 are satisfied and the effective damping and diffusion (constant) matrices, \(\varvec{K}_1\), \(\varvec{K}_2\), defined in (9) and (10) respectively, are invertible. Moreover, we assume that for every \(\varvec{x} \in \mathbb {R}^d\),

$$\begin{aligned} \varvec{B}_\lambda (\varvec{x}) := \varvec{I} + \varvec{g}(\varvec{x}) \tilde{\varvec{\kappa }}(\lambda \tau _{\kappa }) \varvec{h}(\varvec{x})/\lambda m_0 \end{aligned}$$
(44)

is invertible for all \(\lambda \) in the right half plane \(\{\lambda \in \mathbb {C}: Re(\lambda )>0\}\), where \(\tilde{\varvec{\kappa }}(z) := \varvec{C}_1(z\varvec{I} + \varvec{\varGamma }_1)^{-1}\varvec{M}_1 \varvec{C}_1^*\), for \(z \in \mathbb {C}\), is the Laplace transform of the memory function.

Denote \(\varvec{\theta }(\varvec{X}) := \varvec{g}(\varvec{X})\varvec{K}_1 \varvec{h}(\varvec{X}) \in \mathbb {R}^{d \times d}\) for \(\varvec{X} \in \mathbb {R}^d\). Then as \(\epsilon \rightarrow 0\), the process \(\varvec{x}^\epsilon _{t}\) converges to the solution, \(\varvec{X}_{t}\), of the following Itô SDE:

$$\begin{aligned} d\varvec{X}_{t} = \varvec{S}(\varvec{X}_{t}) dt + \varvec{\theta }^{-1}(\varvec{X}_t) \varvec{F}(\varvec{X}_{t}) dt + \varvec{\theta }^{-1}(\varvec{X}_t) \varvec{\sigma }(\varvec{X}_{t}) \varvec{C}_2 \varvec{\varGamma }_2^{-1} \varvec{\varSigma }_2 d\varvec{W}^{(q_2)}_{t}, \end{aligned}$$
(45)

with \(\varvec{S}(\varvec{X}_{t}) = \varvec{S}^{(1)}(\varvec{X}_{t}) + \varvec{S}^{(2)}(\varvec{X}_{t}) + \varvec{S}^{(3)}(\varvec{X}_{t}),\) where the \(\varvec{S}^{(k)}\) are the noise-induced drifts whose ith components are given by

$$\begin{aligned} S^{(1)}_{i}&= m_{0} \frac{\partial }{\partial X_{l}}\left[ (\varvec{\theta }^{-1})_{ij}(\varvec{X})\right] (\varvec{J}_{11})_{jl}(\varvec{X}), \ \ i,j,l = 1, \dots , d, \end{aligned}$$
(46)
$$\begin{aligned} S^{(2)}_{i}&= -\tau _{\kappa } \frac{\partial }{\partial X_{l}}\left[ (\varvec{\theta }^{-1} \varvec{g})_{ij}(\varvec{X})\right] (\varvec{C}_1 \varvec{\varGamma }_1^{-1} \varvec{J}_{21})_{jl}(\varvec{X}), \ \ i,l = 1, \dots , d; \ j = 1,\dots ,q, \end{aligned}$$
(47)
$$\begin{aligned} S^{(3)}_{i}&= \tau _{\xi } \frac{\partial }{\partial X_{l}}\left[ (\varvec{\theta }^{-1}\varvec{\sigma } )_{ij}(\varvec{X}) \right] (\varvec{C}_2 \varvec{\varGamma }_2^{-1} \varvec{J}_{31})_{jl}(\varvec{X}), \ \ i,l = 1, \dots , d; \ j=1,\dots ,r. \end{aligned}$$
(48)

Here \(\varvec{J}_{11} = \varvec{J}_{11}^* \in \mathbb {R}^{d\times d}\), \(\varvec{J}_{21}=\varvec{J}_{12}^* \in \mathbb {R}^{d_1 \times d}\) and \(\varvec{J}_{31} = \varvec{J}_{13}^* \in \mathbb {R}^{d_2 \times d}\) satisfy the following system of five matrix equations:

$$\begin{aligned} \varvec{g} \varvec{C}_1 \varvec{J}_{12}^* + \varvec{J}_{12} \varvec{C}_1^* \varvec{g}^*&= \varvec{\sigma } \varvec{C}_2 \varvec{J}_{13}^* + \varvec{J}_{13} \varvec{C}_2^* \varvec{\sigma }^*, \end{aligned}$$
(49)
$$\begin{aligned} m_0 \varvec{J}_{11} \varvec{h}^* \varvec{C}_1 \varvec{M}_1 + \tau _{\kappa } \varvec{\sigma } \varvec{C}_2 \varvec{J}_{23}^*&= \tau _{\kappa } \varvec{g} \varvec{C}_1 \varvec{J}_{22} + m_0 \varvec{J}_{12} \varvec{\varGamma }_1^*,\end{aligned}$$
(50)
$$\begin{aligned} \tau _{\xi } \varvec{g} \varvec{C}_1 \varvec{J}_{23} + m_0 \varvec{J}_{13} \varvec{\varGamma }_2^*&= \varvec{\sigma } \varvec{C}_2 \varvec{M}_2, \end{aligned}$$
(51)
$$\begin{aligned} \varvec{M}_1 \varvec{C}_1^* \varvec{h} \varvec{J}_{12} + \varvec{J}_{12}^* \varvec{h}^* \varvec{C}_1 \varvec{M}_1&= \varvec{\varGamma }_1 \varvec{J}_{22} + \varvec{J}_{22} \varvec{\varGamma }_1^*, \end{aligned}$$
(52)
$$\begin{aligned} \tau _{\xi } \varvec{M}_1 \varvec{C}_1^* \varvec{h} \varvec{J}_{13}&= \tau _{\xi } \varvec{\varGamma }_1 \varvec{J}_{23} + \tau _{\kappa } \varvec{J}_{23} \varvec{\varGamma }_2^*. . \end{aligned}$$
(53)

The convergence is obtained in the following sense: for all finite \(T>0\),

$$\begin{aligned} \sup _{t \in [0,T]}|\varvec{x}_t^\epsilon - \varvec{X}_t| \rightarrow 0 \end{aligned}$$
(54)

in probability, in the limit as \(\epsilon \rightarrow 0\).

Remark 6

Invertibility of the matrices \(\varvec{B}_\lambda (\varvec{x})\) is a technical condition which will be used in the proof of the theorem. We are going to verify it in the special cases and applications discussed later (see Corollaries 3 and 5). In particular, it follows from the stronger spectral condition, namely that \(\varvec{g}(\varvec{x})\tilde{\varvec{\kappa }}(\mu )\varvec{h}(\varvec{x})\) has no spectrum in the right half plane for any \(\mu \) with \(Re(\mu ) > 0\). See also Remark 8.

Remark 7

It is straightforward to extend Theorems 1 and 2 to the case when the coefficients in the pre-limit equations depend explicitly on time, i.e. when \(\varvec{\gamma }=\varvec{\gamma }(\varvec{x}_t^m, t)\), \(\varvec{\sigma }=\varvec{\sigma }(\varvec{x}_t^m,t)\) and \(\varvec{F}=\varvec{F}(\varvec{x}_t^m,t)\) in (30), (31), as well as when \(\varvec{g}=\varvec{g}(\varvec{x}_t^\epsilon , t)\), \(\varvec{h}=\varvec{h}(\varvec{x}_t^\epsilon , t)\), \(\varvec{\sigma }=\varvec{\sigma }(\varvec{x}_t^\epsilon ,t)\) and \(\varvec{F} = \varvec{F}(\varvec{x}_t^\epsilon ,t)\) in (37)–(42). In this case, we expect similar results to hold under additional assumptions, analogous to those studied in [37].

Proof

We denote \(\varvec{\hat{x}}^\epsilon _{t} := (\varvec{x}^\epsilon _{t}, \varvec{z}^\epsilon _{t}, \varvec{\zeta }^\epsilon _{t}) \in \mathbb {R}^{d+d_1+d_2}\) and \(\varvec{\hat{v}}^\epsilon _{t} := (\varvec{v}^\epsilon _{t}, \varvec{y}^\epsilon _{t}, \varvec{\eta }^\epsilon _{t}) \in \mathbb {R}^{d+d_1+d_2} \) and rewrite the above SDE system in the form (30), (31):

$$\begin{aligned} d\varvec{\hat{x}}^{\epsilon }_{t}&= \varvec{\hat{v}}^{\epsilon }_{t} dt , \end{aligned}$$
(55)
$$\begin{aligned} \epsilon d\varvec{\hat{v}}^\epsilon _{t}&= - \varvec{\hat{\gamma }}(\varvec{x}^\epsilon _{t}) \hat{\varvec{v}}^\epsilon _{t} dt + \varvec{\hat{F}}(\varvec{x}^\epsilon _{t})dt + \varvec{\hat{\sigma }} d\varvec{W}^{(q_2)}_{t}, \end{aligned}$$
(56)

with

$$\begin{aligned} \varvec{\hat{\gamma }}(\varvec{x}^\epsilon _{t}) = \left[ \begin{array}{ccc} \varvec{0} &{} \frac{\varvec{g}(\varvec{x}^\epsilon _{t}) \varvec{C}_1}{m_{0}} &{} -\frac{\varvec{\sigma }(\varvec{x}^\epsilon _{t}) \varvec{C}_2}{m_{0}} \\ -\frac{\varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x}^\epsilon _{t}) }{\tau _{\kappa }} &{} \frac{\varvec{\varGamma }_1}{\tau _{\kappa }} &{} \varvec{0} \\ \varvec{0} &{} \varvec{0} &{} \frac{\varvec{\varGamma }_2}{\tau _{\xi }} \end{array} \right] , \ \ \ \varvec{\hat{F}}(\varvec{x}^\epsilon _{t}) = \begin{bmatrix} \frac{\varvec{F}(\varvec{x}^\epsilon _{t})}{m_{0}} \\ \varvec{0} \\ \varvec{0} \\ \end{bmatrix}, \ \ \varvec{\hat{\sigma }}= \left[ \begin{array}{c} \varvec{0} \\ \varvec{0} \\ \frac{\varvec{\varSigma }_2}{\tau _{\xi }} \end{array} \right] , \end{aligned}$$
(57)

where \(\varvec{\hat{\gamma }} \in \mathbb {R}^{(d+d_1+d_2) \times (d+d_1+d_2)}\) is a 3 by 3 block matrix with each block a matrix of appropriate dimension, \(\varvec{\hat{F}} \in \mathbb {R}^{d+d_1+d_2}\), \(\varvec{\hat{\sigma }} \in \mathbb {R}^{(d+d_1+d_2)\times q_2}\) and the \(\varvec{0}\) appearing in \(\varvec{\hat{\gamma }}\), \(\varvec{\hat{F}}\) and \(\varvec{\hat{\sigma }}\) is a zero vector or matrix of appropriate dimension.

We now want to apply Theorem 1 (with \(m:=\epsilon \), \(n:=d+d_1+d_2\), \(\varvec{\gamma }\) replaced by \(\varvec{\hat{\gamma }}\), \(\varvec{F}\) replaced by \(\varvec{\hat{F}}\), \(\varvec{\sigma }\) replaced by \(\varvec{\hat{\sigma }}\), etc.) to (55), (56).

It is straightforward to see that Assumption 5 implies Assumptions 1 and 7 implies Assumption 4.

To verify Assumption 3, note again that \(\varvec{y}^\epsilon _0 = \varvec{0}\) and so by Assumption 6, we only need to show that for every \(p>0\), \(\mathbb {E}|\epsilon \varvec{\beta }^\epsilon _0|^p = O(\epsilon ^\alpha )\) as \(\epsilon \rightarrow 0\), for some constant \(\alpha \ge p/2\). To show this, we use the fact that for a mean zero Gaussian random variable, \(X \in \mathbb {R}\), with variance \(\sigma ^2\),

$$\begin{aligned} \mathbb {E} |X|^p = \sigma ^p \frac{2^{p/2} \varGamma \left( \frac{p+1}{2}\right) }{\sqrt{\pi }}, \end{aligned}$$
(58)

for every \(p>0\), where \(\varGamma \) denotes the gamma function [40]. Applying this to \(\varvec{\beta }^\epsilon _0\), we obtain, for every \(p>0\), \(\mathbb {E} |\varvec{\beta }_0^\epsilon |^p = O(1/\epsilon ^{p/2})\) as \(\epsilon \rightarrow 0\) and so \(\mathbb {E}|\epsilon \varvec{\beta }_0^\epsilon |^p = O(\epsilon ^{p/2})\) as \(\epsilon \rightarrow 0\). Therefore, Assumption 3 is verified.

It remains to verify Assumption 2, i.e. that \(\varvec{\hat{\gamma }}\) is positive stable. Note that \(\varvec{\varGamma }_2\) is positive stable by assumption and the triangular-block structure of \(\varvec{\hat{\gamma }}\) implies that one only needs to verify that the upper left 2 by 2 block matrix of \(\varvec{\hat{\gamma }}\):

$$\begin{aligned} \varvec{L}(\varvec{x}) = \left[ \begin{array}{cc} \varvec{0} &{} \varvec{g}(\varvec{x}) \varvec{C}_1/m_0 \\ -\varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x})/\tau _{\kappa } &{} \varvec{\varGamma }_1/\tau _{\kappa } \end{array} \right] \end{aligned}$$
(59)

is positive stable, where \(\varvec{x} \in \mathbb {R}^d\).

We thus need to show that the resolvent set of \(-\varvec{L(\varvec{x})}\), \(\rho (-\varvec{L}(\varvec{x})):=\{\lambda \in \mathbb {C}: (\lambda \varvec{I} + \varvec{L}(\varvec{x}))^{-1} \text { exists}\}\), contains the right half plane \(\{\lambda \in \mathbb {C}: Re(\lambda )>0\}\) for every \(\varvec{x} \in \mathbb {R}^d\).

Let \(\lambda \in \mathbb {C}\) such that \(Re(\lambda ) > 0\). We will use the following formula for blockwise inversion of a block matrix: provided that \(\varvec{S}\) and \(\varvec{P}-\varvec{Q}\varvec{S}^{-1} \varvec{R}\) are nonsingular, we have

$$\begin{aligned} \begin{bmatrix} \varvec{P}&\varvec{Q} \\ \varvec{R}&\varvec{S} \\ \end{bmatrix}^{-1} = \begin{bmatrix} (\varvec{P}-\varvec{Q}\varvec{S}^{-1}\varvec{R})^{-1}&-(\varvec{P}-\varvec{Q}\varvec{S}^{-1}\varvec{R})^{-1} \varvec{Q} \varvec{S}^{-1} \\ -\varvec{S}^{-1}\varvec{R}(\varvec{P}-\varvec{Q}\varvec{S}^{-1}\varvec{R})^{-1}&\varvec{S}^{-1} + \varvec{S}^{-1}\varvec{R}(\varvec{P}-\varvec{Q}\varvec{S}^{-1}\varvec{R})^{-1} \varvec{Q} \varvec{S}^{-1}\\ \end{bmatrix} , \end{aligned}$$
(60)

where \(\varvec{P}\), \(\varvec{Q}\), \(\varvec{R}\), \(\varvec{S}\) are matrix sub-blocks of arbitrary dimension.

Since the matrices \(\varvec{A}_{\lambda } := \varvec{\varGamma }_1/\tau _{\kappa } + \lambda \varvec{I}\) and \(\varvec{B}_{\lambda }(\varvec{x}) := \varvec{I} + \varvec{g}(\varvec{x}) \tilde{\varvec{\kappa }}(\lambda \tau _{\kappa }) \varvec{h}(\varvec{x})/\lambda m_0\) are invertible for all \(\lambda \) in the right half plane by assumption, \(\lambda \varvec{I} + \varvec{L}(\varvec{x})\) is indeed invertible for every \(\varvec{x}\) and in fact using the above formula for the inverse of a block matrix, we have:

$$\begin{aligned}&(\lambda \varvec{I} + \varvec{L}(\varvec{x}))^{-1} \nonumber \\&\quad = \left[ \begin{array}{cc} \varvec{B}_{\lambda }^{-1}(\varvec{x})/\lambda &{} -\varvec{B}_{\lambda }^{-1}(\varvec{x}) \varvec{g}(\varvec{x}) \varvec{C}_1 \varvec{A}_{\lambda }^{-1}/\lambda m_0 \\ \varvec{A}_{\lambda }^{-1} \varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x}) \varvec{B}_{\lambda }^{-1}(\varvec{x})/\lambda \tau _{\kappa } &{} \ \varvec{A}_{\lambda }^{-1} (\varvec{I} - \varvec{M}_1 \varvec{C}_1^* \varvec{h}(\varvec{x})\varvec{B}_{\lambda }^{-1}(\varvec{x}) \varvec{g}(\varvec{x}) \varvec{C}_1 \varvec{A}_{\lambda }^{-1}/\lambda m_0 \tau _{\kappa }) \end{array} \right] .\nonumber \\ \end{aligned}$$
(61)

Therefore, \(\varvec{\hat{\gamma }}\) is invertible and one can compute:

$$\begin{aligned} \varvec{\hat{\gamma }}^{-1} = \left[ \begin{array}{ccc} m_0 \varvec{\theta }^{-1} &{} -\tau _{\kappa } \varvec{\theta }^{-1}\varvec{g} \varvec{C}_1 \varvec{\varGamma }_1^{-1} &{} \tau _{\xi } \varvec{\theta }^{-1} \varvec{\sigma } \varvec{C}_2 \varvec{\varGamma }_2^{-1} \\ m_{0} \varvec{\varGamma }_1^{-1} \varvec{M}_1 \varvec{C}_1^* \varvec{h} \varvec{\theta }^{-1} &{} \ \tau _{\kappa } \varvec{\varGamma }_1^{-1}(\varvec{I}- \varvec{M}_1 \varvec{C}_1^* \varvec{h}\varvec{\theta }^{-1} \varvec{g} \varvec{C}_1 \varvec{\varGamma }_1^{-1}) &{} \ \ \tau _{\xi }\varvec{\varGamma }_1^{-1}\varvec{M}_1\varvec{C}_1^* \varvec{h} \varvec{\theta }^{-1} \varvec{\sigma }\varvec{C}_2 \varvec{\varGamma }_2^{-1} \\ \varvec{0} &{} \varvec{0} &{} \tau _{\xi } \varvec{\varGamma }_2^{-1} \end{array} \right] ,\nonumber \\ \end{aligned}$$
(62)

where \(\varvec{\theta } = \varvec{g} \varvec{K}_1 \varvec{h}\). The result follows by applying Theorem 1 to the SDE systems (55), (56). In particular, a rewriting of the resulting Lyapunov equation (35) gives the system of matrix equations (49)–(53). \(\square \)

Remark 8

In the above proof, the condition of invertibility of \(\varvec{B}_\lambda (\varvec{x})\) is only used to guarantee the positive stability of the matrix \(\hat{\varvec{\gamma }}\). Therefore, the conclusion of the theorem holds also when the latter can be established in another way. This can indeed be done in a number of concrete examples (see, for instance, the matrix \(\varvec{\gamma }\) in Eq. (83), or the line (90) and the sentence below the line).

Remark 9

Our SIDEs belong to a special class of non-Markovian equations, the so-called quasi-Markovian Langevin equations [41]. For these equations one can introduce a finite number of auxiliary variables in such a way that the evolution of particle’s position and velocity, together with these auxiliary variables, is described by a usual SDE system and is thus Markovian. We remark that such “Markovianization” procedure works here because the colored noise can be generated by a linear system of SDEs and the memory kernel satisfies a linear system of ordinary differential equations—both with constant coefficients. If, on the other hand, the memory kernel decays as a power, then there is no finite dimensional extension of the space which would make the solution process Markovian [16].

The following corollary uses a linear change of variables in a given SIDE, to arrive at an alternative form of the corresponding homogenized SDE of the form (45).

Corollary 1

For \(i=1,2\), let \(\varvec{T}_i\) be arbitrary \(d_i \times d_i\) constant invertible matrix, where \(d_1,d_2\) are positive integers. For \(t \ge 0\), denote \(\varvec{\varGamma }'_i=\varvec{T}_i \varvec{\varGamma }_i \varvec{T}^{-1}_i\), \(\varvec{M}_i' = \varvec{T}_i \varvec{M}_i \varvec{T}_i^{*}\), \(\varvec{C}'_i =\varvec{C}_i \varvec{T}_i^{-1}\), \((\varvec{\beta }^\epsilon _t)'=\varvec{T}_2 \varvec{\beta }^\epsilon _t\), \(\varvec{\varSigma }_i' = \varvec{T}_i \varvec{\varSigma }_i\) and consider the equations:

$$\begin{aligned} m_0 \epsilon ^{\mu } \ddot{\varvec{x}}^\epsilon _{t}&= \varvec{F}(\varvec{x}^\epsilon _{t}) - \frac{\varvec{g}(\varvec{x}^\epsilon _{t})}{\tau _{\kappa } \epsilon ^{\theta }} \int _{0}^{t} \varvec{C}'_1e^{-\frac{\varvec{\varGamma }'_1}{\tau _{\kappa }\epsilon ^{\theta }}(t-s)} \varvec{M}'_1 (\varvec{C}'_1)^* \varvec{h}(\varvec{x}^\epsilon _{s}) \dot{\varvec{x}}^\epsilon _{s} ds + \varvec{\sigma }(\varvec{x}^\epsilon _{t}) \varvec{C}'_2 (\varvec{\beta }^\epsilon _{t})', \end{aligned}$$
(63)
$$\begin{aligned} \tau _{\xi }\epsilon ^\nu d(\varvec{\beta }^\epsilon _t)'&= -\varvec{\varGamma }'_2 (\varvec{\beta }^\epsilon _t)' dt + \varvec{\varSigma }'_2 d\varvec{W}'_t, \end{aligned}$$
(64)

where \(\varvec{W}'_t\) is a \(q_2\)-dimensional Wiener process and the initial condition, \((\varvec{\beta }^\epsilon _0)'\), is normally distributed with zero mean and covariance of \(\varvec{M}'_2/(\tau _{\xi } \epsilon ^\nu )\).

Suppose that Assumptions 57 are satisfied and the effective damping and diffusion constants, \(\varvec{K}'_i = \varvec{C}'_i (\varvec{\varGamma }_i')^{-1} \varvec{M}_i' (\varvec{C}_i')^{*} = \varvec{K}_i\) (\(i=1,2\)), are invertible. Moreover, we assume that \(\varvec{I} + \varvec{g}(\varvec{x}) \tilde{\varvec{\kappa }'}(\lambda \tau _{\kappa }) \varvec{h}(\varvec{x})/\lambda m_0\) is invertible for all \(\lambda \) in the right half plane \(\{\lambda \in \mathbb {C}: Re(\lambda )>0\}\) and \(\varvec{x} \in \mathbb {R}^d\), where \(\tilde{\varvec{\kappa }'}(z) := \varvec{C}'_1(z\varvec{I} + \varvec{\varGamma }'_1)^{-1}\varvec{M}'_1 (\varvec{C}'_1)^* = \varvec{\tilde{\kappa }}(z)\) for \(z \in \mathbb {C}\).

Let \(\mu = \theta = \nu \) in (63), (64). Then as \(\epsilon \rightarrow 0\), the process \(\varvec{x}^\epsilon _t\) converges, in the similar sense as in Theorem 2, to \(\varvec{X}_t\), where \(\varvec{X}_{t}\) is the solution of the SDE (45) with the \(\varvec{C}_i\), \(\varvec{\varGamma }_i\), \(\varvec{M}_i\), \(\varvec{\varSigma }_i\) replaced by \(\varvec{C}'_i\), \(\varvec{\varGamma }'_i\), \(\varvec{M}'_i\), \(\varvec{\varSigma }'_i\) respectively, and the driving Wiener process \(\varvec{W}^{(q_2)}_t\) replaced by \(\varvec{W}_t'\).

Corollary 1 is an easy consequence of Theorem 2.

Next, we discuss a particular, but very important, case when a fluctuation–dissipation relation holds. This is, for instance, the case when the pre-limit dynamics are (heuristically) derived from Hamiltonian dynamics (see Appendix A). We will further explore similar cases of fluctuation–dissipation relations for the two sub-classes.

Corollary 2

Let \(\varvec{x}^\epsilon _{t} \in \mathbb {R}^{d}\) be the solution to the SDEs (37)–(42). Suppose that the assumptions of Theorem 2 holds. Moreover, we assume that:

$$\begin{aligned} \tau _{\kappa } = \tau _{\xi } = \tau , \ \ \varvec{\sigma } = \varvec{g}, \ \ \varvec{h} = \varvec{g}^*, \end{aligned}$$
(65)

where \(\tau \) is a positive constant, and

$$\begin{aligned} \varvec{C}_1 = \varvec{C}_2 := \varvec{C}, \ \ \varvec{\varGamma }_1 = \varvec{\varGamma }_2 := \varvec{\varGamma }, \ \ \varvec{M}_1 = \varvec{M}_2 := \varvec{M}, \ \ \varvec{\varSigma }_1 = \varvec{\varSigma }_2 := \varvec{\varSigma }, \end{aligned}$$
(66)

(so that \(q=r\) and \(d_1 =d_2)\). Denote \(\varvec{K} := \varvec{C} \varvec{\varGamma }^{-1} \varvec{M} \varvec{C}^*\). Then as \(\epsilon \rightarrow 0\), the process \(\varvec{x}^\epsilon _{t}\) converges to the solution, \(\varvec{X}_{t}\), of the following Itô SDE:

$$\begin{aligned} d\varvec{X}_{t} = \varvec{S}(\varvec{X}_{t}) dt + [\varvec{g}(\varvec{X}_t) \varvec{K} \varvec{g}^*(\varvec{X}_t)]^{-1} \varvec{F}(\varvec{X}_{t}) dt + [\varvec{g}(\varvec{X}_t) \varvec{K} \varvec{g}^*(\varvec{X}_t)]^{-1} \varvec{g}(\varvec{X}_t) \varvec{C} \varvec{\varGamma }^{-1} \varvec{\varSigma } d\varvec{W}^{(q_2)}_{t}, \end{aligned}$$
(67)

where \(\varvec{S}(\varvec{X}_t)\) is the noise-induced drift whose ith component is given by

$$\begin{aligned} S_i(\varvec{X}) = m_0 \frac{\partial }{\partial X_{l}}\left[ ((\varvec{g} \varvec{K}\varvec{g}^*)^{-1})_{ij}(\varvec{X})\right] (\varvec{J}_{11})_{jl}(\varvec{X}), \ \ i,j,l = 1, \dots , d, \end{aligned}$$
(68)

where \(\varvec{J}_{11}\) solves the following system of three matrix equations:

$$\begin{aligned} m_0 \varvec{J}_{11} \varvec{g} \varvec{C} \varvec{M} + \tau \varvec{g} \varvec{C}(\varvec{J}_{23} + \varvec{J}_{23}^*)&= \tau \varvec{g} \varvec{C} \varvec{J}_{22}+\varvec{g} \varvec{C} \varvec{M}, \end{aligned}$$
(69)
$$\begin{aligned} \varvec{M} \varvec{C}^* \varvec{g}^* \varvec{g} \varvec{C} \varvec{M} (\varvec{\varGamma }^{-1})^*&= \tau \varvec{M} \varvec{C}^* \varvec{g}^* \varvec{g} \varvec{C} \varvec{J}_{23} (\varvec{\varGamma }^{-1})^* \nonumber \\&\quad + m_0 (\varvec{\varGamma } \varvec{J}_{23} + \varvec{J}_{23} \varvec{\varGamma }^*), \end{aligned}$$
(70)
$$\begin{aligned} \varvec{M} \varvec{C}^* \varvec{g}^*\varvec{g} \varvec{C} \varvec{M} (\varvec{\varGamma }^{-1})^* + \varvec{\varGamma }^{-1} \varvec{M} \varvec{C}^* \varvec{g}^* \varvec{g} \varvec{C} \varvec{M}&= \tau (\varvec{M} \varvec{C}^* \varvec{g}^* \varvec{\varGamma }^{-1} \varvec{J}_{23}^* \varvec{C}^* \varvec{g}^* \nonumber \\&\quad + \varvec{\varGamma }^{-1} \varvec{J}_{23}^* \varvec{C}^* \varvec{g}^* \varvec{g} \varvec{C} \varvec{M}) \nonumber \\&\quad + m_0 (\varvec{\varGamma } \varvec{J}_{22} + \varvec{J}_{22} \varvec{\varGamma }^*). \end{aligned}$$
(71)

The convergence is obtained in the same sense as in Theorem 2.

Equations (65) and (66) are a form of fluctuation–dissipation relation familiar from non-equilibrium statistical mechanics [13]. As stationary measures of systems satisfying fluctuation–dissipation relations are in equilibrium with respect to the underlying dynamics, this result is relevant for describing equilibrium properties of such systems in the small mass limit.

Remark 10

Therefore, if the fluctuation–dissipation relation holds, the noise-induced drift in the limiting SDE reduces to a single term (later we will see how this term simplifies in some special cases). This result may have interesting implications for nanoscale systems in equilibrium. We remark that the conditions for the fluctuation–dissipation relation in Corollary 2 can be written in other equivalent forms, up to the transformations in (8) and multiplication by a constant.

Proof

The above corollary follows from applying Theorem 2. Indeed, by assumptions of the corollary, (49) simplifies to:

$$\begin{aligned} \varvec{g} \varvec{C} (\varvec{J}_{12}-\varvec{J}_{13})^* + (\varvec{J}_{12}-\varvec{J}_{13}) (\varvec{g} \varvec{C})^* = \varvec{0}. \end{aligned}$$
(72)

This implies that \(\varvec{J}_{12} = \varvec{J}_{13}\) and, therefore, \(\varvec{S}^{(2)}\) and \(\varvec{S}^{(3)}\) cancel. Rewriting the resulting system of matrix equations in (49)–(53) give (69)–(71). \(\square \)

5 Homogenization for Models of the Two Sub-classes

We now return to the two sub-classes of SIDEs (11) introduced in Sect. 2. In this section, we study the effective dynamics described by SIDEs (25) and (27) in the limit as \(\epsilon \rightarrow 0\). By specializing to these two sub-classes, the convergence result of Theorem 2, in particular the expressions in (45)–(53), can be made more explicit under certain assumptions on the matrix-valued coefficients and therefore the limiting equation obtained may be useful for modeling purposes.

5.1 SIDEs Driven by a Markovian Colored Noise

The following convergence result provides a homogenized SDE for the particle’s position in the limit as the inertial time scale, the memory time scale and the noise correlation time scale vanish at the same rate in the case when the pre-limit dynamics are driven by an Ornstein–Uhlenbeck noise.

Corollary 3

Let \(d=d_1=d_2=q_1=q_2=q=r\). We set, in the SDEs (37)-(42): \(\varvec{\beta }^\epsilon _t = \varvec{\eta }^\epsilon _t\), \(\tau _{\xi } = \tau _{\eta }\), \(\varvec{W}_t^{(q_2)} = \varvec{W}^{(d)}_t := \varvec{W}_t\) and

$$\begin{aligned} (\varvec{\varGamma }_1, \varvec{M}_1, \varvec{C}_1) = (\varvec{A}, \varvec{A}, \varvec{I}), \ \ (\varvec{\varGamma }_2, \varvec{M}_2, \varvec{C}_2) = (\varvec{A}, \varvec{A}/2, \varvec{I}), \end{aligned}$$
(73)

to obtain SDEs equivalent to Eqs. (25) and (26) with \(\mu = \theta = \nu = 1\). Let \(\varvec{x}^\epsilon _{t} \in \mathbb {R}^{d}\) be the solution to these equations, with the matrices \(\varvec{g}(\varvec{x})\) and \(\varvec{h}(\varvec{x})\) positive definite for every \(\varvec{x} \in \mathbb {R}^d\). Suppose that Assumptions 57 are satisfied and, moreover, that \(\varvec{g}(\varvec{x})\), \(\varvec{h}(\varvec{x})\) and the diagonal matrix \(\varvec{A}\) are commuting.

Then as \(\epsilon \rightarrow 0\), the process \(\varvec{x}^\epsilon _{t}\) converges to the solution, \(\varvec{X}_{t}\), of the following Itô SDE:

$$\begin{aligned} d\varvec{X}_{t} = \varvec{S}(\varvec{X}_{t}) dt + (\varvec{g} \varvec{h})^{-1}(\varvec{X}_{t})\varvec{F}(\varvec{X}_{t}) dt + (\varvec{g}\varvec{h})^{-1}(\varvec{X}_{t}) \varvec{\sigma }(\varvec{X}_{t}) d\varvec{W}_{t}, \end{aligned}$$
(74)

with \(\varvec{S}(\varvec{X}_{t}) = \varvec{S}^{(1)}(\varvec{X}_{t}) + \varvec{S}^{(2)}(\varvec{X}_{t}) + \varvec{S}^{(3)}(\varvec{X}_{t}),\) where the \(\varvec{S}^{(k)}\) are the noise-induced drifts whose ith components are given by

$$\begin{aligned} S^{(1)}_{i}(\varvec{X})&= m_{0} \frac{\partial }{\partial X_{l}}[((\varvec{g}\varvec{h})^{-1})_{ij}(\varvec{X})] (\varvec{J}_{11})_{jl}(\varvec{X}), \ \ i,j,l = 1, \dots , d, \end{aligned}$$
(75)
$$\begin{aligned} S^{(2)}_{i}(\varvec{X})&= -\tau _{\kappa } \frac{\partial }{\partial X_{l}}[((\varvec{A} \varvec{h})^{-1})_{ij}(\varvec{X})] (\varvec{J}_{21})_{jl}(\varvec{X}), \ \ i,j,l = 1, \dots , d, \end{aligned}$$
(76)
$$\begin{aligned} S^{(3)}_{i}(\varvec{X})&= \tau _{\eta } \frac{\partial }{\partial X_{l}}[((\varvec{g}\varvec{h})^{-1} \varvec{\sigma } \varvec{A}^{-1} )_{ij}(\varvec{X})] (\varvec{J}_{31})_{jl}(\varvec{X}), \ \ i,j,l = 1, \dots , d. \end{aligned}$$
(77)

Here \(\varvec{J}_{11} = \varvec{J}_{11}^*\), \(\varvec{J}_{21}=\varvec{J}_{12}^*\) and \(\varvec{J}_{31} = \varvec{J}_{13}^*\) are d by d block matrices satisfying the following system of matrix equations:

$$\begin{aligned} \tau _{\eta } \varvec{g} \varvec{J}_{23} + m_0 \varvec{J}_{13} \varvec{A}&= \varvec{\sigma } \varvec{A}/2, \end{aligned}$$
(78)
$$\begin{aligned} \tau _{\eta } \varvec{A} \varvec{h} \varvec{J}_{13}&= \tau _{\eta } \varvec{A} \varvec{J}_{23} + \tau _{\kappa } \varvec{J}_{23} \varvec{A}, \end{aligned}$$
(79)
$$\begin{aligned} \varvec{A} \varvec{h} \varvec{J}_{12} + \varvec{J}_{12}^* \varvec{h} \varvec{A}&= \varvec{A} \varvec{J}_{22} + \varvec{J}_{22} \varvec{A}, \end{aligned}$$
(80)
$$\begin{aligned} \varvec{g} \varvec{J}_{12}^* + \varvec{J}_{12} \varvec{g}&= \varvec{\sigma }\varvec{J}_{13}^* + \varvec{J}_{13} \varvec{\sigma }^*, \end{aligned}$$
(81)
$$\begin{aligned} m_0 \varvec{J}_{11} \varvec{h} \varvec{A} + \tau _{\kappa } \varvec{\sigma } \varvec{J}_{23}^*&= \tau _{\kappa } \varvec{g} \varvec{J}_{22} + m_0 \varvec{J}_{12} \varvec{A}. \end{aligned}$$
(82)

The convergence is obtained in the same sense as in Theorem 2.

Proof

We will apply Theorem 2. As \(\varvec{K}_1=2 \varvec{K}_2 = \varvec{I}\), clearly they are invertible. Also, being positive definite, \(\varvec{g}\) and \(\varvec{h}\) are invertible.

Since \(\varvec{g}\), \(\varvec{h}\) and \(\varvec{A}\) are positive definite and commuting matrices, the matrix \(\varvec{B}_{\lambda }(\varvec{x})\), defined in (44), is invertible for all \(\lambda \) such that \(Re(\lambda ) > 0\). Indeed, in this case \(\varvec{B}_{\lambda }(\varvec{x}) = \varvec{I} + \varvec{g}(\varvec{x})(\lambda \tau _{\kappa } \varvec{I} + \varvec{A})^{-1} \varvec{A} \varvec{h}(\varvec{x})/\lambda m_0\). Since \(\varvec{g}\), \(\varvec{h}\) and \(\varvec{A}\) are positive definite and commuting, they have positive eigenvalues and can be simultaneously diagonalized. Therefore, all the eigenvalues of \(\varvec{B}_{\lambda }(\varvec{x})\) are nonzero for every \(\lambda \) with \(Re(\lambda ) > 0\) and \(\varvec{x} \in \mathbb {R}^d\), so the invertibility condition is verified. Therefore, the block matrix:

$$\begin{aligned} \varvec{\hat{\gamma }}(\varvec{x}^\epsilon _{t}) = \left[ \begin{array}{ccc} \varvec{0} &{} \frac{\varvec{g}(\varvec{x}^\epsilon _{t})}{m_{0}} &{} -\frac{\varvec{\sigma }(\varvec{x}^\epsilon _{t})}{m_{0}} \\ -\frac{\varvec{A} \varvec{h}(\varvec{x}^\epsilon _{t}) }{\tau _{\kappa }} &{} \frac{\varvec{A}}{\tau _{\kappa }} &{} \varvec{0} \\ \varvec{0} &{} \varvec{0} &{} \frac{\varvec{A}}{\tau _{\eta }} \end{array} \right] , \end{aligned}$$
(83)

is positive stable (see Remark 8). The result then follows by applying Theorem 2. \(\square \)

For special one-dimensional systems, the form of the limiting equation can be made even more explicit.

Corollary 4

In the one-dimensional case, we drop the boldface and write \(\varvec{X}_{t} := X_{t} \in \mathbb {R}, \ \varvec{g}(\varvec{X}) := g(X),\) with \(g: \mathbb {R}\rightarrow \mathbb {R}\), etc.. We assume that \(h = g\) and \(\varvec{A} := \alpha > 0\) is a constant. The homogenized equation is given by:

$$\begin{aligned} dX_{t} = S(X_{t}) dt + g^{-2}(X_{t}) F(X_{t}) dt + g^{-2}(X_{t}) \sigma (X_{t}) dW_{t}, \end{aligned}$$
(84)

with \(S(X_{t}) = S^{(1)}(X_{t}) + S^{(2)}(X_{t}) + S^{(3)}(X_{t}),\) where the noise-induced drift terms \(S^{(k)}(X_{t})\) have the following explicit expressions that depend on the parameters \(m_{0}, \tau _{\kappa }\) and \(\tau _{\eta }\):

$$\begin{aligned} S^{(1)}(X_{t})&= \left( \frac{1}{g^2(X_{t})} \right) ' \frac{ \sigma (X_{t})^2}{2 g^2(X_{t})} \left[ \frac{\tau _{\kappa }^2 g^2(X_{t})+m_{0}\alpha (\tau _{\kappa }+\tau _{\eta })}{\tau _{\eta }^2 g^2(X_{t})+m_{0}\alpha (\tau _{\kappa }+\tau _{\eta })} \right] , \end{aligned}$$
(85)
$$\begin{aligned} S^{(2)}(X_{t})&= - \left( \frac{1}{g(X_{t})} \right) ' \frac{ \sigma (X_{t})^2 \tau _{\kappa }(\tau _{\kappa }+\tau _{\eta })}{2g(X_{t})[\tau _{\eta }^2 g^2(X_{t})+m_{0}\alpha (\tau _{\kappa }+\tau _{\eta })]}, \end{aligned}$$
(86)
$$\begin{aligned} S^{(3)}(X_{t})&= \left( \frac{\sigma (X_{t})}{g^2(X_{t})} \right) ' \frac{ \sigma (X_{t})\tau _{\eta } (\tau _{\kappa }+\tau _{\eta })}{2[\tau _{\eta }^2 g^2(X_{t})+m_{0}\alpha (\tau _{\kappa }+\tau _{\eta })]}, \end{aligned}$$
(87)

where the prime \('\) denotes derivative with respect to \(X_t\).

Proof

With \(\varvec{x}^\epsilon _{t} := (x^\epsilon _{t}, z^\epsilon _{t}, \zeta ^\epsilon _{t}) \in \mathbb {R}^{3}\) and \(\varvec{v}^\epsilon _{t} := (v^\epsilon _{t}, y^\epsilon _{t}, \eta ^\epsilon _{t}) \in \mathbb {R}^{3}\), SDEs (55), (56) become:

$$\begin{aligned} d\varvec{x}^\epsilon _{t}&= \varvec{v}^\epsilon _{t} dt, \end{aligned}$$
(88)
$$\begin{aligned} \epsilon d\varvec{v}^\epsilon _{t}&= - \varvec{\gamma }(\varvec{x}^\epsilon _{t}) \varvec{v}^\epsilon _{t} dt + \varvec{F}(\varvec{x}^\epsilon _{t})dt + \varvec{\sigma } d\varvec{W}_{t}, \end{aligned}$$
(89)

where

$$\begin{aligned} \varvec{\gamma }(\varvec{x}^\epsilon _{t}) = \left[ \begin{array}{ccc} 0 &{} \frac{g(x^\epsilon _{t})}{m_{0}} &{} -\frac{\sigma (x^\epsilon _{t})}{m_{0}} \\ -\frac{\alpha }{\tau _{\kappa }} g(x^\epsilon _{t}) &{} \frac{\alpha }{\tau _{\kappa }} &{} 0 \\ 0 &{} 0 &{} \frac{\alpha }{\tau _{\eta }} \end{array} \right] , \ \ \varvec{F}(\varvec{x}^\epsilon _{t}) = \begin{bmatrix} \frac{F(x^\epsilon _{t})}{m_{0}} \\ 0\\ 0 \\ \end{bmatrix}, \ \ \varvec{\sigma } = \left[ \begin{array}{c} 0 \\ 0 \\ \frac{\alpha }{\tau _{\eta }} \end{array} \right] . \end{aligned}$$
(90)

It follows from Corollary 3 that the matrix \(\varvec{\gamma }\) is positive stable; one can also calculate its eigenvalues explicitly and see that their real parts are positive. The eigenvalues of \(\varvec{\gamma }\) are

$$\begin{aligned} \frac{\alpha }{\tau _{\eta }}, \ \frac{\alpha }{2 \tau _{\kappa }} \pm \frac{1}{2} \sqrt{\frac{\alpha ^2 m_{0}-4 \alpha g(x^\epsilon _{t})^2 \tau _{\kappa }}{m_{0} \tau _{\kappa }^2}}, \end{aligned}$$
(91)

and so their real parts are indeed positive.

On the other hand, the solution, \(\varvec{J} \in \mathbb {R}^{3 \times 3}\), to the Lyapunov equation,

$$\begin{aligned} \varvec{\gamma } \varvec{J} + \varvec{J} \varvec{\gamma }^{*} = \varvec{\sigma } \varvec{\sigma }^{*}, \end{aligned}$$
(92)

can be computed (using Mathematica\(^{\textregistered }\)) to be:

$$\begin{aligned} \varvec{J} = \left[ \begin{array}{ccc} \frac{ \sigma ^2}{2m_{0} g^2}\left[ \frac{\tau _{\kappa }^2 g^2 + m_{0} \alpha (\tau _{\kappa }+\tau _{\eta }) }{\tau _{\eta }^2 g^2 + m_{0} \alpha (\tau _{\kappa }+\tau _{\eta })}\right] &{} \frac{\alpha \sigma ^2(\tau _{\kappa }+\tau _{\eta }) }{2g(\tau _{\eta }^2 g^2+m_{0} \alpha (\tau _{\kappa }+\tau _{\eta })} &{} \frac{\alpha \sigma (\tau _{\kappa }+\tau _{\eta })}{2(\tau _{\eta }^2 g^2 + m_{0} \alpha (\tau _{\kappa }+\tau _{\eta }))} \\ \frac{\alpha \sigma ^2(\tau _{\kappa }+\tau _{\eta }) }{2g(\tau _{\eta }^2 g^2+m_{0} \alpha (\tau _{\kappa }+\tau _{\eta })} &{} \frac{\alpha \sigma ^2 (\tau _{\kappa }+\tau _{\eta })}{2(\tau _{\eta }^2 g^2 +m_{0} \alpha (\tau _{\kappa }+\tau _{\eta }))} &{} \frac{\tau _{\eta } \alpha \sigma g}{2 (\tau _{\eta }^2 g^2+ m_{0} \alpha (\tau _{\kappa }+\tau _{\eta }))} \\ \frac{\alpha \sigma (\tau _{\kappa }+\tau _{\eta })}{2(\tau _{\eta }^2 g^2 + m_{0} \alpha (\tau _{\kappa }+\tau _{\eta }))} &{} \frac{\tau _{\eta } \alpha \sigma g}{2 (\tau _{\eta }^2 g^2 + m_{0} \alpha (\tau _{\kappa }+\tau _{\eta }))} &{} \frac{\alpha }{2 \tau _{\eta }} \end{array} \right] . \end{aligned}$$
(93)

The result then follows from Corollary 3. \(\square \)

Remark 11

Note that here the matrix \(\varvec{\gamma }\) in (90) is not symmetric and the smallest eigenvalue of its symmetric part can be negative. Moreover, the initial condition \(\varvec{v}^\epsilon _0\) depends on \(\epsilon \) through the component \(\eta ^\epsilon _0\) (which is a zero mean Gaussian random variable with variance \(\frac{\alpha }{2\epsilon \tau _\eta }\)). Thus, we cannot apply the main results in [8] to obtain the convergence result. This is our main motivation to revisit the Smoluchowski–Kramers limit of SDEs in Sect. 3 under a weakened spectral assumption on the matrix \(\varvec{\gamma }\) (or \(\varvec{\hat{\gamma }}\) in the multidimensional case) and a relaxed assumption concerning the \(\epsilon \) dependence of \(\varvec{v}^\epsilon _0\) (or \(\hat{\varvec{v}}^\epsilon _0\) in the multidimensional case).

Remark 12

In the important case when the fluctuation–dissipation relation (i.e. \(\tau _{\kappa } = \tau _{\eta }\), \(h = g\) and g is proportional to \(\sigma \)) holds for the one-dimensional models of the first sub-class, the correction drift terms \(S^{(2)}\) and \(S^{(3)}\) cancel each other and the resulting (single) noise-induced drift term coincides with that obtained in the limit as \(m \rightarrow 0\) of the systems with no memory, driven by a white noise to which Theorem 1 applies directly! However, when the relation fails, we obtain three different drift corrections induced by vanishing of all time scales. Again, the presence of these correction terms may have significant consequences for the dynamics of the systems (see Sect. 6).

5.2 SIDEs Driven by a Non-Markovian Colored Noise

The following corollary provides a homogenized SDE for the particle position in the limit, in which the inertial time scale, the memory time scale and the noise correlation time scale vanish at the same rate in the case when the pre-limit dynamics are driven by the harmonic noise. We emphasize that in this case the original system is driven by a noise which is not a Markov process.

Corollary 5

Let \(d=d_1=d_2=q_1=q_2=q=r\). We set, in the SDEs (37)–(42): \(\tau _{\xi } = \tau _{h}\), \(\varvec{W}_t^{(q_2)} = \varvec{W}^{(d)}_t = \varvec{W}_t\) and

$$\begin{aligned} \varvec{\varGamma }_2= & {} \begin{bmatrix} \varvec{0}&-\varvec{I} \\ \varvec{\varOmega }^2&\varvec{\varOmega }^2 \end{bmatrix}, \ \ \ \varvec{\varGamma }_1 = \frac{1}{2} \begin{bmatrix} \varvec{\varOmega }^2&4\varvec{I}-\varvec{\varOmega }^2 \\ -\varvec{\varOmega }^2&\varvec{\varOmega }^2 \end{bmatrix} =: \varvec{T} \varvec{\varGamma }_2 \varvec{T}^{-1}, \end{aligned}$$
(94)
$$\begin{aligned} \varvec{M}_2= & {} \frac{1}{2} \begin{bmatrix} \varvec{I}&\varvec{0} \\ \varvec{0}&\varvec{\varOmega }^2 \end{bmatrix}, \ \ \ \varvec{M}_1 = 2\varvec{T} \varvec{M}_2 \varvec{T}^*, \end{aligned}$$
(95)
$$\begin{aligned} \varvec{C}_2= & {} [\varvec{I} \ \ \varvec{0}], \ \ \varvec{C}_1 = \varvec{C}_2 \varvec{T}^{-1}, \end{aligned}$$
(96)

to obtain SDEs equivalent to equations (27)–(29) with \(\mu = \theta = \nu = 1\). Let \(\varvec{x}^\epsilon _{t} \in \mathbb {R}^{d}\) be the solution to the SDEs (37)–(42), with the matrices \(\varvec{g}(\varvec{x})\) and \(\varvec{h}(\varvec{x})\) positive definite for every \(\varvec{x} \in \mathbb {R}^d\). Moreover, \(\varvec{g}(\varvec{x})\), \(\varvec{h}(\varvec{x})\) and the diagonal matrix \(\varvec{\varOmega }^2\) are commuting. Suppose that Assumptions 57 are satisfied.

Then as \(\epsilon \rightarrow 0\), the process \(\varvec{x}^\epsilon _{t}\) converges to the solution, \(\varvec{X}_{t}\), of the following Itô SDE

$$\begin{aligned} d\varvec{X}_{t} = \varvec{S}(\varvec{X}_{t}) dt + (\varvec{g}\varvec{h})^{-1}(\varvec{X}_{t})\varvec{F}(\varvec{X}_{t}) dt + (\varvec{g} \varvec{h})^{-1}(\varvec{X}_{t}) \varvec{\sigma }(\varvec{X}_{t}) d\varvec{W}_{t}, \end{aligned}$$
(97)

with \(\varvec{S}(\varvec{X}_{t}) = \varvec{S}^{(1)}(\varvec{X}_{t}) + \varvec{S}^{(2)}(\varvec{X}_{t}) + \varvec{S}^{(3)}(\varvec{X}_{t}),\) where the \(\varvec{S}^{(k)}\) are the noise-induced drift terms whose ith components are given by the expressions

$$\begin{aligned} S^{(1)}_{i}(\varvec{X})&= m_{0} \frac{\partial }{\partial X_{l}}[((\varvec{g}\varvec{h})^{-1})_{ij}(\varvec{X})] (\varvec{J}_{11})_{jl}(\varvec{X}), \end{aligned}$$
(98)
$$\begin{aligned} S^{(2)}_{i}(\varvec{X})&= -\tau _{\kappa }\left( \frac{\partial }{\partial X_{l}}[( \varvec{h}^{-1})_{ij}(\varvec{X})] (\varvec{J}_{21})_{jl}(\varvec{X}) + \frac{\partial }{\partial X_{l}}[( \varvec{h}^{-1} (\varvec{I}-2\varvec{\varOmega }^{-2}))_{ij}(\varvec{X})] (\varvec{J}_{31})_{jl}(\varvec{X}) \right) , \end{aligned}$$
(99)
$$\begin{aligned} S^{(3)}_{i}(\varvec{X})&= \tau _{h} \left( \frac{\partial }{\partial X_{l}}[((\varvec{g}\varvec{h})^{-1} \varvec{\sigma })_{ij}(\varvec{X})] (\varvec{J}_{41})_{jl}(\varvec{X}) + \frac{\partial }{\partial X_{l}}[((\varvec{g}\varvec{h})^{-1} \varvec{\sigma } \varvec{\varOmega }^{-2} )_{ij}(\varvec{X})] (\varvec{J}_{51})_{jl}(\varvec{X}) \right) , \ \ \end{aligned}$$
(100)

where \(i,j,l = 1, \dots , d\). In the above,

$$\begin{aligned} \varvec{\hat{J}} := \begin{bmatrix} \varvec{J}_{11}&\dots&\varvec{J}_{15} \\ \vdots&\ddots&\vdots \\ \varvec{J}_{51}&\dots&\varvec{J}_{55} \end{bmatrix} \in \mathbb {R}^{5d \times 5d}, \ \text { where } \varvec{J}_{kl} \in \mathbb {R}^{d \times d}, \ \ k,l = 1,\dots ,5, \end{aligned}$$
(101)

is the block matrix solving the Lyapunov equation

$$\begin{aligned} \varvec{\hat{J}} \varvec{\hat{\gamma }}^{*} + \varvec{\hat{\gamma }} \varvec{\hat{J}} = \varvec{\hat{\sigma }} \varvec{\hat{\sigma }}^{*}, \end{aligned}$$
(102)

and

$$\begin{aligned} \varvec{\hat{\gamma }} = \left[ \begin{array}{ccccc} \varvec{0} &{} \frac{\varvec{g}(\varvec{X}) }{m_{0}} &{} \frac{\varvec{g}(\varvec{X})}{m_{0}} &{} -\frac{\varvec{\sigma }(\varvec{X}) }{m_{0}} &{} \varvec{0} \\ -\frac{\varvec{h}(\varvec{X}) }{\tau _{\kappa }} &{} \frac{\varvec{\varOmega }^2}{2 \tau _{\kappa }} &{} \frac{2 \varvec{I}}{\tau _{\kappa }} - \frac{\varvec{\varOmega }^2}{2 \tau _{\kappa }} &{} \varvec{0} &{} \varvec{0} \\ \varvec{0} &{} -\frac{\varvec{\varOmega }^2}{2 \tau _{\kappa }} &{} \frac{\varvec{\varOmega }^2}{2 \tau _{\kappa }} &{} \varvec{0} &{} \varvec{0} \\ \varvec{0} &{} \varvec{0} &{} \varvec{0} &{} \varvec{0} &{} -\frac{1}{\tau _{h}} \varvec{I} \\ \varvec{0} &{} \varvec{0} &{} \varvec{0} &{} \frac{\varvec{\varOmega }^2}{\tau _{h}} &{} \frac{\varvec{\varOmega }^2}{\tau _{h}} \end{array} \right] , \ \ \ \ \varvec{\hat{\sigma }} = \left[ \begin{array}{c} \varvec{0} \\ \varvec{0} \\ \varvec{0} \\ \varvec{0} \\ \frac{\varvec{\varOmega }^2}{\tau _{h}} \end{array} \right] . \end{aligned}$$
(103)

In the above, \(\varvec{\hat{\gamma }} \in \mathbb {R}^{5d \times 5d}\) is a 5 by 5 block matrix with each block an \(\mathbb {R}^{d \times d}\)-valued matrix, \( \varvec{\hat{\sigma }} \in \mathbb {R}^{5d \times d}\) is a 5 by 1 block matrix with each block a \(\mathbb {R}^{d \times d}\)-valued matrix, \(\varvec{I}\) is a \(d \times d\) identity matrix, \(\varvec{0}\) in \(\varvec{\hat{\gamma }}\) and \(\varvec{\hat{\sigma }}\) is a \(d \times d\) zero matrix, and \(\varvec{W}\) is a d-dimensional Wiener process. The convergence is obtained in the same sense as in Theorem 2.

Note that the oscillatory nature of covariance function of the harmonic noise in the pre-limit SIDE makes the noise-induced drift in the resulting limiting SDE more complicated (there are more terms) compared to the case of OU process in the first sub-class. Therefore, we write the system of matrix equations that the \(\varvec{J}_{kl}\) satisfy in the form of a matrix Lyapunov equation in Corollary 5, without breaking it up into equations for individual blocks. This could of course be done, leading to a (more complicated) analog of (78)–(82). The proof of Corollary 5 is essentially identical to the proof of Corollary 3, so we omit it.

Again, for special one-dimensional systems, we are going to make the result more explicit.

Corollary 6

In the one-dimensional case, we drop the boldface and write \(\varvec{X}_{t} := X_{t} \in \mathbb {R}, \ \varvec{g}(\varvec{x}) := g(x),\) with \(g: \mathbb {R}\rightarrow \mathbb {R}\), etc.. We assume that \(h=g\) and \(\varvec{\varOmega } := \varOmega \) is a real constant. The homogenized equation is given by:

$$\begin{aligned} dX_{t} = S(X_{t}) dt + g^{-2}(X_{t}) F(X_{t}) dt + g^{-2}(X_{t}) \sigma (X_{t}) dW_{t}, \end{aligned}$$
(104)

with \(S(X_{t}) = S^{(1)}(X_{t}) + S^{(2)}(X_{t}) + S^{(3)}(X_{t}),\) where the noise-induced drift terms \(S^{(k)}(X_{t})\) have the following explicit expressions (computed using Mathematica\(^{\textregistered }\)) that depend on the parameters \(m_{0}, \tau _{\kappa }\) and \(\tau _{h}\):

$$\begin{aligned} S^{(1)}(X)&= m_{0}\left( \frac{1}{g^2(X)}\right) 'J_{11}(X),\end{aligned}$$
(105)
$$\begin{aligned} S^{(2)}(X)&= -\tau _{\kappa }\left( \frac{1}{g(X)}\right) ' \left( J_{21}(X)+\left( 1-\frac{2}{\varOmega ^2} \right) J_{31}(X) \right) , \end{aligned}$$
(106)
$$\begin{aligned} S^{(3)}(X)&= \tau _{h} \left( \frac{\sigma (X)}{g^2(X)}\right) ' \left( J_{41}(X)+\frac{1}{\varOmega ^2} J_{51}(X) \right) , \end{aligned}$$
(107)

where the prime \('\) denotes derivative with respect to X and the \(J_{kl}(X)\) are given by:

$$\begin{aligned} J_{11}(X)&= \frac{\sigma ^2}{2m_{0} g^2 R(X)} \bigg ( g^4 \tau _{\kappa }^4(\tau _{\kappa }^2+\tau _{\kappa } \tau _{h}\varOmega ^2 + \tau _{h}^2 \varOmega ^2) + m_{0}^2 \varOmega ^4 (\tau _{\kappa }+\tau _{h})^2 (\tau _{\kappa }^2 + \tau _{h}^2 \nonumber \\&\quad + \tau _{\kappa } \tau _{h} (\varOmega ^2-2)) + m_{0} \varOmega ^2 g^2 (\tau _{\kappa } + \tau _{h}) [\tau _{h}^4 + \tau _{\kappa }^2\tau _{h}^2 (\varOmega ^2-2)+\tau _{\kappa }^4(\varOmega ^2-1) \nonumber \\&\quad + \tau _{\kappa }^3 \tau _{h}(2-3\varOmega ^2+\varOmega ^4)] \bigg ) \end{aligned}$$
(108)
$$\begin{aligned} J_{21}(X)&= \frac{\sigma ^2(\tau _{\kappa } + \tau _{h}) \varOmega ^2}{4g R(X)} \bigg ( m_{0}\varOmega ^2 (\tau _{\kappa } + \tau _{h})(\tau _{\kappa }^2 + \tau _{h}^2 + \tau _{\kappa } \tau _{h} (\varOmega ^2-2)) \nonumber \\&\quad + g^2(\tau _{\kappa }^4+\tau _{\kappa }^2 \tau _{h}^2 + \tau _{h}^4+\tau _{\kappa }^3 \tau _{h}(\varOmega ^2-1)) \bigg ), \end{aligned}$$
(109)
$$\begin{aligned} J_{31}(X)&= -\frac{\sigma ^2(\tau _{\kappa } + \tau _{h}) \varOmega ^2}{4g R(X)} \bigg (-m_{0}\varOmega ^2 (\tau _{\kappa } + \tau _{h})(\tau _{\kappa }^2 + \tau _{h}^2 + \tau _{\kappa } \tau _{h} (\varOmega ^2-2)) \nonumber \\&\quad + g^2(\tau _{\kappa }^4+\tau _{\kappa }^2 \tau _{h}^2 - \tau _{h}^4+\tau _{\kappa }^3 \tau _{h}(\varOmega ^2-1)) \bigg ), \end{aligned}$$
(110)
$$\begin{aligned} J_{41}(X)&= \frac{1}{2} \bigg (\sigma \varOmega ^2 (\tau _{\kappa }+\tau _{h}) [g^2 \tau _{h}^4 + m_{0} \varOmega ^2 (\tau _{\kappa }+\tau _{h})(\tau _{\kappa }^2+\tau _{h}^2+\tau _{\kappa }\tau _{h}(\varOmega ^2-2))] \bigg ), \end{aligned}$$
(111)
$$\begin{aligned} J_{51}(X)&= -\frac{1}{2} \bigg (\sigma \varOmega ^2 (\tau _{\kappa }+\tau _{h}) [m_{0} \varOmega ^2 (\tau _{\kappa }+\tau _{h})(\tau _{\kappa }^2+\tau _{h}^2+\tau _{\kappa }\tau _{h}(\varOmega ^2-2)) \nonumber \\&\quad -g^2\tau _{\kappa }\tau _{h}^2(\tau _{\kappa }+\tau _{h}(\varOmega ^2-1))] \bigg ), \end{aligned}$$
(112)

where \(g=g(X)\), \(\sigma = \sigma (X)\) and

$$\begin{aligned} R(X)&= g^4 \tau _{h}^4 (\tau _{\kappa }^2 + \tau _{\kappa } \tau _{h} \varOmega ^2+\tau _{h}^2 \varOmega ^2) + m_{0}^2 \varOmega ^4 (\tau _{\kappa }+\tau _{h})^2(\tau _{\kappa }^2+\tau _{h}^2+\tau _{\kappa } \tau _{h} (\varOmega ^2-2)) \nonumber \\&\quad +g^2m_{0}\tau _{h}^2 \varOmega ^2[\tau _{h}^3\varOmega ^2+\tau _{\kappa }^3(\varOmega ^2-2)+\tau _{\kappa }^2 \tau _{h}\varOmega ^2(\varOmega ^2-2)+\tau _{\kappa }\tau _{h}^2(2-2\varOmega ^2+\varOmega ^4)]. \end{aligned}$$
(113)

Note that if we send \(\varOmega \rightarrow \infty \) in the expressions for the \(S^{(i)}(X)\)\((i=1,2,3)\) above, we recover the corresponding expressions given in Corollary 4 (with \(\alpha =1\)). This is not surprising, since in this limit the harmonic noise becomes an OU process (with \(\alpha = 1)\).

Moreover, when \(\tau _{\kappa } = \tau _{h} = \tau \), the noise-induced drift becomes \(S(X) = S^{(1)}(X)+S^{(2)}(X)+S^{(3)}(X),\) where

$$\begin{aligned} S^{(1)}&= \frac{1}{2}\left( \frac{1}{g^2}\right) '\frac{\sigma ^2}{g^2}, \end{aligned}$$
(114)
$$\begin{aligned} S^{(2)}&= -\frac{2\tau \varOmega ^2 \sigma ^2}{g} \left( \frac{1}{g}\right) ' \left( \frac{g^2 \tau + m_{0}\varOmega ^2(\varOmega ^2-1)}{4 m_{0}^2 \varOmega ^6+2 g^2 m_{0} \tau \varOmega ^4(\varOmega ^2-1)+g^4 \tau ^2 (1+2\varOmega ^2)} \right) , \end{aligned}$$
(115)
$$\begin{aligned} S^{(3)}&= 2 \tau \varOmega ^2 \sigma \left( \frac{\sigma }{g^2}\right) ' \left( \frac{g^2 \tau + m_{0}\varOmega ^2(\varOmega ^2-1)}{4 m_{0}^2 \varOmega ^6+2 g^2 m_{0} \tau \varOmega ^4(\varOmega ^2-1)+g^4 \tau ^2 (1+2\varOmega ^2)} \right) . \end{aligned}$$
(116)

Again, in the case when the fluctuation–dissipation relation holds we see that the noise-induced drift coincides with that obtained in the limit as \(m \rightarrow 0\) of the Markovian model in Sect. 3.

6 Application to the Study of Thermophoresis

6.1 Introduction

We revisit the dynamics of a free Brownian particle immersed in a heat bath where a temperature gradient is present. This was previously studied in [42]. It was found there that the particle experiences a drift in response to the temperature gradient, due to the interplay between the inertial time scale and the noise correlation time scale. Such phenomenon is called thermophoresis. We refer to [42, 43] and the references therein for further descriptions of this phenomenon, including references to experiments.

Here, we will study the dynamics of the particle in a non-equilibrium heat bath, where a generalized fluctuation–dissipation relation holds, in which both the diffusion coefficient and the temperature of the heat bath vary with the position. In contrast to [42], we take into account also the memory time scale (in addition to the inertial time scale and the noise correlation time scale) and model the position of the particle as the solution to a SIDE of the form (11). Unlike the model used in [42], the model can be derived heuristically from microscopic dynamics by an argument very similar to that of Appendix A.

For a spherical particle of radius R immersed in a fluid of viscosity \(\mu \), which in general is a function of the temperature \(T = T(x)\) (and thus depends on x as well), the friction (or damping) coefficient \(\gamma \) satisfies the Stokes law [13]:

$$\begin{aligned} \gamma (x) = 6 \pi \mu (T) R. \end{aligned}$$
(117)

On the other hand, the damping coefficient \(\gamma (x)\) and the noise coefficient \(\sigma (x)\) are expressed in terms of the diffusion coefficient D(x) and the temperature T(x) as follows:

$$\begin{aligned} \gamma (x) = \frac{k_{B}T(x)}{D(x)}, \ \ \sigma (x) = \frac{k_{B}T(x) \sqrt{2}}{\sqrt{D(x)}}. \end{aligned}$$
(118)

In the following, we study two one-dimensional non-Markovian models of thermophoresis. The first model is driven by a Markovian colored noise and the second model by a non-Markovian one.

6.2 A Thermophoresis Model with Ornstein–Uhlenbeck Noise

In this section we model evolution of the position, \(x_{t} \in \mathbb {R}\), of a particle by the following SIDE:

$$\begin{aligned} m \ddot{x}_{t} = - \sqrt{\gamma (x_{t})} \int _{0}^{t} \alpha e^{-\alpha (t-s)} \sqrt{\gamma (x_{s})} \dot{x}_{s} ds + \sigma (x_{t}) \eta _{t}, \end{aligned}$$
(119)

where \(\eta _{t}\) is a stationary process, satisfying the SDE:

$$\begin{aligned} d\eta _{t} = -\alpha \eta _{t} dt + \alpha dW_{t}. \end{aligned}$$
(120)

The above equations are obtained by setting \(d=1\), \(\varvec{F} = 0\), \(\varvec{h} = \varvec{g} = g := \sqrt{\gamma }\), \(\varvec{\sigma } = \sigma \) in (15) and \(\varvec{A} = \alpha \) in (13), where \(\gamma \) and \(\sigma \) are given by (118). Note that the noise correlation function is proportional to the memory kernel in the SIDE (119), i.e.

$$\begin{aligned} E[\eta _{t} \eta _{s}] = \frac{\alpha }{2} e^{-\alpha |t-s|} = \frac{1}{2}\kappa _{1}(t-s), \ s,t \ge 0 \end{aligned}$$
(121)

as in (21). Together with (118), this implies that (119) satisfies the generalized fluctuation–dissipation relation (see the statement of Corollary 2 and Remark 12). Note also that g is a constant multiple of \(\sigma \) if and only if T is position-independent.

We now consider the effective dynamics of the particle in the limit when all the three characteristic time scales vanish at the same rate. In the following, the prime \('\) denotes derivative with respect to the argument of the function.

Corollary 7

Let \(\epsilon > 0\) be a small parameter and let the particle’s position, \(x^\epsilon _t \in \mathbb {R}\) (\(t \ge 0\)), satisfy the following rescaled version of (119), (120):

$$\begin{aligned} dx^\epsilon _t&= v^\epsilon _t dt, \end{aligned}$$
(122)
$$\begin{aligned} m_0 \epsilon d v^\epsilon _{t}&= \sigma (x^\epsilon _{t}) \eta ^\epsilon _{t} dt - \sqrt{\gamma (x^\epsilon _{t})} \left( \int _{0}^{t} \frac{\alpha }{\tau \epsilon } e^{-\frac{\alpha }{\tau \epsilon }\left( t-s\right) } \sqrt{\gamma (x^\epsilon _{s})} v^\epsilon _{s} ds \right) dt, \end{aligned}$$
(123)
$$\begin{aligned} \tau \epsilon d\eta ^\epsilon _t&= -\alpha \eta ^\epsilon _t dt + \alpha dW_t, \end{aligned}$$
(124)

where \(m_0\), \(\alpha \), \(\tau \) are positive constants, and \((W_t)\) is a one-dimensional Wiener process. The initial conditions are random variables \(x^\epsilon _0 = x\), \(v^\epsilon _0 = v\), independent of \(\epsilon \) and (statistically) independent of \((W_t)\), and \(\eta ^\epsilon _0\) is distributed according to the invariant distribution of the SDE (124).

Assume that the assumptions of Corollary 3 are satisfied (in particular, \(\gamma (x) > 0\) for every \(x \in \mathbb {R}\)). Then, in the limit as \(\epsilon \rightarrow 0\), \(x^\epsilon _t\) converges (in the same sense as in Corollary 3) to the process \(X_{t} \in \mathbb {R}\), satisfying the SDE:

$$\begin{aligned} dX_{t} = b_{1}(X_{t}) dt + \sqrt{2 D(X_{t})} dW_{t}, \end{aligned}$$
(125)

with the noise-induced drift, \(b_{1}(X) = S^{(1)}(X) + S^{(2)}(X) + S^{(3)}(X)\), where

$$\begin{aligned} S^{(1)}(X)&= D'(X)-\frac{D(X)T'(X)}{T(X)},\end{aligned}$$
(126)
$$\begin{aligned} S^{(2)}(X)&= \left[ -\frac{k_{B}T(X)D'(X)}{D(X)}+ k_{B}T'(X) \right] \cdot \left[ \frac{ \tau D(X)}{\tau k_{B}T(X)+2m_{0}\alpha D(X)} \right] , \end{aligned}$$
(127)
$$\begin{aligned} S^{(3)}(X)&= \left[ \frac{k_{B}T(X)D'(X)}{D(X)} \right] \cdot \left[ \frac{ \tau D(X)}{\tau k_{B}T(X)+2m_{0}\alpha D(X)} \right] . \end{aligned}$$
(128)

Proof

The corollary follows from Corollary 4. In particular, the expressions for \(S^{(1)}\), \(S^{(2)}\) and \(S^{(3)}\) follow from applying Corollary 4 to the present system (see (85)–(87)). \(\square \)

We give some remarks and discussions of the contents of Corollary 7 before we end this subsection.

Remark 13

We see that in this case a part of \(S^{(2)}\) cancels \(S^{(3)}\) and therefore the noise-induced drift simplifies to:

$$\begin{aligned} b_{1}(X) = D'(X)-\frac{2m_{0}\alpha D^2(X)}{\tau k_{B}T(X)+2m_{0}\alpha D(X)}\frac{T'(X)}{T(X)}. \end{aligned}$$
(129)

Using the Stokes law (117) which gives

$$\begin{aligned} D(X) = \frac{k_{B}}{6 \pi R} \frac{T(X)}{\mu (T)}, \end{aligned}$$
(130)

where \(\mu (T) = \mu (T(X))\), we have

$$\begin{aligned} b_{1}(X) = k_{B} T'(X) \left( \frac{\tau }{2(\alpha m_{0} + 3 \pi R \tau \mu (T))} - \frac{\mu '(T) T(X)}{6 \pi R \mu ^2(T)} \right) . \end{aligned}$$
(131)

Equation (129) gives the thermophoretic drift in the limit when the three characteristic time scales vanish. Since it arises in the absence of an external force acting on the particle, it is a “spurious drift” caused by the presence of the temperature gradient and the state-dependence of the diffusion coefficient. Compared to Eq. (101) in [8], the drift term derived here contains a correction term due to the temperature profile.

Discussion We discuss some physical implications of the thermophoretic drift given in (129). As discussed in [42], the sign of \(b_{1}(X)\) determines the direction in which the particle is expected to travel. The particle will eventually reach some boundaries, which can be either absorbing or reflecting. We are going to consider the reflecting boundaries case.

In the reflecting boundaries case, the position of the particle, satisfying the SDE (125), reaches a steady-state distribution \(\rho _{\infty }(X)\) in the limit \(t \rightarrow \infty \). Assuming that the particle is confined to the interval (ab), \(a<b\), one can compute the stationary density:

$$\begin{aligned} \rho _{\infty }(X) = C \exp {\left( -\int _{a}^{X} \frac{2\alpha }{r \gamma (y) + 2\alpha } \frac{T'(y)}{T(y)} dy \right) }, \end{aligned}$$
(132)

where in terms of the original parameters of the model, \(r := \tau /m_{0} > 0\), and C is a normalizing constant. In particular, in absence of temperature gradient (\(T'(y) = 0\)), the particle is equally likely to be found anywhere in (ab), whereas when a temperature gradient is present, the distribution of the particle’s position is not uniform. In the limit \(r \rightarrow \infty \), the particle’s position is again distributed uniformly on (ab). On the other hand, in the limit \(r \rightarrow 0\) the stationary density is inversely proportional to the temperature, i.e. \(\rho _{\infty }(X) = \tilde{C} T(X)^{-1},\) where \(\tilde{C}\) is a normalizing constant. Thus, the particle is more likely to be found in the colder region. In the special case when D(X) is proportional to T(X), so that \(\gamma \) is independent of X, we have

$$\begin{aligned} \rho _{\infty }(X) = \tilde{C} T(X)^{-\frac{2\alpha }{2\alpha +r \gamma }}, \end{aligned}$$
(133)

where \(\tilde{C}\) is a normalizing constant, so the particle is more likely to be found in the colder region, with the likelihood decreasing as r increases.

Next, we are going to study the sign of the thermophoretic drift directly using (129) (this is in contrast to the approach in [42], where \(\mu (T)\) is expanded around a fixed temperature). We find that \(b_{1}(X) > 0\) if and only if \(r > r_c\) and \(r_c\) is the critical ratio of \(\tau /m_0\), given by:

$$\begin{aligned} r_c = \frac{\alpha }{3 \pi R \mu (T)} \left( \frac{\mu '(T) T(X)}{\mu (T) -\mu '(T)T(X)} \right) , \end{aligned}$$
(134)

where \(\mu (T) = \mu (T(X))\) is obtained from the Stokes law. For \(r = r_c\), the stationary density (132) reduces to:

$$\begin{aligned} \rho ^{c}_{\infty }(X) = C \frac{\mu (T(X))}{T(X)}, \end{aligned}$$
(135)

where C is a normalizing constant. Importantly, note that the drift does not change sign if T is independent of X.

We now discuss a special case. When \(\mu (T)=\mu _{0} > 0\) is a constant (so that \(\gamma (X)\) is a constant), the thermophoretic drift is given by:

$$\begin{aligned} b_{1}(X) = \frac{k_{B}T'(X)}{6 \pi R \mu _{0}} \left[ 1 - \frac{\alpha }{\alpha + 3 \pi r R \mu _{0}} \right] . \end{aligned}$$
(136)

In agreement with the result in [42], \(b_{1}(X)\) has the same sign as \(T'(X)\), leading to a flow towards the hotter region. The steady-state density is

$$\begin{aligned} \rho _{\infty }(X) = C T(X)^{-\frac{\alpha }{\alpha + 3 \pi r R \mu _{0}}}, \end{aligned}$$
(137)

where C is a normalizing constant, and the particle is more likely to be found in the colder region for all \(r > 0\), even though the thermophoretic drift actually directs the particle towards the hotter regions. This effect is in agreement with experiments, and is explained by the presence of reflecting boundary conditions.

Remark 14

Strictly speaking, reflecting boundary conditions should be first considered for the positive-epsilon version of the process. For instance, at the moment of hitting a boundary point, the velocity can be required to change its sign instantaneously. One has to ask whether such a modification might have an effect on the limiting process. One could, in principle, resolve this problem by adopting the following strategy. Instead of reflecting from the boundaries, the particle may continue going in the same direction, with the coefficients of the equation obtained by reflection in the boundary point (i.e. \(\sigma (b + x_t^\epsilon ) = \sigma (b - x_t^\epsilon )\), where b is one of the boundary points etc.). The above asymptotic analysis may be conducted for the resulting system. The coefficients of this system are periodic (with the period \(2(b-a)\)) and continuous, but nondifferentiable at isolated points, so this would involve additional technical work. Also, one would have to prove that putting the limiting system back on a bounded interval indeed gives rise to a process reflecting at the boundaries. We do not pursue the details here.

6.3 A Thermophoresis Model with Non-Markovian (Harmonic) Noise

We repeat the analysis of the previous subsection in the case when the colored noise is a harmonic noise. We set \(d=1\), \(\varvec{F} = 0\), \(\varvec{h} = \varvec{g} = g := \sqrt{\gamma }\), \(\varvec{\sigma } = \sigma \), \(\varvec{\varOmega } = \varOmega \), \(\varvec{\varOmega }_0 = \varOmega _0 := \varOmega \sqrt{1-\varOmega ^2/4}\), \(\varvec{\varOmega }_1 = \varOmega _1 := \varOmega /\sqrt{1-\varOmega ^2/4}\) (where \(|\varOmega |<2\)) in the SIDE (20) and study the effective dynamics of the resulting system as before. The case where \(|\varOmega |>2\) can be studied analogously. The following result follows from Corollary 6.

Corollary 8

Let \(\epsilon > 0\) be a small parameter and the particle’s position, \(x^\epsilon _t \in \mathbb {R}\ (t \ge 0)\), satisfy the following rescaled SDEs:

$$\begin{aligned} dx^\epsilon _t&= v^\epsilon _t dt, \end{aligned}$$
(138)
$$\begin{aligned} m_0 \epsilon dv^\epsilon _{t}&= -\frac{\sqrt{\gamma (x^\epsilon _{t})}}{\tau \epsilon } \left( \int _{0}^{t} e^{-\frac{\varOmega ^2}{2\tau \epsilon }\left( t-s\right) }\left[ \cos \left( \frac{\varOmega _{0}}{\tau \epsilon }(t-s) \right) \right. \right. \nonumber \\&\quad \left. \left. + \frac{\varOmega _{1}}{2} \sin \left( \frac{\varOmega _{0}}{\tau \epsilon }(t-s) \right) \right] \sqrt{\gamma (x^\epsilon _s)} v^\epsilon _{s} ds \right) dt + \sigma (x^\epsilon _{t}) h^\epsilon _{t} dt, \end{aligned}$$
(139)
$$\begin{aligned} \tau \epsilon dh^\epsilon _t&= u^\epsilon _t dt, \end{aligned}$$
(140)
$$\begin{aligned} \tau \epsilon du^\epsilon _t&= -\varOmega ^2 u^\epsilon _t dt - \varOmega ^2 h^\epsilon _t dt + \varOmega ^2 dW_t, \end{aligned}$$
(141)

where \(m_0\) and \(\tau \) are positive constants, \(\varOmega \), \(\varOmega _0\) and \(\varOmega _1\) are constants defined as before, and \((W_t)\) is a one-dimensional Wiener process. The initial conditions are given by the random variables \(x^\epsilon _0 = x\), \(v^\epsilon _0 = v\), independent of \(\epsilon \), and \((h^\epsilon _0, u^\epsilon _0)\) are distributed according to the invariant measure of the SDEs (140), (141).

Assume that the assumptions in Corollary 5 are satisfied. Then, in the limit as \(\epsilon \rightarrow 0\), the process \(x^\epsilon _t\) converges (in the same sense as Corollary 5) to the process \(X_{t} \in \mathbb {R}\), satisfying the SDE:

$$\begin{aligned} dX_{t} = b_{2}(X_{t}) dt + \sqrt{2D(X_{t})} dW_{t}, \end{aligned}$$
(142)

where the noise-induced drift term is given by:

$$\begin{aligned} b_{2}(X)&= D'(X) \end{aligned}$$
(143)
$$\begin{aligned}&\quad - \frac{(4m_{0}^2 \varOmega ^6 D^2(X)+\tau ^2 (k_{B}T(X))^2)D(X)}{4m_{0}^2 \varOmega ^6 D^2(X)+2k_{B}T(X)m_{0}\tau \varOmega ^4(\varOmega ^2-1)D(X)+\tau ^2(1+2\varOmega ^2)(k_{B}T(X))^2}\frac{T'(X)}{T(X)}. \end{aligned}$$
(144)

We next discuss the contents of Corollary 8.

Remark 15

Note that \(b_{2}(X)\) differs from \(b_{1}(X)\) obtained previously and \(b_{2}(X) \rightarrow b_{1}(X)\), with \(\alpha =1\) in the expression for \(b_{1}(X)\), in the limit \(\varOmega \rightarrow \infty \).

Discussion In the reflecting boundaries case, the stationary distribution of the particle’s position is

$$\begin{aligned}&\rho _{\infty }(X) \nonumber \\&\quad = C \exp {\left( -\int _{a}^{X} \frac{D(y)(4\varOmega ^6D^2(y)+r^2 (k_{B}T(y))^2)}{4 \varOmega ^6 D^2(y)+2r\varOmega ^4(\varOmega ^2-1)D(y)k_{B}T(y)+r^2(1+2\varOmega ^2)(k_{B}T(y))^2 }\frac{T'(y)}{T(y)} dy \right) }, \end{aligned}$$
(145)

where \(r := \tau /m_{0} > 0\) and C is a normalizing constant. Similarly to the previous model, in the absence of temperature gradient (i.e. when T is a constant), the particle is equally likely to be found anywhere in (ab). When a temperature gradient is present, distribution of the particle’s position is not uniform. However, in contrast to the previous model, in the limit \(r \rightarrow \infty \) the particle is not distributed uniformly on (ab) and in the limit \(r \rightarrow 0\) the stationary density is no longer inversely proportional to the temperature. Both distributions depend on the diffusion coefficient D(X) as well as on the temperature profile T(X).

We can also study the sign of the thermophoretic drift. In this case there can be up to two critical ratios, \(r_{c}\), at which \(b_{2}(X)\) changes sign, as the equation \(b_{2}(X) = 0\) is a quadratic equation in r. In the special case when \(\mu (T)=\mu _{0} > 0\) is a constant (and thus so is \(\gamma (X)\)), the thermophoretic drift is given by:

$$\begin{aligned} b_{2}(X) = \frac{k_{B}T'(X)}{6 \pi R \mu _{0}} \left[ 1 - \frac{\varOmega ^6+9\pi ^2R^2r^2\mu ^{2}_{0}}{\varOmega ^6+3 \pi R r \varOmega ^4(\varOmega ^2-1)\mu _{0} +9\pi ^2 R^2 r^2(1+2\varOmega ^2)\mu ^2_{0}} \right] . \end{aligned}$$
(146)

In contrast to the result in previous model, \(b_{2}(X)\) has the same sign as \(T'(X)\) provided that

$$\begin{aligned} r > \frac{\varOmega ^2(1-\varOmega ^2)}{6\pi R \mu _{0}}. \end{aligned}$$
(147)

Thus, \(b_{2}(X)\) and \(T'(X)\) do not share the same sign for all \(r>0\), unless \( |\varOmega | \ge 1.\) According to this model, presence of a temperature gradient allows us to tune the parameters \((m_{0}, \tau , \varOmega )\) to control the direction which the particle travels. The steady-state density in this case is

$$\begin{aligned} \rho _{\infty }(X) = C T(X)^{-\frac{\varOmega ^6+9\pi ^2R^2r^2\mu ^{2}_{0}}{\varOmega ^6+3 \pi R r \varOmega ^4(\varOmega ^2-1)\mu _{0} +9\pi ^2 R^2 r^2(1+2\varOmega ^2)\mu ^2_{0}}}, \end{aligned}$$
(148)

where C is a normalizing constant. The particle will be more likely found in the colder region for all \(r>0\) if \(|\varOmega | \ge 1\), whereas this might not be true for all \(r>0\) if \(|\varOmega | < 1\).

7 Conclusions and Final Remarks

We have studied homogenization of a class of GLEs in the limit when three characteristic time scales, i.e. the inertial time, the characteristic memory time in the damping term, and the correlation time of colored noise driving the equations, vanish at the same rate. We have derived effective equations, which are simpler in three respects:

  1. 1.

    The velocity variables have been homogenized. As a result, the number of degrees of freedom is reduced and there are no fast variables left.

  2. 2.

    The equations become regular SDEs, since the memory time has been taken to zero.

  3. 3.

    The system is driven by a white noise.

Importantly, noise-induced drifts are present in the limiting equations, resulting from the dependence of the coefficients of the original model on the state of the system. We have applied the general results to a study of thermophoretic drift, correcting the formulae obtained in an earlier work [42]. In systems, satisfying a fluctuation–dissipation relation, the noise-induced drifts in the limiting SDEs for the particle’s position reduce to a single term, and for special cases the limiting SDEs coincide with that of [8]. However, in the more general case, new terms appear, absent in the case without memory. To prove the main theorem, we have employed the main result of [8], proven here in a different version under a relaxed assumption on the damping matrix and the initial conditions.

Homogenization of other specific non-Markovian models can also be studied using the methods of this paper. An example is a system with exponentially decaying memory kernel, driven by white noise in the limit as the inertial and memory time scales vanish at the same rate. In this case the noise-induced drift in the limiting equation will consist of two terms, not three, as in the case studied here. Moreover, one could also study the case when the time scales of the system do not vanish at the same rate, along the lines of [3].

The colored noises considered in this paper have correlations decaying exponentially (short-range memory). It would be interesting to study cases where the GLE is driven by other colored noises such as fractional Gaussian noises, with covariances decaying as a power, relevant for modeling anomalous diffusion phenomena in fields ranging from biology to finance [44]. As mentioned in Sect. 2, we will explore homogenization for GLEs with vanishing effective damping and diffusion constant in a future work.