Glossary

Linear:

Here, linear with respect to a probability density.

Nonlinear:

Here, nonlinear with respect to a probability density.

Markov process:

Process for which it is sufficient to have information about the presence in order to make best predictions about the future. Additional information about the past will not improve the predictions.

Martingale process \( \hat{Z} \):

Process for which the future mean value 〈Z(t + Δt)〉 of a set of realizations Z(i) that is passing at presence t through a certain common state z is the state z: z : 〈Z(t + Δt)〉Z(t) = z = z. Additional information about states z′ visited at times t′ prior to t is irrelevant.

Definition of the Subject

Let \( \hat{X} \) denote a stochastic process defined on the space Ω and the time interval [t0, ∞], where t0 denotes the initial time of the process. We assume that the process \( \hat{X} \) can be described in terms of a random variable X ∈ Ω. More precisely, let X(t) denote the time-dependent evolution of the random variable X for t ≥ t0. Then, we assume that the process \( \hat{X} \) can be described in terms of the infinitely large set of realizations X(i)(t) of X(t) with i = 1, 2, …. The realizations i = 1, 2, … constitute a statistical ensemble. At every time t the probability density P of the process \( \hat{X} \) can be computed from the realizations X(i)(t), that is, from the ensemble by means of

$$ P\left(x,t\right)=\left\langle \delta \left(x=X(t)\right)\right\rangle, $$
(1)

where 〈⋅〉 denotes ensemble averaging and δ(·) is the delta function. We assume that at time t0 the process is distributed like u. That is, the function u(x) describes the initial probability density of the random variable X and we have P(x,t0) = u(x). In general, the evolution of P depends on how the process is distributed at initial time t0. In order to emphasize this point, we will use in what follows the notation P(x,t;u). That is, we interpret Eq. (1) as a conditional probability density with the constraint given by the initial distribution u:

$$ P\left(x,t;u\right)={\left\langle \delta \left(x=X(t)\right)\right\rangle}_{\left\langle \delta \left(x-X\left({t}_0\right)\right)\right\rangle =u(x)}. $$
(2)

We may also say that we study a family of stochastic processes (Frank 2005b). Each family member has a label or name which is given by u. For example, consider three experiments in which the evolution of dust particles in the air is observed for Gaussian, Lévy, and Cauchy initial distributions, respectively. It is known that dust particles perform a so-called Brownian random walk. So we would distinguish the three members \( {\hat{X}}_1,{\hat{X}}_2,{\hat{X}}_3 \) of our family of Brownian walk processes by the names of their initial distributions: Gauss, Lévy, and Cauchy.

Let us consider a stochastic process \( \hat{X} \) whose evolution of its probability density P is defined by a partial differential equation of the form

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=\left[-\frac{\partial }{\partial x}{D}_1\left(x,t\right)+\frac{\partial^2}{\partial {x}^2}{D}_2\Big(x,t\Big)\right]P\left(x,t;u\right), $$
(3)

where D1 and D2 are functions of state x and time t. The functions D1 and D2 are referred to as drift and diffusion coefficients and constitute the Fokker-Planck operator

$$ {L}^0\left(x,t\right)=-\frac{\partial }{\partial x}{D}_1\left(x,t\right)+\frac{\partial^2}{\partial {x}^2}{D}_2\left(x,t\right). $$
(4)

The evolution equation (3) is linear with respect to P. In this sense, Eq. (3) is a linear partial differential equation. Irrespective of this feature, the coefficients D1 and D2 may depend in a highly nonlinear fashion on the state x. For example, we may have D1 =  − x + x3.

For appropriately chosen coefficients D1 and D2, Eq. (3) describes the probability density P of a Markov process. In this case, Eq. (3) is referred to as a Fokker-Planck equation. More precisely, if \( \hat{X} \) is a Markov diffusion process whose probability density P is defined by Eq. (3), then Eq. (3) is called a Fokker-Planck equation. Note that roughly speaking, a Markov diffusion process is a Markov process characterized by a partial differential operator that can be truncated after the second-order partial derivative (see section “Kramers-Moyal Expansion”). In order to distinguish between linear and nonlinear Fokker-Planck equations, we will use the phrase “linear Fokker-Planck equation” instead of “Fokker-Planck equation.”

Let us generalize Eq. (3) by assuming that the drift and diffusion coefficients depend on the probability density P. In this case, Eq. (3) becomes

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=\left[-\frac{\partial }{\partial x}{D}_1\Big(x,t,P\left(x,t;u\right)\left)+\frac{\partial^2}{\partial {x}^2}{D}_2\right(x,t,P\Big(x,t;u\left)\right)\right]P\left(x,t;u\right). $$
(5)

Likewise, the operator (4) is generalized to

$$ L\left(x,t,P\left(x,t;u\right)\right)=-\frac{\partial }{\partial x}{D}_1\left(x,t,P\left(x,t;u\right)\right)+\frac{\partial^2}{\partial {x}^2}{D}_2\left(x,t,P\left(x,t;u\right)\right). $$
(6)

Equation (5) is nonlinear with respect to P(x, t; u). Since the structure of the differential operator in the bracket of Eq. (5) is equivalent to the structure of the differential operator (4), evolution equations of the form (5) are frequently called nonlinear Fokker-Planck equations. In this context, it is important to realize that the phrase “nonlinear Fokker-Planck equation” does not necessarily imply that we are dealing with a Markov process. The phrase “nonlinear Fokker-Planck equation” simply means that we are dealing with a nonlinear partial differential equation involving a partial differential operator that exhibits the structure of a Fokker-Planck operator.

Note again that if an evolution equation of the form (3) is referred to as a Fokker-Planck equation, then it is tacitly assumed that there exists a stochastic process defined by that equation and that this process is a Markov process. Table 1 summarizes how to define linear and nonlinear Fokker-Planck equations by means of structure, existence of solutions, and Markov property.

Table 1 Definition of linear and nonlinear Fokker-Planck equations based on structure, existence of solutions, and Markov property

Linear Fokker-Planck equations are an indispensable tool to describe stochastic processes in a variety of disciplines; see Fig. 1. The theoretical concept of Markov diffusion processes related to linear Fokker-Planck equations is well established. Researchers, applied scientist, technicians, research and development engineers in general, and financial engineers in particular are usually aware that the particular linear Fokker-Planck model they are using belongs to the class of Markov models. That is, the world of linear Fokker-Planck equations is closed and connected.

Fig. 1
figure 1

Connected and disconnected applications of linear and nonlinear Fokker-Planck equations

Nonlinear Fokker-Planck equations are used in a variety of fields that are as diverse as the application fields of linear Fokker-Planck equations. Unfortunately, so far, there is no well-established theory connecting all kinds of nonlinear Fokker-Planck equations. There is not even an academic consent about how to define them at all. This is why in Table 1 we used a very general and less constraining definition for nonlinear Fokker-Planck equations. Concepts of nonlinear Fokker-Planck equations are often developed for particular purposes and are not put into other contexts. That is, theoretical results and other achievements are often tailored to serve special needs and are not discussed in a larger framework. Even worse, so far, a well-established link between linear and nonlinear Fokker-Planck equations that applies to the variety of nonlinearities found in the literature does not exist. In sum, the world of nonlinear Fokker-Planck equations is disconnected. Different types of nonlinear Fokker-Planck equations and different application fields of nonlinear Fokker-Planck equations are often not related to each other and nonlinear Fokker-Planck equations are only loosely connected with their linear “relatives”; see Fig. 1.

Therefore, there is a need for developing a unifying approach to nonlinear Fokker-Planck equations that involves the concept of linear Fokker-Planck equations and applies to all types of nonlinearities and in doing so applies to all kinds of scientific disciplines. Some first efforts in this regard have been made previously (Acebron et al. 2005; Chavanis 2003, 2004; Frank 2001b, 2002a, b, 2005b; Frank and Daffertshofer 1999, 2001a, b; Kaniadakis 2001a; b; Shiino 2002b, 2003). In the following sections, we will review these efforts, present them in a consistent way, and in doing so make a further effort into this direction.

Introduction

Linear and nonlinear Fokker-Planck equations are widely used to describe stochastic phenomenon; see Fig. 1.

Linear Fokker-Planck equations (Gardiner 1997; Haken 2004; Risken 1989) have been introduced by Fokker (1914) and Planck (1917). In physics, linear Fokker-Planck equations have been used, for example, to describe Brownian motion, that is, the diffusion of dust particles in air or fluid layers (Reif 1965). Linear Fokker-Planck equations have been applied in engineering sciences, for example, to describe fluctuations in electronic circuits (Gardiner 1997). Linear Fokker-Planck equations have been frequently used in chemistry to model stochastic aspects of chemical reactions (Hänggi et al. 1990; van Kampen 1981). In finance, one of the most important applications of the Fokker-Planck theory is option pricing by means of the so-called Black-Scholes model (Paul and Baschnagel 1999). Linear Fokker-Planck equations of biology systems (Goel and Dyn 1974) have been concerned, for example, with so-called Brownian motors (Hänggi et al. 2005; Reimann 2002). Population diffusion (Okubo and Levin 2001) and group behavior (Ebeling and Sokolov 2004; Mikhailov and Zanette 1999; Schweitzer 2003) in ecological systems and stochastic neuronal processing (Holden 1976) are further examples of application fields of linear Fokker-Planck equations. In psychology, linear Fokker-Planck models have been proposed for decision making (Bogacz et al. 2006; Ratcliff et al. 2004) and group behavior (Schwämmle et al. 2007b).

Many applications of nonlinear Fokker-Planck equation are related to several benchmark models: the Desai-Zwanzig model (Desai and Zwanzig 1978), the liquid crystal model (Doi and Edwards 1988; Hess 1976), the Kuramoto-Shinomoto-Sakaguchi model (Acebron et al. 2005; Kuramoto 1984; Strogatz 2000), the Vlasov model, and the nonlinear diffusion equation (Aronson 1986; Peletier 1981). Let us highlight some of these benchmark models.

Desai-Zwanzig Model

The Desai-Zwanzig model

$$ \frac{\mathrm{d}}{\mathrm{d}t}P\left(x,t;u\right)=-\frac{\partial }{\partial x}\left[h(x)-\kappa \left(x-\int xP\left(x,t;u\right)\;\mathrm{d}x\right)+Q\frac{\partial }{\partial x}\right]P\left(x,t;u\right) $$
(7)

for κ , Q > 0 has been proposed by Desai and Zwanzig (1978) and Kometani and Shimizu (1975) to study collective phenomena in self-organizing systems.

  • A Lyapunov functional approach to the Desai-Zwanzig model has been introduced by Shiino (1985, 1987) and since then has found several generalizations (Chavanis 2003, 2004; Dawson and Gärtner 1989; Frank 2001a, 2005b; Frank and Daffertshofer 2001b; Frank et al. 2001; Kaniadakis 2001a, b; Kharchenko and Kharchenko 2005; Schwämmle et al. 2007a; Shiino 2001, 2002a, b; 2003). With such a Lyapunov functional at hand, the stability of stationary probability densities, collective phenomena, and bifurcations can be studied by means of Lyapunov’s direct method.

  • The original Desai-Zwanzig model and various modifications of it have been discussed (Dawson 1983; Li et al. 1998; Lo 2005).

  • The additive noise term in Eq. (7) has been replaced by a multiplicative noise term (Horsthemke and Lefever 1984) in order to study the interplay between the nonlinearity and the multiplicative noise (Birner et al. 2002; Müller et al. 1997; Zaikin et al. 2002).

  • Fluctuation-dissipation theorems for stochastic processes described by the Desai-Zwanzig model have been derived (Drozdov and Morillo 1996; Frank 2004c; Morillo et al. 1995).

  • The Desai-Zwanzig model has frequently been used as a mean field approximation of spatially distributed systems with diffusive coupling. By means of such a mean field approximation, analytical result has been derived and compared with numerical simulations (Garcia-Ojalvo et al. 1996; Garcia-Ojalvo and Sancho 1999; van den Broeck et al. 1994a, b, 1997).

Liquid Crystal Model

The nonlinear Fokker-Planck equation proposed by Hess (Hess 1976) and Doi and Edwards (1988) reads

$$ \frac{\partial }{\partial t}P\left(\mathbf{x},t;u\right)={D}_r\mathbf{L}\cdot \left\{\mathbf{L}+\frac{1}{kT}\left[\mathbf{L}\;e\left(\mathbf{x},P\right)\right]\right\}P\left(\mathbf{x},t;u\right) $$
(8)

with Dr , k , T > 0 and L = x × ∂/∂x. The function e(x, P) describes the self-consistent potential of the Maier-Saupe mean field force. For processes \( \hat{X} \) that exhibit cylindrical symmetry e(x, P) reads

$$ e\left(\theta, P\right)=-{U}_0 kT\frac{3\cos^2\theta -1}{2}\left\langle \frac{3\cos^2\theta -1}{2}\right\rangle, $$
(9)

where θ is related to the unit vector x by x = (sinθ cos φ, sinθ sin φ, cosθ). Equation (8) and generalization of it have been extensively studied in the literature (Felderhof 2003; Fialkowiski and Hess 2000; Hütter et al. 2003; Ilg et al. 1999, 2005; Larson and Öttinger 1991) (see also Öttinger (1996) in general and Sect. 6.3.2 in Öttinger (1996) in particular). We will return to this model in section “Liquid Crystal Model.”

Winfree and Kuramoto Model

Winfree’s seminal studies on synchronization among animal populations (Winfree 1967, 2001) supported the interest in the nonlinear Fokker-Planck equation

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=\left\{-\frac{\partial }{\partial x}\left[h(x)-\kappa \cdot \int \sin \left(x-y\right)P\Big(y,t;u\Big)\;\mathrm{d}y\right]+Q\frac{\partial^2}{\partial {x}^2}\right\}P\left(x,t;u\right), $$
(10)

that has been proposed by Kuramoto and co-workers (Kuramoto 1984). In Eq. (10), h(x) is a 2π-periodic function and κ , Q > 0.

  • While the Kuramoto-Shinomoto-Sakaguchi model involves an interaction term ∫a(x, y)P(y, t)  dy, the model originally proposed by Winfree exhibits a coupling term of the form ∫a(y)P(y, t)  dy. Models of this latter kind have also been addressed in Ariaratnam and Strogatz (2001), Li and Hänggi (2001), and Quinn et al. (2007).

  • The Kuramoto-Shinomoto-Sakaguchi model describes an ensemble of phase oscillators. The eigenfrequencies of the phase oscillators do not occur in Eq. (10) because Eq. (10) describes an ensemble of phase oscillators exhibiting the same eigenfrequency ω. In this case, the common eigenfrequency ω can then be eliminated by means of a variable transformation into a rotating frame (Frank 2005b). However, in general, we may think of ensembles of coupled phase oscillators with different eigenfrequencies. In this context, the question arises as to what extent oscillators with different eigenfrequencies synchronize their behavior (Acebron et al. 1998; Arenas and Perez Vincente 1994; Bonilla et al. 1993; Crawford 1995; Kostur et al. 2002; Pikovsky et al. 2001; Sakaguchi 1988; Strogatz and Mirollo 1991).

  • Coupled phase oscillator models of the form (10) have been used to describe associative memories (Yamana et al. 1999; Yoshioka and Shiino 2000).

  • Just as for the Desai-Zwanzig model, the interplay between multiplicative noise and the nonlinearity of the Kuramoto-Shinomoto-Sakaguchi model has been investigated in several studies (Kim et al. 1997; Park and Kim 1996; Reimann et al. 1999a, b).

  • The sine-coupling term in Eq. (10) has been replaced by higher-order coupling functions sin(2z), sin(3z), … (Aonishi and Okada 2002; Daido 1996a, b; Hansel et al. 1993b; Kuramoto 1984). In this context, Daido proposed the so-called order function (Daido 1996a, b) that generalizes the notion of cluster phases and cluster amplitudes (Kuramoto 1984). This order function has also been related to experimental data (Zhai et al. 2005).

  • The Kuramoto-Shinomoto-Sakaguchi has found clinical applications in the context of Parkinsonian disease (Tass 2001, 2003; 2006; see also Tass 1999).

Vlasov-Fokker-Planck Model

Vlasov-Fokker-Planck models frequently describe particle systems with electromagnetic interactions between charged particles. A typical example of a Vlasov-Fokker-Planck equation is shown here (Balescu 1975; Klimontovich 1986):

$$ \frac{\partial }{\partial t}P\left(\mathbf{v},t;u\right)=-\sum \limits_{i=1}^3\frac{\partial }{\partial {v}_i}{D}_i\left(\mathbf{v},P\right)P+\sum \limits_{i,k=1}^3\frac{\partial^2}{\partial {v}_i\partial {v}_k}{D}_{ik}\left(\mathbf{v},P\right)P. $$
(11)

Equation (11) involves the drift and diffusion coefficients

$$ {\displaystyle \begin{array}{l}{D}_i\left(\mathbf{v},P\right)=a\frac{\partial }{\partial {v}_i}{\int}_{\Omega}\frac{P\left({\mathbf{v}}^{\prime },t;u\right)}{\mid \mathbf{v}-{\mathbf{v}}^{\prime}\mid }{\mathrm{d}}^3{v}^{\prime },\\ {}{D}_{ik}\left(\mathbf{v},P\right)=b\frac{\partial^2}{\partial {v}_i{v}_k}{\int}_{\Omega}\mid \mathbf{v}-{\mathbf{v}}^{\prime}\mid P\left({\mathbf{v}}^{\prime },t;u\right){\mathrm{d}}^3{v}^{\prime }.\end{array}} $$
(12)

Time-dependent solutions (Allen and Victory 1994; MacDonald et al. 1957; Nicholson 1983; Rosenbluth et al. 1957; Takai et al. 1981) for Eq. (12) and generalization of Eq. (12) that account for additional drift forces (Bychenkov et al. 1995; Epperlein et al. 1988) have been studied. In particular, numerical methods using short-time propagators (see section “Short-Time Propagator”) have been developed for Vlasov-Fokker-Planck equations of the form (12) (Donoso and Salgado 2006; Donoso et al. 2005; Soler et al. 1992). Such nonlinear Vlasov-Fokker-Planck equations play important roles in plasma physics (Balescu 1975; Klimontovich 1986; Nicholson 1983) and astrophysics (Binney and Tremaine 1987; Lancellotti and Kiessling 2001). In general, astrophysical problems often require a stochastic description in terms of nonlinear Fokker-Planck equations (Chavanis et al. 2002; Shiino 2003; Sopik et al. 2006). Finally, note that Vlasov-Fokker-Planck models have been used in accelerator physics and accelerator engineering to examine instabilities in particle beams (Frank 2003a, 2006; Heifets 2001, 2003; Shobuda and Hirata 2001; Stupakov et al. 1997; Venturini and Warnock 2002).

Nonlinear Diffusion Equation, Nonextensive Thermostatistics, and Semiclassical Descriptions of Quantum Systems

The nonlinear diffusion equation (Aronson 1986; Peletier 1981) reads

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=-\frac{\partial }{\partial x}h(x)P\left(x,t;u\right)+\frac{\partial^2}{\partial {x}^2}D\left(P\left(x,t;u\right)\right), $$
(13)

where D(P) is a diffusion coefficient that depends on the probability density P(x,t) of X(t). In the original version of the nonlinear diffusion equation, the drift coefficient h(x) vanishes, and the diffusion coefficient is proportional to a power of P. In general, there might be a more complicated dependence of D on P (Crank 1975; Daly and Porporato 2004).

  • Since fluid flow through porous materials is an important application of the nonlinear diffusion equation, nonlinear diffusion plays a crucial role in soil sciences (Barenblatt et al. 1990). In biology, nonlinear diffusion equations of the form (13) seem to capture particular aspects of population diffusion (Gurtin and MacCamy 1977; Okubo and Levin 2001).

  • The nonlinear diffusion Eq. (13) provides a link to stochastic processes subjected to nonextensive thermostatistics introduced by Tsallis (Abe and Okamoto 2001; Tsallis 1988, 1997, 2004). For D(P) ∝ Pq, Eq. (13) becomes

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=-\frac{\partial }{\partial x}h(x)P\left(x,t;u\right)+Q\frac{\partial^2}{\partial {x}^2}{P}^q\left(x,t;u\right). $$
(14)

Plastino and Plastino showed that stationary distributions of Eq. (14) correspond to canonical distribution that can be derived in a nonextensive framework (Plastino and Plastino 1995). Equation (14) has turned out to be a testbed for various analytical and numerical studies (Borland 1998; Chavanis 2003, 2004; Compte and Jou 1996; Drazer et al. 2000; Shiino 2003; Tsallis and Bukman 1996). Alternative nonlinear Fokker-Planck equations related to the Tsallis statistics have been derived from master equations in Curado and Nobre 2003 and Nobre et al. 2004. In addition, Eq. (14) has more recently discussed in finance in the context of a generalized Black-Scholes model for option pricing (Borland 2002, 2008; Borland and Bouchaud 2004; Vellekoop and Nieuwenhuis 2007) and fat tail distributions (Cortines and Riera 2007; Michael and Johnson 2003).

  • The nonlinear diffusion Eq. (13) is also related to semiclassical descriptions of quantum mechanical systems. For an appropriate choice of D, nonlinear Fokker-Planck equations for Fermi-Bose and Einstein-Dirac systems have been derived from Eq. (13) (Frank 2005b; Frank and Daffertshofer 1999). Alternative forms of nonlinear Fokker-Planck equations have been derived from quantum mechanical Boltzmann equations (Kaniadakis 2001a; Kaniadakis and Quarati 1993, 1994), on the basis of energy balance equations (Tsekov 1995, 2001), and by means of other techniques (Frank 2004b, 2005a; Kadanoff 2000; Sopik et al. 2006). We will return to semiclassical quantum mechanical descriptions in section “Semiclassical Description of Quantum Systems.”

In addition, nonlinear Fokker-Planck equations have been turned out to be useful models to describe stochastic aspects of Josephson arrays (Hadley et al. 1988; Wiesenfeld et al. 1996), Landau damping (Strogatz et al. 1992), arrays of semiconductor lasers (Kozyreff et al. 2000), charge density waves (Bonilla 1987), and neurons coupled by Hodgkin-Huxley equations (Han et al. 1995; Hansel et al. 1993a).

Stochastic systems composed of different kinds of interacting subsystems or species have been modeled in terms of multivariate nonlinear Fokker-Planck equations (Dano et al. 2001; Gang et al. 1996; Ichiki et al. 2007). For example, the collective behavior of coupled relaxation oscillators has been studied (Yamaguchi and Shimizu 1984). Networks of neural oscillators as defined by the Wilson-Cowan model, (Schuster and Wagner 1990), the two-dimensional Morris-Lecar system (Han et al. 1995), the FitzHugh-Nagumo equations (Hasegawa 2003; Kanamaru et al. 2001; Park et al. 2004), and the Hindmarsh-Rose equations (Rosenblum and Pikovsky 2004) have been studied.

The dynamics of mean field coupled phase oscillators under the impact of inertia effects (Acebron and Spigler 1998) and related models have attracted considerable attention. Bridge vibrations induced by pedestrian walking have been discussed in this context recently (Eckhardt et al. 2007; Strogatz et al. 2005). Models for circadian rhythms have been examined (Daido 2001).

Solutions of the Kadar-Parisi-Zhang equation have been examined by means of nonlinear Fokker-Planck equations (Giada and Marsili 2000; Marsili and Bray 1996). In doing so, the growth of surfaces and roughening phenomena have been studied.

Wetting processes (de los Santos et al. 2003), interacting Brownian motors (Becker and Engel 2007; Savel’ev et al. 2003), and spatially distributed phase oscillators (Kawamura 2007; Kawamura et al. 2007) have been analyzed by means of the nonlinear Fokker-Planck perspective.

In the mathematical literature, a seminal study on nonlinear Fokker-Planck equations of the Burgers equation type was due to McKean Jr. (1969). In particular, the convergence of stochastic processes described by multivariate linear Fokker-Planck equations to processes described by nonlinear Fokker-Planck equations (Cepa and Lepingle 1997; Dawson 1983, 1993; Ding 1994; Djehiche and Kaj 1995; Fontbona 2003; Gärtner 1988; Greven 2005; Jourdain 2000; McKean 1969; Meleard 1996; Meleard and Coppoletta 1987; Oelschläger 1989; Overbeck 1996; Pilipenko 2005; Rogers and Shi 1993) and martingales of stochastic processes defined by nonlinear Fokker-Planck equations have been addressed (Djehiche and Kaj 1995; Fontbona 2003; Gärtner 1988; Graham 1990; Greven 2005; Jourdain 2000; Meleard 1996; Meleard and Coppoletta 1987; Overbeck 1996). Moreover, the propagation of molecular chaos has been studied (Bonilla 1987; Meleard 1996; Meleard and Coppoletta 1987). The convergence of transient solutions of nonlinear Fokker-Planck equations to stationary ones has been examined by means of functionals that are similar to the Lyapunov functionals introduced by Shiino (see above) (Arnold et al. 1996, 2000, 2001; Carillo et al. 2001, 2008). In addition, from a purely mathematical perspective, nonlinear Fokker-Planck equations should be considered as nonlinear parabolic partial differential equations that have been discussed in several textbooks (Friedman 1969).

In what follows, we will show that there is a common theoretical framework that unifies most of the aforementioned studies on nonlinear Fokker-Planck equations and includes the theory of linear Fokker-Planck equations as a special case. This common theoretical framework is rooted in the notion of Markov processes and martingales.

Time-Dependent Solutions and First-Order Statistics

Linear Case

Equation (3) defines the evolution of P given an initial distribution u(x). The norm of the probability density P equals unity for all times provided that the norm of u(x) equals unity. That is, if ∫Ωu(x) dx = 1 holds, we have ∫ΩP(x, t; u)dx = 1 for t ≥ t0. We can see this by integrating Eq. (3) with respect to x. For appropriate boundary conditions, it can be shown by partial integration that the right-hand side vanishes which implies that d[∫ΩP(x, t; u) dx]/dt = 0 holds. The formal solution of Eq. (3) reads

$$ P\left(x,t;u\right)=\exp \left\{{\int}_{t_0}^t\mathrm{d}z\;{L}^0\left(x,z\right)\right\}u(x), $$
(15)

where L0 is defined in Eq. (4). Equation (15) can be used to solve Eq. (3) numerically (see Vol. 1, Sect. 6.5 in Haken 2004). Let tn denote a discrete time point tn = t0 + nΔt with n = 0, 1, 2, …, where Δt is the interval of a single time step and should be small. Let us define Pn(x; u) = P(x, tn; u). Then, we have

$$ {P}_{n+1}\left(x;u\right)=\left\{1+\Delta {tL}^0\left(x,{t}_n\right)\right\}{P}_n\left(x;u\right) $$
(16)

with P0 = u(x). If \( \hat{X} \) corresponds to an autonomous processes, then the coefficient D1 and D2 do not depend on t. In this case, Pn can be expressed in terms of u as

$$ {P}_n\left(x;u\right)={\left[1+\Delta {tL}^0(x)\right]}^nu(x). $$
(17)

Numerical solutions converge to exact solutions in the limit Δt → 0.

Nonlinear Case

For appropriately chosen drift and diffusion coefficients, Eq. (5) exhibits time-dependent solutions P. By analogy with the linear case, these solutions are normalized to unity provided that appropriate boundary conditions hold and that the initial probability density is normalized to unity. Solutions of Eq. (5) are formally defined by

$$ P\left(x,t;u\right)=\exp \left\{{\int}_{t_0}^t\mathrm{d}z\;L\Big(x,z,P\left(x,z;u\right)\Big)\right\}u(x). $$
(18)

The time-dependent solutions P can be computed numerically by analogy to the linear case discussed above. That is, the probability densities Pn(x; u) = P(x, tn; u) on the discrete time grid t0 , t0 + Δt , t0 + 2Δt , … can be computed from

$$ {P}_{n+1}\left(x;u\right)=\left\{1+\Delta tL\left(x,{t}_n,{P}_n\left(x;u\right)\right)\right\}{P}_n\left(x;u\right) $$
(19)

with P0 = u(x) and n = 0 , 1 , 2 , …. If drift and diffusion coefficients do not explicitly depend on t, we find that the operator L still depends implicitly on t because it depends on the time-dependent solution P that in turn depends on t. Consequently, it is not trivial to generalize Eq. (17) to the nonlinear case. If the drift- and diffusion coefficients do not depend explicitly on time t and the process converges to a stationary one, then the nonlinear Fokker-Planck operator L does not depend on time. This implies that the stationary probability density Pst satisfies

$$ {P}_{\mathrm{st}}=\left\{1+\Delta tL\left(x,{P}_{\mathrm{st}}\right)\right\}{P}_{\mathrm{st}}+O\left(\Delta {t}^2\right). $$
(20)

Finally, note that we do not necessarily need to define the formal solution with respect to the initial probability density u as in Eq. (18). We can solve the nonlinear Fokker-Planck equation on the time interval [t0, t] by splitting the solution in two intervals [t0, t] and [t, t]. Then, we obtain

$$ P\left(x,t;u\right)=\exp \left\{{\int}_{t^{\prime}}^t\mathrm{d}z\;L\Big(x,z,P\left(x,z;u\right)\Big)\right\}P\left(x,{t}^{\prime };u\right). $$
(21)

Equation (21) can be solved iteratively by means of Eq. (19) yielding a mapping \( {T}_{\Delta t}:P\left(x,t;u\right)={T}_{t-{t}^{\prime }}\left[P\left(x,{t}^{\prime };u\right)\right] \) with Δt = t − t.

Markov Property: Second-Order and Higher-Order Statistics

Conditional Probability Densities

Let p(x) define the probability density of the time-dependent random variable X at time t given that X assumed at earlier times t′, t″, t‴,… with t ≥ t′ > t″ > t‴ > … particular values x′, x, x,… Then p is defined by

$$ p=\left\langle \delta \right(x-X(t)\Big\rangle {}_{X\left({t}^{\prime}\right)={x}^{\prime },X\left({t}^{{\prime\prime}}\right)={x}^{{\prime\prime} },X\left({t}^{{\prime\prime\prime}}\right)={x}^{{\prime\prime\prime} },\dots }. $$
(22)

In order to point out the information that we need to compute p, we write

$$ p=p\left(x,t|{x}^{\prime },{t}^{\prime };{x}^{{\prime\prime} },{t}^{{\prime\prime} };{x}^{{\prime\prime\prime} },{t}^{{\prime\prime\prime} },\dots \right). $$
(23)

A conditional probability density is a relation that gives us estimates about future events and tells us what we need to know in order to be able to calculate these estimates. In our example given by Eq. (23), we see that we need the information of random values X at times t′ > t > t > … in order to make a prediction about the statistics or probability density of X at time t. Alternatively, we may say that the conditional probability density depends on a list of variables. In the context of Markov processes, this alternative viewpoint however gives rise to a problem that will be discussed below.

If \( \hat{X} \) is a Markov process, then the information about the stochastic process available at one particular time t′ is sufficient to make predictions about the future t ≥ t′. Adding more information about how the process evolved in the past before t′ does not improve these predictions. That is, the information about the events at time t′ is sufficient to make statistical estimates about events at time t ≥ t′. An alternative definition of a Markov process is that a Markov process exhibits a conditional probability density p (x, t |·) that depends only on one time point prior to t. That is, according to the first definition, we look from time t′ into the future, whereas according to the second definition, we look in the opposite direction: we look from time t into the past.

For example, in order to describe the probability density p(T) of the temperature T in Boston (USA) on December 1st, 2007, given that on November 1st, 2007, the temperature was 2 °C and on October 1st, 2007, the temperature was 3 °C, we would define the conditional probability density p(T, Dec 1st 2007| T = 2, Nov 1st 2007; T = 3, Oct 1st 2007). If the temperature T as a function of time t is a Markov process, it is sufficient to know the temperature at November 1st in order to compute the probability density p(T) at December 1st. For example, we would obtain the same function p(T) for the conditions (i) and (ii) with (i) T = 2 °C on Nov 1st and T = 3 °C on Oct 1st and (ii) T = 2 °C on Nov 1st and T = 5 °C on Oct 1st. That is, we would have

$$ {\displaystyle \begin{array}{l}\begin{array}{l}p\Big(T,\mathrm{Dec}\;1\mathrm{st}\;2007\mid T=2,\mathrm{Nov}\;1\mathrm{st}\;2007;\\ {}T=3,\mathrm{Oct}\;1\mathrm{st}\;2007\Big)\end{array}\\ {}\begin{array}{l}=p\Big(T,\mathrm{Dec}\;1\mathrm{st}\;2007\mid T=2,\mathrm{Nov}\;1\mathrm{st}\;2007;\\ {}T=5,\mathrm{Oct}\;1\mathrm{st}\;2007\Big)\end{array}\\ {}=p\Big(T,\mathrm{Dec}\;1\mathrm{st}\;2007\mid T=2,\mathrm{Nov}\;1\mathrm{st}\;2007.\end{array}} $$
(24)

The information about the October temperature is irrelevant. In this sense, the conditional probability density would depend on the November temperature but would not depend on the October temperature.

A problem that arises in the context of the definition of Markov process is as follows. Suppose that there is a purely deterministic dynamical aspect involved in a stochastic process. In our example about Boston temperatures, we may think of the annual periodic changes of the temperature that are related to the annual changes in distance and declination angle between the earth and sun. Let us assume that distance and declination angle change periodically in a purely deterministic fashion such that the distance and declination angle at November 1st can be computed from the distance and declination angle at January 1st by a simple one-to-one mapping. Then the question arises: does the temperature in Boston on December 1st depend on the distance and declination angle of November 1st as suggested by p(T, Dec 1st 2007 ∣ T = 25, Nov 1st 2007) or does it depend on the distance and declination angle of January 1st In the former case, we have a Markov conditional probability density. In the latter case, we would need to write p like

$$ {\displaystyle \begin{array}{c}p\Big(T,\mathrm{Dec}\;1\mathrm{st}\;2007\mid T=25,\mathrm{Nov}\;1\mathrm{st}\;2007;\\ {}\mathrm{distance}\kern0.17em \mathrm{and}\kern0.17em \mathrm{angle},\mathrm{Jan}\;1\mathrm{st}\;2007\Big)\end{array}} $$

indicated that we are dealing with a non-Markovian process. The situation becomes even worse if we take into consideration that the earth-sun distance and the declination angle at January 1, 2008, can be computed from the information known at November 1 using our simple one-to-one mapping. Therefore, we may say that the temperature estimate for December 1st, 2007, depends on a future event, namely, the earth-sun distance and declination angle given at January 1st, 2008. The conditional probability density would assume the form

$$ {\displaystyle \begin{array}{c}p\left(T,\mathrm{Dec}\;1\mathrm{st}\;2007\mid T=25,\mathrm{Nov}\;1\mathrm{st}\;2007;\right]\\ {}\mathrm{distance}\kern0.17em \mathrm{and}\kern0.17em \mathrm{angle},\mathrm{Jan}\;1\mathrm{st}\;2008\Big)\end{array}} $$

which would suggest again that we are dealing with a non-Markovian and – to a certain extent – noncausal process. We can solve this problem by realizing that purely deterministic relationships in time that represent external driving forces are irrelevant for the distinction between Markov and non-Markov processes. We can completely determine such an external driving force by a parameter set {t0, A1, A2, …} that describes the initial state of the driving forces. Although this initial state is related to the initial time t0 of the Markov process, the conditional probability density does not actually depend (i.e., it does not explicitly depend) on t0. Likewise the conditional probability density does not actually depend on the parameters {A1, A2, …}. The information that we have at time t′ includes the information about the driving force at time t′ and therefore the information about the driving force at all times t ∈ [t0 ,  ∞ ). Consequently, the information at time t′ is sufficient to predict how the driving force will evolve in the future at times t ≥ t′. There is no need to assess information about events prior to t′ or information about events that will happen in the future at times larger than in order to determine the evolution of the deterministic driving force.

Let us summarize. A stochastic process \( \hat{X} \) is called a Markov process if information about the process at time t′ is sufficient to make predictions about future events. This implies that the conditional probability density p defined in Eq. (23) can be simplified like

$$ p\left(x,t|{x}^{\prime },{t}^{\prime}\hbox{'}{x}^{{\prime\prime} },{t}^{{\prime\prime} };{x}^{{\prime\prime\prime} },{t}^{{\prime\prime\prime} },\dots \right)=p\left(x,t|{x}^{\prime },{t}^{\prime}\right). $$
(25)

Note that we may say that p depends only on the state x′ related to the time t′ in the sense that the information at time t′ is sufficient to predict how X will be distributed at time t ≥ t. We may say it is sufficient to best predict future events where best refers to the fact that adding additional information about the past does not improve our predictions. Let us illustrate this issue by another example. Let p(x, tX = θ) denote the probability density of X at time t given that X equals the function θ in the interval [t0, t′] with t′ ≤ t. If X describes a Markov process, we have

$$ {\displaystyle \begin{array}{ll}p\left(x,t|X=\theta \right)& =p\left(x,t|\theta \left({t}^{\prime}\right),{t}^{\prime}\right)\\ {}& =p\left(x,t|{x}^{\prime },{t}^{\prime}\right)\end{array}} $$
(26)

with x′ = θ(t′).

Linear Fokker-Planck Equations

As mentioned in section “Definition of the Subject,” linear Fokker-Planck equations describe Markov processes (Gardiner 1997; Risken 1989). Markov processes related to linear Fokker-Planck equations of the form (4) have conditional probability densities defined by

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime}\right)={L}^0\left(x,t\right)p\left(x,t|{x}^{\prime },{t}^{\prime}\right) $$
(27)

with \( {\lim}_{t\to {t}^{\prime }}p\left(x,t|{x}^{\prime },{t}^{\prime}\right)=\delta \left(x-{x}^{\prime}\right) \). The conditional probability density p is also called the fundamental solution or Green’s function of the Fokker-Planck equation (3). In general, a stochastic process \( \hat{X} \) is completely defined in terms of the joint probability density

$$ P\left({x}_n,{t}_n;{x}_{n-1},{t}_{n-1};\dots; {x}_0,{t}_0\right)=\left\langle \delta \left({x}_n-X\left({t}_n\right)\right)\cdot \delta \left({x}_{n-1}-X\left({t}_{n-1}\right)\right)\cdots \delta \left({x}_0-X\left({t}_0\right)\right)\right\rangle, $$
(28)

where n can assume arbitrarily large integer numbers. In particular, if \( \hat{X} \) is a Markov process, then this joint probability density can be computed from p and u like

$$ P\left(\cdot \right)=p\left({x}_n,{t}_n|{x}_{n-1},{t}_{n-1}\right)\cdot p\left({x}_{n-1},{t}_{n-1}|{x}_{n-2},{t}_{n-2}\right)\cdots p\left({x}_1,{t}_1|{x}_0,{t}_0\right)u\left({x}_0\right). $$
(29)

Consequently, the linear Fokker-Planck equation (3) defines completely a Markov process via the associated evolution equation (27) and the initial distribution u. In particular, the time-dependent probability densities P(x, t; u) and P(x, t′; u) with t ≥ t are related to each other by means of a linear functional

$$ P\left(x,t;u\right)={\int}_{\Omega}p\left(x,t|{x}^{\prime },{t}^{\prime}\right)P\left({x}^{\prime },t;u\right)\;\mathrm{d}{x}^{\prime }. $$
(30)

That is, the Green’s function p induces a functional that is linear with respect to P(x′, t′; u).

Langevin Equations of Linear Fokker-Planck Equations

The stochastic trajectories X(t) of the Markov process \( \hat{X} \) defined by Eq. (3) can be computed from the Ito-Langevin equation (Coffey et al. 2004; Gardiner 1997; Risken 1989)

$$ \frac{\mathrm{d}}{\mathrm{d}t}X(t)={D}_1\left(X(t),t\right)+\sqrt{D_2\left(X(t),t\right)}\Gamma (t), $$
(31)

where Γ(t) denotes a Langevin force normalized to the delta function like 〈Γ(t)Γ(t)〉 = 2δ(t − t) From the Langevin equation (31), it follows again that we are dealing with a Markov process. Information about one reference time t′ is sufficient to compute the future behavior of the trajectory X(t) with t ≥ t′. On a discrete time grid the stochastic trajectories or realizations of \( \hat{X} \) can be computed iteratively like (Kloeden and Platen 1992; Risken 1989)

$$ {X}_{n+1}={X}_n+\Delta t\;{D}_1\left({X}_n,{t}_n\right)+\sqrt{\Delta t\;{D}_2\left({X}_n,{t}_n\right)}{\upepsilon}_n $$
(32)

with X(tn) = Xn , tn = t0 + nΔt and n = 0, 1, 2, … Here, ϵn are independent Gaussian distributed random numbers with vanishing mean and variance 2. That is, we have 〈ϵn〉 = 0 and 〈ϵnϵm〉 = 2δnm, where δnm is the Kronecker symbol. Finally, the probability density W of ϵn at every step n is given by

$$ W\left({\upepsilon}_n\right)=\frac{1}{\sqrt{4\pi }}\exp \left\{-\frac{\upepsilon_n^2}{4}\right\}. $$
(33)

Note that we do not necessarily need to start the iteration iterative map at t0. The Scheme (32) can be started at any time step n. Moreover, in order to compute the subsequent time steps, it is sufficient to have information about the random variable X at time tn. Consequently, the sequence Xn, Xn+1, Xn+2, … computed from Eq. (32) describes a trajectory of a Markov process.

Strongly Nonlinear Fokker-Planck Equations

In section “Definition of the Subject,” we pointed out that there is some kind of inconsistency in the definition of linear and nonlinear Fokker-Planck equations. While a linear Fokker-Planck equation defines a stochastic process, a nonlinear Fokker-Planck equation defines at best the evolution of a probability density P(x,t). That is, if solutions of Eq. (5) exist for u ∈ U, then Eq. (5) defines the evolution of first-order statistical properties of a stochastic process \( \hat{X} \) such as the time-dependent probability density, the mean, and the variance of the process \( \hat{X} \). In any case, Eq. (5) does not define second- and higher-order statistical quantities such as correlation functions and conditional probability densities (Frank 2004d). In particular, the time-dependent solutions P of Eq. (5) in general cannot be used to construct Green’s functions of Markov processes because they do not necessarily correspond to Green’s functions of Markov processes (Frank 2003b). Note that this is not a peculiarity of stochastic processes defined by nonlinear Fokker-Planck equations. In fact, time-dependent solutions P of linear Fokker-Planck equations Eq. (3) involving explicitly time-dependent coefficients D1 and D2 do not necessarily correspond to Green’s functions. Mathematically speaking, let P(x, t; u = δ(x – x0)) denote the probability density of a process \( \hat{X} \) defined by a nonautonomous linear Fokker-Planck equation or by a nonlinear Fokker-Planck equation and let p(x, t| x, t) denote the conditional probability density of that process \( \hat{X} \), then we have (Frank 2003b)

$$ {\displaystyle \begin{array}{c}P{\left(x,t;u=\delta \left(x-{x}^{\prime}\right)\right)}_{t_0={t}^{\prime }}\\ {}\mathrm{is}\kern0.17em \mathrm{not}\kern0.17em \mathrm{necessarily}\kern0.17em \mathrm{equivalent}\kern0.17em \mathrm{to}\;p\left(x,t|{x}^{\prime },{t}^{\prime}\right),\end{array}} $$
(34)

where \( P{\left(x,t;u=\delta \left(x-x\prime \right)\right)}_{t_0={t}^{\prime }} \) means that we take the time-dependent solution P(x, t; u = δ(x – x0)) and replace in this expression x0 by x′ and t0 by t′.

Let us return to the issue how to define a stochastic process \( \hat{X} \) on the basis of a nonlinear Fokker-Planck equation (5). In order to do so, we need to define appropriate constraints such that out of all possible stochastic processes that exhibit a time-dependent probability density P defined by Eq. (5), one particular process is selected. In what follows, we will discuss one particular set of constraints (Frank 2004d). As we will see, the stochastic processes thus defined exhibit the Markov property.

Let U denote a set of initial probability densities u. That is, U is a set of functions or a space of functions. Let P(x, t; u) denote the solution of the nonlinear Fokker-Planck equation

$$ \frac{\partial }{\partial t}P\left(x,t,u\right)=L\left(x,t,P\left(x,t;u\right)\right)P\left(x,t;u\right) $$
(35)

with

$$ L\left(x,t,P\left(x,t;u\right)\right)=-\frac{\partial }{\partial x}{D}_1\left(x,t,P\left(x,t;u\right)\right)+\frac{\partial^2}{\partial {x}^2}{D}_2\left(x,t,P\left(x,t;u\right)\right) $$
(36)

for an initial distribution u ∈ U. Let us introduce the associated drift and diffusion coefficients \( {\tilde{D}}_1 \) and \( {\tilde{D}}_2 \) by

$$ {\tilde{D}}_1\left(x,t;u\right)={D}_1\left(x,t;P\left(x,t;u\right)\right),{\tilde{D}}_2\left(x,t;u\right)={D}_2\left(x,t;P\left(x,t;u\right)\right). $$
(37)

That is, for any u ∊ U, Eq. (35) is solved analytically or by numerical iteration (19). The solution is substituted into the drift and diffusion coefficients D1 and D2. The coefficients thus obtained are the functions \( {\tilde{D}}_1 \) and \( {\tilde{D}}_2 \) associated to D1 and D2. Let us assume that for all u ∊ U, the evolution equation

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime}\right)=\left[-\frac{\partial }{\partial x}{\tilde{D}}_1\left(x,t;u\right)+-\frac{\partial^2}{\partial {x}^2}{\tilde{D}}_2\Big(x,t;u\Big)\right]\cdot p\left(x,t|{x}^{\prime },{t}^{\prime}\right) $$
(38)

has a fundamental solution or Green’s function. Then, this solution p and its corresponding initial distribution u define a Markov process. In (Frank 2004d) nonlinear Fokker-Planck equations that induce evolution equations (38) with fundamental solutions were called strongly nonlinear Fokker-Planck equations. Note that nonlinear Fokker-Planck equations (5) do not necessarily exhibit the property of being strongly nonlinear. Note also that in some applications, it might be worth to define carefully the set U of initial probability densities u such that a nonlinear Fokker-Planck equation under consideration becomes strongly nonlinear.

As indicated above, the time-dependent probability density P of a nonlinear Fokker-Planck equation depends on the initial distribution u. Likewise, the associated coefficients \( {\tilde{D}}_1 \) and \( {\tilde{D}}_2 \) depend on u. As a result, the conditional probability density p(x, t| x′,t′ ) depends on u as well. For this reason, the notation p(x, t| x′,t′; u) has been suggested. Unfortunately, this notation is likely to cause confusion because one might think that p depends not only on the time t′ but also on the initial time t0 which seems to be incompatible with the notion of a Markov conditional probability density (Frank 2007; McCauley et al. 2006). In fact, this confusion results from the second alternative way to define Markov processes that has been discussed above. The evolution of the function P(x, t; u) is a purely deterministic one. That is, P(x, t; u) represents a deterministic driving force for the purpose of computing the conditional probability density. The distribution u is just a parameter which determines the initial value of this driving force. In this context, note again that the conditional probability density of a Markov process in general depends on parameters and in particular can depend on the initial time t0 and other parameters {A1, A2, …} that define the initial state of a driving force. Consequently, the notation p(x, t| x′,t′; u) does not imply a contradiction with the notion of a Markov process. For example, the nonautonomous Langevin equation

$$ \frac{\partial }{\partial t}X(t)=-\upgamma X(t)-A\cos \left(\omega \left(t-{t}_0\right)\right)+\sqrt{Q}\Gamma (t) $$
(39)

with γ, A, ω, Q > 0 defines a Markov process that is driven by a harmonic force −A cos(ω(t – t0)). That is, the harmonic force has amplitude A at the beginning of the process. The conditional probability density of that process depends on the parameter γ, ω, Q but also on the parameters t0 and A which correspond to the initial amplitude and time. We have (see Sect. 3.7.3 in Frank 2005b)

$$ p=p\left(x,t|{x}^{\prime },{t}^{\prime },\gamma, \omega, Q,A,{t}_0\right). $$
(40)

Nevertheless, in what follows, we will develop a slightly different notation for conditional probability densities p of strongly nonlinear Fokker-Planck equations that is in line with the first definition of Markov processes discussed above and will be helpful to elucidate that the functions p reflect indeed Markov processes.

Let us exploit first the fact that if Eq. (35) is a strongly nonlinear Fokker-Planck equations, then time-dependent solutions P(x, t; u) of Eq. (35) exist for u ∈ U and are related to their initial probability densities u by a one-to-one mapping Tt. That is, for every t we have P(x, t; u) = Tt [u(x)]. For an explicit construction of the map Tt, see, for example, Eq. (18). Likewise, we have P(x, t′; u) = Tt[u(x)]. Using the inverse of T, we can map u to P like \( u(x)={T}_{t^{\prime}}^{-1}\left[P\left(x,t;u\right)\right] \). Substituting these expressions into p(x, t| x′,t′; u), we obtain \( p\Big(x,t\mid {x}^{\prime },{t}^{\prime };{T}_{t^{\prime}}^{-1}\left[P\left(x,{t}^{\prime };u\right)\right]. \) This result demonstrates that the information about the stochastic process \( \hat{X} \) at time t′ is sufficient to predict the future at t > t′. We can regard the conditional probability density p as a function that does not depend explicitly on u, but it depends explicitly on the state of the driving force P at time t′. In line with this remark, we introduce conditional probability densities of the form p (x, t | x′, t′, P(x′, t′;u)).

Let us dwell on the interpretation of a conditional probability density p (x, t | x′, t′, P(x′, t′; u)). To this end, we need to discuss briefly the notion of a particular conditional averaging that is important in this context and will become important later on as well. Let us assume that we make observations of realizations of a stochastic process \( \hat{X} \) for which the following two conditions hold: (i) X (t′) = x′ and (ii) the ensemble of all realization is distributed like P at time t′. Next, we average across all observations that we make under these conditions. In doing so, we average under the constraints

$$ X\left({t}^{\prime}\right)={x}^{\prime}\;\mathrm{and}\;\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle =P\left({x}^{\prime },{t}^{\prime };u\right). $$
(41)

In order to indicate that such a structured constraint should hold, we will use the notation

$$ {\left\langle \cdot \right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle =P\left({x}^{\prime },{t}^{\prime };u\right)}. $$
(42)

In words, Eq. (42) is the instruction to take out of an ensemble with probability density P at time t′ only those realizations that assume the value x′ at time t′. On the one hand, this constraint induces a trivial situation. We know for sure that X(t′) = x′ and consequently can replace the random variable X(t′) by x′. On the other hand, the averaging procedure may involve the random variable X(t) at a time point t different from t′. Although X(t′) is fixed at x′, the random variable X(t) can assume different values at t for different realizations of the process \( \hat{X} \). The conditional probability density p (x, t | x′, t′, P(x′, t′; u)) is a special case in which the delta function is averaged under the constraint (41). We have

$$ p\left(x,t|x^{\prime },t^{\prime },P\left(x^{\prime },t^{\prime };u\right)\right)={\left\langle \delta \left(x-X(t)\right)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle =P\left({x}^{\prime },{t}^{\prime };u\right)}. $$
(43)

Summarizing the results we have derived so far, we see that strongly nonlinear Fokker-Planck equations define Markov processes whose

  • Time-dependent probability densities P(x, t; u) are defined by Eq. (35)

  • Conditional probability densities p (x, t|x′, t′, P(x′, t′; u)) are defined by

$$ \frac{\partial }{\partial t}p\left(x,t|x^{\prime },t^{\prime },{P}^{\prime}\right)=L\left(x,t,P\right)p\left(x,t|x^{\prime },t^{\prime },{P}^{\prime}\right) $$
(44)

with L given by Eq. (36), P = P(x, t; u), and P = P(x, t; u). Note that by multiplying Eq. (44) with P(x′, t′; u) and integrating with respect to x′, we get Eq. (35) which in turn defines the evolution of P(x, t; u). Consequently, Eq. (44) defines both the evolution of P(x, t; u) and p(x, t| x, t, P). Note also that the solution of Eq. (44) formally reads

$$ p\left(x,t|x^{\prime },t^{\prime },P\left(x^{\prime },t^{\prime };u\right)\right)=\exp \left\{{\int}_{t^{\prime}}^t\mathrm{d}z\;L\Big(x,z,P\left(x,z;u\right)\Big)\right\}\delta \left(x-{x}^{\prime}\right) $$
(45)

and depends on the evolution of P(x, z; u) for z ∈ [t′, t]. In fact, as indicated above, p depends only on P(x′, t′, u). To see this recall that the formal solution (21) can be obtained by means of the iterative method (19) such that we can write P(x, z; u) = Tzt [P(x, t′; u)]. Substituting this solution into Eq. (45), we get

$$ p\left(x,t|x^{\prime },t^{\prime },P\left(x^{\prime },t^{\prime };u\right)\right)=\exp \left\{{\int}_{t^{\prime}}^t\mathrm{d}z\;L\Big(x,z,{T}_{z-{t}^{\prime }}\left[P\left(x,{t}^{\prime };u\right)\right)\right\}\delta \left(x-{x}^{\prime}\right). $$
(46)

In addition, we find that the solution (45) does not explicitly depend on u.

We arrive at the following conclusion: conditional probability densities p(x, t|⋅) of Markov processes described by strongly nonlinear Fokker-Planck equations depend only on the value of individual realizations at one prior time point t′ ≤ t and on the probability density P defined by all realizations at the very same prior time point t′.

Equation (45) can be simplified for stationary Markov processes with operators L that do not depend explicitly on time t. Then the conditional probability density in the stationary case can be computed from

$$ p\left(x,t|{x}^{\prime },{t}^{\prime },{P}_{\mathrm{st}}\left({x}^{\prime}\right)\right)=\exp \left\{\left(t-{t}^{\prime}\right)L\left(x,{P}_{\mathrm{st}}(x)\right)\right\}\delta \left(x-{x}^{\prime}\right). $$
(47)

where Pst(x) denotes a stationary probability density out of a set of stationary probability densities defined by LPst = 0. Note that in this context, Pst plays the role of an initial distribution u.

Just as in the linear case, the conditional probability density p in combination with the initial distribution u completely defines the stochastic process \( \hat{X} \). In particular, the joint probability density P(xn, tn; xn−1, tn−1; … x0, t0 ) can be computed from p and u like

$$ {\displaystyle \begin{array}{c}P\left(\cdot \right)=p\left({x}_n,{t}_n|{x}_{n-1},{t}_{n-1},{P}_{n-1}\right)\\ {}\cdot p\left({x}_{n-1},{t}_{n-1}|{x}_{n-2},{t}_{n-2},{P}_{n-2}\right)\cdots \\ {}\cdots p\left({x}_1,{t}_1|{x}_0,{t}_0,u\right)u\left({x}_0\right),\end{array}} $$
(48)

with Pn−1 = P(xn−1, tn−1; u), Pn−2 = P(xn−2, tn−2; u) and so on.

In particular, the time-dependent probability densities P(x, t; u) and P(x, t; u) with t ≥ t are related to each other by means of a nonlinear functional

$$ P\left(x,t;u\right)={\int}_{\Omega}p\left(x,t|{x}^{\prime },{t}^{\prime },P\left({x}^{\prime },{t}^{\prime };u\right)\right)P\left({x}^{\prime },{t}^{\prime };u\right)\;\mathrm{d}{x}^{\prime }, $$
(49)

where p is defined by Eq. (46). That is, the Green’s function p induces a functional that is nonlinear with respect to P(x′, t′; u).

Langevin Equations of Strongly Nonlinear Fokker-Planck Equations

The stochastic trajectories X(t) of the Markov process \( \hat{X} \) defined by Eq. (44) can be computed from two-layered Langevin equations (see Sect. 3.4 in Frank 2005b) or alternatively from the self-consistent Ito-Langevin equation

$$ \frac{\mathrm{d}}{\mathrm{d}t}X(t)={D}_1\left(X(t),t,P\left(X(t),t;u\right)\right)+\sqrt{D_2\left(X(t),t,P\left(X(t),t;u\right)\right)}\Gamma (t), $$
(50)

where Γ(t) denotes the Langevin force introduced earlier. Note that the expression P(X(t), t; u) means that the function P(x, t; u) is evaluated at the state x that is given by the random variable X at time t. That is, we may write P(X(t), t; u) = P(x, t; u)|x=X(t). From the Langevin equation (50), we can read off that we are dealing with a Markov process. Information about one reference time t′ in terms of the state X(t′) of a realization and the distribution of the ensemble as given by the probability density P(x, t′; u) is sufficient to compute the future behavior of the trajectory X(t) with t ≥ t.

The Langevin equation (50) may be implemented on a computer using the iterative map

$$ {X}_{n+1}={X}_n+\Delta t\;{D}_1\left({X}_n,{t}_n,P\left({X}_n,{t}_n;u\right)\right)+\sqrt{\Delta t\;{D}_2\left({X}_n,{t}_n,P\left({X}_n,{t}_n;u\right)\right)}{\upepsilon}_n $$
(51)

with X(tn) = Xn, tn = t0 + nΔt, n = 0, 1, 2, … and ϵn given as statistically independent Gaussian distributed random numbers with vanishing mean and variance 2 (see above). The expression P(Xn, t; u) can be computed from the realizations generated by the iteration map (51). Let \( {X}_n^{(i)} \) denote the ith realization at time step n. Then, the stochastic trajectories X(t) can numerically be computed by simulating an ensemble of realizations i = 1, …, N like

$$ {X}_{n+1}^{(i)}={X}_n^{(i)}+\Delta t\;{D}_1\left({X}_n^{(i)},{t}_n,{P}_n\left({X}_n^{(i)}\right)\right)+\sqrt{\Delta t\;{D}_2\left({X}_n^{(i)},{t}_n,{P}_n\left({X}_n^{(i)}\right)\right)}{\upepsilon}_n^{(i)}, $$
(52)

where \( {\upepsilon}_n^{(i)} \) are statistically independent Gaussian random numbers with respect to both indices n and i and Pn is computed from the set \( \left\{{X}_n^{(1)},\dots, {X}_n^{(N)}\right\} \) of realizations using standard kernel estimators. For example, we may use

$$ {P}_n(x)=\frac{1}{Ns\sqrt{2\pi }}\sum \limits_{i=1}^N\exp \left\{-\frac{{\left(x-{X}_n^{(i)}\right)}^2}{2{s}^2}\right\}, $$
(53)

with s = N−1/5σe(tn) where σe(tn) is the standard deviation of the empirical ensemble \( \left\{{X}_n^{(1)},\dots, {X}_n^{(N)}\right\} \) (Frank 2005b, 2008; Silverman 1986). Just as in the case of Langevin equations of linear Fokker-Planck equations, the map (52) can be started at any time step n provided that we have information about Pn and Xn. In particular, if we start at a step n > 0, we see that the information about the initial distribution is irrelevant. Consequently, the sequence Xn, Xn+1, Xn+2, … computed from the time-discrete Langevin equation (52) related to the nonlinear Fokker-Planck equation (44) describes a trajectory of a Markov process.

Finally, note that self-consistent Langevin equations can be evaluated analytically in order to determine second-order statistical properties of a stochastic process defined by a strongly nonlinear Fokker-Planck equation (Borland 1998; Kharchenko and Kharchenko 2005).

Short-Time Propagator

The Green’s function for short time intervals is frequently called the short-time propagator and can be derived from the time-discrete Ito-Langevin (51). Equation (51) relates the random variable ϵn that is distributed like W(ϵn) (see Eq. (33)) to the random variable Xn+1. In general, if Xn+1 is a function of ϵn then the probability density W′(xn+1) of Xn+1 is given by

$$ {W}^{\prime}\left({x}_{n+1}\right)=W\left({\upepsilon}_n\right)\frac{\mathrm{d}{\upepsilon}_n}{\mathrm{d}{x}_{n+1}}. $$
(54)

In particular, if Xn+1 is computed from ϵn for a particular value xn and probability density P, then we obtain the short-time conditional probability density

$$ {p}_s\left({x}_{n+1}|{x}_n,P\left({x}_n,{t}_n;u\right)\right)=W\left({\upepsilon}_n\right)\frac{\mathrm{d}{\upepsilon}_n}{\mathrm{d}{x}_{n+1}}. $$
(55)

Equation (51) can be transformed into

$$ {\upepsilon}_n=\frac{X_{n+1}-{X}_n+\Delta {tD}_1\left({X}_n,{t}_n,P\left({X}_n,{t}_n;u\right)\right)}{\sqrt{\Delta t\;{D}_2\left({X}_n,{t}_n,P\left({X}_n,{t}_n;u\right)\right)}}. $$
(56)

Substituting Eq. (56) into Eq. (55), we obtain

$$ {p}_s\left({x}_{n+1}|{x}_n,P\left({x}_n,{t}_n;u\right)\right)=\frac{\exp \left\{-\frac{{\left[{x}_{n+1}-{x}_n+\Delta t\;{D}_1\left({x}_n,{t}_n,P\left({x}_n,{t}_n;u\right)\right)\right]}^2}{4\Delta t\;{D}_2\left({x}_n,{t}_n,P\left({x}_n,{t}_n;u\right)\right)}\right\}}{\sqrt{4\uppi \Delta t\;{D}_2\left({x}_n,{t}_n,P\left({x}_n,{t}_n;u\right)\right)}}. $$
(57)

Using the time-continuous framework and the replacements n → t, xn → x, n + 1 → t = t + Δt xn+1 → x and likewise P(xn, tn; u) → P(x′,t′; u) = P′, we obtain the short-time propagator (see Sect. 2.8.1 in Frank 2005b)

$$ {p}_s\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\frac{\exp \left\{-\frac{{\left[x-{x}^{\prime }+\Delta t\;{D}_1\left({x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\right]}^2}{4\Delta t\;{D}_2\left({x}^{\prime },{t}^{\prime },{P}^{\prime}\right)}\right\}}{\sqrt{4\pi \Delta t\;{D}_2\left({x}^{\prime },{t}^{\prime },{P}^{\prime}\right)}}. $$
(58)

The short-time propagator has originally been proposed by Wehner and Wolfer (1987) and can be used to solve nonlinear Fokker-Planck equations numerically (Donoso and Salgado 2006; Donoso et al. 2005; Soler et al. 1992). To this end, the short-time propagator is substituted into Eq. (48), and subsequently Eq. (48) is integrated over all variables xn−1, …, x0. Thus we obtain P(x, tn; u) for tn = nΔt. In the context of stochastic processes described by linear Fokker-Planck equations, the construction of solutions by means of short-time propagators is referred to as path integral approach (Gardiner 1997; Haken 2004; Risken 1989). We will return to a similar path integral approach in section “Semiclassical Description of Quantum Systems.” Expectation values of functions f can be computed from (58) like

$$ {\left\langle f\left(X(t)\right)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle =P\left({x}^{\prime },{t}^{\prime };u\right)}={\int}_{\omega }f(x){p}_s\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\mathrm{d}x, $$
(59)

which holds for small intervals Δt = t − t. The short-time propagator illustrates again the Markov property of solutions of the strongly nonlinear Fokker-Planck equation (44). The information about x and P at time t is sufficient to make predications in terms of expectation values that the stochastic process will assume at time t = t′ + Δt.

Chapman-Kolmogorov Equation, Kramers-Moyal Expansion, and Drift-Diffusion Estimates

Linear Fokker-Planck equation can be derived using the Kramers-Moyal expansion of the Chapman-Kolmogorov equation (Gardiner 1997; Risken 1989). The definition of the expansion coefficients in turn can be used to estimate the Kramers-Moyal coefficients in general and the drift and diffusion coefficients of linear Fokker-Planck equations in particular from experimental data (Friedrich and Peinke 1997; Friedrich et al. 2000). We will show in this section that if a stochastic process defined by a nonlinear Fokker-Planck equation can be embedded into a Markov process using the concept of strongly nonlinear Fokker-Planck equations, then we can proceed as in the linear case. Taking a slightly different perspective, we may say that there are Markov processes that involve conditional probability densities of the form p (x, t | x′, t′, P(x′, t′; u)) and can be characterized in terms of generalized Kramers-Moyal expansion coefficients.

Chapman-Kolmogorov Equation

Let \( \hat{X} \) denote a stochastic Markov process with conditional probability density p (x, t | x′, t′, P(x′, t′; u)). Then as discussed in the previous section, the joint probability P(x, t; x′, t′; x″, t″; u)) can be expressed by

$$ P\left(x,t;x^{\prime },t^{\prime };x^{{\prime\prime} },t^{{\prime\prime} };u\right)=p\left(x,t|x^{\prime },t^{\prime },P\left(x^{\prime },t^{\prime };u\right)\right)\cdot p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },P\left({x}^{{\prime\prime} },{t}^{{\prime\prime} };u\right)\right)P\left({x}^{{\prime\prime} },{t}^{{\prime\prime} };u\right). $$
(60)

Integrating with respect to x and dividing by P(x″, t″; u) yields the generalized Chapman-Kolmogorov equation

$$ p\left(x,t|{x}^{{\prime\prime} },{t}^{{\prime\prime} },P\left({x}^{{\prime\prime} },{t}^{{\prime\prime} };u\right)\right)={\int}_{\Omega}p\left(x,t|x^{\prime },t^{\prime },P\left(x^{\prime },t^{\prime };u\right)\right)\cdot p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },P\left({x}^{{\prime\prime} },{t}^{{\prime\prime} };u\right)\right)\;\mathrm{d}{x}^{\prime }. $$
(61)

Note that in what follows, we will use the notation

$$ {\displaystyle \begin{array}{c}P=P\left(x,t;u\right),\\ {}{P}^{\prime }=P\left({x}^{\prime },{t}^{\prime };u\right),\\ {}{P}^{{\prime\prime} }=P\left({x}^{{\prime\prime} },{t}^{{\prime\prime} };u\right).\end{array}} $$
(62)

If we need to express probability densities P different from those listed in Eq. (62), we will write down if necessary their arguments explicitly. For example, we will write P(x, t′; u) to express the probability density 〈δ(x − X(t′))〉 for a stochastic process \( \hat{X} \) with initial distribution u.

Using the notation of Eq. (62), we can write the joint probability density (60) like

$$ P\left(x,t;x^{\prime },t^{\prime };x^{{\prime\prime} },t^{{\prime\prime} };u\right)=p\left(x,t|x^{\prime },t^{\prime },{P}^{\prime}\right)p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right){P}^{{\prime\prime} } $$
(63)

and the generalized Chapman-Kolmogorov Eq. (61) becomes

$$ p\left(x,t|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)={\int}_{\Omega}p\left(x,t|x^{\prime },t^{\prime },{P}^{\prime}\right)p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)\;\mathrm{d}{x}^{\prime }. $$
(64)

Kramers-Moyal Expansion

In this section, the Kramers-Moyal expansion for linear Fokker-Planck equations as discussed in Risken (1989) will be generalized to the nonlinear case. Consider the conditional probability density p(x, t|x′, t′, P′) for t = t′ + Δt. Then, we have

$$ p\left(x,{t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)={\int}_{\Omega}\delta \left(y-x\right)p\left(y,{t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\;\mathrm{d}y. $$
(65)

The variables x and x′ denote arbitrary states in Ω. However, let us consider next states y that are close to x′ such that is ϵ = y − x small. Using y − x = ϵ +  x′ − x, we obtain

$$ p\left(x,{t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)={\int}_{\Omega}\delta \left({x}^{\prime }-x+\upepsilon \right)\cdot p\left({x}^{\prime }+\upepsilon, {t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\;\mathrm{d}\upepsilon . $$
(66)

Use

$$ \delta \left({x}^{\prime }-x+\upepsilon \right)=\delta \left({x}^{\prime }-x\right)+\sum \limits_1^{\infty}\frac{\upepsilon^n}{n!}{\left(\frac{\partial }{\partial {x}^{\prime }}\right)}^n\delta \left(x\hbox{'}-x\right). $$
(67)

Then, Eq. (66) becomes

$$ p\left(x,{t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\delta \left({x}^{\prime }-x\right)+\sum \limits_1^{\infty }{\int}_{\Omega}d\upepsilon \frac{\upepsilon^n}{n!}\cdot p\left({x}^{\prime }+\upepsilon, {t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },P\right){\left(\frac{\partial }{\partial {x}^{\prime }}\right)}^n\delta \left(x\hbox{'}-x\right). $$
(68)

Multiplying Eq. (68) with p(x′, t′ | x″, t″, P″) and integrating with respect to x′ yields on the left-hand side

$$ {\displaystyle \begin{array}{llll}\mathrm{LHS} & = & {\int}_{\Omega}p\left(x,{t}^{\prime }+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{\prime\prime} \right)\mathrm{d}{{x}^{\prime}}\\ & & \,\, =p\left(x,{{t}^{\prime}}+\Delta t|{x}^{\prime\prime},{t}^{\prime\prime},{P}^{{\prime\prime}}\right)\end{array}} $$
(69)

and on the right-hand side

$$ {\displaystyle \begin{array}{l}\begin{array}{l}\begin{array}{l}\mathrm{RHS}=p\left(x,{t}^{\prime }|x",{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)\\ {}+\sum \limits_1^{\infty }{\int}_{\Omega}\mathrm{d}{x}^{\prime }{\int}_{\Omega}\mathrm{d}\upepsilon \frac{\upepsilon^n}{n!}p\left({x}^{\prime }+\upepsilon, t\hbox{'}+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\\ {}\cdot p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)\frac{\partial^n\delta \left({x}^{\prime }-x\right)}{\partial {x^{\prime}}^n}\end{array}\\ {}\begin{array}{l}\mathrm{RHS}=p\left(x,{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)\\ {}+\sum \limits_1^{\infty }{\int}_{\Omega}\mathrm{d}{x}^{\prime}\delta \left({x}^{\prime }-x\right)\frac{\partial^n}{\partial {x^{\prime}}^n}{\left(-1\right)}^n\\ {}\cdot {\int}_{\Omega}\mathrm{d}\upepsilon \frac{\upepsilon^n}{n!}p\left({x}^{\prime }+\upepsilon, t\hbox{'}+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)p\left({x}^{\prime },{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)\end{array}\\ {}\begin{array}{l}\mathrm{RHS}=p\left(x,{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)\\ {}+\sum \limits_1^{\infty }{\left(-\frac{\partial }{\partial x}\right)}^n{\int}_{\Omega}\mathrm{d}\upepsilon \frac{\upepsilon^n}{n!}p\left(x+\upepsilon, {t}^{\prime }+\Delta t|x,{t}^{\prime },P\left(x,{t}^{\prime };u\right)\right)\end{array}\end{array}\\ {}\cdot p\left(x,{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right).\end{array}} $$
(70)

Note that we used the Chapman-Kolmogorov Eq. (61) in order to evaluate the left-hand side (69) and we used Eq. (61) as well as partial integration in order to evaluate the right-hand side (70). Let us define the moments Mn(x, t, Δt, P(x, t; u)) by

$$ {M}_n\left(x,t,\Delta t,P\right)={\int}_{\Omega}\mathrm{d}\upepsilon \frac{\upepsilon^n}{n!}p\left(x+\upepsilon, t+\Delta t|x,t,P\right) $$
(71)

or using ϵ + x = z by

$$ {M}_n\left(x,t,\Delta t,P\right)={\int}_{\Omega}\mathrm{dz}\frac{{\left(z-x\right)}^n}{n!}p\left(z,t+\Delta t|x,t,P\right). $$
(72)

Combining the left- and right-hand sides given by Eqs. (69) and (70), respectively, we obtain

$$ p\left(x,{t}^{\prime }+\Delta t|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)=p\left(x,{t}^{\prime }|{x}^{{\prime\prime} },{t}^{{\prime\prime} },{P}^{{\prime\prime}}\right)+\sum \limits_1^{\infty }{\left(-\frac{\partial }{\partial x}\right)}^n\cdot {M}_n\left(x,{t}^{\prime },\Delta t,P\left(x,{t}^{\prime };u\right)\right)p\left(x,{t}^{\prime }|{x}^{{\prime\prime} },t",{P}^{{\prime\prime}}\right). $$
(73)

To improve readability, let us replace t′ by t and subsequently t′′ by t′. Thus, we obtain

$$ p\left(x,t+\Delta t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)=p\Big(x,t\mid {x}^{\prime },{t}^{\prime },{P}^{\prime }+\sum \limits_1^{\infty }{\left(-\frac{\partial }{\partial x}\right)}^n\cdot {M}_n\left(x,t,\Delta t,P\right)p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right). $$
(74)

This is the time-discrete version of the Kramers-Moyal expansion of the generalized Chapman-Kolmogorov Eq. (64). Note that Mn depends on P(x, t; u), whereas p depends on P(x′, t′; u). Next, we define the Kramers-Moyal coefficients

$$ {D}_n\left(x,t,P\right)=\underset{\Delta t\to 0}{\lim \limits}\frac{M_n}{\Delta t}\underset{\Delta t\to 0}{\lim \limits}\frac{1}{\Delta t}{\int}_{\Omega}\mathrm{d}x\frac{{\left(z-x\right)}^n}{n!}\cdot p\left(z,t+\Delta t|x,t,P\right). $$
(75)

Dividing Eq. (74) by Δt and taking the limiting case Δt → 0, Eq. (74) becomes the time-continuous generalized Kramers-Moyal expansion

$$ \frac{\partial }{\partial t}p\left(x,t|x\hbox{'},t\hbox{'},P\right)=\sum \limits_1^{\infty }{\left(-\frac{\partial }{\partial x}\right)}^n{D}_n\left(x,t,P\right)p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right). $$
(76)

Note that by generalizing the Kramers-Moyal expansion to the nonlinear case, we found immediately that the coefficients Dn depend on P(x, t; u), whereas the conditional probability density p depends on P(x′, t′; u). Note also that in the special case Dn = 0 for n ≥ 3, the Kramers-Moyal expansion (76) yields the nonlinear Fokker-Planck equation (44). Finally note that since we have Mnt = 0) = 0 for all n, Kramers-Moyal coefficients can also be defined by

$$ {D}_n\left(x,t,P\right)={\left.\frac{\partial {M}_n}{\mathrm{\partial \Delta }t}\right|}_{\Delta t=0}={\int}_{\Omega}{\left.\mathrm{d}x\frac{{\left(z-x\right)}^n}{n!}\frac{\partial }{\partial u}p\left(z,u|x,t,P\right)\right|}_{u=t}. $$
(77)

Drift-Diffusion Estimates

The definition of the Kramers-Moyal coefficients can be exploited to extract the drift and diffusion coefficients of nonlinear Fokker-Planck equations from time series data. Accordingly, the drift coefficient D1 and the diffusion coefficient D2 are defined by

$$ {\displaystyle \begin{array}{l}{D}_1\left(x,t,P\right)=\underset{\Delta t\to 0}{\lim \limits}\frac{1}{\Delta t}{\int}_{\Omega}\mathrm{d}x\left(z-x\right)p\left(z,t+\Delta t|x,t,P\right),\\ {}{D}_2\left(x,t,P\right)=\underset{\Delta t\to 0}{\lim \limits}\frac{1}{2\Delta t}{\int}_{\Omega}\mathrm{d}x\frac{{\left(z-x\right)}^2}{2}p\left(z,t+\Delta t|x,t,P\right).\end{array}} $$
(78)

The limiting case Δt may be approximated by the smallest time step that is available in the data set:

$$ {\displaystyle \begin{array}{l}{D}_1\left(x,t,P\right)\approx \frac{1}{\Delta t}{\int}_{\Omega}\mathrm{d}x\left(z-x\right)p\left(z,t+\Delta t|x,t,P\right),\\ {}{D}_2\left(x,t,P\right)\approx \frac{1}{2\Delta t}{\int}_{\Omega}\mathrm{d}x\frac{{\left(z-x\right)}^2}{2}p\left(z,t+\Delta t|x,t,P\right),\end{array}} $$
(79)

where Δt denotes now the sampling interval between two data points. Note that on the basis of the alternative definition (77), higher-order approximations can also be defined (Patanarapeelert et al. 2006). The conditional averages can be approximated by empirical conditional averages computed from a finite set of realizations X(1)(t), X(2)(t), …, X(N)(t). Thus, we obtain

$$ {\displaystyle \begin{array}{l}{D}_1\left(x,t,P\right)\approx \frac{1}{\Delta t}\frac{1}{\sum_{i\in I\left(t,x\right)}1}\cdot \sum \limits_{i\in I\left(t,x\right)}\left[{X}^{(i)}\left(t+\Delta t\right)-{X}^{(i)}(t)\right],\\ {}{D}_2\left(x,t,P\right)\approx \frac{1}{2\Delta t}\frac{1}{\sum_{i\in I\left(t,x\right)}1}\cdot \sum \limits_{i\in I\left(t,x\right)}{\left[{X}^{(i)}\left(t+\Delta t\right)-{X}^{(i)}(t)\right]}^2,\end{array}} $$
(80)

where I(t,x) is the set of indices i for which X(i)(t) ≈ x. In the case of Markov processes described by linear Fokker-Planck equations, the argument P in the coefficients can be dropped and the above drift-diffusion estimates reduce to the estimates proposed in Friedrich and Peinke (1997) and Friedrich et al. (2000) that have recently found many applications (Bödeker et al. 2003; Jafari et al. 2002; Sura and Barsugli 2002; Waechter et al. 2004). For Markov processes described by nonlinear Fokker-Planck equations, we need to compute the conditional averages for different probability densities P. To this end, we may vary the initial distribution u of a stochastic process. For a stochastic process with a particular distribution of X at time t, we will obtain the coefficients D1 and D2 only for that particular distribution. Using the kernel estimate method mentioned above, we obtain

$$ {D}_1\left(x,t,P\approx \frac{1}{Ns\sqrt{2\pi }}\sum \limits_{i=1}^N\exp \left\{-\frac{{\left(x-{X}^{(i)}(t)\right)}^2}{2{s}^2}\right\}\right)\approx \frac{1}{\Delta t}\frac{1}{\sum_{i\in I\left(t,x\right)}1}\sum \limits_{i\in I\left(t,x\right)}\left[{X}^{(i)}\left(t+\Delta t\right)-{X}^{(i)}(t)\right] $$
(81)

and

$$ {D}_2\left(x,t,P\approx \frac{1}{Ns\sqrt{2\pi }}\sum \limits_{i=1}^N\exp \left\{-\frac{{\left(x-{X}^{(i)}(t)\right)}^2}{2{s}^2}\right\}\right)\approx \frac{1}{2\Delta t}\frac{1}{\sum_{i\in I\left(t,x\right)}1}\sum \limits_{i\in I\left(t,x\right)}{\left[{X}^{(i)}\left(t+\Delta t\right)-{X}^{(i)}(t)\right]}^2 $$
(82)

with s = N−1/5σe(tn), where σe(tn) is the standard deviation of the empirical ensemble {X(1)(t), …, X(N)(t)}. Note that in general the Kramers-Moyal coefficients of Markov processes induced by conditional probability densities of the form p(x, t|x′, t′, P′) can be estimated using

$$ {D}_n\left(x,t,P\approx \frac{1}{Ns\sqrt{2\pi }}\sum \limits_{i=1}^N\exp \left\{-\frac{{\left(x-{X}^{(i)}(t)\right)}^2}{2{s}^2}\right\}\right)\approx \frac{1}{n!\Delta t}\frac{1}{\sum_{i\in I\left(t,x\right)}1}\sum \limits_{i\in I\left(t,x\right)}{\left[{X}^{(i)}\left(t+\Delta t\right)-{X}^{(i)}(t)\right]}^n. $$
(83)

Alternatively, parametric estimate methods may be used. For example, we may be interested in estimating the exponent q of a Markov process defined by the Plastino-Plastino model (see section “Nonextensive Systems” below)

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\left[\frac{\partial }{\partial x}\upgamma x+Q\frac{\partial^2}{\partial {x}^2}P{\left(x,t;u\right)}^{q-1}\right]p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right) $$
(84)

with γ, Q, q > 0. Then, the diffusion coefficient D2(P) = QPq−1 involves the parameter Q and q. Using Eq. (82) and taking the logarithm, we get

$$ \ln Q+\left(q-1\right)\ln \left\{\frac{1}{Ns\sqrt{2\pi }}\sum \limits_{i=1}^N\exp \left\{-\frac{{\left(x-{X}^{(i)}(t)\right)}^2}{2{s}^2}\right\}\right\}\approx \ln \left\{\frac{1}{2\Delta t}\frac{1}{\sum_{i\in I\left(t,x\right)}1}\sum \limits_{i\in I\left(t,x\right)}{\left[{X}^{(i)}\left(t+\Delta t\right)-{X}^{(i)}(t)\right]}^2\right\}. $$
(85)

For example, at a particular time t, Eq. (85) can be evaluated for different states xi. In that case, Eq. (85) assumes the form ln Q + (q − 1) A1(xi) = A2(xi). Then, the expressions ln Q and q – 1 (and in doing so the parameters Q and q) can be estimated from a linear regression (Frank and Friedrich 2005).

Martingales

Let Z(t) denote a functional of a stochastic process \( \hat{X} \) defined for t ≥ t0. In what follows, we will put t0 = 0. Then, Z is a martingale of \( \hat{X} \) if

$$ {\left\langle Z(t)\right\rangle}_{X=\theta }=Z\left({t}^{\prime}\right) $$
(86)

holds for t ≥ t, where θ is a realization of the random variable X on the interval [0,t′] (see Sect. 1.3 in Karlin and Taylor 1975). That is, the constraint X = θ means X(s) = θ(s) holds for s ∈ [0,t′]. That is, in the interval [0,t′] the trajectory X is fixed. Roughly speaking, a martingale is a random variable for which the best predictor of its future mean value is the present value. With regard to Eq. (86), the prediction of the future mean value is 〈Z(t)〉, whereas the present value of Z is Z(t′). Alternatively, we may say that the information at one time t′ about the value of the martingale Z is sufficient to predict the mean value of the martingale Z for future times t ≥ t′. Note that this alternative point of view is closely related with the first definition of Markov processes discussed in the previous section.

For linear Fokker-Planck equations, there is a close link between martingales and the Markov property. Accordingly, a stochastic process is a Markov process defined by a linear Fokker-Planck equation with drift and diffusion coefficients D1 and D2 if and only if a particular random variable Z that involves the Fokker-Planck operator is a martingale (see Sect. 15.1 in Karlin and Taylor 1981). In the mathematical literature, this link has also been studied in the context of nonlinear Fokker-Planck equations (Djehiche and Kaj 1995; Fontbona 2003; Gärtner 1988; Graham 1990; Greven 2005; Jourdain 2000; Meleard 1996; Meleard and Coppoletta 1987; Overbeck 1996).

Our aim in this section is to make the martingale approach more accessible to scientists working in physics, applied mathematicians, and related disciplines. To this end, we will in what follows illustrate this link between martingales and Markov processes defined by strongly nonlinear Fokker-Planck equation by means of standard techniques frequently used in physics.

Theorem 1

Let \( \hat{X} \) be a stochastic process with initial probability density u(x) and conditional probability density

$$ p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)={\left\langle \delta \left(x-X(t)\right)\right\rangle}_{X\left({t}^{\prime}\right)=x\hbox{'};\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle =p\left({x}^{\prime },{t}^{\prime };u\right)}. $$
(87)

Then, \( \hat{X} \) is a Markov process defined by the nonlinear Fokker-Planck equation

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)=L\left(x,t,P\right)p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right) $$
(88)

with

$$ L\left(x,t,P\right)=-\frac{\partial }{\partial x}{D}_1\left(x,t,P\right)+\frac{\partial^2}{\partial {x}^2}{D}_2\left(x,t,P\right) $$
(89)

if and only if Z(t) defined by

$$ Z(t)=f\left(X(t)\right)-{\int}_0^t{L}_Bf\left[X(z),z,P\right]\;\mathrm{d}z $$
(90)

with

$$ {L}_B\left(x,t,P\right)={D}_1\left(x,t,P\right)\frac{\partial }{\partial x}+{D}_2\left(x,t,P\right)\frac{\partial^2}{\partial {x}^2} $$
(91)

is a martingale of X for smooth functions f. In the context of linear Fokker-Planck equations, the operator LB is the Fokker-Planck backwards operator (Gardiner 1997; Risken 1989). Note that in our context, we refer to f as a smooth function if it has continuous second-order derivatives. Note also that above and in what follows, we will frequently use the notation (54). Note finally that in the above theorem the notion LB f[X(z), z, P] should be interpreted like

$$ {\displaystyle \begin{array}{ll}{L}_Bf\left[X(z),z,P\right]& ={L}_B\left(X(z),z,P\right)f\left(X(z)\right)\\ {}& ={\left\{{L}_B\left(x,t,P\right)f(x)\right\}}_{x=X(z),t=z}.\end{array}} $$
(92)

That is, first we carry out the differentiations defined by the operator LB. Subsequently, we replace in the result the state variable x by the value of the random variable X at time z. Moreover we replace t by z. Let us prove the theorem in two parts.

From Strongly Nonlinear Fokker-Planck Equations to Martingales

Let us prove in this section that a Markov process defined by a strongly nonlinear Fokker-Planck equation exhibits the martingale Z. To this end, we first compute the conditional mean of the random variable Z defined in Eq. (90). Thus, we obtain

$$ {\displaystyle \begin{array}{c}{\left\langle Z(t)\right\rangle}_{X=\theta}\\ {}={\left\langle f\left(X(t)\right)\right\rangle}_{X=\theta }-{\int}_0^t\mathrm{d}z{\left\langle {L}_Bf\left[X(z),z,P\right]\right\rangle}_{X=\theta}\\ {}={\left\langle f\left(X(t)\right)\right\rangle}_{X=\theta }-{\int}_{t^{\prime}}^t\mathrm{d}z{\left\langle {L}_Bf\left[X(z),z,P\right]\right\rangle}_{X=\theta}\\ {}-{\int}_0^{t^{\prime }}\mathrm{d}{sL}_Bf\left[\theta (s),s\right].\end{array}} $$
(93)

The Markov property implies that the constraints can be relaxed. That is, for every functional g(t) of X(t) with t ≥ t, we have 〈g(t)〉X=θ = 〈g(t)〉X(t′)=x′; P where x′ is given by x′ = θ(t′). We have indicated here that the average may depend on how the process is distributed at time t′. Consequently, Eq. (93) becomes

$$ {\displaystyle \begin{array}{l}{\left\langle Z(t)\right\rangle}_{X=\theta }={\left\langle f\left(X(t)\right)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle ={P}^{\prime }}\\ {}-{\int}_{t^{\prime}}^t\mathrm{d}z{\left\langle {L}_Bf\left[X(z),z,P\right]\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },\left\langle \delta \left({x}^{\prime }-X\left({t}^{\prime}\right)\right)\right\rangle ={P}^{\prime }}\\ {}-{\int}_0^{t^{\prime }}\mathrm{d}{sL}_Bf\left[\theta (s),s,P\right].\end{array}} $$
(94)

Multiplying the Fokker-Planck equation (88) with f(x) and integrating with respect to x, we obtain

$$ \frac{\partial }{\partial t}{\int}_{\Omega}f(x)p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)\;\mathrm{d}x={\int}_{\Omega}f(x)L\left(x,t,P\right)p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)\;\mathrm{d}x. $$
(95)

By means of partial integration, we find that ∫Ωf(x)L(x, t, P)p(x, t| x, t; P) dx = ∫Ωp(x, t| x, t; P)LB(x, t, P)f(x) dx. As a result, Eq. (95) can be transformed into

$$ {\displaystyle \begin{array}{l}\frac{\partial }{\partial t}{\int}_{\Omega}f(x)p\left(x,t|x\hbox{'},{t}^{\prime },{P}^{\prime}\right)\;\mathrm{d}x\\ {}\\ {}={\int}_{\Omega}p\left(x,t|x\hbox{'},{t}^{\prime },{P}^{\prime}\right){L}_B\left(x,t,P\right)f(x)\;\mathrm{d}x\\ {}={\left\langle {L}_B\left(x,t,P\right)f(x)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };{P}^{\prime }}.\end{array}} $$
(96)

Using Eq. (96), we obtain

$$ {\left\langle {L}_Bf\left[X(z),z\right]\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };\left\langle \delta \left(x-X\left({t}^{\prime}\right)\right)\right\rangle ={P}^{\prime }}=\frac{\partial }{\partial z}{\int}_{\Omega}f(x)p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\;\mathrm{d}x. $$
(97)

Consequently, the following integral transformation holds

$$ {\displaystyle \begin{array}{ll}I& ={\int}_{t^{\prime}}^t\mathrm{d}z{\left\langle {L}_Bf\left[X(z),z,P\right]\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime };{P}^{\prime }}\\ {}& ={\int}_{t^{\prime}}^t\mathrm{d}z\frac{\partial }{\partial z}{\int}_{\Omega}\mathrm{d}x\;f(x)p\left(x,z|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\;\\ {}& =\left\langle f\right(X(t)\Big\rangle {}_{X\left({t}^{\prime}\right)={x}^{\prime };{P}^{\prime }}-f\left({x}^{\prime}\right).\end{array}} $$
(98)

Substituting Eq. (98) into Eq. (94), we get

$$ {\left\langle Z(t)\right\rangle}_{X=\theta }=f\left({x}^{\prime}\right)-{\int}_0^{t^{\prime }}\mathrm{d}{sL}_Bf\left[\theta (s),s,P\right]. $$
(99)

By definition, the function Z(t′) for X(s) = θ(s) given in s ∈ [0, t] reads

$$ Z\left({t}^{\prime}\right)=f\left({x}^{\prime}\right)-{\int}_0^{t^{\prime }}\mathrm{d}{sL}_Bf\left[\theta (s),s,P\right]. $$
(100)

Consequently, we have our final result

$$ {\left\langle Z(t)\right\rangle}_{X=\theta }=Z\left({t}^{\prime}\right) $$
(101)

and the proof is completed.

From Martingales to Strongly Nonlinear Fokker-Planck Equations

Let us prove next that the martingale (90) defines a Markov process of a strongly nonlinear Fokker-Planck equation. Evaluating Eq. (90) by analogy to Eq. (93) gives us

$$ {\left\langle Z(t)\right\rangle}_{X=\theta }={\left\langle f\left(X(t)\right)\right\rangle}_{X=\theta }-{\int}_{t^{\prime}}^t\mathrm{d}z{\left\langle {L}_Bf\left[X(z),z,P\right]\right\rangle}_{X=\theta }-{\int}_0^{t^{\prime }}\mathrm{d}s\;{L}_Bf\left[\theta (s),s,P\right]. $$
(102)

Substituting this result into Eq. (86) and substituting Eq. (100) into Eq. (86), we see that Eq. (86) becomes

$$ f\left(X(t)\right)\Big\rangle {}_{X=\theta }=f\left({x}^{\prime}\right)+{\int}_{t^{\prime}}^t\mathrm{d}z{\left\langle {L}_Bf\left[X(z),z\right]\right\rangle}_{X=\theta }. $$
(103)

Equation (103) can equivalently be written as

$$ {\int}_{\Omega}f(x)p\left(x,t|X=\theta \right)\;\mathrm{d}x=f\left({x}^{\prime}\right)+{\int}_{t\hbox{'}}^t\mathrm{d}z{\int}_{\Omega}\mathrm{d}x\;p\left(x,z|X=\theta \right){L}_Bf\left[x,t,P\right] $$
(104)

with p(x, z|X = θ) = ⟨δ(x − X(z))⟩X=θ. Using partial integration, we can show that the operator LB and the differential operator L are related to each other like

$$ {\int}_{\Omega}\mathrm{d}x\;p\left(x,z|X=\theta \right)\;{L}_Bf\left[x,t,P\right]={\int}_{\Omega}\mathrm{d}x\;f(x)L\left(x,z,P\right)p\left(x,z|X=\theta \right). $$
(105)

Substituting this result into Eq. (104) yields

$$ 0={\int}_{\Omega}\mathrm{d}x\;f(x)\left\{p\left(x,t|X=\theta \right)-\delta \left(x-{x}^{\prime}\right)-{\int}_{t^{\prime}}^t\mathrm{d}z\;L\Big(x,z,P\left)p\right(x,z|X=\theta \Big)\right\}. $$
(106)

This holds for arbitrary smooth functions f. Since f is arbitrary, the expression in the brackets {⋅} of Eq. (106) must vanish, and we obtain

$$ p\left(x,t|X=\theta \right)=\delta \left(x-{x}^{\prime}\right)+{\int}_{t^{\prime}}^t\mathrm{d}z\;L\left(x,z,P\right)p\left(x,z|X=\theta \right). $$
(107)

Differentiating Eq. 107 with respect to t gives us

$$ \frac{\partial }{\partial t}p\left(x,t|X=\theta \right)=L\left(x,t,P\right)p\left(x,t|X=\theta \right). $$
(108)

Multiplying with the probability density P(X = θ) and performing a functional integration with respect to the path θ, we obtain

$$ \frac{\partial }{\partial t}P\left(x,t\right)=L\left(x,t,P\right)P\left(x,t\right). $$
(109)

The formal solutions of Eqs. (108) and (109) read

$$ p\left(x,t|X=\theta \right)=\exp \left\{{\int}_{t^{\prime}}^t\mathrm{d}z\;L\left(x,z,P\right)\right\}\delta \left(x-{x}^{\prime}\right) $$
(110)

and

$$ P\left(x,t\right)=\exp \left\{{\int}_{t^{\prime}}^t\mathrm{d}z\;L\left(x,z,P\right)\right\}P\left(x,{t}^{\prime}\right). $$
(111)

We see that a solution of Eq. (108) under the initial condition p(x, t|X = θ) = δ(xx′) for t → t′ with x′ = θ(t′) only depends on θ(t′) but does not depend on θ(s) for s < t′. Consequently, X is a Markov process. However, L depends on P. From Eq. (110), it is clear that the conditional probability density p depends on the time-dependent probability density P for z ∈ [t′, t]. Since the probability density P(x,t;u) for t ≥ t′ can be computed from P(x,t′;u) as shown in Eq. (111), we conclude that p depends only on P(x,t′;u) and does not depend on the evolution of P on the whole interval [t, t′]. Therefore, we have p(x, t| X = θ) = p(x, t| x, t, P). Substituting this result into Eq. (108), we see that Eq. (108) becomes a strongly nonlinear Fokker-Planck equation

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)=L\left(x,t,P\right)p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right). $$
(112)

Examples

Shimizu-Yamada Model

The Shimizu-Yamada model (Shimizu 1974; Shimizu and Yamada 1972) corresponds to the Desai-Zwanzig model (7) for a linear single-particle force h(x) = −γx. The evolution of the conditional probability density p is defined by

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\left[\frac{\partial }{\partial x}\upgamma x+\kappa \left(x-{\int}_{\Omega} xP\left(x,t;u\right)\;\mathrm{d}x\right)+Q\frac{\partial^2}{\partial {x}^2}\right]\cdot p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right) $$
(113)

with Ω = ℝ and γ, κ, Q > 0. Multiplying Eq. (113) with P(x′, t′; u) and integrating with respect to x′ yields the evolution equation for P(x, t; u):

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=\left[\frac{\partial }{\partial x}\upgamma x+\kappa \left(x-{\int}_{\Omega} xP\left(x,t;u\right)\;\mathrm{d}x\right)+Q\frac{\partial^2}{\partial {x}^2}\right]\cdot P\left(x,t;u\right). $$
(114)

See also Frank (2004d) and Sect. 3.10 in Frank (2005b). From Eq. (114), it follows that the mean value m(t) = ∫Ω xP(x, t; u) dx decays exponentially like

$$ m(t)=m\left({t}_0\right)\exp \left\{-\upgamma \left(t-{t}_0\right)\right\} $$
(115)

with m(t0) = ∫Ω xu(x) dx. Substituting Eq. (115) into Eqs. (113) and (114), we realize that a solution P(x, t; u) and a Green’s function p exists for any initial probability density u(x). Therefore, the Shimizu-Yamada model is a strongly nonlinear Fokker-Planck equation and describes a Markov process.

It can be shown that the conditional probability density p(x, tx′, t′; u) reads (see Frank (2004d) and Sect. 3.10 in Frank (2005b))

$$ p\left(x,t|{x}^{\prime },{t}^{\prime };u\right)=\frac{\exp \left\{-\frac{{\left[x-g\left(t,{t}^{\prime },{t}_0,u\right)-{x}^{\prime }m\left(t,{t}^{\prime}\right)\right]}^2}{2K\left(t,{t}^{\prime}\right)}\right\}}{\sqrt{2\pi K\left(t,{t}^{\prime}\right)}} $$
(116)

with

$$ m\left(t,t\hbox{'}\right)=\exp \left\{-\left(\upgamma +\kappa \right)\left(t-{t}^{\prime}\right)\right\}, $$
(117)
$$ K\left(t,{t}^{\prime}\right)=\frac{Q}{\upgamma +\kappa}\left[1+\exp \left\{-2\left(\upgamma +\kappa \right)\left(t-{t}^{\prime}\right)\right\}\right], $$
(118)

and

$$ g=\left[\exp \left\{-\upgamma \left(t-{t}_0\right)\right\}-\exp \left\{-\left(\upgamma +\kappa \right)t+\upgamma {t}_0+\kappa {t}^{\prime}\Big)\right\}\right]\cdot {\int}_{\Omega} xu(x)\;\mathrm{d}x. $$
(119)

The mean value m(t) acts as a self-organized driving force of the stochastic process. Since there is a one-to-one mapping of m(t) to m(t′) with t′ < t, we can eliminate the parameter u in p(x, t|x′, t′; u) as argued in section “Strongly Nonlinear Fokker-Planck Equations.” Substituting Eq. (115) into Eq. (119), we obtain

$$ g\left(t,{t}^{\prime },P\left(x,{t}^{\prime };u\right)\right)=\exp \left\{-\upgamma \left(t-{t}^{\prime}\right)\right\}\cdot \left[1-\exp \left\{-\kappa \left(t-{t}^{\prime}\right)\right\}\right]{\int}_{\Omega} xP\left(x,{t}^{\prime };u\right)\kern0.24em \mathrm{d}x $$
(120)

or

$$ g\left(t,{t}^{\prime },\left\langle X\left({t}^{\prime}\right)\right\rangle \right)\exp \left\{-\upgamma \left(t-{t}^{\prime}\right)\right\}.\left[1-\exp \left\{-\kappa \left(t-{t}^{\prime}\right)\right\}\right]\;\left\langle X\left({t}^{\prime}\right)\right\rangle . $$
(121)

Consequently, the conditional probability density p(x, t|x′, t′, P′) reads

$$ p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)=\frac{\exp \left\{-\frac{{\left[x-g\left(t,{t}^{\prime },\left\langle X\left({t}^{\prime}\right)\right\rangle \right)-{x}^{\prime }m\left(t,{t}^{\prime}\right)\right]}^2}{2K\left(t,{t}^{\prime}\right)}\right\}}{\sqrt{2\pi K\left(t,{t}^{\prime}\right)}}. $$
(122)

Dynamic Takatsuji Model

The dynamic Takatsuji model for the conditional probability density p is defined by

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right)=\left[\frac{\partial }{\partial x}\left(\upgamma +c\right)x-\sqrt{c}\tanh \left(\sqrt{c}{\int}_{\Omega} xP\left(x,t;u\right)\;\mathrm{d}x\right)+Q\frac{\partial^2}{\partial {x}^2}\right]\cdot p\left(x,t|{x}^{\prime },{t}^{\prime };{P}^{\prime}\right) $$
(123)

with x ∈ Ω = ℝ and c, Q > 0 γ ∈ ℝ. Likewise, the probability density P(x, t; u) satisfies

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=\left[\frac{\partial }{\partial x}\left(\upgamma +c\right)x-\sqrt{c}\tanh \left(\sqrt{c}{\int}_{\Omega} xP\left(x,t;u\right)\;\mathrm{d}x\right)+Q\frac{\partial^2}{\partial {x}^2}\right]\cdot P\left(x,t;u\right). $$
(124)

For details, see Frank (2004e) and Takatsuji (1975). From Eq. (124), it follows that the first moment M1(t) = ⟨X⟩ can be computed from

$$ \frac{\mathrm{d}}{\mathrm{d}t}{M}_1(t)=-\left(\upgamma +c\right){M}_1+\sqrt{c}\tanh \left[\sqrt{c}{M}_1(t)\right]. $$
(125)

For arbitrary initial distribution u, solutions of M1(t) exist and are smooth functions of t. Substituting these solutions into Eqs. (123) and (124), we see that solutions of Eqs. (123) and (124) in terms of Green’s functions p and probability densities P exist as well. Consequently, the dynamic Takatsuji model belongs to the class of strongly nonlinear Fokker-Planck equations and describes a Markov process.

Since p(x, t|x′, t′, P′) depends on P′, the expected mean value of X(t) of realizations that assume the value x′ at time t′ depends on the distribution of the ensemble at time t′. Let us illustrate this issue. The conditional mean value under consideration reads

$$ {\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}=\int xp\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)\;\mathrm{d}x. $$
(126)

Multiplying Eq. (123) with x and integrating with respect to x, we obtain

$$ \frac{\mathrm{d}}{\mathrm{d}t}{\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}=-\left(\upgamma +c\right){\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}+\sqrt{c}\tanh \left[\sqrt{c}{M}_1(t)\right]. $$
(127)

The solution reads

$$ {\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}={x}^{\prime}\exp \left\{-\left(\upgamma +c\right)\left(t-{t}^{\prime}\right)\right\}+\sqrt{c}{\int}_{t^{\prime}}^t\tanh \left[\sqrt{c}{M}_1(z)\right]\;\mathrm{d}z, $$
(128)

where M1(z) is the solution of Eq. (125) for the initial value M1(t′) = ∫Ω xP(x, t′; u) dx. Let I denote the integral \( I=\sqrt{c}{\int}_{t^{\prime}}^t\tanh \left[\sqrt{c}{M}_1\left(\mathrm{z}\right)\right]\;\mathrm{d}z \). Then, I depends on M1(t′), c, γ, t′ and t:I = I(t, t′, M1(t′), c, γ). Consequently, Eq. (128) can be cast into the form

$$ {\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}={x}^{\prime}\exp \left\{-\left(\upgamma +c\right)\left(t-{t}^{\prime}\right)\right\}+I\left(t,{t}^{\prime },\int {x}^{\prime }P\left({x}^{\prime },{t}^{\prime };u\right)\;\mathrm{d}{x}^{\prime },c,\upgamma \right) $$
(129)

or

$$ {\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}={x}^{\prime}\exp \left\{-\left(\upgamma +c\right)\left(t-{t}^{\prime}\right)\right\}+I\left(t,{t}^{\prime },\left\langle X\left({t}^{\prime}\right)\right\rangle, c,\upgamma \right). $$
(130)

Equation (130) illustrates that in order to predict future conditional mean values of a Takatsuji process \( \hat{X} \) at times t, it is sufficient to have at one time t′ ≤ t information about the state value x′ of a realization of \( \hat{X} \) and the mean value ⟨X(t′)⟩ of all realizations of \( \hat{X} \).

Note that the trajectories Z(t) of martingale processes \( \hat{Z} \) induced by the Takatsuji process \( \hat{X} \) are given by

$$ Z(t)=f(X)-{\int}_0^t\mathrm{d}s\left\{\left[-\left(\upgamma +c\right)X(s)+\sqrt{c}\tanh \left(\sqrt{c}\left\langle X(s)\right\rangle \right)\right]\frac{\partial f}{\partial X(s)}+Q\frac{\partial^2f}{\partial {X}^2(s)}\right\} $$
(131)

for arbitrary smooth functions f. We can exploit these martingale processes in order to compute conditional expectations. For example, for f(y) = y from the martingale property (86), it follows that

$$ {\left\langle X(t)\right\rangle}_{X\left({t}^{\prime}\right)={x}^{\prime },{P}^{\prime }}={x}^{\prime }-{\int}_{t^{\prime}}^t\mathrm{d}s\left(\upgamma +c\right)\left\langle X(s)\right\rangle -\sqrt{c}\tanh \left[\sqrt{c}\left\langle X(s)\right\rangle \right]. $$
(132)

Differentiating this relation with respect to t, we obtain Eq. (127) again and so we can compute the conditional expectation (130).

Liquid Crystal Model

Liquid crystals exhibit nematic-isotropic phase transitions (Chandrasekhar 1977; de Gennes and Prost 1993; de Jeu 1980). At high temperatures, the liquid crystal macromolecules exhibit an orientational disorder. The liquid crystal is said to be in the isotropic phase. Below a critical temperature, the macromolecules show some degree of orientational order. The degree of orientational order is often measured by the Maier-Saupe order parameter S (Maier and Saupe 1958).

A nonlinear Fokker-Planck equation that describes the stochastic behavior of the liquid crystal in the isotropic and nematic phases and to a certain extent also describes the order-disorder phase transition was proposed by Doi and Edwards (Doi and Edwards 1988) and Hess (Hess 1976) and is shown above in Eq. (8). Equation (8) describes the random walk of the orientation of liquid crystal molecules, where the orientation is given by a vector x that points to the surface of a unit sphere. If we are dealing with rod-like molecules, then the orientation corresponds to the primary axis of the molecules along the rod. In particular, for liquid crystals with an axial symmetry, the liquid crystal model can be simplified. The simplified model describes the random walk of the molecule alignment with the symmetry axis. The random variable is defined on X ∈ Ω = [0, 1]. For sake of simplicity, we will extend the range of definition to the interval Ω = [−1, 1] and require that distributions are symmetric. For X = 0, the molecule has an orientation perpendicular to the symmetry axis. If X = 1 or X = −1, the molecule points exactly in the direction of the symmetry axis. In this symmetric case, the probability density P of X satisfies (Felderhof 2003)

$$ \frac{\partial }{\partial t}P\left(x,t;u\right)=\frac{\partial }{\partial x}\left(1-{x}^2\right)\cdot \left[-\frac{9}{2}\kappa x\left(\int {x}^2P\left(x,t;u\right)\;\mathrm{d}x-\frac{1}{3}\right)+{D}_r\frac{\partial }{\partial x}\right]P\left(x,t;u\right) $$
(133)

with κ, Dr. > 0. Equation (133) as well as the original Eq. (8) are regarded as descriptions for an ensemble of macromolecules that perform rotational Brownian motion (Doi and Edwards 1988). Since Brownian motion is a Markov process, it is reasonable to construct on the basis of Eq. (133) a model for a many-body system that exhibits a Markov process. In line with our discussion in section “Markov Property: Second-Order and Higher-Order Statistics,” we assume that the conditional probability density p satisfies (Frank 2005c)

$$ \frac{\partial }{\partial t}p\left(x,t|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\frac{\partial }{\partial x}\left(1-{x}^2\right)\cdot \left[-\frac{9}{2}\kappa x\left(\int {x}^2P\left(x,t;u\right)\;\mathrm{d}x-\frac{1}{3}\right)+{D}_r\frac{\partial }{\partial x}\right]\cdot p\left(x,t,|{x}^{\prime },{t}^{\prime },{P}^{\prime}\right). $$
(134)

Note that the expression in the bracket (⋅) is related to the Maier-Saupe order parameter which reads in the symmetric case

$$ S(t)=\frac{1}{2}\left(3\int {x}^2P\left(x,t;u\right)\;\mathrm{d}x-1\right). $$
(135)

Due to the boundary conditions X ∈ [−1, 1], the order parameter S and consequently the bracket (⋅) is bounded. This implies that solutions P and p of Eqs. (133) and (134) exist and that the liquid crystal model (133)–(134) describes a Markov process. The self-consistent Ito-Langevin equation of this Markov process reads (Frank 2005c)

$$ \frac{\mathrm{d}}{\mathrm{d}t}X(t)=\frac{9\kappa }{2}\left(1-X{(t)}^2\right)X(t)\left(\left\langle X{(t)}^2\right\rangle -\frac{1}{3}\right)-2{D}_rX(t)+\sqrt{D_r\left(1-X{(t)}^2\right)}\Gamma (t). $$
(136)

Trajectories Z(t) of martingale processes \( \hat{Z} \) of the liquid crystal model are defined by (Frank 2007)

$$ Z(t)=f\left(X(t)\right)-{\int}_0^t\mathrm{d}s\left[\cdot \right]f\left(X(s)\right) $$
(137)

with

$$ \left[\cdot \right]=\left[\left(\frac{9\kappa }{2}\left(1-X{(s)}^2\right)X(s)\left(\left\langle {\left[X(s)\right]}^2\right\rangle -\frac{1}{3}\right)-2{D}_rX(s)\right)\frac{\partial }{\partial X(s)}+{D}_r\frac{\partial^2}{\partial {X}^2(s)}\right]. $$
(138)

In the stationary case, the short-time autocorrelation function Ct) = ⟨X(t)X(t + Δt)⟩st reads (Frank 2005c)

$$ C\left(\Delta t\right)=\frac{2S+1}{3}-\frac{2{D}_r\left(1-S\right)}{3}\Delta t+O\left(\Delta {t}^2\right), $$
(139)

where S denotes the order parameter (see above) in the stationary case. That is, we have S = (3 ⟨X2st − 1)/2. Consequently, C depends on S. This has important implications for the hysteresis loop of the nematic-isotropic phase transition. Let us assume that if we decrease the temperature of a liquid crystal, we find the transition from the isotropic to the nematic phase with S = 0 → S > 0 at the critical temperature Tc,low. In contrast, if we increase the temperature of a liquid crystal, we find the transition from the nematic to the isotropic phase with S > 0 → S = 0 at the slightly higher critical temperature Tc,high. Then, in the temperature interval [Tc,low, Tc,high], the liquid crystal exhibits two autocorrelation functions

$$ {C}_{\mathrm{isotrope}}\left(\Delta t\right)=\frac{1}{3}-\frac{2{D}_r}{3}\Delta t, $$
(140)
$$ {C}_{\mathrm{nem}}\left(\Delta t\right)=\frac{2S(T)+1}{3}-\frac{2{D}_r\left(1-S(T)\right)}{3}\Delta t $$
(141)

which hold up to terms of order Δt2. Equations (140)–(141) illustrate that we are dealing with a system that exhibits two kinds of Markov processes that we may label “isotropic” and “nematic,” respectively. The modeling approach by means of strongly nonlinear Fokker-Planck equations indicates that these Markov processes are just different members of a family of Markov processes that naturally emerge in the self-organized liquid crystal. That is, the two Markov processes are not related to two different systems but they represent two different “states” of the same self-organizing many-body system.

Let us compute the conditional mean value of molecules that are perpendicular to the symmetry axis. To this end, we consider the random walk of the orientation angle ϕ defined by X(t) = sin ϕ(t). Using the Stratonovich-Langevin equation of Eq. (134) (see Frank 2005c), we obtain a self-consistent Langevin equation for ϕ:

$$ \frac{\mathrm{d}}{\mathrm{d}t}\phi =\frac{9}{4}\kappa \sin \left(2\phi (t)\right)\left(\left\langle {\sin}^2\left(\phi (t)\right)\right\rangle -\frac{1}{3}\right)-{D}_r\tan \phi (t)+{D}_r\Gamma (t). $$
(142)

For short time intervals Δt = t − t′ and appropriate small noise amplitudes Dr, we assume that ϕ(t) ≈ 0 if ϕ(t′) ≈ 0. Linearizing Eq. (142) at ϕ = 0 yields

$$ \frac{\mathrm{d}}{\mathrm{d}t}\phi (t)=\left\{\frac{9}{4}\kappa \left(\left\langle {\sin}^2\left(\phi (t)\right)\right\rangle -\frac{1}{3}\right)-{D}_r\right\}\phi (t)+{D}_r\Gamma (t). $$
(143)

The conditional expectation value ∫ϕp(ϕ, t|ϕ′, t′, P)dϕ for short time intervals Δt can then be computed from Eq. (143) by averaging both side of Eq. (143) under the constraint ϕ(t′) = ϕ′ and ϕ(t′) distributed like P′. Thus, we obtain

$$ {\left\langle \phi (t)\right\rangle}_{\phi \left({t}^{\prime}\right)={\phi}^{\prime },{P}^{\prime }}={\phi}^{\prime}\left[1+\Delta t\left\{\frac{9}{4}\kappa \left(\left\langle {\sin}^2\left(\phi \left({t}^{\prime}\right)\right)\right\rangle -\frac{1}{3}\right)-{D}_r\right\}\right] $$
(144)

or

$$ {\left\langle \phi (t)\right\rangle}_{\phi \left({t}^{\prime}\right)={\phi}^{\prime },{P}^{\prime }}={\phi}^{\prime}\left[1+\Delta t\left\{\frac{3}{2}\kappa S\left({t}^{\prime}\right)-{D}_r\right\}\right]. $$
(145)

These estimates hold for small intervals Δt, sufficiently small noise amplitudes Dr, and orientation angles ϕ′ ≈ 0. Again, in line with our general discussion in the preceding sections, we see that the conditional expectation ⟨ϕ(t)⟩ϕ(t′) = ϕ′,P can be computed provided that for t′ < t the distribution of ϕ(t′) or at least the order parameter S(t′) is known and the angle ϕ′ is selected.

Semiclassical Description of Quantum Systems

A stochastic treatment of semiclassical quantum systems by means of nonlinear Fokker-Planck equations that can be cast into the form of Eqs. (5) and (6) has been proposed and analyzed in several studies (Carrillo et al. 2008; Chavanis 2003; Frank and Daffertshofer 1999; Kadanoff 2000; Kaniadakis 2001a; Kaniadakis and Quarati 1993, 1994). Accordingly, a Fermi or Bose particle with mass 1 that moves in a one-dimensional space with velocity v exhibits in the stationary case a Fermi-Dirac or Bose-Einstein distribution of the kinetical energy Ekin = v2/2. The free diffusion of the particle can be described by the nonlinear Fokker-Planck equations (Frank and Daffertshofer 1999)

$$ \frac{\partial }{\partial t}P\left(v,t;u\right)=\frac{\partial }{\partial v}\upgamma v\left[1\mp P\left(v,t,u\right)\right]P\left(v,t;u\right)+Q\frac{\partial^2}{\partial {v}^2}P\left(v,t;u\right), $$
(146)

where the upper sign holds for Fermi particles, the lower for Bose particles. The parameters γ and Q represent damping and fluctuation strength and are related to the temperature T by the fluctuation dissipation theorem Q/γ = 1/(kBT), where kB is the Boltzmann constant. The stationary probability density Pst(v) of Eq. (146) reads

$$ {P}_{\mathrm{st}}(v)=\frac{1}{\exp \left\{\left({E}_{\mathrm{kin}}-\mu \right)/\left({k}_BT\right)\right\}\pm 1}, $$
(147)

where μ is a normalization constant that can be interpreted as chemical potential. The transient solution P(v, t; u) can be obtained by solving the integral equation (Frank 2007; Meleard and Coppoletta 1987)

$$ P\left(v,t;u\right)={\int}_{\Omega}\mathrm{d}{v}_0{G}_B\left(t,{t}_0,v,{v}_0\right)u\left({v}_0,{t}_0\right)+{\int}_{t_0}^t\mathrm{d}s\cdot {\int}_{\Omega}\mathrm{d}{v}^{\prime }{G}_B\left(t,s,v,{v}^{\prime}\right)\upgamma \frac{\partial }{\partial {v}^{\prime }}\left[1\mp P\left({v}^{\prime },s;u\right)\right]P\left({v}^{\prime },s;u\right) $$
(148)

with

$$ {G}_B\left(t,{t}^{\prime },v,{v}^{\prime}\right)=\frac{1}{\sqrt{2\pi Q\left(t-{t}^{\prime}\right)}}\exp \left\{-\frac{{\left(v-{v}^{\prime}\right)}^2}{2Q\left(t-{t}^{\prime}\right)}\right\}, $$
(149)

where GB is the Gaussian propagator of Brownian motion. In the limit t → ∞, the transient solution P(v, t; u) approaches Pst(v) (Frank and Daffertshofer 2001b; Kaniadakis 2001a). Equation (148) is a useful description for numerical approaches. Using partial integration, Eq. (148) can be written in the form

$$ P\left(v,t;u\right)={\int}_{\varOmega}\mathrm{d}{v}_0{G}_B\left(t,{t}_0,v,{v}_0\right)u\left({v}_0,{t}_0\right)+{\int}_{t_0}^t\mathrm{d}s{\int}_{\varOmega}\mathrm{d}{v}^{\prime }{G}_B\left(t,s,v,{v}^{\prime}\right)\gamma \frac{v-{v}^{\prime }}{Q\left(t-s\right)}\cdot \left[1\mp P\left({v}^{\prime },s;u\right)\right]P\left({v}^{\prime },s;u\right). $$
(150)

This integral relation can be solved iteratively. In contrast to the iterative procedure discussed in section “Time-Dependent Solutions and First Order Statistics,” there is no need to compute derivatives. That is, we are dealing with some kind of path integral approach here that is similar to the numerical path integral approach involving short-time propagators; see section “Short-Time Propagator.”

In order to describe quantum particles that exhibits a Markov process, we may exploit the approach outlined in section “Markov Property: Second-Order and Higher-Order Statistics.” Accordingly, the Markov conditional probability density of the quantum particle satisfies

$$ \frac{\partial }{\partial t}p\left(v,t|{v}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\left\{\frac{\partial }{\partial v}\upgamma v\left[1\mp P\left(v,t;u\right)\right]+Q\frac{\partial^2}{\partial {v}^2}\right\}\cdot p\left(v,t|{v}^{\prime },{t}^{\prime },{P}^{\prime}\right), $$
(151)

and the self-consistent Langevin equation reads

$$ \frac{\mathrm{d}}{\mathrm{d}t}v(t)=-\upgamma v(t)\left(1\mp P\left(v(t),t;u\right)\right)+\sqrt{Q}\Gamma (t). $$
(152)

From a martingale perspective, we see that stochastic trajectories v(t) induce for arbitrary smooth functions f the martingale \( \hat{Z} \) with trajectories

$$ Z(t)=f\left(v(t)\right)-{\int}_0^t{\left.\mathrm{d}s\cdot \left[-\upgamma v(s)\left[1\mp P\right(v(s),s;u\left)\right]\frac{\partial }{\partial {v}^2}+Q\frac{\partial^2}{\partial {v}^2}\right]f(v)\right|}_{v=v(s)}. $$
(153)

Moreover, as far as the Markov short-time propagator is concerned, for small time intervals t = t′ + Δt, the propagator reads

$$ p\left(v,t|{v}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\sqrt{\frac{1}{2\pi Q\Delta t}}\exp \left\{-,\frac{{\left[v-{v}^{\prime }+\Delta t\upgamma {v}^{\prime}\left[1\mp {P}^{\prime}\right]\right]}^2}{2Q\Delta t}\right\}, $$
(154)

and can be computed from the information about the distribution P′ of v(t′) and the state v that was observed for particular realizations of the process \( \hat{v} \). In the stationary case, the propagator p reads for small time intervals Δt = t − t

$$ p\left(v,t|{v}^{\prime },{t}^{\prime },{P}_{\mathrm{st}}\left({v}^{\prime}\right)\right)=\sqrt{\frac{1}{2\pi Q\Delta t}}\exp \left\{-,\frac{{\left[v-{v}^{\prime }+\Delta t\upgamma {v}^{\prime}\left[1\mp {P}_{\mathrm{st}}\left({v}^{\prime}\right)\right]\right]}^2}{2Q\Delta t}\right\} $$
(155)

with Pst defined by Eq. (147).

Nonextensive Systems

Nonextensive thermostatistical systems have been related to the Tsallis entropy (Abe and Okamoto 2001; Tsallis 1988)

$$ {S}_q=\frac{1}{q-1}{\int}_{\Omega}\left[P{(v)}^q-P(v)\right]\;\mathrm{d}v, $$
(156)

where q measures the degree of nonextensivity. Diffusion processes in nonextensive thermostatistical systems can be regarded as generalized Ornstein-Uhlenbeck processes that satisfy the nonlinear Fokker-Planck equation (Plastino and Plastino 1995) (see also Borland 1998; Chavanis 2003, 2004; Compte and Jou 1996; Drazer et al. 2000; Frank and Daffertshofer 1999, 2000; Shiino 2003; Tsallis and Bukman 1996)

$$ \frac{\partial }{\partial t}P\left(v,t;u\right)=\frac{\partial }{\partial v}\upgamma vP\left(x,t;u\right)+Q\frac{\partial^2}{\partial {v}^2}P{\left(v,t;u\right)}^q, $$
(157)

where v is the velocity of a particle with mass 1 that moves in one spatial dimension. In the asymptotic domain P(v, t; u) approaches, a stationary Tsallis distribution

$$ {P}_{\mathrm{st}}(v)=\frac{D_{\mathrm{st}}}{{\left[1+\upgamma \left(1-q\right){v}^2/\left[2{qQD}_{\mathrm{st}}^{q-1}\right]\right]}^{1/\left(1-q\right)}} $$
(158)

for q ∈ (1/3, 1) with \( {D}_{\mathrm{st}}={\left[\upgamma /\left(2 qQ\right){z}_q^2\right]}^{1/\left(1+q\right)} \) and \( {z}_q=\sqrt{\pi /\left(1-q\right)}\Gamma \left[\left(1+q\right)/\left[2\left(1-q\right)\right]\right]/\Gamma \left[1/\left(1-q\right)\right] \) (Frank and Daffertshofer 2000). The process described by Eq. (157) is said to evolve in a nonextensive thermodynamic framework because its stationary probability density (158) maximizing the entropy measure (156). As discussed in section “Markov Property: Second-Order and Higher-Order Statistics,” the Markov conditional probability density p satisfies

$$ \frac{\partial }{\partial t}p\left(v,t|{v}^{\prime },{t}^{\prime },{P}^{\prime}\right)=\frac{\partial }{\partial x}\upgamma v\;p\left(v,t|{v}^{\prime },{t}^{\prime },{P}^{\prime}\right)+Q\frac{\partial^2}{\partial {v}^2}P{\left(v,t;u\right)}^{q-1}p\left(v,t|{v}^{\prime },{t}^{\prime },{P}^{\prime}\right). $$
(159)

The self-consistent Langevin equation of the Markov diffusion process reads (see also Borland 1998)

$$ \frac{\mathrm{d}}{\mathrm{d}t}v(t)=-\upgamma v(t)+\sqrt{QP{\left(v(t),t;u\right)}^{q-1}}\Gamma (t). $$
(160)

Any stochastic path v(t) computed from Eq. (160) yields for arbitrary smooth functions f the martingale

$$ Z(t)=f\left(v(t)\right)-{\int}_0^t\mathrm{d}s\left[-\upgamma v(s)\frac{\partial }{\partial v(s)}+ QP\Big(v(s),s;u\Big){}^{q-1}\frac{\partial^2}{\partial v{(s)}^2}\right]f\left(v(s)\right). $$
(161)

The autocorrelation C = ⟨v(t)v(t′)⟩ in the transient domain for u(v0) = δ(v − v0) reads (Frank 2004a)

$$ C\left(t,{t}^{\prime },{v}_0,{t}_0\right)={M}_2\left({t}^{\prime },{t}_0,{v}_0\right)\exp \left\{-\upgamma \left(t-{t}^{\prime}\right)\right\} $$
(162)

with

$$ {\displaystyle \begin{array}{ll}{M}_2\left({t}^{\prime },{t}_0,{v}_0\right)& =K\left({t}^{\prime },{t}_0\right)+{M}_1^2\left({t}^{\prime },{t}_0,{v}_0\right),\\ {}K\left({t}^{\prime },{t}_0\right)& =\frac{1}{3q-1}{\left[\frac{2 qQ{\left[{z}_q\right]}^{\left(1-q\right)}}{\upgamma}\cdot \left(1-\exp \left\{-\left(1+q\right)\upgamma \left(t-{t}_0\right)\right\}\right)\right]}^{2/\left(1+q\right)},\\ {}{M}_1\left({t}^{\prime },{t}_0,{v}_0\right)& ={v}_0\exp \left\{-\upgamma \left({t}^{\prime }-{t}_0\right)\right\}.\end{array}} $$
(163)

The autocorrelation function C depends on t0. This is not in contradiction with the Markov property of the underlying process as discussed in section “Markov Property: Second-Order and Higher-Order Statistics.” In particular, we may eliminating the initial condition. Then, Eq. (162) reads

$$ C\left(t,{t}^{\prime}\right)=\left\langle {u}^2\left({t}^{\prime}\right)\right\rangle \exp \left\{-\upgamma \left(t-{t}^{\prime}\right)\right\}. $$
(164)

and holds for arbitrary initial probability densities u.

Linear Nonequilibrium Thermodynamics

Linear and nonlinear Fokker-Planck equations alike can be approached from the principles of linear nonequilibrium thermodynamics (de Groot and Mazur 1962; Glansdorff and Prigogine 1971; Kondepudi and Prigogine 1998). For stochastic processes to which linear nonequilibrium thermodynamics applies the probability density P(x, t; u) of a process evolves such that the free energy functional F[P] decreases as a function of time t. More precisely, following a study by Compte and Jou (Compte and Jou 1996), it has been proposed that P satisfies the nonlinear Fokker-Planck equations of the form (Chavanis 2004; Frank 2002a, 2005b; Scarfone and Wada 2007)

$$ \frac{\partial }{\partial t}P=\frac{\partial }{\partial x}P\tilde{M}\frac{\partial }{\partial x}\frac{\delta F}{\delta P}, $$
(165)

where \( \tilde{M} \) is an appropriately defined mobility coefficient and δF/δP denotes the variational derivative of F. Note that this thermodynamic approach is closely related to the GENERIC approach developed in Espanol et al. (1999), Jelic et al. (2006), Öttinger (2005, 2007), and Öttinger and Grmela (1997).

For example, the Desai-Zwanzig model (8) can be expressed in terms of Eq. (165) for \( \tilde{M}=1 \) and (Frank 2005b; Shiino 1987)

$$ F=\left\langle V\right\rangle +{U}_{\mathrm{MF}}-{QS}_{\mathrm{BGS}}. $$
(166)

Here, V is the potential of the force h (i.e., we have V(x) = − ∫h(x) dx, UMF is the mean field energy given by UMF = −κσ2/2 (where σ2 is the variance of the process), and SBGS is the Boltzmann-Gibbs-Shannon entropy

$$ {S}_{\mathrm{BGS}}=-{\int}_{\Omega}P\left(x,t;u\right)\ln P\left(x,t;u\right)\;\mathrm{d}x, $$
(167)

where we have put the Boltzmann constant equal to unity. The liquid crystal model (8) can be written as Eq. (165) with (Frank 2005c)

$$ F=-\frac{\kappa }{2}{S}^2-{D}_r{S}_{\mathrm{BGS}}, $$
(168)

where S is the Maier-Saupe order parameter (135). We have \( \tilde{M}=1-{x}^2 \). Moreover, the expression −κS2/2 is the Maier-Saupe mean field energy. The Kuramoto-Shinomoto-Sakaguchi model (10) can equivalently be expressed in terms of Eq. (165) with \( \tilde{M}=1 \) (see Sect. 5.4 in Frank 2005b) using

$$ F=\left\langle V\right\rangle -\frac{\kappa }{2}{r}^2-{QS}_{\mathrm{BGS}}, $$
(169)

where r is the cluster phase defined by r = |⟨exp.{−iX(t)}⟩|. Here, the expression −κr2/2 is a measure for the mean field energy among the phase oscillators described by the model. The Takatsuji model (124) involves a constant mobility coefficient \( \tilde{M}=1 \) and the free energy functional (Frank 2005b)

$$ F=\frac{\upgamma +c}{2}\left\langle {X}^2\right\rangle -\ln \cosh \left(\sqrt{c}\left\langle X\right\rangle \right)-{QS}_{\mathrm{BGS}}. $$
(170)

The Plastino-Plastino model (14) related to the nonextensive Tsallis entropy (156) is given by Eq. (165) and \( \tilde{M}=1 \) with (Frank 2005b)

$$ F=\left\langle V\right\rangle -{QS}_q, $$
(171)

where V is the potential of the gradient force h. For an appropriate choice of \( \tilde{M} \), the quantum mechanical nonlinear Fokker-Planck equations (146) can be cast into the form Eq. (165) with

$$ F=\left\langle V\right\rangle -{QS}_{\mathrm{FD},\mathrm{BE}}, $$
(172)

where SFD,BE is the quantum mechanical entropy of the Fermi-Dirac or Bose-Einstein statistics and V is the potential of the function h(x) again. For details, see Frank (2005b) and Frank and Daffertshofer (1999). From the perspective of linear nonequilibrium thermodynamics, linear and nonlinear Fokker-Planck equations can be distinguished by means of the thermodynamic flux (Compte and Jou 1996; Frank 2002a, 2005b, 2007)

$$ J=-\tilde{M}P\frac{\partial }{\partial x}\frac{\delta F}{\delta P}. $$
(173)

Note that in this approach, the thermodynamic flux is equivalent to the probability current (Frank 2005b). As can be seen from Eq. (173), on the one hand, the flux is associated to the free energy F. On the other hand, from the evolution equation (165), it follows that

$$ \frac{\partial }{\partial t}P=-\frac{\partial }{\partial x}J. $$
(174)

If J is linear with respect to P, then the corresponding Fokker-Planck equation is linear with respect to P as well. If J is nonlinear with respect to P, then we are dealing with a nonlinear Fokker-Planck equation. The question whether J is linear or not is answered by nature herself (Frank 2007). For the Brownian particle motion, we have \( \tilde{M}=\upgamma \) and

$$ F=\frac{1}{2}\left\langle {v}^2\right\rangle -{QS}_{\mathrm{BGS}}, $$
(175)

which yields

$$ J=-\upgamma vP-Q\frac{\partial }{\partial x}P. $$
(176)

J is linear and the corresponding Fokker-Planck equation is linear as well. For a self-organizing system, frequently it is found that J is nonlinear because F involves a mean field energy term that is nonlinear with respect to P. For examples, see Eqs. (166), (168), (169), and (170). Likewise, for quantum and nonextensive systems, we find that J is nonlinear because F involves quantum and nonextensive entropies such as SFD,BE and Sq. For examples, see Eqs. (171) and (172).

Summary and Future Directions

From the previous discussion in section “Linear Nonequilibrium Thermodynamics,” it is clear that modeling approaches based on nonlinear Fokker-Planck equations are rooted in the theory of collective phenomena and self-organization, on the one hand, and in the theory of quantum mechanical and nonextensive systems, on the other. In contrast, linear Fokker-Planck equations are tailored to address the stochastic properties of systems composed of noninteracting subsystems when equating the material subsystem ensemble with the ensemble of statistical realizations. Note that – of course – linear Fokker-Planck equations can also be applied to discuss stochastic properties of self-organizing systems. However, in such cases, either the stochastic behavior of order parameters by means of low-dimensional linear Fokker-Planck equations is discussed (Haken 2004) or linear high-dimensional or even functional Fokker-Planck equations are involved (Gardiner 1997).

We showed in section “Markov Property: Second-Order and Higher-Order Statistics” that both linear and nonlinear Fokker-Planck equations exhibit Green’s functions and Langevin equations. The fact that a nonlinear evolution equation can give rise to a Green’s function may be counterintuitive because Green’s functions are associated with linearity. In fact, the evolution equation of the Green’s function p is linear with respect to p. The nonlinearity is in the evolution equation for the time-dependent probability density P but not in the evolution equation of the Green’s function p. In this context, we would like to reiterate what we pointed out in section “Strongly Nonlinear Fokker-Planck Equations”: time-dependent solutions P do not necessarily correspond to Green’s function p (Frank 2003b).

In the mathematical literature, the theory of Markov processes that involve conditional probability densities of the form p(x, t|x′, t′, P′) has been discussed for several decades (see references in section “Definition of the Subject”). In line with these studies, we suggest to refer to Markov processes with conditional probability densities of the form p(x, t|x′, t′, P′) as nonlinear Markov processes or nonlinear families of Markov processes (Frank 2004d). Likewise, we suggest to refer to Markov processes whose conditional probability densities do not depend on P′ as linear Markov processes or linear families of Markov processes. Using this terminology, we would say that strongly nonlinear Fokker-Planck equations describe nonlinear Markov diffusion processes, and vice versa nonlinear Markov diffusion processes can be expressed in terms of strongly nonlinear Fokker-Planck equations.

In physics and related disciplines, the relevance of nonlinear Markov processes has to be explored in the future. That is, except for research primarily reported in the mathematical literature, the theory of Markov processes constructed from conditional probability measures of the form p(x, t|x′, t′, P′) is still in its infancy. The Chapman-Kolmogorov equation and the Kramers-Moyal expansion presented in section “Markov Property: Second-Order and Higher-Order Statistics” provide promising departure points for future studies in this regard.

In the present study, we pointed out that there are a few overarching concepts that apply to linear and nonlinear Fokker-Planck equations alike: the concepts of Markov diffusion processes, martingales, and linear nonequilibrium thermodynamics. Therefore, future studies may change the state of the art illustrated in Fig. 1 into a scenario as shown in Fig. 2. In doing so, a closely connected world of linear and nonlinear Fokker-Planck equations that is governed by a small set of powerful principles could emerge.

Fig. 2
figure 2

Connected applications of linear and nonlinear Fokker-Planck equations