1 Introduction

Under a Markov approximation, the evolution of an open quantum system \(\mathcal {S}\) in interaction with an environment \(\mathcal {E}\) is described by the Gorini–Kossakowski–Sudarshan–Lindblad Master (GKSL) equation [23, 30]. More precisely, assuming that the system is described by the Hilbert space \({\mathbb {C}}^k\), the set of its states is defined as the set \(\mathcal {D}_k\) of density matrices, i.e., positive semidefinite matrices with trace one:

$$\begin{aligned} \mathcal {D}_k=\{\rho \in \mathrm {M}_{k}({\mathbb {C}}) \text{ s.t. } \rho \ge 0,{\text {tr}}\rho =1\}. \end{aligned}$$

The evolution \(t\in {\mathbb {R}}_+\mapsto {\bar{\rho }}_t\in \mathcal {D}_k\) of states of the system is then determined by the GKSL equation (also called quantum master equation):

$$\begin{aligned} \mathrm {d}{\bar{\rho }}_t=\mathcal {L}({\bar{\rho }}_t)\,\mathrm {d}t,\quad {\bar{\rho }}_0\in \mathcal {D}_k, \end{aligned}$$
(1.1)

where \(\mathcal {L}\) is a linear operator on \(\mathrm {M}_{k}({\mathbb {C}})\) of the form

$$\begin{aligned} \mathcal {L}: \rho \mapsto -\mathrm {i}[H,\rho ]+\sum _{i\in I}\big ( V_i\rho V_i^*-\genfrac{}{}{}1{1}{2}\{V_i^*V_i,\rho \}\big ), \end{aligned}$$
(1.2)

with I a finite set, \(H\in \mathrm {M}_{k}({\mathbb {C}})\) self-adjoint, and \(V_i\in \mathrm {M}_{k}({\mathbb {C}})\) for each \(i\in I\) (\([\cdot ,\cdot ]\) and \(\{\cdot ,\cdot \}\) are, respectively, the commutator and anticommutator). Such an \(\mathcal {L}\) is called a Lindblad operator.

Since \(\mathcal {L}\) is linear, \(t\mapsto {\bar{\rho }}_t\) is given by \({\bar{\rho }}_t=\mathrm {e}^{t\mathcal {L}}({\bar{\rho }}_0)\). The flow is therefore a semigroup \((\mathrm {e}^{t\mathcal {L}})_t\), which consists of completely positive, trace-preserving maps (see [38]). In particular, \(\mathcal {L}\) is the generator of a semigroup of contractions, thus \({\text {spec}}\mathcal {L}\subset \{\lambda \in {\mathbb {C}}\ \text{ s.t. }\ {\text {Re}}\lambda \le 0\}\). Since \(\mathrm {e}^{t\mathcal {L}}\) is trace preserving, \(0\in {\text {spec}}\mathcal {L}\). The following assumption is equivalent to the simplicity of the eigenvalue 0 [38, Proposition 7.6]:

(\(\mathcal {L}\)-erg)::

There exists a unique nonzero minimal orthogonal projection \(\pi \) such that \(\mathcal {L}(\pi \mathrm {M}_{k}({\mathbb {C}})\pi )\subset \pi \mathrm {M}_{k}({\mathbb {C}})\pi \).

Assumption (\(\mathcal {L}\)-erg) implies directly that there exists a unique \(\rho _{\mathrm {inv}}\in \mathcal {D}_k\) such that \(\mathcal {L}\rho _{\mathrm {inv}}=0\). Moreover, one can show that (\(\mathcal {L}\)-erg) implies the existence of \(\lambda >0\) such that for any \(\rho \in \mathcal {D}_k\), \(\mathrm {e}^{t\mathcal {L}}(\rho )=\rho _{\mathrm {inv}}+O(\mathrm {e}^{-\lambda t})\) (see [38, Proposition 7.5]).

The above framework generalizes that of continuous-time Markov semigroups on a finite number of sites: density matrices \(\rho \) over \({\mathbb {C}}^k\) generalize probability distributions over k classical states, while Lindbladians \(\mathcal {L}\) generalize generators of Markov jump processes. In Sect. 6.4, we show how a classical finite-state Markov jump process can be encoded in the present formalism.

The family \(({\bar{\rho }}_t)_t\) describes the reduced evolution of the system \(\mathcal {S}\) when coupled to an environment \(\mathcal {E}\) in a conservative manner. This evolution can be derived by considering the full Hamiltonian of \(\mathcal {S}+\mathcal {E}\) in relevant limiting regimes, e.g., the weak coupling or fast repeated interactions regimes, and tracing out the environment degrees of freedom (see [17, 18] and [1], respectively). It can also be described by a stochastic unraveling, i.e., a stochastic process \((\rho _t)_t\) with values in \(\mathcal {D}_k\) such that the expectation \({\overline{\rho }}_t\) of \(\rho _t\) satisfies (1.1); this method was developed in [4,5,6]. One possible choice of a stochastic unraveling is described by the following stochastic differential equation (SDE), called a stochastic master equation:

$$\begin{aligned} \begin{aligned} \mathrm {d}\rho _t=&\mathcal {L}(\rho _{t-})\,\mathrm {d}t\\&+\quad \sum _{i\in I_b}\Big (L_i\rho _{t-}+\rho _{t-} L_i^*-{\text {tr}}\big (\rho _{t-}(L_i+L_i^*)\big )\rho _{t-}\Big )\,\mathrm {d}B_i(t)\\&+\quad \sum _{j\in I_p} \Big (\frac{C_j\rho _{t-}C_j^*}{{\text {tr}}(C_j\rho _{t-}C_j^*)}-\rho _{t-}\Big )\Big (\mathrm {d}N_j(t)-{\text {tr}}(C_j\rho _{t-} C_j^*)\,\mathrm {d}t\Big ), \end{aligned} \end{aligned}$$
(1.3)

where

  • \(I=I_b\cup I_p\) is a partition of I such that \(L_i=V_i\) for \(i\in I_b\) and \(C_j=V_j\) for \(j\in I_p\),

  • each \(B_i\) is a Brownian motion,

  • each \(N_j\) is a Poisson process of intensity \(t\mapsto \int _0^t {\text {tr}}(C_j\rho _{s-}C_j^*)\mathrm {d}s\).

Remark 1

The processes \(\big (B_j(t)\big )_t\) and \(\left( N_j(t)-\int _0^t {\text {tr}}(C_j\rho _{s-}C_j^*)\mathrm {d}s\right) _t\) are actually martingales. Then, assuming that (1.3) accepts a solution, it is easy to check that for any \(t\ge 0\), the expectation of \(\rho _t\) is equal to \({\bar{\rho }}_t\) whenever \(\rho _0={\bar{\rho }}_0\).

Proper definitions of these Poisson processes and proofs of existence and uniqueness of the solution to (1.3) can be found in [5, 6, 33,34,35]. A solution \((\rho _t)_t\) of Equation (1.3) is called a quantum trajectory.

Equations of form (1.3) are used to model experiments in quantum optics (photodetection, heterodyne or homodyne interferometry), particularly for measurement and control (see [15, 24, 37]). They were also introduced as stochastic collapse models (see [19, 22]) and as numerical tools to compute \({\overline{\rho }}_t\) (see [16]). Here, we are interested in the fact that they model the evolution of the system \(\mathcal {S}\) when continuous measurements are done on the environment \(\mathcal {E}\). This can be shown starting from quantum stochastic differential equations using quantum filtering [3, 10, 13, 21, 25]. An approach using the notion of a priori and a posteriori states has been also developed using “classical” stochastic calculus (see the reference book by Barchielli and Gregoratti [5], and references therein). Continuous-time limits of discrete-time models can also be considered, see [33,34,35].

Equation (1.3) has the property that if \(\rho _0\) is an extreme point of \(\mathcal {D}_k\), then \({\rho }_t\) is almost surely an extreme point of \(\mathcal {D}_k\) for any \(t\in {\mathbb {R}}_+\). Since we will extensively use this property, let us make it explicit. The extreme points of \(\mathcal {D}_k\) are the rank 1 orthogonal projectors of \({\mathbb {C}}^k\); for any \(x\in {\mathbb {C}}^k{\setminus }\{0\}\), let \({\hat{x}}\) be its equivalence class in \(\mathrm {P}{\mathbb {C}}^{k}\), the projective space of \({\mathbb {C}}^k\). For \({\hat{x}}\in \mathrm {P}{\mathbb {C}}^{k}\), let \(\pi _{{\hat{x}}}\) be the orthogonal projector onto \({\mathbb {C}}x\). Then, \({\hat{x}}\in \mathrm {P}{\mathbb {C}}^{k}\mapsto \pi _{{\hat{x}}}\) is a bijective map from \(\mathrm {P}{\mathbb {C}}^{k}\) to the set of extreme points. Assume now that \(\rho _0=\pi _{{\hat{x}}_0}\) for some \({\hat{x}}_0\in \mathrm {P}{\mathbb {C}}^{k}\). Then, it is easy to check that \(\rho _t=\pi _{{\hat{x}}_t}\) almost surely for any \(t\in {\mathbb {R}}_+\), with \(t\mapsto x_t\) the unique solution to the following SDE, called a stochastic Schrödinger equation:

$$\begin{aligned} \begin{aligned} \mathrm {d}x_t&=D(x_{t-})x_{t-}\,\mathrm {d}t\\&\quad +\sum _{i\in I_b} \big (L_i-\genfrac{}{}{}1{1}{2}v_i(t-)\,\mathrm {Id}\big )x_{t-}\,\mathrm {d}B_i(t)\\&\quad +\sum _{j\in I_p} \Big (\frac{C_j}{\sqrt{n_j(t-)}}-\mathrm {Id}\Big ) x_{t-}\,\mathrm {d}N_j(t), \end{aligned} \end{aligned}$$
(1.4)

for \(x_0\in {\hat{x}}_0\) of norm 1, where the operator \(D(x_{t-})\) is defined as

$$\begin{aligned} D(x_{t-})= & {} -\big (\mathrm {i}H+\frac{1}{2}\sum _{i\in I_b} L_i^*L_i+\frac{1}{2}\sum _{j\in I_p} C_j^*C_j\big )\\&+\frac{1}{2}\sum _{i\in I_b} v_i(t-)\big (L_i-\genfrac{}{}{}1{1}{4}\,v_i(t-)\,\mathrm {Id}\big )+\frac{1}{2}\sum _{j\in I_p} n_j(t-), \end{aligned}$$

with

$$\begin{aligned} v_i(t-)=\langle x_{t-},(L_i+L_i^*)x_{t-}\rangle ,\qquad n_j(t-)=\langle x_{t-},C_j^*C_j x_{t-} \rangle =\Vert C_j x_{t-}\Vert ^2. \end{aligned}$$

The brackets \(\langle \cdot , \cdot \rangle \) denote the scalar product in \({\mathbb {C}}^k\). Without possible confusion, a solution \((x_t)_t\) will be also called a quantum trajectory. Remark that \(\Vert x_0\Vert =1\) implies \(\Vert x_t\Vert =1\) almost surely for any \(t\in {\mathbb {R}}_+\); remark also that the numerical computation of \( \rho _t\) involves only multiplications of matrices with vectors and not multiplications of matrices. (This is the motivation for the use of quantum trajectories as numerical tools mentioned above.)

In the physics literature, extreme points of \(\mathcal {D}_k\) are called pure states. In particular, the preceding paragraph shows that the evolution dictated by Eq. (1.3) preserves pure states. It actually has also the property that quantum trajectories (solution of (1.3)) tend to “purify.” This has been formalized by Maassen and Kümmerer in [31] for discrete-time quantum trajectories and extended to the continuous-time case by Barchielli and Paganoni in [7]. Purification is related to the following assumption (here, \(A\propto B\) means there exists \(\lambda \in {\mathbb {C}}\) such that \(A=\lambda B\) or \(\lambda A=B\). Particularly, we allow for \(\lambda =0\)).

(Pur)::

Any nonzero orthogonal projector \(\pi \) such that for all \(i\in I_b\), \(\pi (L_i+L_i^*)\pi \propto \pi \) and for all \(j\in I_p\), \(\pi C_j^*C_j\pi \propto \pi \) has rank 1.

As shown in [7], (Pur) implies that for any \(\rho _0\in \mathcal {D}_k\)

$$\begin{aligned} \lim _{t\rightarrow \infty } \inf _{{\hat{y}}\in \mathrm {P}{\mathbb {C}}^{k}}\Vert \rho _t -\pi _{{\hat{y}}}\Vert =0\quad \text {almost surely.} \end{aligned}$$
(1.5)

The main goal of this article is to show how the exponential convergence of the solution \(({\overline{\rho }}_t)_t\) of Eq. (1.1), induced by (\(\mathcal {L}\)-erg), translates for its stochastic unraveling \((\rho _t)_t\) solution of Eq. (1.3). We prove uniqueness of the invariant measure for continuous-time quantum trajectories assuming both (\(\mathcal {L}\)-erg) and (Pur). From (1.5), under these assumptions, the invariant measure will be concentrated on pure states, so we only need to prove uniqueness of the invariant measure for \(({\hat{x}}_t)_t\) equivalence class of \(( x_t)_t\) solution of (1.4) (since \(\mathrm {P}{\mathbb {C}}^{k}\) is compact and the involved process is Feller, the existence of an invariant measure is obvious). The difficulty of this proof lies in the failure of usual techniques like \(\varphi \)-irreducibility. Note that this question has already been partially addressed in the literature: essentially, only diffusive equations have been considered, i.e., equations for which Eq. (1.3) or (1.4) contains no jump term (in our notation, \(I_p=\emptyset \)). The results of [7] were, to our knowledge, the most advanced ones so far. In that article, algebraic conditions on the vector fields describing the stochastic differential equation are imposed to obtain the uniqueness of the invariant measure. This allows the authors to apply directly standard results from the analysis of stochastic differential equations. Unfortunately, their assumptions are hard to check for a given family of matrices \((L_i)_{i\in I_b}\).

The main result of the present paper is the following theorem.

Theorem 1.1

Assume that (Pur) and (\(\mathcal {L}\)-erg) hold. Then, the Markov process \(({\hat{x}}_t)_t\) has a unique invariant probability measure \(\mu _{\mathrm {inv}}\), and there exist \(C>0\) and \(\lambda >0\) such that for any initial distribution \(\mu \) of \({\hat{x}}_0\) over \(\mathrm {P}{\mathbb {C}}^{k}\), for all \(t\ge 0\), the distribution \(\mu _t\) of \({\hat{x}}_t\) satisfies

$$\begin{aligned} W_1(\mu _t,\mu _{\mathrm {inv}})\le C\mathrm {e}^{-\lambda t} \end{aligned}$$

where \(W_1\) is the Wasserstein distance of order 1.

This theorem is more general than previous similar results in different ways. First, we consider stochastic Schrödinger equations involving both Poisson and Wiener processes. Second, our assumptions are standard for quantum trajectories and are easy to check for a given family of operators \(\big (H,(L_i)_{i\in I_b},(C_j)_{j\in I_p}\big )\). Last, we prove an exponential convergence toward the invariant measure. As a by-product, we also provide a simple proof of the purification expressed in Eq. (1.5) (see Proposition 2.5). To complete the picture, assuming only (Pur), we show that (\(\mathcal {L}\)-erg) is necessary. We also provide a complete characterization of the set of invariant measures of \(({\hat{x}}_t)\) whenever (\(\mathcal {L}\)-erg) does not hold (see Proposition 4.2). Arguments in Sects. 3 and 4 are adaptations of [11], where similar results for discrete-time quantum trajectories are considered.

The paper is structured as follows. In Sect. 2, we give a precise description of the model of quantum trajectories with a proper definition of the underlying probability space. In particular, we introduce a new martingale which is central to our proofs. In Sect. 3, we prove Theorem 1.1. In Sect. 4, we derive the full set of invariant measures assuming only (Pur). In Sect. 5, we show that (Pur) is not necessary even if (\(\mathcal {L}\)-erg) holds. In Sect. 6, we provide some examples of explicit invariant measures. In Sect. 6.4, we provide an encoding of any classical finite-state Markov jump process into a stochastic master equation.

2 Construction of the Model

2.1 Construction of Quantum Trajectories

In this section, we fix the notations and introduce the probability space we use to study \(({\hat{x}}_t)_t\). First, for an element \(x\ne 0\) of \({\mathbb {C}}^k\), and for an operator A with \(Ax\ne 0\), we denote

$$\begin{aligned} A\cdot {\hat{x}}={\widehat{Ax}}. \end{aligned}$$

We consider the following distance on \(\mathrm {P}{\mathbb {C}}^{k}\):

$$\begin{aligned} d({\hat{x}},{\hat{y}})=\sqrt{1- |\langle x,y\rangle |^2\,}, \end{aligned}$$
(2.1)

for all \({\hat{x}},{\hat{y}}\in \mathrm {P}{\mathbb {C}}^{k}\), where x and y are norm 1 representatives of \({\hat{x}}\) and \({\hat{y}}\), respectively. We equip \(\mathrm {P}{\mathbb {C}}^{k}\) with the associated Borel \(\sigma \)-algebra denoted by \({\mathcal {B}}\).

Now we introduce a stochastic process with values in \(\mathrm {M}_{k}({\mathbb {C}})\). Let \(\big (\Omega ,(\mathcal {F}_t)_t,{\mathbb {P}}\big )\) be a filtered probability space with standard Brownian motions \(W_i\) for \(i\in I_b\), and standard Poisson processes \(N_j\) for \(j\in I_p\), such that the full family \(\big (W_i, N_j; \,i\in I_b, j\in I_p \big )\) is independent. The filtration \((\mathcal {F}_t)_t\) is assumed to satisfy the standard conditions, and we denote \(\mathcal {F}_{\infty }\) by \(\mathcal {F}\) and the processes \(\big (W_i(t)\big )_t\) and \(\big (N_j(t)-t\big )_t\) are \((\mathcal {F}_t)_t\)-martingales under \({\mathbb {P}}\). We denote by \({\mathbb {E}}\) the expectation with respect to \({\mathbb {P}}\).

On \(\big (\Omega ,(\mathcal {F}_t)_t,{\mathbb {P}}\big )\), for \(s\in {\mathbb {R}}_+\), let \((S_t^s)_{t\in [s,\infty )}\) be the solution to the following SDE:

$$\begin{aligned} \mathrm {d}S_{t}^s= & {} \big (K+{\textstyle {\frac{\# I_p}{2}}}\,{\mathrm {Id}}\big )S_{t-}^{s}\,\mathrm {d}t\nonumber \\&+\sum _{i\in I_b} L_iS_{t-}^{s}\,\mathrm {d}W_i(t)+\sum _{j\in I_p}(C_j-\mathrm {Id})S_{t-}^{s}\,\mathrm {d}N_j(t),\qquad S_s^s=\mathrm {Id}\end{aligned}$$
(2.2)

(\(\# I_p\) is the cardinal of \(I_p\)), where

$$\begin{aligned} K=-\mathrm {i}H-\frac{1}{2}\Big (\sum _{i\in I_b} L_i^*L_i+\sum _{j\in I_p} C_j^*C_j\Big ). \end{aligned}$$

Since standard Cauchy–Lipschitz conditions are fulfilled, the SDE defining \((S_t^s)_t\) has indeed a unique (strong) solution. We denote \(S_t:=S_t^0\). Note that for s fixed the process \((S_t^s)_t\) is independent of \(\mathcal {F}_s\), and we have that for all \(0\le r\le s\le t\)

$$\begin{aligned} S_t^sS_s^r=S_t^r. \end{aligned}$$

In addition, for any \(\rho \in \mathcal {D}_k\), let \((Z_t^\rho )_t\) be the positive real-valued process defined by

$$\begin{aligned} Z_t^\rho ={\text {tr}}(S_t^*S_t\rho ), \end{aligned}$$

and let \((\rho _t)_{t}\) be the \(\mathcal {D}_k\)-valued process defined by

$$\begin{aligned} \rho _t=\frac{S_t\rho S_t^*}{{\text {tr}}(S_t\rho S_t^*)} \end{aligned}$$

if \(Z_t^\rho \ne 0\), taking an arbitrarily fixed value whenever \(Z_t^\rho =0\) (this value will always appear with probability zero in the sequel).

The following results on the properties of \((Z_t^\rho )_t\) were proven in [6]. We give short proofs adapted to our restricted setting where the Hilbert space is finite-dimensional, and \(I=I_b\cup I_p\) is a finite set.

Lemma 2.1

For any \(\rho \in \mathcal {D}_k\), the stochastic process \((Z_t^\rho )_t\) is the unique solution of the SDE

$$\begin{aligned} \mathrm {d}Z_t^\rho= & {} Z_{t-}^\rho \Big (\sum _{i\in I_b}{\text {tr}}\big ((L_i+L_i^*)\rho _{t-}\big )\mathrm {d}W_i(t)\\&+\sum _{j\in I_p} \big ({\text {tr}}(C_j^*C_j\rho _{t-})-1\big )\big (\mathrm {d}N_j(t)-\mathrm {d}t\big )\Big ),\quad Z_0^\rho =1. \end{aligned}$$

Moreover, \((Z_t^\rho )_{t}\) is a nonnegative martingale under \({\mathbb {P}}\).

Proof

The fact that \((Z_t^\rho )_t\) verifies the given SDE is a direct application of the Itô formula. Since \((\rho _t)_t\) takes its values in the compact space \(\mathcal {D}_k\), that SDE verifies standard Cauchy–Lipschitz conditions, ensuring the uniqueness of the solution. Since the processes \(\big (W_i(t)\big )_t\) and \(\big (N_j(t)-t\big )_t\) are \({\mathbb {P}}\)-martingales, it follows that \((Z_t^\rho )_t\) is a \({\mathbb {P}}\)-local martingale. Since \({\text {tr}}(C_j^*C_j\rho )\ge 0\) for any \(j\in I_p\) and \(\rho \in \mathcal {D}_k\), and \((\rho _t)_t\) takes value in the compact space \(\mathcal {D}_k\), it follows from [27, Theorem 12] that \((Z_t^\rho )_{t\in [0,T]}\) is a \({\mathbb {P}}\)-nonnegative martingale for all T. \(\square \)

For any \(\rho \in \mathcal {D}_k\), we define a probability \({\mathbb {P}}^{\rho }_t\) on \((\Omega ,\mathcal {F}_t)\):

$$\begin{aligned} \mathrm {d}{\mathbb {P}}^{\rho }_t=Z_t^\rho \,\mathrm {d}{\mathbb {P}}|_{\mathcal {F}_t}. \end{aligned}$$
(2.3)

Since \((Z_t^\rho )_{t}\) is a \({\mathbb {P}}\)-martingale from Lemma 2.1, the family \(({\mathbb {P}}^{\rho }_t)_t\) is consistent, that is, \({\mathbb {P}}^{\rho }_t(E)={\mathbb {P}}^{\rho }_s(E)\) for \(t\ge s\) and \(E\in \mathcal {F}_s\). Kolmogorov’s extension theorem defines a unique probability on \((\Omega ,\mathcal {F}_\infty )\), which we denote by \({\mathbb {P}}^{\rho }\). We will denote by \({\mathbb {E}}^{\rho }\) the expectation with respect to \({\mathbb {P}}^{\rho }\).

The following proposition makes explicit the relationship between \({\mathbb {P}}\) and \({\mathbb {P}}^{\rho }\). It follows from a direct application of Girsanov’s change of measure theorem (see [26, Theorems III.3.24 and III.5.19]). For all \(i\in I_b\) and \(t\in {\mathbb {R}}_+\), let

$$\begin{aligned} B_i^{\rho }(t)=W_i(t)-\int _0^t{\text {tr}}\big ((L_i+L_i^*)\rho _{s-}\big )\,\mathrm {d}s. \end{aligned}$$

Proposition 2.2

Let \(\rho \in \mathcal {D}_k\). Then, with respect to \({\mathbb {P}}^{\rho }\), the processes \(\{B_i^\rho \}_{i\in I_b}\) are independent Wiener processes and the processes \(\{N_j\}_{j\in I_p}\) are point processes of respective stochastic intensity \(\{t\mapsto {\text {tr}}(C_j^*C_j\rho _{t-})\}_{j\in I_p}\).

The process \((\rho _t)_t\) considered under \({\mathbb {P}}^{\rho }\) models the evolution of a Markov open quantum system subject to indirect measurements. We refer the reader to [5, 14, 15] and references therein for a more detailed discussion of this interpretation.

From Itô calculus, \((\rho _t)_t\) is solution of the SDE

$$\begin{aligned} \begin{aligned} \mathrm {d}{\rho }_t&=\mathcal {L}(\rho _{t-})\mathrm {d}t\\&\quad +\sum _{i\in I_b}\Big (L_i\rho _{t-}+\rho _{t-} L_i^*-{\text {tr}}\big (\rho _{t-}(L_i+L_i^*)\big )\rho _{t-}\Big )\mathrm {d}B_i^\rho (t)\\&\quad +\sum _{j\in I_p} \Big (\frac{C_j\rho _{t-}C_j^*}{{\text {tr}}(C_j\rho _{t-}C_j^*)}-\rho _{t-}\Big )\Big (\mathrm {d}N_j(t)-{\text {tr}}(C_j\rho _{t-} C_j^*) \,\mathrm {d}t\Big ). \end{aligned} \end{aligned}$$
(2.4)

Proposition 2.2 then implies that with respect to \({\mathbb {P}}^{\rho }\), the process \((\rho _t)_t\) is indeed the unique solution of (1.3) with \(\rho _0=\rho \). Similarly, if \(\rho _0=\pi _{\hat{x}}\) for some \({\hat{x}}\in \mathrm {P}{\mathbb {C}}^{k}\), then with respect to \({\mathbb {P}}^{\pi _{{\hat{x}}}}\), the process \(\big (\frac{S_t {x}}{\Vert S_t{x}\Vert }\big )_t\) is the solution of (1.4) with x any norm 1 representative of \(\hat{x}\).

Remark also that for any \(\rho \in {\mathcal {D}}_k\), using (2.3), one has from Remark 1

$$\begin{aligned} {\mathbb {E}}(S_t\rho S_t^*) = {\mathbb {E}}(\rho _t Z_t^\rho )={\mathbb {E}}^{\rho }(\rho _t)=\mathrm {e}^{t\mathcal {L}}(\rho ). \end{aligned}$$
(2.5)

Our strategy of proof is based on the study of the joint distribution of \(S_t\) and a random initial state \({\hat{x}}\). To this end, we consider the product space \(\Omega \times \mathrm {P}{\mathbb {C}}^{k}\) equipped with the filtration \((\mathcal {F}_t\otimes {\mathcal {B}})_t\) and the full \(\sigma \)-algebra \(\mathcal {F}\otimes {\mathcal {B}}\). For any probability measure \(\mu \) on \(\mathrm {P}{\mathbb {C}}^{k}\), and for all \(E\in \mathcal {F}\) and \(A\in {\mathcal {B}}\), let

$$\begin{aligned} {\mathbb {Q}}_{\mu }(E\times A)=\int {\mathbb {P}}^{\pi _{{\hat{x}}}}(E)\,\mathbb {1}_{{\hat{x}} \in A}\,\mathrm {d}\mu ({\hat{x}}). \end{aligned}$$

We will denote by \({\mathbb {E}}_{\mu }\) the expectation with respect to \({\mathbb {Q}}_{\mu }\). Note that \(\mathrm {d}{\mathbb {P}}^{\pi _{{\hat{x}}}}_t = \Vert S_t x\Vert ^2 \,\mathrm {d}{\mathbb {P}}\) for any \({\hat{x}}\in \mathrm {P}{\mathbb {C}}^{k}\), so that \({\mathbb {P}}^{\pi _{{\hat{x}}}}(\{S_tx=0\})=0\) for all \(x\in {\hat{x}}\). Therefore,

$$\begin{aligned} {\mathbb {Q}}_{\mu }\big (\{S_tx=0\}\big )=0 \end{aligned}$$

and there exists a process \(({\hat{x}}_t)_t\) for which

$$\begin{aligned} {\hat{x}}_t=S_t\cdot x \end{aligned}$$

holds almost surely. It has the same distribution as the image by the map \(x\mapsto {\hat{x}}\) of the solution \((x_t)_t\) to (1.4) with \(x_0\in {\hat{x}}\), \(\Vert x_0\Vert =1\).

The following proposition shows that the laws of any \(\mathcal {F}\)-measurable random variables are given by a marginal of \({\mathbb {Q}}_{\mu }\). For a probability measure \(\mu \) on \(\mathrm {P}{\mathbb {C}}^{k}\), we define

$$\begin{aligned} \rho _\mu :={\mathbb {E}}_\mu (\pi _{{\hat{x}}}). \end{aligned}$$

Proposition 2.3

Let \(\mu \) be a probability measure on \(\mathrm {P}{\mathbb {C}}^{k}\), then \(\rho _\mu \in \mathcal {D}_k\) and for any \(E\in \mathcal {F}\),

$$\begin{aligned} {\mathbb {Q}}_{\mu }(E\times \mathrm {P}{\mathbb {C}}^{k})={\mathbb {P}}^{\rho _\mu }(E). \end{aligned}$$

Proof

The fact that \(\rho _\mu \in \mathcal {D}_k\) follows from the positivity and linearity of the expectation. Concerning the second part, let \(t\ge 0\) and \(E\in \mathcal {F}_t\), then

$$\begin{aligned} {\mathbb {Q}}_{\mu }(E\times \mathrm {P}{\mathbb {C}}^{k})=\int {\mathbb {P}}^{\pi _{{\hat{x}}}}(E)\, \mathrm {d}\mu (x)=\int \!\int _{E}{\text {tr}}( S_t^*S_t \pi _{{\hat{x}}}) \,\mathrm {d}{\mathbb {P}}\,\mathrm {d}\mu (x). \end{aligned}$$

Fubini’s Theorem implies

$$\begin{aligned} {\mathbb {Q}}_{\mu }(E\times \mathrm {P}{\mathbb {C}}^{k})=\int _E{\text {tr}}(S_t^*S_t\rho _\mu )\,\mathrm {d}{\mathbb {P}}=\int _E Z_t^{\rho _\mu }\,\mathrm {d}{\mathbb {P}}={\mathbb {P}}^{\rho _\mu }_t(E). \end{aligned}$$

The uniqueness of the extended measure in Kolmogorov’s extension theorem yields the proposition. \(\square \)

Remark 2

Any \(\mathcal {F}\)-measurable random variables X can be extended canonically to a \(\mathcal {F}\otimes {\mathcal {B}}\)-measurable random variables setting \(X(\omega ,{\hat{x}})=X(\omega )\). Proposition 2.3 then implies that the distribution of a \(\mathcal {F}\)-measurable random variable under \({\mathbb {Q}}_{\mu }\) depends on \(\mu \) only through \(\rho _\mu \). The central idea of our proof is that assumption (Pur) will allow us to find a \(\mathcal {F}\)-measurable process approximating \(({\hat{x}}_t)_t\). The \(\mathcal {F}\)-measurability of the process will then imply that it inherits some ergodicity properties from assumption (\(\mathcal {L}\)-erg).

Remark 3

If \(\mu _{\mathrm {inv}}\) is an invariant measure for the Markov chain \(({\hat{x}}_t)_t\), then with the above notation, \(\rho _{\mu _{\mathrm {inv}}}\) is an invariant state for \((\mathrm {e}^{t\mathcal {L}})_t\). In particular, if (\(\mathcal {L}\)-erg) holds then \(\rho _{\mu _{\mathrm {inv}}}=\rho _{\mathrm {inv}}\). This follows from the identities

$$\begin{aligned} \mathrm {e}^{t\mathcal {L}}(\rho _{\mu _{\mathrm {inv}}}) = \int \mathrm {e}^{t\mathcal {L}}(\pi _{{\hat{x}}})\, \mathrm {d}\mu _{\mathrm {inv}}= \int S_t \pi _{{\hat{x}}} S_t^* \, \mathrm {d}{\mathbb {P}}\, \mathrm {d}\mu _{\mathrm {inv}}({\hat{x}})= \int \pi _{{\hat{x}}}\, \mathrm {d}\mu _{\mathrm {inv}}({\hat{x}})= \rho _{\mu _{\mathrm {inv}}} \end{aligned}$$

where the second identity uses (2.5).

2.2 Key Martingale

The following process is the key to construct a \(\mathcal {F}\)-measurable process approximating \(({\hat{x}}_t)_t\). For any \(t\ge 0\), let

$$\begin{aligned} M_t=\frac{S_t^*S_t}{{\text {tr}}(S_t^*S_t)}, \end{aligned}$$
(2.6)

whenever \({\text {tr}}(S_t^*S_t)\ne 0\), and give \(M_t\) a fixed arbitrary value whenever \({\text {tr}}(S_t^*S_t)=0\). Since, by definition, for any \(\rho \in \mathcal {D}_k\), \({\mathbb {P}}^{\rho }\big (\{{\text {tr}}S_t^*S_t=0\}\big )=0\), the arbitrary definition of \(M_t\) on this set of vanishing probability is irrelevant. It turns out that with respect to \({\mathbb {P}}^{\mathrm {Id}/k}\), \((M_t)_t\) is a martingale. For convenience, we write \({\mathbb {P}}^{\mathrm {ch}}={\mathbb {P}}^{\mathrm {Id}/k}\) and similarly for any other \(\rho \)-dependent object, whenever \(\rho =\mathrm {Id}/k\).

Theorem 2.4

With respect to \({\mathbb {P}}^{\mathrm {ch}}\), the stochastic process \((M_t)_t\) is a bounded martingale. Therefore, it converges \({\mathbb {P}}^{\mathrm {ch}}\)-almost surely and in \(\mathrm {L}^{1}\) to a random variable \(M_\infty \). Moreover, for any \(\rho \in \mathcal {D}_k\),

$$\begin{aligned} \mathrm {d}{\mathbb {P}}^{\rho }=k\,{\text {tr}}(\rho M_{\infty })\, \mathrm {d}{\mathbb {P}}^{\mathrm {ch}}, \end{aligned}$$

and \((M_t)_t\) converges almost surely and in \(\mathrm {L}^{1}\) to \(M_{\infty }\) with respect to \({\mathbb {P}}^{\rho }\).

Proof

Expressing \((S_t)_t\) in terms of \(B_i^{\mathrm {ch}}\) for \(i\in I_b\), we have that

$$\begin{aligned} \mathrm {d}S_{t}= & {} \Big (K+{\textstyle {\frac{\# I_p}{2}}}\,{\mathrm {Id}}+\sum _{i\in I_b} \frac{{\text {tr}}\big (S_{t-}^*(L_i+L_i^*)S_{t-}\big )}{{\text {tr}}(S_{t-}^*S_{t-})}L_i\Big )S_{t-}\,\mathrm {d}t\\&+\sum _{i\in I_b} L_iS_{t-}\,\mathrm {d}B_i^{\mathrm {ch}}(t)+\sum _{j\in I_p}(C_j-\mathrm {Id})S_{t-}\,\mathrm {d}N_j(t). \end{aligned}$$

Recall that the distributions of the \(B_i^{\mathrm {ch}}\) and \(N_j\) under \({\mathbb {P}}^{\mathrm {ch}}\) are given by Proposition 2.2.

Since \({\text {tr}}(S_t^*S_t)\) is \({\mathbb {P}}^{\mathrm {ch}}\)-almost surely nonzero, we can define \(R_t\) by \(R_t=S_t/\sqrt{{\text {tr}}(S_t^*S_t)}\) almost surely for \({\mathbb {P}}^{\mathrm {ch}}\), and therefore for \({\mathbb {P}}^{\rho }\) and \({\mathbb {Q}}_{\mu }\). The Itô formula implies

$$\begin{aligned} \mathrm {d}M_t=&\sum _{i\in I_b} \Big (R_{t-}^*(L_i+L_i^*)R_{t-}-M_{t-}{\text {tr}}\big (R_{t-}^*(L_i+L_i^*)R_{t-}\big )\Big )\,\mathrm {d}B^{\mathrm {ch}}_i(t)\\&+\sum _{j\in I_p} \Big (\frac{R_{t-}^*C_j^*C_j R_{t-}}{{\text {tr}}(R_{t-}^*C_j^*C_j R_{t-})}- M_{t-}\Big )\big (\mathrm {d}N_j(t)-{\text {tr}}(R_{t-}^* C_j^*C_jR_{t-})\,\mathrm {d}t\big ). \end{aligned}$$

Hence, with respect to \({\mathbb {P}}^{\mathrm {ch}}\), \((M_t)_t\) is a local martingale. By definition, it is positive-semidefinite, and is also bounded since \({\text {tr}}(M_t)=1\) almost surely. Thus, \((M_t)_t\) is a martingale and standard theorems of convergence for martingales imply the convergence almost surely and in \(\mathrm {L}^{1}\).

By direct computation, we get \(\mathrm {d}{\mathbb {P}}^{\rho }|_{\mathcal {F}_t}=k{\text {tr}}(\rho M_t)\, \mathrm {d}{\mathbb {P}}^{\mathrm {ch}}|_{\mathcal {F}_t}\). The \(\mathrm {L}^{1}\) convergence of \((M_t)_t\) with respect to \({\mathbb {P}}^{\mathrm {ch}}\) then implies \(\mathrm {d}{\mathbb {P}}^{\rho }=k{\text {tr}}(\rho M_\infty )\,\mathrm {d}{\mathbb {P}}^{\mathrm {ch}}\). Finally, the inequality \({\text {tr}}(AB)\le \Vert A\Vert {\text {tr}}(B)\) for any two positive semidefinite matrices implies \({\mathbb {P}}^{\rho }\le k\,{\mathbb {P}}^{\mathrm {ch}}\), which yields the \(\mathrm {L}^{1}\) and almost sure convergence with respect to \({\mathbb {P}}^{\rho }\). \(\square \)

Now we are in the position to show that under the assumption (Pur) the limit \(M_\infty \) is a rank 1 projector. To this end, let us introduce the polar decomposition of \((S_t)_t\): there exists a process \((U_t)_t\) with values in the set of \(k\times k\) unitary matrices such that for all \(t\ge 0\)

$$\begin{aligned} S_t=\sqrt{{\text {tr}}(S_t^*S_t)}\, U_tM_t^{1/2}. \end{aligned}$$

Proposition 2.5

Assume that (Pur) holds. Then, for any \(\rho \in \mathcal {D}_k\), \({\mathbb {P}}^{\rho }\)-almost surely, the random variable \(M_\infty \) is a rank 1 orthogonal projector on \({\mathbb {C}}^k\).

Proof

First, since \({\mathbb {P}}^{\rho }\) is absolutely continuous with respect to \({\mathbb {P}}^{\mathrm {ch}}\), proving the result with \(\rho =\mathrm {Id}/k\) is sufficient. To achieve this, remark that the \({\mathbb {P}}^{\mathrm {ch}}\)-almost sure convergence of \((M_t)_t\) and the \({\mathbb {P}}^{\mathrm {ch}}\)-almost sure bound \(\sup _{t\ge 0}\Vert M_t\Vert \le 1\) imply the convergence of \({\mathbb {E}}^{\mathrm {ch}}(M_t^2)\). Now recall that \(R_t=S_t/\sqrt{{\text {tr}}(S_t^*S_t)}\). The Itô isometry implies

$$\begin{aligned} {\mathbb {E}}^{\mathrm {ch}}(M_t^2)=&M_0^2+\sum _{i\in I_b}\int _0^t{\mathbb {E}}^{\mathrm {ch}}\Big ( R_{s}^*(L_i+L_i^*)R_{s}-M_{s}{\text {tr}}\big (R_{s}^*(L_i+L_i^*)R_s\big )\Big )^2\,\mathrm {d}s\\&+\sum _{j\in I_p}\int _0^t{\mathbb {E}}^{\mathrm {ch}}\bigg (\Big (\frac{R_{s}^*C_j^*C_jR_{s}}{{\text {tr}}(R_s^*C_j^*C_jR_s)}-M_{s}\Big )^2{\text {tr}}(R_{s}^*C_j^*C_jR_s)\bigg )\,\mathrm {d}s. \end{aligned}$$

Therefore, the convergence of \({\mathbb {E}}^{\mathrm {ch}}(M_t^2)\) to \({\mathbb {E}}^{\mathrm {ch}}(M_\infty ^2)\) implies that

$$\begin{aligned} \int _0^\infty {\mathbb {E}}^{\mathrm {ch}}\Big ( R_{s}^*(L_i+L_i^*)R_{s}-M_{s}{\text {tr}}\big (R_{s}^*(L_i+L_i^*)R_s\big )\Big )^2\,\mathrm {d}s<\infty \end{aligned}$$

for all \(i\in I_b\) and

$$\begin{aligned} \int _0^{\infty }{\mathbb {E}}^{\mathrm {ch}}\bigg (\Big (\frac{R_{s}^*C_j^*C_jR_{s}}{{\text {tr}}(R_s^*C_j^*C_jR_s)}-M_{s}\Big )^2{\text {tr}}(R_{s}^*C_j^*C_jR_s)\bigg )\,\mathrm {d}s<\infty \end{aligned}$$

for all \(j\in I_p\). Since the integrands are nonnegative, their inferior limits at infinity are 0. Hence, there exists an unbounded increasing sequence \((t_n)_n\) such that for any \(i\in I_b\),

$$\begin{aligned} \lim _n {\mathbb {E}}^{\mathrm {ch}}\Big ( R_{t_n}^*(L_i+L_i^*)R_{t_n}-M_{t_n}{\text {tr}}\big (R_{t_n}^*(L_i+L_i^*)R_{t_n}\big )\Big )=0 \end{aligned}$$

and for any \(j\in I_p\),

$$\begin{aligned} \lim _n {\mathbb {E}}^{\mathrm {ch}}\bigg (\Big (\frac{R_{t_n}^*C_j^*C_jR_{t_n}}{{\text {tr}}(R_{t_n}^*C_j^*C_jR_{t_n})}-M_{t_n}\Big )^2{\text {tr}}(R_{t_n}^*C_j^*C_jR_{t_n})\bigg )=0. \end{aligned}$$

Since convergence in \(\mathrm {L}^{1}\) implies the almost sure convergence of a subsequence, there exists an unbounded increasing sequence, which we denote also by \((t_n)_n\), such that \({\mathbb {P}}^{\mathrm {ch}}\)-almost surely,

$$\begin{aligned} \lim _{n\rightarrow \infty } \Big ( R_{t_n}^*(L_i+L_i^*)R_{t_n}-M_{t_n}{\text {tr}}(R_{t_n}^*(L_i+L_i^*)R_{t_n}) \Big )=0 \end{aligned}$$

and

$$\begin{aligned} \lim _{n\rightarrow \infty } \bigg (\Big (\frac{R_{t_n}^*C_j^*C_jR_{t_n}}{{\text {tr}}(R_{t_n}^*C_j^*C_jR_{t_n})}-M_{t_n}\Big )^2{\text {tr}}(R_{t_n}^*C_j^*C_jR_{t_n})\bigg )=0 \end{aligned}$$

for all \(i\in I_b\) and \(j\in I_p\).

Now and for the rest of this paragraph, fix a realization (i.e., an element of \(\Omega \)) such that \((M_{t_n})_n\) converges to \(M_\infty \). The polar decomposition of \(R_t\) is \(R_t=U_t\sqrt{M_t}\). Since the set of \(k\times k\) unitary matrices is compact, there exists a subsequence \((s_n)_n\) of \((t_n)_n\) such that \((U_{s_n})_n\) converges to \(U_{\infty }\). We therefore have

$$\begin{aligned} \sqrt{M_{\infty }}U_\infty ^*(L_i+L_i^*)U_\infty \sqrt{M_\infty }-M_\infty {\text {tr}}(M_{\infty }U_\infty ^*(L_i+L_i^*)U_\infty )=0 \end{aligned}$$

and

$$\begin{aligned} \sqrt{M_{\infty }}U_\infty ^*C_j^*C_jU_\infty \sqrt{M_\infty }-M_\infty {\text {tr}}(M_{\infty }U_\infty ^*C_j^*C_jU_\infty )=0, \end{aligned}$$

for all \(i\in I_b\) and \(j\in I_p\). Denoting \(P_\infty \) the orthogonal projector onto the range of \(M_\infty \), it follows that there exist real numbers \((\alpha _i)_{i\in I_b}\) and \((\beta _j)_{j\in I_p}\) such that

$$\begin{aligned} U_\infty P_\infty U_\infty ^* (L_i+L_i^*) U_\infty P_\infty U_\infty ^*=\alpha _i U_\infty P_\infty U_\infty ^* \end{aligned}$$

and

$$\begin{aligned} U_\infty P_\infty U_\infty ^* C_j^*C_j U_\infty P_\infty U_\infty ^*=\beta _j U_\infty P_\infty U_\infty ^*. \end{aligned}$$

Assumption (Pur) implies that the orthogonal projector \(U_\infty P_\infty U_\infty ^*\) has rank 1, thus so does \(P_\infty \). Since \({\text {tr}}(M_\infty )=1\), \(M_\infty \) is a rank 1 orthogonal projector.

Since \((M_{t_n})_n\) converges \({\mathbb {P}}^{\mathrm {ch}}\)-almost surely, the above paragraph and the absolute continuity of \({\mathbb {P}}^{\rho }\) with respect to \({\mathbb {P}}^{\mathrm {ch}}\) show that \(M_\infty \) is \({\mathbb {P}}^{\rho }\)-almost surely a rank one orthogonal projector. \(\square \)

3 Invariant Measure and Exponential Convergence in Wasserstein Distance

This section is devoted to the main result of the paper, which concerns the exponential convergence to the invariant measure for the Markov process \(({\hat{x}}_t)_t\). We first show a convergence result for \({\mathcal {F}}\)-measurable random variables. The following theorem is a transcription of [38, Proposition 7.5].

Theorem 3.1

Assume that (\(\mathcal {L}\)-erg) holds. Then, there exist two constants \(C>0\) and \(\lambda >0\) such that for any \(\rho \in \mathcal {D}_k\) and any \(t\ge 0\),

$$\begin{aligned} \left\| \mathrm {e}^{t\mathcal {L}}(\rho )-\rho _{\mathrm {inv}}\right\| \le C\mathrm {e}^{-\lambda t} \end{aligned}$$

Our next proposition requires the introduction of a shift semigroup. From now on, we assume that \(\big (\Omega ,(\mathcal {F}_t)_t,{\mathbb {P}}\big )\) is a canonical realization of the processes \(W_i\) and \(N_j\), in particular \(\Omega \) is a subset of \(({\mathbb {R}}^{I_b\cup I_p})^{{\mathbb {R}}_+}\). We can then define for every \(t\ge 0\) the map \(\theta ^t\) on \(\Omega \) by

$$\begin{aligned} \big (\theta ^t\omega \big )(s)=\omega (s+t)-\omega (t). \end{aligned}$$

From the previous theorem, we deduce the following proposition for \(\mathcal {F}\)-measurable random variables.

Proposition 3.2

Assume (\(\mathcal {L}\)-erg) holds. Then, there exist two constants \(C>0\) and \(\lambda >0\) such that for any \(\mathcal {F}\)-measurable, essentially bounded function \(f:\Omega \mapsto {\mathbb {C}}\) with essential bound \(\Vert f\Vert _{\infty }\), any \(t\ge 0\) and any \(\rho \in \mathcal {D}_k\),

$$\begin{aligned} \big \vert {\mathbb {E}}^{\rho }(f\circ \theta ^t)-{\mathbb {E}}^{\rho _{\mathrm {inv}}}(f)\big \vert \le \Vert f\Vert _{\infty } C\mathrm {e}^{-\lambda t}. \end{aligned}$$
(3.1)

Proof

Recall that by definition \({\mathbb {P}}\) is the law of processes with independent increments. It follows that if g is \(\mathcal {F}_t\)-measurable and h is \(\mathcal {F}\)-measurable, \({\mathbb {E}}(h\circ \theta ^t g)={\mathbb {E}}(h\circ \theta ^t){\mathbb {E}}(g)\). Then, by definition of \({\mathbb {P}}^{\rho }\),

$$\begin{aligned} {\mathbb {E}}^{\rho }(f\circ \theta ^t)={\mathbb {E}}(f\circ \theta ^tZ_{t+s}^{\rho }). \end{aligned}$$

Since \(Z_{t+s}^{\rho }={\text {tr}}(S_{t+s}^{t\,*}S_{t+s}^t S_t\rho S_t^*)\) where \(S_{t}^*\rho S_t\) is \(\mathcal {F}_t\)-measurable and \(S_{t+s}^{t\,*}S_{t+s}^t=S_{s}^*S_{s}\circ \theta ^t\) by (2.2)

$$\begin{aligned} {\mathbb {E}}^{\rho }(f\circ \theta ^t)={\mathbb {E}}\Big (f\circ \theta ^t{\text {tr}}\big (S_{t+s}^{t\,*}S_{t+s}^t {\mathbb {E}}(S_t\rho S_t^*)\big )\Big ). \end{aligned}$$

Then, relation (2.5), the \(\theta \)-invariance of \({\mathbb {P}}\), and the definition of the measures \({\mathbb {P}}^{\rho }\) yield

$$\begin{aligned} {\mathbb {E}}^{\rho }(f\circ \theta ^t)={\mathbb {E}}^{{\bar{\rho }}_t}(f) \end{aligned}$$

with \({\bar{\rho }}_t=\mathrm {e}^{t\mathcal {L}}(\rho )\). It follows from Theorem 2.4 that

$$\begin{aligned} {\mathbb {E}}^{\rho }(f\circ \theta ^t)-{\mathbb {E}}^{\rho _{\mathrm {inv}}}(f)={\mathbb {E}}^{\mathrm {ch}}\Big ( f {\text {tr}}\big (M_\infty (\mathrm {e}^{t\mathcal {L}}(\rho )-\rho _{\mathrm {inv}})\big )\Big ). \end{aligned}$$

For any matrix A, denoting \(\Vert A\Vert _1\) its trace norm, \(\big |{\text {tr}}(M_\infty A)\big |\le \Vert A\Vert _1\). Therefore,

$$\begin{aligned} \big |{\mathbb {E}}^{\rho }(f\circ \theta ^t)-{\mathbb {E}}^{\rho _{\mathrm {inv}}}(f)\big |\le \Vert f\Vert _\infty \Vert \mathrm {e}^{t\mathcal {L}}(\rho ) -\rho _{\mathrm {inv}}\Vert _1. \end{aligned}$$

Theorem 3.1 then yields the proposition. \(\square \)

The main strategy to show Theorem 1.1 is to construct a \(\mathcal {F}\)-measurable process \(({\hat{y}}_t)_t\) approximating the process \(({\hat{x}}_t)_t\). Let \(({\hat{z}}_t)_t\) be the maximum likelihood process:

$$\begin{aligned} {\hat{z}}_{t}=\mathop {\mathrm {argmax}}\limits _{{\hat{x}}\in \mathrm {P}{\mathbb {C}}^{k}}\,\Vert S_t x\Vert \end{aligned}$$
(3.2)

where x is a norm 1 representative of \({\hat{x}}\). If the largest eigenvalue of \(S_t^*S_t\) is not simple, the choice of \({\hat{z}}_t\) may not be unique. However, we can always choose an appropriate \({\hat{z}}_t\) in an \((\mathcal {F}_t)_t\)-adapted way. If (Pur) holds, Proposition 2.5 ensures that the definition of \({\hat{z}}_t\) is almost surely unambiguous for large enough t: it is the equivalence class of eigenvectors of \(M_t\) corresponding to its largest eigenvalue.

Let now \(({\hat{y}}_t)_t\) be the evolution of this maximum likelihood estimate:

$$\begin{aligned} {\hat{y}}_t=S_t\cdot {\hat{z}}_t. \end{aligned}$$
(3.3)

We shall also use the notation \({\hat{z}}_{t+s}^s:={\hat{z}}_{t}\circ \theta ^s\) and \({\hat{y}}_{t+s}^s:={\hat{y}}_t\circ \theta ^s\), that is, processes defined in the same fashion but substituting \(S_{t+s}^s\) for \(S_t\). It is worth noticing that these processes are all \(\mathcal {F}\)-measurable.

Our proof that \(({\hat{y}}_t)_t\) is an exponentially good approximation of \(({\hat{x}}_t)_t\) relies in part on the use of the exterior product of \({\mathbb {C}}^k\). We recall briefly the relevant definitions: for \(x_1, x_2\in {\mathbb {C}}^k\), we denote by \(x_1\wedge x_2\) the alternating bilinear form

$$\begin{aligned} x_1\wedge x_2:(y_1,y_2)\mapsto \det \begin{pmatrix} \langle x_1, y_1\rangle &{} \langle x_1, y_2\rangle \\ \langle x_2, y_1\rangle &{}\langle x_2, y_2\rangle \end{pmatrix}. \end{aligned}$$

Then, the set of all \(x_1\wedge x_2\) is a generating family for the set \(\wedge ^2{\mathbb {C}}^k\) of alternating bilinear forms on \({\mathbb {C}}^k\). We equip it with a complex inner product by

$$\begin{aligned} \langle x_1\wedge x_2, y_1\wedge y_2\rangle = \det \begin{pmatrix} \langle x_1, y_1\rangle &{} \langle x_1, y_2\rangle \\ \langle x_2, y_1\rangle &{}\langle x_2, y_2\rangle \end{pmatrix}, \end{aligned}$$

and denote by \(\Vert x_1\wedge x_2\Vert \) the associated norm (there should be no confusion with the norm on vectors). It is immediate to verify that our metric \(d(\cdot ,\cdot )\) on \(\mathrm {P}{\mathbb {C}}^{k}\) satisfies

$$\begin{aligned} d({\hat{x}},{\hat{y}})=\frac{\Vert x\wedge y\Vert }{\Vert x\Vert \Vert y\Vert }. \end{aligned}$$
(3.4)

For \(A\in \mathrm {M}_{k}({\mathbb {C}})\), we write \(\wedge ^2 A\) for the operator on \(\wedge ^2{\mathbb {C}}^k\) defined by

$$\begin{aligned} \big (\wedge ^2 A\big ) \,(x_1\wedge x_2)=Ax_1\wedge Ax_2. \end{aligned}$$
(3.5)

It follows that \(\wedge ^2 (AB)=\wedge ^2 A\,\wedge ^2 B\), so that \(\Vert \wedge ^2\! (AB)\Vert \le \Vert \wedge ^2 A\Vert \Vert \wedge ^2 B\Vert \). There exists a useful relationship between the operator norm on \(\wedge ^2\mathrm {M}_{k}({\mathbb {C}})\) and singular values of matrices. From, e.g., Chapter XVI of [32],

$$\begin{aligned} \Vert \wedge ^2 A\Vert =a_1(A)\,a_2(A), \end{aligned}$$
(3.6)

where \(a_1(A)\ge a_2(A)\) are the two first singular values of A, i.e., the square roots of eigenvalues of \(A^* A\). We recall that the operator norm is defined such that \(\Vert A\Vert :=a_1(A)\).

The exponential decrease of \(d({\hat{x}}_t,{\hat{y}}_t)\) is derived from the exponential decay of the following function:

$$\begin{aligned} f:t\mapsto {\mathbb {E}}\big (\Vert \wedge ^2S_t\Vert \big ). \end{aligned}$$

Lemma 3.3

Assume that (Pur) holds. Then, there exist two constants \(C>0\) and \(\lambda >0\) such that for all \(t\ge 0\)

$$\begin{aligned} f(t)\le C\mathrm {e}^{-\lambda t} \end{aligned}$$

Proof

First, we show that f converges to zero as t grows to \(\infty \). To this end recall that \(R_t=S_t/\sqrt{kZ_t^{\mathrm {ch}}}\), so that

$$\begin{aligned} {\mathbb {E}}\big (\Vert \wedge ^2 S_{t}\Vert \big )={\mathbb {E}}^{\mathrm {ch}}\big (k\Vert \wedge ^2R_{t}\Vert \big ). \end{aligned}$$

Furthermore, since \(R_t^*R_t=M_t\), we have from Theorem 2.4 and Proposition 2.5 that

$$\begin{aligned} \lim _{t\rightarrow \infty }\Vert \wedge ^2R_{t}\Vert =\lim _{t\rightarrow \infty }a_1(R_t)\,a_2(R_t)=0. \end{aligned}$$

Indeed, since \(a_1(R_t)\) and \(a_2(R_t)\) are the largest two eigenvalues of \(\sqrt{M_t}\), the fact that it converges to a rank 1 projector implies that \(a_1(R_t)\) converges to 1 and \(a_2(R_t)\) to zero. The inequality \(\Vert S_t\Vert ^2\le {\text {tr}}(S_t^*S_t)\) implies \(\Vert \wedge ^2 R_t\Vert \le 1\) almost surely. Then, Lebesgue’s dominated convergence theorem yields \(\lim _{t\rightarrow \infty }f(t)=0\).

Second, we show f is submultiplicative. By the semigroup property, \(S_{t+s}=S_{t+s}^sS_s\) for all \(t,s\ge 0\). Using that the norm is submultiplicative, for any \(t,s\ge 0\),

$$\begin{aligned} \Vert \wedge ^2 S_{t+s}\Vert \le \Vert \wedge ^2 S_{t+s}^s\Vert \Vert \wedge ^2 S_{s}\Vert \end{aligned}$$

Since \({\mathbb {P}}\) has independent increments, \(\Vert \wedge ^2 S_{t+s}^s\Vert =\Vert \wedge ^2S_t\Vert \circ \theta ^s\) is \(\mathcal {F}_s\)-independent and \(\Vert \wedge ^2S_s\Vert \) is \(\mathcal {F}_s\)-measurable,

$$\begin{aligned} {\mathbb {E}}\big (\Vert \wedge ^2S_{t+s}\Vert \big ) \le {\mathbb {E}}\big (\Vert \wedge ^2 S_{t+s}^s\Vert \big ) {\mathbb {E}}\big (\Vert \wedge ^2 S_{s}\Vert \big ) \end{aligned}$$

The measure \({\mathbb {P}}\) being shift-invariant,

$$\begin{aligned} {\mathbb {E}}\big (\Vert \wedge ^2S_{t+s}\Vert \big ) \le {\mathbb {E}}\big (\Vert \wedge ^2 S_{t}\Vert \big ) {\mathbb {E}}\big (\Vert \wedge ^2 S_{s}\Vert \big ) \end{aligned}$$

which yields that f is submultiplicative.

Since f is measurable, submultiplicative and \(0\le f(t)\le k\) for all t, Fekete’s subadditive lemma ensures that there exists \(\lambda \in (-\infty ,\infty ]\) such that

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t}\log f(t)=\inf \frac{1}{t}\log f(t)=-\lambda . \end{aligned}$$

Since f converges toward 0, this \(\lambda \) belongs to \((0,\infty ]\). This yields the lemma. \(\square \)

Proposition 3.4

Assume that (Pur) holds. Then, there exist two constants \(C>0\) and \(\lambda >0\) such that for any \(s,t\in {\mathbb {R}}_+\) and for any probability measure \(\mu \) on \((\mathrm {P}{\mathbb {C}}^{k},{\mathcal {B}})\),

$$\begin{aligned} {\mathbb {E}}_{\mu }\big (d({\hat{x}}_{t+s},{\hat{y}}_{t+s}^s)\big )\le C\mathrm {e}^{-t\lambda }. \end{aligned}$$
(3.7)

Proof

Recall that \({\mathbb {E}}_\mu \) is the expectation with respect to \({\mathbb {Q}}_{\mu }\). Using the Markov property, we have

$$\begin{aligned} {\mathbb {E}}_{\mu }\big (d({\hat{x}}_{t+s},{\hat{y}}_{t+s}^s)\big )={\mathbb {E}}_{\mu _s}\big (d\big ({\hat{x}}_{t},{\hat{y}}_t)\big ) \end{aligned}$$
(3.8)

with \(\mu _s\) the distribution of \({\hat{x}}_s\) conditioned on \({\hat{x}}_0\sim \mu \). Then, it is sufficient to prove the proposition for \(s=0\). For any \(t\ge 0\), using the fact that \(\Vert S_tz_t\Vert =\Vert S_t\Vert \) for \(z_t\) a norm 1 representative of \({\hat{z}}_t\),

$$\begin{aligned} d({\hat{x}}_{t},{\hat{y}}_t)&=\frac{\Vert x_t\wedge y_t\Vert }{\Vert x_t\Vert \Vert y_t\Vert } =\frac{\Vert S_t x_0\wedge S_t z_t\Vert }{\Vert S_t x_0\Vert \Vert S_t z_t\Vert } \le \frac{\Vert \wedge ^2 S_t\Vert }{\Vert S_t x_0\Vert \Vert S_t\Vert } \le \frac{\Vert \wedge ^2 S_t\Vert }{\Vert S_t x_0\Vert ^2} \end{aligned}$$

Using this inequality and the fact that \(\mathrm {d}{\mathbb {Q}}_{\mu }{}|_{\mathcal {F}_t\otimes \mathcal {B}}=\Vert S_t x_0\Vert ^2\,\mathrm {d}{\mathbb {P}}\,\mathrm {d}\mu ({\hat{x}}_0)\),

$$\begin{aligned} {\mathbb {E}}_{\mu }\big (d({\hat{x}}_{t},{\hat{y}}_t)\big )\le \int {\mathbb {E}}\big (\Vert \wedge ^2 S_t\Vert \big )\, \mathrm {d}\mu ({\hat{x}}_0)\le f(t). \end{aligned}$$

Finally, Lemma 3.3 yields the proposition. \(\square \)

We turn to the proof of our main theorem, Theorem 1.1. The speed of convergence is expressed in terms of the Wasserstein distance \(W_1\). Let us recall the definition of this distance for compact metric spaces: for X a compact metric space equipped with its Borel \(\sigma \)-algebra, the Wasserstein distance of order 1 between two probability measures \(\sigma \) and \(\tau \) on X can be defined using the Kantorovich–Rubinstein duality Theorem as

$$\begin{aligned} W_1(\sigma ,\tau )=\sup _{f\in \mathrm {Lip}_1(X)}\Big | \int _{X} f\,\mathrm {d}\sigma - \int _X f\, \mathrm {d}\tau \Big |, \end{aligned}$$

where \(\mathrm {Lip}_1(X)=\{f:X\rightarrow {\mathbb {R}} \ \mathrm {s.t.}\ \vert f(x)-f(y)\vert \le d(x,y)\}\) is the set of Lipschitz continuous functions with constant 1, and d is the metric on X. Here, we use this for \(X=\mathrm {P}{\mathbb {C}}^{k}\) and d defined in (2.1) (see also (3.4)).

We recall our main theorem before proving it.

Theorem 3.5

Assume that (Pur) and (\(\mathcal {L}\)-erg) hold. Then, the Markov process \(({\hat{x}}_t)_t\) has a unique invariant probability measure \(\mu _{\mathrm {inv}}\), and there exist \(C>0\) and \(\lambda >0\) such that for any initial distribution \(\mu \) of \({\hat{x}}_0\) over \(\mathrm {P}{\mathbb {C}}^{k}\), for all \(t\ge 0\), the distribution \(\mu _t\) of \({\hat{x}}_t\) satisfies

$$\begin{aligned} W_1(\mu _t,\mu _{\mathrm {inv}})\le C\mathrm {e}^{-\lambda t} \end{aligned}$$

where \(W_1\) is the Wasserstein distance of order 1.

Proof

Let \(f\in \mathrm {Lip}_1(\mathrm {P}{\mathbb {C}}^{k})\). From the definition of Wasserstein distance, we can restrict ourselves to functions f that vanish at some point. Remark that since \(\sup _{{\hat{x}},{\hat{y}}\in \mathrm {P}{\mathbb {C}}^{k}}d({\hat{x}},{\hat{y}})=1\), restricting to this set of functions implies \(\Vert f\Vert _\infty \le 1\). Let \(\mu _{\mathrm {inv}}\) be an invariant probability measure for \(({\hat{x}}_t)_t\). We will prove the exponential convergence of \((\mu _t)_t\) toward \(\mu _{\mathrm {inv}}\) for any initial \(\mu _0\), and that will imply that \(({\hat{x}}_t)_t\) accepts a unique invariant probability measure. Let \(t\ge 0\), and recall that \({\hat{y}}_{t}^{t/2} = {\hat{y}}_{t/2} \circ \theta ^{t/2}\). We have

$$\begin{aligned} {\mathbb {E}}_{\mu }\big (f({\hat{x}}_t)\big )-{\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{x}}_t)\big )&={\mathbb {E}}_{\mu }\big (f({\hat{x}}_t)\big )-{\mathbb {E}}_{\mu }\big (f({\hat{y}}_t^{t/2})\big )\nonumber \\&\quad + {\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{y}}_t^{t/2})\big )-{\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{x}}_t)\big ) \nonumber \\&\quad + {\mathbb {E}}_{\mu }\big (f({\hat{y}}_t^{t/2})\big )-{\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{y}}_t^{t/2})\big )\nonumber \\&\le {\mathbb {E}}_{\mu }\big (d({\hat{x}}_t,{\hat{y}}_t^{t/2})\big )+ {\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (d({\hat{x}}_t,{\hat{y}}_t^{t/2})\big ) \end{aligned}$$
(3.9)
$$\begin{aligned}&\quad + {\mathbb {E}}_{\mu }\big (f({\hat{y}}_t^{t/2})\big )-{\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{y}}_t^{t/2})\big ). \end{aligned}$$
(3.10)

The two terms on the right-hand side of line (3.9) are bounded using Proposition 3.4. Using Proposition 2.3, the difference on line (3.10) satisfies

$$\begin{aligned} {\mathbb {E}}_{\mu }\big (f({\hat{y}}_t^{t/2})\big )-{\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{y}}_t^{t/2})\big )= {\mathbb {E}}^{\rho _\mu }\big (f({\hat{y}}_t^{t/2})\big )-{\mathbb {E}}^{\rho _{\mathrm {inv}}}\big (f({\hat{y}}_t^{t/2})\big ). \end{aligned}$$

Then, bounding the right-hand side using Proposition 3.2, it follows there exist \(C>0\) and \(\lambda >0\) such that

$$\begin{aligned} \Big |{\mathbb {E}}_{\mu }\big (f({\hat{x}}_t)\big )-{\mathbb {E}}_{\mu _{\mathrm {inv}}}\big (f({\hat{x}}_t)\big )\Big |\le 3C \mathrm {e}^{-\lambda t/2 }. \end{aligned}$$

Adapting the two constants yields the theorem. \(\square \)

4 Set of Invariant Measures Under (Pur)

The results and proofs of this section are a direct translation of [11, Appendix B]. We reproduce the proofs for the reader’s convenience.

Whenever (\(\mathcal {L}\)-erg) does not hold, \(\dim \ker \mathcal {L}>1\) and the semigroup \((\mathrm {e}^{t \mathcal {L}})_t\) accepts more than one fixed point in \(\mathcal {D}_k\). The convex set of invariant states can be explicitly classified given the matrices \((L_i)_{i\in I_b}\) and \((C_j)_{j\in I_b}\). Following [9, Theorem 7] (alternatively see Theorem 7.2 and Proposition 7.6 in [38], and [36]), there exists a decomposition

$$\begin{aligned} {\mathbb {C}}^k \simeq {\mathbb {C}}^{n_1} \oplus \cdots \oplus {\mathbb {C}}^{n_d} \oplus {\mathbb {C}}^{D}, \quad k = n_1 + \ldots + n_d + D \end{aligned}$$

with the following properties:

  1. (1)

    The range of any invariant states is a subspace of \(V = {\mathbb {C}}^{n_1} \oplus \cdots \oplus {\mathbb {C}}^{n_d} \oplus \{0\}\);

  2. (2)

    The restriction of the operators \(L_i\) and \(C_j\) to \( {\mathbb {C}}^{n_1} \oplus \cdots \oplus {\mathbb {C}}^{n_d}\) is block-diagonal, with

    $$\begin{aligned} \begin{aligned} L_i&= L_{1,i} \oplus \cdots \oplus L_{d,i},&i\in I_b,\\ C_j&= C_{1,j} \oplus \cdots \oplus C_{d,j},&j\in I_p; \end{aligned} \end{aligned}$$
    (4.1)
  3. (3)

    For each \(\ell =1,\ldots ,d\), there are a decomposition \({\mathbb {C}}^{n_\ell } = {\mathbb {C}}^{k_\ell } \otimes {\mathbb {C}}^{m_\ell }, \, n_\ell = k_\ell \times m_\ell \), a unitary matrix \(U_\ell \) on \({\mathbb {C}}^{n_\ell }\) and matrices \(\{{\hat{L}}_{\ell ,i}\}_{i\in I_b}\) and \(\{{\hat{C}}_{\ell ,j}\}_{j\in I_p}\) on \({\mathbb {C}}^{k_\ell }\) such that

    $$\begin{aligned} \begin{aligned} L_{\ell ,i}&= U_\ell ({\hat{L}}_{\ell ,i} \otimes \mathrm {Id}_{{\mathbb {C}}^{m_\ell }}) U_\ell ^*,&i\in I_b,\\ C_{\ell ,j}&= U_\ell ({\hat{C}}_{\ell ,j} \otimes \mathrm {Id}_{{\mathbb {C}}^{m_\ell }}) U_\ell ^*,&j\in I_p; \end{aligned} \end{aligned}$$
    (4.2)
  4. (4)

    There exists a positive definite matrix \(\rho _\ell \) on \({\mathbb {C}}^{k_\ell }\) such that

    $$\begin{aligned} 0 \oplus \cdots \oplus U_\ell (\rho _\ell \otimes \mathrm {Id}_{{\mathbb {C}}^{m_\ell }}) U_\ell ^* \oplus \cdots \oplus 0 \end{aligned}$$
    (4.3)

    is a fixed point of \((\mathrm {e}^{t\mathcal {L}})_t\).

Then, the set of fixed points for \((\mathrm {e}^{t\mathcal {L}})\) is

$$\begin{aligned} U_1\big (\rho _1\otimes M_{m_1}({\mathbb {C}})\big )U_1^*\oplus \ldots \oplus U_d\big (\rho _d\otimes M_{m_d}({\mathbb {C}})\big )U_d^*\oplus 0_{M_D({\mathbb {C}})}. \end{aligned}$$

The decomposition simplifies under the purification assumption.

Proposition 4.1

Assume that (Pur) holds. Then, there exist a set \(\{\rho _\ell \}_{\ell =1}^d\) of positive definite matrices and an integer D such that the set of fixed points of \((\mathrm {e}^{t\mathcal {L}})_t\) is

$$\begin{aligned} {\mathbb {C}}\rho _1\oplus \cdots \oplus {\mathbb {C}}\rho _d\oplus 0_{M_D({\mathbb {C}})}. \end{aligned}$$

Proof

The statement follows from the discussion preceding the proposition if we show that (Pur) implies \(m_1 = \ldots = m_d =1\). Assume that one of the \(m_\ell \), e.g., \(m_1\), is greater than 1. Let x be a norm 1 vector in \({\mathbb {C}}^{k_1}\). Then, \(\pi = U_1(\pi _{\hat{x}} \otimes I_{{\mathbb {C}}^{m_1}}) U_1^*\oplus 0 \oplus \cdots \oplus 0\) is an orthogonal projection of rank \(m_1>1\), and

$$\begin{aligned} \pi (L_i+L_i^* )\pi= & {} \Vert (L_{1,i}+L_{1,i}^*) x\Vert ^2\, \pi \quad \text{ and }\quad \\ \pi (C_j^*C_j )\pi= & {} \Vert C_{1,j}^*C_{1,j} x\Vert ^2 \,\pi \quad \text{ for } \text{ all } i\in I_b, j\in I_p, \end{aligned}$$

and this contradicts (Pur). \(\square \)

It is clear from Proposition 4.1 that to each extremal fixed point \(0 \oplus \cdots \oplus \rho _\ell \oplus \cdots \oplus 0\) corresponds a unique invariant measure \(\mu _\ell \) supported on its range \(\mathop {\mathrm{ran}}\nolimits \rho _\ell \). The converse is the subject of the next proposition.

Proposition 4.2

Assume (Pur) holds. Then, any invariant probability measure of \(({\hat{x}}_t)_t\) is a convex combination of the measures \(\mu _\ell \), \(\ell =1,\ldots ,d\).

Proof

Let \(\mu \) be an invariant probability measure for \(({\hat{x}}_t)_t\) and f be a continuous function on \(\mathrm {P}{\mathbb {C}}^{k}\). Proposition 3.4 implies that

$$\begin{aligned} \int f \,\mathrm {d}\mu ={\mathbb {E}}_{\mu }\big (f({\hat{x}}_0)\big )={\mathbb {E}}_{\mu }\big (f({\hat{x}}_t)\big )=\lim _{t\rightarrow \infty }{\mathbb {E}}_{\mu }\big (f( {\hat{y}}_t)\big ). \end{aligned}$$

Since \(({\hat{y}}_t)_t\) is \(\mathcal {F}\)-measurable, Proposition 2.3 implies \(\int f \,\mathrm {d}\mu =\lim _{t\rightarrow \infty }{\mathbb {E}}^{\rho _\mu }\big (f( {\hat{y}}_t)\big )\), and by Remark 3, \(\rho _\mu \in \mathcal {D}_k\) is a fixed point of \((\mathrm {e}^{t\mathcal {L}})_t\). Proposition 4.1 ensures that there exist nonnegative numbers \(p_1,\ldots ,p_d\) summing up to one such that \(\rho _\mu =p_1\,\rho _1\oplus \cdots \oplus p_d\,\rho _d\oplus 0_{M_D({\mathbb {C}})}\). From the definition of \({\mathbb {P}}^{\rho _\mu }\),

$$\begin{aligned} {\mathbb {P}}^{\rho _\mu }=p_1\,{\mathbb {P}}^{\rho _1}+\cdots +p_d\,{\mathbb {P}}^{\rho _d} \end{aligned}$$

with the abuse of notation \(\rho _\ell := 0\oplus \cdots \oplus \rho _\ell \oplus \cdots \oplus 0\), so that

$$\begin{aligned} \int f \, \mathrm {d}\mu =\lim _{t\rightarrow \infty } p_1\,{\mathbb {E}}^{\rho _1}\big (f({\hat{y}}_t)\big )+\cdots +p_d\,{\mathbb {E}}^{\rho _d}\big (f({\hat{y}}_t )\big ). \end{aligned}$$

The same argument gives \(\int f \, \mathrm {d}\mu _\ell = \lim _{t\rightarrow \infty } {\mathbb {E}}^{\rho _\ell }\big (f({\hat{y}}_t)\big )\), and we have \(\mu =p_1\,\mu _1+\ldots + p_d\,\mu _d\). \(\square \)

5 (Pur) is Not Necessary for Purification

As shown by the following example, the condition (Pur) is sufficient but not necessary for (1.5) to hold.

Let \(k=3\) and fix an orthonormal basis \(\{e_1,e_2,e_3\}\) of \({\mathbb {C}}^3\). Let \(I_b=\{0,1\}\), \(I_p=\{2\}\), \(u=(e_1+e_2+e_3)/\sqrt{3}\) and \(v=(e_1+e_3)/\sqrt{2}\). Let

$$\begin{aligned} H=0,\qquad V_0=L_0=e_1u^*, \qquad V_1=L_1=2 vv^*+e_2e_2^*, \qquad V_2=C_2=u e_1^*. \end{aligned}$$
(5.1)

Proposition 5.1

Let \(\mathcal {L}\) be the Lindblad operator given by (1.2) with H, \(V_1\), \(V_2\), \(V_3\) defined in (5.1). Then, (\(\mathcal {L}\)-erg) holds and the unique invariant state \(\rho _{\mathrm {inv}}\) is positive definite.

Proof

Using [38, Proposition 7.6], it is sufficient to prove that if \(\pi \) is a non-null orthogonal projector such that \((\mathrm {Id}-\pi ) L_0\pi =(\mathrm {Id}-\pi ) L_1\pi = (\mathrm {Id}-\pi )C_2\pi =0\), then \(\pi =\mathrm {Id}\). Assume \({\text {rank}}\pi <3\). Since \(\pi \in M_3({\mathbb {C}})\), there exist \({\hat{x}}\in \mathrm {P}{\mathbb {C}}^{3}\) such that either \(\pi =\pi _{{\hat{x}}}\) or \(\pi =\mathrm {Id}-\pi _{{\hat{x}}}\). If the first alternative holds, \((\mathrm {Id}-\pi ) L_0\pi =(\mathrm {Id}-\pi ) L_1\pi = (\mathrm {Id}-\pi )C_2\pi =0\) implies \({\hat{x}}\) is the equivalence class of a common eigenvector of \(L_0\), \(L_1\) and \(C_2\). If the second alternative holds, \({\hat{x}}\) is the equivalence class of a common eigenvector of \(L_0^*\), \(L_1^*\) and \(C_2^*\). The only common eigenvectors of \(L_0\) and \(C_2\) or \(L_0^*\) and \(C_2^*\) are elements of \({\mathbb {C}}(e_2-e_3)\). Since \(L_1\) is self-adjoint, and this eigenspace is not an eigenspace of \(L_1\), the proposition holds. \(\square \)

In the orthonormal basis \(\{e_1,e_2,e_3\}\),

$$\begin{aligned} \begin{aligned} L_0^*+L_0=\frac{1}{\sqrt{3}}\begin{pmatrix} 2&{}{}1&{}{}1\\ 1&{}{}0&{}{}0\\ 1&{}{}0&{}{}0 \end{pmatrix},\quad L_1^*+L_1=2\begin{pmatrix} 1&{}{}\quad 0&{}{}\quad 1\\ 0&{}{}\quad 1&{}{}\quad 0\\ 1&{}{}\quad 0&{}{}\quad 1 \end{pmatrix} \quad \text{ and }\quad C_2^*C_2=\begin{pmatrix} 1&{}{}\quad 0&{}{}\quad 0\\ 0&{}{}\quad 0&{}{}\quad 0\\ 0&{}{}\quad 0&{}{}\quad 0 \end{pmatrix}. \end{aligned} \end{aligned}$$

Taking \(\pi \) the orthogonal projector onto the subspace spanned by \(\{e_2,e_3\}\), it follows that (Pur) does not hold. Yet we have the following proposition.

Proposition 5.2

Consider the family of processes \((\rho _t)_t\) defined by (1.3) with H, \(L_0\), \(L_1\), \(C_2\) defined in (5.1). Then, for any \(\rho \in \mathcal {D}_k\),

$$\begin{aligned} \lim _{t\rightarrow \infty } \inf _{{\hat{y}}\in \mathrm {P}{\mathbb {C}}^{k}}\Vert \rho _t -\pi _{{\hat{y}}}\Vert =0\quad {\mathbb {P}}^{\rho }\text {-almost surely.} \end{aligned}$$

Proof

Proposition 5.1 implies that \(\rho _{\mathrm {inv}}\), the unique element of \(\mathcal {D}_k\) invariant by \((\mathrm {e}^{t\mathcal {L}})_t\) is positive definite. Then, \({\text {tr}}(C_2^*C_2\rho _{\mathrm {inv}})>0\). The results of [29] thus ensure that for any \(\rho \in \mathcal {D}_k\),

$$\begin{aligned} \lim _{t\rightarrow \infty } N_2(t)/t={\text {tr}}(C_2^*C_2\,\rho _{\mathrm {inv}}),\quad {\mathbb {P}}^{\rho }\text {-almost surely.} \end{aligned}$$

Let \(T=\inf \{t\ge 0 : N_2(t)\ge 1\}\). Then, \({\mathbb {P}}^{\rho }(T<\infty )=1\) and from the definition of \(C_2\),

$$\begin{aligned} \begin{aligned} \rho _T=\pi _{{\hat{u}}} \quad \text{ and }\quad \rho _t=\pi _{S_t^T\cdot {\hat{u}}} \text{ for } \text{ any } t\ge T. \end{aligned} \end{aligned}$$

Hence, \(\inf _{{\hat{y}}\in \mathrm {P}{\mathbb {C}}^{k}}\Vert \rho _t -\pi _{{\hat{y}}}\Vert =0\) for any \(t\ge T\) and \({\mathbb {P}}^{\rho }(T<\infty )=1\) yield the proposition. \(\square \)

Corollary 5.3

Consider the process \(({\hat{x}}_t)_t\) defined by (1.4) with H, \(L_0\), \(L_1\), \(C_2\) defined in (5.1). Then, \(({\hat{x}}_t)_t\) accepts a unique invariant probability measure \(\mu _{\mathrm {inv}}\) and there exist \(C>0\) and \(\lambda >0\) such that for any initial distribution \(\mu \) of \({\hat{x}}_0\) over \(\mathrm {P}{\mathbb {C}}^{3}\), for all \(t\ge 0\), the distribution \(\mu _t\) of \({\hat{x}}_t\) satisfies

$$\begin{aligned} W_1(\mu _t,\mu _{\mathrm {inv}})\le C e^{-\lambda t}. \end{aligned}$$

Proof

It is a direct adaptation of our proof of Theorem 1.1. Indeed, Theorem 1.1 holds if one substitutes the conclusion of Proposition 2.5 for (Pur). Taking \(\rho _0=\mathrm {Id}/3\) in the latter proposition yields \(\rho _t=\frac{S_t S_t^*}{{\text {tr}}(S_t S_t^*)}\) and \(M_t=\frac{S_t^* S_t}{{\text {tr}}(S_t S_t^*)}\). Therefore, \(\rho _t\) and \(M_t\) are unitarily equivalent. Following the arguments and notation of the proof of Proposition 5.2, we see that \({\mathbb {P}}^{\mathrm {ch}}\)-almost surely, \(M_T\) has rank one, and so does any \(M_t\) for \(t\ge T\). Hence, the conclusion of Proposition 2.5 holds and the corollary is proven. \(\square \)

Following the proofs of the discrete-time results of [11], we can prove that the implication in Proposition 2.5 is an equivalence if (Pur) is replaced by

(NSC-Pur)::

Any nonzero orthogonal projector \(\pi \) that satisfies \(\pi S_t^*S_t\pi \propto \pi \) \({\mathbb {P}}\)-almost surely for any \(t\ge 0\) has rank one.

Alas, in practice, such a condition is hard to check.

6 Examples

In the following examples, \(k=2\). We recall the definition of the Pauli matrices:

$$\begin{aligned} \sigma _x:=\begin{pmatrix} 0&{}\,1\\ 1&{}\,0 \end{pmatrix},\quad \sigma _y:=\begin{pmatrix} 0&{}-\mathrm {i}\\ \mathrm {i}&{}\,0 \end{pmatrix}\quad \text{ and }\quad \sigma _z:=\begin{pmatrix} 1&{}\,0\\ 0&{}-1 \end{pmatrix}. \end{aligned}$$

A standard orthonormal basis of \(M_2({\mathbb {C}})\) equipped with the Hilbert–Schmidt inner product is

$$\begin{aligned} \left( \genfrac{}{}{}1{1}{\sqrt{2}}\mathrm {Id},\genfrac{}{}{}1{1}{\sqrt{2}}\sigma _x,\genfrac{}{}{}1{1}{\sqrt{2}}\sigma _y,\genfrac{}{}{}1{1}{\sqrt{2}}\sigma _z\right) . \end{aligned}$$

In the basis of Pauli matrices, one can write in a unique way any projection \(\pi _{{\hat{x}}}\) as

$$\begin{aligned} \pi _{{\hat{x}}}=\genfrac{}{}{}1{1}{2}\big (\mathrm {Id}+ {\mathcal {X}}\sigma _x + {\mathcal {Y}}\sigma _y+{\mathcal {Z}}\sigma _z\big ) \end{aligned}$$

where

$$\begin{aligned} {\mathcal {X}}= {\text {tr}}(\pi _{{\hat{x}}}\sigma _x), \quad {\mathcal {Y}}= {\text {tr}}(\pi _{{\hat{x}}}\sigma _y), \quad {\mathcal {Z}}= {\text {tr}}(\pi _{{\hat{x}}}\sigma _z). \end{aligned}$$

We denote, in particular, by \({\mathcal {X}}_t,{\mathcal {Y}}_t,{\mathcal {Z}}_t\), respectively, the coordinates associated with \(\pi _{{\hat{x}}_t}\).

6.1 Unitarily Perturbed Non-Demolition Diffusive Measurement

Our first example consists of a \(\genfrac{}{}{}1{1}{2}\)-spin (or qbit) in a magnetic field oriented along the y-axis and subject to indirect non demolition measurement along the z-axis. It is a typical quantum optics experimental situation (see, for example, [20]). In terms of the parameters defining the related quantum trajectories, we get \(H=\sigma _y\), \(I_b=\{0\}\), \(I_p=\emptyset \) and \(L_0=\sqrt{\gamma }\,\sigma _z\) with \(\gamma >0\). Then, \((\pi _{{\hat{x}}_t})_t\) conditioned on \({\hat{x}}_0\) is the solution of

$$\begin{aligned} \mathrm {d}\pi _{{\hat{x}}_t}= & {} \big (-\mathrm {i}[\sigma _y,\pi _{{\hat{x}}_t}] +\gamma (\sigma _z\pi _{{\hat{x}}_t}\sigma _z-\pi _{{\hat{x}}_t})\big )\,\mathrm {d}t \nonumber \\&+ \sqrt{\gamma }\big (\sigma _z\pi _{{\hat{x}}_t}+\pi _{{\hat{x}}_t}\sigma _z-2{\text {tr}}(\sigma _z\pi _{{\hat{x}}_t})\pi _{{\hat{x}}_t}\big )\,\mathrm {d}B_t. \end{aligned}$$
(6.1)

For this quantum trajectory, it is immediate to verify (Pur), and solving \(\mathcal {L}(\rho )=0\) shows that \(\rho _{\mathrm {inv}}=\genfrac{}{}{}1{1}{2}\mathrm {Id}\) is the unique invariant state, so that (\(\mathcal {L}\)-erg) holds. Hence, by Theorem 1.1\(({\hat{x}}_t)_t\) has a unique invariant measure. In the following, we derive an explicit expression for this invariant measure.

The next lemma allows us to restrict the state space.

Lemma 6.1

If \(\mu ({\mathcal {Y}}_0=0)=1\), then \({\mathbb {Q}}_{\mu }({\mathcal {Y}}_t=0)=1\) for all t in \({\mathbb {R}}\).

Proof

From equation (6.1), \(({\mathcal {Y}}_t)_t\) is the solution of \(\mathrm {d}{\mathcal {Y}}_t=-2{\mathcal {Y}}_t(\gamma \,\mathrm {d}t-\sqrt{\gamma } \,{\mathcal {Z}}_t\,\mathrm {d}B_t)\). It is therefore a Doléans-Dade exponential:

$$\begin{aligned} \begin{aligned} {\mathcal {Y}}_t={\mathcal {Y}}_0\ e^{-2\gamma t} \exp \big (-2\gamma \int _0^t {\mathcal {Z}}_s^2\, \mathrm {d}s +2\sqrt{\gamma }\int _0^t {\mathcal {Z}}_s \, \mathrm {d}B_s\big ) \end{aligned} \end{aligned}$$

and the conclusion follows. \(\square \)

Now we prove that the invariant measure admits a rotational symmetry.

Lemma 6.2

Assume that the distribution \(\mu \) of \({\hat{x}}_0\) is invariant with respect to the mapping \({\hat{x}}\mapsto \sigma _y\cdot {\hat{x}}\). Then, \(\mu _t\) is invariant with respect to the same mapping.

Proof

Since \(\sigma _y\) is unitary and self-adjoint, we have \(\pi _{\sigma _y\cdot {\hat{x}}}=\sigma _y \pi _{{\hat{x}}}\sigma _y\). Since \(\sigma _y\sigma _z=-\sigma _z\sigma _y\), it follows from (6.1) that

$$\begin{aligned} \mathrm {d}(\sigma _y\pi _{{\hat{x}}_t}\sigma _y)=&\big (-\mathrm {i}[\sigma _y,\sigma _y\pi _{{\hat{x}}_t}\sigma _y]+\gamma (\sigma _z\sigma _y\pi _{{\hat{x}}_t}\sigma _y\sigma _z- \sigma _y\pi _{{\hat{x}}_t}\sigma _y)\big )\,\mathrm {d}t\\&-\sqrt{\gamma }\big (\sigma _z\sigma _y\pi _{{\hat{x}}_t}\sigma _y+\sigma _y\pi _{{\hat{x}}_t}\sigma _y\sigma _z-2{\text {tr}}(\sigma _z\sigma _y\pi _{{\hat{x}}_t}\sigma _y)\sigma _y\pi _{{\hat{x}}_t}\sigma _y\big )\,\mathrm {d}B_t. \end{aligned}$$

Then, it follows from \(\sigma _y \cdot {\hat{x}}_0\sim {\hat{x}}_0\) and \((B_t)_t\sim (-B_t)_t\) that \(({\hat{x}}_t)_t\) and \((\sigma _y\cdot {\hat{x}}_t)_t\) are both weak solutions to the same SDE with the same initial condition. Since this SDE has a unique solution, they have the same distributions. \(\square \)

Proposition 6.3

Let \(({\hat{x}}_t)_t\) be the process defined by (6.1). Then, its unique invariant measure is the normalized image measure by

$$\begin{aligned} \iota :\theta \mapsto \genfrac{}{}{}1{1}{2}\big (\mathrm {Id}+ \sin \theta \,\sigma _x +\cos \theta \,\sigma _z\big ) \end{aligned}$$
(6.2)

of the measure \(\tau (\theta )\,\mathrm {d}\theta \) on \((-\pi ,\pi ]\) with

$$\begin{aligned} \tau (\theta )=\int _\theta ^\pi \exp \frac{\cot x-\cot \theta }{\gamma } \frac{\sin x}{\sin ^3\theta }\,\mathrm {d}x \end{aligned}$$

for \(\theta \in [0,\pi ]\) and \(\tau (\theta )=\tau (\theta +\pi )\) for \(\theta \in (-\pi ,0]\).

Proof

The convergence results in Theorem 1.1 and Lemma 6.1 imply that the invariant measure \(\mu _{\mathrm {inv}}\) is the image by \(\iota \) of a probability measure \(\uptau \) on \((-\pi ,\pi ]\). Let \((\theta _t)_t\) be the solution of

$$\begin{aligned} \mathrm {d}\theta _t=2(1-\gamma \cos \theta _t\sin \theta _t)\,\mathrm {d}t -2\sqrt{\gamma }\sin \theta _t \,\mathrm {d}B_t \end{aligned}$$
(6.3)

with initial condition \(\theta _0\). Remark that \((\theta _t)_t\) is \(2\pi \)-periodic with respect to its initial condition, namely \((\theta _t+2\pi )_t\) is solution of (6.3) with initial condition \(\theta _0+2\pi \). Now, using the Itô formula,

$$\begin{aligned} (\cos \theta _t,\sin \theta _t)_t\sim \big ({\text {tr}}(\pi _{{\hat{x}}_t}\sigma _z),{\text {tr}}(\pi _{{\hat{x}}_t}\sigma _x)\big )_t \end{aligned}$$

for \((\pi _{{\hat{x}}_t})_t\) solution of (6.1) with initial condition \({\hat{x}}_0=\genfrac{}{}{}1{1}{2}\big (\mathrm {Id}+ \sin \theta _0\,\sigma _x +\cos \theta _0 \,\sigma _z\big )\). Hence, \(\big (\iota (\theta _t)\big )_t\) has the same distribution as \(({\hat{x}}_t)_t\). Therefore, \(\uptau \) is an invariant measure for the diffusion defined by (6.3); in addition, Theorem 1.1 shows that this invariant measure is unique, and Lemma 6.2 shows that it is \(\pi \)-periodic. Following standard methods (see [28]), one shows that the restriction of \(\uptau \) to \([0,\pi )\) has a density of the form \(\tau (\theta )=C_1\tau _1(\theta )+C_2\tau _2(\theta )\) with \(C_1,C_2\in {\mathbb {R}}\) and

$$\begin{aligned} \begin{aligned} \tau _1(\theta )=\frac{\int _\theta ^\pi \sin x \, \exp (\genfrac{}{}{}1{1}{\gamma }\cot x) \, \mathrm {d}x}{\sin ^3\theta \exp (\frac{1}{\gamma }\cot \theta )}, \qquad \tau _2(\theta )=\frac{1}{ \sin ^3\theta \exp (\genfrac{}{}{}1{1}{\gamma }\cot \theta )}. \end{aligned} \end{aligned}$$

Now, straightforward analysis shows that \(\int _0^\pi \tau _1(\theta )\,\mathrm {d}\theta <\infty \) while \(\int _0^\pi \tau _2(\theta )\,\mathrm {d}\theta =\infty \). Therefore, \(\tau \) is proportional to \(\tau _1\) and the result follows. \(\square \)

Remark 4

For \(\gamma \rightarrow \infty \), the invariant measure \(\mu _{\mathrm {inv}}\) in Proposition 6.3 is a Dirac measure at 0 and \(\pi \). To describe the scaling for \(\gamma \) large, we embed \(\tau (\theta )\) into \(L^{1}({\mathbb {R}})\) by defining it to be zero outside the region \((-\pi , \pi ]\). Then, on the positive half line, in the \(L^1\) norm,

$$\begin{aligned} \lim _{\gamma \rightarrow \infty }\frac{1}{2 \gamma ^3} \tau \left( \frac{\theta }{\gamma }\right) = \frac{1}{\theta ^3} \exp \left( -\frac{1}{\theta }\right) . \end{aligned}$$

Hence, for large \(\gamma \), the stationary probability distribution has two peaks of width (of order) \(1/\gamma \) located \(1/\gamma \) radians clockwise from the limit points 0 and \(\pi \). Furthermore, the probability to find the particle around the limit points is exponentially suppressed.

The strong noise limit, \(\gamma \rightarrow \infty \), was recently studied in various models [2, 8, 12]. This is the first model that allows for an explicit calculation of the shape of the stationary probability measure. The density of the invariant probability distribution is plotted in Fig. 1 for three values of \(\gamma \), and for \(\theta \in [0,\pi ]\).

Fig. 1
figure 1

Restriction to \([0,\pi ]\) of the density of the invariant probability distribution in Example 6.1

6.2 Thermal Qubit, Diffusive Case

The following second example corresponds to the evolution of a qubit interacting weakly with the electromagnetic field at a fixed temperature. The emission and absorption of photons by the qubit are stimulated by a resonant coherent field (laser). In the limit of a strong stimulating laser, the measurement of emitted photons results in a diffusive signal whose drift depends on the instantaneous average value of the raising and lowering operators of the qubit (see [37, §4.4] for a more detailed physical derivation). We obtain an analytically solvable model if we assume that the unitary rotation of the qubit is compensated for and thus frozen. In terms of the parameters defining the related quantum trajectories, we get \(H=0\), \(I=I_b=\{0,1\}\), \(L_0=\sqrt{a} \,\sigma _+\) and \(L_1=\sqrt{b}\,\sigma _-\) with \(a,b\in {\mathbb {R}}_+{\setminus }\{0\}\) and \(\sigma _\pm =\genfrac{}{}{}1{1}{2}(\sigma _x\pm \mathrm {i}\sigma _y)\), so that \(\sigma _+= \begin{pmatrix} 0 &{} 1 \\ 0 &{} 0\end{pmatrix}\) and \(\sigma _-= \begin{pmatrix} 0 &{} 0 \\ 1 &{} 0\end{pmatrix}\).

The stochastic master equation satisfied by \(\pi _{{\hat{x}}_t}\) is

$$\begin{aligned} \begin{aligned} \mathrm {d}\pi _{{\hat{x}}_t}&=\quad a\Big (\sigma _+\pi _{{\hat{x}}_t}\sigma _--\genfrac{}{}{}1{1}{4}\big ((\mathrm {Id}-\sigma _z)\pi _{{\hat{x}}_t}+\pi _{{\hat{x}}_t}(\mathrm {Id}-\sigma _z)\big )\Big )\,\mathrm {d}t\\&\quad +b\Big (\sigma _-\pi _{{\hat{x}}_t}\sigma _+-\genfrac{}{}{}1{1}{4}\big ((\mathrm {Id}+\sigma _z)\pi _{{\hat{x}}_t}+\pi _{{\hat{x}}_t}(\mathrm {Id}+\sigma _z)\big )\Big )\,\mathrm {d}t\\&\quad + \sqrt{a}\Big (\sigma _+\pi _{{\hat{x}}_t}+\pi _{{\hat{x}}_t}\sigma _--{\text {tr}}(\sigma _x\pi _{{\hat{x}}_t})\pi _{{\hat{x}}_t}\Big )\,\mathrm {d}B_0(t)\\&\quad +\sqrt{b}\Big (\sigma _-\pi _{{\hat{x}}_t}+\pi _{{\hat{x}}_t}\sigma _+-{\text {tr}}(\sigma _x\pi _{{\hat{x}}_t})\pi _{{\hat{x}}_t}\Big )\, \mathrm {d}B_1(t) \end{aligned} \end{aligned}$$
(6.4)

Again it is immediate to verify (Pur), and solving for \(\mathcal {L}(\rho )=0\) shows that (\(\mathcal {L}\)-erg) holds.

Lemma 6.4

If \(\mu ({\mathcal {Y}}_0=0)=1\) then \({\mathbb {Q}}_{\mu }({\mathcal {Y}}_t=0)=1\) for all t in \({\mathbb {R}}\).

Proof

From (6.4), \(({\mathcal {X}}_t)_t\) and \(({\mathcal {Y}}_t)_t\) satisfy

$$\begin{aligned} \mathrm {d}{\mathcal {Y}}_t=-{\mathcal {Y}}_t\Big (\genfrac{}{}{}1{1}{2}(a+b)\,\mathrm {d}t+{\mathcal {X}}_t\big (\sqrt{a}\,\mathrm {d}B_0(t)+\sqrt{b}\,\mathrm {d}B_1(t)\big )\Big ). \end{aligned}$$

Therefore, if one defines

$$\begin{aligned} M_t=\exp \Big (-\frac{1}{2}\int _0^t (a+b){\mathcal {X}}_s^2\, \mathrm {d}s -\int _0^t {\mathcal {X}}_s\big (\sqrt{a}\,\mathrm {d}B_0(s)+\sqrt{b}\,\mathrm {d}B_1(s)\big )\Big ) \end{aligned}$$

then one has \({\mathcal {Y}}_t={\mathcal {Y}}_0 \,\mathrm {e}^{-\genfrac{}{}{}1{1}{2}(a+b)t}M_t\), and this proves Lemma 6.4. \(\square \)

Proposition 6.5

Let \(({\hat{x}}_t)_t\) be the process defined by (6.4). Then, its unique invariant measure is the normalized image measure by \(\iota \) (defined by (6.2)) of the measure \(\tau (\theta ) \, \mathrm {d}\theta \) on \((-\pi ,\pi ]\) with

$$\begin{aligned} \tau (\theta )=\frac{e^{\varsigma z\arctan \big (\varsigma (\cos \theta -z)\big )}}{\big (\cos ^2\theta +1-2z\cos \theta )\big )^{3/2}}, \end{aligned}$$

with \(z=\frac{a-b}{a+b}\) and \(\varsigma =\frac{a+b}{2\sqrt{ab}}.\)

Proof

As in the proof of Proposition 6.3, Theorem 1.1 and Lemma 6.4 imply that the invariant measure \(\mu _{\mathrm {inv}}\) is the image by \(\iota \) of a probability measure \(\uptau \) on \((-\pi ,\pi ]\). Let \((\theta _t)_t\) be the solution of

$$\begin{aligned} \mathrm {d}\theta _t&= \big ((b-a)\sin \theta _t + \genfrac{}{}{}1{1}{2}(a+b) \cos \theta _t\sin \theta _t\big ) \, \mathrm {d}t \nonumber \\&\quad + \sqrt{a} \,(\cos \theta _t-1)\,\mathrm {d}B_0(t) + \sqrt{b} \,(\cos \theta _t +1)\,\mathrm {d}B_1(t). \end{aligned}$$
(6.5)

The Itô formula implies once again

$$\begin{aligned} (\cos \theta _t,\sin \theta _t)\sim \big ({\text {tr}}(\pi _{{\hat{x}}_t}\sigma _z),{\text {tr}}(\pi _{{\hat{x}}_t}\sigma _x)\big )_t \end{aligned}$$

for \((\pi _{{\hat{x}}_t})_t\) solution of (6.4) with initial condition \({\hat{x}}_0=\genfrac{}{}{}1{1}{2}\big (\mathrm {Id}+ \sin \theta _0\,\sigma _x +\cos \theta _0 \,\sigma _z\big )\). Hence, \(\big (\iota (\theta _t)\big )\) has the same distribution as \(({\hat{x}}_t)\). As in the proof of Proposition 6.3, standard techniques show that the unique invariant distribution for (6.5) has density proportional to the function \(\tau \) above. \(\square \)

The density of the invariant probability distribution for three values of the pair (ab) is plotted in Fig. 2.

Fig. 2
figure 2

Density of the invariant probability distribution for Example 6.2

6.3 Thermal Qubit, Jump Case

Our third example is the second one where the stimulating coherent field has relatively small amplitude and is filtered out. Then, the signal is composed only of the photons absorbed or emitted by the qubit. The resulting trajectory involves only jumps related to these events. The parameters defining the model are then \(H=0\), \(I_b=\emptyset \) and \(I_p=\{0,1\}\), \(C_0=\sqrt{a}\,\sigma _+\) and \(C_1=\sqrt{b}\,\sigma _-\) with \(a,b\in {\mathbb {R}}_+{\setminus }\{0\}\).

The process \((\pi _{{\hat{x}}_t})_t\) is solution of

$$\begin{aligned} \begin{aligned} \mathrm {d}\pi _{{\hat{x}}_t}=&a\Big (\sigma _+\pi _{{\hat{x}}_{t-}}\sigma _--\genfrac{}{}{}1{1}{4}\big ((\mathrm {Id}-\sigma _z)\pi _{{\hat{x}}_{t-}}+\pi _{{\hat{x}}_{t-}}(\mathrm {Id}-\sigma _z)\big )\Big )\mathrm {d}t\\&+b\Big (\sigma _-\pi _{{\hat{x}}_{t-}}\sigma _+-\genfrac{}{}{}1{1}{4}\big ((\mathrm {Id}+\sigma _z)\pi _{{\hat{x}}_{t-}}+\pi _{{\hat{x}}_{t-}}(\mathrm {Id}+\sigma _z)\big )\Big )\mathrm {d}t\\&+\Big (\frac{\sigma _+\pi _{{\hat{x}}_{t-}}\sigma _-}{{\text {tr}}(\sigma _-\sigma _+\pi _{{\hat{x}}_{t-}})}-\pi _{{\hat{x}}_{t-}}\Big )\Big (\mathrm {d}N_0(t)-a{\text {tr}}(\sigma _-\sigma _+\pi _{{\hat{x}}_{t-}})\mathrm {d}t\Big )\\&+\Big (\frac{\sigma _-\pi _{{\hat{x}}_{t-}}\sigma _+}{{\text {tr}}(\sigma _+\sigma _-\pi _{{\hat{x}}_{t-}})}-\pi _{{\hat{x}}_{t-}}\Big )\Big (\mathrm {d}N_1(t)-b{\text {tr}}(\sigma _+\sigma _-\pi _{{\hat{x}}_{t-}})\mathrm {d}t\Big ) \end{aligned} \end{aligned}$$
(6.6)

where \(N_0\) and \(N_1\) are Poisson processes of stochastic intensities

$$\begin{aligned}t\mapsto \int _0^t a{\text {tr}}(\sigma _-\sigma _+\pi _{{\hat{x}}_{s-}})\, \mathrm {d}s\quad \text {and}\quad t\mapsto \int _0^t b{\text {tr}}(\sigma _+\sigma _-\pi _{{\hat{x}}_{s-}})\,\mathrm {d}s.\end{aligned}$$

Assumptions (Pur) and (\(\mathcal {L}\)-erg) hold as in Example 6.2.

Proposition 6.6

Let \(\{e_1,e_2\}\) denote the canonical basis of \({\mathbb {C}}^2\). The invariant measure for Equation (6.6) is

$$\begin{aligned} \mu _{\mathrm {inv}}{}=\frac{a}{a+b}\,\delta _{\pi _{{\hat{e}}_1}}+\frac{b}{a+b}\,\delta _{\pi _{{\hat{e}}_2}} \end{aligned}$$

Proof

It is enough to check from (6.6) that, if \({{\hat{x}}_0}\) is either \({\hat{e}}_1\) or \({\hat{e}}_2\), then \((\pi _{{\hat{x}}_t})_t\) is a jump process on \((\pi _{{\hat{e}}_1},\pi _{{\hat{e}}_2})\) with intensity b for the jumps from \(\pi _{{\hat{e}}_1}\) to \(\pi _{{\hat{e}}_2}\) and intensity a for the reverse jumps. \(\square \)

6.4 Finite State Space Markov Process Embedding

In this last example, we show how we can recover all the usual continuous-time Markov chains using special quantum trajectories.

Let \(\{e_\ell \}_{\ell =1}^k\) be an orthonormal basis of \({\mathbb {C}}^k\), and \((X_t)_t\) a \(\{e_1,\ldots ,e_k\}\)-valued Markov process with generator Q (we recall that Q is a \(k\times k\) real matrix such that \({\mathbb {E}}(\langle v,X_t\rangle |X_0)=\langle v, e^{tQ}X_0\rangle \) for any vector \(v\in {\mathbb {C}}^k\)). Let H be diagonal in the basis \(\{e_\ell \}_{\ell =1}^k\), let \(I_b=\emptyset \) and \(I_p=\{ (i,j); i\ne j \text{ in } 1,\ldots ,k\}\) and for any \((i,j)\in I_p\) let \(C_{i,j}=\sqrt{Q_{i,j}} \,e_j e_i^*\).

Proposition 6.7

Let \(({\hat{x}}_t)_t\) be the quantum trajectory defined by Equation (1.4) and the above parameters. Then, assumption (Pur) holds. In addition,

  1. (i)

    Let \(T=\inf \big \{t\ge 0: {\hat{x}}_t\in \{{\hat{e}}_1,\ldots ,{\hat{e}}_k\}\big \}\). If for all i there exists j with \(Q_{i,j}>0\), then for any probability measure \(\mu \) over \(\mathrm {P}{\mathbb {C}}^{k}\), \({\mathbb {P}}_\mu (T<\infty )=1\).

  2. (ii)

    Conditionally on \({\hat{x}}_0\in \{{\hat{e}}_\ell \}_{\ell =1}^k\), the process \(({\hat{x}}_t)_t\) has the same distribution as the image by \(x\mapsto {\hat{x}}\) of \((X_t)_t\).

  3. (iii)

    The assumption (\(\mathcal {L}\)-erg) holds if and only if \((X_t)_t\) accepts a unique invariant measure. In that case, the unique invariant measure \(\nu _{\mathrm {inv}}\) for \(({\hat{x}}_t)_t\) is the image by \(x\mapsto {\hat{x}}\) of the unique invariant measure for \((X_t)_t\).

Proof

Note first that any \(C_{i,j}^*C_{i,j}=Q_{i,j}\,e_i e_i^*\) , so that (Pur) holds trivially.

To prove (i), let \(T_1=\inf \{t>0; \ \exists (i,j)\in I_p \text{ such } \text{ that } N_{i,j}(t)>0\}\). Remark that because \({\text {tr}}(C_{i,j}\pi _{{\hat{x}}_{s-}}C_{i,j}^*)=Q_{i,j} |\langle e_i, x_{s-}\rangle |^2\), the sum \(\sum _{i,j}N_{i,j}\) of independent Poisson processes has intensity

$$\begin{aligned} \sum _{i,j}\int _0^t Q_{i,j} |\langle e_i, x_{s-}\rangle |^2\, \mathrm {d}s\ge t \,\min _i Q_i \end{aligned}$$

where \(Q_i=\sum _j Q_{i,j}\) is positive by assumption, so that \(T_1\) is almost surely finite. Now consider the almost surely unique (ij) in \(I_p\) such that \(N_{i,j}(T_1)>0\); necessarily \({\text {tr}}(C_{i,j} \pi _{{\hat{x}}_{t-}} C_{i,j}^*)>0\), and then \(\frac{C_{i,j} \pi _{{\hat{x}}_{t-}} C_{i,j}^*}{{\text {tr}}(C_{i,j} \pi _{{\hat{x}}_{t-}} C_{i,j}^*)}=\pi _{{\hat{e}}_j}\), so that \(T\le T_1\). This proves (i).

Now, to prove (ii), remark that Eq. (1.3) can be rewritten in the form

$$\begin{aligned} \begin{aligned} \mathrm {d}\pi _{{\hat{x}}_t}=&\,\sum _{(i,j)\in I_p}\big ({\text {tr}}(C_{i,j}\pi _{{\hat{x}}_{t-}}C_{i,j}^*)\pi _{{\hat{x}}_{t-}}-\frac{1}{2}\{C_{i,j}^*C_{i,j},\pi _{{\hat{x}}_{t-}}\}\big )\,\mathrm {d}t\\&+\sum _{(i,j)\in I_p}\Big (\frac{C_{i,j}\pi _{{\hat{x}}_{t-}}C_{i,j}^*}{{\text {tr}}(C_{i,j}\pi _{{\hat{x}}_{t-}}C_{i,j}^*)}-\pi _{{\hat{x}}_{t-}}\Big )\,\mathrm {d}N_{i,j}(t). \end{aligned} \end{aligned}$$

Let \(T_1\) be defined as above; then, for \(t<T_1\) the process \((\pi _{{\hat{x}}_t})_t\) satisfies

$$\begin{aligned} \pi _{{\hat{x}}_t}=\pi _{{\hat{x}}_0}+\sum _{(i,j)\in I_p} \int _0^t \big ({\text {tr}}(C_{i,j}\pi _{{\hat{x}}_{s-}}C_{i,j}^*)\pi _{{\hat{x}}_{s-}}-\frac{1}{2}\{C_{i,j}^*C_{i,j},\pi _{{\hat{x}}_{s-}}\}\big )\,\mathrm {d}s.\nonumber \\ \end{aligned}$$
(6.7)

Starting with an initial condition \({\hat{x}}_0\in \{{\hat{e}}_\ell \}_{\ell =1}^k\), one proves easily that the integrand is zero, which means that \(\pi _{{\hat{x}}_t}=\pi _{{\hat{x}}_0}\) for \(t<T_1\). This shows in addition that for \(t<T_1\), the intensity of \(N_{i,j}\) is

$$\begin{aligned} \int _0^t {\text {tr}}(C_{i,j}\pi _{{\hat{x}}_{s-}}C_{i,j}^*)\,\mathrm {d}s= \left\{ \begin{array}{cl} Q_{i,j}\, t &{} \text{ if } x_0=e_i\\ 0 &{} \text{ otherwise. } \end{array} \right. \end{aligned}$$

Therefore, conditionally on \(x_0=e_i\), \(T_1=\inf \{t>0; \ \exists j\ne i \text{ such } \text{ that } N_{i,j}(t)>0\}\) and there exists an almost surely unique j such that \(N_{i,j}(T_1)>0\). One then has

$$\begin{aligned} \pi _{{\hat{x}}_{T_1}}=\frac{C_{i,j} \pi _{{\hat{x}}_{T_1-}} C_{i,j}^*}{{\text {tr}}(C_{i,j} \pi _{{\hat{x}}_{T_1-}} C_{i,j}^*)}=\pi _{\hat{e}_j}. \end{aligned}$$

This shows that for \(t\in [0,T_1]\) the process \(({\hat{x}}_t)_t\) has the same distribution as the process of equivalence classes of \(X_t\). This extends to all t by the Markov property of the Poisson processes. This proves (ii).

Points (i) and (ii) show that for \(t>T_1\), the process \(({\hat{x}}_t)_t\) has the same distribution as \((X_t)_t\) with initial condition \(X_{T_1}\) satisfying \({\hat{X}}_{T_1}={\hat{x}}_{T_1}\). Therefore, any invariant measure for \(({\hat{x}}_t)_t\) is the image by \(x\mapsto {\hat{x}}\) of an invariant measure for \((X_t)_t\). Theorem 1.1 and Sect. 4 show that \(({\hat{x}}_t)_t\) admits at least one invariant measure and that the invariant measure is unique if and only if (\(\mathcal {L}\)-erg) holds. This implies that \((X_t)_t\) has a unique invariant measure if and only if (\(\mathcal {L}\)-erg) holds. \(\square \)