Invariant measure for quantum trajectories

Benoist, T.; Fraas, M.; Pautrat, Y.; Pellegrini, C.

doi:10.1007/s00440-018-0862-9

Invariant measure for quantum trajectories

Published: 20 July 2018

Volume 174, pages 307–334, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Probability Theory and Related Fields Aims and scope Submit manuscript

Invariant measure for quantum trajectories

Download PDF

T. Benoist¹,
M. Fraas²,
Y. Pautrat³ &
…
C. Pellegrini ORCID: orcid.org/0000-0001-8072-4284¹

544 Accesses
19 Citations
Explore all metrics

Abstract

We study a class of Markov chains that model the evolution of a quantum system subject to repeated measurements. Each Markov chain in this class is defined by a measure on the space of matrices, and is then given by a random product of correlated matrices taken from the support of the defining measure. We give natural conditions on this support that imply that the Markov chain admits a unique invariant probability measure. We moreover prove the geometric convergence towards this invariant measure in the Wasserstein metric. Standard techniques from the theory of products of random matrices cannot be applied under our assumptions, and new techniques are developed, such as maximum likelihood-type estimations.

Invariant Measure for Stochastic Schrödinger Equations

Article 02 January 2021

Markovian statistics on evolving systems

Article 27 April 2017

Quantum Random Evolutions

Article 30 May 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We consider a complex vector space $\mathbb {C}^k$ and its projective space ${\mathrm P}(\mathbb {C}^k)$ equipped with its Borel $\sigma $-algebra $\mathcal {B}$. For a nonzero vector $x\in {\mathbb {C}}^k$, we denote $\hat{x}$ the corresponding equivalence class of x in ${\mathrm P}({\mathbb {C}}^k)$. For a linear map $v\in \mathrm {M}_k({\mathbb {C}})$ we denote $v\cdot \hat{x} $ the element of the projective space represented by $v\,x$ whenever $v\,x\ne 0$. We equip $\mathrm {M}_k({\mathbb {C}})$ with its Borel $\sigma $-algebra and let $\mu $ be a measure on $\mathrm {M}_k({\mathbb {C}})$ with a finite second moment, $\int _{\mathrm {M}_k({\mathbb {C}})} \Vert v\Vert ^2\,\mathrm{d}\mu (v)<\infty $, that satisfies the stochasticity condition

$$\begin{aligned} \int _{\mathrm {M}_k({\mathbb {C}})} v^* v \,\mathrm {d} \mu (v) = \mathrm{Id}_{{{\mathbb {C}}}^k} \end{aligned}$$

(1)

(we discuss this condition below).

In this article we are interested in particular Markov chains $(\hat{x}_n)$ on ${\mathrm P}(\mathbb {C}^k)$, defined by

$$\begin{aligned} \hat{x}_{n+1}=V_n\cdot \hat{x}_{n}, \end{aligned}$$

where $V_n$ is an $\mathrm {M}_k({\mathbb {C}})$-valued random variable with a probability density $ \Vert v x_n\Vert ^2/\Vert x_n\Vert ^2 \mathrm {d} \mu (v).$ Condition (1) ensures this is a probability density for any $x_n\ne 0$. More precisely, such a Markov chain is associated with the transition kernel given for a set $S\in \mathcal {B}$ and $\hat{x}\in {\mathrm P}({\mathbb {C}}^k)$ by

$$\begin{aligned} \Pi (\hat{x},S)=\int _{\mathrm {M}_k({{\mathbb {C}}})} \mathbf {1}_{S}\left( v\cdot \hat{x}\right) \Vert v x\Vert ^2 \mathrm {d} \mu (v), \end{aligned}$$

(2)

where x is an arbitrary normalized vector representative of $\hat{x}$. Moreover, the event $\{vx=0\}$ always has probability 0, hence the Markov chain is well-defined on ${\mathrm P}({{\mathbb {C}}}^k)$. We recall that for any probability measure $\nu $, $\nu \Pi $ is the probability measure defined by

$$\begin{aligned} \nu \Pi (S)=\int _{{\mathrm P}\left( {{\mathbb {C}}}^k\right) }\Pi \left( \hat{x},S\right) \mathrm{d}\nu (\hat{x}) \end{aligned}$$

for any $S\in \mathcal {B}$. A measure $\nu $ is called invariant if $\nu \Pi = \nu $.

We are interested in the large-time distribution of $(\hat{x}_n)$. Note that $\hat{x}_n$ can be written as

$$\begin{aligned} \hat{x}_n = V_{n}\ldots V_{1}\cdot \hat{x}_0, \end{aligned}$$

so that the study of $\hat{x}_n$ can be formulated in terms of random products of matrices. Markov chains associated to random products of matrices have been studied in a more general setting where the weight appearing in the transition kernel (2) is proportional to $\Vert v x\Vert ^s$ for some $s \ge 0$, instead of $\Vert v x\Vert ^2$. The classical case of products of independent, identically distributed random matrices pioneered by Kesten, Furstenberg and Guivarc’h corresponds to $s = 0$. In that case, for i.i.d. invertible random matrices $Y_1,Y_2,\ldots $, denoting $S_n=Y_n\ldots Y_1$, one is usually interested in the asymptotic properties of

$$\begin{aligned} \log \Vert S_n x\Vert , \end{aligned}$$

for any $x\ne 0$. In particular, a law of large numbers, a central limit theorem and a large deviation principle have been obtained for this quantity, under contractivity and strong irreducibility assumptions [8, 11, 16]. Such results are closely linked to the uniqueness of the invariant measure of the Markov chain

$$\begin{aligned} \hat{x}_n=S_n\cdot \hat{x}. \end{aligned}$$

These results were generalized to the case $s >0$ in [10]. Our framework corresponds to the case $s=2$; in this case, and with the additional assumption (1), we provide a new method to study this Markov chain, and use it to derive the above results without assuming invertibility of the matrices, and with an optimal irreducibility assumption. We compare our approach with respect to that of [10] at the end of this section.

The method that we employ is motivated by an interpretation of this process as statistics of a quantum system being repeatedly indirectly measured. Let us expand on this as we introduce more notation and terminology. The set of states of a quantum system described by a finite dimensional Hilbert space ${{\mathbb {C}}}^k$ is the set of density matrices ${\mathcal {D}}_k:=\{\rho \in \mathrm {M}_k({{\mathbb {C}}})\ |\ \rho \ge 0,\ {{\text {tr}}}\,\rho =1\}$. This set is convex and the set of its extreme points is called the set of pure states. This latter set is in one to one correspondence with the projective space ${{\mathrm P}({{\mathbb {C}}}^k)}$ by the bijection ${{\mathrm P}({{\mathbb {C}}}^k)}\ni \hat{x}\mapsto \pi _{\hat{x}}\in {\mathcal {D}}_k$ with $\pi _{\hat{x}}$ the orthogonal projector on the corresponding ray in ${{\mathbb {C}}}^k$. The time evolution of the system conditioned on a measurement outcome is encoded in a matrix v that updates the state of the system. The support of $\mu $ is endowed with the meaning of the possible updates, and the system is updated according to v with a probability density ${{\text {tr}}}(v \rho v^*)\, \mathrm{d}\mu (v)$. Given v, a state $\rho $ is mapped to a state $v \rho v^*/{{\text {tr}}}(v \rho v^*)$. Iterating this procedure defines a random sequence $(\rho _n)$ in ${\mathcal {D}}_k$ called a quantum trajectory: after n measurements with resulting matrices $v_1,\dots ,v_n$ the state of the system becomes

$$\begin{aligned} \rho _n=\frac{v_{n}\ldots v_{1}\rho _0 v_{1}^* \ldots v_{n}^*}{{{\text {tr}}}\left( v_{n}\ldots v_{1}\rho _0 v_{1}^* \ldots v_{n}^*\right) } \end{aligned}$$

(3)

where $(v_1,\ldots ,v_n)$ has probability density ${{\text {tr}}}(v_{n} \dots v_{1} \rho _0 v_{1}^*\ldots v_{n}^*)\,\mathrm{d}\mu ^{\otimes n}(v_1,\ldots ,v_n)$. In other words, the process Eq. (3) describes an evolution of a repeatedly measured quantum system.

A key result in the theory of quantum trajectories is the purification theorem obtained by Kümmerer and Maassen [17] showing that quantum trajectories $(\rho _n)$ defined on $\mathcal {D}_k$ almost surely approach the set of pure states (which are the extreme points of $\mathcal {D}_k$) if and only if the following purification condition is satisfied:

(Pur)::: Any orthogonal projector $\pi $ such that for any $n\in {{\mathbb {N}}}$, $\pi v_1^*\ldots v_n^* v_n\ldots v_1 \pi \propto \pi $ for $\mu ^{\otimes n}$-almost all $(v_1,\ldots ,v_n)$, is of rank one

(we write $X \propto Y$ for X, Y two operators if there exists $\lambda \in {{\mathbb {C}}}$ such that $X=\lambda Y$).

Under this assumption, the long-time behavior of the Markov chain is essentially dictated by its form on the set of pure states, i.e. for $\rho _0=\pi _{\hat{x}_0}$. It is an immediate observation that

$$\begin{aligned} {{\text {tr}}}\left( v\pi _{\hat{x}_0}v^*\right) = \Vert v x_0\Vert ^2, \quad \frac{v\pi _{\hat{x}_0} v^*}{{{\text {tr}}}\left( v\pi _{\hat{x}_0}v^*\right) } = \pi _{v\cdot \hat{x}_0}, \end{aligned}$$

(4)

for all $v\in \mathrm {M}_k({\mathbb {C}})$. This way our Markov chain $(\hat{x}_n)$ corresponds to the quantum trajectory $(\rho _n)$ described above when $\rho _0$ is a pure state $\pi _{\hat{x}_0}$.

Although ideas underlying our method are based on the connection of $(\hat{x}_n)$ with this physical problem, we will not explicitly use it in the paper. The notion of quantum trajectory originates in quantum optics [6], and Haroche’s Nobel prize winning experiment [9] is arguably the most prominent example of a system described by the above formalism. The reader interested in the involved mathematical structures might consult for example the review book [13] or the pioneering articles [14, 15, 17].

We will show that under the condition (Pur), the set of all invariant measures of the Markov chain (3) can be completely classified, depending on the operator $\phi $ on $\mathcal {D}_k$ describing the average evolution:

$$\begin{aligned} \phi (\rho )=\int _{\mathrm {M}_k({\mathbb {C}})} v \rho v^* \, \mathrm{d}\mu (v). \end{aligned}$$

(5)

The map $\phi $ on ${\mathcal {D}}_k$ is completely positive and trace-preserving.^{Footnote 1} Such a map is often called a quantum channel (see e.g. [22]). It has in particular the property of mapping states to states. Brouwer’s fixed point theorem shows that there exists an invariant state, i.e. $\rho \in {\mathcal {D}}_k$ such that $\phi (\rho )=\rho $. A necessary and sufficient algebraic condition for uniqueness of this invariant state is (see e.g. [5, 7, 22])

($\phi $-Erg)::: There exists a unique minimal non trivial subspace E of ${{\mathbb {C}}}^k$ such that $\forall v\in {\text {supp}}\mu $, $vE\subset E$.

If ($\phi $-Erg) holds with $E={{\mathbb {C}}}^k$, then $\phi $ is said to be irreducible. We chose the name ($\phi $-Erg) to avoid confusion with the notion of irreducibility for Markov chains. We moreover emphasize that we call this assumption ($\phi $-Erg) because it relies only on $\phi $ and not on the different operators v in the support of $\mu $: an equivalent statement of ($\phi $-Erg) is that there exists a unique minimal nonzero orthogonal projector $\pi $ such that $\phi (\pi )\le \lambda \pi $ for some $\lambda \ge 0$ (see e.g. [20]).

We now state the main result of the paper:

Theorem 1.1

Assume that $\mu $ satisfies assumptions (Pur) and ($\phi $-Erg). Then, the transition kernel $\Pi $ has a unique invariant probability measure $\nu _{\mathrm {inv}}$ and there exist $m\in \{1,\ldots ,k\}$, $C>0$ and $0<\lambda <1$ such that for any probability measure $\nu $ over $\big ({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}}\big )$,

$$\begin{aligned} W_1\left( \frac{1}{m}\sum _{r=0}^{m-1} \nu \Pi ^{mn+r}, \nu _{\mathrm {inv}}\right) \le C \lambda ^n, \end{aligned}$$

(6)

where $W_1$ is the Wasserstein metric of order 1.

The Wasserstein metric is constructed with respect to a natural metric on the complex projective space. This metric is defined, for $\hat{x},\,\hat{y}$ in ${{\mathrm P}({{\mathbb {C}}}^k)}$, by

$$\begin{aligned} d\left( \hat{x},\hat{y}\right) = \left( 1-|\langle x,y\rangle |^2\right) ^{\frac{1}{2}}, \end{aligned}$$

(7)

where $x,\,y$ are unit length representative vectors of $\hat{x}$, $\hat{y}$, and $\langle \,\cdot \,,\,\cdot \,\rangle $ is the canonical hermitian inner product on $\mathbb {C}^k$.

Let us now compare our results to those of the article [10] of Guivarc’h and Le Page. They consider a probability distribution $\mu $ with support in $\mathrm {GL}_k({\mathbb {C}})$, without requiring the normalization condition (1), and study the transition kernel on ${{\mathrm P}({{\mathbb {C}}}^k)}$ given, for $S\in {\mathcal {B}}$, by

$$\begin{aligned} \Pi _{s}\left( \hat{x},S\right) \propto \int _{\mathrm {M}_k({\mathbb {C}})}\mathbf {1}_S\left( v\cdot \hat{x}\right) \Vert v x\Vert ^s \mathrm {d} \mu (v). \end{aligned}$$

In the case $s=2$, Theorem A of [10] implies the conclusions of Theorem 1.1 under two assumptions:

strong irreducibility, in the sense that there is no non-trivial finite union of proper subspaces of ${{\mathbb {C}}}^k$ left invariant by all $v\in \mathrm {supp}\,\mu $,
contractivity, in the sense that there exists a sequence $(a_n)$ in $T_\mu $, the smallest closed sub-semigroup of $\mathrm {GL}_k({{\mathbb {C}}})$ containing ${\text {supp}}\mu $, such that $\lim _{n\rightarrow \infty }a_n/\Vert a_n\Vert $ exists and is of rank one.

It is, however, immediate that strong irreducibility of $\mu $ implies ($\phi $-Erg) with $E={{\mathbb {C}}}^k$. In addition, if we assume ${\text {supp}}\mu \subset \mathrm {GL}_k({\mathbb {C}})$ and ${\text {supp}}\mu $ is strongly irreducible, the equivalence

$$\begin{aligned} {\mathbf{(Pur)}}\iff T_\mu \text{ is } \text{ contracting } \end{aligned}$$

holds (see “Appendix A”). Our results therefore offer a strong refinement of [10] in the restricted framework of $s=2$ with $\int v^* v \,\mathrm{d}\mu (v)=\mathrm{Id}_{{{\mathbb {C}}}^k}$. This assumption, although mathematically restrictive, is automatically verified in the framework of repeated (indirect) quantum measurements as described earlier in this section.

The article is structured as follows. Section 2 is devoted to the first part of Theorem 1.1, that is the uniqueness of the invariant measure. In Sect. 3 we show the geometric convergence towards the invariant measure with respect to the 1-Wasserstein metric. In Sect. 4 we discuss the Lyapunov exponents of the process and relate them to the convergence between the Markov chain and an estimate of the chain used in our proofs.

Notation For $x\in {{\mathbb {C}}}^k\setminus \{0\}$, $\hat{x}$ is its equivalence class in ${\mathrm P}({{\mathbb {C}}}^k)$ and, for $\hat{x}$ in ${\mathrm P}({{\mathbb {C}}}^k)$, x is an arbitrary norm one vector representative of $\hat{x}$. If e.g. ${{\mathbb {P}}}_\nu $ (resp. ${\mathbb {P}}^\rho $) is a probability measure (depending on some a priori object $\nu $ (resp. $\rho $)) then ${{\mathbb {E}}}_\nu $ (resp. ${{\mathbb {E}}}^\rho $) is the expectation with respect to ${{\mathbb {P}}}_\nu $ (resp. ${{\mathbb {P}}}^\rho $). ${{\mathbb {N}}}$ represents the set of positive integers $\{1,2,\ldots \}$.

2 Uniqueness of the invariant measure

This section concerns essentially the first part of Theorem 1.1. More precisely, under ($\phi $-Erg) and (Pur) we show that the Markov chain has a unique invariant measure. Note that an invariant measure always exists since ${\mathrm P}({\mathbb {C}}^k)$ is compact. We start by introducing a probability space describing both the state $\hat{x}\in {\mathrm P}({{\mathbb {C}}}^k)$ and the sequence of matrices $(v_1,v_2,\ldots )$ such that $(v_n\ldots v_1\cdot \hat{x})$ has the same distribution as the Markov chain $(\hat{x}_n)$. Then, in Proposition 2.1, we show that the marginal on the matrix sequence is the same for any $\Pi $-invariant probability measure as long as ($\phi $-Erg) holds. In Proposition 2.2 and Lemma 2.3 we show that $(\hat{x}_n)$ is asymptotically a function of $(v_1,v_2,\ldots )$. We conclude on the uniqueness of the invariant measure in Corollary 2.4.

We now proceed to introduce some additional notation. We consider the space of infinite sequences $\Omega :=\mathrm {M}_k({{\mathbb {C}}})^{{\mathbb {N}}}$, write $\omega = (v_1,v_2, \dots )$ for any such infinite sequence, and denote by $\pi _n$ the canonical projection on the first n components, $\pi _n(\omega )=(v_1,\ldots ,v_n)$. Let ${\mathcal {M}}$ be the Borel $\sigma $-algebra on $\mathrm {M}_k({{\mathbb {C}}})$. For $n\in {{\mathbb {N}}}$, let $\mathcal {O}_n$ be the $\sigma $-algebra on $\Omega $ generated by the n-cylinder sets, i.e. $\mathcal {O}_n = \pi _n^{-1}({\mathcal {M}}^{\otimes n})$. We equip the space $\Omega $ with the smallest $\sigma $-algebra $\mathcal {O}$ containing $\mathcal {O}_n$ for all $n\in {{\mathbb {N}}}$. We let ${\mathcal {B}}$ be the Borel $\sigma $-algebra on ${\mathrm P}({\mathbb {C}}^k)$, and denote

$$\begin{aligned} \mathcal {J}_n={\mathcal {B}}\otimes \mathcal {O}_n,\qquad \mathcal {J}={\mathcal {B}}\otimes \mathcal {O}. \end{aligned}$$

This makes $\big ({\mathrm P}({\mathbb {C}}^k)\times \Omega ,\mathcal {J}\big )$ a measurable space. With a small abuse of notation we denote the sub-$\sigma $-algebra $\{\emptyset ,{\mathrm P}({\mathbb {C}}^k)\}\times \mathcal {O}$ by $\mathcal {O}$, and equivalently identify any $\mathcal {O}$-measurable function f with the $\mathcal {J}$-measurable function f satisfying $f(\hat{x},\omega ) = f(\omega )$.

For $i\in {\mathbb {N}}$, we consider the random variables $V_i : \Omega \rightarrow \mathrm {M}_k({\mathbb {C}})$,

$$\begin{aligned} V_i(\omega ) = v_i \quad \text{ for } \quad \omega =(v_1,v_2,\ldots ), \end{aligned}$$

(8)

and we introduce ${\mathcal {O}}_n$-mesurable random variables $(W_n)$ defined for all $n\in {\mathbb {N}}$ as

$$\begin{aligned} W_n=V_{n}\ldots V_{1}. \end{aligned}$$

With a small abuse of notation we identify cylinder sets and their bases, and extend this identification to several associated objects. In particular we identify $O_n\in {\mathcal {M}}^{\otimes n}$ with $\pi _n^{-1}(O_n)$, a function f on ${\mathcal {M}}^{\otimes n}$ with $f \circ \pi _n$ and a measure $\mu ^{\otimes n}$ with the measure $\mu ^{\otimes n} \circ \pi _n$. Since $\mu $ is not necessarily finite, we can not extend $(\mu ^{\otimes n})$ into a measure on $\Omega $.

Let $\nu $ be a probability measure over $({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}})$. We extend it to a probability measure $\mathbb {P}_\nu $ over $({{\mathrm P}({{\mathbb {C}}}^k)}\times \Omega ,\mathcal {J})$ by letting, for any $S\in {\mathcal {B}}$ and any cylinder set $O_n \in \mathcal {O}_n$,

$$\begin{aligned} \mathbb {P}_\nu (S \times O_n):=\int _{S\times O_n} \Vert W_n(\omega )x\Vert ^2 \mathrm{d}\nu (\hat{x}) \mathrm{d}\mu ^{\otimes n}(\omega ). \end{aligned}$$

(9)

From relation (1), it is easy to check that the expression (9) defines a consistent family of probability measures and, by Kolmogorov’s theorem, this defines a unique probability measure ${{\mathbb {P}}}_\nu $ on ${{\mathrm P}({{\mathbb {C}}}^k)}\times \Omega $. In addition, the restriction of $\mathbb {P}_\nu $ to ${\mathcal {B}}\otimes \{\emptyset ,\Omega \}$ is by construction $\nu $.

We now define the random process $(\hat{x}_n)$. For $(\hat{x}, \omega )\in {{\mathrm P}({{\mathbb {C}}}^k)}\times \Omega $ we define $\hat{x}_0(\hat{x}, \omega )=\hat{x}$. Note that for any n, the definition (9) of ${{\mathbb {P}}}_\nu $ imposes

$$\begin{aligned} {{\mathbb {P}}}_\nu \left( W_n x_0 = 0\right) =0. \end{aligned}$$

This allows us to define a sequence $(\hat{x}_n)$ of $(\mathcal {J}_n)$-adapted random variables on the probability space $({{\mathrm P}({{\mathbb {C}}}^k)}\times \Omega ,\mathcal {J}, {{\mathbb {P}}}_\nu )$ by letting

$$\begin{aligned} \hat{x}_n:= W_n\cdot \hat{x} \end{aligned}$$

(10)

whenever the expression makes sense, i.e. for any $\omega $ such that $W_n(\omega ) x\ne 0$, and extending it arbitrarily to the whole of $\Omega $. The process $(\hat{x}_n)$ on $(\Omega \times {\mathrm P}({\mathbb {C}}^k),\mathcal {J}, \mathbb {P}_\nu )$ has the same distribution as the Markov chain defined by $\Pi $ and initial probability measure $\nu $.

Let us highlight the relation between ${{\mathbb {P}}}_\nu $ and density matrices. To that end, let

$$\begin{aligned} \rho _\nu := {\mathbb {E}}_\nu \left( \pi _{\hat{x}}\right) . \end{aligned}$$

(11)

By linearity and positivity of the expectation, $\rho _\nu \in {\mathcal {D}}_k$. Note that, conversely, for a given $\rho \in {\mathcal {D}}_k$ there exists $\nu $ (in general non-unique) such that $\rho _\nu = \rho $. For example, if a spectral decomposition of $\rho $ is $\rho =\sum _j p_j \pi _{x_j}$ then necessarily $\sum _j p_j=1$, so that $\nu = \sum _j p_j \delta _{\hat{x}_j}$ is a probability measure on ${{\mathrm P}({{\mathbb {C}}}^k)}$, and it satisfies the desired relation (11).

This relation motivates the following definition of probability measures over $(\Omega ,\mathcal {O})$. For $\rho \in \mathcal {D}_k$ and any cylinder set $O_n \in \mathcal {O}_n$, let

$$\begin{aligned} {{\mathbb {P}}}^{\rho }(O_n):= \int _{O_n}{{\text {tr}}}\big (W_n(\omega ) \rho W_n^*(\omega )\big ) \mathrm {d} \mu ^{\otimes n}(\omega ). \end{aligned}$$

(12)

In particular, for any $S\in \mathcal {B}$ and $A\in \mathcal {O}$,

$$\begin{aligned} {{\mathbb {P}}}_\nu (S\times A)=\int _{S}{{\mathbb {P}}}^{\pi _{\hat{x}}}(A)\, \mathrm{d}\nu (\hat{x}). \end{aligned}$$

(13)

The following proposition elucidates further the connection between ${{\mathbb {P}}}_\nu $ and ${{\mathbb {P}}}^{\rho _\nu }$.

Proposition 2.1

The marginal of $\mathbb {P}_\nu $ on $\mathcal {O}$ is the probability measure ${\mathbb {P}}^{\rho _\nu }$. Moreover, if ($\phi $-Erg) holds, ${{\mathbb {P}}}^{\rho _{\nu _a}}={{\mathbb {P}}}^{\rho _{\nu _b}}$ for any two $\Pi $-invariant probability measures $\nu _a$ and $\nu _b$.

Proof

By construction it is sufficient to check the equality of the measures on cylinder sets. Let $O_n \in \mathcal {O}_n$; from the definition of $\mathbb {P}_\nu $, and the linearity of the trace and the integral, we have

$$\begin{aligned} \mathbb {P}_\nu \left( {{\mathrm P}({{\mathbb {C}}}^k)}\times O_n\right)&=\int _{{{\mathrm P}({{\mathbb {C}}}^k)}\times O_n} {{\text {tr}}}\big (W^*_n(\omega )W_n(\omega )\pi _{\hat{x}}\big )\, \mathrm{d}\nu (\hat{x}) \mathrm{d}\mu ^{\otimes n}(\omega )\\&=\int _{O_n} {{\text {tr}}}\left( W_n^*(\omega )W_n(\omega )\int _{{{\mathrm P}({{\mathbb {C}}}^k)}} \pi _{\hat{x}} \,\mathrm{d}\nu (\hat{x})\right) \,\mathrm{d}\mu ^{\otimes n}(\omega )\\&=\int _{O_n} {{\text {tr}}}\big (W_n^*(\omega )W_n(\omega )\rho _\nu \big ) \,\mathrm{d}\mu ^{\otimes n}(\omega ). \end{aligned}$$

The equality between the marginal of ${{\mathbb {P}}}_\nu $ on $\mathcal {O}$ and ${{\mathbb {P}}}^{\rho _\nu }$ follows.

If $\nu $ is an invariant measure, on the one hand

$$\begin{aligned} {\mathbb {E}}_\nu \left( \pi _{\hat{x}_1}\right) ={\mathbb {E}}_\nu \left( \pi _{\hat{x}_0}\right) =\rho _\nu . \end{aligned}$$

On the other hand,

$$\begin{aligned} {\mathbb {E}}_\nu \left( \pi _{\hat{x}_1}\right) =&\int _{{{\mathrm P}({{\mathbb {C}}}^k)}\times \mathrm {M}_k({{\mathbb {C}}})} \frac{v\pi _{\hat{x}_0} v^*}{\Vert vx_0\Vert ^2}\Vert vx_0\Vert ^2\mathrm{d}\nu \left( \hat{x}_0\right) \mathrm{d}\mu (v)\\ =&\int _{\mathrm {M}_k({{\mathbb {C}}})} v\,{{\mathbb {E}}}_\nu \left( \pi _{\hat{x}_0}\right) \,v^*\mathrm{d}\mu (v)\\ =&\phi (\rho _\nu ), \end{aligned}$$

so that $\rho _\nu $ is a fixed point of $\phi $. Hence if ($\phi $-Erg) holds, $\rho _\nu $ is the unique fixed point of $\phi $ in ${\mathcal {D}}_k$. Hence $\rho _{\nu _a}=\rho _{\nu _b}$ and ${{\mathbb {P}}}^{\rho _{\nu _a}}={{\mathbb {P}}}^{\rho _{\nu _b}}$ holds. $\square $

In the following we use the measure ${{\mathbb {P}}}^{\mathrm {ch}}={{\mathbb {P}}}^{\frac{1}{k}\mathrm{Id}_{{{\mathbb {C}}}^k}}$ associated to the operator $\mathrm{Id}_{{\mathbb {C}}^k}/k\in {\mathcal {D}}_k$ as a reference measure. Since for any $\rho \in {\mathcal {D}}_k$ there exists a constant c such that $\rho \le c \,\frac{\mathrm{Id}_{{\mathbb {C}}^k}}{k}$, the measure ${{\mathbb {P}}}^\rho $ is absolutely continuous w.r.t. ${{\mathbb {P}}}^{\mathrm {ch}}$. We will denote absolute continuity between measures with the symbol $\ll $, so that we have here

$$\begin{aligned} {{\mathbb {P}}}^\rho \ll {{\mathbb {P}}}^{\mathrm {ch}}, \end{aligned}$$

for all $\rho \in {\mathcal {D}}_k$. The Radon–Nykodim derivative will be made explicit in Proposition 2.2. To that end, we use a particular $(\mathcal {O}_n)$-adapted process. We define a sequence of matrix-valued random variables:

$$\begin{aligned} M_n:=\frac{W_n^*W_n}{{{\text {tr}}}\left( W_n^*W_n\right) } \quad \text{ if }\,\, {{\text {tr}}}\left( W_n^*W_n\right) \ne 0 \end{aligned}$$

and extend the definition arbitrarily whenever ${{\text {tr}}}(W_n^*W_n)=0$. The latter alternative appears with probability 0: indeed, ${{\mathbb {P}}}^{\mathrm {ch}}\big ({{\text {tr}}}(W_n^*W_n)=0\big )=0$ and then by the absolute continuity of ${{\mathbb {P}}}^\rho $ with respect to ${{\mathbb {P}}}^{\mathrm {ch}}$ we have ${{\mathbb {P}}}_\nu \big ({{\text {tr}}}(W_n^*W_n)=0\big )={{\mathbb {P}}}^{\rho _\nu }\big ({{\text {tr}}}(W_n^*W_n)=0\big )=0$ for any measure $\nu $. The key property of $M_n$, that we establish in the proof of Proposition 2.2, is that it is an $(\mathcal {O}_n)$-martingale with respect to ${{\mathbb {P}}}^{\mathrm {ch}}$.

From the existence of a polar decomposition for $W_n$, for each n, there exists a unitary matrix-valued random variable $U_n$ such that

$$\begin{aligned} W_n=U_n\sqrt{{{\text {tr}}}\left( W_n^*W_n\right) }M_n^{\frac{1}{2}}. \end{aligned}$$

(14)

This process $(U_n)$ can be chosen to be $(\mathcal {O}_n)$-adapted.

The key technical results about $M_n$ needed for our proofs are summarized in the following proposition. Recall that any $\mathcal {O}$-measurable function f is extended to a $\mathcal {J}$-measurable function by setting $f(\hat{x},\omega )=f(\omega )$ for any $(\hat{x},\omega )\in {\mathrm P}({{\mathbb {C}}}^k)\times \Omega $.

Proposition 2.2

For any probability measure $\nu $ over $({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}})$, $(M_n)$ converges $\mathbb {P}_\nu {\text {-}}\mathrm {a.s.}$ and in $L^1$-norm to an $\mathcal {O}$-measurable random variable $M_\infty $. The change of measure formula

$$\begin{aligned} \frac{\mathrm{d}{\mathbb {P}}^\rho }{\mathrm{d}{{\mathbb {P}}}^{\mathrm {ch}}}=k\,{{\text {tr}}}(\rho M_\infty ) \end{aligned}$$

(15)

holds true for all $\rho \in {\mathcal {D}}_k$.

Moreover, the measure $\mu $ verifies (Pur) if and only if $M_\infty $ is ${{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}$ a rank one projection for any probability measure $\nu $ over $({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}})$.

Proof

We start the proof by showing that $M_n$ is a ${{\mathbb {P}}}^{\mathrm {ch}}$-martingale. Recall that for all $n\in {\mathbb {N}}$ and all $O_n\in \mathcal {O}_n$,

$$\begin{aligned} {{\mathbb {P}}}^{\mathrm {ch}}(O_n)=\frac{1}{k} \int _{O_n} {{\text {tr}}}\big (W_{n}^*(\omega )W_n(\omega )\big ) \, \mathrm{d}\mu ^{\otimes n}(\omega ). \end{aligned}$$

From the definition of $W_n$, Eq. (8),

$$\begin{aligned} M_{n+1} = \frac{W^*_{n}V^*_{{n+1}}V_{{n+1}} W_n}{{{\text {tr}}}\left( W^*_n W_n\right) }\, \frac{{{\text {tr}}}\big (W^*_n W_n\big )}{{{\text {tr}}}\big (W^*_{n+1} W_{n+1}\big )}. \end{aligned}$$

(16)

This implies that for an arbitrary $\mathcal {O}_n$-measurable random variable Y

$$\begin{aligned} {{\mathbb {E}}}^{\mathrm {ch}}\left( Y M_{n+1}\right)&= \frac{1}{k} \int _{\mathrm {M}_k({{\mathbb {C}}})^{n+1}} \frac{W^*_{n}V_{{n+1}}^*V_{{n+1}} W_n}{{{\text {tr}}}\left( W^*_n W_n\right) } \, Y \, {{\text {tr}}}\big (W^*_n W_n\big ) \, \mathrm{d}\mu ^{\otimes n+1} \\&=\frac{1}{k} \int _{\mathrm {M}_k({{\mathbb {C}}})^n} \frac{W^*_{n} W_n}{{{\text {tr}}}\big (W^*_n W_n\big )}\, Y \, {{\text {tr}}}\big (W^*_n W_n\big )\, \mathrm{d}\mu ^{\otimes n}\\&= {{\mathbb {E}}}^{\mathrm {ch}}(Y M_n), \end{aligned}$$

where the second equality follows from the stochasticity condition (1), $\int v^* v \mathrm{d} \mu (v) = \mathrm{Id}_{{{\mathbb {C}}}_k}$. This shows that $(M_n)$ is an $(\mathcal {O}_n)$-martingale w.r.t. ${{\mathbb {P}}}^{\mathrm {ch}}$. Since the sequence $(M_n)$ is composed of positive semidefinite matrices of trace one, its coordinates are a.s. uniformly bounded by 1. Therefore, the martingale property implies the $L^1$ and a.s. convergence of $(M_n)$ to an $\mathcal {O}$-measurable random variable $M_\infty $. Now note that for any $\rho \in {\mathcal {D}}_k$,

$$\begin{aligned} {{\text {tr}}}\left( W_n^* W_n\rho \right) ={{\text {tr}}}(M_n\rho )\,k\,{{\text {tr}}}\left( W_n^* W_n \,\frac{\mathrm{Id}_{{\mathbb {C}}_k}}{k}\right) . \end{aligned}$$

This way, the convergence of $(M_n)$ implies the change of measure formula.

We now prove the last part of the proposition. Using the martingale property one can see that for all $n\in {\mathbb {N}}$, and any fixed $p \in \mathbb {N}$,

$$\begin{aligned} V_n^p\ :=\ \sum _{k=0}^{p-1}{{\mathbb {E}}}^{\mathrm {ch}}\left( M_{k+n+1}^2 - M_k^2\right)= & {} \sum _{k=0}^n{{\mathbb {E}}}^{\mathrm {ch}}\left( M_{k+p}^2\right) -\sum _{k=0}^n{{\mathbb {E}}}^{\mathrm {ch}}\left( M_k^2\right) \nonumber \\= & {} \sum _{k=0}^n{{\mathbb {E}}}^{\mathrm {ch}}\left( \left( M_{k+p}-M_k\right) ^2\right) \nonumber \\= & {} {{\mathbb {E}}}^{\mathrm {ch}}\left( \sum _{k=0}^n{{\mathbb {E}}}^{\mathrm {ch}}\left( \left( M_{k+p}-M_k\right) ^2\vert {\mathcal {O}}_k\right) \right) .\nonumber \\ \end{aligned}$$

(17)

Since $(M_n)$ is bounded and almost surely convergent, applying Lebesgue’s dominated convergence theorem to each ${{\mathbb {E}}}^{\mathrm {ch}}(M_{k+n+1}^2)$, $k=0,\ldots ,p-1$ as $n\rightarrow \infty $ implies that the term $V_n^p$ is convergent when n goes to infinity. Then, using the monotone convergence theorem in the last line of (17), we get that

$$\begin{aligned} \lim _{n\rightarrow \infty }V_n^p={{\mathbb {E}}}^{\mathrm {ch}}\left( \sum _{k=0}^\infty {{\mathbb {E}}}^{\mathrm {ch}}\left( \left( M_{k+p}-M_k\right) ^2 \vert {\mathcal {O}}_k\right) \right) . \end{aligned}$$

It implies that the series $\sum _{k=0}^\infty {{\mathbb {E}}}^{\mathrm {ch}}\big ((M_{k+p}-M_k)^2\vert {\mathcal {O}}_k\big )$ is almost surely finite. This yields that

$$\begin{aligned} \lim _{n\rightarrow \infty }{{\mathbb {E}}}^{\mathrm {ch}}\left( \left( M_{n+p}-M_n\right) ^2\vert {\mathcal {O}}_n\right) =0,\quad {{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}\end{aligned}$$

Since all the norms are equivalent in finite dimension, Jensen’s inequality implies

$$\begin{aligned} \lim _{n\rightarrow \infty }{{\mathbb {E}}}^{\mathrm {ch}}\left( \left\| M_{n+p}-M_n\right\| |\mathcal {O}_n\right) =0,\quad {{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}\end{aligned}$$

(18)

At this stage we use the polar decomposition of $(W_n)$, Eq. (14), to write

$$\begin{aligned} M_{n+p} = \frac{M_n^{\frac{1}{2}}U_n^*V_{n+1}^*\ldots V_{n+p}^*V_{n+p}\ldots V_{n+1}U_nM_n^{\frac{1}{2}}}{{{\text {tr}}}\left( M_n^{\frac{1}{2}}U_n^*V_{n+1}^*\ldots V_{n+p}^*V_{n+p}\ldots V_{n+1}U_nM_n^{\frac{1}{2}}\right) }. \end{aligned}$$

Then we get an expression for the conditional expectation, see the first part of the proof,

$$\begin{aligned} {{\mathbb {E}}}^{\mathrm {ch}}\big (\left\| M_{n+p}-M_n\right\| \vert {\mathcal {O}}_n\big )=&\int \left\| \frac{M_n^{\frac{1}{2}}U_n^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_nM_n^{\frac{1}{2}}}{{{\text {tr}}}\left( M_n^{\frac{1}{2}}U_n^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_nM_n^{\frac{1}{2}}\right) }-M_n\right\| \\&\qquad {{\text {tr}}}\left( v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_nM_nU_n^*\right) \,\mathrm{d}\mu ^{\otimes p}(v_1,\ldots ,v_p)\\ =&\int \left\| M_n^{\frac{1}{2}}U_n^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_nM_n^{\frac{1}{2}}\right. \\&\left. \qquad - M_n{{\text {tr}}}\left( v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_nM_nU_n^*\right) \right\| \\&\qquad \mathrm{d}\mu ^{\otimes p}(v_1,\ldots ,v_p). \end{aligned}$$

We used non-negativity of ${{\text {tr}}}(v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_nM_nU_n^*)$ to get the second equality. The above equation holds for ${{\mathbb {P}}}^{\mathrm {ch}}$-almost all realizations $\big (U_n(\omega )\big )$ of $(U_n)$. Since the group of unitary matrices is compact, for any fixed $\omega $ there exists a subsequence along which $\big (U_n(\omega )\big )$ converges to a unitary matrix $U_\infty (\omega )$. Taking the limit along this subsequence in the above expression yields (we drop $\omega $ for notational simplicity):

$$\begin{aligned}&\int _{\mathrm {M}_k({{\mathbb {C}}})^p}\left\| M_\infty ^{\frac{1}{2}}U_\infty ^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty M_\infty ^{\frac{1}{2}} - M_\infty {{\text {tr}}}\left( v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty M_\infty U_\infty ^*\right) \right\| \qquad \\&\qquad \qquad \mathrm{d}\mu ^{\otimes p}(v_1,\ldots ,v_p)=0. \end{aligned}$$

This implies that

$$\begin{aligned} M_\infty ^{\frac{1}{2}}U_\infty ^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty M_\infty ^{\frac{1}{2}} = {{\text {tr}}}\left( v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty M_\infty U_\infty ^*\right) M_\infty , \end{aligned}$$

for $\mu ^{\otimes p}$-almost all $(v_1,\ldots ,v_p)$.

Denoting by $\pi _\infty $ the orthogonal projector onto the range of $M_\infty $, the above condition is equivalent to $\pi _\infty U_\infty ^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty \pi _\infty =\lambda \pi _\infty $ with $\lambda ={{\text {tr}}}(v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty M_\infty U_\infty ^*)$. Finally, it follows that

$$\begin{aligned} U_\infty \pi _\infty U_\infty ^*v_{1}^*\ldots v_{p}^*v_{p}\ldots v_{1}U_\infty \pi _\infty U_\infty ^*\propto U_\infty \pi _\infty U_\infty ^*, \end{aligned}$$

for $\mu ^{\otimes p}$-almost all $(v_1,\ldots ,v_p)$. Since $U_\infty \pi _\infty U_\infty ^*$ is an orthogonal projector, the condition (Pur) implies (reintroducing $\omega $) that ${\text {rank}}(M_\infty (\omega ))={\text {rank}}(U_\infty (\omega ) \pi _\infty (\omega ) U_\infty ^*(\omega ))=1$. Since $M_\infty (\omega )$ is a trace one, positive semidefinite matrix this means that $M_\infty (\omega )$ is a rank one projector. Since this conclusion holds true for ${{\mathbb {P}}}^{\mathrm {ch}}$-almost all $\omega $ this establishes that the condition (Pur) implies that $M_\infty $ is ${{\mathbb {P}}}^{\mathrm {ch}}$-almost surely a rank 1 projection.

For the converse implication, assume that $M_\infty $ is ${{\mathbb {P}}}^{\mathrm {ch}}$-almost surely a rank one projection but that (Pur) does not hold. Then there exists $\pi $, a rank two orthogonal projector, such that for all $n\in {{\mathbb {N}}}$,

$$\begin{aligned} \pi W_n^*W_n\pi \propto \pi , \end{aligned}$$

$\mu ^{\otimes n}$-almost everywhere. Since $\mu ^{\otimes n}$-almost everywhere $M_n\propto W_n^*W_n$, we get that

$$\begin{aligned} \pi M_n \pi \propto \pi , \end{aligned}$$

$\mu ^{\otimes n}$-almost everywhere. Thus, $\pi M_\infty \pi \propto \pi $ and, under our assumption that ${\text {rank}}M_\infty =1\;{{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}$ and ${\text {rank}}\pi =2$, this implies that $\pi M_\infty \pi =0$, ${{\mathbb {P}}}^{\mathrm {ch}}$-almost surely. On the other hand for all $n\in {\mathbb {N}}$ we have ${{\mathbb {E}}}^{\mathrm {ch}}(M_n)=\mathrm{Id}_{{{\mathbb {C}}}^k}$, and the $L^1$ convergence implies that ${{\mathbb {E}}}^{\mathrm {ch}}(M_\infty )=\mathrm{Id}_{{{\mathbb {C}}}^k}$. Then, ${{\mathbb {E}}}^{\mathrm {ch}}(\pi M_\infty \pi )=\pi $ which contradicts $\pi M_\infty \pi =0\;{{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}$$\square $

By the polar decomposition, the rank of $W_n$ is equal to the rank of $M_n$ and the proposition thus implies that $W_n \rho _0 W_n^*/{{\text {tr}}}(W_n \rho _0 W_n^*)$ approaches the set of pure states for any $\rho _0\in {\mathcal {D}}_k$ if and only if (Pur) holds. This is the result of Maassen and Kümmerer [17] mentioned in the introduction. Though $M_n$ is not used in [17], the proof relies on similar ideas.

We are now in the position to show that the Markov chain $(\hat{x}_n)$ is asymptotically an $\mathcal {O}$-measurable process. This is expressed in the following lemma. Whenever (Pur) holds, we denote by $\hat{z} \in {{\mathrm P}({{\mathbb {C}}}^k)}$ the $\mathcal {O}$-measurable random variable defined by

$$\begin{aligned} M_\infty = \pi _{\hat{z}}. \end{aligned}$$

Recall that $d(\cdot ,\cdot )$, defined by Eq. (7), is our metric on ${{\mathrm P}({{\mathbb {C}}}^k)}$.

Lemma 2.3

Assume (Pur) holds. Then for any probability measure $\nu $ on $({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}})$,

$$\begin{aligned} \lim _{n\rightarrow \infty } d\left( \hat{x}_n,U_n \cdot \hat{z}\right) =0\quad \mathbb {P}_\nu {\text {-}}\mathrm {a.s.}\end{aligned}$$

Proof

We start the proof by showing that for any $\nu $

$$\begin{aligned} \lim _{n\rightarrow \infty } M_n^{\frac{1}{2}} \cdot \hat{x}=\hat{z}\quad {\mathbb {P}}_\nu {\text {-}}\mathrm {a.s.}\end{aligned}$$

(19)

Let $\hat{x}$ be fixed and recall from Proposition 2.2 that (Pur) implies $M_\infty =\pi _{\hat{z}}$. Since $M_\infty x=\langle z,x\rangle z$, in order to show (19), it is enough to show that $\hat{x}$ is ${{\mathbb {P}}}_\nu $-almost surely not orthogonal to $\hat{z}$. From Eq. (13) and the change of measure formula in Proposition 2.2,

$$\begin{aligned} \mathrm{d}{{\mathbb {P}}}_{\nu }\left( \hat{x}, \omega \right) = k\,{{\text {tr}}}\left( \pi _{\hat{x}} \pi _{\hat{z}(\omega )}\right) \, \mathrm{d}\big (\nu (\hat{x})\otimes {{\mathbb {P}}}^{\mathrm {ch}}(\omega )\big ). \end{aligned}$$

Hence the event $\{{{\text {tr}}}(\pi _{\hat{x}}\pi _{\hat{z}})=|\langle z,x\rangle |^2=0\}$ has ${{\mathbb {P}}}_{\nu }$-measure 0. This proves the required claim, and (19) follows from the almost sure convergence of $M_n$ to $\pi _{\hat{z}}$.

Now using the polar decomposition, Eq. (14), and the fact that proportionality of vectors amounts to equality of their equivalence classes in ${\mathrm P}({\mathbb {C}}^k)$, we have

$$\begin{aligned} \hat{x}_n = U_n M_n^{\frac{1}{2}}\cdot \hat{x}_0. \end{aligned}$$

The first part of the proof then yields

$$\begin{aligned} \lim _{n\rightarrow \infty } d\left( \hat{x}_n,U_n \cdot \hat{z}\right) =0,\quad \mathbb {P}_\nu {\text {-}}\mathrm {a.s.}\end{aligned}$$

$\square $

The uniqueness of the invariant measure which is the first part of Theorem 1.1 follows as a corollary.

Corollary 2.4

Assume (Pur) and ($\phi $-Erg). Then the Markov kernel $\Pi $ admits a unique invariant probability measure.

Proof

For an invariant measure $\nu $, the random variable $\hat{x}_n$ is $\nu $-distributed for all $n \in \mathbb {N}$. In particular, $ \mathbb {E}_\nu \big (f(\hat{x}_n)\big )$ is constant for any continuous function f. On the other hand Lemma 2.3 and Lebesgue’s dominated convergence theorem imply that

$$\begin{aligned} \lim _{n\rightarrow \infty } {\mathbb {E}}_{\nu }\big (f\left( \hat{x}_n\right) -f\left( U_n\cdot \hat{z}\right) \big ) =0. \end{aligned}$$

Hence

$$\begin{aligned} \lim _{n\rightarrow \infty } {\mathbb {E}}_{\nu }\big ( f\left( U_n\cdot \hat{z}\right) \big ) = \mathbb {E}_{\nu } \big (f\left( \hat{x}_0\right) \big ). \end{aligned}$$

(20)

Assume now that there exist two invariant measures $\nu _a$ and $\nu _b$. Since $U_n\cdot \hat{z}$ is $\mathcal {O}$-measurable, Proposition 2.1 implies

$$\begin{aligned} {\mathbb {E}}_{\nu _a}\big (f\left( U_n\cdot \hat{z}\right) \big )={\mathbb {E}}_{\nu _b} \big (f\left( U_n\cdot \hat{z}\right) \big ). \end{aligned}$$

Then Eq. (20) applied with $\nu =\nu _a$, resp. $\nu = \nu _b$ gives

$$\begin{aligned} {\mathbb {E}}_{\nu _a}\big (f\left( \hat{x}_0\right) \big )={\mathbb {E}}_{\nu _b}\big (f\left( \hat{x}_0\right) \big ) \end{aligned}$$

which means that $\nu _a=\nu _b$ and the uniqueness is proved. $\square $

Assuming only (Pur) we can actually completely characterize the set of invariant measures.

Proposition 2.5

Assuming (Pur) there exists a set $\{F_j\}_{j=1}^d$ of mutually orthogonal subspaces of ${{\mathbb {C}}}^k$ such that for each $j\in \{1,\ldots ,d\}$ there exists a unique $\Pi $-invariant probability measure $\nu _j$ supported on ${\mathrm P}(F_j)$, and the set of $\Pi $-invariant probability measures is the convex hull of $\{\nu _j\}_{j=1}^d$.

The subspaces $F_j$ are the ranges of the extremal fixed points of $\phi $ in ${\mathcal {D}}_k$. This is shown in the proof of Proposition 2.5, which we give in “Appendix B”.

Remark 2.6

Assuming ($\phi $-Erg) only, the chain might or might not have a unique invariant probability measure. Indeed, if ${\text {supp}}\mu \subset \mathrm {SU}(k)$, Assumption (Pur) is trivially not verified and, as proved in “Appendix C”, the uniqueness of the invariant measure depends on the smallest closed subgroup of $\mathrm {SU}(k)$ containing ${\text {supp}}\mu $. To illustrate this point, in the same appendix, we study two examples with $\mu $ supported on and giving equiprobability to two elements of $\mathrm {SU}(2)$ such that ($\phi $-Erg) holds. In the first example $\Pi $ has a unique invariant probability measure whereas in the second example $\Pi $ has uncountably many mutually singular invariant probability measures.

3 Convergence

We now turn to the proof of the second part of Theorem 1.1, namely the geometric convergence in Wasserstein distance of the process $(\hat{x}_n)$ towards the invariant measure. We first recall a definition of this distance for compact metric spaces: for X a compact metric space equipped with its Borel $\sigma $-algebra, the Wasserstein distance of order 1 between two probability measures $\sigma $ and $\tau $ on X can be defined using the Kantorovich–Rubinstein duality theorem as

$$\begin{aligned} W_1(\sigma ,\tau )=\sup _{f\in \mathrm{Lip}_1(X)}\left| \int _{X} f\,\mathrm{d}\sigma - \int _X f\, \mathrm{d}\tau \right| , \end{aligned}$$

where $\mathrm{Lip}_1(X)=\{f:X\rightarrow {\mathbb {R}} \ \mathrm {s.t.}\ \vert f(x)-f(y)\vert \le d(x,y)\}$ is the set of Lipschitz continuous functions with constant one, and $d$ is the metric on X.

The proof of Eq. (6) consists of three parts. In the first part we show a geometric convergence in total variation of ${{\mathbb {P}}}^\rho $ to ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$ under the shift $\theta (v_1,v_2,\ldots )=(v_2,v_3,\ldots )$. In the second one we show a geometric convergence of the chain $(\hat{x}_n)$ towards an $\mathcal {O}$-measurable process $(\hat{y}_n)$. Finally, we combine these results to prove Eq. (6).

3.1 Convergence for $\mathcal {O}$-measurable random variables

Let us first discuss the origin of the integer m in Eq. (6). Let E be a subspace of ${{\mathbb {C}}}^k$ s.t. $vE\subset E$ for any $v\in {\text {supp}}\mu $. Let $(E_1,\ldots ,E_\ell )$ be an orthogonal partition of E, i.e.$E=E_1\oplus \cdots \oplus E_\ell $. We say that $(E_1,\ldots ,E_\ell )$ is a $\ell $-cycle of $\phi $ if $vE_j\subset E_{j+1}$ for $\mu $-a.e. v (with the convention $E_{\ell +1}=E_1$).^{Footnote 2} The set of $\ell \in {{\mathbb {N}}}$ for which there exists an $\ell $-cycle is non-empty (as it contains 1) and bounded (as necessarily $\ell \le k$).

Definition 3.1

The largest $\ell \in {{\mathbb {N}}}$ such that there exists a $\ell $-cycle of $\phi $ is called the period of $\phi $. We denote this period by m.

Remark 3.2

The above definition for the period of $\phi $ is similar to that of the period of a $\varphi $-irreducible Markov chain. It is obvious that if $(E_1,\ldots ,E_\ell )$ is an $\ell $-cycle of $\phi $ then it is also an $\ell $-cycle of $\Pi $. However, the Markov chain defined by $\Pi $ is not $\varphi $-irreducible in general. Hence the results of [19] on the period of $\varphi $-irreducible Markov chains do not apply and the characterization of the period of $\Pi $ remains an open problem.
The above definition shows that the union $\bigcup _{j=1}^m E_j$ is invariant by $\mu $-a.e. v. Hence, the strong irreducibility assumption discussed at the end of the introduction implies that $m=1$.

The following result is a reformulation of the Perron–Frobenius theorem of Evans and Høegh-Krohn (see [7]). The original formulation in [7] makes the additional assumption that $E={{\mathbb {C}}}^k$ in ($\phi $-Erg). For the present extension see e.g. [22]. In the following statement, and in the rest of the article, for X an operator on ${{\mathbb {C}}}^k$ we denote $\Vert X\Vert _1={{\text {tr}}}|X|$ (all statements are identical with a different norm, but this choice will spare us a few irrelevant constants).

Theorem 3.3

Assume that ($\phi $-Erg) holds. Then there exists a unique $\phi $-invariant element $\rho _{\mathrm {inv}}$ of ${\mathcal {D}}_k$ with range equal to the minimal invariant subspace E. In addition, there exist two positive constants c and $\lambda <1$ such that, with m defined in Definition 3.1, for any $\rho \in {\mathcal {D}}_k$ and for any $n\in {{\mathbb {N}}}$,

$$\begin{aligned} \left\| \frac{1}{m}\sum _{r=0}^{m-1} \phi ^{mn+r}(\rho )-\rho _{\mathrm {inv}}\right\| _1\le c\lambda ^n. \end{aligned}$$

(21)

Proof

Theorem 4.2 in [7] implies that $\rho _{\mathrm {inv}}$ is the unique $\phi $-invariant element of ${\mathcal {D}}_k$, that the eigenvalues of $\phi $ of modulus one are exactly the m-th roots of unity, and that they are all simple. The statement follows, with $\lambda $ any quantity strictly larger than the modulus of the largest non-peripheral eigenvalue. $\square $

Recall that $\theta $ is the left shift operator on $\Omega $, i.e.

$$\begin{aligned} \theta (v_1,v_2,\ldots )=(v_2,v_3,\ldots ). \end{aligned}$$

The main result of this section is the following proposition. As announced it concerns the speed of convergence in total variation (expressed in terms of expectation values).

Proposition 3.4

Assume ($\phi $-Erg) holds. Then there exist two positive constants C and $\lambda <1$ such that for any $\mathcal {O}$-measurable function f with essential bound $\Vert f\Vert _\infty $, any $\rho \in {\mathcal {D}}_k$ and all $n\in {{\mathbb {N}}}$,

$$\begin{aligned} \left| {{\mathbb {E}}}^\rho \left( \frac{1}{m}\sum _{r=0}^{m-1}f\circ \theta ^{mn+r}\right) -{{\mathbb {E}}}^{\rho _{\mathrm {inv}}}(f)\right| \le C\Vert f\Vert _\infty \lambda ^n. \end{aligned}$$

(22)

Proof

We claim that for any bounded $\mathcal {O}$-measurable function f,

$$\begin{aligned} {{\mathbb {E}}}^\rho (f \circ \theta ) = {{\mathbb {E}}}^{\phi (\rho )}(f). \end{aligned}$$

(23)

It suffices to prove this relation for all $\mathcal {O}_l$-measurable functions for any integer l. Thus, let l be an integer and f an $\mathcal {O}_l$-measurable function. Then,

$$\begin{aligned} {{\mathbb {E}}}^\rho (f\circ \theta )=&\int _{\mathrm {M}_k({{\mathbb {C}}})^{l+1}} f(v_{2},\ldots ,v_{l+1}){{\text {tr}}}\left( v_{l+1}\ldots v_{1}\rho v_1^*\ldots v_{l+1}^*\right) \, \mathrm{d}\mu ^{\otimes (l+1)}(v_1,\ldots ,v_{l+1})\\ =&\int _{\mathrm {M}_k({{\mathbb {C}}})^{l}} f(v_{1},\ldots ,v_{l}){{\text {tr}}}\left( v_{l}\ldots v_{1}\phi (\rho )v_{1}^*\ldots v_{l}^*\right) \, \mathrm{d}\mu ^{\otimes l}(v_{1},\ldots ,v_{l}), \end{aligned}$$

which is equal to ${{\mathbb {E}}}^{\phi (\rho )}(f)$, as claimed.

Applying Eq. (23) multiple times and using the change of measure of Proposition 2.2 we obtain

$$\begin{aligned} {{\mathbb {E}}}^\rho \left( \frac{1}{m}\sum _{r=0}^{m-1} f\circ \theta ^{mn+r}\right)&= \frac{1}{m} \sum _{r=0}^{m-1} {{\mathbb {E}}}^{\phi ^{nm+r} (\rho )}(f) \\&= k \frac{1}{m} \sum _{r=0}^{m-1} {{\mathbb {E}}}^{\mathrm {ch}}\Big (f\, {{\text {tr}}}\big ( M_\infty \phi ^{nm+r} (\rho )\big )\Big ), \end{aligned}$$

for any $\mathcal {O}$-measurable function f. Using $|{{\text {tr}}}(M_\infty A)|\le \Vert A\Vert _1$ for $A=A^*$ (remark that $M_\infty \in {\mathcal {D}}_k$ by construction) we then obtain

$$\begin{aligned} \left| {{\mathbb {E}}}^\rho \left( \frac{1}{m}\sum _{r=0}^{m-1} f\circ \theta ^{mn+r}\right) -{{\mathbb {E}}}^{\rho _{\mathrm {inv}}}(f)\right| \le \Vert f\Vert _\infty \, k\left\| \frac{1}{m}\sum _{r=0}^{m-1}\phi ^{mn+r}(\rho )-\rho _{\mathrm {inv}}\right\| _1 \end{aligned}$$

and Theorem 3.3 yields the proposition with $C=ck$. $\square $

3.2 Convergence to an ${\mathcal {O}}$-measurable process

Let us introduce two relevant processes: for all $n\in {{\mathbb {N}}}$, let

$$\begin{aligned} \hat{z}_{n}(\omega )=\mathop {\mathrm {argmax}}_{\hat{x}\in {{\mathrm P}({{\mathbb {C}}}^k)}}\,\Vert W_n x\Vert ^2 \end{aligned}$$

(24)

and

$$\begin{aligned} \hat{y}_n=W_n\cdot \hat{z}_n. \end{aligned}$$

(25)

Both random variables $\hat{y}_n$ and $\hat{z}_n$ are $\mathcal {O}_n$-measurable.

The random variable $\hat{z}_n$ corresponds to the maximum likelihood estimator of $\hat{x}_0$. Note that the ${\text {argmax}}$ may not be uniquely defined. We can, however, define it in an $\mathcal {O}_n$-measurable way. The following results will not be affected by such a consideration, and we will not discuss such questions in the sequel. It follows from the definition of $\hat{z}_n$ that

$$\begin{aligned} \left( W_n^*W_n\right) ^{\frac{1}{2}}\, z_n=\Vert W_n\Vert z_n, \quad {{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}\end{aligned}$$

(26)

We recall that $z_n$ is a vector representative of the class $\hat{z}_n$.

Concerning $\hat{y}_n$, it can be seen as an estimator of $\hat{x}_n$ given the maximum likelihood estimation of $\hat{x}_0$. The following proposition establishes consistency of this estimator, we show the geometric contraction in the mean of $(\hat{x}_n)$ and $(\hat{y}_n)$. In fact we prove a slightly more general statement that the estimator based on the first n outcomes can be replaced by an estimator based on outcomes between l and $l+n$. We will prove the almost-sure contraction in Proposition 4.4.

Proposition 3.5

Assume (Pur) holds. Then there exist two positive constants C and $\lambda <1$ such that for any probability measure $\nu $ over $({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}})$,

$$\begin{aligned} {{\mathbb {E}}}_{\nu }\left( d\left( \hat{x}_{n+l}, \hat{y}_n\circ \theta ^l\right) \right) \le C\lambda ^n, \end{aligned}$$

(27)

holds for all non-negative integers l and n.

In order to prove Proposition 3.5 we study the largest two singular values of $W_n$. As is customary in the study of products of random matrices, we make use of exterior products. We recall briefly the relevant definitions: for $p\in {{\mathbb {N}}}$ and p vectors $x_1, \ldots , x_p$ in ${{\mathbb {C}}}^k$ we denote by $x_1\wedge \cdots \wedge x_p$ the alternating bilinear form $(y_1,\ldots , y_p)\mapsto \det \big (\langle x_i, y_j\rangle \big )_{i,j=1}^p$. Then, the set of all $x_1\wedge \cdots \wedge x_p$ is a generating family for the set $\wedge ^p{{\mathbb {C}}}^k$ of alternating bilinear forms on ${{\mathbb {C}}}^k$, and we can define a hermitian inner product by

$$\begin{aligned} \left\langle x_1\wedge \cdots \wedge x_p, y_1\wedge \cdots \wedge y_p\right\rangle = \det \left( \langle x_i, y_j\rangle \right) _{i,j=1}^p, \end{aligned}$$

and denote by $\Vert x_1\wedge \cdots \wedge x_p\Vert $ the associated norm. It is immediate to verify that our metric $d$, defined by (7), satisfies

$$\begin{aligned} d(\hat{x},\hat{y})=\frac{\Vert x\wedge y\Vert }{\Vert x\Vert \Vert y\Vert }. \end{aligned}$$

(28)

For an operator A on ${\mathbb {C}}^k$, we write $\wedge ^p A$ for the operator on $\wedge ^p{{\mathbb {C}}}^k$ defined by

$$\begin{aligned} \wedge ^p A \,(x_1\wedge \cdots \wedge x_p)=Ax_1\wedge \cdots \wedge Ax_p. \end{aligned}$$

(29)

Obviously $\wedge ^p (AB)=\wedge ^p A\wedge ^p B$, so that $\Vert \wedge ^p (AB)\Vert \le \Vert \wedge ^p A\Vert \Vert \wedge ^p B\Vert $. From e.g. Chapter XVI of [18] or Lemma III.5.3 of [4], we have in addition for $1\le p\le k$

$$\begin{aligned} \left\| \wedge ^p A\right\| =a_1(A)\ldots a_p(A), \end{aligned}$$

(30)

where $a_1(A)\ge \cdots \ge a_k(A)$ are the singular values of A, i.e. the square roots of eigenvalues of $A^* A$, labelled in decreasing order.

Our strategy to prove Proposition 3.5 is to bound the left hand side of Eq. (27) by a submultiplicative function $f : \mathbb {N} \rightarrow \mathbb {R}_+$ and then use Fekete’s lemma. We will show that the function

$$\begin{aligned} f(n)=\int _{\mathrm {M}_k({{\mathbb {C}}})^n} \left\| \wedge ^2 v_n\ldots v_1\right\| \,\mathrm{d}\mu ^{\otimes n}(v_1,\ldots ,v_n) \end{aligned}$$

(31)

has the desired properties. The following lemma establishes an exponential decay of this function.

Lemma 3.6

Assume (Pur). Then there exist two positive constants C and $\lambda <1$ such that

$$\begin{aligned} f(n)\le C\lambda ^n. \end{aligned}$$

Proof

First, we prove $\lim _{n\rightarrow \infty }f(n)=0$. To prove it, we express the function f(n) using the process $W_n$ as

$$\begin{aligned} f(n)={{\mathbb {E}}}^{\mathrm {ch}}\left( k\frac{\Vert \wedge ^2 W_n\Vert }{{{\text {tr}}}\left( W_n^*W_n\right) }\right) . \end{aligned}$$

(32)

By definition the eigenvalues of $M_n^{\frac{1}{2}}$ are the singular values of $W_n/\sqrt{{{\text {tr}}}(W_n^*W_n)}$. Since by Proposition 2.2, $M_n$ converges ${{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}$ to a rank one projection,

$$\begin{aligned} \lim _{n\rightarrow \infty }a_1\left( \frac{W_n}{\sqrt{{{\text {tr}}}\left( W_n^*W_n\right) }}\right) a_2\left( \frac{W_n}{\sqrt{{{\text {tr}}}\left( W_n^*W_n\right) }}\right) =0\quad {{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}\end{aligned}$$

Using Eq. (30) we then conclude that

$$\begin{aligned} \lim _{n\mapsto \infty }\frac{\left\| \wedge ^2W_n\right\| }{{{\text {tr}}}\left( W_n^*W_n\right) }=0 \quad {{\mathbb {P}}}^{\mathrm {ch}}{\text {-}}\mathrm {a.s.}\end{aligned}$$

(33)

Since $\Vert \wedge ^2 W_n\Vert \le \Vert W_n\Vert ^2\le {{\text {tr}}}(W_n^*W_n)$, the expression (32) and Lebesgue’s dominated convergence theorem imply $\lim _{n\rightarrow \infty }f(n)=0$.

Second, remark that the function f is submultiplicative. Indeed, for $p,q\in {\mathbb {N}}$ we have

$$\begin{aligned} \left\| \wedge ^2\left( v_{p+q}\ldots v_{1}\right) \right\| \le \left\| \wedge ^2\left( v_{p+q}\ldots v_{p+1}\right) \right\| \left\| \wedge ^2\left( v_{p}\ldots v_{1}\right) \right\| \end{aligned}$$

and the submultiplicativity follows.

By Fekete’s subadditive Lemma, $\frac{\log f(n)}{n}$ converges to $\inf _{n\in {{\mathbb {N}}}} \frac{\log f(n)}{n}$, which is (strictly) negative (and possibly equal to $-\infty $) since $f(n)\rightarrow 0$. Then there exists $0<\lambda <1$ such that $f(n)\le \lambda ^n$ for large enough n, and the conclusion follows. $\square $

We are now in a position to prove Proposition 3.5.

Proof of Proposition 3.5

The Markov property of $(\hat{x}_n)$ implies that

$$\begin{aligned} {{\mathbb {E}}}_{\nu }\left( d\left( \hat{x}_{n+l}, \hat{y}_n\circ \theta ^l\right) \right) = {{\mathbb {E}}}_{\nu \Pi ^l}\left( d\left( \hat{x}_{n}, \hat{y}_n\right) \right) . \end{aligned}$$

Provided inequality (27) is established for $l=0$, the right hand side of the previous equality can be bounded by $C \lambda ^n$. It is hence sufficient to prove the inequality for $l=0$.

The case $l=0$ follows from Lemma 3.6 if for any $n \in {{\mathbb {N}}}$ and any probability measure $\nu $,

$$\begin{aligned} {\mathbb {E}}_{\nu }\left( d\left( \hat{x}_n, \hat{y}_n\right) \right) \le f(n). \end{aligned}$$

(34)

To obtain this inequality, note that from the definitions of $\hat{x}_n$, $\hat{y}_n$ and $\hat{z}_n$, we have that

$$\begin{aligned} d\left( \hat{x}_n, \hat{y}_n\right)&=\frac{\left\| \wedge ^2 W_n\,(x_0\wedge z_n)\right\| }{\Vert W_n x_0\Vert \Vert W_n z_n\Vert }\\&\le \frac{\left\| \wedge ^2 W_n\right\| }{\left\| W_n x_0\right\| ^2}\frac{\left\| W_n x_0\right\| }{\Vert W_n\Vert }\\&\le \frac{\left\| \wedge ^2 W_n\right\| }{\left\| W_n x_0\right\| ^2}, \end{aligned}$$

holds ${{\mathbb {P}}}_\nu $-almost surely. To get the first inequality we used $\Vert W_n z_n\Vert = \Vert W_n\Vert $, and $\Vert x_0\wedge z_n \Vert =d(\hat{x}_0,\hat{z}_n)\le 1$. In addition, by definition of ${{\mathbb {P}}}_\nu $,

$$\begin{aligned} {{\mathbb {E}}}_{\nu }\left( \frac{\left\| \wedge ^2 W_n\right\| }{\Vert W_n x_0\Vert ^2}\right)&= \int _{{\mathrm P}({{\mathbb {C}}}^k)\times \mathrm {M}_k({{\mathbb {C}}})^n} \frac{\left\| \wedge ^2W_n\right\| }{\Vert W_n x_0\Vert ^2}\,\Vert W_n x_0\Vert ^2\,\mathrm{d}\mu ^{\otimes n}\, \mathrm{d}\nu (\hat{x}_0) \\&= \int _{\mathrm {M}_k({{\mathbb {C}}})^n} {\left\| \wedge ^2W_n\right\| }\,\mathrm{d}\mu ^{\otimes n}(v_1,\ldots , v_n), \end{aligned}$$

which is f(n). Therefore (34) holds and Lemma 3.6 yields the proof. $\square $

3.3 Convergence in Wasserstein metric

The remainder of Sect. 3 is devoted to the proof of the second part of Theorem 1.1.

Proof of Eq. (6)

We are supposed to prove that

$$\begin{aligned} W_1\left( \frac{1}{m}\sum _{r=0}^{m-1} \nu \Pi ^{mn+r}, \nu _{\mathrm {inv}}\right) = \sup _{f\in \mathrm{Lip}_1({\mathrm P}({\mathbb {C}}^k))} \left| \mathbb {E}_\nu \left( \frac{1}{m}\sum _{r=0}^{m-1} f\left( \hat{x}_{mn+r}\right) \right) - \mathbb {E}_{\nu _{_{\mathrm {inv}}}}\left( f\left( \hat{x}_0\right) \right) \right| \end{aligned}$$

is exponentially decaying in n. The expression in the supremum on the right hand side is unchanged by adding an arbitrary constant to f. This freedom allows us to restrict the supremum to functions bounded by 1, i.e. $\Vert f\Vert _\infty \le 1$.

Let $f\in \mathrm{Lip}_1({\mathrm P}({\mathbb {C}}^k))$ be such a function. Our strategy is to approximate $\hat{x}_{mn+r}$ by $\hat{y}_{mp} \circ \theta ^{mq+r}$ with $p=\lfloor \frac{n}{2} \rfloor $ and $q=\lceil \frac{n}{2}\rceil $ so that in particular $p+q =n$. Using telescopic estimates and the invariance of $\nu _{\mathrm {inv}}$ we then have

$$\begin{aligned}&\left| {{\mathbb {E}}}_\nu \left( \frac{1}{m} \sum _{r=0}^{m-1} f\left( \hat{x}_{mn +r}\right) \right) - {{\mathbb {E}}}_{\nu _{\mathrm {inv}}}\left( f\left( \hat{x}_0\right) \right) \right| \\&\quad \le \frac{1}{m} \sum _{r=0}^{m-1} \left| {{\mathbb {E}}}_\nu \left( f\left( \hat{x}_{m(p+q)+r}\right) \right) - {{\mathbb {E}}}_\nu \left( f\left( \hat{y}_{mp} \circ \theta ^{mq+r}\right) \right) \right| \\&\qquad + \frac{1}{m} \sum _{r=0}^{m-1} \left| {{\mathbb {E}}}_{\nu _{\mathrm {inv}}} \left( f\left( \hat{y}_{mp} \circ \theta ^{mq+r}\right) \right) -{{\mathbb {E}}}_{\nu _{\mathrm {inv}}} \left( f\left( \hat{x}_{m(p+q)+r}\right) \right) \right| \\&\qquad + \left| \frac{1}{m} \sum _{r=0}^{m-1}{{\mathbb {E}}}_\nu \left( f\left( \hat{y}_{mp} \circ \theta ^{mq+r}\right) \right) - {{\mathbb {E}}}_{\nu _{\mathrm {inv}}} \left( f\left( \hat{y}_{mp}\right) \right) \right| . \end{aligned}$$

We bound the terms on the right hand side using Proposition 3.5 for the first two terms and Proposition 3.4 for the last term. To this end let C and $\lambda < 1$ be such that the bounds in both these propositions hold true. Since f is 1-Lipschitz continuous we have

$$\begin{aligned} \left| f\left( \hat{x}_{m(p+q)+r}\right) - f\left( \hat{y}_{mp} \circ \theta ^{mq+r}\right) \right| \le d\left( \hat{x}_{m(p+q)+r},\hat{y}_{mp} \circ \theta ^{mq+r}\right) . \end{aligned}$$

Proposition 3.5 then implies that

$$\begin{aligned} \left| {{\mathbb {E}}}_\nu \left( f\left( \hat{x}_{m(p+q)+r}\right) \right) - {{\mathbb {E}}}_\nu \left( f\left( \hat{y}_{mp} \circ \theta ^{mq+r}\right) \right) \right| \le C\lambda ^{mp}, \end{aligned}$$

and similarly with $\nu $ replaced by $\nu _{\mathrm {inv}}$. Regarding the last term in the above telescopic estimate we have by Proposition 3.4,

$$\begin{aligned} \left| \frac{1}{m} \sum _{r=0}^{m-1}{{\mathbb {E}}}_\nu \left( f\left( \hat{y}_{mp} \circ \theta ^{mq+r}\right) \right) - {{\mathbb {E}}}_{\nu _{\mathrm {inv}}} \left( f\left( \hat{y}_{mp}\right) \right) \right| \le C \lambda ^{q}, \end{aligned}$$

where we used the constraint $\Vert f\Vert _\infty \le 1$ discussed at the beginning of the proof.

Putting these estimates together we get

$$\begin{aligned} \left| {{\mathbb {E}}}_\nu \left( \frac{1}{m} \sum _{r=0}^{m-1} f\left( \hat{x}_{mn+r}\right) \right) - {{\mathbb {E}}}_{\nu _{\mathrm {inv}}}\left( f\left( \hat{x}_0\right) \right) \right| \le 3C\lambda ^{\left\lfloor \frac{n}{2} \right\rfloor } \end{aligned}$$

and this concludes the proof of Eq. (6) and therefore of Theorem 1.1. $\square $

4 Lyapunov exponents

In this section, we study the almost sure stability exponents. The main results of this section will assume ($\phi $-Erg) with the additional assumption that $E={{\mathbb {C}}}^k$.

Remark 4.1

Assuming $E={{\mathbb {C}}}^k$ amounts to saying that $\phi $ has no transient part. Without this assumption, we would have to take into account the almost sure Lyapunov exponent corresponding to the escape from the transient part. See [3] for a precise account of these ideas.

The relevance of this assumption will stem from the following straightforward inequalities: if $\rho $ is any element of ${\mathcal {D}}_k$ then one has

$$\begin{aligned} \frac{\mathrm{d}{{\mathbb {P}}}^\rho _{\vert {\mathcal {O}}_n}}{\mathrm{d}\mu ^{\otimes n}} \le \Vert W_n\Vert ^2, \end{aligned}$$

and if $\rho $ is faithful (i.e. definite positive), then

$$\begin{aligned} \frac{\mathrm{d}{{\mathbb {P}}}^\rho _{\vert {\mathcal {O}}_n}}{\mathrm{d}\mu ^{\otimes n}} \ge \Vert \rho ^{-1}\Vert ^{-1}\Vert W_n\Vert ^2. \end{aligned}$$

In particular, under the assumption that ($\phi $-Erg) holds with $E={{\mathbb {C}}}^k$, thus $\rho _{\mathrm {inv}}>0$ and for any $\rho \in {\mathcal {D}}_k$, we have

$$\begin{aligned} {{\mathbb {P}}}^\rho \ll {{\mathbb {P}}}^{\rho _{\mathrm {inv}}}. \end{aligned}$$

(35)

Let us start by proving the following lemma, which concerns the ergodicity of $\theta $ with respect to the measure ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$.

Lemma 4.2

Assume that ($\phi $-Erg) holds. Then the shift $\theta $ on $(\Omega ,\mathcal {O})$ is ergodic with respect to the probability measure ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$.

Proof

Let A, $A'$ in $\mathcal {O}_l$. From the definition of ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$, for j large enough, ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}\big (A \cap \theta ^{-j}(A')\big )$ equals

$$\begin{aligned} \int _{A\times A'} {{\text {tr}}}\Big (v_{l}'\ldots v_{1}' \phi ^{j-l}\big (v_{l}\ldots v_{1} \rho _{\mathrm {inv}}v_{1}^*\ldots v_{l}^*\big ){v_{1}'}^*\ldots {v_{l}'}^* \Big ) \,\mathrm{d}\mu ^{\otimes l}(v_1,\ldots ,v_l)\, \mathrm{d}\mu ^{\otimes l}\left( v_1',\ldots ,v_l'\right) , \end{aligned}$$

and the Perron–Frobenius Theorem 3.3 implies

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n} \sum _{j=0}^{n-1} \phi ^j\big (v_{l}\ldots v_{1} \rho _{\mathrm {inv}}v_{1}^*\ldots v_{l}^*\big ) = {{\text {tr}}}\big (v_{l}\ldots v_{1} \rho _{\mathrm {inv}}v_{1}^*\ldots v_{l}^*\big )\, \rho _{\mathrm {inv}}\end{aligned}$$

for $\mu ^{\otimes l}$-almost all $(v_1,\ldots ,v_l)$ so that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n} \sum _{j=0}^{n-1} {{\mathbb {P}}}^{\rho _{\mathrm {inv}}}\big (A \cap \theta ^{-j}\left( A'\right) \big ) ={{\mathbb {P}}}^{\rho _{\mathrm {inv}}}(A)\,{{\mathbb {P}}}^{\rho _{\mathrm {inv}}}\left( A'\right) , \end{aligned}$$

which proves the ergodicity.$\square $

Now we can state our result concerning Lyapunov exponents.

Proposition 4.3

Assume that ($\phi $-Erg) holds with $E={{\mathbb {C}}}^k$, and that (Pur) holds. Assume $\int \Vert v\Vert ^2\log \Vert v\Vert \mathrm{d}\mu (v)< \infty $. Then there exists numbers

$$\begin{aligned} \infty >\gamma _1\ge \gamma _2\ge \cdots \ge \gamma _k\ge -\infty \end{aligned}$$

such that for any probability measure $\nu $ over $({\mathrm P}({\mathbb {C}}^k),{\mathcal {B}})$:

(1)
for any $p\in \{1,\ldots ,k\}$,
$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\log \left\| \wedge ^p W_n\right\| =\sum _{j=1}^p \gamma _j,\quad {{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}, \end{aligned}$$
(36)
(2)
$\gamma _2-\gamma _1<0$ with $\gamma _2-\gamma _1$ understood as the limit of $\frac{1}{n}\log \frac{\Vert \wedge ^2 W_n\Vert }{\Vert W_n\Vert ^2}$ whenever $\gamma _1=-\infty $,
(3)
one has the convergence
$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{n}(\log \Vert W_n x_0\Vert -\log \Vert W_n\Vert )=0\quad {{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}\end{aligned}$$
(37)

Proof

Let us start by proving (1). Note that $n\mapsto \log \Vert \wedge ^p W_n\Vert $ is subadditive by definition. The existence of the ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}{\text {-}}\mathrm {a.s.}$ limits $\lim _{n\rightarrow \infty }\frac{1}{n}\log \Vert \wedge ^p W_n\Vert $ then follows from ${{\mathbb {E}}}^{\rho _{\mathrm {inv}}}(\log \Vert V\Vert ^2)\le \int \Vert v\Vert ^2\log \Vert v\Vert ^2d\mu (v)<\infty $, ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}\circ \theta ^{-1}={{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$ and a direct application of Kingman’s subadditive ergodic theorem (see e.g. [21]). The fact that these limits are ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}{\text {-}}\mathrm {a.s.}$ constant comes from the $\theta $-ergodicity of ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$ proved in Lemma 4.2. Since by Eq. (35) any ${{\mathbb {P}}}^\rho $ is absolutely continuous with respect to ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}$, Proposition 2.1 and the $\mathcal {O}$-measurability of $\Vert \wedge ^pW_n\Vert $ imply the convergence holds ${{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}$ The numbers $\gamma _j$ are uniquely defined, by defining $\sum _{j=1}^p \gamma _j$ as the ${{\mathbb {P}}}^{\rho _{\mathrm {inv}}}{\text {-}}\mathrm {a.s.}$ limit $\lim _{n\rightarrow \infty }\frac{1}{n}\log \Vert \wedge ^p W_n\Vert $ and imposing the rule that $\gamma _{j+1}=-\infty $ if $\gamma _{j}=-\infty $. This convention and (30) impose that $\gamma _{j+1}\le \gamma _j$ for $j=1,\ldots ,k-1$.

Concerning (2), recall the quantity f(n) defined in Eq. (31). Then Eq. (32) and the inequality ${{\text {tr}}}\,W_n^*W_n \le k \Vert W_n\Vert ^2$ give

$$\begin{aligned} f(n)\ge {{\mathbb {E}}}^{\mathrm {ch}}\left( \frac{\left\| \wedge ^2 W_n\right\| }{\Vert W_n\Vert ^2} \right) . \end{aligned}$$

Jensen’s inequality implies

$$\begin{aligned} \frac{1}{n} \log f(n)\ge {{\mathbb {E}}}^{\mathrm {ch}}\left( \frac{1}{n} \log \frac{\left\| \wedge ^2 W_n\right\| }{\Vert W_n\Vert ^2} \right) \end{aligned}$$

so that by Lemma 3.6 and Fatou’s lemma, $\log \lambda \ge \gamma _2-\gamma _1$ with $\lambda \in (0,1)$.

Finally for (3), from Proposition 2.2, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\Vert W_n x_0\Vert }{\Vert W_n\Vert }=\lim _{n\rightarrow \infty }\frac{\left\| M_n^{\frac{1}{2}}x_0\right\| }{\left\| M_{n}^{\frac{1}{2}}\right\| }=|\langle x_0, z\rangle |\quad {{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}\end{aligned}$$

Since ${{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}\,|\langle x_0, z\rangle |>0$, the proposition holds. $\square $

From this proposition we deduce the following almost sure convergence rate for the distance between the Markov chain $(\hat{x}_n)$ and the $(\mathcal {O}_n)$-adapted process $(\hat{y}_n)$.

Proposition 4.4

Assume (Pur) holds and ($\phi $-Erg) holds with $E={{\mathbb {C}}}^k$. Then, for any probability measure $\nu $ on $({{\mathrm P}({{\mathbb {C}}}^k)},{\mathcal {B}})$,

$$\begin{aligned} \limsup _{t\rightarrow \infty }\frac{1}{n}\log \big (d\left( \hat{x}_n, \hat{y}_n\right) \big )\le -(\gamma _1-\gamma _2)<0,\quad {{\mathbb {P}}}_\nu {\text {-}}\mathrm {a.s.}\end{aligned}$$

Proof

Identity (28) and the definition of $\hat{z}_n$ imply

$$\begin{aligned} {d(\hat{x}_n,\hat{y}_n)}=\frac{\left\| \wedge ^2W_n\, x_0\wedge z_n\right\| }{\Vert W_nx_0\Vert \Vert W_n z_n\Vert } \le \frac{\left\| \wedge ^2W_n\right\| }{\Vert W_nx_0\Vert \Vert W_n\Vert }. \end{aligned}$$

Proposition 4.3 then yields the result. $\square $

Notes

Complete positivity is stronger than positivity; namely by definition $\phi $ is completely positive iff $\phi \otimes \mathrm{Id}_{M_n({{\mathbb {C}}})}$ is positive for all $n\in {{\mathbb {N}}}$.
As suggested by its name, the notion of cycle for $\phi $ depends only on $\phi $ and not on the specific measure $\mu $ leading to $\phi $ [20, 22].

References

Applebaum, D.: Probability on Compact Lie Groups, Volume 70 of Probability Theory and Stochastic Modelling. Springer, Berlin (2014)
Google Scholar
Baumgartner, B., Narnhofer, H.: The structures of state space concerning quantum dynamical semigroups. Rev. Math. Phys. 24(02), 1250001 (2012)
Article MathSciNet MATH Google Scholar
Benoist, T., Pellegrini, C., Ticozzi, F.: Exponential stability of subspaces for quantum stochastic master equations. Ann. Henri Poincaré 18, 2045–2074 (2017)
Article MathSciNet MATH Google Scholar
Bougerol, P., Lacroix, J.: Products of Random Matrices with Applications to Schrödinger Operators, Volume 8 of Progress in Probability and Statistics. Birkhäuser Boston, Inc., Boston (1985)
MATH Google Scholar
Carbone, R., Pautrat, Y.: Irreducible decompositions and stationary states of quantum channels. Rep. Math. Phys. 77(3), 293–313 (2016)
Article MathSciNet MATH Google Scholar
Carmichael, H.: An Open Systems Approach to Quantum Optics: Lectures Presented at the Université Libre de Bruxelles, October 28 to November 4, 1991. Springer, Berlin (1993)
Google Scholar
Evans, D.E., Høegh-Krohn, R.: Spectral properties of positive maps on $C^*$-algebras. J. Lond. Math. Soc. (2) 17(2), 345–355 (1978)
Article MathSciNet MATH Google Scholar
Furstenberg, H., Kesten, H.: Products of random matrices. Ann. Math. Stat. 31(2), 457–469 (1960)
Article MathSciNet MATH Google Scholar
Guerlin, C., Bernu, J., Deleglise, S., Sayrin, C., Gleyzes, S., Kuhr, S., Brune, M., Raimond, J.-M., Haroche, S.: Progressive field-state collapse and quantum non-demolition photon counting. Nature 448(7156), 889–893 (2007)
Article Google Scholar
Guivarc’h, Y., Le Page, É.: Spectral gap properties for linear random walks and Pareto’s asymptotics for affine stochastic recursions. Ann. Inst. H. Poincaré Probab. Stat. 52(2), 503–574 (2016)
Article MathSciNet MATH Google Scholar
Guivarc’h, Y., Raugi, A.: Frontière de Furstenberg, propriétés de contraction et théorèmes de convergence. Probab. Theory Relat. Fields 69(2), 187–242 (1985)
MATH Google Scholar
Guivarc’h, Y., Raugi, A.: Products of random matrices: convergence theorems. In: Cohen, J.E., Kesten, H., Newman, C.M. (eds.) Random Matrices and Their Applications (Brunswick, Maine, 1984), Volume 50 of Contemporary Mathematics, pp. 31–54. American Mathematical Society, Providence (1986)
Google Scholar
Holevo, A.: Statistical Structure of Quantum Theory. Springer, Berlin (2001)
Book MATH Google Scholar
Kümmerer, B., Maassen, H.: An ergodic theorem for quantum counting processes. J. Phys. A 36(8), 2155 (2003)
Article MathSciNet MATH Google Scholar
Kümmerer, B., Maassen, H.: A pathwise ergodic theorem for quantum trajectories. J. Phys. A 37(49), 11889–11896 (2004)
Article MathSciNet MATH Google Scholar
Le Page, E.: Theoremes limites pour les produits de matrices aleatoires. In: Heyer, H. (ed.) Probability Measures on Groups. Lecture Notes in Mathematics, vol. 928. Springer, Berlin, Heidelberg (1982)
Maassen, H., Kümmerer, B.: Purification of quantum trajectories. Lect. Notes Monogr. Ser. 48, 252–261 (2006)
Article MathSciNet MATH Google Scholar
Mac Lane, S., Birkhoff, G.: Algebra, 3rd edn. Chelsea Publishing Co., New York (1988)
MATH Google Scholar
Meyn, S., Tweedie, R.L.: Markov Chains and Stochastic Stability, 2nd edn. Cambridge University Press, Cambridge (2009)
Book MATH Google Scholar
Schrader, R.: Perron-Frobenius theory for positive maps on trace ideals. In: Mathematical Physics in Mathematics and Physics (Siena, 2000), vol. 30, pp. 361–378 (2001)
Walters, P.: An Introduction to Ergodic Theory. Graduate Texts in Mathematics, vol. 79. Springer, Berlin (1982)
Book Google Scholar
Wolf, M.M.: Quantum channels & operations: guided tour. http://www-m5.ma.tum.de/foswiki/pub/M5/Allgemeines/MichaelWolf/QChannelLecture.pdf (2012). Lecture notes based on a course given at the Niels–Bohr Institute. Accessed 28 Feb 2017

Download references

Acknowledgements

T.B. and C.P. would like to thank Y. Guivarc’h for his useful comments at an early stage of this work. Y.P. and C.P. would like to thank P. Bougerol for enlightening discussions about random products of matrices. Y.P. and C.P. would like to thank L. Miclo for relevant discussions regarding Markov chains. The research of T.B. has been supported by ANR-11-LABX-0040-CIMI within the program ANR-11-IDEX-0002-02. The research of T.B., Y.P. and C.P. has been supported by the ANR project StoQ ANR-14-CE25-0003-01 and CNRS InFIniTi project MISTEQ.

Author information

Authors and Affiliations

Institut de Mathématiques de Toulouse, UMR5219, Université de Toulouse, CNRS, UPS, 31062, Toulouse Cedex 9, France
T. Benoist & C. Pellegrini
Instituut voor Theoretische Fysica, KU Leuven, 3001, Leuven, Belgium
M. Fraas
Laboratoire de Mathématiques d’Orsay, Univ. Paris-Sud, CNRS, Université Paris-Saclay, 91405, Orsay, France
Y. Pautrat

Authors

T. Benoist
View author publications
You can also search for this author in PubMed Google Scholar
M. Fraas
View author publications
You can also search for this author in PubMed Google Scholar
Y. Pautrat
View author publications
You can also search for this author in PubMed Google Scholar
C. Pellegrini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Pellegrini.

Appendices

Appendix A: Equivalence of (Pur) and contractivity

We assume ${\text {supp}}\mu \subset \mathrm {GL}_k({{\mathbb {C}}})$. Recall that $T_\mu $ is the smallest closed sub-semigroup of $\mathrm {GL}_k({{\mathbb {C}}})$ that contains ${\text {supp}}\mu $. It is said to be contracting if there exists a sequence $(a_n)_{n\in {\mathbb {N}}}\subset T_\mu $ such that $\lim _{n\rightarrow \infty } a_n/\Vert a_n\Vert $ exists and is a rank one matrix.

Proposition A.1

Assume ${\text {supp}}\mu \subset \mathrm {GL}_k({\mathbb {C}})$ and $T_\mu $ is strongly irreducible. Then $\mu $ verifies (Pur) if and only if $T_\mu $ is contracting.

Proof

By Proposition 2.2 the implication (Pur)$\Rightarrow $ contractivity follows by taking for $(a_n)$ a convergent subsequence of $(W_n(\omega ))$ for $\omega \in {\text {supp}}{{\mathbb {P}}}^{\mathrm {ch}}$.

We prove the opposite implication by contradiction. Following [12, Lemma 3], under the assumptions of the proposition, $T_\mu $ is contracting if and only if, for any two $\hat{x}, \hat{y}\in {{\mathrm P}({{\mathbb {C}}}^k)}$ there exists a sequence of matrices $(a_n)\subset T_\mu $ such that

$$\begin{aligned} \lim _{n\rightarrow \infty }d\left( a_n\cdot \hat{x},a_n\cdot \hat{y}\right) =0. \end{aligned}$$

Now, assume that contractivity holds but (Pur) does not. Namely, that $T_\mu $ is contracting but there exists an orthogonal projector $\pi $ of rank $\ge 2$, such that for any $a\in T_\mu $,

$$\begin{aligned} \pi a^*a\pi \propto \pi . \end{aligned}$$

Let x, y in the range of $\pi $ be orthonormal vectors. Then $\langle ax,ay\rangle =\langle x,y\rangle =0$, and $\Vert ax\Vert , \Vert ay\Vert $ are nonzero, so that $d(a\cdot \hat{x},a\cdot \hat{y})=1$. As this is true for any a in $T_{\mu }$, contractivity cannot hold. This contradiction yields the proposition. $\square $

Appendix B: Set of invariant measures under assumption (Pur)

A quantum channel is a map $\phi $ on $\mathrm {M}_k({{\mathbb {C}}})$ of a form

$$\begin{aligned} \phi (\rho ) = \int _{\mathrm {M}_k(\mathcal {C})} v \rho v^* \mathrm{d}\mu (v), \end{aligned}$$

where $\mu $ is a measure satisfying the normalization condition (1). The decomposition of quantum channels to irreducible components was derived in [2, 5, 22]. The space $\mathbb {C}^k$ is decomposed into orthogonal subspaces, one subspace is transient and in all other subspaces the map has a canonical tensor product structure. We recall these results.

There exists a decomposition

$$\begin{aligned} \mathbb {C}^k \simeq \mathbb {C}^{n_1} \oplus \dots \oplus \mathbb {C}^{n_d} \oplus \mathbb {C}^{D}, \quad k = n_1 + \dots + n_d + D \end{aligned}$$

with the following properties. We denote by $v^{(j)}$ the restriction of v to $\mathbb {C}^{n_j}$.

(e1)
All invariant states are supported in the subspace $L = \mathbb {C}^{n_1} \oplus \dots \oplus \mathbb {C}^{n_d} \oplus 0$,
(e2)
The restriction of v to this subspace is block diagonal,
$$\begin{aligned} v|_L = v^{(1)} \oplus \cdots \oplus v^{(d)}\oplus 0 \quad \mu {\text {-}}\mathrm {a.e.}\end{aligned}$$
(38)
(e3)
For each $j=1, \dots ,d$ there is a decomposition $\mathbb {C}^{n_j} = \mathbb {C}^{k_j} \otimes \mathbb {C}^{m_j}, \, n_j = k_j m_j$, a unitary matrix $U_j$ on $\mathbb {C}^{n_j}$ and a matrix $\tilde{v}^{(j)}$ on $\mathbb {C}^{k_j}$ such that
$$\begin{aligned} v^{(j)} = U_j \left( \tilde{v}^{(j)} \otimes \mathrm{Id}_{{{\mathbb {C}}}^{m_j}}\right) U_j^* \quad \mu -a.s. \end{aligned}$$
(39)
(e4)
There exists a full rank positive matrix $\rho _j$ on $\mathbb {C}^{k_j}$ such that
$$\begin{aligned} 0 \oplus \cdots \oplus U_j \left( \rho _j \otimes \mathrm{Id}_{{{\mathbb {C}}}^{m_j}}\right) U_j^* \oplus \cdots \oplus 0 \end{aligned}$$
(40)
is a fixed point of $\phi $.

It follows from (e3) and (e4) that the set of fixed points for $\phi $ is

$$\begin{aligned} U_1\big (\rho _1\otimes M_{m_1}({{\mathbb {C}}})\big )U_1^*\oplus \cdots \oplus U_d\big (\rho _d\otimes M_{m_d}({{\mathbb {C}}})\big )U_d^*\oplus 0_{M_D({{\mathbb {C}}})}. \end{aligned}$$

The decomposition simplifies under the purification assumption.

Proposition B.1

Assume (Pur) holds. Then there exists a set $\{\rho _j\}_{j=1}^d$ of positive definite matrices and an integer D such that the set of $\phi $ fixed points is

$$\begin{aligned} {{\mathbb {C}}}\rho _1\oplus \cdots \oplus {{\mathbb {C}}}\rho _d\oplus 0_{M_D({{\mathbb {C}}})}. \end{aligned}$$

Proof

The statement follows from the discussion preceding the proposition if we show that (Pur) implies $m_1 = \dots = m_d =1$. Assume that one of the $m_j$, e.g. $m_1$, is greater than 1. Let x be a norm one vector in $\mathbb {C}^{k_1}$. Then $\pi = U_1\pi _{\hat{x}} \otimes \mathrm{Id}_{\mathbb {C}^{m_1}} U_1^*\oplus 0 \oplus \dots \oplus 0$ is a projection with rank bigger than 1, and by Eq. (39) we have, in the notation of (38) and (39),

$$\begin{aligned} \pi v_1^*\ldots v_n^* v_n\ldots v_1 \pi = \left\| \tilde{v}^{(1)}_n\ldots \tilde{v}^{(1)}_1 x\right\| ^2 \pi \end{aligned}$$

for $\mu ^{\otimes n}$-almost all $v_1,\ldots ,v_n$. This contradicts (Pur). $\square $

It is clear from Eq. (38) that to each extremal fixed point $0 \oplus \dots \oplus \rho _j \oplus \dots \oplus 0$ corresponds a unique invariant measure $\nu _j$ supported on its range $F_j$. The converse is the subject of the next proposition.

Proposition B.2

Assume (Pur) holds. Then any $\Pi $-invariant probability measure is a convex combination of the measures $\nu _j$, $j=1,\ldots ,d$.

Proof

Let $\nu $ be a $\Pi $-invariant probability measure. Let f be a continuous function. From Lemma 2.3,

$$\begin{aligned} {{\mathbb {E}}}_\nu (f)=\lim _{n\rightarrow \infty }{{\mathbb {E}}}_\nu \big (f\left( U_n\cdot \hat{z}\right) \big ). \end{aligned}$$

Proposition 2.1 implies

$$\begin{aligned} {{\mathbb {E}}}_\nu (f)=\lim _{n\rightarrow \infty }{{\mathbb {E}}}^{\rho _\nu }\big (f\left( U_n\cdot \hat{z}\right) \big ) \end{aligned}$$

with $\rho _\nu \in {\mathcal {D}}_k$ a fixed point of $\phi $. By Proposition B.1, (Pur) implies that there exist non negative numbers $t_1,\ldots ,t_d$ summing up to one such that $\rho _\nu =t_1\rho _1\oplus \cdots \oplus t_d\rho _d\oplus 0_{M_D({{\mathbb {C}}})}$. From the definition of ${{\mathbb {P}}}^{\rho _\nu }$,

$$\begin{aligned} {{\mathbb {P}}}^{\rho _\nu }=t_1{{\mathbb {P}}}^{\rho _1}+\cdots +t_d{{\mathbb {P}}}^{\rho _d} \end{aligned}$$

where we used the abuse of notation $\rho _j\equiv 0\oplus \cdots \oplus \rho _j\oplus \cdots \oplus 0$. Using Proposition 2.1, it follows that

$$\begin{aligned} {{\mathbb {E}}}_\nu (f)=\lim _{n\rightarrow \infty } t_1{{\mathbb {E}}}_{\nu _1}\big (f\left( U_n\cdot \hat{z}\right) \big )+\cdots +t_d{{\mathbb {E}}}_{\nu _d}\big (f\left( U_n\cdot \hat{z}\right) \big ). \end{aligned}$$

Then Lemma 2.3 and the $\Pi $-invariance of each measure $\nu _j$ yield the proposition. $\square $

Appendix C: Products of special unitary matrices

Proposition C.1

Assume ${\text {supp}}\mu \subset \mathrm {SU}(k)$. Let G be the smallest closed subgroup of $\mathrm {SU}(k)$ such that ${\text {supp}}\mu \subset G$. For any $\hat{x}\in {{\mathrm P}({{\mathbb {C}}}^k)}$, let $[\hat{x}]_G$ be the orbit of $\hat{x}$ with respect to G and the action $G\times {{\mathrm P}({{\mathbb {C}}}^k)}\ni (v,\hat{x})\mapsto v\cdot \hat{x}$. Namely, $[\hat{x}]_G:=\{\hat{y}\in {{\mathrm P}({{\mathbb {C}}}^k)}\ |\ \exists v\in G \text{ s.t. } \hat{y}=v\cdot \hat{x}\}$. Then, for any $\hat{x}$, there exists a unique $\Pi $-invariant probability measure supported on $[\hat{x}]_G$, and this unique invariant measure is uniform in the sense that for any $v\in G$ it is invariant by the map $\hat{x}\mapsto v\cdot \hat{x}$.

Corollary C.2

With the assumption and definitions of the last proposition, if $G=\mathrm {SU}(k)$, $\Pi $ has a unique invariant probability measure and this probability is the uniform one on ${{\mathrm P}({{\mathbb {C}}}^k)}$.

Proof

The corollary being a trivial consequence of $G=\mathrm {SU}(k)\Rightarrow [\hat{x}]_G={{\mathrm P}({{\mathbb {C}}}^k)}\ \forall \hat{x}\in {{\mathrm P}({{\mathbb {C}}}^k)}$, we are left with proving the proposition.

Let $P_\mu $ be the Markov kernel on G defined by the left multiplication: $P_\mu f(v)=\int _G f(uv)d\mu (u)$. Since G is compact as a closed subset of $\mathrm {SU}(k)$, following [1, Proposition 4.8.1, Theorem 4.8.2], the unique $P_\mu $-invariant probability measure $\mu _G$ on G is the normalized Haar measure on G. Since G is compact, Prokhorov’s theorem implies that for any $u\in G$,

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\sum _{k=1}^n\delta _uP_\mu ^k=\mu _G\quad \text{ weakly. } \end{aligned}$$

(41)

Let $\hat{x}\in {{\mathrm P}({{\mathbb {C}}}^k)}$. Since ${\text {supp}}\mu \subset G$, for any $\hat{y}\in [\hat{x}]_G$, $\Pi (\hat{y}, [\hat{x}]_G)=1$. Then, $[\hat{x}]_G$ being compact, there exists a $\Pi $-invariant measure $\nu $ supported on $[\hat{x}]_G$.

Let f be a continuous function on $[\hat{x}]_G$. Then,

$$\begin{aligned} \nu (f)=\frac{1}{n}\sum _{k=1}^n\nu \Pi ^k f=\frac{1}{n}\sum _{k=1}^n\int _{G^k\times [\hat{x}]_G} f\left( v_k \ldots v_1\cdot \hat{y}\right) \mathrm{d}\mu ^{\otimes k}(v_1,\ldots ,v_k)\mathrm{d}\nu (\hat{y}). \end{aligned}$$

For each $\hat{y}\in [\hat{x}]_G$ let $u_y\in G$ be such that $\hat{y}=u_y\cdot \hat{x}$. The map $v\mapsto vu_y\cdot \hat{x}$ being continuous, setting $u=u_y$, the weak convergence (41) and Lebesgue’s dominated convergence theorem imply,

$$\begin{aligned} \nu (f)=\int _{G}f\left( v\cdot \hat{x}\right) \mathrm{d}\mu _G(v). \end{aligned}$$

It follows that $\nu $ is the image measure of $\mu _G$ by the application $v\mapsto v\cdot \hat{x}$. The left multiplication invariance of the Haar measure $\mu _G$ yields the invariance of $\nu $ by the map $\hat{x}\mapsto v\cdot \hat{x}$ for any $v\in G$. $\square $

Example C.3

Let $\mu =\frac{1}{2}(\delta _{v_1}+\delta _{v_2})$ with,

$$\begin{aligned} v_1=\begin{pmatrix} e^i&{}\quad 0\\ 0&{}\quad e^{-i} \end{pmatrix}\quad \text{ and }\quad v_2=\begin{pmatrix} \cos 1&{}\quad i\sin 1\\ i\sin 1 &{}\quad \cos 1 \end{pmatrix}. \end{aligned}$$

Then $G=\mathrm {SU}(2)$ and the uniform measure on ${\mathrm P}({{\mathbb {C}}}^2)$ is the unique $\Pi $-invariant probability measure.

Proof

Following Proposition C.1, it is sufficient to prove that any element of $\mathrm {SU}(2)$ is the limit of a sequence of products of $v_1$ and $v_2$.

Let $\sigma _1,\sigma _2,\sigma _3$ be the usual Pauli matrices:

$$\begin{aligned} \sigma _1:=\begin{pmatrix} 0&{}\quad 1\\ 1&{}\quad 0 \end{pmatrix},\quad \sigma _2:=\begin{pmatrix} 0&{}\quad -i\\ i&{}\quad 0 \end{pmatrix}\quad \text{ and }\quad \sigma _3=\begin{pmatrix} 1&{}\quad 0\\ 0&{}\quad -1 \end{pmatrix}. \end{aligned}$$

The Pauli matrices being generators of $\mathrm {SU}(2)$ in its fundamental representation, for any $u\in \mathrm {SU}(2)$, there exist three reals $\theta _1,\theta _2,\theta _3\in {{\mathbb {R}}}$ s.t.,

$$\begin{aligned} u=\exp (i(\theta _1\sigma _1+\theta _2\sigma _2+\theta _3\sigma _3)). \end{aligned}$$

Especially, $v_1=\exp (i\sigma _3)$ and $v_2=\exp (i\sigma _1)$. Since for any $j=1,2,3$, $\exp (i\theta _j\sigma _j)=\exp (i(\theta _j+2\pi )\sigma _j)$, taking limits of sequences of powers of $v_1$ or $v_2$, for any $\theta \in {{\mathbb {R}}}$, both

$$\begin{aligned} e^{i\theta \sigma _1}\quad \text{ and }\quad e^{i\theta \sigma _3} \end{aligned}$$

are elements of G. It remains to show that any $u\in \mathrm {SU}(2)$ is a product of elements equal to $\exp (i\theta \sigma _1)$ or $\exp (i\theta \sigma _3)$ with $\theta $ real.

Fix $(\theta _1,\theta _2,\theta _3)\in {{\mathbb {R}}}^3$. Then using spherical coordinates in ${{\mathbb {R}}}^3$, there exist $r\in {{\mathbb {R}}}_+$, $\theta \in [0,\pi ]$ and $\varphi \in [0,2\pi [$ such that $\theta _1=r\cos \theta $, $\theta _2=r\sin \theta \cos \varphi $ and $\theta _3=r\sin \theta \sin \varphi $. Then by direct computation,

$$\begin{aligned} e^{i(\theta _1\sigma _1+\theta _2\sigma _2+\theta _3\sigma _3)} =e^{-i\frac{\varphi }{2}\sigma _1}e^{i\frac{\theta }{2}\sigma _3}e^{ir\sigma _1} e^{-i\frac{\theta }{2}\sigma _3}e^{i\frac{\varphi }{2}\sigma _1}. \end{aligned}$$

It follows that as a product of elements of G, $e^{i(\theta _1\sigma _1+\theta _2\sigma _2+\theta _3\sigma _3)}\in G$, hence $G=\mathrm {SU}(2)$ and the example holds. $\square $

Example C.4

Let $\mu =\frac{1}{2}(\delta _{v_1}+\delta _{v_2})$ with,

$$\begin{aligned} v_1=\begin{pmatrix} i&{}\quad 0\\ 0&{}\quad -i \end{pmatrix}\quad \text{ and }\quad v_2=\begin{pmatrix} 0&{}\quad i\\ i &{}\quad 0 \end{pmatrix}. \end{aligned}$$

Then $G=\{\pm \mathrm{Id}_{{{\mathbb {C}}}^2}, \pm v_1, \pm v_2, \pm v_1v_2\}$. For $z\in {{\mathbb {C}}}$, let $e_z=(1,z)^\mathsf {T}$ and $e_\infty =(0,1)^\mathsf {T}$. With the conventions $\infty ^{-1}=0$, $0^{-1}=\infty $ and $-\infty =\infty $, for any $z\in {{\mathbb {C}}}\cup \{\infty \}$, $[\hat{e}_z]_G=\{\hat{e}_z, \hat{e}_{z^{-1}}, \hat{e}_{-z}, \hat{e}_{-z^{-1}}\}$ and the measure $\frac{1}{4}(\delta _{\hat{e}_z}+\delta _{\hat{e}_{-z}}+\delta _{\hat{e}_{z^{-1}}}+\delta _{\hat{e}_{-z^{-1}}})$ is a $\Pi $-invariant probability measure.

The proof of this example is obtained by an explicit computation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benoist, T., Fraas, M., Pautrat, Y. et al. Invariant measure for quantum trajectories. Probab. Theory Relat. Fields 174, 307–334 (2019). https://doi.org/10.1007/s00440-018-0862-9

Download citation

Received: 02 May 2017
Revised: 19 June 2018
Published: 20 July 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s00440-018-0862-9

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Invariant measure for quantum trajectories

Abstract

Similar content being viewed by others

Invariant Measure for Stochastic Schrödinger Equations

Markovian statistics on evolving systems

Quantum Random Evolutions

1 Introduction

Theorem 1.1

2 Uniqueness of the invariant measure

Proposition 2.1

Proof

Proposition 2.2

Proof

Lemma 2.3

Proof

Corollary 2.4

Proof

Proposition 2.5

Remark 2.6

3 Convergence

3.1 Convergence for \(\mathcal {O}\)-measurable random variables

Definition 3.1

Remark 3.2

Theorem 3.3

Proof

Proposition 3.4

Proof

3.2 Convergence to an \({\mathcal {O}}\)-measurable process

Proposition 3.5

Lemma 3.6

Proof

Proof of Proposition 3.5

3.3 Convergence in Wasserstein metric

Proof of Eq. (6)

4 Lyapunov exponents

Remark 4.1

Lemma 4.2

Proof

Proposition 4.3

Proof

Proposition 4.4

Proof

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Equivalence of (Pur) and contractivity

Proposition A.1

Proof

Appendix B: Set of invariant measures under assumption (Pur)

Proposition B.1

Proof

Proposition B.2

Proof

Appendix C: Products of special unitary matrices

Proposition C.1

Corollary C.2

Proof

Example C.3

Proof

Example C.4

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation