1 Random Matrices

Random matrix theory consists of choosing an N × N matrix at random and looking at natural properties of that matrix, notably its eigenvalues. Typically, interesting results are obtained only for large random matrices, that is, in the limit as N tends to infinity. The subject began with the work of Wigner [43], who was studying energy levels in large atomic nuclei. The subject took on new life with the discovery that the eigenvalues of certain types of large random matrices resemble the energy levels of quantum chaotic systems—that is, quantum mechanical systems for which the underlying classical system is chaotic. (See, e.g., [20] or [39].) There is also a fascinating conjectural agreement, due to Montgomery [35], between the statistical behavior of zeros of the Riemann zeta function and the eigenvalues of random matrices. See also [30] or [6].

We will review briefly some standard results in the subject, which may be found in textbooks such as those by Tao [40] or Mehta [33].

1.1 The Gaussian Unitary Ensemble

The first example of a random matrix is the Gaussian unitary ensemble (GUE) introduced by Wigner [43]. Let H N denote the real vector space of N × N Hermitian matrices, that is, those with X  = X, where X is the conjugate transpose of X. We then consider a Gaussian measure on H N given by

$$\displaystyle \begin{aligned} d_{N}e^{-N\mathrm{trace}(X^{2})/2}~dX,\quad X\in H_{N}, {} \end{aligned} $$
(1)

where dX denotes the Lebesgue measure on H N and where d N is a normalizing constant. If X N is a random matrix having this measure as its distribution, then the diagonal entries are normally distributed real random variables with mean zero and variance 1∕N. The off-diagonal entries are normally distributed complex random variables, again with mean zero and variance 1∕N. Finally, the entries are as independent as possible given that they are constrained to be Hermitian, meaning that the entries on and above the diagonal are independent (and then the entries below the diagonal are determined by those above the diagonal). The factor of N in the exponent in (1) is responsible for making the variance of the entries of order 1∕N. This scaling of the variances, in turn, guarantees that the eigenvalues of the random matrix X N do not blow up as N tends to infinity.

In order to state the first main result of random matrix theory, we introduce the following notation.

Definition 1

For any N × N matrix X, the empirical eigenvalue distribution of X is the probability measure on \(\mathbb {C}\) given by

$$\displaystyle \begin{aligned} \frac{1}{N}\sum_{j=1}^{N}\lambda_{j}, \end{aligned}$$

where {λ 1, …, λ N} are the eigenvalues of X, listed with their algebraic multiplicity.

We now state Wigner’s semicircle law.

Theorem 2

Let X N be a sequence of independently chosen N × N random matrices, each chosen according to the probability distribution in (1). Then as N , the empirical eigenvalue distribution of X N converges almost surely in the weak topology to Wigner’s semicircle law, namely, the measure supported on [−2, 2] and given there by

$$\displaystyle \begin{aligned} \frac{1}{2\pi}\sqrt{4-x^{2}}~dx,\quad -2\leq x\leq2. {} \end{aligned} $$
(2)

Figure 1 shows a simulation of the Gaussian unitary ensemble for N = 2, 000, plotted against the semicircular density in (2). One notable aspect of Theorem 2 is that the limiting eigenvalue distribution (i.e., the semicircular measure in (2)) is nonrandom. That is to say, we are choosing a matrix at random, so that its eigenvalues are random, but in the large-N limit, the randomness in the bulk eigenvalue distribution disappears—it is always semicircular. Thus, if we were to select another GUE matrix with N = 2, 000 and plot its eigenvalues, the histogram would (with high probability) look very much like the one in Figure 1.

Fig. 1
figure 1

A histogram of the eigenvalues of a GUE random variable with N = 2, 000, plotted against a semicircular density

It is important to note, however, that if one zooms in with a magnifying glass so that one can see the individual eigenvalues of a large GUE matrix, the randomness in the eigenvalues will persist. The behavior of these individual eigenvalues is of considerable interest, because they are supposed to resemble the energy levels of a “quantum chaotic system” (i.e., a quantum mechanical system whose classical counterpart is chaotic). Nevertheless, in this article, I will deal only with the bulk properties of the eigenvalues.

1.2 The Ginibre Ensemble

We now discuss the non-Hermitian counterpart to the Gaussian unitary ensemble, known as the Ginibre ensemble [15].We let \(M_{N}(\mathbb {C})\) denote the space of all N × N matrices, not necessarily Hermitian. We then make a measure on \(M_{N}(\mathbb {C})\) using a formula similar to the Hermitian case:

$$\displaystyle \begin{aligned} f_{N}~e^{-N\mathrm{trace}(Z^{\ast}Z)}~dZ,\quad Z\in M_{N}(\mathbb{C}), {} \end{aligned} $$
(3)

where dZ denotes the Lebesgue measure on H N and where f N is a normalizing constant. In this case, the eigenvalues need not be real, and they follow the circular law.

Theorem 3

Let Z N be a sequence of independently chosen N × N random matrices, each chosen according to the probability distribution in (3). Then as N , the empirical eigenvalue distribution of Z N converges almost surely in the weak topology to the uniform measure on the unit disk.

Figure 2 shows the eigenvalues of a random matrix chosen from the Ginibre ensemble with N = 2, 000. As in the GUE case, the bulk eigenvalue distribution becomes deterministic in the large-N limit. As in the GUE case, one can also zoom in with a magnifying glass on the eigenvalues of a Ginibre matrix until the individual eigenvalues become visible, and the local behavior of these eigenvalues is an interesting problem—which will not be discussed in this article.

Fig. 2
figure 2

A plot of the eigenvalues of a Ginibre matrix with N = 2, 000

1.3 The Ginibre Brownian Motion

In this article, I will discuss a certain approach to analyzing the behavior of the eigenvalues in the Ginibre ensemble. The main purpose of this analysis is not so much to obtain the circular law, which can be proved by various other methods. The main purpose is rather to develop tools that can be used to study a more complex random matrix model in the group of invertible N × N matrices. The Ginibre case then represents a useful prototype for this more complicated problem.

It is then useful to introduce a time parameter into the description of the Ginibre ensemble, which we can do by studying the Ginibre Brownian motion. Specifically, in any finite-dimensional real inner product space V , there is a natural notion of Brownian motion. The Ginibre Brownian motion is obtained by taking V  to be \(M_{N}(\mathbb {C})\), viewed as a real vector space of dimension 2N 2, and using the (real) inner product \(\left \langle \cdot ,\cdot \right \rangle _{N}\) given by

$$\displaystyle \begin{aligned} \left\langle X,Y\right\rangle _{N}:=N\operatorname{Re}(\mathrm{trace}(X^{\ast }Y)). \end{aligned}$$

We let \(C_{t}^{N}\) denote this Brownian motion, assumed to start at the origin.

At any one fixed time, the distribution of \(C_{t}^{N}\) is just the same as \(\sqrt {t}Z^{N},\) where Z N is distributed as the Ginibre ensemble. The joint distribution of the process \(C_{t}^{N}\) for various values of t is determined by the following property: For any collection of times 0 = t 0 < t 1 < t 2 < ⋯ < t k, the “increments”

$$\displaystyle \begin{aligned} C_{t_{1}}^{N}-C_{t_{0}}^{N},C_{t_{2}}^{N}-C_{t_{1}}^{N},\ldots,C_{t_{k}} ^{N}-C_{t_{k-1}}^{N} {} \end{aligned} $$
(4)

are independent and distributed as \(\sqrt {t_{j}-t_{j-1}}Z^{N}.\)

2 Large-N Limits in Random Matrix Theory

Results in random matrix theory are typically expressed by first computing some quantity (e.g., the empirical eigenvalue distribution) associated to an N × N random matrix and then letting N tend to infinity. It is nevertheless interesting to ask whether there is some sort of limiting object that captures the large-N limit of the entire random matrix model. In this section, we discuss one common approach constructing such a limiting object.

2.1 Limit in ∗-Distribution

Suppose we have a matrix-valued random variable X, not necessarily normal. Then we can then speak about the ∗-moments of X, which are expressions like

$$\displaystyle \begin{aligned} \mathbb{E}\left\{ \frac{1}{N}\mathrm{trace}(X^{2}(X^{\ast})^{3}X^{4}X^{\ast })\right\} . \end{aligned}$$

Generally, suppose p(a, b) is a polynomial in two noncommuting variables, that is, a linear combination of words involving products of a’s and b’s in all possible orders. We may then consider

$$\displaystyle \begin{aligned} \mathbb{E}\left\{ \frac{1}{N}\mathrm{trace}[p(X,X^{\ast})]\right\} . \end{aligned}$$

If, as usual, we have a family X N of N × N random matrices, we may consider the limits of such ∗-moments (if the limits exist):

$$\displaystyle \begin{aligned} \lim_{N\rightarrow\infty}\mathbb{E}\left\{ \frac{1}{N}\mathrm{trace} [p(X^{N},(X^{N})^{\ast})]\right\} . {} \end{aligned} $$
(5)

2.2 Tracial von Neumann Algebras

Our goal is now to find some sort of limiting object that can encode all of the limits in (5). Specifically, we will try to find the following objects: (1) an operator algebra \(\mathcal {A}\), (2) a “trace” \(\tau :\mathcal {A}\rightarrow \mathbb {C},\) and (3) an element x of \(\mathcal {A},\) such that for each polynomial p in two noncommuting variables, we have

$$\displaystyle \begin{aligned} \lim_{N\rightarrow\infty}\mathbb{E}\left\{ \frac{1}{N}\mathrm{trace} [p(X^{N},(X^{N})^{\ast})]\right\} =\tau\lbrack p(x,x^{\ast})]. {} \end{aligned} $$
(6)

We now explain in more detail what these objects should be. First, we generally take \(\mathcal {A}\) to be a von Neumann algebra, that is, an algebra of operators that contains the identity, is closed under taking adjoints, and is closed under taking weak operator limits. Second, the “trace” τ is not actually computed by taking the trace of elements of \(\mathcal {A},\) which are typically not of trace class. Rather, τ is a linear functional that has properties similar to the properties of the normalized trace \(\frac {1}{N}\mathrm {trace}(\cdot )\) for matrices. Specifically, we require the following properties:

  • τ(1) = 1, where on the left-hand side, 1 denotes the identity operator,

  • τ(a a) ≥ 0 with equality only if a = 0, and

  • τ(ab) = τ(ba), and

  • τ should be continuous with respect to the weak-∗ topology on \(\mathcal {A}.\)

Last, x is a single element of \(\mathcal {A}.\)

We will refer to the pair \((\mathcal {A},\tau )\) as a tracial von Neumann algebra. We will not discuss here the methods used for actually constructing interesting examples of tracial von Neumann algebras. Instead, we will simply accept as a known result that certain random matrix models admit large-N limits as operators in a tracial von Neumann algebra. (The interested reader may consult the work of Biane and Speicher [5], who use a Fock space construction to find tracial von Neumann algebras of the sort we will be using in this article.)

Let me emphasize that although X N is a matrix-valued random variable, x is not an operator-valued random variable. Rather, x is a single operator in the operator algebra \(\mathcal {A}.\) This situation reflects a typical property of random matrix models, which we have already seen an example of in Sections 1.1 and 1.2, that certain random quantities become nonrandom in the large-N limit. In the present context, it is often the case that we have a stronger statement than (6), as follows: If we sample the X N’s independently for different N’s, then with probability one, we will have

$$\displaystyle \begin{aligned} \lim_{N\rightarrow\infty}\frac{1}{N}\mathrm{trace}[p(X^{N},(X^{N})^{\ast })]=\tau\lbrack p(x,x^{\ast})]. \end{aligned}$$

That is to say, in many cases, the random quantity \(\frac {1} {N}\mathrm {trace}[p(X^{N},(X^{N})^{\ast })]\) converges almost surely to the single, deterministic number τ[p(x, x )] as N tends to infinity.

2.3 Free Independence

In random matrix theory, it is often convenient to construct random matrices as sums or products of other random matrices, which are frequently assumed to be independent of one another. The appropriate notion of independence in the large-N limit—that is, in a tracial von Neumann algebra—is the notion of “freeness” or “free independence.” This concept was introduced by Voiculescu [41, 42] and has become a powerful tool in random matrix theory. (See also the monographs [36] by Nica and Speicher and [34] by Mingo and Speicher.) Given an element a in a tracial von Neumann algebra \((\mathcal {A},\tau )\) and a polynomial p, we may form the element p(a). We also let \(\dot {p}(a)\) denote the corresponding “centered” element, given by

$$\displaystyle \begin{aligned} \dot{p}(a)=p(a)-\tau(p(a)) \end{aligned}$$

We then say that elements a 1, …, a k are freely independent (or, more concisely, free) if the following condition holds. Let j 1, …, j n be any sequence of indices taken from {1, …, k}, with the property that j l is distinct from j l+1. Let \(p_{j_{1} },\ldots ,p_{j_{n}}\) be any sequence \(p_{j_{1}},\ldots ,p_{j_{n}}\) of polynomials. Then we should have

$$\displaystyle \begin{aligned} \tau(\dot{p}_{j_{1}}(a_{j_{1}})\dot{p}_{j_{2}}(a_{j_{2}})\cdots\dot{p}_{j_{n} }(a_{j_{n}}))=0. \end{aligned}$$

Thus, for example, if a and b are freely independent, then

$$\displaystyle \begin{aligned} \tau\lbrack(a^{2}-\tau(a^{2}))(b^{2}-\tau(b^{2}))(a-\tau(a))]=0. \end{aligned}$$

The concept of freeness allows us, in principle, to disentangle traces of arbitrary words in freely independent elements, thereby reducing the computation to the traces of powers of individual elements. As an example, let us do a few computations with two freely independent elements a and b. We form the corresponding centered elements a − τ(a) and b − τ(b) and start applying the definition:

$$\displaystyle \begin{aligned} 0 & =\tau\lbrack(a-\tau(a))(b-\tau(b))]\\ & =\tau\lbrack ab]-\tau\lbrack\tau(a)b]-\tau\lbrack a\tau(b)]+\tau\lbrack \tau(a)\tau(b)]\\ & =\tau\lbrack ab]-\tau(a)\tau(b)-\tau(a)\tau(b)+\tau(a)\tau(b)\\ & =\tau\lbrack ab]-\tau(a)\tau(b), \end{aligned} $$

where we have used that scalars can be pulled outside the trace and that τ(1) = 1. We conclude, then, that

$$\displaystyle \begin{aligned} \tau(ab)=\tau(a)\tau(b). \end{aligned}$$

A similar computation shows that τ(a 2 b) = τ(a 2)τ(b) and that τ(ab 2) = τ(a)τ(b 2).

The first really interesting case comes when we compute τ(abab). We start with

$$\displaystyle \begin{aligned} 0=\tau\lbrack(a-\tau(a))(b-\tau(b))(a-\tau(a))(b-\tau(b))] \end{aligned}$$

and expand out the right-hand side as τ(abab) plus a sum of fifteen terms, all of which reduce to previously computed quantities. Sparing the reader the details of this computation, we find that

$$\displaystyle \begin{aligned} \tau(abab)=\tau(a^{2})\tau(b)^{2}+\tau(a)^{2}\tau(b^{2})-\tau(a)^{2} \tau(b)^{2}. \end{aligned}$$

Although the notion of free independence will not explicitly be used in the rest of this article, it is certainly a key concept that is always lurking in the background.

2.4 The Circular Brownian Motion

If Z N is a Ginibre random matrix (Section 1.2), then the ∗-moments of Z N converge to those of a “circular element” c in a certain tracial von Neumann algebra \((\mathcal {A},\tau ).\) The ∗-moments of c can be computed in an efficient combinatorial way (e.g., Example 11.23 in [36]). We have, for example, τ(c c) = 1 and τ(c k) = 0 for all positive integers k.

More generally, we can realize the large-N limit of the entire Ginibre Brownian motion \(C_{t}^{N},\) for all t > 0, as a family of elements c t in a tracial von Neumann algebra \((\mathcal {A},\tau ).\) In the limit, the ordinary independence conditions for the increments of \(C_{t}^{N}\) (Section 1.3) is replaced by the free independence of the increments of c t. That is, for all 0 = t 0 < t 1 < ⋯ < t k, the elements

$$\displaystyle \begin{aligned} c_{t_{1}}-c_{t_{0}},c_{t_{2}}-c_{t_{1}},\ldots,c_{t_{k}}-c_{t_{k-1}} \end{aligned}$$

are freely independent, in the sense described in the previous subsection. For any t > 0, the ∗-distribution of c t is the same as the ∗-distribution of \(\sqrt {t}c_{1}.\)

3 Brown Measure

3.1 The Goal

Recall that if A is an N × N matrix with eigenvalues λ 1, …, λ N, the empirical eigenvalue distribution μ A of A is the probability measure on \(\mathbb {C}\) assigning mass 1∕N to each eigenvalue:

$$\displaystyle \begin{aligned} \mu_{A}=\frac{1}{N}\sum_{j=1}^{N}\delta_{\lambda_{j}}. \end{aligned}$$

Goal 4

Given an arbitrary element x in a tracial von Neumann algebra \((\mathcal {A} ,\tau ),\) construct a probability measure μ x on \(\mathbb {C}\) analogous to the empirical eigenvalue distribution of a matrix.

If \(x\in \mathcal {A}\) is normal, then there is a standard way to construct such a measure. The spectral theorem allows us to construct a projection-valued measure γ x [23, Section 10.3] associated to x. For each Borel set E, the projection γ x(E) will, again, belong to the von Neumann algebra \(\mathcal {A},\) and we may therefore define

$$\displaystyle \begin{aligned} \mu_{x}(E)=\tau\lbrack\gamma_{x}(E)]. {} \end{aligned} $$
(7)

We refer to μ x as the distribution of x (relative to the trace τ). If x is not normal, we need a different construction—but one that we hope will agree with the above construction in the normal case.

3.2 A Motivating Computation

If A is an N × N matrix, define a function \(s:\mathbb {C} \rightarrow \mathbb {R\cup \{-\infty \}}\) by

$$\displaystyle \begin{aligned} s(\lambda)=\log(\left\vert \det(A-\lambda)\right\vert ^{2/N}), \end{aligned}$$

where the logarithm takes the value − when \(\det (A-\lambda )=0.\) Note that s is computed from the characteristic polynomial \(\det (A-\lambda )\) of A. We can compute s in terms of its eigenvalues λ 1, …, λ N (taken with their algebraic multiplicity) as

$$\displaystyle \begin{aligned} s(\lambda)=\frac{2}{N}\sum_{j=1}^{N}\log\left\vert \lambda-\lambda _{j}\right\vert . {} \end{aligned} $$
(8)

See Figure 3 for a plot of (the negative of) s(λ).

Fig. 3
figure 3

A plot of the function − s(λ) for a matrix with five eigenvalues. The function is harmonic except at the singularities

We then recall that the function \(\log \left \vert \lambda \right \vert \) is a multiple of the Green’s function for the Laplacian on the plane, meaning that the function is harmonic away from the origin and that

$$\displaystyle \begin{aligned} \Delta\log\left\vert \lambda\right\vert ={2\pi}\delta_{0}(\lambda), \end{aligned}$$

where δ 0 is a δ-measure at the origin. Thus, if we take the Laplacian of s(λ), with an appropriate normalizing factor, we get the following nice result.

Proposition 5

The Laplacian, in the distribution sense, of the function s(λ) in (8) satisfies

$$\displaystyle \begin{aligned} \frac{1}{4\pi}\Delta s(\lambda)=\frac{1}{N}\sum_{j=1}^{N}\delta_{\lambda_{j} }(\lambda), \end{aligned}$$

where \(\delta _{\lambda _{j}}\) is a δ-measure at λ j. That is to say, \(\frac {1}{4\pi }\Delta s\) is the empirical eigenvalue distribution of A (Definition 1 ).

Recall that if B is a strictly positive self-adjoint matrix, then we can take the logarithm of B, which is the self-adjoint matrix obtained by keeping the eigenvectors of B fixed and taking the logarithm of the eigenvalues.

Proposition 6

The function s in (8) can also be computed as

$$\displaystyle \begin{aligned} s(\lambda)=\frac{1}{N}\mathrm{trace}[\log((A-\lambda)^{\ast}(A-\lambda))] {} \end{aligned} $$
(9)

or as

$$\displaystyle \begin{aligned} s(\lambda)=\lim_{\varepsilon\rightarrow0^{+}}\frac{1}{N}\mathrm{trace} [\log((A-\lambda)^{\ast}(A-\lambda)+\varepsilon)]. {} \end{aligned} $$
(10)

Here the logarithm is the self-adjoint logarithm of a positive self-adjoint matrix.

Note that in (9), the logarithm is undefined when λ is an eigenvalue of A. In (10), inserting ε > 0 guarantees that the logarithm is well defined for all λ, but a singularity of s(λ) at each eigenvalue still arises in the limit as ε approaches zero.

Proof

An elementary result [24, Theorem 2.12] says that for any matrix X, we have \(\det (e^{X})=e^{\mathrm {trace}(X)}\). If P is a strictly positive matrix, we may apply this result with \(X=\log P\) (so that e X = P) to get

$$\displaystyle \begin{aligned} \det(P)=e^{\mathrm{trace}(X)} \end{aligned}$$

or

$$\displaystyle \begin{aligned} \mathrm{trace}(\log P)=\log[\det P]. \end{aligned}$$

Let us now apply this identity with P = (Aλ)(A − λ), whenever λ is not an eigenvalue of A, to obtain

$$\displaystyle \begin{aligned} \frac{1}{N}\mathrm{trace}[\log((A-\lambda)^{\ast}(A-\lambda))] & =\frac {1}{N}\log[\det((A-\lambda)^{\ast}(A-\lambda))]\\ & =\frac{1}{N}\log[\det(A-\lambda)^{\ast}\det(A-\lambda)]\\ & =\log(\left\vert \det(A-\lambda)\right\vert ^{2/N}), \end{aligned} $$

where this last expression is the definition of s(λ).

Continuity of the matrix logarithm then establishes (10). □

3.3 Definition and Basic Properties

To define the Brown measure of a general element x in a tracial von Neumann algebra \((\mathcal {A},\tau ),\) we use the obvious generalization of (10). We refer to Brown’s original paper [7] along with Chapter 11 of [34] for general references on the material in this section.

Theorem 7

Let \((\mathcal {A},\tau )\) be a tracial von Neumann algebra and let x be an arbitrary element of \(\mathcal {A}.\) Define

$$\displaystyle \begin{aligned} S(\lambda,\varepsilon)=\tau\lbrack\log((x-\lambda)^{\ast}(x-\lambda )+\varepsilon)] {} \end{aligned} $$
(11)

for all \(\lambda \in \mathbb {C}\) and ε > 0. Then

$$\displaystyle \begin{aligned} s(\lambda):=\lim_{\varepsilon\rightarrow0^{+}}S(\lambda,\varepsilon) {} \end{aligned} $$
(12)

exists as an almost-everywhere-defined subharmonic function. Furthermore, the quantity

$$\displaystyle \begin{aligned} \frac{1}{4\pi}\Delta s, {} \end{aligned} $$
(13)

where the Laplacian is computed in the distribution sense, is represented by a probability measure on the plane. We call this measure the Brown measure of x and denote it by μ x.

The Brown measure of x is supported on the spectrum σ(x) of x and has the property that

$$\displaystyle \begin{aligned} \int_{\sigma(x)}\lambda^{k}~d\mu_{x}(\lambda)=\tau(x^{k}) {} \end{aligned} $$
(14)

for all non-negative integers k.

See the original article [7] or Chapter 11 of the monograph [34] of Mingo and Speicher. We also note that the quantity s(λ) is the logarithm of the Fuglede–Kadison determinant of x − λ; see [13, 14]. It is important to emphasize that, in general, the moment condition (14) does not uniquely determine the measure μ x. After all, σ(x) is an arbitrary nonempty compact subset of \(\mathbb {C},\) which could, for example, be a closed disk. To uniquely determine the measure, we would need to know the value of \(\int _{\sigma (x)}\lambda ^{k}\bar {\lambda }^{l}~d\mu _{x}(\lambda )\) for all non-negative integers k and l. There is not, however, any simple way to compute the value of \(\int _{\sigma (x)}\lambda ^{k}\bar {\lambda }^{l}~d\mu _{x}(\lambda )\) in terms of the operator x. In particular, unless x is normal, this integral need not be equal to τ[x k(x )l]. Thus, to compute the Brown measure of a general operator \(x\in \mathcal {A},\) we actually have to work with the rather complicated definition in (11), (12), and (13).

We note two important special cases.

  • Suppose \(\mathcal {A}\) is the space of all N × N matrices and τ is the normalized trace, \(\tau \lbrack x]=\frac {1}{N}\mathrm {trace}(x).\) Then the Brown measure of any \(x\in \mathcal {A}\) is simply the empirical eigenvalue distribution of x, which puts mass 1∕N at each eigenvalue of x.

  • If x is normal, then the Brown measure μ x of x agrees with the measure defined in (7) using the spectral theorem.

3.4 Brown Measure in Random Matrix Theory

Suppose one has a family of N × N random matrix models X N and one wishes to determine the large-N limit of the empirical eigenvalue distribution of X N. (Recall Definition 1.) One may naturally use the following three-step process.

Step 1. Construct a large-N limit of X N as an operator x in a tracial von Neumann algebra \((\mathcal {A},\tau ).\)

Step 2. Determine the Brown measure μ x of x.

Step 3. Prove that the empirical eigenvalue distribution of X N converges almost surely to μ x as N tends to infinity.

It is important to emphasize that Step 3 in this process is not automatic. Indeed, this can be a difficult technical problem. Nevertheless, this article is concerned with exclusively with Step 2 in the process (in situations where Step 1 has been carried out). For Step 3, the main tool is the Hermitization method developed in Girko’s pioneering paper [16] and further refined by Bai [1]. (Although neither of these authors explicitly uses the terminology of Brown measure, the idea is lurking there.)

There exist certain pathological examples where the limiting eigenvalue distribution does not coincide with the Brown measure. In light of a result of Śniady [38], we can say that such examples are associated with spectral instability, that is, matrices where a small change in the matrix produces a large change in the eigenvalues. Śniady shows that if we add to X N a small amount of random Gaussian noise, then eigenvalues distribution of the perturbed matrices will converge to the Brown measure of the limiting object. (See also the papers [19] and [12], which obtain similar results by very different methods.) Thus, if the original random matrices X N are somehow “stable,” adding this noise should not change the eigenvalues of X N by much, and the eigenvalues of the original and perturbed matrices should be almost the same. In such a case, we should get convergence of the eigenvalues of X N to the Brown measure of the limiting object.

The canonical example in which instability occurs is the case in which X N = nilN, the deterministic N × N matrix having 1s just above the diagonal and 0s elsewhere. Then of course nilN is nilpotent, so all of its eigenvalues are zero. We note however that both \(\mathrm {nil}_{N}^{\ast }\mathrm {nil}_{N}\) and \(\mathrm {nil}_{N}\mathrm {nil} _{N}^{\ast }\) are diagonal matrices whose diagonal entries have N − 1 values of 1 and only a single value of 0. Thus, when N is large, nilN is “almost unitary,” in the sense that \(\mathrm {nil}_{N}^{\ast }\mathrm {nil}_{N}\) and \(\mathrm {nil}_{N}\mathrm {nil} _{N}^{\ast }\) are close to the identity. Furthermore, for any positive integer k, we have that \(\mathrm {nil}_{N}^{k}\) is again nilpotent, so that \(\mathrm {trace}[\mathrm {nil}_{N}^{k}]=0.\) Using these observations, it is not hard to show that the limiting object is a “Haar unitary,” that is, a unitary element u of a tracial von Neumann algebra satisfying τ(u k) = 0 for all positive integers k. The Brown measure of a Haar unitary is the uniform probability measure on the unit circle, while of course the eigenvalue distribution X N is entirely concentrated at the origin.

In Figure 4 we see that even under a quite small perturbation (adding 10−6 times a Ginibre matrix), the spectrum of the nilpotent matrix X N changes quite a lot. After the perturbation, the spectrum clearly resembles a uniform distribution over the unit circle. In Figure 5, by contrast, we see that even under a much larger perturbation (adding 10−1 times a Ginibre matrix), the spectrum of a GUE matrix changes only slightly. (Note the vertical scale in Figure 5.)

Fig. 4
figure 4

Spectra of the nilpotent matrix nilN (left) and of nilN + ε(Ginibre) with ε = 10−5 (right), with N = 2, 000

Fig. 5
figure 5

Spectrum of a GUE matrix X (left) and X + ε(Ginibre) with ε = 10−1 (right), with N = 2, 000

3.5 The Case of the Circular Brownian Motion

We now record the Brown measure of the circular Brownian motion.

Proposition 8

For any t > 0, the Brown measure of c t is the uniform probability measure on the disk of radius \(\sqrt {t}\) centered at the origin.

Now, as we noted in Section 2.4, the ∗-distribution of the circular Brownian motion at any time t > 0 is the same as the ∗-distribution of \(\sqrt {t}c_{1}.\) Thus, the proposition will follow if we know that the Brown measure of a circular element c is the uniform probability measure on the unit disk. This result, in turn, is well known; see, for example, Section 11.6.3 of [34].

4 PDE for the Circular Law

In this article, I present a different proof of Proposition 8 using the PDE method developed in [10]. The significance of this method is not so much that it gives another computation of the Brown measure of a circular element. Rather, it is a helpful warm-up case on the path to tackling the much more complicated problem in [10], namely, the computation of the Brown measure of the free multiplicative Brownian motion. In this section and the two that follow, I will show how the PDE method applies in the case of the circular Brownian motion. Then in the last section, I will describe the case of the free multiplicative Brownian motion.

The reader may also consult the recent preprint [29], which extends the results of [10] to case of the free multiplicative Brownian motion with arbitrary unitary initial distribution. Section 3 of this paper also analyzes the case of the free circular Brownian motion (with an arbitrary Hermitian initial distribution) using PDE methods.

We let c t be the circular Brownian motion (Section 2.4 ). Then, following the construction of the Brown measure in Theorem 7, we define, for each \(\lambda \in \mathbb {C},\) a function S λ given by

$$\displaystyle \begin{aligned} S^{\lambda}(t,\varepsilon)=\tau\lbrack\log((c_{t}-\lambda)^{\ast}(c_{t}-\lambda)+\varepsilon)] {} \end{aligned} $$
(15)

for all t > 0 and ε > 0. The Brown measure of c t will then be obtained by letting ε tend to zero, taking the Laplacian with respect to λ, and dividing by 4π. Our first main result is that, for each λ, S λ(t, ε) satisfies a PDE in t and ε.

Theorem 9

For each \(\lambda \in \mathbb {C},\) the function S λ satisfies the first-order, nonlinear differential equation

$$\displaystyle \begin{aligned} \frac{\partial S^{\lambda}}{\partial t}=\varepsilon\left( \frac{\partial S^{\lambda} }{\partial \varepsilon}\right) ^{2} {} \end{aligned} $$
(16)

subject to the initial condition

$$\displaystyle \begin{aligned} S^{\lambda}(0,\varepsilon)=\log(\left\vert \lambda\right\vert ^{2}+\varepsilon). \end{aligned}$$

We now see the motivation for making λ a parameter rather than a variable for S: since λ does not appear in the PDE (16), we can think of solving the same equation for each different value of λ, with the dependence on λ entering only through the initial conditions.

On the other hand, we see that the regularization parameter ε plays a crucial role here as one of the variables in our PDE. Of course, we are ultimately interested in letting ε tend to zero, but since derivatives with respect to ε appear, we cannot merely set ε = 0 in the PDE.

Of course, the reader will point out that, formally, setting ε = 0 in (16) gives ∂S λ(t, 0)∕∂t = 0, because of the leading factor of ε on the right-hand side. This conclusion, however, is not actually correct, because ∂S λ∂ε can blow up as ε approaches zero. Actually, it will turn out that S λ(t, 0) is independent of t when \(\left \vert \lambda \right \vert >\sqrt {t},\) but not in general.

4.1 The Finite-N Equation

In this subsection, we give a heuristic argument for the PDE in Theorem 9. Although the argument is not rigorous as written, it should help explain what is going on. In particular, the computations that follow should make it clear why the PDE is only valid after taking the large-N limit.

4.1.1 The Result

We introduce a finite-N analog of the function S λ in Theorem 9 and compute its time derivative. Let \(C_{t}^{N}\) denote the Ginibre Brownian motion introduced in Section 1.3.

Proposition 10

For each N, let

$$\displaystyle \begin{aligned} S^{\lambda,N}(t,\varepsilon)=\mathbb{E}\{\mathrm{tr}[\log((C_{t}^{N}-\lambda)^{\ast }(C_{t}^{N}-\lambda)+\varepsilon)]\}. \end{aligned}$$

Then we have the following results.

  1. (1)

    The time derivative of S λ, N may be computed as

    $$\displaystyle \begin{aligned} \frac{\partial S^{\lambda,N}}{\partial t}=\varepsilon\mathbb{E}\{(\mathrm{tr} [((C_{t}^{N}-\lambda)^{\ast}(C_{t}^{N}-\lambda)+\varepsilon)^{-1}])^{2}\}. {} \end{aligned} $$
    (17)
  2. (2)

    We also have

    $$\displaystyle \begin{aligned} \frac{\partial}{\partial \varepsilon}\mathrm{tr}[\log((C_{t}^{N}-\lambda)^{\ast} (C_{t}-\lambda)+\varepsilon)]=\mathrm{tr}[((C_{t}^{N}-\lambda)^{\ast}(C_{t}^{N} -\lambda)+\varepsilon)^{-1}]. {} \end{aligned} $$
    (18)
  3. (3)

    Therefore, if we set

    $$\displaystyle \begin{aligned} T^{\lambda,N}=\mathrm{tr}[((C_{t}^{N}-\lambda)^{\ast}(C_{t}^{N}-\lambda )+\varepsilon)^{-1}], \end{aligned}$$

    we may rewrite the formula for ∂S λ, N∂t as

    $$\displaystyle \begin{aligned} \frac{\partial S^{\lambda,N}}{\partial t}=\varepsilon\left( \frac{\partial S^{\lambda,N}}{\partial \varepsilon}\right) ^{2}+\mathrm{Cov}, {} \end{aligned} $$
    (19)

    where Cov is a “covariance term” given by

    $$\displaystyle \begin{aligned} \mathrm{Cov}=\mathbb{E}\{(T^{\lambda,N})^{2}\}-(\mathbb{E}\{T^{\lambda ,N}\})^{2}. \end{aligned}$$

The key point to observe here is that in the formula (17) for ∂S λ, N∂t, we have the expectation value of the square of a trace. On the other hand, if we computed (∂S λ, N∂ε)2 by taking the expectation value of both sides of (18) and squaring, we would have the square of the expectation value of a trace. Thus, there is no PDE for S λ, N—we get an unavoidable covariance term on the right-hand side of (19).

On the other hand, the Ginibre Brownian motion \(C_{t}^{N}\) exhibits a concentration phenomenon for large N. Specifically, let us consider a family {Y N} of random variables of the form

$$\displaystyle \begin{aligned} Y^{N}=\mathrm{tr}[\text{word in }C_{t}^{N}\text{ and }(C_{t}^{N})^{\ast}]. \end{aligned}$$

(Thus, e.g., we might have \(Y^{N}=\mathrm {tr}[C_{t}^{N}(C_{t} ^{N})^{\ast }C_{t}^{N}(C_{t}^{N})^{\ast }].\)) Then it is known that (1) the large-N limit of \(\mathbb {E}\{Y^{N}\}\) exists, and (2) the variance of Y N goes to zero. That is to say, when N is large, Y N will be, with high probability, close to its expectation value. It then follows that \(\mathbb {E}\{(Y^{N})^{2}\}\) will be close to \((\mathbb {E}\{Y^{N}\})^{2}.\) (This concentration phenomenon was established by Voiculescu in [42] for the analogous case of the “GUE Brownian motion.” The case of the Ginibre Brownian motion is similar.)

Now, although the quantity

$$\displaystyle \begin{aligned} ((C_{t}^{N}-\lambda)^{\ast}(C_{t}^{N}-\lambda)+\varepsilon)^{-1} \end{aligned}$$

is not a word in \(C_{t}^{N}\) and \((C_{t}^{N})^{\ast },\) it is expressible—at least for large ε—as a power series in such words. It is therefore reasonable to expect—this is not a proof!—that the variance of X N will go to zero as N goes to infinity and the covariance term in (19) will vanish in the limit.

4.1.2 Setting Up the Computation

We view \(M_{N}(\mathbb {C})\) as a real vector space of dimension 2N 2 and we use the following real-valued inner product \(\left \langle \cdot ,\cdot \right \rangle _{N}\):

$$\displaystyle \begin{aligned} \left\langle X,Y\right\rangle _{N}=N\operatorname{Re}(\mathrm{trace}(X^{\ast }Y)). {} \end{aligned} $$
(20)

The distribution of \(C_{t}^{N}\) is the Gaussian measure of variance t∕2 with respect to this inner product

$$\displaystyle \begin{aligned} d\gamma_{t}(C)=d_{t}e^{-\left\langle C,C\right\rangle /t}~dC, \end{aligned}$$

where d t is a normalization constant and dC is the Lebesgue measure on \(M_{N}(\mathbb {C}).\) This measure is a heat kernel measure. If we let \(\mathbb {E}_{t}\) denote the expectation value with respect to γ t, then we have, for any “nice” function,

$$\displaystyle \begin{aligned} \frac{d}{dt}\mathbb{E}_{t}\{f\}=\frac{1}{4}\mathbb{E}_{t}\{\Delta f\}, {} \end{aligned} $$
(21)

where Δ is the Laplacian on \(M_{N}(\mathbb {C})\) with respect to the inner product (20).

To compute more explicitly, we choose an orthonormal basis for \(M_{N} (\mathbb {C})\) over \(\mathbb {R}\) consisting of \(X_{1},\ldots ,X_{N^{2}}\) and \(Y_{1},\ldots ,Y_{N^{2}},\) where \(X_{1},\ldots ,X_{N^{2}}\) are skew-Hermitian and where Y j = iX j. We then introduce the directional derivatives \(\tilde {X}_{j}\) and \(\tilde {Y}_{j}\) defined by

$$\displaystyle \begin{aligned} (\tilde{X}_{j}f)(a)=\left. \frac{d}{ds}f(a+sX_{j})\right\vert {}_{s=0} ;\quad (\tilde{Y}_{j}f)(Z)=\left. \frac{d}{ds}f(a+sY_{j})\right\vert {}_{s=0}. \end{aligned}$$

Then the Laplacian Δ is given by

$$\displaystyle \begin{aligned} \Delta=\sum_{j=1}^{N^{2}}\left( (\tilde{X}_{j})^{2}+(\tilde{Y}_{j} )^{2}\right). \end{aligned}$$

We also introduce the corresponding complex derivatives, Z j and \(\bar {Z}_{j}\) given by

$$\displaystyle \begin{aligned} Z_{j} & =\frac{1}{2}(\tilde{X}_{j}-i\tilde{Y}_{j});\\ \bar{Z}_{j} & =\frac{1}{2}(\tilde{X}_{j}+i\tilde{Y}_{j}), \end{aligned} $$

which give

$$\displaystyle \begin{aligned} \frac{1}{4}\Delta=\sum_{j=1}^{N^{2}}\bar{Z}_{j}Z_{j}. \end{aligned}$$

We now let C denote a matrix-valued variable ranging over \(M_{N} (\mathbb {C}).\) We may easily compute the following basic identities:

$$\displaystyle \begin{aligned} Z_{j}(C) & =X_{j};\quad Z_{j}(C^{\ast})=0;\\ \bar{Z}_{j}(C) & =0;\quad \bar{Z}_{j}(C^{\ast})=-X_{j}. {} \end{aligned} $$
(22)

(Keep in mind that X j is skew-Hermitian.) We will also need the following elementary but crucial identity

$$\displaystyle \begin{aligned} \sum_{j=1}^{N^{2}}X_{j}AX_{j}=-\mathrm{tr}(A), {} \end{aligned} $$
(23)

where tr(⋅) is the normalized trace, given by

$$\displaystyle \begin{aligned} \mathrm{tr}(A)=\frac{1}{N}\mathrm{trace}(A). \end{aligned}$$

See, for example, Proposition 3.1 in [9]. When applied to function involving a normalized trace, this will produce second trace.

Finally, we need the following formulas for differentiating matrix-valued functions of a real variable:

$$\displaystyle \begin{aligned} \frac{d}{ds}A(s)^{-1} & =-A(s)^{-1}\frac{dA}{ds}A(s)^{-1}{} \end{aligned} $$
(24)
$$\displaystyle \begin{aligned} \frac{d}{ds}\mathrm{tr}[\log A(s)] & =\mathrm{tr}\left[ A(s)^{-1}\frac {dA}{ds}\right] . {} \end{aligned} $$
(25)

The first of these is standard and can be proved by differentiating the identity A(s)A(s)−1 = I. The second identity is Lemma 1.1 in [7]; it is important to emphasize that this second identity does not hold as written without the trace. One may derive (25) by using an integral formula for the derivative of the logarithm without the trace (see, e.g., Equation (11.10) in [27]) and then using the cyclic invariance of the trace, at which point the integral can be computed explicitly.

4.1.3 Proof of Proposition 10

We continue to let \(\mathbb {E}_{t}\) denote the expectation value with respect to the measure γ t, which is the distribution at time t of the Ginibre Brownian motion \(C_{t}^{N},\) so that

$$\displaystyle \begin{aligned} S^{\lambda,N}(t,\varepsilon)=\mathbb{E}_{t}\{\mathrm{tr}[\log((C-\lambda)^{\ast }(C-\lambda)+\varepsilon)]\}, \end{aligned}$$

where the variable C ranges over \(M_{N}(\mathbb {C}).\) We apply the derivative Z j using (25) and (22), giving

$$\displaystyle \begin{aligned} Z_{j}S^{\lambda,N}(t,\varepsilon)=\mathbb{E}_{t}\{\mathrm{tr}[((C-\lambda)^{\ast }(C-\lambda)+\varepsilon)^{-1}(C-\lambda)^{\ast}X_{j}]\}. \end{aligned}$$

We then apply the derivative \(\bar {Z}_{j}\) using (24) and (22), giving

$$\displaystyle \begin{aligned} & \bar{Z}_{j}Z_{j}S^{\lambda,N}(t,\varepsilon)=-\mathbb{E}_{t}\{\mathrm{tr} [((C-\lambda)^{\ast}(C-\lambda)+\varepsilon)^{-1}X_{j}^{2}]\}\\ & {+}\mathbb{E}_{t}\{\mathrm{tr}[((C-\lambda)^{\ast}(C-\lambda){+}\varepsilon)^{-1} X_{j}(C{-}\lambda)((C{-}\lambda)^{\ast}(C{-}\lambda){+}\varepsilon)^{-1}(C-\lambda)^{\ast} X_{j}]\}. \end{aligned} $$

We now sum on j and apply the identity (23). After applying the heat equation (21) with \(\Delta =\sum _{j}\bar {Z}_{j}Z_{j},\) we obtain

$$\displaystyle \begin{aligned} & \frac{d}{dt}S^{\lambda,N}(t,\varepsilon)\\ & =\sum_{j}\bar{Z}_{j}Z_{j}S^{\lambda,N}(t,\varepsilon)\\ & =\mathbb{E}_{t}\{\mathrm{tr}[((C-\lambda)^{\ast}(C-\lambda)+\varepsilon)^{-1} ]\}-\mathbb{E}_{t}\{\mathrm{tr}[((C-\lambda)^{\ast}(C-\lambda)+\varepsilon)^{-1} ]\times\\ & \mathrm{tr}[(C-\lambda)^{\ast}(C-\lambda)((C-\lambda)^{\ast}(C-\lambda )+\varepsilon)^{-1}]\}. {} \end{aligned} $$
(26)

But then

$$\displaystyle \begin{aligned} & (C-\lambda)^{\ast}(C-\lambda)((C-\lambda)^{\ast}(C-\lambda)+\varepsilon)^{-1}\\ & =((C-\lambda)^{\ast}(C-\lambda)+\varepsilon-\varepsilon)((C-\lambda)^{\ast}(C-\lambda )+\varepsilon)^{-1}\\ & =1-\varepsilon((C-\lambda)^{\ast}(C-\lambda)+\varepsilon)^{-1}. \end{aligned} $$

Thus, there is a cancellation between the two terms on the right-hand side of (26), giving

$$\displaystyle \begin{aligned} \frac{\partial S^{\lambda,N}}{\partial t}=\varepsilon\mathbb{E}_{t}\{(\mathrm{tr} [((C-\lambda)^{\ast}(C-\lambda)+\varepsilon)^{-1}])^{2}\}, \end{aligned}$$

as claimed in Point 1 of the proposition.

Meanwhile, we may use again the identity (25) to compute

$$\displaystyle \begin{aligned} \frac{\partial}{\partial \varepsilon}\mathrm{tr}[\log((C_{t}^{N}-\lambda)^{\ast} (C_{t}-\lambda)+\varepsilon)] \end{aligned}$$

to verify Point 2 3 then follows by simple algebra.

4.2 A Derivation Using Free Stochastic Calculus

4.2.1 Ordinary Stochastic Calculus

In this section, I will describe briefly how the PDE in Theorem 9 can be derived rigorously, using the tools of free stochastic calculus. We begin by recalling a little bit of ordinary stochastic calculus, for the ordinary, real-valued Brownian motion. To avoid notational conflicts, we will let ε t denote Brownian motion in the real line. This is a random continuous path satisfying the properties proposed by Einstein in 1905, namely, that for any 0 = t 0 < t 1 < ⋯ < t k, the increments

$$\displaystyle \begin{aligned} x_{t_{1}}-x_{t_{0}},~x_{t_{2}}-x_{t_{1}},\ldots,~x_{t_{k}}-x_{t_{k-1}} \end{aligned}$$

should be independent normal random variables with mean zero and variance t j − t j−1. At a rigorous level, Brownian motion is described by the Wiener measure on the space of continuous paths.

It is a famous result that, with probability one, the path x t is nowhere differentiable. This property has not, however, deterred people from developing a theory of “stochastic calculus” in which one can take the “differential” of x t, denoted dx t. (Since x t is not differentiable, we should not attempt to rewrite this differential as \(\frac {dx_{t}}{dt}dt.\)) There is then a theory of “stochastic integrals,” in which one can compute, for example, integrals of the form

$$\displaystyle \begin{aligned} \int_{a}^{b}f(x_{t})~dx_{t}, \end{aligned}$$

where f is some smooth function.

A key difference between ordinary and stochastic integration is that (dx t)2 is not negligible compared to dt. To understand this assertion, recall that the increments of Brownian motion have variance t j − t j−1—and therefore standard deviation \(\sqrt {t_{j}-t_{j-1}}.\) This means that in a short time interval Δt, the Brownian motion travels distance roughly Δt. Thus, if Δx t = x t+ Δt − x t, we may say that ( Δx t)2 ≈ Δt. Thus, if f is a smooth function, we may use a Taylor expansion to claim that

$$\displaystyle \begin{aligned} f(x_{t+\Delta t}) & \approx f(x_{t})+f^{\prime}(x_{t})\Delta x_{t}+\frac {1}{2}f^{\prime\prime}(x_{t})(\Delta x_{t})^{2}\\ & \approx f(x_{t})+f^{\prime}(x_{t})\Delta x_{t}+\frac{1}{2}f^{\prime\prime }(x_{t})\Delta t. \end{aligned} $$

We may express the preceding discussion in the heuristically by saying

$$\displaystyle \begin{aligned} (dx_{t})^{2}=dt. \end{aligned}$$

Rigorously, this line of reasoning lies behind the famous Itô formula, which says that

$$\displaystyle \begin{aligned} df(x_{t})=f^{\prime}(x_{t})~dx_{t}+\frac{1}{2}f^{\prime\prime}(x_{t})~dt. \end{aligned}$$

The formula means, more precisely, that (after integration)

$$\displaystyle \begin{aligned} f(x_{b})-f(x_{a})=\int_{a}^{b}f^{\prime}(x_{t})~dx_{t}+\frac{1}{2}\int_{a} ^{b}f^{\prime\prime}(x_{t})~dt, \end{aligned}$$

where the first integral on the right-hand side is a stochastic integral and the second is an ordinary Riemann integral.

If we take, for example, f(x) = x 2∕2, then we find that

$$\displaystyle \begin{aligned} \frac{1}{2}(x_{b}^{2}-x_{a}^{2})=\int_{a}^{b}x_{t}~dx_{t}+\frac{1}{2}(b-a) \end{aligned}$$

so that

$$\displaystyle \begin{aligned} \int_{a}^{b}x_{t}~dx_{t}=\frac{1}{2}(x_{b}^{2}-x_{a}^{2})-\frac{1}{2}(b-a). \end{aligned}$$

This formula differs from what we would get if x t were smooth by the b − a term on the right-hand side.

4.2.2 Free Stochastic Calculus

We now turn to the case of the circular Brownian motion c t. Since c t is a limit of ordinary Brownian motion in the space of N × N matrices, we expect that (dc t)2 will be non-negligible compared to dt. The rules are as follows; see [31, Lemma 2.5, Lemma 4.3]. Suppose g t and h t are processes “adapted to c t,” meaning that g t and h t belong to the von Neumann algebra generated by the operators c s with 0 < s < t. Then we have

$$\displaystyle \begin{aligned} dc_{t}\,g_{t}\,dc_{t}^{\ast} & =dc_{t}^{\ast}\,g_{t}\,dc_{t}=\tau (g_{t})\,dt{} \end{aligned} $$
(27)
$$\displaystyle \begin{aligned} dc_{t}\,g_{t}\,dc_{t} & =dc_{t}^{\ast}\,g_{t}\,dc_{t}^{\ast}=0{} \end{aligned} $$
(28)
$$\displaystyle \begin{aligned} \tau(g_{t}\,dc_{t}\,h_{t}) & =\tau(g_{t}\,dc_{t}^{\ast}\,h_{t})=0. {} \end{aligned} $$
(29)

In addition, we have the following Itô product rule: if \(a_{t}^{1} ,\ldots ,a_{t}^{n}\) are processes adapted to c t, then

$$\displaystyle \begin{aligned} d(a_{t}^{1}\cdots a_{t}^{n}) & =\sum_{j=1}^{n}(a_{t}^{1}\cdots a_{t} ^{j-1})\,da_{t}^{j}\,(a_{t}^{j+1}\cdots a_{t}^{n}){} \end{aligned} $$
(30)
$$\displaystyle \begin{aligned} & +\sum_{1\leq j<k\leq n}(a_{t}^{1}\cdots a_{t}^{j-1})\,da_{t}^{j} \,(a_{t}^{j+1}\cdots a_{t}^{k-1})\,da_{t}^{k}\,(a_{t}^{k+1}\cdots a_{t}^{n}). {} \end{aligned} $$
(31)

Finally, the differential “d” can be moved inside the trace τ.

Suppose, for example, we wish to compute \(d\tau \lbrack c_{t}^{\ast }c_{t}].\) We start by applying the product rule in (30) and (31). But by (29), there will be no contribution from the first line (30) in the product rule. We then use the second line (31) of the product rule together with (27) to obtain

$$\displaystyle \begin{aligned} d\tau\lbrack c_{t}^{\ast}c_{t}]=\tau\lbrack dc_{t}^{\ast}dc_{t}]=\tau (1)~dt=dt. \end{aligned}$$

Thus,

$$\displaystyle \begin{aligned} \frac{d}{dt}\tau\lbrack c_{t}^{\ast}c_{t}]=1. \end{aligned}$$

Since, also, c 0 = 0, we find that \(\tau \lbrack c_{t}^{\ast }c_{t}]=t.\)

4.2.3 The Proof

In the proof that follows, the Itô formula (27) plays the same role as the identity (23) plays in the heuristic argument in Section 4.1. We begin with a lemma whose proof is an exercise in using the rules of free stochastic calculus.

Lemma 11

For each \(\lambda \in \mathbb {C},\) let us use the notation

$$\displaystyle \begin{aligned} c_{t,\lambda}:=c_{t}-\lambda. \end{aligned}$$

Then for each positive integer n, we have

$$\displaystyle \begin{aligned} \frac{d}{dt}\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{n}]=n\sum _{l=0}^{n-1}\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{j}]\tau \lbrack(c_{t,\lambda}c_{t,\lambda}^{\ast})^{n-j-1}] \end{aligned}$$

Proof

We first note that dc t,λ = dc t and \(dc_{t,\lambda }^{\ast } =dc_{t}^{\ast },\) since λ is a constant. We then compute \(d\tau \lbrack (c_{t,\lambda }^{\ast }c_{t,\lambda })^{n}]\) by moving the d inside the trace and then applying the product rule in (30) and (31). By (29), the terms arising from (30) will not contribute. Furthermore, by (28), the only terms from (31) that contribute are those where one d goes on a factor of c t,λ and one goes on a factor of \(c_{t,\lambda }^{\ast }.\)

By choosing all possible factors of c t,λ and all possible factors of \(c_{t,\lambda }^{\ast },\) we get n 2 terms. In each term, after putting the d inside the trace, we can cyclically permute the factors until, say, the dc t,λ factor is at the end. There are then only n distinct terms that occur, each of which occurs n times. By (27), each distinct term is computed as

$$\displaystyle \begin{aligned} & \tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{j}~dc_{t}^{\ast }c_{t,\lambda}(c_{t,\lambda}^{\ast}c_{t,\lambda})^{n-j-2}c_{t,\lambda}^{\ast }~dc_{t}]\\ & =\tau\lbrack c_{t,\lambda}(c_{t,\lambda}^{\ast}c_{t,\lambda})^{n-j-2} c_{t,\lambda}^{\ast}]\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{j}]~dt\\ & =\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{j}]\tau\lbrack c_{t} c_{t}^{\ast}(c_{t,\lambda}c_{t,\lambda}^{\ast})^{n-j-1}]~dt. \end{aligned} $$

Since each distinct term occurs n times, we obtain

$$\displaystyle \begin{aligned} d\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{n}]=n\sum_{j=0}^{n-1} \tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{j}]\tau\lbrack(c_{t,\lambda }c_{t,\lambda}^{\ast})^{n-j-1}]~dt, \end{aligned}$$

which is equivalent to the claimed formula. □

We are now ready to give a rigorous argument for the PDE.

Proof of Theorem 9

We continue to use the notation c t,λ := c t − λ. We first compute, using the operator version of (25), that

$$\displaystyle \begin{aligned} \frac{\partial S}{\partial \varepsilon} & =\frac{\partial}{\partial \varepsilon}\tau\lbrack \log(c_{t,\lambda}^{\ast}c_{t,\lambda}+\varepsilon)]\\ & =\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda}+\varepsilon)^{-1}]. {} \end{aligned} $$
(32)

We note that the definition of S in (15) actually makes sense for all \(\varepsilon \in \mathbb {C}\) with \(\operatorname {Re}(\varepsilon )>0,\) using the standard branch of the logarithm function. We note that for \(\left \vert \varepsilon \right \vert >\left \vert z\right \vert ,\) we have

$$\displaystyle \begin{aligned} \frac{1}{z+\varepsilon} & =\frac{1}{\varepsilon\left( 1-\left( -\frac{z}{\varepsilon}\right) \right) }\\ & =\frac{1}{\varepsilon}\left[ 1-\frac{z}{\varepsilon}+\frac{z^{2}}{\varepsilon^{2}}-\frac{z^{3}}{\varepsilon^{3} }+\cdots\right] . {} \end{aligned} $$
(33)

Integrating with respect to z gives

$$\displaystyle \begin{aligned} \log(z+\varepsilon)=\log \varepsilon+\sum_{n=1}^{\infty}\frac{(-1)^{n-1}}{n}\left( \frac{z} {\varepsilon}\right) ^{n}. \end{aligned}$$

Thus, for \(\left \vert \varepsilon \right \vert >\left \Vert c_{t}^{\ast }c_{t}\right \Vert ,\) we have

$$\displaystyle \begin{aligned} \tau\lbrack\log(c_{t,\lambda}^{\ast}c_{t,\lambda}+\varepsilon)]=\log \varepsilon+\,\sum _{n=1}^{\infty}\frac{(-1)^{n-1}}{n\varepsilon^{n}}\tau\lbrack(c_{t,\lambda}^{\ast }c_{t,\lambda})^{n}]. {} \end{aligned} $$
(34)

Assume for the moment that it is permissible to differentiate (34) term by term with respect to t. Then by Lemma 11, we have

$$\displaystyle \begin{aligned} \frac{\partial S}{\partial t}=\sum_{n=1}^{\infty}\frac{(-1)^{n-1}}{\varepsilon^{n}} \sum_{j=0}^{n-1}\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda})^{j}]\tau \lbrack(c_{t,\lambda}c_{t,\lambda}^{\ast})^{n-j-1}]. {} \end{aligned} $$
(35)

Now, by [5, Proposition 3.2.3], the map tc t is continuous in the operator norm topology; in particular, \(\left \Vert c_{t}\right \Vert \) is a locally bounded function of t. From this observation, it is easy to see that the right-hand side of (35) converges locally uniformly in t. Thus, a standard result about interchange of limit and derivative (e.g., Theorem 7.17 in [37]) shows that the term-by-term differentiation is valid.

Now, in (35), we let k = j and l = n − j − 1, so that n = k + l + 1. Then k and l go from 0 to , and we get

$$\displaystyle \begin{aligned} \frac{\partial S}{\partial t}=\varepsilon\left( \frac{1}{\varepsilon}\sum_{k=0}^{\infty} \frac{(-1)^{k}}{\varepsilon^{k}}\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda} )^{k}]\right) \left( \frac{1}{\varepsilon}\sum_{l=0}^{\infty}\frac{(-1)^{l}}{\varepsilon^{l} }\tau\lbrack(c_{t,\lambda}c_{t,\lambda}^{\ast})^{l}]\right) . \end{aligned}$$

(We may check that the power of ε in the denominator is k + l + 1 = n and that the power of − 1 is k + l = n − 1.) Thus, moving the sums inside the traces and using (33), we obtain that

$$\displaystyle \begin{aligned} \frac{\partial S}{\partial t}=\varepsilon(\tau\lbrack(c_{t,\lambda}^{\ast}c_{t,\lambda }+\varepsilon)^{-1}])^{2}, {} \end{aligned} $$
(36)

which reduces to the claimed PDE for S, by (32).

We have now established the claimed formula for ∂S∂t for ε in the right half-plane, provided \(\left \vert \varepsilon \right \vert \) is sufficiently large, depending on t and λ. Since, also, \(S(0,\lambda ,\varepsilon )=\log (\left \vert \lambda -1\right \vert ^{2}+\varepsilon ),\) we have, for sufficiently large \(\left \vert \varepsilon \right \vert ,\)

$$\displaystyle \begin{aligned} S(t,\lambda,\varepsilon)=\log(\left\vert \lambda-1\right\vert ^{2}+\varepsilon)+\int_{0}^{t} \varepsilon\tau\lbrack(c_{s,\lambda}^{\ast}c_{s,\lambda}+\varepsilon)^{-1}]\tau\lbrack (c_{s,\lambda}c_{s,\lambda}^{\ast}+\varepsilon)^{-1}]~ds. {} \end{aligned} $$
(37)

We now claim that both sides of (37) are well-defined, holomorphic functions of ε, for ε in the right half-plane. This claim is easily established from the standard power-series representation of the inverse:

$$\displaystyle \begin{aligned} (A+\varepsilon+h)^{-1} & =(A+\varepsilon)^{-1}(1+h(A+\varepsilon)^{-1})^{-1}\\ & =(A+\varepsilon)^{-1}\sum_{n=0}^{\infty}(-1)^{n}h^{n}(A+\varepsilon)^{-n}, \end{aligned} $$

and a similar power-series representation of the logarithm. Thus, (37) actually holds for all ε in the right half-plane. Differentiating with respect to t then establishes the desired formula (36) for dSdt for all ε in the right half-plane. □

5 Solving the Equation

5.1 The Hamilton–Jacobi Method

The PDE (16) in Theorem 9 is a first-order, nonlinear equation of Hamilton–Jacobi type. “Hamilton–Jacobi type” means that the right-hand side of the equation involves only ε and ∂S∂ε, and not S itself. The reader may consult Section 3.3 of the book [11] of Evans for general information about equations of this type. In this subsection, we describe the general version of this method. In the remainder of this section, we will then apply the general method to the PDE (16).

The Hamilton–Jacobi method for analyzing solutions to equations of this type is a generalization of the method of characteristics. In the method of characteristics, one finds certain special curves along which the solution is constant. For a general equation of Hamilton–Jacobi type, the method of characteristics is not applicable. Nevertheless, we may hope to find certain special curves along which the solution varies in a simple way, allowing us to compute the solution along these curves in a more-or-less explicit way.

We now explain the representation formula for solutions of equations of Hamilton–Jacobi type. A self-contained proof of the following result is given as the proof of Proposition 6.3 in [10].

Proposition 12

Fix a function H(x, p) defined for x in an open set \(U\subset \mathbb {R}^{n}\) and p in \(\mathbb {R}^{n}.\) Consider a smooth function S(t, x) on [0, ) × U satisfying

$$\displaystyle \begin{aligned} \frac{\partial S}{\partial t}=-H(\mathbf{x},\nabla_{\mathbf{x}}S) {} \end{aligned} $$
(38)

for x ∈ U and t > 0. Now suppose (x(t), p(t)) is curve in \(U\times \mathbb {R}^{n}\) satisfying Hamilton’s equations:

$$\displaystyle \begin{aligned} \frac{dx_{j}}{dt}=\frac{\partial H}{\partial p_{j}}(\mathbf{x}(t),\mathbf{p} (t));\quad \frac{dp_{j}}{dt}=-\frac{\partial H}{\partial x_{j}}(\mathbf{x} (t),\mathbf{p}(t)) \end{aligned}$$

with initial conditions

$$\displaystyle \begin{aligned} \mathbf{x}(0)={\mathbf{x}}_{0};\quad \mathbf{p}(0)={\mathbf{p}}_{0}:=(\nabla _{\mathbf{x}}S)(0,{\mathbf{x}}_{0}). {} \end{aligned} $$
(39)

Then we have

$$\displaystyle \begin{aligned} S(t,\mathbf{x}(t))=S(0,{\mathbf{x}}_{0})-H({\mathbf{x}}_{0},{\mathbf{p}}_{0} )~t+\int_{0}^{t}\mathbf{p}(s)\cdot\frac{d\mathbf{x}}{ds}~ds {} \end{aligned} $$
(40)

and

$$\displaystyle \begin{aligned} (\nabla_{\mathbf{x}}S)(t,\mathbf{x}(t))=\mathbf{p}(t). {} \end{aligned} $$
(41)

We emphasize that we are not using the Hamilton–Jacobi formula to construct a solution to the equation (38); rather, we are using the method to analyze a solution that is assumed ahead of time to exist. Suppose we want to use the method to compute (as explicitly as possible), the value of S(t, x) for some fixed x. We then need to try to choose the initial position x 0 in (39 )—which determines the initial momentum p 0 = (∇x S)(0, x 0)—so that x(t) = x. We then use (40) to get an in-principle formula for S(t, x(t)) = S(t, x).

5.2 Solving the Equations

The equation for S λ in Theorem 9 is of Hamilton–Jacobi form with n = 1, with Hamiltonian given by

$$\displaystyle \begin{aligned} H(\varepsilon,p)=-\varepsilon p^{2}. {} \end{aligned} $$
(42)

Since S λ(t, ε) is only defined for ε > 0, we take open set U in Proposition 12 to be (0, ). That is to say, the Hamilton–Jacobi formula (40) is only valid if the curve ε(s) remains positive for 0 ≤ s ≤ t.

Hamilton’s equations for this Hamiltonian then take the explicit form

$$\displaystyle \begin{aligned} \frac{d\varepsilon}{dt} & =\frac{\partial H}{\partial p}=-2\varepsilon p{} \end{aligned} $$
(43)
$$\displaystyle \begin{aligned} \frac{dp}{dt} & =-\frac{\partial H}{\partial \varepsilon}=p^{2}. {} \end{aligned} $$
(44)

Following the general method, we take an arbitrary initial position ε 0, with the initial momentum p 0 given by

$$\displaystyle \begin{aligned} p_{0} & =\left. \frac{\partial}{\partial \varepsilon}\log(\left\vert \lambda \right\vert ^{2}+\varepsilon)\right\vert {}_{\varepsilon=\varepsilon_{0}}\\ & =\frac{1}{\left\vert \lambda\right\vert ^{2}+\varepsilon_{0}}. {} \end{aligned} $$
(45)

Theorem 13

For any ε 0 > 0, the solution (ε(t), p(t)) to (43) and (44) with initial momentum \(p_{0}=1/(\left \vert \lambda \right \vert ^{2}+\varepsilon _{0})\) exists for \(0\leq t<\left \vert \lambda \right \vert ^{2}+\varepsilon _{0}.\) On this time interval, we have

$$\displaystyle \begin{aligned} \varepsilon(t)=\varepsilon_{0}\left( 1-\frac{t}{\left\vert \lambda\right\vert ^{2}+\varepsilon_{0}}\right) ^{2}. {} \end{aligned} $$
(46)

The general Hamilton–Jacobi formula (40) then takes the form

$$\displaystyle \begin{aligned} & S^{\lambda}\left( t,\varepsilon_{0}\left( 1-\frac{t}{\left\vert \lambda\right\vert ^{2}+\varepsilon_{0}}\right) ^{2}\right) \\ & =\log(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})-\frac{\varepsilon_{0}t}{(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})^{2}},\quad 0\leq t<\left\vert \lambda\right\vert ^{2}+\varepsilon_{0}. {} \end{aligned} $$
(47)

Proof

Since the equation (44) for dpdt does not involve ε(t), we may easily solve it for p(t) as

$$\displaystyle \begin{aligned} p(t)=\frac{p_{0}}{1-p_{0}t}. \end{aligned}$$

We may then plug the formula for p(t) into the equation (43) for dt, giving

$$\displaystyle \begin{aligned} \frac{d\varepsilon}{dt}=-2\varepsilon\frac{p_{0}}{1-p_{0}t} \end{aligned}$$

so that

$$\displaystyle \begin{aligned} \frac{1}{\varepsilon}d\varepsilon=-2\frac{p_{0}}{1-p_{0}t}~dt. \end{aligned}$$

Thus,

$$\displaystyle \begin{aligned} \log \varepsilon=2\log(p_{0}t-1)+c_{1} \end{aligned}$$

so that

$$\displaystyle \begin{aligned} \varepsilon=c_{2}(1-p_{0}t)^{2}. \end{aligned}$$

Plugging in t = 0 gives c 2 = ε 0. Recalling the expression (45) for p 0 gives the claimed formula for ε(t).

Assuming ε 0 > 0, the solution to the system (43)–(44) continues to exist with ε(t) > 0 until p(t) blows up, which occurs at time \(t=1/p_{0}=\left \vert \lambda \right \vert ^{2}+\varepsilon _{0}.\)

Finally, we work out the general Hamilton–Jacobi formula (40) in the case at hand. We note from (42) and (43) that \(p(s)\frac {d\varepsilon }{ds}=-2\varepsilon (s)p(s)^{2}=2H(s).\) Since the Hamiltonian is always a conserved quantity in Hamilton’s equations, we find that

$$\displaystyle \begin{aligned} p(s)\frac{d\varepsilon}{ds}=2H(0)=-2\varepsilon_{0}p_{0}^{2}. \end{aligned}$$

Thus, (40) reduces to

$$\displaystyle \begin{aligned} S^{\lambda}(t,\varepsilon(t)) & =S(0,\varepsilon_{0})+H(0)t\\ & =\log(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})-\varepsilon_{0}p_{0}^{2}t. \end{aligned} $$

Using the formula (45) for p 0 gives the claimed formula (47). □

6 Letting ε Tend to Zero

Recall that the Brown measure is obtained by first evaluating

$$\displaystyle \begin{aligned} s_{t}(\lambda):=\lim_{\varepsilon\rightarrow0^{+}}S^{\lambda}(t,0) \end{aligned}$$

and then taking 1∕(4π) times the Laplacian (in the distribution sense) of s t(λ). We record the result here and will derive it in the remainder of this section.

Theorem 14

We have

$$\displaystyle \begin{aligned} s_{t}(\lambda)=\left\{ \begin{array} [c]{cc} \log(\left\vert \lambda\right\vert ^{2}) & \left\vert \lambda\right\vert \geq\sqrt{t}\\ \log t-1+\frac{\left\vert \lambda\right\vert ^{2}}{t} & \left\vert \lambda\right\vert <\sqrt{t} \end{array} \right. . {} \end{aligned} $$
(48)

The Brown measure is then absolutely continuous with respect to the Lebesgue measure, with density W t(λ) given by

$$\displaystyle \begin{aligned} W_{t}(\lambda)=\left\{ \begin{array} [c]{cc} 0 & \left\vert \lambda\right\vert \geq\sqrt{t}\\ \frac{1}{\pi t} & \left\vert \lambda\right\vert <\sqrt{t} \end{array} \right. . {} \end{aligned} $$
(49)

That is to say, the Brown measure is the uniform probability measure on the disk of radius \(\sqrt {t}\) centered at the origin. The functions s t(λ) and W t(λ) are plotted for t = 1 in Figure 6. On the left-hand side of the figure, the dashed line indicates the boundary of the unit disk.

Fig. 6
figure 6

Plot of s t(λ) := S λ(t, 0+) (left) and \( \frac {1}{4\pi }\Delta s_{t}(\lambda )\) (right) for t = 1

6.1 Letting ε Tend to Zero: Outside the Disk

Our goal is to compute \(s_{t}(\lambda ):=\lim _{\varepsilon \rightarrow 0^{+}}S^{\lambda }(t,\varepsilon )\). Thus, in the Hamilton–Jacobi formalism, we want to try to choose ε 0 so that the quantity

$$\displaystyle \begin{aligned} \varepsilon(t)=\varepsilon_{0}\left( 1-\frac{t}{\left\vert \lambda\right\vert ^{2}+\varepsilon_{0}}\right) ^{2} {} \end{aligned} $$
(50)

will be very close to zero. Since there is a factor of ε 0 on the right-hand side of the above formula, an obvious strategy is to take ε 0 itself very close to zero. There is, however, a potential difficulty with this strategy: If ε 0 is small, the lifetime of the solution may be smaller than the time t we are interested in. To see when the strategy works, we take the formula for the lifetime of the solution—namely, \(\left \vert \lambda \right \vert ^{2}+\varepsilon _{0}\)—and take the limit as ε 0 tends to zero.

Definition 15

For each \(\lambda \in \mathbb {C},\) we define T(λ) to be the lifetime of solutions to the system (43)–(44), in the limit as ε 0 approaches zero. Thus, explicitly,

$$\displaystyle \begin{aligned} T(\lambda) & =\lim_{\varepsilon_{0}\rightarrow0^{+}}(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})\\ & =\left\vert \lambda\right\vert ^{2}. \end{aligned} $$

Thus, if the time t we are interested in is larger than \(T(\lambda )=\left \vert \lambda \right \vert ^{2},\) our simple strategy of taking ε 0 ≈ 0 will not work. After all, if t > T(λ) and ε 0 ≈ 0, then the lifetime of the path is less than t and the Hamilton–Jacobi formula (47) is not applicable. On the other hand, if the time t we are interested in is at most \(T(\lambda )=\left \vert \lambda \right \vert ^{2},\) the simple strategy does work. Figure 7 illustrates the situation.

Fig. 7
figure 7

If ε 0 is small and positive, ε(s) will remain small and positive up to time t, provided that \(t\leq T(\lambda )= \left \vert \lambda \right \vert ^{2}\)

Conclusion 16

The simple strategy of letting ε 0 approach zero works precisely when \(t\leq T(\lambda )=\left \vert \lambda \right \vert ^{2}.\) Equivalently, the simple strategy works when \(\left \vert \lambda \right \vert \geq \sqrt {t},\) that is, when λ is outside the open disk of radius \(\sqrt {t}\) centered at the origin.

In the case that λ is outside the disk, we may then simply let ε 0 approach zero in the Hamilton–Jacobi formula, giving the following result.

Proposition 17

Suppose \(\left \vert \lambda \right \vert \geq \sqrt {t},\) that is, λ is outside the open disk of radius \(\sqrt {t}\) centered at 0. Then we may let ε 0 tend to zero in the Hamilton–Jacobi formula (47) to obtain

$$\displaystyle \begin{aligned} \lim_{\varepsilon\rightarrow0^{+}}S^{\lambda}(t,\varepsilon) & =\lim_{\varepsilon_{0}\rightarrow0}\left( \log(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})-\frac{\varepsilon_{0}t}{(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})^{2}}\right) \\ & =\log(\left\vert \lambda\right\vert ^{2}). {} \end{aligned} $$
(51)

Since the right-hand side of (51) is harmonic, we conclude that

$$\displaystyle \begin{aligned} \Delta s_{t}(\lambda)=0,\quad \left\vert \lambda\right\vert >\sqrt{t}. \end{aligned}$$

That is to say, the Brown measure of c t is zero outside the disk of radius \(\sqrt {t}\) centered at 0.

6.2 Letting ε Tend to Zero: Inside the Disk

We now turn to the case in which the time t we are interested in is greater than the small-ε 0 lifetime T(λ) of the solutions to (43)–(44). This case corresponds to \(t>T(\lambda )^{2}=\left \vert \lambda \right \vert ^{2},\) that is, \(\left \vert \lambda \right \vert <\sqrt {t}.\) We still want to choose ε 0 so that ε(t) will approach zero, but we cannot let ε 0 tend to zero, or else the lifetime of the solution will be less than t. Instead, we allow the second factor in the formula (46) for ε(t) to approach zero. To make this factor approach zero, we make \(\left \vert \lambda \right \vert ^{2}+\varepsilon _{0}\) approach t, that is, ε 0 should approach \(t-\left \vert \lambda \right \vert ^{2}.\) Note that since we are now assuming that \(\left \vert \lambda \right \vert <\sqrt {t},\) the quantity \(t-\left \vert \lambda \right \vert ^{2}\) is positive. This strategy is illustrated in Figure 8: When \(\varepsilon _{0}=t-\left \vert \lambda \right \vert ^{2},\) we obtain ε(t) = 0, and if ε 0 approaches \(t-\left \vert \lambda \right \vert ^{2}\) from above, the value of ε(t) approaches 0 from above.

Fig. 8
figure 8

If \( \left \vert \lambda \right \vert < \sqrt {t}\) and we let ε 0 approach \(t- \left \vert \lambda \right \vert ^{2}\) from above, ε(s) will remain positive until time t, and ε(t) will approach zero

Proposition 18

Suppose \(\left \vert \lambda \right \vert \leq \sqrt {t},\) that is, λ is inside the closed disk of radius \(\sqrt {t}\) centered at 0. Then in the Hamilton–Jacobi formula (47), we may let ε 0 approach \(t-\left \vert \lambda \right \vert ^{2}\) from above, and we get

$$\displaystyle \begin{aligned} \lim_{\varepsilon\rightarrow0^{+}}S^{\lambda}(t,\varepsilon)=\log t-1+\frac{\left\vert \lambda\right\vert ^{2}}{t},\quad \left\vert \lambda\right\vert \leq\sqrt{t}. \end{aligned}$$

For \(\left \vert \lambda \right \vert <\sqrt {t}\) , we may then compute

$$\displaystyle \begin{aligned} \frac{1}{4\pi}\Delta s_{t}(\lambda)=\frac{1}{\pi t}. \end{aligned}$$

Thus, inside the disk of radius \(\sqrt {t},\) the Brown measure has a constant density of 1∕(πt).

Proof

We use the Hamilton–Jacobi formula (47). Since the lifetime of our solution is \(\left \vert \lambda \right \vert ^{2}+\varepsilon _{0},\) if we let ε 0 approach \(t-\left \vert \lambda \right \vert ^{2}\) from above, the lifetime will always be at least t. In this limit, the formula (46) for ε(t) approaches zero from above. Thus, we may take the limit \(\varepsilon _{0}\rightarrow (t-\left \vert \lambda \right \vert ^{2})^{+}\) in (47) to obtain

$$\displaystyle \begin{aligned} \lim_{\varepsilon\rightarrow0^{+}}S^{\lambda}(t,\varepsilon) & =\lim_{\varepsilon_{0}\rightarrow (t-\left\vert \lambda\right\vert ^{2})^{+}}\left[ \log(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})-\frac{\varepsilon_{0}t}{(\left\vert \lambda\right\vert ^{2}+\varepsilon_{0})^{2}}\right] \\ & =\log t-\frac{(t-\left\vert \lambda\right\vert ^{2})t}{t^{2}}, \end{aligned} $$

which simplifies to the claimed formula. □

6.3 On the Boundary

Note that if \(\left \vert \lambda \right \vert ^{2}=t,\) both approaches are valid—and the two values of \(s_{t}(\lambda ):=\lim _{\varepsilon \rightarrow 0^{+} }S^{\lambda }(t,\varepsilon )\) agree, with a common value of \(\log t=\log \left \vert \lambda \right \vert ^{2}.\) Furthermore, the radial derivatives of s t(λ) agree on the boundary: 2∕r on the outside and 2rt on the inside, which have a common value of \(2/\sqrt {t}\) at \(r=\sqrt {t}.\) Of course, the angular derivatives of s t(λ) are identically zero, inside, outside, and on the boundary.

Since the first derivatives of s t are continuous up to the boundary, we may take the distributional Laplacian by taking the ordinary Laplacian inside the disk and outside the disk and ignoring the boundary. (See the proof of Proposition 7.13 in [10].) Thus, we may compute the Laplacian of the two formulas in (48) to obtain the formula (49) for the Brown measure of c t.

7 The Case of the Free Multiplicative Brownian Motion

7.1 Additive and Multiplicative Models

The standard GUE and Ginibre ensembles are given by Gaussian measures on the relevant space of matrices (Hermitian matrices for GUE and all matrices for the Ginibre ensemble). In light of the central limit theorem, these ensembles can be approximated by adding together large numbers of small, independent random matrices. We may therefore refer to these Gaussian ensembles as “additive” models.

It is natural to consider also “multiplicative” random matrix models, which can be approximated by multiplying together large numbers of independent matrices that are “small” in the multiplicative sense, that is, close to the identity. Specifically, if Z add is a random matrix with a Gaussian distribution, we will consider a multiplicative version \(Z_{t}^{\mathrm {mult}},\) where the distribution of \(Z_{t}^{\mathrm {mult}}\) may be approximated as

$$\displaystyle \begin{aligned} Z_{t}^{\mathrm{mult}}\sim\prod_{j=1}^{k}\left( I+i\sqrt{\frac{t}{k}} Z_{j}^{\mathrm{add}}-\frac{t}{k}\mathrm{It}\hat{\mathrm{o}}\right) ,\quad k\text{ large.} {} \end{aligned} $$
(52)

Here t is a positive parameter, the \(Z_{j}^{\mathrm {add}}\)s are independent copies of Z add, and “Itô” is an Itô correction term. This correction term is a fixed multiple of the identity, independent of t and k. (In the next paragraph, we will identify the Itô term in the main cases of interest.) Since the factors in (52) are independent and identically distributed, the order of the factors does not affect the distribution of the product.

The two main cases we will consider are those in which Z is distributed according to the Gaussian unitary ensemble or the Ginibre ensemble. In the case that Z is distributed according to the Gaussian unitary ensemble, the Itô term is \(\mathrm {It}\hat {\mathrm {o}}=\frac {1}{2}I.\) In this case, the resulting multiplicative model may be described as Brownian motion in the unitary group U(N), which we write as \(U_{t}^{N}.\) The Itô correction is essential in this case to ensure that \(Z_{t}^{\mathrm {mult}}\) actually lives in the unitary group. In the case that Z is distributed according to the Ginibre ensemble, the Itô term is zero. In this case, the resulting multiplicative model may be described as Brownian motion in the general linear group \(\mathsf {GL}(N;\mathbb {C})\), which we write as \(B_{t}^{N}.\)

7.2 The Free Unitary and Free Multiplicative Brownian Motions

The large-N limits of the Brownian motions \(U_{t}^{N}\) and \(B_{t}^{N}\) were constructed by Biane [3]. The limits are the free unitary Brownian motion and the free multiplicative Brownian motion, respectively, which we write as u t and b t. The qualifier “free” indicates that the increments of these Brownian motions—computed in the multiplicative sense as us−1u t or \(b_{s}^{-1}b_{t}\)—are freely independent in the sense of Section 2.3. In the case of b t, the convergence of \(B_{t}^{N}\) to b t was conjectured by Biane [3] and proved by Kemp [31]. In both cases, we take the limiting object to be an element of a tracial von Neumann algebra \((\mathcal {A},\tau ).\)

Since u t is unitary, we do not need to use the machinery of Brown measure, but can rather use the spectral theorem as in (7) to compute the distribution of u t, denoted ν t. We emphasize that ν t is, in fact, the Brown measure of u t, but it easier to describe ν t using the spectral theorem than to use the general Brown measure construction. The measure ν t is a probability measure on the unit circle describing the large-N limit of Brownian motion in the unitary group U(N). Biane computed the measure ν t in [3] and established the following support result.

Theorem 19

For t < 4, the measure ν t is supported on a proper subset of the unit circle:

$$\displaystyle \begin{aligned} \mathrm{supp}(\nu_{t})=\left\{ \left. e^{i\theta}\right\vert ~\left\vert \theta\right\vert \leq\frac{1}{2}\sqrt{t(4-t)}+\cos^{-1}\left( 1-\frac{t} {2}\right) \right\} ,\quad t<4. \end{aligned}$$

By contrast, for all t ≥ 4, the closed support of ν t is the whole unit circle.

In the physics literature, the change in behavior of the support of ν t at t = 4 is called a topological phase transition, indicating that the topology of supp(ν t) changes from a closed interval to a circle.

The remainder of this article is devoted to recent results of the author with Driver and Kemp regarding the Brown measure of the free multiplicative Brownian motion b t. We expect that the Brown measure of b t will be the limiting empirical eigenvalue distribution of the Brownian motion \(B_{t}^{N}\) in the general linear group \(\mathsf {GL}(N;\mathbb {C})\). Now, when t is small, we may take k = 1 in (52), so that (since the Itô correction is zero in this case)

$$\displaystyle \begin{aligned} B_{t}^{N}\sim I+i\sqrt{\frac{t}{k}}Z,\quad t\text{ small.} \end{aligned}$$

Thus, when t is small and N is large, the eigenvalues of \(B_{t}^{N}\) resemble a scaled and shifted version of the circular law. Specifically, the eigenvalue distribution should resemble a uniform distribution on the disk of radius \(\sqrt {t}\) centered at 1.

Figure 9 shows the eigenvalues of \(B_{t}^{N}\) with t = 0.1 and N = 2, 000. The eigenvalue distribution bears a clear resemblance to the just-described picture, with \(\sqrt {t}=\sqrt {0.1}\approx 0.316.\) Nevertheless, we can already see some deviation from the small-t picture: The region into which the eigenvalues are clustering looks like a disk, but not quite centered at 1, while the distribution within the region is slightly higher at the left-hand side of the region than the right. Figures 10 and 11, meanwhile, show the eigenvalue distribution of \(B_{t}^{N}\) for several larger values of t. The region into which the eigenvalues cluster becomes more complicated as t increases, and the distribution of eigenvalues in the region becomes less and less uniform. We expect that the Brown measure of the limiting object b t will be supported on the domain into which the eigenvalues are clustering.

Fig. 9
figure 9

The eigenvalues of \(B_{t}^{N}\) with t = 0.1 and N = 2.000

Fig. 10
figure 10

Eigenvalues of \(B_{t}^{N}\) for t = 2 (left) and t = 3.9 (right), with N = 2, 000

Fig. 11
figure 11

Eigenvalues of \(B_{t}^{N}\) for t = 4 (left) and t = 4.1 (right), with N = 2, 000

7.3 The Domains Σt

We now describe certain domains Σt in the plane, as introduced by Biane in [4, pp. 273–274]. It will turn out that the Brown measure of b t is supported on Σt. We use here a new the description of Σt, as given in Section 4 of [10]. For all nonzero \(\lambda \in \mathbb {C},\) we define

$$\displaystyle \begin{aligned} T(\lambda)=\left\vert \lambda-1\right\vert ^{2}\frac{\log(\left\vert \lambda\right\vert ^{2})}{\left\vert \lambda\right\vert ^{2}-1}. {} \end{aligned} $$
(53)

If \(\left \vert \lambda \right \vert ^{2}=1,\) we interpret \(\log (\left \vert \lambda \right \vert ^{2})/(\left \vert \lambda \right \vert ^{2}-1)\) as having the value 1 when \(\left \vert \lambda \right \vert ^{2}=1,\) in accordance with the limit

$$\displaystyle \begin{aligned} \lim_{r\rightarrow1}\frac{\log r}{r-1}=1. \end{aligned}$$

See Figure 12 for a plot of this function.

Fig. 12
figure 12

A plot of the function T(λ). The function has a minimum at λ = 1, a saddle point at λ = −1, and a singularity at λ = 0

We then define the domains Σt as follows.

Definition 20

For each t > 0, we define

$$\displaystyle \begin{aligned} \Sigma_{t}=\left\{ \left. \lambda\in\mathbb{C}\right\vert T(\lambda )<t\right\} . \end{aligned}$$

Several examples of these domains were plotted already in Figures 9, 10, and 11. The domain Σt is simply connected for t ≤ 4 and doubly connected for t > 4. The change in behavior at t = 4 occurs because T has a saddle point at λ = −1 and because T(−1) = 4. We note that a change in the topology of the region occurs at t = 4, which is the same value of t at which the topology of the support of Biane’s measure changes (Theorem 19).

7.4 The Support of the Brown Measure of b t

As we have noted, the domains Σt were introduced by Biane in [4]. Two subsequent works in the physics literature, the article [18] by Gudowska-Nowak, Janik, Jurkiewicz, and Nowak and the article [32] by Lohmayer, Neuberger, and Wettig then argued, using nonrigorous methods, that the eigenvalues of \(B_{t}^{N}\) should concentrate into Σt for large N. The first rigorous result in this direction was obtained by the author with Kemp [26]; we prove that the Brown measure of b t is supported on the closure of Σt.

Now, we have already noted that Σt is simply connected for t ≤ 4 but doubly connected for t > 4. Thus, the support of the Brown measure of the free multiplicative Brownian motion undergoes a “topological phase transition” at precisely the same value of the time parameter as the distribution of the free unitary Brownian motion (Theorem 19).

The methods of [26] explain this apparent coincidence, using the “free Hall transform” \(\mathcal {G}_{t}\) of Biane [4]. Biane constructed this transform using methods of free probability as an infinite-dimensional analog of the Segal–Bargmann transform for U(N), which was developed by the author in [21]. More specifically, Biane’s definition \(\mathcal {G}_{t}\) draws on the stochastic interpretation of the transform in [21] given by Gross and Malliavin [17]. Biane conjectured (with an outline of a proof) that \(\mathcal {G} _{t}\) is actually the large-N limit of the transform in [21]. This conjecture was then verified by in independent works of Cébron [8] and the author with Driver and Kemp [9]. (See also the expository article [25].)

Recall from Section 7.2 that the distribution of the free unitary Brownian motion is Biane’s measure ν t on the unit circle, the support of which is described in Theorem 19. A key ingredient in [26] is the function f t given by

$$\displaystyle \begin{aligned} f_{t}(\lambda)=\lambda e^{\frac{t}{2}\frac{1+\lambda}{1-\lambda}}. {} \end{aligned} $$
(54)

This function maps the complement of the closure of Σt conformally to the complement of the support of Biane’s measure:

$$\displaystyle \begin{aligned} f_{t}:\mathbb{C}\setminus\overline{\Sigma}_{t}\rightarrow\mathbb{C} \setminus\mathrm{supp}(\nu_{t}). {} \end{aligned} $$
(55)

(This map f t will also play a role in the results of Section 7.5; see Theorem 23.)

The key computation in [26] is that for λ outside \(\overline {\Sigma }_{t},\) we have

$$\displaystyle \begin{aligned} \mathcal{G}_{t}^{-1}\left( \frac{1}{z-\lambda}\right) =\frac{f_{t}(\lambda )}{\lambda}\frac{1}{u-f_{t}(\lambda)},\quad \lambda\notin\overline{\Sigma}_{t}. {} \end{aligned} $$
(56)

See Theorem 6.8 in [26]. Properties of the free Hall transform then imply that for λ outside \(\overline {\Sigma }_{t},\) the operator b t − λ has an inverse. Indeed, the noncommutative L 2 norm of (b tλ)−1 equals to the norm in L 2(S 1, ν t) of the function on the right-hand side of (56). This norm, in turn, is finite because f t(λ) is outside the support of ν t whenever λ is outside \(\overline {\Sigma }_{t}.\) The existence of an inverse to b t − λ then shows that λ must be outside the support of \(\mu _{b_{t}}.\)

An interesting aspect of the paper [26] is that we not only compute the support of \(\mu _{b_{t}}\) but also that we connect it to the support of Biane’s measure ν t, using the transform \(\mathcal {G}_{t}\) and the conformal map f t.

We note, however, that none of the papers [18, 32], or [26] says anything about the distribution of \(\mu _{b_{t}}\) within Σt; they are only concerned with identifying the region Σt. The actual computation of \(\mu _{b_{t}}\) (not just its support) was done in [10].

7.5 The Brown Measure of b t

We now describe the main results of [10]. Many of these results have been extended by Ho and Zhong [29] to the case of the free multiplicative Brownian motion with an arbitrary unitary initial distribution.

The first key result in [10] is the following formula for the Brown measure of b t (Theorem 2.2 of [10]).

Theorem 21

For each t > 0, the Brown measure \(\mu _{b_{t}}\) is zero outside the closure of the region Σ t. In the region Σ t, the Brown measure has a density W t with respect to Lebesgue measure. This density has the following special form in polar coordinates:

$$\displaystyle \begin{aligned} W_{t}(r,\theta)=\frac{1}{r^{2}}w_{t}(\theta),\quad re^{i\theta}\in\Sigma_{t}, \end{aligned}$$

for some positive continuous function w t. The function w t is determined entirely by the geometry of the domain and is given as

$$\displaystyle \begin{aligned} w_{t}(\theta)=\frac{1}{4\pi}\left( \frac{2}{t}+\frac{\partial}{\partial \theta}\frac{2r_{t}(\theta)\sin\theta}{r_{t}(\theta)^{2}+1-2r_{t}(\theta )\cos\theta}\right) , \end{aligned}$$

where r t(θ) is the “outer radius” of the region Σ t at angle θ.

See Figure 13 for the definition of r t(θ), Figure 14 for plots of the function w t(θ), and Figure 15 for a plot of W t. The simple explicit dependence of W t on r is a major surprise of our analysis. See Corollary 22 for a notable consequence of the form of W t.

Fig. 13
figure 13

The quantity r t(θ) is the larger of the two radii at which the ray of angle θ intersects the boundary of Σt

Fig. 14
figure 14

Plots of w t(θ) for t = 2, 3.5, 4, and 7

Fig. 15
figure 15

Plot of the density W t for t = 1

Using implicit differentiation, it is possible to compute dr t(θ)∕ explicitly as a function of r t(θ). This computation yields the following formula for w t, which does not involve differentiation:

$$\displaystyle \begin{aligned} w_{t}(\theta)=\frac{1}{2\pi t}\omega(r_{t}(\theta),\theta), \end{aligned}$$

where

$$\displaystyle \begin{aligned} \omega(r,\theta)=1+h(r)\frac{\alpha(r)\cos\theta+\beta(r)}{\beta(r)\cos \theta+\alpha(r)}, {} \end{aligned} $$
(57)

and

$$\displaystyle \begin{aligned} h(r)=r\frac{\log(r^{2})}{r^{2}-1};\quad \alpha(r)=r^{2}+1-2rh(r);\quad \beta(r)=(r^{2}+1)h(r)-2r. \end{aligned}$$

See Proposition 2.3 in [10].

We expect that the Brown measure of b t will coincide with the limiting empirical eigenvalue distribution of the Brownian motion \(B_{t}^{N}\) in \(\mathsf {GL}(N;\mathbb {C}).\) This expectation is supported by simulations; see Figure 16.

Fig. 16
figure 16

The density W t (left) and a histogram of the eigenvalues of \(B_{t}^{N}\) (right), for t = 1 and N = 2, 000

We note that the Brown measure (inside Σt) can also be written as

$$\displaystyle \begin{aligned} d\mu_{b_{t}} & =\frac{1}{r^{2}}w_{t}(\theta)~r~dr~d\theta\\ & =w_{t}(\theta)~\frac{1}{r}~dr~d\theta\\ & =w_{t}(\theta)~d\log r~d\theta. \end{aligned} $$

Since the complex logarithm is given by \(\log (re^{i\theta })=\log r+i\theta ,\) we obtain the following consequence of Theorem 21.

Corollary 22

The push-forward of the Brown measure \(\mu _{b_{t}}\) under the complex logarithm has density that is constant in the horizontal direction and given by w t in the vertical direction.

In light of this corollary, we expect that for large N, the logarithms of the eigenvalues of \(B_{t}^{N}\) should be approximately uniformly distributed in the horizontal direction. This expectation is confirmed by simulations, as in Figure 17.

Fig. 17
figure 17

The eigenvalues of \(B_{t}^{N}\) for t = 4.1 and N = 2, 000 (left) and the logarithms thereof (right). The density of points on the right-hand side of the figure is approximately constant in the horizontal direction

We conclude this section by describing a remarkable connection between the Brown measure \(\mu _{b_{t}}\) and the distribution ν t of the free unitary Brownian motion. Recall the holomorphic function f t in (54) and (55). This map takes the boundary of Σt to the unit circle. We may then define a map

$$\displaystyle \begin{aligned} \Phi_{t}:\overline{\Sigma}_{t}\rightarrow S^{1} \end{aligned}$$

by requiring (a) that Φt should agree with f t on the boundary of Σt and (b) that Φt should be constant along each radial segment inside \(\overline {\Sigma }_{t},\) as in Figure 18. (This specification makes sense because f t has the same value at the two boundary points on each radial segment.) We then have the following result, which may be summarized by saying that the distribution ν t of free unitary Brownian motion is a “shadow” of the Brown measure of b t.

Fig. 18
figure 18

The map Φt maps \(\overline {\Sigma }_{t}\) to the unit circle by mapping each radial segment in \(\overline {\Sigma }_{t}\) to a single point in S 1

Theorem 23

The push-forward of the Brown measure of b t under the map Φ t is Biane’s measure ν t on S 1. Indeed, the Brown measure of b t is the unique measure μ on \(\overline {\Sigma }_{t}\) with the following two properties: (1) the push-forward of μ by Φ t is ν t , and (2) μ is absolutely continuous with respect to Lebesgue measure with a density W having the form

$$\displaystyle \begin{aligned} W(r,\theta)=\frac{1}{r^{2}}g(\theta) \end{aligned}$$

in polar coordinates, for some continuous function g.

This result is Proposition 2.6 in [10]. Figure 19 shows the eigenvalues for \(B_{t}^{N}\) after applying the map Φt, plotted against the density of Biane’s measure ν t. We emphasize that we have computed the eigenvalues of the Brownian \(B_{t}^{N}\) motion in \(\mathsf {GL}(N;\mathbb {C})\) (in the two-dimensional region Σt) and then mapped these points to the unit circle. The resulting histogram, however, looks precisely like a histogram of the eigenvalues of the Brownian motion in U(N).

Fig. 19
figure 19

The eigenvalues of \(B_{t}^{N}\), mapped to the unit circle by Φt, plotted against the density of Biane’s measure ν t. Shown for t = 2 and N = 2, 000

7.6 The PDE and Its Solution

We conclude this article by briefly outlining the methods used to obtain the results in the previous subsection.

7.6.1 The PDE

Following the definition of the Brown measure in Theorem 7, we consider the function

$$\displaystyle \begin{aligned} S(t,\lambda,\varepsilon):=\tau\lbrack\log((b_{t}-\lambda)^{\ast}(b_{t}-\lambda)+\varepsilon)]. {} \end{aligned} $$
(58)

We then record the following result [10, Theorem 2.8].

Theorem 24

The function S in (58) satisfies the following PDE:

$$\displaystyle \begin{aligned} \frac{\partial S}{\partial t}=\varepsilon\frac{\partial S}{\partial \varepsilon}\left( 1+(\left\vert \lambda\right\vert ^{2}-\varepsilon)\frac{\partial S}{\partial \varepsilon} -a\frac{\partial S}{\partial a}-b\frac{\partial S}{\partial b}\right) ,\quad \lambda=a+ib, {} \end{aligned} $$
(59)

with the initial condition

$$\displaystyle \begin{aligned} S(0,\lambda,\varepsilon)=\log(\left\vert \lambda-1\right\vert ^{2}+\varepsilon). {} \end{aligned} $$
(60)

Recall that in the case of the circular Brownian motion (the PDE in Theorem 9), the complex number λ enters only into the initial condition and not into the PDE itself. By contrast, the right-hand side of the PDE (59) involves differentiation with respect to the real and imaginary parts of λ.

On the other hand, the PDE (59) is again of Hamilton–Jacobi type. Thus, following the general Hamilton–Jacobi method in Section 5.1, we define a Hamiltonian function H from (the negative of) the right-hand side of (59), replacing each derivative of S by a corresponding momentum variable:

$$\displaystyle \begin{aligned} H(a,b,\varepsilon,p_{a},p_{b},p_{\varepsilon})=-\varepsilon p_{\varepsilon}(1+(a^{2}+b^{2})p_{\varepsilon}-\varepsilon p_{\varepsilon}-ap_{a}-bp_{b}). {} \end{aligned} $$
(61)

We then consider Hamilton’s equations for this Hamiltonian:

$$\displaystyle \begin{aligned} \frac{da}{dt} & =\frac{\partial H}{\partial p_{a}};\quad ~~\frac{db} {dt}=\frac{\partial H}{\partial p_{b}};\quad ~~~\frac{d\varepsilon}{dt}=\frac{\partial H}{\partial p_{\varepsilon}};\\ \frac{dp_{a}}{dt} & =-\frac{\partial H}{\partial a};\quad \frac{dp_{b}} {dt}=-\frac{\partial H}{\partial b};\quad \frac{dp_{\varepsilon}}{dt}=-\frac{\partial H}{\partial \varepsilon}. {} \end{aligned} $$
(62)

Then, after a bit of simplification, the general Hamilton–Jacobi formula in (40) then takes the form

$$\displaystyle \begin{aligned} S(t,\lambda(t),\varepsilon(t)) & =\log(\left\vert \lambda_{0}-1\right\vert ^{2} +\varepsilon_{0})-\frac{\varepsilon_{0}t}{(\left\vert \lambda_{0}-1\right\vert ^{2}+\varepsilon_{0})^{2} }\\ & +\log\left\vert \lambda(t)\right\vert -\log\left\vert \lambda _{0}\right\vert \text{.} {} \end{aligned} $$
(63)

(See Theorem 6.2 in [10].)

The analysis in [10] then proceeds along broadly similar lines to those in Sections 5 and 6. The main structural difference is that because λ is now a variable in the PDE, the ODE’s in (62) now involve both x and λ and the associated momenta. (That is to say, the vector x in Proposition 12 is equal to \((\lambda ,\varepsilon ) \in \mathbb {C}\times \mathbb {R} \cong \mathbb {R}^3.\)) The first key result is that the system of ODE’s associated to (59) can be solved explicitly; see Section 6.3 of [10]. Solving the ODE’s gives an implicit formula for the solution to (59) with the initial conditions (60).

We then evaluate the solution in the limit as ε tends to zero. We follow the strategy in Section 6. Given a time t and a complex number λ, we attempt to choose initial conditions ε 0 and λ 0 so that ε(t) will be very close to zero and λ(t) will equal λ. (Recall that the initial momenta in the system of ODE’s are determined by the positions by (39).)

7.6.2 Outside the Domain

As in the case of the circular Brownian motion, we use different approaches for λ outside Σt and for λ in Σt. For λ outside Σt, we allow the initial condition ε 0 in the ODE’s to approach zero. As it turns out, when ε 0 is small and positive, ε(t) remains small and positive for as long as the solution to the system exists. Furthermore, when ε 0 is small and positive, λ(t) is approximately constant. Thus, our strategy will be to take ε 0 ≈ 0 and λ 0 ≈ λ.

A key result is the following.

Proposition 25

In the limit as ε 0 tends to zero, the lifetime of the solution to (62) with initial conditions λ 0 and ε 0 —and initial moment determined by (39)—approaches T(λ 0), where T is the same function (53) that enters into the definition of the domain Σ t.

This result is Proposition 6.13 in [10]. Thus, the strategy in the previous paragraph will work—meaning that the solution continues to exist up to time t—provided that T(λ 0) ≈ T(λ) is greater than t. The condition for success of the strategy is, therefore, T(λ) > t. In light of the characterization of Σt in Definition 20 , we make have the following conclusion.

Conclusion 26

The simple strategy of taking ε 0 ≈ 0 and λ 0 ≈ λ is successful precisely if T(λ) > t or, equivalently, if λ is outside \(\overline {\Sigma }_{t}.\)

When this strategy works, we obtain a simple expression for \(\lim _{\varepsilon \rightarrow 0^{+}}S(t,\lambda ,\varepsilon ),\) by letting ε 0 approach zero and λ 0 approach λ in (63). Since λ(t) approaches λ in this limit [10, Proposition 6.11], we find that

$$\displaystyle \begin{aligned} \lim_{\varepsilon\rightarrow0^{+}}S(t,\lambda,\varepsilon)=\log(\left\vert \lambda-1\right\vert ^{2}),\quad \lambda\notin\overline{\Sigma}_{t}. {} \end{aligned} $$
(64)

This function is harmonic (except at λ = 1, which is always in the domain Σt), so we conclude that the Brown measure of b t is zero outside \(\overline {\Sigma }_{t}.\) See Section 7.2 in [10] for more details.

7.6.3 Inside the Domain

For λ inside Σt, the simple approach in the previous subsection does not work, because when λ is outside Σt and ε 0 is small, the solutions to the ODE’s (62) will cease to exist prior to time t (Proposition 25). Instead, we must prove a “surjectivity” result: For each t > 0 and λ ∈ Σt, there exist—in principle—\(\lambda _{0} \in \mathbb {C}\) and ε 0 > 0 giving λ(t) = λ and ε(t) = 0. See Figure 20. Actually the proof shows that λ 0 again belongs to the domain Σt; see Section 6.5 in [10].

Fig. 20
figure 20

For each λ in Σt, there exists ε 0 > 0 and λ 0 ∈ Σt such that with these initial conditions, we have ε(t) = 0 and λ(t) = λ

We then make use of the second Hamilton–Jacobi formula (41 ), which allows us to compute the derivatives of S directly, without having to attempt to differentiate the formula (63) for S. Working in logarithmic polar coordinates, \(\rho =\log \left \vert \lambda \right \vert \) and \(\theta =\arg \lambda ,\) we find an amazingly simple expression for the quantity

$$\displaystyle \begin{aligned} \frac{\partial s_{t}}{\partial\rho}=\lim_{\varepsilon\rightarrow0^{+}}\frac{\partial S}{\partial\rho}(t,\lambda,\varepsilon), \end{aligned}$$

inside Σt, namely,

$$\displaystyle \begin{aligned} \frac{\partial s_{t}}{\partial\rho}=\frac{2\rho}{t}+1,\quad \lambda\in \Sigma_{t}. {} \end{aligned} $$
(65)

(See Corollary 7.6 in [10].) This result is obtained using a certain constant of motion of the system of ODE’s, namely, the quantity

$$\displaystyle \begin{aligned} \Psi=\varepsilon p_{\varepsilon}+\frac{1}{2}(ap_{a}+bp_{b}) \end{aligned}$$

in [10, Proposition 6.5].

If we evaluate this constant of motion at a time t when ε(t) = 0, the εp ε term vanishes. But if ε(t) = 0, the second Hamilton–Jacobi formula (41) tells us that

$$\displaystyle \begin{aligned} \left( a\frac{\partial S}{\partial a}+b\frac{\partial S}{\partial b}\right) (t,\lambda(t),0)=a(t)p_{a}(t)+b(t)p_{b}(t). \end{aligned}$$

Furthermore, \(a\frac {\partial S}{\partial a}+b\frac {\partial S}{\partial b}\) is just ∂S∂ρ, computed in rectangular coordinates. A bit of algebraic manipulation yields an explicit formula for \(a\frac {\partial S}{\partial a}+b\frac {\partial S}{\partial b},\) as in [10, Theorem 6.7], explaining the formula (65). To complete the proof (65), it still remains to address certain regularity issues of S(t, λ, ε) near ε > 0, as in Section 7.3 of [10].

Once (65) is established, we note that the formula for ∂s t∂ρ in (65) is independent of θ. It follows that

$$\displaystyle \begin{aligned} \frac{\partial}{\partial\rho}\frac{\partial s_{t}}{\partial\theta} =\frac{\partial}{\partial\theta}\frac{\partial s_{t}}{\partial\rho}=0, \end{aligned}$$

that is, that ∂s t∂θ is independent of ρ inside Σt. Writing the Laplacian in logarithmic polar coordinates, we then find that

$$\displaystyle \begin{aligned} \Delta s_{t}(\lambda) & =\frac{1}{r^{2}}\left( \frac{\partial^{2}s_{t} }{\partial\rho^{2}}+\frac{\partial^{2}s_{t}}{\partial\theta^{2}}\right) \\ & =\frac{1}{r^{2}}\left( \frac{2}{t}+\frac{\partial}{\partial\theta}\left( \frac{\partial s_{t}}{\partial\theta}\right) \right) ,\quad \lambda\in \Sigma_{t}, {} \end{aligned} $$
(66)

where 2∕t term in the expression comes from differentiating (65) with respect to ρ. Since ∂s t∂θ is independent of ρ, we can understand the structure of the formula in Theorem 21.

The last step in the proof of Theorem 21 is to compute ∂s t∂θ. Since ∂s t∂θ is independent of ρ—or, equivalently, independent of \(r=\left \vert \lambda \right \vert \)—inside Σt, the value of ∂s t∂θ at a point λ in Σt is the same as its value as we approach the boundary of Σt along the radial segment through λ. We show that ∂s t∂θ is continuous over the whole complex plane, even at the boundary of Σt. (See Section 7.4 of [10].) Thus, on the boundary of Σt, the function ∂s t∂θ will agree with the angular derivative of \(\log (\left \vert \lambda -1\right \vert ^{2})\), namely

$$\displaystyle \begin{aligned} \frac{\partial}{\partial\theta}\log(\left\vert \lambda-1\right\vert ^{2}) & =\frac{2\operatorname{Im}\lambda}{\left\vert \lambda-1\right\vert ^{2} }\\ & =\frac{2r\sin\theta}{r^{2}+1-2r\cos\theta}. {} \end{aligned} $$
(67)

Thus, to compute ∂s t∂θ at a point λ in Σt, we simply evaluate (67) at either of the two points where the radial segment through λ intersects Σt. (We get the same value at either point.)

One such boundary point is the point with argument \(\theta =\arg \lambda \) and radius r t(θ), as in Figure 13. Thus, inside Σt, we have

$$\displaystyle \begin{aligned} \frac{\partial s_{t}}{\partial\theta}=\frac{2r_{t}(\theta)\sin\theta} {r_{t}(\theta)^{2}+1-2r_{t}(\theta)\cos\theta}. \end{aligned}$$

Plugging this expression into (66) gives the claimed formula in Theorem 21.