Fusion Frame Homotopy and Tightening Fusion Frames by Gradient Descent

Needham, Tom; Shonkwiler, Clayton

doi:10.1007/s00041-023-10028-0

Fusion Frame Homotopy and Tightening Fusion Frames by Gradient Descent

Published: 28 July 2023

Volume 29, article number 51, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

Fusion Frame Homotopy and Tightening Fusion Frames by Gradient Descent

Download PDF

111 Accesses
2 Altmetric
Explore all metrics

Abstract

Finite frames, or spanning sets for finite-dimensional Hilbert spaces, are a ubiquitous tool in signal processing. There has been much recent work on understanding the global structure of collections of finite frames with prescribed properties, such as spaces of unit norm tight frames. We extend some of these results to the more general setting of fusion frames—a fusion frame is a collection of subspaces of a finite-dimensional Hilbert space with the property that any vector can be recovered from its list of projections. The notion of tightness extends to fusion frames, and we consider the following basic question: is the collection of tight fusion frames with prescribed subspace dimensions path connected? We answer (a generalization of) this question in the affirmative, extending the analogous result for unit norm tight frames proved by Cahill, Mixon and Strawn. We also extend a result of Benedetto and Fickus, who defined a natural functional on the space of unit norm frames (the frame potential), showed that its global minimizers are tight, and showed that it has no spurious local minimizers, meaning that gradient descent can be used to construct unit-norm tight frames. We prove the analogous result for the fusion frame potential of Casazza and Fickus, implying that, when tight fusion frames exist for a given choice of dimensions, they can be constructed via gradient descent. Our proofs use techniques from symplectic geometry and Mumford’s geometric invariant theory.

Factorable weak operator-valued frames

Article 06 December 2021

The convergence constants and non linear approximations of fusion frames

Article 29 August 2021

Introduction to Finite Frame Theory

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A frame is a spanning collection of vectors in a Hilbert space which satisfies a certain Parseval-like condition [28]. Frames are important in the context of signal processing, where a signal is modeled as a vector in a Hilbert space and is encoded by the list of its inner products with the vectors in the frame [26]. A frame is generally overcomplete (i.e., linearly dependent), a property which is useful in applications because the resulting signal representations are, by virtue of their redundancy, more robust to noise than representations in a basis. For practical reasons, there has been increased interest in finite frames, that is, spanning sets of vectors for finite-dimensional Hilbert spaces; see [24] and [59] for general surveys. For the rest of the paper, we will only consider finite frames and will therefore work under the simplifying convention that our Hilbert space is $\mathbb {K}^d$, $\mathbb {K}= \mathbb {R}$ or $\mathbb {C}$, endowed with the standard inner product $\langle \cdot , \cdot \rangle $, with standard norm denoted $\Vert \cdot \Vert $.

Typically, one requires a frame to satisfy certain norm or spectral constraints. For example, a frame $\{f_1,\ldots ,f_N\}$ for $\mathbb {K}^d$ is called tight if the frame operator map $v \mapsto \sum _{i=1}^N \langle v, f_i \rangle f_i$ is a constant multiple of the identity map on $\mathbb {K}^d$. Tight frames are of particular interest, since they guarantee optimal robustness under certain noise models [20, 34]. Equivalently, a frame is tight if the spectrum of its frame operator is constant. The collection of all length-N frames for $\mathbb {K}^d$ with prescribed frame vector norms and frame operator spectrum defines a complicated algebraic variety, and topological and geometrical properties of these varieties have been the focus of a growing body of recent research [14, 16, 29, 52,53,54].

The goal of this paper is to extend results on spaces of frames to the setting of a more general signal processing tool: a fusion frame is an ordered collection $(\mathcal {S}_1,\ldots ,\mathcal {S}_N)$ of subspaces of $\mathbb {K}^d$ such that the frame operator $ v \mapsto \sum _{i=1}^N P_i v $ is invertible (and hence necessarily positive definite), where $P_i:\mathbb {K}^d \rightarrow \mathcal {S}_i$ is the orthogonal projection. Observe that if all subspaces are 1-dimensional, this essentially reduces to the definition of a (classical) frame. Fusion frames were introduced by Casazza and Kutyniok in [21] as a hierarchical approach to the construction of large frames with desirable properties. Fusion frames were subsequently developed into a general tool for distributed data processing [22, 23, 44]—the basic idea is that factors such as hardware limitations may require a collection of local vector-valued signal measurements to be coherently and robustly fused into a global measurement. Fusion frames have more recently been applied to compressed sensing for structured signals or low quality measurement modalities [2, 6, 60].

In both the original definition and much of the subsequent literature, fusion frames allow for positive weights to be attached to the subspaces. In what follows we will either generalize to operator-valued frames (which include all weighted fusion frames) or specialize to unweighted fusion frames, so in this paper (as in [44]), we only give the definition in the unweighted case and when we say “fusion frames” we really mean “unweighted fusion frames.”

As in the case of (classical) frames, there is typically a focus on fusion frames with extra structure. For instance, the notion of a tight frame generalizes to that of a tight fusion frame—this is a fusion frame such that the frame operator is a multiple of the identity map. We study the topology and geometry of spaces of tight fusion frames, as well as spaces of fusion frames with more general prescribed data. We give precise formulations of our results in subsections 1.1 and 1.2 below, but our main contributions are described informally as follows.

Fusion Frame Homotopy Theorem (Theorem 1.9). We show that the space of tight fusion frames in a complex Hilbert space with a prescribed sequence of subspace dimensions is path connected. This gives a generalization of the (complex) Frame Homotopy Theorem, which says that the space of length-N tight frames for a complex Hilbert space, whose frame vectors are all unit norm, is path connected. This was proved by Cahill, Mixon and Strawn in [14], affirming a conjecture of Larson from 2002 (although the conjecture was first published in [29]). The Frame Homotopy Theorem was generalized by the authors of the present paper to spaces of frames with more general prescribed spectral and norm data using techniques from symplectic geometry [52] and to spaces of quaternionic frames using the theory of isoparametric submanifolds [55]. We once again apply symplectic techniques to prove the analogous result for fusion frames—in fact, our techniques work in much greater generality, and we are able to prove a connectivity result for spaces of operator-valued frames. Our result is described in detail below in Sect. 1.1.
Benedetto–Fickus Theorem for Fusion Frames (Theorem 1.14). In [7], Benedetto and Fickus introduced the frame potential, a natural energy functional on the space of frames consisting of a fixed number of unit vectors in a fixed Hilbert space. They showed that the global minimizers of the frame potential are tight frames and proved the surprising result that the frame potential has no spurious local minimizers, meaning that tight frames can reliably be generated via gradient descent—we refer to this result ( [7, Theorem 7.1], also stated below as Theorem 1.12) as the Benedetto–Fickus theorem. Casazza and Fickus defined a more general functional on the space of fusion frames, called the fusion frame potential, and characterized its minimizers [18]. We extend the Benedetto–Fickus theorem to give general conditions which guarantee that the fusion frame potential has no spurious local minimizers (Theorem 1.14), which implies that if tight fusion frames exist in a given space of fusion frames, they can always be reached by gradient descent (Corollary 1.15). Together with Mixon and Villar, we gave several strengthenings of the Benedetto–Fickus theorem in [49], one of which was proved using ideas from Mumford’s geometric invariant theory (GIT) [51]. Theorem 1.14 is proved by extending this application of GIT to the fusion frame setting. We precisely state and further contextualize our result below in Sect. 1.2.

The structure of the paper is as follows. The remaining subsections of the introduction pin down exact definitions, set notation, and give precise statements of our main results. Section 2 provides necessary background on symplectic geometry, in preparation for the proof of the Fusion Frame Homotopy Theorem (Theorem 1.9), which is then proved in Sect. 3. The Benedetto–Fickus Theorem for Fusion Frames (Theorem 1.12) is proved in Sect. 4, after introducing the main ideas of GIT. We remark that the exposition about symplectic geometry and GIT in Sect. 2 and 4, respectively, is intended to be accessible to non-experts in these fields. The paper concludes with a discussion of open problems and future directions in Sect. 5.

1.1 Fusion Frame Homotopy

Recall that a tight fusion frame (TFF) is a fusion frame $(\mathcal {S}_1,\ldots ,\mathcal {S}_N)$ such that the frame operator $\sum _i P_i$ is a multiple of the identity. If $k_i = \dim (\mathcal {S}_i) = {\text {rk}}(P_i)$ then, since all nonzero eigenvalues of an orthogonal projector are equal to 1,

$$\begin{aligned} {\text {tr}}\left( \sum _{i=1}^N P_i\right) = \sum _{i=1}^N {\text {tr}}(P_i) = \sum _{i=1}^N k_i =:n, \end{aligned}$$

so it must be the case that a TFF has frame operator equal to $\frac{n}{d} \mathbb {I}_d$, where $\mathbb {I}_d$ is the identity operator on $\mathbb {K}^d$. Since the $P_i$ uniquely determine and are uniquely determined by the $\mathcal {S}_i$, we will also call $(P_1, \dots , P_N)$ a (tight) fusion frame when the corresponding $(\mathcal {S}_1, \dots , \mathcal {S}_N)$ is.

With the frame homotopy conjecture [29] (resolved by Cahill, Mixon, and Strawn in 2017 [14]) in mind, the following is a very natural question:

Question 1.1

Is the space of TFFs in $\mathbb {K}^d$ with given ranks $(k_1, \dots , k_N)$ path-connected?

The first goal of this paper is to show that the answer to Question 1.1 is always “yes” for complex fusion frames (i.e., when $\mathbb {K}= \mathbb {C}$). In fact, we will prove a much more general theorem about spaces of operator-valued frames with fixed spectral data.

To motivate the definition given below, notice that, for a fusion frame $(P_1, \dots , P_N)$, each projector $P_i$ has a full-rank square root $A_i: \mathbb {K}^d \rightarrow \mathbb {K}^{k_i}$ so that $A_i^*A_i = P_i$. This square root is unique up to composing with a unitary transformation of the codomain:

Proposition 1.2

( [36, Theorem 7.3.11]) Suppose $A: \mathbb {K}^d \rightarrow \mathbb {K}^k$ and $B: \mathbb {K}^d \rightarrow \mathbb {K}^k$ are linear maps. Then $A^*A = B^*B$ if and only if there exists a unitary transformation $U \in {\text {U}}(k)$ so that $B = UA$.

Hence, up to this indeterminacy, we can also think of a fusion frame as a collection of operators $(A_1, \dots , A_N)$ so that $A_i^*A_i$ is an orthogonal projector for each $i=1,\dots , N$ and so that $A_1^*A_1 + \dots + A_N^*A_N$ is positive definite. This is the definition we will generalize by relaxing the condition on the individual $A_i^*A_i$:

Definition 1.3

Let $d, N, k_1,\dots ,k_N$ be positive integers. Let $\varvec{k}:=(k_1, \dots , k_N)$. Then an operator-valued frame of type $(d, \varvec{k})$ is an N-tuple $\varvec{A}:=(A_1, \dots , A_N)$ of linear maps $A_i:\mathbb {K}^d \rightarrow \mathbb {K}^{k_i}$ so that the frame operator

$$\begin{aligned} S_{\varvec{A}}:= \sum _{i=1}^N A_i^*A_i \end{aligned}$$

is positive definite. The space of operator-valued frames of type $(d, \varvec{k})$ will be denoted $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$.

This definition is essentially the specialization to finite dimensions of Kaftal, Larson, and Zhang’s definition [37] of an operator-valued frame (see also [9]). As with fusion frames, we will define $P_i:= A_i^*A_i$, so that the frame operator is $\sum _i P_i$. In practice, we will make the simplifying assumption ${\text {rk}}(A_i) = k_i$, by restricting the codomain of $A_i$ to its image if necessary.

The feature that distinguishes fusion frames among the operator-valued frames is that the $P_i$ are orthogonal projectors of rank $k_i$, which means precisely that the $k_i$ nonzero eigenvalues of $P_i$ are all equal to 1. More generally, we can consider spaces of operator-valued frames with fixed spectral data:

Definition 1.4

Let $d, N, k_1, \dots , k_N$ be positive integers and $\varvec{k}:=(k_1, \dots , k_N)$. For each i, let $\varvec{r}_i =(r_{i1}, \dots , r_{ik_i})$ with $r_{i1} \ge \dots \ge r_{ik_i} > 0$,^{Footnote 1} and let $\varvec{r}= (\varvec{r}_1, \dots , \varvec{r}_N)$. $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r})$ will denote the space of operator-valued frames $(A_1, \dots , A_N)$ of type $(d,\varvec{k})$ for which $P_i = A_i^*A_i$ has nonzero eigenvalues equal to $\varvec{r}_i$. Equivalently, each $r_{ij} = \sigma _{ij}^2$, where the $\sigma _{ij}$ are the singular values of $A_i$.

Example 1.5

Let $\varvec{k}= (k_1,\ldots ,k_N)$ and let $\varvec{r}= (\varvec{r}_1,\ldots ,\varvec{r}_N)$ such that every $\varvec{r}_i$ is a list of $k_i$ ones. Then $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r})$ is equivalent to the space of fusion frames with prescribed ranks $\varvec{k}$. Indeed, it is the space of $\varvec{A} = (A_1, \dots , A_N)$ so that $P_i = A_i^*A_i$ is a rank-$k_i$ orthogonal projector and the frame operator $S_{\varvec{A}} = \sum _i P_i$ is positive-definite. Since this space is of particular interest, we use the specialized notation $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}:= \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r})$. As a special case, if $k_i = 1$ for all i then $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ is equivalent to the space of unit-norm frames of length N in $\mathbb {K}^d$.

Tight fusion frames are also distinguished among fusion frames by spectral data since multiples of the identity are uniquely determined by their spectra: $\lambda \mathbb {I}_d$ is the only operator with spectrum $(\lambda , \dots , \lambda )$. Fixing the spectral data of the frame operator is more natural than fixing the frame operator itself, since we can always diagonalize the frame operator by precomposing the $A_i: \mathbb {K}^d \rightarrow \mathbb {K}^{k_i}$ by a common unitary transformation of the domain. Hence, a natural generalization of tight frames is the collection of operator-valued frames with fixed spectrum of their frame operator:

Definition 1.6

Let $d, N, k_1, \dots , k_N$ be positive integers and $\varvec{k}:=(k_1, \dots , k_N)$. Let $\varvec{\lambda }= (\lambda _1, \dots , \lambda _d)$ with $\lambda _1 \ge \dots \ge \lambda _d > 0$. $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}$ will denote the space of operator-valued frames $(A_1, \dots , A_N)$ whose frame operator $S_{\varvec{A}} = \sum _i A_i^*A_i$ has spectrum $\varvec{\lambda }$.

Example 1.7

Let $\varvec{k}= (k_1,\ldots ,k_N)$ and let $\varvec{\lambda }= (1,\ldots ,1)$ be the list of d ones. Then $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}$ could reasonably be called the space of Parseval operator-valued frames of type $(d, \varvec{k})$, by analogy with the case $k_i = 1$ for all i, when $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}$ is equivalent to the space of Parseval frames of length N in $\mathbb {K}^d$ (that is, frames whose frame operator is the identity).

Of course, the definition of a tight fusion frame includes both fixed spectral data of the $P_i$ and fixed spectral data of the frame operator, so it involves intersecting two of the spaces defined above. In that spirit, define

$$\begin{aligned} \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r}):= \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }} \cap \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r}). \end{aligned}$$

Hence, the operator-valued generalization of Question 1.1 is the following:

Question 1.8

For given $d,N,\varvec{k}, \varvec{r}, \varvec{\lambda }$, is the space $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ path-connected?

In the case of complex classical frames (i.e., $\mathbb {K}= \mathbb {C}$ and $\varvec{k}= (1, \dots , 1)$, but general $\varvec{\lambda }$ and $\varvec{r}$) we answered this question in the affirmative using symplectic geometry [52]. Here, we give an affirmative answer for general complex operator-valued frames:

Theorem 1.9

The space $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is always path-connected.

Since the empty set is trivially path-connected, the substantive content of this theorem is that $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is path-connected whenever it is non-empty. For discussion of when it is empty, see Remark 1.16.

1.2 Benedetto–Fickus Theorem for Fusion Frames

We now specialize back to fusion frames, but again work over $\mathbb {K}= \mathbb {R}$ or $\mathbb {C}$. Recall from Example 1.5 that, for d and N positive and $\varvec{k}= (k_1, \dots , k_N)$, $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ denotes the space of (square roots of) fusion frames with prescribed ranks $\varvec{k}$.

It is easy to generate random elements of $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$: for each i, let $B_i$ be a $k_i \times d$ matrix with standard Gaussian entries, and let $A_i$ be the result of orthogonalizing the rows of $B_i$. Given their usefulness in applications, it is desirable to generate not just fusion frames, but tight fusion frames. Following the lead of Benedetto and Fickus [7], a plausible strategy for doing so is to define a potential function on $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ whose global minima are exactly the set of TFFs, and then flow along the negative gradient direction of this potential. A natural candidate for such a potential is the fusion frame potential defined by Casazza and Fickus [18], generalizing Benedetto and Fickus’ frame potential. In the definition below, and at times throughout the rest of the paper, we will abuse notation and also use $\langle \cdot , \cdot \rangle $ and $\Vert \cdot \Vert $ to denote, respectively, the Frobenius inner product and norm on a space of operators; the meaning should always be clear from context.

Definition 1.10

Let $d, N, k_1, \dots , k_N$ be positive integers. The fusion frame potential ${\text {FFP}}:\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}\rightarrow \mathbb {R}$ is defined by

$$\begin{aligned} {\text {FFP}}(\varvec{A}):= \left\| S_{\varvec{A}}\right\| ^2. \end{aligned}$$

Note that the fusion frame potential could be generalized to arbitrary spaces of operator-valued frames, though we will not do so here. As Casazza and Fickus showed (see also Proposition 4.1), the fusion frame potential satisfies a Welch-type lower bound which is achieved exactly at the TFFs. Hence, when they exist, TFFs are exactly the global minima of ${\text {FFP}}$, and it is natural to ask whether we can get to these global minima by negative gradient flow:

Question 1.11

When they exist, can we construct TFFs by flowing along the negative gradient of ${\text {FFP}}$? That is, are all local minima of ${\text {FFP}}$ also global in this setting?

An affirmative answer to this question is essentially a conjecture of Massey, Ruiz, and Stojanoff [47, Conjecture 4.3.3]. For unit-norm tight frames, Benedetto and Fickus showed that there are no spurious local minima of the frame potential, completely resolving Question 1.11 in this case:

Theorem 1.12

(Benedetto–Fickus theorem [7]) Let d and N be positive and $\varvec{k}=(1,\dots , 1)$. Then ${\text {FFP}}: \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}\rightarrow \mathbb {R}$ has no spurious local minimizers.

In [49] we gave three new proofs of this result, one of which we now intend to generalize to fusion frames. The exposition here is self-contained, but parallels that in [49, §4].

To state our theorem, we need to define a suitable notion of genericity for fusion frames.

Definition 1.13

Let $\varvec{A} \in \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ and recall that, for each $i=1,\dots , N$, the image of the orthogonal projector $P_i = A_i^*A_i$ is a $k_i$-dimensional subspace $\mathcal {S}_i \subset \mathbb {K}^d$. We say that $\varvec{A}$ has property ${\mathscr {S}}$ if, for all proper linear subspaces $\mathcal {Q} \subset \mathbb {K}^d$,

$$\begin{aligned} \frac{1}{\dim \mathcal {Q}} \sum _{i=1}^N \dim (\mathcal {S}_i \cap \mathcal {Q}) \le \frac{1}{d} \sum _{i=1}^N k_i = \frac{n}{d}. \end{aligned}$$

Roughly speaking, this condition says that no subspace of $\mathbb {K}^d$ intersects too many of the $\mathcal {S}_i$. For example, in the classical frames case $\varvec{k}= (1, \dots , 1)$, property ${\mathscr {S}}$ says the fraction of frame vectors lying in any given r-dimensional subspace is no more than $\frac{r}{d}$. In particular, this is a much weaker condition than being full spark.

More generally, in the case of equal-rank fusion frames—i.e., each $P_i$ is rank k—property ${\mathscr {S}}$ is weaker than the condition $\sum _{i=1}^N \dim (\mathcal {S}_i \cap \mathcal {Q}) \le \dim \mathcal {Q}$ for all proper subspaces $\mathcal {Q} \subset \mathbb {K}^d$. Fusion frames satisfying this latter condition are relevant to the problem of compressed sensing with block sparsity [11, 30] since they provide unique reconstructions for the largest possible class of block-sparse signals, much as classical full spark frames are optimal for traditional compressed sensing [3, 27].

Theorem 1.14

Let $0< d,N,k_1,\dots ,k_N$. Consider the negative gradient flow $\Gamma : \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}\times [0,\infty ) \rightarrow \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ defined by the differential equation

$$\begin{aligned} \Gamma (\varvec{A}_0,0) = \varvec{A}_0, \quad \frac{d}{dt}\Gamma (\varvec{A}_0,t) = - {\text {grad}}{\text {FFP}}(\Gamma (\varvec{A}_0,t)) \end{aligned}$$

where ${\text {grad}}$ is the Riemannian gradient and $\varvec{A}_0 = (A_1, \dots , A_n) \in \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ is an initial frame.

If $\varvec{A}_0$ has property ${\mathscr {S}}$, then $\varvec{A}_\infty := \lim _{t \rightarrow \infty } \Gamma (\varvec{A}_0,t) \in \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ is a global minimum of ${\text {FFP}}$.

As mentioned above, for parameters d and $\varvec{k}$ which admit TFFs, the TFFs are exactly the global minima of ${\text {FFP}}$, so for those parameters this theorem implies that fusion frames with property ${\mathscr {S}}$ always flow to tight fusion frames. Moreover, these are exactly the parameters for which $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ contains fusion frames with property ${\mathscr {S}}$ (Corollary 4.14), which turn out to be dense in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ (Corollary 4.7). This implies that there are fusion frames arbitrarily close to any non-minimal critical point of ${\text {FFP}}$ which flow to a global minimum, so non-minimal critical points of ${\text {FFP}}$ cannot be basins of attraction of the gradient flow. In turn, ${\text {FFP}}$ is a polynomial defined on an analytic submanifold of Euclidean space, so it will have a Łojasiewicz exponent (cf. [10, Corollary 4.2]), and an argument analogous to [1, Theorem 3] shows that local minima must be basins of attraction. Hence, non-minimal critical points cannot be local minima.

Corollary 1.15

When $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ contains TFFs, all local minima of ${\text {FFP}}$ are global minima.

This generalizes Benedetto and Fickus’ result to fusion frames and completely answers Question 1.11. See also work of Heineken, Llarena, and Morillas [32], which gives a similar answer for a restriction of ${\text {FFP}}$ to a subset of $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$, and of Massey, Ruiz, and Stojanoff [47], who proved an analogous result with respect to a different notion of distance on fusion frames.

Even in some situations where there cannot be any TFFs—for example, when $d = N = 3$ and $\varvec{k}=(1,1,2)$—the negative gradient flow of ${\text {FFP}}$ empirically seems to avoid spurious local minima, so there is hope that the conclusion of Corollary 1.15 follows from weaker hypotheses.

Remark 1.16

(Admissibility)

In light of Corollary 1.15, it would be useful to know when the space of TFFs is non-empty. More generally, we can ask whether $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is non-empty, which is really prior to the question of whether it is path-connected. In the context of classical frames (i.e., $\varvec{k}= (1, \dots , 1)$), this is sometimes called the admissibility problem, and its resolution follows easily from the Schur–Horn theorem [36, 56] (see also [4, 25] and Cahill, Fickus, Mixon, Poteet, and Strawn’s contructive solution [13]): in this setting, $\varvec{r}= (r_1, \dots , r_N)$ is a single list of positive numbers, and $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is nonempty precisely when $\varvec{\lambda }$ majorizes $\varvec{r}$ [46], meaning that

$$\begin{aligned} \sum _{i=1}^d \lambda _i = \sum _{i=1}^N r_i \qquad \text {and} \qquad \sum _{i=1}^k \lambda _i \ge \sum _{i=1}^k r_i \quad \text {for all } k=1,\dots , d, \end{aligned}$$

where we assume $\varvec{\lambda }$ and $\varvec{r}$ are sorted in decreasing order. In particular, the space $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\left( \frac{N}{d}, \dots , \frac{N}{d}\right) }(1, \dots , 1)$ of unit-norm tight frames is always nonempty when $N \ge d$.

More generally, admissibility has been completely resolved for all spaces of tight fusion frames (therefore precisely determining the scope of Theorem 1.14 and Corollary 1.15): by Casazza, Fickus, Mixon, Wang, and Zhou [19] when all $P_i$ have the same rank, and in general by Bownik, Luoto, and Richmond [12]. Even for tight fusion frames, the conditions on d and $\varvec{k}$ which ensure non-emptiness are quite complicated, boiling down to non-vanishing of certain Littlewood–Richardson coefficients.

However, this is not really surprising, given that in general admissibility for operator-valued frames is equivalent to the following question: what $\varvec{\lambda }, \varvec{r}_1, \dots , \varvec{r}_N$ can be the eigenvalues of $d \times d$ Hermitian matrices $M, P_1, \dots , P_N$ so that $M=P_1 + \dots + P_N$? In 1962, Horn conjectured necessary and sufficient conditions involving a complicated system of inequalities between $\varvec{\lambda }$ and the $\varvec{r}_i$ [35]. This became known as the Horn conjecture, which was eventually proved in the late 1990 s by Klyachko [41] and Knutson and Tao [42, 43]; see Fulton’s survey [31] for more, and Berenstein and Sjamaar’s paper [8] for a generalization. The answer does not depend on the base field: if $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is non-empty, then so is $\mathcal{O}\mathcal{F}^{\mathbb {R}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ (see, e.g., [31, Theorem 20]).

Thus, the admissibility problem for operator-valued frames is at least implicitly resolved by the proof of the Horn conjecture (see also [47]), and more explicit solutions exist in the most common cases of interest, namely frames and tight fusion frames.

2 Symplectic Machinery

As in our earlier work on frames [52], our strategy for proving Theorem 1.9 involves symplectic geometry. First, we will review some standard concepts from symplectic geometry, which will help both to provide a quick reference and to establish our notation and sign conventions. Our main references for symplectic geometry are McDuff and Salamon [48] and Cannas da Silva [17].

2.1 Definitions

A symplectic manifold is a pair $(M,\omega )$, where M is a (smooth, real) manifold and $\omega $ is a closed, nondegenerate 2-form on M. For each point $p \in M$ and for each pair of tangent vectors $X,Y \in T_pM$, we write $\omega _p(X,Y) \in \mathbb {R}$ for the evaluation of $\omega $ at the point on the pair. Being closed means that $d\omega $ is identically zero, where d is the exterior derivative on M, and being nondegenerate means that, for each $p \in M$ and each $X \in T_pM$, there exists $Y \in T_pM$ so that $\omega _p(X,Y) \ne 0$. Nondegeneracy implies that a symplectic manifold must be even-dimensional over $\mathbb {R}$.

Example 2.1

The simplest example of a symplectic manifold is $\mathbb {C}^n \approx \mathbb {R}^{2n}$. For $p \in \mathbb {C}^n$ there is a natural isomorphism $T_p \mathbb {C}^n \approx \mathbb {C}^n$, and complex coordinates $(x_1 + \sqrt{-1} y_1, \dots , x_n + \sqrt{-1}y_n)$ for $\mathbb {C}^n$ correspond to real coordinates $(x_1, \dots , x_n, y_1, \dots , y_n)$, with respect to which the standard symplectic form on $\mathbb {C}^n$ is given by

$$\begin{aligned} \omega = dx_1 \wedge dy_1 + \dots + dx_n \wedge dy_n. \end{aligned}$$

We can rewrite this in complex coordinates as follows: given $p \in \mathbb {C}^n$ and $Z=(z_1, \dots , z_n), W = (w_1, \dots , w_n) \in T_p\mathbb {C}^n \approx \mathbb {C}^n$,

$$\begin{aligned} \omega _p(Z,W) = -{\text {Im}}({\overline{w}}_1 z_1 + \dots + {\overline{w}}_n z_n) = -{\text {Im}}(W^*Z) = -{\text {Im}}\langle Z, W \rangle . \end{aligned}$$

In fact, Darboux’s theorem [48, Theorem 3.2.2] implies that every point in a symplectic manifold has a neighborhood on which the symplectic structure looks like the standard one on $\mathbb {C}^n$.

Symplectic geometry grew out of Hamiltonian mechanics, and a big emphasis both historically and currently is on the interactions between symplectic structures and symmetries; that is, group actions on symplectic manifolds. In general, if G is a Lie group with Lie algebra ${\mathfrak {g}}$ and G acts on a manifold M, then each $\xi \in {\mathfrak {g}}$ determines a vector field $Y_\xi $ on M as follows. For $p \in M$ and $g \in G$, let $g \cdot p$ denote the action of g on p. Then we define

$$\begin{aligned} \left. Y_\xi \right| _p:= \left. \frac{d}{d\varepsilon } \right| _{\varepsilon =0} \exp (\varepsilon \xi ) \cdot p, \end{aligned}$$

where $\exp : {\mathfrak {g}} \rightarrow F$ is the exponential map of G.

Now, suppose G acts on a symplectic manifold $(M,\omega )$. If ${\mathfrak {g}}^*$ is the dual of ${\mathfrak {g}}$, a map $\Phi : M \rightarrow {\mathfrak {g}}^*$ is called a momentum map for the G-action if it satisfies the following conditions.

First, let $D\Phi (p): T_pM \rightarrow T_{\Phi (p)}{\mathfrak {g}}^*$ be the derivative of $\Phi $ at $p \in M$. Since ${\mathfrak {g}}^*$ is a vector space, there is a natural isomorphism $T_{\Phi (p)}{\mathfrak {g}}^*\approx {\mathfrak {g}}^*$, so we can interpret $D\Phi (p)$ as a map to ${\mathfrak {g}}^*$. Hence, for each $X \in T_p M$, $D\Phi (p)(X)$ is an element of the dual space of ${\mathfrak {g}}$; that is, $D\Phi (p)(X): {\mathfrak {g}} \rightarrow \mathbb {R}$. For $\Phi $ to be a momentum map we require this map to satisfy the compatibility condition

$$\begin{aligned} D\Phi (p)(X)(\xi ) = \omega _p(\left. Y_\xi \right| _p,X) \end{aligned}$$

for all $\xi \in {\mathfrak {g}}$.

Also, we require a momentum map $\Phi $ to be equivariant with respect to the given action of G on M and the natural coadjoint action of G on ${\mathfrak {g}}^*$. More explicitly, the adjoint action of G on ${\mathfrak {g}}$ is the linearization at the identity of the conjugation action of G on itself; that is, for each $g \in G$ the map ${\text {Ad}}_g: {\mathfrak {g}} \rightarrow {\mathfrak {g}}$ is the derivative at the identity of the map $h \mapsto g h g^{-1}$. In turn, the coadjoint action of G on ${\mathfrak {g}}^*$ gives a map ${\text {Ad}}_g^*: {\mathfrak {g}}^*\rightarrow {\mathfrak {g}}^*$ for each $g \in G$ which is defined by ${\text {Ad}}_g^*(\chi )(\xi ):= \chi ({\text {Ad}}_{g^{-1}}(\xi ))$. When G is a matrix group, both ${\text {Ad}}_g$ and ${\text {Ad}}_g^*$ can be interpreted as conjugation by g. Now, the equivariance condition on momentum maps is that, for each $g \in G$ and each $p \in M$,

$$\begin{aligned} {\text {Ad}}_g^*(\Phi (p)) = \Phi (g \cdot p). \end{aligned}$$

When a G-action admits a momentum map, the action is called Hamiltonian and the tuple $(M, \omega , G, \Phi )$ is a Hamiltonian G-space.

Proposition 2.2

(cf. [48, Exercise 5.3.16]) Suppose G is a Lie group and $(M_i,\omega _i,G,\Phi _i)$ are Hamiltonian G-spaces for $i=1,\dots ,n$. Then the diagonal action of G on $M_1 \times \dots \times M_n$ is Hamiltonian with momentum map

$$\begin{aligned} \Phi (p_1, \dots , p_n) = \Phi _1(p_1) + \dots + \Phi _n(p_n). \end{aligned}$$

Proof

The standard symplectic form on a product $\prod _{i=1}^n M_i$ of symplectic manifolds is $\pi _1^*\omega _1 + \dots + \pi _n^*\omega _n$, where ${\pi _k:\prod _{i=1}^n M_i\rightarrow M_k}$ is projection onto the kth factor and $\pi _k^*$ is the induced pullback operator on forms.

Any tangent vector $X \in T_{(p_1, \dots , p_n)} \prod _{i=1}^n M_i$ is of the form $X = (X_1, \dots , X_n)$ for $X_i \in T_{p_i}M_i$ for all $i=1, \dots , n$. In particular, if $\xi \in {\mathfrak {g}}$, then the associated vector field

$$\begin{aligned} \left. Y_\xi \right| _{(p_1, \dots , p_n)} = (\left. (Y_1)_\xi \right| _{p_1}, \dots , \left. (Y_n)_\xi \right| _{p_n}), \end{aligned}$$

where each $\left. (Y_i)_\xi \right| _{p_i} = (d\pi _i)_{(p_1, \dots , p_n)} \left. Y_\xi \right| _{(p_1, \dots , p_n)}$ is the vector field on $M_i$ associated to $\xi $.

Let $\Phi :\prod _{i=1}^n M_i \rightarrow {\mathfrak {g}}^*$ be defined as in the statement of the proposition and let $X = (X_1, \dots , X_n) \in T_{(p_1, \dots , p_n)} \prod _{i=1}^n M_i$. Then

$$\begin{aligned} D\Phi (p_1,\dots ,p_n)(X_1,\dots , X_n)(\xi )&= \sum _{i=1}^n D\Phi _i(p_i)(\xi ) = \sum _{i=1}^n (\omega _i)_{p_i}(\left. Y_\xi \right| _{p_i},X_i)\\&\quad =\sum _{i=1}^n(\omega _1)_{p_i}((d\pi _i)_{(p_i, \dots , p_i)} \left. Y_\xi \right| _{(p_1, \dots , p_n)},X_i)\\&\quad = \sum _{i=1}^n\pi _i^*(\omega _i)_{p_i}(\left. Y_\xi \right| _{(p_1, \dots , p_n)},X_i)\\&\quad = \omega _{(p_1,\dots ,p_n)}(\left. Y_\xi \right| _{(p_1, \dots ,p_n)},(X_1,\dots , X_n)), \end{aligned}$$

where we’ve used linearity in various places. So $\Phi $ satisfies the appropriate compatibility condition with $\omega $.

Moreover, if $g \in G$ and $(p_1, \dots , p_n) \in \prod _{i=1}^n M_i$, then

$$\begin{aligned} {\text {Ad}}_g^*(\Phi (p_1, \dots , p_n))= & {} {\text {Ad}}_g^*\left( \sum _{i=1}^n \Phi _i(p_i)\right) = \sum _{i=1}^n {\text {Ad}}_g^*(\Phi _i(p_i))\\= & {} \sum _{i=1}^n \Phi _i(g \cdot p_i) = \Phi (g \cdot p_1, \dots , g \cdot p_n) = \Phi (g \cdot (p_1, \dots , p_n)) \end{aligned}$$

using the linearity of ${\text {Ad}}_g^*$ and the G-equivariance of the $\Phi _i$, so $\Phi $ is G-equivariant, and hence is a momentum map for the G-action. $\square $

2.2 Coadjoint Orbits

An important class of Hamiltonian spaces are coadjoint orbits, which we now describe in some detail, loosely following [48, Example 5.3.11].

Let G be a Lie group with Lie algebra ${\mathfrak {g}}$ and dual Lie algebra ${\mathfrak {g}}^*$. Let $\chi \in {\mathfrak {g}}^*$ and let $\mathcal {O}_\chi $ be the coadjoint orbit through $\chi $; that is,

$$\begin{aligned} \mathcal {O}_\chi := \left\{ {\text {Ad}}_g^*(\chi ) \,|\, g \in G \right\} . \end{aligned}$$

It is a standard fact that $\mathcal {O}_\chi $ has a natural symplectic form called the Kirillov–Kostant–Souriau (KKS) form [5, II.3.c], denoted $\omega ^{\text {KKS}}$, defined as follows. The tangent space to $\mathcal {O}_\chi $ at $\chi $ consists of vectors of the form ${\text {ad}}_\xi ^*\chi $, where $\xi \in {\mathfrak {g}}$ and ${\text {ad}}_\xi ^*$ is the coadjoint representation of ${\mathfrak {g}}$ on ${\mathfrak {g}}^*$; that is, the derivative of the coadjoint representation ${\text {Ad}}^*:G \rightarrow {\text {Aut}}({\mathfrak {g}}^*)$ at the identity. Then

$$\begin{aligned} \omega ^{\text {KKS}}_\chi ({\text {ad}}_\xi ^*(\chi ),{\text {ad}}_{\zeta }^*(\chi )):= \chi ([\xi ,\zeta ]), \end{aligned}$$

where $[\cdot ,\cdot ]$ is the Lie bracket on ${\mathfrak {g}}$.

The action of G on $\mathcal {O}_\chi $ is Hamiltonian, with momentum map $\Phi : \mathcal {O}_\chi \rightarrow {\mathfrak {g}}^*$ simply being the inclusion map. To see this, first notice that the vector field $Y_\xi $ on $\mathcal {O}_\chi $ induced by $\xi \in {\mathfrak {g}}$ is

$$\begin{aligned} \left. Y_\xi \right| _\chi = \left. \frac{d}{d\varepsilon }\right| _{\varepsilon =0} {\text {Ad}}_{\exp \varepsilon \xi }^*(\chi ) = {\text {ad}}_\xi ^*\chi . \end{aligned}$$

If $\Phi $ is the inclusion of $\mathcal {O}_\chi $ into ${\mathfrak {g}}^*$, then its derivative $D\Phi (\chi ): T_\chi \mathcal {O}_\chi \rightarrow {\mathfrak {g}}^*$ is also an inclusion, so

$$\begin{aligned} D\Phi (\chi )({\text {ad}}_\zeta ^*(\chi ))(\xi )= & {} {\text {ad}}_\zeta ^*(\chi )(\xi ) = \chi (-{\text {ad}}_\zeta (\xi )) = \chi (-[\zeta ,\xi ]) = \chi ([\xi ,\zeta ]) \\= & {} \omega ^{\text {KKS}}({\text {ad}}_\xi ^*(\chi ),{\text {ad}}_\zeta ^*(\chi )) = \omega ^{\text {KKS}}(\left. Y_\xi \right| _\chi ,{\text {ad}}_\zeta ^*(\chi )). \end{aligned}$$

Since $\Phi $ is obviously equivariant, this shows that it is a momentum map for the G-action on $\mathcal {O}_\chi $.

Combining the above discussion with Proposition 2.2 yields the following corollary:

Corollary 2.3

Let G be a Lie group with dual Lie algebra ${\mathfrak {g}}^*$. Let $\mathcal {O}_1, \dots , \mathcal {O}_n \subset {\mathfrak {g}}^*$ be coadjoint orbits with their KKS forms $\omega _i^{\text {KKS}}$. Then $\left( \prod _{i=1}^n\mathcal {O}_i, \sum _{i=1}^n \pi _i^*\omega _i^{\text {KKS}}\right) $ is symplectic and the diagonal coadjoint action of G is Hamiltonian with momentum map

$$\begin{aligned} \Phi (\chi _1, \dots , \chi _n) = \chi _1 + \dots + \chi _n, \end{aligned}$$

where on the right hand side we have identified each $\chi _i \in \mathcal {O}_i$ with its inclusion into ${\mathfrak {g}}^*$.

2.3 Level Sets of Momentum Maps

We are shortly going to associate $P_i$ with fixed spectra with points on a coadjoint orbit of ${\text {U}}(d)$, so fixing the spectra of $P_1, \dots , P_N$ corresponds to choosing coadjoint orbits of ${\text {U}}(d)$ and taking their Cartesian product, and the associated momentum map of the diagonal ${\text {U}}(d)$ action will be the frame operator by Corollary 2.3. This will all be explained in the next section, but the point is that the tight operator-valued frames whose $P_i$ have a given spectrum will be precisely a level set of the momentum map, and the frame homotopy problem in this case boils down to showing connectedness of this level set. Fortunately for us, there are powerful results showing that level sets of momentum maps are usually connected.

While one can prove somewhat more general results, we will only consider Hamiltonian G-spaces $(M,\omega ,G,\Phi )$ where both M and G are compact. Since we intend to apply these results to the diagonal action of ${\text {U}}(d)$ on a product of its coadjoint orbits, this will suffice for our purposes.

Fix a G-invariant inner product on the Lie algebra ${\mathfrak {g}}$. This induces a vector space isomorphism of ${\mathfrak {g}}^*$ with ${\mathfrak {g}}$, and hence determines an inner product and norm on the codomain ${\mathfrak {g}}^*$ of the momentum map $\Phi $. Kirwan [39] showed that, while it is not quite a Morse–Bott function, the norm-squared map $\Vert \Phi \Vert ^2$ has many of the desirable properties of such functions.

Theorem 2.4

(Kirwan [39, Theorem 4.16]) The set of critical points for $\Vert \Phi \Vert ^2$ is a finite disjoint union of closed subsets $\{C_\beta : \beta \in \mathcal {B}\}$ on each of which $\Vert \Phi \Vert ^2$ takes a constant value (here the indexing set $\mathcal {B}$ is a finite subset of the positive Weyl chamber ${\mathfrak {t}}_+$ of ${\mathfrak {g}} \simeq {\mathfrak {g}}^*$).

Moreover, there is a smooth stratification $\{S_\beta : \beta \in \mathcal {B}\}$ of M, where $p \in S_\beta $ if and only if the limit set of its image under the flow of $-{\text {grad}}\Vert \Phi \Vert ^2$ (for an appropriate choice of G-invariant Riemannian metric on M) is contained in $C_\beta $.

Finally, for each $\beta $ the inclusion $C_\beta \hookrightarrow S_\beta $ is an equivalence of Čech cohomology and also G-invariant cohomology.

Moreover, Kirwan showed [39, 4.18] that the $S_\beta $ are all locally closed, even-dimensional submanifolds of M. Since it is impossible to disconnect a manifold by removing submanifolds of codimension $\ge 2$, each stratum must be connected. In particular, the stratum $S_0$ of points which flow to $\Phi ^{-1}(0)$ is connected; by the equivalence of Čech cohomology, the level set $\Phi ^{-1}(0) = C_0$ is also connected:

Theorem 2.5

(Kirwan [40, (3.1)]; see also [57, Remark 5.8]) Let $(M,\omega ,G,\Phi )$ be a Hamiltonian G-space with M and G compact. Then $\Phi ^{-1}(0)$ is connected.

In general, we will be interested not just in $\Phi ^{-1}(0)$, but also in $\Phi ^{-1}(\mathcal {O})$, where $\mathcal {O} \subset {\mathfrak {g}}^*$ is a coadjoint orbit. Fortunately, the “shifting trick” will allow us to easily translate Theorem 2.5 to this more general setting:

Corollary 2.6

Let $(M,\omega ,G,\Phi )$ be a Hamiltonian G-space with M and G compact and let $\mathcal {O} \subset {\mathfrak {g}}^*$ be a coadjoint orbit. Then $\Phi ^{-1}(\mathcal {O})$ is connected.

Proof

The goal is to build a new Hamiltonian G-space $({\overline{M}},{\overline{\omega }},G,{\overline{\Phi }})$ so that ${\overline{\Phi }}^{-1}(0) \approx \Phi ^{-1}(\mathcal {O})$. This is exactly what the shifting trick (see [17, §24.4] or [48, Proof of Proposition 5.4.15]) was designed to do.

Specifically, we know that $(\mathcal {O},\omega ^{\text {KKS}})$ is symplectic, and hence so is $(\mathcal {O},-\omega ^{\text {KKS}})$. Let ${\overline{M}} = M \times \mathcal {O}$ with symplectic form ${\overline{\omega }} = \pi _1^*\omega + \pi _2^*(-\omega ^{\text {KKS}})$, where as usual $\pi _i$ is projection onto the ith factor. Let the G action on ${\overline{M}}$ be the diagonal action. Then Proposition 2.2 and the discussion in Sect. 2.2 imply that

$$\begin{aligned} {\overline{\Phi }}(p,\chi ):= \Phi (p) - \chi \end{aligned}$$

is a momentum map for this action.

Now,

$$\begin{aligned} {\overline{\Phi }}^{-1}(0) = \{(p,\Phi (p))\,|\,p \in M, \Phi (p) \in \mathcal {O}\} \end{aligned}$$

is connected by Theorem 2.5. Since this space is certainly homeomorphic to $\Phi ^{-1}(\mathcal {O})$, we conclude that $\Phi ^{-1}(\mathcal {O})$ is also connected. $\square $

3 The Symplectic Geometry of Spaces of Operator-Valued Frames

We now relate the general machinery of the previous section to operator-valued frames. Throughout this section we will fix positive integers $d,N,k_1, \dots , k_N$. We will also fix $\varvec{r}= (\varvec{r}_1,\dots , \varvec{r}_N)$, where $\varvec{r}_i = (r_{i1},\dots , r_{ik_i})$ with $r_{i1} \ge \dots \ge r_{ik_i} > 0$.

Given this data, consider the space $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r})$ of all operator-valued frames $(A_1, \dots , A_n)$, where $A_i: \mathbb {C}^d \rightarrow \mathbb {C}^{k_i}$ is linear and $P_i:= A_i^*A_i$ has spectrum $\varvec{r}_i$. The space $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r})$ is not obviously symplectic, but our first goal is to show that the quotient $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r})/({\text {U}}(k_1) \times \dots \times {\text {U}}(k_N))$ is symplectic, and in fact is essentially a product of coadjoint orbits of ${\text {U}}(d)$.

To start, we recall that the space $\mathcal {H}(d)$ of $d \times d$ Hermitian matrices can be identified with the dual ${\mathfrak {u}}(d)^*$ of the Lie algebra of ${\text {U}}(d)$ by the isomorphism

$$\begin{aligned} \alpha : \mathcal {H}(d)&\rightarrow {\mathfrak {u}}(d)^*\\ \xi&\mapsto \left( \eta \mapsto \frac{\sqrt{-1}}{2} {\text {tr}}(\eta ^*\xi ) = \left\langle \frac{\sqrt{-1}}{2}\xi , \eta \right\rangle =: \alpha _\xi (\eta )\right) . \end{aligned}$$

We collect relevant lemmas from our previous paper [52, §2.2.1 and §2.2.2] in the following proposition:

Proposition 3.1

Under this identification, the coadjoint action of ${\text {U}}(d)$ on ${\mathfrak {u}}(d)^*$ corresponds to the conjugation action of ${\text {U}}(d)$ on $\mathcal {H}(d)$, and hence coadjoint orbits of ${\text {U}}(d)$ can be identified with collections of Hermitian matrices with fixed spectrum $\varvec{\mu }$, which we will denote $\mathcal {O}_{\varvec{\mu }} \subset \mathcal {H}(d)$.

For each $i=1, \dots , N$, the collection of $P_i \in \mathcal {H}(d)$ with spectrum $\varvec{r}_i$ is exactly $\mathcal {O}_{\varvec{r}_i}$, and so

$$\begin{aligned} \mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r}):= & {} \{(P_1, \dots , P_N) \in \mathcal {H}(d)^N \,|\, {\text {spec}}(P_1) = \varvec{r}_1,\dots , {\text {spec}}(P_N) = \varvec{r}_N\}\nonumber \\= & {} \mathcal {O}_{\varvec{r}_1} \times \dots \times \mathcal {O}_{\varvec{r}_N} \end{aligned}$$

(1)

is a product of coadjoint orbits. Corollary 2.3 now has the following immediate consequence:

Corollary 3.2

The momentum map $\Phi $ for the diagonal conjugation action of ${\text {U}}(d)$ on $\mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r})$ is precisely the frame operator:

$$\begin{aligned} \Phi (P_1, \dots , P_N) = P_1 + \dots + P_N. \end{aligned}$$

Fix $\varvec{\lambda }= (\lambda _1, \dots , \lambda _d)$ with $\lambda _1 \ge \dots \ge \lambda _d > 0$. Define

$$\begin{aligned} \mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r}):= & {} \{(P_1, \dots , P_N) \in \mathcal {H}(d)^N \,|\, {\text {spec}}(P_1) = \varvec{r}_1,\dots ,\nonumber \\{} & {} {\text {spec}}(P_N) = \varvec{r}_N, {\text {spec}}(P_1 + \dots + P_N) = \varvec{\lambda }\}. \end{aligned}$$

(2)

In other words, this is the collection of $P_i$ with both the correct individual spectra and the correct spectrum of the frame operator. Then Corollaries 3.2 and 2.6 imply:

Proposition 3.3

$\mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r}) = \Phi ^{-1}(\mathcal {O}_{\varvec{\lambda }}) \subset \mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r})$ is connected.

Note that, if we think of fusion frames in the usual way as a collection of subspaces or, equivalently, orthogonal projectors, this already gives an affirmative answer to Question 1.1 in the case $\mathbb {K}= \mathbb {C}$.

In the more general operator-valued frame case, the question is how $\mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ relates to $\mathcal{O}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$, which is the space we want to show is connected. As previously discussed, while $A_i: \mathbb {C}^d \rightarrow \mathbb {C}^{k_i}$ uniquely determines $P_i = A_i^*A_i$, the operator $A_i$ cannot be uniquely determined from $P_i$. Indeed, if $U \in {\text {U}}(k_i)$, then

$$\begin{aligned} (UA_i)^*(UA_i) = A_i^*U^*U A_i = A_i^*A_i = P_i, \end{aligned}$$

so composing with a unitary transformation of the codomain leaves $P_i$ invariant. Proposition 1.2 says that this is the only indeterminacy, and hence the set of $P_i$ with given spectrum is precisely the collection of cosets of the (left) unitary action on the set of $A_i$ with given singular values:

$$\begin{aligned} \{P_i: \mathbb {C}^d \rightarrow \mathbb {C}^d \,|\, {\text {spec}}{P_i} = \varvec{r}_i\} \approx \{A_i:\mathbb {C}^d \rightarrow \mathbb {C}^{k_i} \,|\, {\text {spec}}(A^*A) = \varvec{r}_i\}/{\text {U}}(k_i). \end{aligned}$$

In turn, this implies that

$$\begin{aligned} \mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}(\varvec{r}) \approx \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r})/({\text {U}}(k_1) \times \dots \times {\text {U}}(k_N)), \end{aligned}$$

(3)

and hence that

$$\begin{aligned} \mathcal{P}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r}) \approx \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})/({\text {U}}(k_1) \times \dots \times {\text {U}}(k_N)). \end{aligned}$$

Here, we are using the fact that the operations of taking a level set and taking a quotient commute with one another. That is, suppose that X is a topological space with an action by a group G and $\Psi :X \rightarrow Y$ is a G-invariant map, so that the induced quotient map ${\widetilde{\Psi }}:X/G \rightarrow Y$ taking an equivalence class [x] to $\Psi (x)$ is well defined; then ${\widetilde{\Psi }}^{-1}(y) = \Psi ^{-1}(y)/G$ for any $y \in Y$. Indeed, $[x] \in {\widetilde{\Psi }}^{-1}(y)$ if and only if $x \in \Psi ^{-1}(y)$, or equivalently $[x] \in \Psi ^{-1}(y)/G$.

Therefore, the following lemma (with $X = \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ and $G = {\text {U}}(k_1) \times \dots \times {\text {U}}(k_N)$) combined with Proposition 3.3 implies that $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is connected. Since $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is a real algebraic set in $\mathbb {C}^{k_1\times d} \times \dots \times \mathbb {C}^{k_N \times d}\approx \mathbb {R}^{2(k_1 + \dots + k_N)d}$, it is locally path-connected, so that connectivity implies path-connectivity, completing the proof of Theorem 1.9.

Lemma 3.4

Let X be a topological space and let G be a connected topological group acting continuously on X. If X/G is connected, then X is connected.

The lemma follows from standard point-set topology arguments (see, e.g., [45, Exercise 5.5]), since connectedness of G implies the fibers of the quotient map are connected.

4 Tightening Fusion Frames

We now specialize to fusion frames, but relax our assumption on the base field, so that $\mathbb {K}= \mathbb {R}$ or $\mathbb {C}$. Recall the definition of the fusion frame potential ${\text {FFP}}: \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}\rightarrow \mathbb {R}$:

$$\begin{aligned} {\text {FFP}}(\varvec{A}):= \left\| S_{\varvec{A}}\right\| ^2. \end{aligned}$$

One of Casazza and Fickus’ first results about the fusion frame potential was a Welch-type lower bound:

Proposition 4.1

([18, Proposition 1]) Let $d,N,k_1, \dots , k_N$ be positive integers and define $n:= k_1 + \cdots + k_N$. Then for $\varvec{A} = (A_1, \dots , A_N) \in \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$,

$$\begin{aligned} {\text {FFP}}(\varvec{A}) \ge \frac{1}{d}\left( \sum _{i=1}^N k_i\right) ^2 = \frac{n^2}{d} \end{aligned}$$

with equality if and only if $\varvec{A}$ is a tight fusion frame.

Let $\Lambda = \frac{n}{d}$ and define $\varvec{\Lambda }=(\Lambda , \dots , \Lambda )$. If they exist, the TFFs must have frame operator $\Lambda \mathbb {I}_d$, so that $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}_{\varvec{\Lambda }} \subset \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ is exactly the collection of TFFs. Hence, Proposition 4.1 says that if this collection of TFFs is nonempty then it is exactly the set of global minimizers of ${\text {FFP}}$ in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$.

The hard work of proving Theorem 1.14 is in proving the $\mathbb {K}= \mathbb {C}$ case. The real case then follows immediately since $\mathcal{F}\mathcal{F}^{\mathbb {R}^d,\varvec{k}}\subset \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ is invariant under the gradient flow of ${\text {FFP}}$. We record this observation in the following proposition:

Proposition 4.2

For given d and $\varvec{k}$, if Theorem 1.14 is true for $\mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$, then it is also true for $\mathcal{F}\mathcal{F}^{\mathbb {R}^d,\varvec{k}}$.

Except where explicitly pointed out below, we will assume $\mathbb {K}=\mathbb {C}$ in what follows. The strategy for proving the complex case of Theorem 1.14 is to show that property ${\mathscr {S}}$ (Definition 1.13) satisfies the following conditions:

(i)
gradient flow (and its limit) preserves ${\mathscr {S}}$, but
(ii)
no non-minimizing critical point of the fusion frame potential satisfies ${\mathscr {S}}$.

Our argument has roots in Mumford’s geometric invariant theory (GIT) [51] (see [58] for a nice introduction), which we introduce in some generality before specializing.

Let G be a reductive algebraic group that acts linearly on a finite-dimensional complex vector space V. For example, G might be GL(V) or SL(V). A nonzero vector $v \in V$ is unstable under the action of G if the closure $\overline{G \cdot v}$ of the G-orbit of v contains the origin; otherwise v is semi-stable. Notice that the unstable points are precisely those in the vanishing locus of every G-invariant homogeneous polynomial on V, and hence the semi-stable points are those on which some G-invariant homogeneous polynomial does not vanish.

As one might expect, semi-stability is a feature of the orbit of v: either the entire orbit consists of semi-stable points, or the entire orbit consists of unstable points.

Proposition 4.3

(see, e.g., [49, Proposition 6]) Given a nonzero $v \in V$ that is semi-stable, every point in $\overline{G \cdot v}$ is also semi-stable.

4.1 V and the ${\text {SL}}(d)$ Action

In our application of GIT, we will have $G = {\text {SL}}(d)$. To determine the appropriate vector space V, we first recall the Plücker embedding ${\text {Gr}}_{k}(\mathbb {C}^{d}) \rightarrow \mathbb {P}(\bigwedge ^k \mathbb {C}^d)$, defined on the Grassmannian ${\text {Gr}}_{k}(\mathbb {C}^{d})$ of k-dimensional linear subspaces of $\mathbb {C}^d$. We can represent a k-plane by any basis $v_1, \dots , v_k$ for it. Then the Plücker embedding is defined to be the projectivization of the map $\tau : (\mathbb {C}^d)^k \rightarrow \bigwedge ^k \mathbb {C}^d$ defined by

$$\begin{aligned} \tau (v_1, \dots , v_k):= v_1 \wedge \dots \wedge v_k. \end{aligned}$$

When $(u_1, \dots , u_k)$ and $(v_1, \dots , v_k)$ span the same k-dimensional subspace, then $\tau (u_1, \dots , u_k) = \det (h) \tau (v_1, \dots , v_k)$ where $h \in {\text {GL}}(k)$ is the change-of-basis matrix, so both map to the same point in projective space, and the Plücker embedding is well-defined on the Grassmannian. Of course, the standard action of ${\text {SL}}(d)$ on $\mathbb {C}^d$ induces an action on $\bigwedge ^k \mathbb {C}^d$ by

$$\begin{aligned} g \cdot (v_1 \wedge \dots \wedge v_k):= (g v_1) \wedge \dots \wedge (gv_k) \end{aligned}$$

and extending linearly.

How do we get from fusion frames to Grassmannians, and hence to such a representation of ${\text {SL}}(d)$?

For $\varvec{A} \in \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$, each $P_i = A_i^*A_i$ is a rank-$k_i$ orthogonal projector and the rows $a_{i1}, \dots , a_{ik_i}$ of $A_i$ give an orthonormal basis for the $k_i$-dimensional subspace which is the image of $P_i$. Moreover, $\mathcal {O}_{\varvec{r}_i}$ is the collection of all rank-$k_i$ orthogonal projectors, which is symplectomorphic to the Grassmannian ${\text {Gr}}_{k_i}(\mathbb {C}^{d})$.

Define $\tau _i: A_i \mapsto a_{i1}^*\wedge \dots \wedge a_{ik_i}^*$, the projectivization of which is exactly the Plücker embedding of ${\text {Gr}}_{k_i}(\mathbb {C}^{d})$, and $\tau _i$ is equivariant with respect to the right ${\text {SL}}(d)$ action $g \cdot A_i:= A_i g^*$ on the domain and the ${\text {SL}}(d)$ action described above on the codomain:

$$\begin{aligned} \tau (g \cdot A_i) = \tau (A_i g^*) = (g a_{i1}^*) \wedge \dots \wedge (g a_{ik_i}^*) = g \cdot \tau (A_i) \end{aligned}$$

for any $g \in {\text {SL}}(d)$.

Next, define $\tau : \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}\rightarrow \bigwedge ^{k_1} \mathbb {C}^d \otimes \dots \otimes \bigwedge ^{k_N} \mathbb {C}^d$ by $\tau := \tau _1 \otimes \dots \otimes \tau _N$, so that

$$\begin{aligned} \tau (\varvec{A}) = \tau _1(A_1) \otimes \dots \otimes \tau _N(A_N) = \left( a_{11}^*\wedge \dots \wedge a_{1k_1}^*\right) \otimes \dots \otimes \left( a_{N1}^*\wedge \dots \wedge a_{Nk_N}^*\right) . \end{aligned}$$

In other words, the projectivization of $\tau $ is the Segre embedding of the product of Plücker embeddings of the individual factors.

Finally, then, our vector space $V = \bigwedge ^{k_1} \mathbb {C}^d \otimes \dots \otimes \bigwedge ^{k_N} \mathbb {C}^d$, on which $G = {\text {SL}}(d)$ acts by

$$\begin{aligned}{} & {} g \cdot \left( (v_{11} \wedge \dots \wedge v_{1k_1}) \otimes \dots \otimes (v_{N1} \wedge \dots \wedge v_{Nk_N})\right) \\{} & {} \quad := \left( (g v_{11}) \wedge \dots \wedge (g_{1k_1})) \otimes \dots \otimes ((gv_{N1}) \wedge \dots \wedge (gv_{Nk_N}))\right) \end{aligned}$$

and extending linearly.

The point of defining property ${\mathscr {S}}$ as we have is the following theorem of Mumford (stated originally in terms of Grassmannians):

Theorem 4.4

(Mumford [51, Proposition 4.3]; see also [50] and [39, §16.3]) $\varvec{A}$ has property ${\mathscr {S}}$ if and only if $\tau (\varvec{A})$ is semi-stable with respect to the ${\text {SL}}(d)$ action.

As pointed out just before Proposition 4.3, the semi-stable points in V are those on which some G-invariant homogeneous polynomial does not vanish. Hence, Theorem 4.4 implies that if $\varvec{A} \in \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ has property ${\mathscr {S}}$, then there is some G-invariant homogeneous polynomial which does not vanish at $\tau (\varvec{A})$. Since the coordinates of $\tau (\varvec{A})$ are precisely the determinants of all the $k_i \times k_i$ minors of the $A_i$, and since these determinants are themselves polynomials in the entries of $\varvec{A}$, this means that there is some polynomial expression in the coordinates of $\varvec{A}$ which does not vanish. Therefore, the collection of $\varvec{A}$ with property ${\mathscr {S}}$ is Zariski-open in the smooth, connected, real algebraic variety $\mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$, and hence is either empty or dense (see, e.g., [14, Proposition 5.11]). Moreover, the same reasoning applies in $\mathcal{F}\mathcal{F}^{\mathbb {R}^d,\varvec{k}}$. Therefore, we have:

Proposition 4.5

Let $\mathbb {K}= \mathbb {R}$ or $\mathbb {C}$. When it is non-empty, the collection of all fusion frames in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ with property ${\mathscr {S}}$ is dense.

We also take this opportunity to show that TFFs always have property ${\mathscr {S}}$.

Proposition 4.6

Let $\mathbb {K}= \mathbb {R}$ or $\mathbb {C}$ and suppose $\varvec{A} \in \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ is a TFF. Then $\varvec{A}$ has property ${\mathscr {S}}$.

Proof

Since $\varvec{A}$ is tight, its frame operator $S_{\varvec{A}} = \frac{n}{d}\mathbb {I}_d$. Let $\mathcal {Q}\subset \mathbb {K}^d$ be a proper subspace and let $P_{\mathcal {Q}}$ be orthogonal projection onto $\mathcal {Q}$.

For each $i=1,\dots , N$, any nonzero vector in $\mathcal {Q} \cap \mathcal {S}_i$ is fixed by the product $P_{\mathcal {Q}}P_i$, and hence ${\text {tr}}(P_{\mathcal {Q}}P_i) \ge \dim (\mathcal {Q} \cap \mathcal {S}_i)$, since all eigenvalues of $P_{\mathcal {Q}}P_i$ are real and non-negative. Therefore, since $P_{\mathcal {Q}}S_{\varvec{A}} = \frac{n}{d}P_{\mathcal {Q}}$,

$$\begin{aligned} \frac{n}{d} \dim (\mathcal {Q}) = {\text {tr}}(P_{\mathcal {Q}}S_{\varvec{A}}) = \sum _{i=1}^N {\text {tr}}(P_{\mathcal {Q}}P_i) \ge \sum _{i=1}^N \dim (\mathcal {Q} \cap \mathcal {S}_i), \end{aligned}$$

so $\varvec{A}$ has property ${\mathscr {S}}$. $\square $

Combining Propositions 4.5 and 4.6 yields the following immediate corollary:

Corollary 4.7

Whenever there are TFFs in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$, the fusion frames with property ${\mathscr {S}}$ are dense in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$.

4.2 Property ${\mathscr {S}}$ Satisfies (i) and (ii)

4.2.1 Property ${\mathscr {S}}$ Satisfies (i)

The goal in this subsection is to show that the gradient flow of ${\text {FFP}}$ preserves property ${\mathscr {S}}$.

Notice that, if $\varvec{A} = (A_1, \dots , A_N)$ is a fusion frame, then the rows of each $A_i$ form an orthonormal set, and hence each $a_{i1}^*\wedge \dots \wedge a_{ik_i}^*\in \bigwedge ^{k_i} \mathbb {C}^d$ is a unit vector with respect to the standard inner product on $\bigwedge ^{k_i} \mathbb {C}^d$. In turn, this implies that $\tau (\varvec{A}) \in \left( \bigwedge ^{k_1} \mathbb {C}^d \right) \otimes \dots \otimes \left( \bigwedge ^{k_N} \mathbb {C}^d\right) $ is also a unit vector. In other words:

Lemma 4.8

$\tau \left( \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}\right) $ is contained in the unit sphere, and in particular is bounded away from the origin.

Now we compute the gradient of ${\text {FFP}}$, first by computing the extrinsic gradient of its extension to the entire vector space containing $\mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$:

Lemma 4.9

Define ${\text {EFP}}: \mathbb {C}^{k_1 \times d} \times \dots \times \mathbb {C}^{k_N \times d} \rightarrow \mathbb {R}$ to be the extension of ${\text {FFP}}$ to all of $\mathbb {C}^{k_1 \times d} \times \dots \times \mathbb {C}^{k_N \times d}$ given by ${\text {EFP}}(A_1, \dots , A_N):= \left\| \sum _{i=1}^N A_i^*A_i \right\| ^2$. Its gradient is

$$\begin{aligned} \nabla {\text {EFP}}(\varvec{A}) = (4 A_1 S_{\varvec{A}}, \ldots , 4 A_N S_{\varvec{A}}). \end{aligned}$$

(4)

Proof

Let $\varvec{B} = (B_1,\ldots ,B_N) \in T_{\varvec{A}} (\mathbb {C}^{k_1 \times d} \times \dots \times \mathbb {C}^{k_N \times d}) \approx \mathbb {C}^{k_1 \times d} \times \dots \times \mathbb {C}^{k_N \times d}$ and consider the directional derivative of ${\text {EFP}}$ at $\varvec{A}$ in the direction $\varvec{B}$. In the following, we slightly abuse notation and generically use $\langle \cdot , \cdot \rangle $ for the Frobenius inner product on matrix spaces of various dimensions. We have

$$\begin{aligned} \left. \frac{d}{d\epsilon } \right| _{\epsilon = 0} {\text{ EFP }}(\textbf{A} + \epsilon \textbf{B})&= \left. \frac{d}{d\epsilon } \right| _{\epsilon = 0} \left\| \sum _{i=1}^N (A_i + \epsilon B_i)^*(A_i + \epsilon B_i) \right\| ^2 \nonumber \\ {}&\quad = \left. \frac{d}{d\epsilon } \right| _{\epsilon = 0} \left\langle \sum _{i=1}^N (A_i^*A_i + \epsilon (A_i^*B_i + B_i^*A_i) + \epsilon ^2 B_i^*B_i),\right. \nonumber \\ {}&\quad \quad \quad \quad \quad \left. \sum _{i=1}^N (A_i^*A_i + \epsilon (A_i^*B_i + B_i^*A_i) + \epsilon ^2 B_i^*B_i)\right\rangle \nonumber \\ {}&\quad = 2 \text {Re} \left\langle \sum _{i=1}^N A_i^*A_i, \sum _{i=1}^N (A_i^*B_i + B_i^*A_i) \right\rangle \nonumber \\ {}&\quad = 4 \text {Re} \sum _{i=1}^N \left\langle S_{\textbf{A}}, A_i^*B_i \right\rangle \end{aligned}$$

(5)

$$\begin{aligned}&\quad = \text {Re} \sum _{i=1}^N \left\langle 4 A_i S_{\textbf{A}}, B_i\right\rangle = \text {Re} \left\langle (4A_1 S_{\textbf{A}},\ldots ,4 A_N S_{\textbf{A}}),\right. \nonumber \\ {}&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \left. (B_1,\ldots ,B_N) \right\rangle . \end{aligned}$$

(6)

The equality (5) makes the replacement $S_{\varvec{A}} = \sum _{i=1}^N A_i^*A_i$, uses linearity in the second coordinate to move the summation out of the inner product and uses properties of the (real part of the) inner product to equate $\textrm{Re}\langle \cdot , A_i^*B_i + B_i^*A_i\rangle = 2 \textrm{Re}\langle \cdot , A_i^*B_i \rangle $. The quantity (6) is the (real part of the) standard inner product for $\mathbb {C}^{k_1 \times d} \times \dots \times \mathbb {C}^{k_N \times d}$ applied to $(4A_1 S_{\varvec{A}},\ldots ,4 A_N S_{\varvec{A}})$ (the claimed formula for the gradient) and $\varvec{B}$. This implies (4). $\square $

And now the intrinsic gradient:

Lemma 4.10

The Riemannian gradient of ${\text {FFP}}: \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}\rightarrow \mathbb {R}$, is

$$\begin{aligned} {\text {grad}}{\text {FFP}}(\varvec{A}) = \left( 4(A_1 S_{\varvec{A}} - (A_1 S_{\varvec{A}} A_1^*) A_1), \ldots , 4(A_N S_{\varvec{A}} - (A_N S_{\varvec{A}} A_N^*) A_N)\right) . \end{aligned}$$

Proof

The Riemannian gradient ${\text {grad}}{\text {FFP}}(\varvec{A})$ is the projection of the extrinsic gradient $\nabla {\text {EFP}}(\varvec{A})$ onto the tangent space to $\mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ at $\varvec{A}$. This means that on the ith block we need to project $4A_i S_{\varvec{A}}$ onto the orthogonal complement of the row span of $A_i$. This orthogonal projection is accomplished by right-multiplying by $(\mathbb {I}_d - A_i^*A_i)$, so the projection is $4A_i S_{\varvec{A}}(\mathbb {I}_d - A_i^*A_i) = 4\left( A_i S_{\varvec{A}} - \left( A_i S_{\varvec{A}} A_i^*\right) A_i\right) $. The result follows. $\square $

Proposition 4.11

Suppose $\varvec{A}_0 \in \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ has property ${\mathscr {S}}$ and that $\Gamma : \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}\times [0,\infty ) \rightarrow \mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ is the gradient flow defined in Theorem 1.12. Then $\varvec{A}_\infty := \lim _{t\rightarrow \infty } \Gamma (\varvec{A}_0,t)$ has property ${\mathscr {S}}$.

Proof

Using Lemma 4.10, we have that the ith block of ${\text {grad}}{\text {FFP}}(\varvec{A})$ is

$$\begin{aligned} 4\left( A_i S_{\varvec{A}} - \left( A_i S_{\varvec{A}} A_i^*\right) A_i\right) = \left. \frac{d}{d \epsilon } \right| _{\epsilon =0} 4\exp (-\epsilon A_i S_{\varvec{A}} A_i^*) A_i \exp (\epsilon S_{\varvec{A}}). \end{aligned}$$

(7)

Exponentiating a matrix always yields an invertible matrix, so (7) tells us that ${\text {grad}}{\text {FFP}}(\varvec{A})$ is tangent to the orbit $\left( {\text {GL}}(d) \times \prod _{i=1}^N {\text {GL}}(k_i)\right) \cdot \varvec{A}$, where $(g,(h_1,\dots , $$ h_N)) \in {\text {GL}}(d) \times \prod _{i=1}^N {\text {GL}}(k_i)$ acts on $\prod _{i=1}^N \mathbb {C}^{k_i \times d}$ by

$$\begin{aligned} (g,(h_1,\dots , h_N)) \cdot \textbf{A} = (h_1 A_1 g^*, \dots , h_N A_N g^*). \end{aligned}$$

For $(g,(h_1,\dots , h_N)) \in {\text {GL}}(d) \times \prod _{i=1}^N {\text {GL}}(k_i)$, we normalize g to get something in ${\text {SL}}(d)$ without changing the action by moving a scalar to the other factor: $\left( (\det g)^{-1}g, \left( (\overline{\det g}) h_1, \dots , (\overline{\det g}) h_N\right) \right) \in {\text{ SL }}(d) \times \prod _{i=1}^N {\text{ GL }}(k_i)$ and

$$\begin{aligned} (g,(h_1,\dots , h_N)) \cdot \textbf{A} = \left( (\det g)^{-1}g, \left( (\overline{\det g}) h_1, \dots , (\overline{\det g}) h_N\right) \right) \cdot \textbf{A}, \end{aligned}$$

so ${\text {grad}}{\text {FFP}}(\varvec{A})$ is actually in the tangent space to $\left( {\text {SL}}(d) \times \prod _{i=1}^N {\text {GL}}(k_i)\right) \cdot \varvec{A}$ at $\varvec{A}$. Therefore,

$$\begin{aligned} \Gamma (\varvec{A}_0,t) \in \left( {\text {SL}}(d) \times \prod _{i=1}^N {\text {GL}}(k_i)\right) \cdot \varvec{A}_0 \quad \text { for all } t \ge 0. \end{aligned}$$

If $(g,(h_1,\dots , h_N) \in {\text {SL}}(d) \times \prod _{i=1}^N {\text {GL}}(k_i)$, then

$$\begin{aligned} \tau ((g,(h_1,\dots , h_N)) \cdot \varvec{A})&= \left( \left( g a_{11}^*h_1^*\right) \wedge \dots \wedge \left( g a_{1k_1}^*h_1^*\right) \right) \otimes \dots \otimes \\&\quad \left( \left( g a_{N1}^*h_N^*\right) \wedge \dots \wedge \left( g a_{Nk_N}^*h_N^*\right) \right) \\&\quad = \left( \det (h_1^*) \left( g a_{11}^*\right) \wedge \dots \wedge \left( g a_{1k_1}^*\right) \right) \otimes \dots \otimes \\&\quad \left( \det (h_N^*) \left( g a_{N1}^*\right) \wedge \dots \wedge \left( g a_{Nk_N}^*\right) \right) \\&\quad = \prod _{i=1}^N \det (h_i^*)\left[ \left( \left( g a_{11}^*\right) \wedge \dots \wedge \left( g a_{1k_1}^*\right) \right) \otimes \dots \otimes \right. \\&\quad \left. \left( \left( g a_{N1}^*\right) \wedge \dots \wedge \left( g a_{Nk_N}^*\right) \right) \right] \in \left( {\text {SL}}(d) \times \mathbb {C}^\times \right) \cdot \tau (\varvec{A}), \end{aligned}$$

where $(g,a) \in {\text {SL}}(d) \times \mathbb {C}^\times $ acts on $\left( \bigwedge ^{k_1} \mathbb {C}^d \right) \otimes \dots \otimes \left( \bigwedge ^{k_N} \mathbb {C}^d\right) $ by

$$\begin{aligned}{} & {} (g,a) \cdot \left[ \left( v_{11} \wedge \dots \wedge v_{1k_1}\right) \otimes \dots \otimes \left( v_{N1} \wedge \dots \wedge v_{Nk_N}\right) \right] \\{} & {} \quad := a \left[ \left( \left( g v_{11}\right) \wedge \dots \wedge \left( g v_{1k_1}\right) \right) \otimes \dots \otimes \left( \left( g v_{N1} \right) \wedge \dots \wedge \left( g v_{Nk_N}\right) \right) \right] . \end{aligned}$$

This implies that $\tau (\Gamma (\varvec{A}_0,t)) \in \left( {\text {SL}}(d) \times \mathbb {C}^\times \right) \cdot \tau (\varvec{A}_0)$ for all $t \ge 0$. Since $\tau (\Gamma (\varvec{A}_0,t))$ is a unit vector for all t by Lemma 4.8, so is the limit $\tau (\varvec{A}_\infty )$.

Since everything is bounded away from the origin and since rescaling a vector by a nonzero scalar does not affect its semistability with respect to the ${\text {SL}}(d)$-action, Proposition 4.3 implies that the entire gradient flow line, including $\tau (\varvec{A}_\infty )$, is semi-stable, and hence $\varvec{A}_\infty $ has property ${\mathscr {S}}$ by Theorem 4.4. $\square $

4.2.2 Property ${\mathscr {S}}$ Satisfies (ii)

Finally, we need to show that critical points which are not global minima do not satisfy property ${\mathscr {S}}$. We do this by showing that, if $\varvec{A}$ is a non-minimizing critical point, then $\tau (\varvec{A})$ is not semi-stable with respect to the ${\text {SL}}(d)$ action. Semi-stability is defined in terms of the full group orbit, but this is typically much too big to be tractable. Instead, it is preferable to work with one-parameter subgroups, which remarkably turn out to be sufficient.

We briefly return to discussing a general reductive group G acting linearly on a vector space V. A one-parameter subgroup of G is a homomorphism of algebraic groups $\lambda : \mathbb {C}^\times \rightarrow G$. Any such homomorphism induces a decomposition $V = \bigoplus _{i \in I} V_i$ and integer weights $w: I \rightarrow \mathbb {Z}$ so that, for every $i \in I$, $v \in V_i$, and $t \in \mathbb {C}^\times $,

$$\begin{aligned} \lambda (t) \cdot v = t^{w(i)}v. \end{aligned}$$

It follows immediately from the definition that a nonzero vector $v \in V$ is unstable under the action of G if there exists a one-parameter subgroup $\lambda $ so that

$$\begin{aligned} \lim _{t\rightarrow 0} \lambda (t) \cdot v = 0. \end{aligned}$$

Much less obvious is that the converse holds:

Theorem 4.12

(Hilbert–Mumford criterion [33, 51]) $v \in V\backslash \{0\}$ is unstable under the action of G if and only if there exists a one-parameter subgroup $\lambda $ of G so that

$$\begin{aligned} \lim _{t \rightarrow 0} \lambda (t) \cdot v = 0. \end{aligned}$$

This will be our key tool in proving that property ${\mathscr {S}}$ satisfies (ii).

Proposition 4.13

Suppose $\varvec{A} \in \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ is a critical point of ${\text {FFP}}$ which is not tight. Then $\varvec{A}$ does not have property ${\mathscr {S}}$. In particular, if $\varvec{A}$ is a critical point which is not a global minimum, then it does not have property ${\mathscr {S}}$.

Proof

Let $\varvec{A} \in \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ be a critical point of ${\text {FFP}}$. Then ${\text {grad}}{\text {FFP}}(\varvec{A}) = 0$; by Lemma 4.10, this implies that, for each $i = 1, \dots , N$,

$$\begin{aligned} 0 = A_i S_{\varvec{A}} - A_i S_{\varvec{A}} A_i^*A_i = A_i S_{\varvec{A}}( \mathbb {I}_d - A_i^*A_i ). \end{aligned}$$

The operator $\mathbb {I}_d - A_i^*A_i$ is orthogonal projection onto the orthogonal complement of ${\text {row}}(A_i)$, the row space of $A_i$ (since $\varvec{A}$ is a fusion frame, the rows of $A_i$ are orthonormal). The above equation then says that the rows of $A_i S_{\varvec{A}}$ lie in ${\text {row}}(A_i)$. In other words, ${\text {row}}(A_i)$ is an invariant subspace for the frame operator $S_{\varvec{A}}$, and hence has an orthonormal basis of eigenvectors of $\left. S_{\varvec{A}}\right| _{{\text {row}}(A_i)}$, so there exists $U_i \in {\text {U}}(k_i)$ so that the rows of ${\widetilde{A}}_i:= U_i A_i$ are eigenvectors of $\left. S_{\varvec{A}}\right| _{{\text {row}}(A_i)}$, and hence also of $S_{\varvec{A}} = S_{\widetilde{\varvec{A}}}$. So far, this is not new: the conclusion of the previous sentence is exactly Casazza and Fickus’ characterization of the critical points of the fusion frame potential [18, Theorem 4].

If $\varvec{A}$ is not tight, then the frame operator $S_{\varvec{A}}$ has at least two distinct eigenvalues. Let $\lambda $ be the largest eigenvalue, with corresponding eigenspace $E_\lambda $ of dimension $\ell $ and orthogonal complement $E_\lambda ^\bot $ of dimension $d-\ell $. Since the average of the eigenvalues of $S_{\varvec{A}}$ is

$$\begin{aligned} \frac{1}{d} {\text {tr}}(S_{\varvec{A}}) = \frac{1}{d} {\text {tr}}\left( \sum _{i=1}^N A_i^*A_i \right) = \frac{1}{d}\sum _{i=1}^N {\text {tr}}\left( A_i^*A_i \right) = \frac{1}{d}\sum _{i=1}^N k_i = \frac{n}{d} \end{aligned}$$

and the eigenvalues aren’t all equal, we know that the largest eigenvalue $\lambda > \frac{n}{d}$.

Up to conjugating $S_{\varvec{A}}$ by $U \in {\text {U}}(d)$ (corresponding to right-multiplying each $A_i$ by $U^*$), we can make the simplifying assumption that the frame operator is diagonal: $S_{\varvec{A}} = \begin{bmatrix} \lambda \mathbb {I}_\ell &{} 0 \\ 0 &{} S' \end{bmatrix}$, where $S'$ is a diagonal (but not necessarily scalar) matrix. Hence, $E_\lambda = {\text {span}}\{e_1, \dots , e_\ell \}$ and $E_\lambda ^\bot = {\text {span}}\{e_{\ell +1}, \dots , e_d\}$.

If, for each $i=1,\dots , N$, ${\tilde{a}}_{i1},\dots , {\tilde{a}}_{ik_i}$ are the rows of ${\widetilde{A}}_i$, then, since each ${\tilde{a}}_{ij}$ is an eigenvector of $S_{\varvec{A}}$ and distinct eigenspaces are orthogonal, the ${\tilde{a}}_{ij}$ split into two perpendicular groups: those in $E_\lambda $ and those in $E_\lambda ^\bot $. Let ${\tilde{b}}_1, \dots , {\tilde{b}}_m$ be the collection contained in $E_\lambda $. Then

$$\begin{aligned} \sum _{i=1}^m {\tilde{b}}_i^*{\tilde{b}}_i = \begin{bmatrix} \lambda \mathbb {I}_\ell &{} 0 \\ 0 &{} 0 \end{bmatrix} \end{aligned}$$

so that ${\tilde{b}}_1^*, \dots , {\tilde{b}}_m^*$ are a $\lambda $-tight frame for $E_\lambda $. Since the ${\tilde{b}}_i$ are unit vectors,

$$\begin{aligned} \ell \lambda = {\text {tr}}\begin{bmatrix} \lambda \mathbb {I}_\ell &{} 0 \\ 0 &{} 0 \end{bmatrix} = {\text {tr}}(\sum _{i=1}^m {\tilde{b}}_1^*{\tilde{b}}_m) = {\text {tr}}(\sum _{i=1}^m {\tilde{b}}_i{\tilde{b}}_i^*) = m, \end{aligned}$$

it follows that $\lambda = \frac{m}{\ell }$ and hence that $\frac{m}{\ell } > \frac{n}{d}$; equivalently, $md-n\ell > 0$.

We’re now ready to show that $\varvec{A}$ does not have property ${\mathscr {S}}$. To see this, consider the 1-parameter subgroup $\lambda : \mathbb {C}^\times \rightarrow {\text {SL}}(d)$ given as a block matrix by

$$\begin{aligned} \lambda (t) = \begin{bmatrix} t^{d-\ell }\mathbb {I}_\ell &{} 0 \\ 0 &{} t^{-\ell }\mathbb {I}_{d-\ell } \end{bmatrix}. \end{aligned}$$

Since ${\tilde{a}}_{i1}^*\wedge \dots \wedge {\tilde{a}}_{ik_i}^*= \det (U_i^*) a_{i1}^*\wedge \dots \wedge a_{ik_i}^*$, it follows that $\tau (\varvec{A}) = \rho \tau (\widetilde{\varvec{A}})$ for unimodular $\rho = \prod _{i=1}^N \det (U_i)$, and hence

$$\begin{aligned} \lambda (t) \cdot \tau (\varvec{A})= & {} \rho \lambda (t) \cdot \tau (\widetilde{\varvec{A}}) = \rho \left( \left( \lambda (t) {\tilde{a}}_{11}^*\right) \wedge \dots \wedge \left( \lambda (t) {\tilde{a}}_{1k_1}^*\right) \right) \otimes \dots \otimes \\{} & {} \left( \left( \lambda (t) {\tilde{a}}_{N1}^*\right) \wedge \dots \wedge \left( \lambda (t) {\tilde{a}}_{Nk_N}^*\right) \right) \\= & {} t^{m(d-\ell )-(n-m)\ell }\rho \left( {\tilde{a}}_{11}^*\wedge \dots \wedge {\tilde{a}}_{1k_1}^*\right) \otimes \dots \otimes \\{} & {} \left( {\tilde{a}}_{N1}^*\wedge \dots \wedge {\tilde{a}}_{Nk_N}^*\right) = t^{md-n\ell } \rho \tau (\widetilde{\varvec{A}}), \end{aligned}$$

which goes to zero as $t\rightarrow 0$ since $md-n\ell > 0$.

Therefore, $\tau (\varvec{A})$ is not semi-stable, and hence, by Theorem 4.4, $\varvec{A}$ does not have property ${\mathscr {S}}$.

Finally, if $\varvec{A}$ is not a global minimum, then it cannot be tight by Proposition 4.1, so we see that non-minimum critical points cannot have property ${\mathscr {S}}$. $\square $

If d, N, and $\varvec{k}$ are such that $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ contains no TFFs (see Remark 1.16 for conditions on when this occurs), then any global minimum $\varvec{A}$ of ${\text {FFP}}$ cannot be tight, so Proposition 4.13 shows that $\varvec{A}$ cannot have property ${\mathscr {S}}$. By the contrapositive of Proposition 4.11, then, nothing in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ that flows to $\varvec{A}$ under the negative gradient flow of ${\text {FFP}}$ can have property ${\mathscr {S}}$, either. Since this is true for all global minima, there is some open set containing the global minima which completely avoids the fusion frames with property ${\mathscr {S}}$. Therefore, the set of fusion frames in $\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ with property ${\mathscr {S}}$ cannot be dense, and hence, by Proposition 4.5, must be empty. In other words:

Corollary 4.14

$\mathcal{F}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}$ contains fusion frames with property ${\mathscr {S}}$ if and only if it contains TFFs.

4.2.3 Completing the Proof of Theorem 1.14

We now have all the tools we need to prove that gradient descent limits to a global minimizer.

Proof of Theorem 1.14

If $\varvec{A}_0 \in \mathcal{F}\mathcal{F}^{\mathbb {C}^d,\varvec{k}}$ has property ${\mathscr {S}}$, the limit $\varvec{A}_\infty := \lim _{t \rightarrow \infty } \Gamma (\varvec{A}_0, t)$ has property ${\mathscr {S}}$ by Proposition 4.11. Since $\varvec{A}_\infty $ is a limit point of the gradient flow, it must be a critical point of ${\text {FFP}}$. Since it has property ${\mathscr {S}}$, Proposition 4.13 implies that $\varvec{A}_\infty $ is a global minimizer of ${\text {FFP}}$.

This proves Theorem 1.14 when $\mathbb {K}= \mathbb {C}$. The real case then follows immediately by Proposition 4.2. $\square $

5 Discussion

There are choices of d, N, and $\varvec{k}$ for which there are no fusion frames with property ${\mathscr {S}}$: for example, $d=N=3$ and $\varvec{k}= (1,1,2)$. Elements $(A_1, A_2, A_3) \in \mathcal{F}\mathcal{F}^{\mathbb {R}^3,(1,1,2)}$ will determine two lines $\ell _1 = {\text {row}}(A_1)$ and $\ell _2 = {\text {row}}(A_2)$ and a plane $\mathcal {S} = {\text {row}}(A_3)$. If $\mathcal {Q}$ is a plane containing $\ell _1$ and $\ell _2$, then it must intersect $\mathcal {S}$ at least in a line, so

$$\begin{aligned}{} & {} \frac{1}{\dim \mathcal {Q}}(\dim (\ell _1 \cap \mathcal {Q}) + \dim (\ell _2 \cap \mathcal {Q}) + \dim (\mathcal {S} \cap \mathcal {Q}))\\{} & {} \quad \ge \frac{3}{2} > \frac{4}{3} = \frac{\dim \ell _1 + \dim \ell _2 + \dim \mathcal {S}}{3}. \end{aligned}$$

Hence, nothing in $\mathcal{F}\mathcal{F}^{\mathbb {R}^3,(1,1,2)}$ has property ${\mathscr {S}}$, so Theorem 1.14 tells us nothing. Moreover, by Proposition 4.6, there are no TFFs in $\mathcal{F}\mathcal{F}^{\mathbb {R}^3,(1,1,2)}$.

Nonetheless, running gradient descent from random starting points in $\mathcal{F}\mathcal{F}^{\mathbb {R}^3,(1,1,2)}$ seems to always find minimizers of ${\text {FFP}}$ in practice. Figure 1 shows the value of ${\text {FFP}}$ rapidly decreasing to the global minimum value $\frac{11}{2}$, which is greater than the value $\frac{16}{3}$ that a TFF would have. As observed by Casazza and Fickus [18, p. 17], the minimum is achieved by fusion frames of the form shown on the right of Fig. 1, where $\ell _1$ and $\ell _2$ lie in a plane $\mathcal {Q}$ perpendicular to $\mathcal {S}$ and the lines $\ell _1$, $\ell _2$, and $\mathcal {Q} \cap \mathcal {S}$ correspond to a tight Mercedes–Benz frame for $\mathcal {Q}$.

This suggests that the fusion frame Benedetto–Fickus theorem may hold even for parameters where there are no fusion frames with property ${\mathscr {S}}$. We also expect our approach to proving Theorem 1.14 will extend to more general spaces $\mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r})$ of operator-valued frames whose $P_i$ have a given spectrum (including weighted fusion frames), though the details seem more complicated. Hence, we pose the following conjecture:

Conjecture 5.1

Let $d,N,k_1, \dots k_N$ be positive integers and fix $\varvec{r}$. Let ${\text {OFP}}: \mathcal{O}\mathcal{F}^{\mathbb {K}^d,\varvec{k}}(\varvec{r}) \rightarrow \mathbb {R}$ be the obvious generalization of ${\text {FFP}}$ to operator-valued frames. Then all local minima of ${\text {FFP}}$ are global minima.

We expect that, as in the case of classical frames [53], $\mathcal{O}\mathcal{F}^{\mathbb {H}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ is path-connected, where $\mathbb {H}$ is the skew-field of quaternions. However, $\mathcal{O}\mathcal{F}^{\mathbb {R}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ cannot always be connected. For example, translating a result of Kapovich and Millson [38, Theorem 1] to our setting and notation implies that $\mathcal{O}\mathcal{F}^{\mathbb {R}^2,(1,1,1,1)}_{(5,5)}(3,3,3,1)$ is not connected. On the other hand, Cahill, Mixon, and Strawn [15] proved that the space $\mathcal{O}\mathcal{F}^{\mathbb {R}^d,(1,\dots ,1)}_{\left( \frac{N}{d}, \dots , \frac{N}{d}\right) }(1, \dots , 1)$ of real unit-norm tight frames is connected for all $d \ge 2$ and $N \ge d+2$, so there is some interesting characterization of when the $\mathcal{O}\mathcal{F}^{\mathbb {R}^d,\varvec{k}}_{\varvec{\lambda }}(\varvec{r})$ are connected still waiting to be discovered.

Cahill, Mixon, and Strawn’s proof of the Frame Homotopy Theorem relied heavily on the use of eigensteps, which are the eigenvalues of the partial sums of the $P_i$. While eigensteps can be similarly defined for fusion frames and even operator-valued frames, it is not clear whether they would be a useful tool for studying connectedness in the real case. Eigensteps give good coordinates for classical frame spaces because they are action coordinates—that is, they are the coordinates of a momentum map for a Hamiltonian action of a half-dimensional torus [54]. This means that not only is the image a convex polytope, but the fibers of the eigenstep map are reasonably simple and well-understood. For dimension-counting reasons it seems unlikely that eigensteps could give action coordinates for fusion frames or operator-valued frames, but it is desirable to find similarly useful coordinates in this more general setting.

Notes

Remember our simplifying assumption that the $A_i$ should be full rank.

References

Absil, Pierre-Antoine., Kurdyka, Krzysztof: On the stable equilibrium points of gradient systems. Systems & Control Letters 55(7), 573–577 (2006)
MathSciNet MATH Google Scholar
Aceska, Roza, Bouchot, Jean-Luc., Li, Shidong: Local sparsity and recovery of fusion frame structured signals. Signal Processing 174, 107615 (2020)
Google Scholar
Alexeev, Boris, Cahill, Jameson, Mixon, Dustin G.: Full spark frames. Journal of Fourier Analysis and Applications 18(6), 1167–1194 (2012)
MathSciNet MATH Google Scholar
Antezana, Jorge, Massey, Pedro G., Ruiz, Mariano A., Stojanoff, Demetrio: The Schur-Horn theorem for operators and frames with prescribed norms and frame operator. Illinois Journal of Mathematics 51(2), 537–560 (2007)
MathSciNet MATH Google Scholar
Audin, Michèle: Torus Actions on Symplectic Manifolds, volume 93 of Progress in Mathematics. Springer, Basel, second revised edition, (2012)
Ayaz, Ulaş, Dirksen, Sjoerd, Rauhut, Holger: Uniform recovery of fusion frame structured sparse signals. Applied and Computational Harmonic Analysis 41(2), 341–361 (2016)
MathSciNet MATH Google Scholar
Benedetto, John J., Fickus, Matthew: Finite normalized tight frames. Advances in Computational Mathematics 18(2–4), 357–385 (2003)
MathSciNet MATH Google Scholar
Berenstein, Arkady, Sjamaar, Reyer: Coadjoint orbits, moment polytopes, and the Hilbert-Mumford criterion. Journal of the American Mathematical Society 13(2), 433–466 (2000)
MathSciNet MATH Google Scholar
Bodmann, Bernhard G.: Optimal linear transmission by loss-insensitive packet encoding. Applied and Computational Harmonic Analysis 22(3), 274–285 (2007)
MathSciNet MATH Google Scholar
Bodmann, Bernhard G., Haas, John: Frame potentials and the geometry of frames. Journal of Fourier Analysis and Applications 21(6), 1344–1383 (2015)
MathSciNet MATH Google Scholar
Boufounos, Petros, Kutyniok, Gitta, Rauhut, Holger: Compressed sensing for fusion frames. In: Goyal, Vivek K., Papadakis, Manos, Van De Ville, Dimitri (eds.) Wavelets XIII. volume 7446, pp. 360–370. International Society for Optics and Photonics, SPIE (2009)
Google Scholar
Bownik, Marcin, Luoto, Kurt, Richmond, Edward: A combinatorial characterization of tight fusion frames. Pacific Journal of Mathematics 275(2), 257–294 (2015)
MathSciNet MATH Google Scholar
Cahill, Jameson, Fickus, Matthew, Mixon, Dustin G., Poteet, Miriam J., Strawn, Nate: Constructing finite frames of a given spectrum and set of lengths. Applied and Computational Harmonic Analysis 35(1), 52–73 (2013)
MathSciNet MATH Google Scholar
Cahill, Jameson, Mixon, Dustin G., Strawn, Nate: Connectivity and irreducibility of algebraic varieties of finite unit norm tight frames. SIAM Journal on Applied Algebra and Geometry 1(1), 38–72 (2017)
MathSciNet MATH Google Scholar
Cahill, Jameson, Mixon, Dustin G., Strawn, Nate: Connectivity and irreducibility of algebraic varieties of finite unit norm tight frames. SIAM Journal on Applied Algebra and Geometry 1(1), 38–72 (2017)
MathSciNet MATH Google Scholar
Cahill, Jameson, Strawn, Nate: Algebraic geometry and finite frames. In: Casazza, Peter G., Kutyniok, Gitta (eds.) Finite Frames. Applied and Numerical Harmonic Analysis, pp. 141–170. Birkhäuser, Boston, MA, USA (2013)
MATH Google Scholar
Ana Cannas da Silva: Lectures on Symplectic Geometry. Lecture Notes in Mathematics, vol. 1764. Springer-Verlag, Berlin, Heidelberg (2001)
Casazza, Peter G., Fickus, Matthew: Minimizing fusion frame potential. Acta Applicandae Mathematicae 107(1–3), 7–24 (2009)
MathSciNet MATH Google Scholar
Casazza, Peter G., Fickus, Matthew, Mixon, Dustin G., Wang, Yang, Zhou, Zhengfang: Constructing tight fusion frames. Applied and Computational Harmonic Analysis 30(2), 175–187 (2011)
MathSciNet MATH Google Scholar
Casazza, Peter G., Kovačević, Jelena: Equal-norm tight frames with erasures. Advances in Computational Mathematics 18(2–4), 387–430 (2003)
MathSciNet MATH Google Scholar
Casazza, Peter G., Kutyniok, Gitta: Frames of subspaces. In Christopher Heil, Palle E. T. Jorgensen, and David R. Larson, editors, Wavelets, Frames and Operator Theory, number 345 in Contemporary Mathematics, pages 87–113. American Mathematical Society, Providence, RI, USA, (2004)
Casazza, Peter G., Kutyniok, Gitta, Li, Shidong: Fusion frames and distributed processing. Applied and Computational Harmonic Analysis 25(1), 114–132 (2008)
MathSciNet MATH Google Scholar
Peter, G.: Casazza, Gitta Kutyniok, Shidong Li, and Christopher J. Rozell. Modeling sensor networks with fusion frames. In: Van De Ville, Dimitri, Goyal, Vivek K., Papadakis, Manos (eds.) Wavelets XII, volume 6701, page 67011M. International Society for Optics and Photonics, SPIE (2007)
Google Scholar
Peter, G.: Casazza, Gitta Kutyniok, and Friedrich Philipp. Introduction to finite frame theory. In: Casazza, Peter G., Kutyniok, Gitta (eds.) Finite Frames. Applied and Numerical Harmonic Analysis, pp. 1–53. Birkhäuser, Boston, MA, USA (2013)
Google Scholar
Casazza, Peter G., Leon, Manuel T.: Existence and construction of finite frames with a given frame operator. International Journal of Pure and Applied Mathematics 63(2), 149–157 (2010)
MathSciNet MATH Google Scholar
Daubechies, Ingrid, Grossmann, Alexander, Meyer, Yves: Painless nonorthogonal expansions. Journal of Mathematical Physics 27(5), 1271–1283 (1986)
MathSciNet MATH Google Scholar
Donoho, David L., Elad, Michael: Optimally sparse representation in general (nonorthogonal) dictionaries via $\ell ^1$ minimization. Proceedings of the National Academy of Sciences of the United States of America 100(5), 2197–2202 (2003)
MathSciNet MATH Google Scholar
Duffin, Richard James, Schaeffer, Albert Charles: A class of nonharmonic Fourier series. Transactions of the American Mathematical Society 72(2), 341–366 (1952)
MathSciNet MATH Google Scholar
Dykema, Ken, Strawn, Nate: Manifold structure of spaces of spherical tight frames. International Journal of Pure and Applied Mathematics 28(2), 217–256 (2006)
MathSciNet MATH Google Scholar
Eldar, Yonina C., Kuppinger, Patrick, Bolcskei, Helmut: Block-sparse signals: Uncertainty relations and efficient recovery. IEEE Transactions on Signal Processing 58(6), 3042–3054 (2010)
MathSciNet MATH Google Scholar
Fulton, William: Eigenvalues, invariant factors, highest weights, and Schubert calculus. Bulletin of the American Mathematical Society 37(3), 209–249 (2000)
MathSciNet MATH Google Scholar
Heineken, Sigrid B., Llarena, Juan P., Morillas, Patricia M.: On the minimizers of the fusion frame potential. Mathematische Nachrichten 291(4), 669–681 (2018)
MathSciNet MATH Google Scholar
Hilbert, David: Ueber die vollen Invariantensysteme. Mathematische Annalen 42(3), 313–373 (1893)
MathSciNet MATH Google Scholar
Holmes, Roderick B., Paulsen, Vern I.: Optimal frames for erasures. Linear Algebra and its Applications 377, 31–51 (2004)
MathSciNet MATH Google Scholar
Horn, Alfred: Eigenvalues of sums of Hermitian matrices. Pacific Journal of Mathematics 12(1), 225–241 (1962)
MathSciNet MATH Google Scholar
Roger, A.: Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, UK (2013)
Google Scholar
Kaftal, Victor, Larson, David R., Zhang, Shuang: Operator-valued frames. Transactions of the American Mathematical Society 361(12), 6349–6385 (2009)
MathSciNet MATH Google Scholar
Kapovich, Michael, Millson, John J.: On the moduli space of polygons in the Euclidean plane. Journal of Differential Geometry 42(2), 430–464 (1995)
MathSciNet MATH Google Scholar
Kirwan, Frances: Cohomology of Quotients in Symplectic and Algebraic Geometry. Mathematical Notes, vol. 31. Princeton University Press, Princeton, NJ, USA (1984)
Kirwan, Frances: Convexity properties of the moment mapping. III. Inventiones Mathematicae 77(3), 547–552 (1984)
MathSciNet MATH Google Scholar
Klyachko, Alexander A.: Stable bundles, representation theory and Hermitian operators. Selecta Mathematica 4(3), 419–445 (1998)
MathSciNet MATH Google Scholar
Knutson, Allen, Tao, Terence: The honeycomb model of $GL_n({\mathbb{C} })$ tensor products I: Proof of the saturation conjecture. Journal of the American Mathematical Society 12(4), 1055–1090 (1999)
MathSciNet MATH Google Scholar
Knutson, Allen, Tao, Terence, Woodward, Christopher: The honeycomb model of $GL_n({\mathbb{C} })$ tensor products II: Puzzles determine facets of the Littlewood-Richardson cone. Journal of the American Mathematical Society 17(1), 19–48 (2004)
MathSciNet MATH Google Scholar
Kutyniok, Gitta, Pezeshki, Ali, Calderbank, Robert, Liu, Taotao: Robust dimension reduction, fusion frames, and Grassmannian packings. Applied and Computational Harmonic Analysis 26(1), 64–76 (2009)
MathSciNet MATH Google Scholar
Manetti, Marco: Topology. Unitext, vol. 91. Springer, Cham (2015)
Marshall, Albert W., Olkin, Ingram, Arnold, Barry C.: Inequalities: Theory of Majorization and Its Applications, 2nd edn. Springer Series in Statistics. Springer, New York, NY, USA (2011)
MATH Google Scholar
Massey, Pedro G., Ruiz, Mariano A., Stojanoff, Demetrio: The structure of minimizers of the frame potential on fusion frames. Journal of Fourier Analysis and Applications 16(4), 514–543 (2009)
MathSciNet MATH Google Scholar
McDuff, Dusa, Salamon, Dietmar: Introduction to Symplectic Topology, volume 27 of Oxford Graduate Texts in Mathematics. Oxford University Press, Oxford, UK, third edition, (2017)
Mixon, Dustin G., Needham, Tom, Shonkwiler, Clayton, Villar, Soledad: Three proofs of the Benedetto–Fickus theorem. Preprint, arXiv:2112.02916 [math.MG], (2021)
Mumford, David: Projective invariants of projective structures and applications. In Proceedings of the International Congress of Mathematicians, Stockholm, 1962, pages 526–530. Almqvist & Wiksells, Uppsala, Sweden, (1963)
Mumford, David, Fogarty, John, Kirwan, Frances: Geometric Invariant Theory. Ergebnisse der Mathematik und ihrer Grenzgebiete, vol. 34. Springer-Verlag, Berlin (1994)
Needham, Tom, Shonkwiler, Clayton: Symplectic geometry and connectivity of spaces of frames. Advances in Computational Mathematics 47, 5 (2021)
MathSciNet MATH Google Scholar
Needham, Tom, Shonkwiler, Clayton: Admissibility and frame homotopy for quaternionic frames. Linear Algebra and its Applications 645, 237–255 (2022)
MathSciNet MATH Google Scholar
Needham, Tom, Shonkwiler, Clayton: Toric symplectic geometry and full spark frames. Applied and Computational Harmonic Analysis 61, 254–287 (2022)
MathSciNet MATH Google Scholar
Palais, Richard S., Terng, Chuu-Lian.: Critical Point Theory and Submanifold Geometry. Lecture Notes in Mathematics, vol. 1353. Springer-Verlag, Berlin (1988)
Schur, Issai: Uber eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie. Sitzungsberichte der Berliner Mathematischen Gesellschaft 22, 9–20 (1923)
MATH Google Scholar
Sjamaar, Reyer, Lerman, Eugene: Stratified symplectic spaces and reduction. The Annals of Mathematics, Second Series 134(2), 375–422 (1991)
MathSciNet MATH Google Scholar
Thomas, Richard P.: Notes on GIT and symplectic reduction for bundles and varieties. Surveys in Differential Geometry 10(1), 221–273 (2005)
MathSciNet MATH Google Scholar
Waldron, Shayne F. D.: An Introduction to Finite Tight Frames. Applied and Numerical Harmonic Analysis. Birkhäuser, New York, NY, USA, (2018)
Xia, Yu., Li, Song: Nonuniform recovery of fusion frame structured sparse signals. Analysis and Applications 15(03), 333–352 (2017)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

We are grateful to Dustin Mixon and Soledad Villar for discussions over the past year. Our joint chapter [49] served as an important catalyst for this paper. This work was supported by grants from the National Science Foundation (DMS–2107808, Tom Needham; DMS–2107700, Clayton Shonkwiler).

Author information

Authors and Affiliations

Department of Mathematics, Florida State University, Tallahassee, FL, USA
Tom Needham
Department of Mathematics, Colorado State University, Fort Collins, CO, USA
Clayton Shonkwiler

Authors

Tom Needham
View author publications
You can also search for this author in PubMed Google Scholar
Clayton Shonkwiler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tom Needham.

Additional information

Communicated by Pete Casazza.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Needham, T., Shonkwiler, C. Fusion Frame Homotopy and Tightening Fusion Frames by Gradient Descent. J Fourier Anal Appl 29, 51 (2023). https://doi.org/10.1007/s00041-023-10028-0

Download citation

Received: 06 September 2022
Revised: 07 March 2023
Accepted: 01 June 2023
Published: 28 July 2023
DOI: https://doi.org/10.1007/s00041-023-10028-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Fusion Frame Homotopy and Tightening Fusion Frames by Gradient Descent

Abstract

Similar content being viewed by others

Factorable weak operator-valued frames

The convergence constants and non linear approximations of fusion frames

Introduction to Finite Frame Theory

1 Introduction

1.1 Fusion Frame Homotopy

Question 1.1

Proposition 1.2

Definition 1.3

Definition 1.4

Example 1.5

Definition 1.6

Example 1.7

Question 1.8

Theorem 1.9

1.2 Benedetto–Fickus Theorem for Fusion Frames

Definition 1.10

Question 1.11

Theorem 1.12

Definition 1.13

Theorem 1.14

Corollary 1.15

Remark 1.16

2 Symplectic Machinery

2.1 Definitions

Example 2.1

Proposition 2.2

Proof

2.2 Coadjoint Orbits

Corollary 2.3

2.3 Level Sets of Momentum Maps

Theorem 2.4

Theorem 2.5

Corollary 2.6

Proof

3 The Symplectic Geometry of Spaces of Operator-Valued Frames

Proposition 3.1

Corollary 3.2

Proposition 3.3

Lemma 3.4

4 Tightening Fusion Frames

Proposition 4.1

Proposition 4.2

Proposition 4.3

4.1 V and the \({\text {SL}}(d)\) Action

Theorem 4.4

Proposition 4.5

Proposition 4.6

Proof

Corollary 4.7

4.2 Property \({\mathscr {S}}\) Satisfies (i) and (ii)

4.2.1 Property \({\mathscr {S}}\) Satisfies (i)

Lemma 4.8

Lemma 4.9

Proof

Lemma 4.10

Proof

Proposition 4.11

Proof

4.2.2 Property \({\mathscr {S}}\) Satisfies (ii)

Theorem 4.12

Proposition 4.13

Proof

Corollary 4.14

4.2.3 Completing the Proof of Theorem 1.14

Proof of Theorem 1.14

5 Discussion

Conjecture 5.1

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article