Abstract
Ranked data sets, where m judges/voters specify a preference ranking of n objects/candidates, are increasingly prevalent in contexts such as political elections, computer vision, recommender systems, and bioinformatics. The vote counts for each ranking can be viewed as an n! data vector lying on the permutahedron, which is a Cayley graph of the symmetric group with vertices labeled by permutations and an edge when two permutations differ by an adjacent transposition. Leveraging combinatorial representation theory and recent progress in signal processing on graphs, we investigate a novel, scalable transform method to interpret and exploit structure in ranked data. We represent data on the permutahedron using an overcomplete dictionary of atoms, each of which captures both smoothness information about the data (typically the focus of spectral graph decomposition methods in graph signal processing) and structural information about the data (typically the focus of symmetry decomposition methods from representation theory). These atoms have a more naturally interpretable structure than any known basis for signals on the permutahedron, and they form a Parseval frame, ensuring beneficial numerical properties such as energy preservation. We develop specialized algorithms and open software that take advantage of the symmetry and structure of the permutahedron to improve the scalability of the proposed method, making it more applicable to the high-dimensional ranked data found in applications.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Ranked data consist of m judges/voters specifying a preference ranking of n objects/candidates. While methods for analyzing such rankings date back to the late 18th century in the context of social choice theory (e.g., [4, 11, 23]), ranked data are increasingly prevalent in contexts such as computer vision [44, 45], recommender systems [86, 98], image processing [8, 99], crowdsourced subjective labeling [9, 16, 92], peer grading [78], metasearch [1], sports analytics [27], computational geometry [48], and bioinformatics [12, 26, 61, 96] (see [90, Sect. 2.2] for an excellent, comprehensive overview of application areas). Moreover, an increasing number of cities, states, colleges and universities, organizations, and corporations are using ranked choice voting for elections [35].
The vote counts for each possible ranking of n objects is an n!-dimensional data vector in \({\mathbb {R}}^{n!}\). We view this vector as lying on the permutahedron, denoted \({\mathbb {P}}_n\) and also referred to by some as the permutation polytope [94]. The permutahedron has vertices labeled by permutations and an edge when two permutations differ by transposing adjacent entries in the permutation. For example, in \({\mathbb {P}}_5\), the permutation 25134 corresponds to ranking candidate 2 first, candidate 5 second, candidate 1 third, and so on, and it is connected by an edge to each of 52134, 21534, 25314, and 25143. The permutahedron \({\mathbb {P}}_n\) is the Cayley graph of the symmetric group \({\mathbb {S}}_n\) induced by the generating set of adjacent transpositions (see Sect. 3.1), and a signal on the permutahedron \({\mathbb {P}}_n\) is a function \(f: {\mathbb {S}}_n \rightarrow {\mathbb {R}}\). In this context, \(f(\sigma )\) equals the number of votes for the permutation \(\sigma .\)
To deal with the scale of this data (factorial in the number of candidates), it is critical to construct efficient and meaningful representations that highlight salient features of the ranking tallies. Specifically, we follow the common signal processing approach of constructing a dictionary of atoms and representing a signal on the permutahedron as a linear combination of these atoms. For audio signals and images, as well as signals residing on more general weighted graphs, Fourier, time-frequency, curvelet, shearlet, bandlet, and other dictionaries have led to resounding successes in visual analysis of data, statistical analysis of data, compression, and as regularizers in machine learning and ill-posed inverse problems such as inpainting, denoising, and classification (see, e.g., [82, Sect. II] for an excellent historical overview of dictionary design methods and signal transforms).
In general, desirable properties when designing dictionaries include: (i) the dictionaries comprise an orthonormal basis or tight frame for the signal space, ensuring that the contribution of each atom can be computed via an inner product with the signal, and the energy of the signal is equal to the energy of the transform coefficients; (ii) the atoms have an interpretable structure, so that the inner products between the signal and each atom are informative; (iii) it is numerically efficient to apply the dictionary analysis and synthesis operators (forward and inverse transforms); and (iv) signals of certain mathematical classes can be represented exactly or approximately as sparse linear combinations of a subset of the dictionary atoms [87].
The main contributions of this work are as follows. First, we leverage techniques and ideas from both signal processing on graphs and combinatorial representation theory to propose a novel dictionary construction that can be used to transform high-dimensional ranked data in order to find, interpret, and exploit structural patterns in the rankings. Each of the atoms in the overcomplete dictionaries we propose captures both smoothness information about the data (typically the focus of spectral graph decomposition methods in graph signal processing) and structural information about the data (typically the focus of symmetry decomposition methods from representation theory). Second, we prove that the proposed dictionaries comprise tight Parseval frames and therefore preserve the energy of the signal (Theorems 1 and 2 in Sect. 4). Third, we demonstrate the application of the proposed transform methods and show how the interpretable structure of the atoms can lead to insights on real ranked data sets (Sect. 5). Fourth, we investigate numerical challenges, and propose novel algorithms that take advantage of the symmetry and structure of the permutahedron to enhance the scalability of our implementations for applying the analysis operators arising from our proposed dictionaries (Sect. 6). Fifth, we relate the proposed transform methods to related methods from combinatorial representation theory, graph signal processing, and statistical modeling for ranked data (Sects. 3 and 7).
2 Example Data Sets
We use the following three ranked data sets in running examples throughout this article.
2.1 1980 American Psychological Association Presidential Election
In Fig. 1 (left), on the permutahedron \({\mathbb {P}}_5\) we plot the vote tallies of the 5738 American Psychological Association (APA) members who ranked all five candidates for APA president in 1980 (out of the 15,449 total ballots cast) [14, 29, Table 1]. Under the instant runoff (Hare) voting system in which the votes for the candidate with the fewest first place votes in each iteration are transferred to the next ranked candidate on those ballots, candidate 1 was the winner.
2.2 2017 Minneapolis City Council Ward 3 Election
In Fig. 1 (right), on the permutahedron \({\mathbb {P}}_4\) we plot the vote tallies of the 5055 voters who ranked at least three of the four candidates for the Minneapolis City Council Ward 3 seat in 2017 (out of 9578 total valid votes cast) [73]. If a voter ranked three candidates, we assume that the unranked candidate was the voter’s fourth choice. Candidates 1 to 4 are Ginger Jentzen (Socialist-Alternative), Samantha Pree-Stinson (Green), Steve Fletcher (Democratic-Farmer-Labor), and Tim Bildsoe (Democratic-Farmer-Labor), respectively. Pree-Stinson began as a Democratic-Farmer-Labor (DFL) candidate, but was later endorsed by the Green Party. Fletcher was endorsed by the DFL Party. Jentzen received the most first place votes, but Fletcher won the election under the instant runoff (Hare) voting system utilized by Minneapolis. A discussion of the candidates’ views and an interesting analysis of the voting results by geographical regions within the ward is contained in [101].
2.3 Sushi Preference Data
For a data set with more candidates (\(n=10\)), we use Kamishima’s type A set of sushi preferences [50, 51], in which 5000 people provide complete preference rankings for the ten different types of sushi listed in Fig. 2a. Since plotting a signal on the permutahedron \({\mathbb {P}}_{10}\) (a 9-dimensional object) is not particularly informative, we show in Fig. 2b the projection of the data into a two-dimensional space via Gabriel’s biplot [22, 39, 67, Sect. 2.2], which is similar to a principal components projection except with a different choice of center. Chen [15, Sect. 5.3] notes that there is a full Condorcet ranking:
That is, in a preference comparison of any pair of two sushi items, the majority of voters would prefer the item that falls earlier in this ranking; i.e., the majority of voters prefer fatty tuna to any other item, the majority of voters prefer tuna to any item besides fatty tuna, and so forth.
3 Related Work
Analysis of ranked data has a long history in the mathematical psychology and statistics literatures. Approaches have largely focused on parametric statistical models including order statistic models, distance-based models, and pairwise comparison models (see, e.g., [62, 63, 67, 90, Sect. 2.5], and [100] for excellent overviews of these models). We focus our attention here on linear transforms for ranked data that attempt to identify structure in the data by taking inner products between the n!-dimensional vote tally (signal on the permutahedron) and building block signals that have some interpretable structure.
3.1 The Fourier Analysis on the Symmetric Group Approach
The vertices of the permutahedron \({\mathbb {P}}_n\) are labeled by the symmetric group \({\mathbb {S}}_n\) of permutations of \(\{1, \ldots ,n\}\). For example \(\sigma = \left( {\begin{matrix} 1 &{} 2 &{} 3 &{} 4 &{} 5 \\ 2 &{} 5 &{} 1 &{} 3 &{} 4 \end{matrix}}\right) = 25134\in {\mathbb {S}}_5\) is the bijective function \(1 \mapsto 2, 2 \mapsto 5,\) etc.. We let (i, j) denote the transposition (in cycle notation) that exchanges i and j and fixes the other entries, and the adjacent transpositions are \((i,i+1), 1 \le i \le n-1\). The group operation in \({\mathbb {S}}_n\) is composition of functions, and right multiplication by an adjacent transposition \((i,i+1)\) exchanges the candidates in positions i and \(i+1\). For example, \(\left( {\begin{matrix} 1 &{} 2 &{} 3 &{} 4 &{} 5 \\ 2 &{} 5 &{} 1 &{} 3 &{} 4 \end{matrix}}\right) (2,3) = \left( {\begin{matrix} 1 &{} 2 &{} 3 &{} 4 &{} 5 \\ 2 &{} 1 &{} 5 &{} 3 &{} 4 \end{matrix}}\right) \). In this context, the permutahedron \({\mathbb {P}}_n\) is the Cayley graph with vertices labeled by \({\mathbb {S}}_n\) and edges \(\{ (\sigma , \sigma s) \mid \sigma \in {\mathbb {S}}_n, s \in S\}\), where \(S = \{(i,i+1) \mid 1 \le i \le n-1\}\). Each transposition is its own inverse, so the generating set S is closed under inverses and \({\mathbb {P}}_n\) is a simple graph that is \((n-1)\)-regular, since \(|S| = n-1\).
Using right multiplication of transpositions is natural when studying ranked data, because two permutations are adjacent if and only if they differ by transposing adjacent candidates in the ranking. Under left multiplication, two rankings are adjacent if and only if they differ by swapping the candidates ranked ith and \((i+1)\)st. For example, \((2,3) \cdot \left( {\begin{matrix} 1 &{} 2 &{} 3 &{} 4 &{} 5 \\ 2 &{} 5 &{} 1 &{} 3 &{} 4 \end{matrix}}\right) = \left( {\begin{matrix} 1 &{} 2 &{} 3 &{} 4 &{} 5 \\ 3 &{} 5 &{} 1 &{} 2 &{} 4 \end{matrix}}\right) \). The adjacent transpositions S satisfy the Coxeter relations [83, 2.12.10] and give \({\mathbb {S}}_n\) the structure of a reflection group. If one uses the full set of transpositions \(\{(i,j) \mid 1 \le i < j \le n\}\), then the generating set is a full conjugacy class in \({\mathbb {S}}_n\). In that case, the Cayley graph is said to be quasi-abelian, and the Laplacian eigenvalues and eigenvectors are especially nice, but they are less interpretable in the context of ranked data as we discuss in Sect. 3.2.
Data on \({\mathbb {P}}_n\) lives in the underlying real vector space \({\mathbb {R}}[ {\mathbb {S}}_n]\) (or, equivalently \({\mathbb {R}}^{n!}\)) spanned by the symmetric group \({\mathbb {S}}_n\), which is called the group algebra of \({\mathbb {S}}_n\) and has a canonical basis \(\{ {\mathbf {e}}_\sigma \}_{\sigma \in {\mathbb {S}}_n}\).Footnote 1 A signal on \({\mathbb {S}}_n\) is a function \(f: {\mathbb {S}}_n \rightarrow {\mathbb {R}}\), which we view as a vector in \({\mathbb {R}}[{\mathbb {S}}_n]\) by \(\mathbf{f} = \sum _{\sigma \in {\mathbb {S}}_n} f(\sigma ) {\mathbf {e}}_\sigma \). Non-commutative harmonic analysis attempts to find structure in ranked data by decomposing them into subspaces that are set-wise invariant under the relabeling (left multiplication) and/or re-ranking (right multiplication) of the candidates [29]. Specifically, a permutation \(\sigma \in {\mathbb {S}}_n\) acts on the right and on the left, respectively, of a signal \(\mathbf{f}\), by
The vector space \({\mathbb {R}}[{\mathbb {S}}_n]\) decomposes into the direct sum of orthogonal subspaces, called isotypic components
where the sum is over all integer partitions \(\gamma =[\gamma _1,\gamma _2,\ldots ,\gamma _\ell ]\) of n, denoted by \(\gamma \vdash n\). The subspaces \(W_\gamma \) are invariant under both relabeling and reindexing by \({\mathbb {S}}_n\); that is, if \({\mathbf {w}}\in W_\gamma \), then \({\mathbf {w}}\sigma \in W_\gamma \) and \(\sigma {\mathbf {w}}\in W_\gamma \) for any permutation \(\sigma \in {\mathbb {S}}_n\). Thus, \(W_\gamma \) is both a left and a right submodule of \({\mathbb {R}}[{\mathbb {S}}_n]\). We refer to the integer partition \(\gamma \) as the shape or symmetry type (we use these terms interchangeably) of \(W_\gamma \).
The isotypic components further decompose into a direct sum of irreducible left and right \({\mathbb {S}}_n\)-submodules, respectively, as
where \(V_{\gamma ,i} \cong V_{\gamma ,j}\) and \(V_{\gamma ,i}^*\cong V_{\gamma ,j}^*\) are isomorphic as left and right \({\mathbb {S}}_n\)-modules, respectively, and \(V_{\gamma ,i} \not \cong V_{\rho ,j}\) and \(V_{\gamma ,i}^*\not \cong V_{\rho ,j}^*\) if \(\gamma \not = \rho \). The modules \(V_{\gamma ,i}\) are left invariant, meaning \(\sigma {\mathbf {v}}\in V_{\gamma ,i}\) for all \({\mathbf {v}}\in V_{\gamma ,i}\) and \(\sigma \in {\mathbb {S}}_n\). The modules \(V_{\gamma ,i}^*=\{ f: V_{\gamma ,i} \rightarrow {\mathbb {R}}\}\) are the dual vector spaces of linear functionals on \(V_{\gamma ,i}\) and are invariant under the right action \((f\sigma )({\mathbf {v}}) = f(\sigma {\mathbf {v}})\). The left action of \(\sigma \in {\mathbb {S}}_n\) on permutations replaces i with \(\sigma (i)\) and the right action replaces the entry in the ith position with the entry in the \(\sigma (i)\)th position, so the left modules \(V_{\gamma ,i}\) are invariant under relabeling the candidates and the right modules are invariant under re-indexing the candidates. For shorthand, we write \(\bigoplus _{i = 1}^{d_\gamma } V_{\gamma ,i}\) as \(V_{\gamma }^{\oplus d_\gamma }\) and \(\bigoplus _{i = 1}^{d_\gamma } V_{\gamma ,i}^*\) as \((V_{\gamma }^*)^{\oplus d_\gamma }\). The famous isomorphism in (3) was proved around 1900 in the work of Frobenius and Burnside. It was extended in the 1920s to hold for topological groups in the Peter–Weyl theorem. This decomposition has special properties: (i) the dimension \(d_\gamma = \dim (V_{\gamma })\) equals the multiplicity of \(V_{\gamma }\) in \(W_\gamma \); (ii) \(d_\gamma \) also equals the number of standard Young tableaux of shape \(\lambda \) and can be computed using the hook formula (see, e.g., [83, 3.10]); (iii) \(\dim (W_\gamma ) = d_\gamma ^2\); and (iv) the only left submodules of \({\mathbb {R}}[{\mathbb {S}}_n]\) isomorphic to \(V_{\gamma }\) appear in \(W_\gamma \). The same properties hold for the right action with the dual spaces.
In Fig. 3, we display the APA data and Minneapolis City Council election data from Fig. 1 as the sum of projections onto the orthogonal subspaces \(W_{\gamma }\).
How can this approach be used to find structure in ranked data? The isotypic decomposition (2) has a close relation with marginal statistics. The first order marginals of ranked data, shown in Fig. 4 for an election with four candidates, capture how many voters placed candidate i in ranking position j. There are two types of second order marginals. The unordered type, shown in Fig. 5a for the same election, capture how many voters placed candidates i and \(i^\prime \) in ranking positions j and \(j^\prime \), in either order. The ordered type, shown in Fig. 5b, capture how many voters placed candidate i in ranking position j and candidate \(i^\prime \) in ranking position \(j^\prime \). Higher order marginals that capture how many voters placed \(k>2\) specific candidates in k specific ranking positions may be totally unordered, totally ordered, or partially ordered (e.g., candidates i and \(i^{\prime }\) are in positions j and \(j^\prime \) in either order, and candidate \(i^{\prime \prime }\) is in position \(j^{\prime \prime }\)). In all of these marginal tables, each row and each column sum to the total number of votes (5055 in Figs. 4, 5).
Spectral analysis through projections onto the isotypic components captures order effects related to a specific marginal, net of the structure found in the “less complicated” marginals [29, 67, Sect. 2.6.1]. For example, Fig. 4 shows that of all first order effects, two of the most significant are that Steve Fletcher was far more likely than would be the case under a uniform distribution to be listed as the second choice, and far less likely to be listed as the fourth choice. These first order effects also appear in the second order marginals of Fig. 5, as well as higher order marginals. However, the projection of the signal \(\mathbf{g}\) from Fig. 1 onto captures information about the second order unordered marginals net of the number of voters (zero order) and first order marginals. Similarly, the projection of \(\mathbf{g}\) onto captures information about the second order ordered marginals, net of the zero and first order marginals and the second order unordered marginals.
More generally, the indices of orthogonal subspaces \(W_\gamma \) in (2) have a natural partial ordering, referred to as dominance ordering. The shape \(\nu =[\nu _1,\nu _2,\ldots ,\nu _{\ell }]\) strictly dominates shape \(\gamma = [\gamma _1,\gamma _2,\ldots ,\gamma _{{\ell }^\prime }] \ne \nu \), denoted by \(\nu \vartriangleright \gamma \), if \(\sum _{i=1}^j \nu _i \ge \sum _{i=1}^j \gamma _i\) for each \(j=1,2,\ldots ,\max \{{\ell },{\ell }^{\prime }\}\). For example, with \(n=6\), dominates and , but the latter two shapes are incomparable (neither one dominates the other). The projection of a signal onto an isotypic component \(W_\gamma \) captures information about the marginals corresponding to shape \(\gamma \), net of the marginals corresponding to all shapes strictly preceding \(\gamma \) in dominance order.
With this justification, returning to the APA example in Fig. 3, Diaconis [29, Sect. 2C] interprets the relatively large contribution on isotypic component , corresponding to the two-row shape [3, 2], as the contribution of unordered pair (second order) effects, and he argues that this is related to the fact that the APA divides into two groups: academics and clinicians. This Fourier analysis approach has also found application in “Q-sort" data in psychology [28, 5B], balanced incomplete block designs [80], multiple object tracking [58], finding graph invariants [56], the quadratic assignment problem [54], computational geometry [48], genomic data analysis [96], and sports analytics [27].
What are the limitations of this approach? As shown in Fig. 6, two signals with different structure and support can have exactly the same energy decomposition into isotypic components, limiting the amount of information that can be extracted from this energy decomposition alone.
A more refined approach is to use the Fourier transform of \(\mathbf{f}\) on the symmetric group, which is defined as the set of matrices \(\{{\hat{f}}(\gamma ) \mid \gamma \text{ a } \text{ partition } \text{ of } n\}\), where for each integer partition \(\gamma \), \({\hat{f}}(\gamma )\) is defined as the sum \(\sum _{\sigma \in {\mathbb {S}}_n} f(\sigma ) \rho _\gamma (\sigma )\) [28]. Here, \(\rho _\gamma (\sigma )\) is the matrix of the permutation \(\sigma \) as a linear transformation on the \({\mathbb {S}}_n\)-invariant subspace \(V_{\gamma ,i}\) (for any i). This set \(\{{\hat{f}}(\gamma ) \mid \gamma \text { a partition of } n\}\) has a total of n! matrix entries, which are viewed as the Fourier coefficients of \(\mathbf{f}\). Unfortunately, there is no natural choice of basis of the irreducible components \(V_{\gamma ,i}\) (see for example Diaconis [29, p. 955]), and therefore ad hoc methods are used to interpret these values. The standard choices are Young’s seminormal basis or Young’s orthogonal basis [28, 8A], [19, 30, 57], which are used largely because they are well-adapted to fast computation, but they lack interpretability.
Alternatively, [29, Sect. 2C] applies Mallow’s method: construct an overcomplete spanning set for \(W_\gamma \) by projecting interpretable functions that capture kth order effects \(\left( k=\sum _{i=2}^\ell \gamma _i\right) \) onto \(W_\gamma \), and then take inner products between these projections and the signal. In Fig. 7, we show 2 of the 36 interpretable second order functions and their projections in the overcomplete spanning set for . In general, there are \(m_\gamma ^2\) of these spanning vectors for the \(d_\gamma ^2\)-dimensional space \(W_\gamma \), where \(m_\gamma =\frac{n!}{\prod _{l=1}^{\ell } \gamma _l !}\).
For this particular isotypic component , nine projected functions each appear four times in the 36 vectors; for example,
For the 2017 Minneapolis City Council Ward 3 election data \(\mathbf{g}\) shown in Fig. 1, the largest inner products
are the ones between the signal and and the three identical projections listed in (4). This can be seen visually by thinking about the inner products in the form listed in the right-hand side of (5) and then examining in Fig. 3. Again, an interpretation of these inner products is that net of zero and first order marginal effects (i.e., starting from ), the pairs of candidates \(\{1,2\}\) and \(\{3,4\}\) are likely to appear together in ranking positions \(\{1,2\}\) or \(\{3,4\}\). See [96] for an additional application of this method to genomic data.
The new approach we propose in Sect. 4 also yields an overcomplete spanning set, but we directly construct each spanning vector as an interpretable function in \(V_{\gamma ,i}\) (and in fact in a more precisely defined eigenspace that is a strict subset of \(V_{\gamma ,i}\)).
Finally, it is noteworthy that the non-commutative Fourier analysis approach does not make any direct use of the permutahedron or any other underlying graph structure for that matter. Rather, it is completely independent of the choice of generating set of the symmetric group. We return to this point at the end of Sect. 3.2.
3.2 The Graph Signal Processing Approach
Within the last ten years, researchers in the field of graph signal processing [75, 88] have developed new methods to identify and exploit structure in data residing on the vertices of weighted or unweighted graphs. Signals on the unweighted permutahedron, as shown in Fig. 1, fit into this framework. While, to our knowledge, such ranked data on the permutahedron have not been analyzed with graph signal processing techniques, the natural first method to apply would be the graph Fourier transform. The graph Laplacian matrix is defined as \({\varvec{\mathcal {L}}}:=\mathbf{D}-\mathbf{A}\), where \(\mathbf{A}\) is the adjacency matrix of the permutahedron and the diagonal degree matrix \(\mathbf{D}\) is equal to \((n-1)\mathbf{I}_{n!}\), where \(\mathbf{I}_k\) is a \(k \times k\) identity matrix, since the permutahedron is an \((n-1)\)-regular graph. The matrix \({\varvec{\mathcal {L}}}\) is symmetric with nonnegative eigenvalues, and each eigenvalue \(\lambda \) is associated with an orthogonal eigenspace \(U_\lambda \). The signal space \({\mathbb {R}}[{\mathbb {S}}_n]\) decomposes into a direct sum of these orthogonal eigenspaces,
A common definition of the graph Fourier transform is\(\hat{f}(\lambda _{{\ell }}):=|\langle \mathbf{f},\mathbf{u}_{{\ell }} \rangle |\), where \(\mathbf{u}_{{\ell }}\) is the eigenvector associated with the \({\ell }\)th eigenvalue of \({\varvec{\mathcal {L}}}\) [88]. Since this definition depends on the particular choice of the Laplacian eigenvectors in the case of repeated eigenvalues (which occur in \({\mathbb {P}}_n\)), we define the graph Fourier transform here as \(\hat{f}(\lambda ):=\Vert \mathbf{f}_\lambda \Vert \), where \(\mathbf{f}_\lambda \) is the orthogonal projection of \(\mathbf{f}\) onto the eigenspace \(U_\lambda \). In the absence of repeated eigenvalues, these two definitions coincide.
How can the graph Fourier transform and other graph signal processing techniques be used to find structure in ranked data? For each unit norm Laplacian eigenvector \(\mathbf{u}_\lambda \) associated with eigenvalue \(\lambda \), we have
where \({{\mathcal {E}}}\) are the edges of the permutahedron. Therefore, the eigenvectors associated with lower Laplacian eigenvalues are smoother in the sense that the values vary less across neighboring vertices, as shown in Fig. 9. The graph Fourier transform provides a decomposition of the energy of the signal into the energy in each Laplacian eigenspace, yielding information about the signal’s smoothness, as shown in Fig. 8.
What are the limitations of applying the standard graph Fourier transform to signals on the permutahedron? First, because the graph Laplacian eigenvalues and eigenvectors of the permutahedron are not known in closed form, it is computationally intensive to compute the graph Fourier transform [\({{\mathcal {O}}}(n!^3)\)], and not tractable for ranked data with more than seven or eight candidates. Second, there is not a natural orthonormal basis for each eigenspace that preserves the structure and symmetry of the graph. All but the first and last Laplacian eigenvalues of the permutahedron are repeated multiple times, and thus there are infinitely many choices of orthornomal bases for the associated eigenspaces \(\{U_\lambda \}_{\lambda \notin \{0,2(n-1)\}}\). As shown in Fig. 9 for two Laplacian eigenspaces of \({\mathbb {P}}_4\), the numerical computation of a basis is not guaranteed to preserve any sort of symmetry, leading to less interpretable basis vectors. Moreover, and third, different isotypic components may contain the same Laplacian eigenvalue, and thus, it is not even guaranteed that the numerically computed basis vectors live in a single isotypic component.
Why is the permutahedron the right graph to represent the underlying data domain? An alternative choice is the Cayley graph of the symmetric group induced by the generating set of all transpositions (not just neighboring transpositions), which is shown in Fig. 10 and which we denote by \(\Upgamma _n\). This graph has some nice mathematical properties: (i) the isotypic components \(W_\gamma \) are each spanned by eigenvectors associated with a single Laplacian eigenvalue of the Cayley graph \(\Upgamma _n\) [58, Proposition 1]; (ii) the Laplacian eigenvalues and eigenvectors of \(\Upgamma _n\) are known in closed form [81, Theorem 1.1], [58, Proposition 2], [40, Theorem III.1]; and, (iii) moreover, the Laplacian eigenvalues to which the isotypic components correspond increase according to dominance ordering. So if \(\nu \vartriangleright \gamma \), then any vector \(\mathbf{f}_\nu \) in the isotypic component \(W_\nu \) is smoother with respect to the Cayley graph \(\Upgamma _n\) [recall the definition of smoothness in (7)] than any vector \(\mathbf{f}_\gamma \) in \(W_\gamma \) with the same norm as \(\mathbf{f}_\nu \), since \( \frac{\mathbf{f}_\nu ^{\top } {\varvec{\mathcal {L}}}_{\Upgamma _n} \mathbf{f}_\nu }{\mathbf{f}_\nu ^{\top }\mathbf{f}_\nu }=\lambda _{\nu }<\lambda _\gamma = \frac{\mathbf{f}_\gamma ^{\top } {\varvec{\mathcal {L}}}_{\Upgamma _n} \mathbf{f}_\gamma }{\mathbf{f}_\gamma ^{\top }\mathbf{f}_\gamma }\). In this sense, the isotypic components provide a notion of frequency, with vectors residing in isotypic components later in the dominance ordering representing “more complex” (less smooth) functions, with respect to the Cayley graph \(\Upgamma _n\) induced by the generating set of all transpositions [58].
Despite these nice mathematical properties, the permutahedron is the more appropriate domain on which to develop new techniques for analyzing the structure of most ranked data sets, due to the different notions of distance the two underlying graphs capture [55]. The structure of \(\Upgamma _n\) captures an appropriate notion of distance between the permutations in applications such as multi-object tracking, where, e.g., the object trajectories (slots) are continuously visible on radar or camera, but the object identities (candidates) associated with each trajectory are only revealed at certain time instances (e.g., pilot reports by radio, observations captured by a security camera) [58]. In this situation, there is not necessarily a physically-meaningful linear order to the trajectories (slots), and therefore it may be likely that objects jump from one trajectory to a crossing trajectory whose label is not adjacent or even similar. This information is typically captured by a corresponding noise model (e.g., [58, Sect. 3.1]). However, in most ranked data applications, the ranking positions represent a linear ordering, and therefore permutations that swap the candidates in the first and last ranking positions, e.g., are not close from a voter’s viewpoint. To illustrate this with a specific example, 1234 and 4231 are adjacent in the Cayley graph of Fig. 10, but they are far apart from a voter’s perspective and they are far apart in the permutahedron \({\mathbb {P}}_4\), whereas 1234 and 2134 are adjacent in both graphs. Kondor [55, Sect. 3] distinguishes these cases in terms of invariance, with our notion being only right-invariant and the other notion being bi-invariant (right-invariant and left-invariant). As shown in Fig. 11, the graph structures induced by these two different distance metrics (generating sets) yield different notions of signal smoothness, as captured by the respective graph Fourier transforms. To summarize the key takeaways, when we choose an underlying graph data domain, we are defining a notion of distance between permutations; and the distance induced by the permutahedron structure is most appropriate in ranking applications where permutations should be considered closest if the candidate swap occurs across neighboring ranking slots.
3.3 Other Related Transforms for Ranked Data
We briefly mention other related linear transforms for ranked data. Nested orthogonal contrasts [66, 67] compare the rankings of two or more groups of candidates, ignoring the relative ranks within each group of candidates. Inversions [42, 67, 70, 71] project the data onto linear subspaces based on the relative rankings of subsets (pairs, triplets, etc.) of the candidates, net of the effects of lower order subsets.
Other ranked data transforms consider the underlying graph to be a quasi-Abelian Cayley graph; i.e., the set of generators of the Cayley graph is the union of conjugacy classes. Rockmore et al. [81] investigates fast Fourier transforms for data on quasi-Abelian Cayley graphs. Ghandehari et al. [40] extend Rockmore’s work by developing tight windowed Fourier frames for data residing on quasi-Abelian Cayley graphs. Since the full set of transpositions is a conjugacy class in \({\mathbb {S}}_n\), the Cayley graph \(\Upgamma _n\) induced by the generating set of all transpositions is quasi-Abelian, but the permutahedron \({\mathbb {P}}_n\) does not fall into this class (except for \({\mathbb {P}}_2\)). Kondor’s left-invariant coset-based multiresolution analysis [57] yields an orthogonal basis of wavelet and scaling atoms that are localized in the vertex and spectral domains of \(\Upgamma _n\), but not necessarily in either domain when the underlying graph is taken to be the permutahedron \({\mathbb {P}}_n\).
Finally, taking inspiration more from one-dimensional signal processing than from the literature on signal processing on graphs, Kakarala [49] further decomposes the coefficient matrices \(\hat{f}(\gamma )\) of the Fourier transform on the symmetric group into the product of a positive semidefinite “magnitude” matrix and an orthogonal “phase” matrix.
4 Tight Spectral Frames for Ranked Data
In this section, we present a new approach to generate dictionaries for ranked data by combining the symmetry decomposition method (3) from combinatorial representation theory with the spectral graph decomposition method (6) from graph signal processing, in order to capture two different kinds of information about the data.
These two approaches are connected in the following way, which is at the crux of our method. The graph Laplacian matrix is in the regular representation of the symmetric group algebra, because it can be written as the linear combination
where \(\rho _R(s)\) is the matrix of \(s \in S\) acting as a linear transformation on the right of \({\mathbb {R}}[{\mathbb {S}}_n]\) and \(\mathbf{1} \in {\mathbb {S}}_n\) is the identity element so that \(\rho _R(\mathbf{1}) = \mathbf {I}_{n!}\). Since our generators are involutions (\(s^2 = 1\)), the Laplacian is symmetric, and (8) is also true if the right regular representation is replaced with the left regular representation and \({\varvec{\mathcal {L}}}\) acts on the left. As a result of (8), the isotypic components in (3) decompose into Laplacian eigenspaces, and therefore the group algebra decomposes as follows.
Proposition 1
and \(\Lambda _\gamma \) is the set of Laplacian eigenvectors \(\lambda \) such that \(W_\gamma \cap U_\lambda \not = \emptyset \).
Proof
The group algebra decomposes into a direct sum of vector spaces in two ways,
On the left we use the fact that \({\varvec{\mathcal {L}}}\) is symmetric to decompose \({\mathbb {R}}[{\mathbb {S}}_n]\) into a direct sum of Laplacian eigenspaces \(U_\lambda , \lambda \in \Lambda \), where \(\Lambda \) is the set of eigenvalues of \({\varvec{\mathcal {L}}}\). On the right is the decomposition (3) into isotypic components for \({\mathbb {S}}_n\). The isotypic components are closed under both the left and right action of \({\mathbb {S}}_n\) and \({\varvec{\mathcal {L}}}\) is a linear combination (8) of group elements, so the isotypic components \(W_\gamma \) are closed under the action of \({\varvec{\mathcal {L}}}\).
If \(\mathbf{f} \in {\mathbb {R}}[{\mathbb {S}}_n]\), then there is a unique decomposition \(\mathbf{f} = \sum _{\lambda \in \Lambda } \mathbf{f}_\lambda \) with \(\mathbf{f}_\lambda \in U_\lambda \); namely, \(\mathbf{f}_\lambda \) is the orthogonal projection of \(\mathbf{f}\) onto \(U_\lambda \). For each \(\lambda \in \Lambda \), there is a unique decomposition, \(\mathbf{f}_\lambda = \sum _{\{\gamma \vdash n: \lambda \in \Lambda _\gamma \}} \mathbf{f}_{\gamma ,\lambda }\), of \(\mathbf{f}_\lambda \) into vectors \(\{\mathbf{f}_{\gamma ,\lambda }\}\) in the corresponding isotypic components \(\{W_\gamma \}\). Multiplying \(\mathbf{f}_\lambda \) by \(\lambda \) gives
and multiplying \(\mathbf{f}_\lambda \) by \({\varvec{\mathcal {L}}}\) gives
Since \(W_\gamma \) is closed under multiplication by \({\varvec{\mathcal {L}}}\), each term \({\varvec{\mathcal {L}}}\mathbf{f}_{\gamma ,\lambda }\) in the summation in (11) is in \(W_\gamma \). By the uniqueness of the decomposition of \(\mathbf{f}\) into isotypic components, we conclude from (10) and (11) that \({\varvec{\mathcal {L}}}\mathbf{f}_{\gamma ,\lambda } = \lambda \mathbf{f}_{\gamma ,\lambda }\) for each \(\gamma \vdash n\) such that \(\lambda \in \Lambda _\gamma \). Thus, \(\mathbf{f}_{\gamma ,\lambda }\) is a Laplacian eigenvector of eigenvalue \(\lambda \) and \(\mathbf{f}_{\gamma ,\lambda }\in Z_{\gamma ,\lambda }=W_\gamma \cap U_\lambda \). In summary, for any \(\mathbf{f} \in {\mathbb {R}}[{\mathbb {S}}_n]\), there is a unique decomposition \(\mathbf{f}=\sum _{\gamma \vdash n} \sum _{\lambda \in \Lambda _\gamma } \mathbf{f}_{\gamma ,\lambda }\), with \(\mathbf{f}_{\gamma ,\lambda }\in Z_{\gamma ,\lambda }\). \(\square \)
If \(\lambda \in \Lambda _\gamma \), we say that the eigenvalue \(\lambda \) has symmetry type or shape \(\gamma \). Eigenvalues may have multiple symmetry types. For example, the Laplacian eigenvalue \(\lambda =3\) on \({\mathbb {P}}_6\) is repeated 15 times (i.e., \(U_3\) is a 15-dimensional space). This eigenvalue appears 5 times in the component and 10 times in the component (e.g., is a 10-dimensional space), so \(\lambda =3\) has two symmetry types. On the other hand, \(\lambda =2\) has a single symmetry type as it only appears in the component.
Our objective is to find a spanning set \(\{{\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi }\}\) of dictionary atoms for each space \(Z_{\gamma , \lambda }\) such that
-
(i)
the overall dictionary analysis operator preserves the energy in the signal; that is, \(\sum _{\gamma ,\lambda ,k,\pi } |\langle \mathbf{f}, {\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi } \rangle |^2 = \Vert \mathbf{f}\Vert _2^2,\) or equivalently, \(\Vert {\varvec{\Phi }}^{\top }\mathbf{f}\Vert _2=\Vert \mathbf{f}\Vert _2\), where the atoms \(\{{\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi }\}\) comprise the columns of the matrix \({\varvec{\Phi }}\);
-
(ii)
the atoms are interpretable (i.e., they have a particular structure that makes the inner products useful in identifying structure in the data); and
-
(iii)
we can efficiently compute the inner products between these dictionary atoms and the signal on the permutahedron.
4.1 Preliminaries: Schreier Graphs and Equitable Partitions
We start by detailing (i) how to construct, for each integer partition \(\gamma \) of n, a graph \({\mathbb {P}}_\gamma \), called a Schreier graph, which is isomorphic to a quotient of the permutahedron \({\mathbb {P}}_n\) [38, 84]; and (ii) the relation between the spectral decompositions of the Schreier graphs and the spectral decomposition of the permutahedron.
If \(\gamma = [\gamma _1, \ldots , \gamma _\ell ]\) is an integer partition of n, then a set partition \(\pi =\{C_1, \ldots , C_\ell \}\) has shape \(\gamma \) if its blocks have size \(|C_i| = \gamma _i\). There are \(m_\gamma =\frac{n!}{\prod _{i=1}^{\ell } \gamma _i !}\) different ordered set partitions \(\pi \) of \(\{1,2,\ldots ,n\}\) of shape \(\gamma \), and we refer to this collection of ordered partitions as \(\Pi _\gamma \).Footnote 2 A permutation \(\sigma \in {\mathbb {S}}_n\) acts on an ordered set partition \(\pi \in \Pi _\gamma \) by permuting the entries of \(\pi \). For example if \(\sigma = \left( {\begin{matrix} 1 &{} 2 &{} 3 &{} 4 &{} 5 \\ 2 &{} 5 &{} 4 &{} 3 &{} 1 \end{matrix}}\right) \) and \(\pi = \{ \{2,4,5\}, \{1,3\}\}\) then \(\sigma (\pi ) = \{ \{1,3,5\}, \{2,4\}\}\).
Definition 1
The Schreier graph \({\mathbb {P}}_\gamma \) is the graph with vertex set \(\Pi _\gamma \) and edge set \(\{(\pi ,s \pi ) \mid \pi \in \Pi _\gamma , s\in S\}\), where \(S \subseteq {\mathbb {S}}_n\) is the subset of adjacent transpositions defined in Sect. 1.
Each Schreier graph (i) is undirected since adjacent transpositions are involutions; (ii) is \((n-1)\)-regular since \(|S| = n-1\); and (iii) has a self-loop at vertex \(\pi \) for each pair \(i,i+1\) that is in the same block of \(\pi \), since then the transposition \(s= (i,i+1)\) fixes \(\pi \). Two ordered set partitions \(\pi _1 \not = \pi _2\) are connected by the edge labeled by \(s = (i,i+1)\) if and only if \(\pi _1\) and \(\pi _2\) are identical except with i and \(i+1\) switched. Thus, non-loop edge weights are equal to 1. We view the vertices of \({\mathbb {P}}_\gamma \) as representing groupings of rankings (first place, second place, etc.), not specific candidates. Figure 12 shows the example of \({\mathbb {P}}_\gamma \) with \(\gamma = [3,2]\).
Next, we show that each ordered set partition \(\pi \) induces an equivalence relation on \({\mathbb {S}}_n\), the vertices of the permutahedron \({\mathbb {P}}_n\), and under this equivalence relation, the Schreier graphs are isomorphic to quotient graphs of the permutahedron \({\mathbb {P}}_n\).
Definition 2
Let \(\pi \in \Pi _\gamma \) be an ordered set partition of shape \(\gamma \), and let \(\sigma , \tau \in {\mathbb {S}}_n\). The equivalence relation \(\sim _\pi \) is given by identifying \(\sigma \sim _\pi \tau \) if and only if \(\sigma ^{-1}( \pi ) =\tau ^{-1}(\pi )\). The equivalence classes under \(\sim _\pi \) are the sets \({{\mathcal {V}}}_{\pi ,\mu } = \{ \sigma \in {\mathbb {S}}_n \mid \sigma (\mu ) = \pi \}\) for each \(\mu \in \Pi _\gamma .\) These are the permutations that place candidates \(\pi \) in the positions given by \(\mu .\)
For example, if \(\pi = \{\{2,4,5\},\{1,3\}\}\) (or \(\{245|13\}\) for shorter notation), then \(34512 \sim _\pi 12435\) because each group of candidates, \(\{2,4,5\}\) and \(\{1,3\}\), is in the same set of positions in the two permutations: \(3{\underline{45}}1{\underline{2}}\) and \(1{\underline{24}}3{\underline{5}}\). Furthermore, the equivalence class containing these two permutations is the set \(\mathcal{V}_{\{245|13\}, \{235|14\}}\) consisting of all permutations with {2,4,5} and \(\{1,3\}\) in positions {2,3,5} and \(\{1,4\}\), respectively.
Proposition 2
The partition of the vertices of \({\mathbb {P}}_n\) into equivalence classes \(\{{{\mathcal {V}}}_{\pi ,\mu } \mid \mu \in \Pi _\gamma \}\) induced by \(\sim _\pi \) is an equitable partition, meaning that for every pair of (not necessarily distinct) ordered set partitions \(\mu , \nu \in \Pi _\gamma ,\) there is a nonnegative integer \(\mathbf{K}_\pi (\mu ,\nu )\) such that each vertex \(\sigma \in \mathcal{V}_{\pi ,\mu }\) has exactly \(\mathbf{K}_\pi (\mu ,\nu )\) neighbors in \(\mathcal{V}_{\pi ,\nu }\) [41, Sect. 9.3]. In fact, when \(\mu \not = \nu ,\) \(\mathbf{K}_\pi (\mu ,\nu ) = 1\) if there exists \(s \in S\) such that \(s(\mu ) = \nu \) and equals 0 otherwise, and \(\mathbf{K}_\pi (\mu ,\mu )\) equals the number of \(s \in S\) such that \(s(\mu ) = \mu \).
Proof
When the equivalence class \({{\mathcal {V}}}_{\pi ,\mu } = \{ \sigma \in {\mathbb {S}}_n \mid \sigma (\mu ) = \pi \}\) is right multiplied by \(\tau \in {\mathbb {S}}_n\), we have \({{\mathcal {V}}}_{\pi ,\mu } \tau = \{ \sigma \tau \in {\mathbb {S}}_n \mid \sigma (\mu ) = \pi \} = \{ \sigma \tau \in {\mathbb {S}}_n \mid \sigma \tau (\tau ^{-1}(\mu )) = \pi \} = {{\mathcal {V}}}_{\pi ,\tau ^{-1}(\mu )}\). It follows that if \(s \in S\) and \(\sigma \in {{\mathcal {V}}}_{\pi ,\mu }\) then \(\sigma s \in {{\mathcal {V}}}_{\pi ,s(\mu )}\). Multiplication by a group element s is a bijection, so each element in \({{\mathcal {V}}}_{\pi ,\mu }\) is connected by an edge in \({\mathbb {P}}_n\) to exactly one element in \(\mathcal{V}_{\pi ,s(\mu )}\). If there is not an adjacent transposition s such that \(s(\mu ) = \nu \), then there are no edges in \({\mathbb {P}}_n\) between the vertices in \({{\mathcal {V}}}_{\pi ,\mu }\) and those in \({{\mathcal {V}}}_{\pi ,\mu }\). \(\square \)
Proposition 3
For each ordered set partition \(\pi \in \Pi _\gamma ,\) the quotient graph \({\mathbb {P}}_n /{\sim _\pi }\) is isomorphic to \({\mathbb {P}}_\gamma \).
Proof
Since \(\sim _\pi \) induces an equitable partition of \({\mathbb {P}}_n\) with equivalence classes \({{\mathcal {V}}}_{\pi ,\mu }\), the quotient \({\mathbb {P}}_n/\sim _\pi \) is a well-defined graph with vertices \(\{{{\mathcal {V}}}_{\pi ,\mu } \mid \mu \in \Pi _\gamma \}\) and edges \(\{ ({{\mathcal {V}}}_{\pi ,\mu }, {{\mathcal {V}}}_{\pi ,s(\mu )}) \mid \mu \in \Pi _\gamma , s \in S\}\). Comparing Proposition 2 with Definition 1 shows that these graphs are isomorphic by identifying the vertex \({{\mathcal {V}}}_{\pi ,\mu }\) in \({\mathbb {P}}_n/\sim _\pi \) with \(\mu \) in \({\mathbb {P}}_\gamma \). \(\square \)
Definition 3
The characteristic matrix of an equitable partition \(\pi \) of shape \(\gamma \), denoted by \(\mathbf{B}_{\pi }\), is the \(n! \times m_\gamma \) (0, 1)-matrix whose \((\sigma ,\mu )\)th element, for \(\sigma \in {\mathbb {S}}_n\) and \(\mu \in \Pi _\gamma \), is equal to 1 if and only if \(\sigma (\mu ) = \pi \); that is \(\sigma \) places the candidates from the jth block of \(\pi \) into the positions corresponding to the jth block of \(\mu \), for each j.
Each row of \(\mathbf{B}_{\pi }\) contains exactly one 1, and each column contains exactly \(\frac{n!}{m_\gamma }\) 1s. Therefore \(\mathbf{B}_{\pi }^\top \mathbf{B}_{\pi } = \left( \frac{n!}{m_\gamma }\right) \mathbf{I}_{m_\gamma }\). This is illustrated in Fig. 14. Furthermore, \(\mathbf{B}_{\pi }\) has the following symmetry property, which we use in our frame construction (15) and Theorem 1.
Proposition 4
For each \(\sigma \in {\mathbb {S}}_n\) and \(\pi \in \Pi _\gamma \) we have \(\rho _L(\sigma ) \mathbf{B}_{\pi } = \mathbf{B}_{\sigma (\pi )},\) where \(\rho _L(\sigma )\) is the matrix of \(\sigma \) in the left regular representation of \({\mathbb {S}}_n\) on \({\mathbb {R}}[{\mathbb {S}}_n]\).
Proof
For \(\sigma ,\tau \in {\mathbb {S}}_n\) and \(\mu \in \Pi _\gamma \), the \((\tau ,\mu )\) entry of \(\rho _L(\sigma ) \mathbf{B}_{\pi }\) equals 1 if and only if \(\sigma ^{-1} \tau (\mu ) = \pi \), which is true if and only if \(\tau (\mu ) = \sigma (\pi )\), and this is exactly the condition for the \((\tau ,\mu )\) entry of \(\mathbf{B}_{\sigma (\pi )}\). \(\square \)
A key property of characteristic matrices of equitable partitions is that they can be used to lift eigenvectors of the Schreier graphs \({\mathbb {P}}_\gamma \) to eigenvectors of the permutahedron \({\mathbb {P}}_n\).
Proposition 5
If \(\pi \in \Pi _\gamma \) and \(\mathbf{v}_{\gamma , \lambda }\) is a graph Laplacian eigenvector of the Schreier graph \({\mathbb {P}}_\gamma \) with eigenvalue \(\lambda ,\) then \(\mathbf{w} = \mathbf{B}_{\pi } \mathbf{v}_{\gamma , \lambda }\) is a graph Laplacian eigenvector of the permutahedron \({\mathbb {P}}_n\) with the same eigenvalue \(\lambda \).
Proof
Let \(\mathbf{A}_{{\mathbb {P}}_n}\) be the adjacency matrix of the permutahedron, and \(\mathbf{A}_{{\mathbb {P}}_\gamma } = \mathbf{A}_{{\mathbb {P}}_n/{\sim _\pi }}\) be the adjacency matrix of the Schreier graph \({\mathbb {P}}_\gamma \). Since \(\pi \) induces an equitable partition \(\sim _\pi \) of \({\mathbb {P}}_n\), \(\mathbf{A}_{{\mathbb {P}}_n}{} \mathbf{B}_{\pi } = \mathbf{B}_{\pi } \mathbf{A}_{{\mathbb {P}}_\gamma }\) [41, Lemma 9.3.1], and thus,
as both \({\mathbb {P}}_n\) and \({\mathbb {P}}_\gamma \) are (n-1)-regular graphs. Thus,
where the first equality follows from (12). \(\square \)
For each shape \(\gamma \) we view signals on the Schreier graph \({\mathbb {P}}_\gamma \) as vectors in the vector space \({\mathbb {R}}[\Pi _\gamma ]\) with canonical basis \(\{ {\mathbf {e}}_\pi \}_{\pi \in \Pi _\gamma }\). This vector space has a natural right \({\mathbb {S}}_n\)-action defined on a basis element \({\mathbf {e}}_\pi \) and a permutation \(\sigma \in {\mathbb {S}}_n\) by \({\mathbf {e}}_\pi \sigma = {\mathbf {e}}_{\sigma ^{-1}(\pi )}\), where permutations act on set partitions in \(\Pi _\gamma \) by permuting the entries as described just before Definition 2. The \({\mathbb {S}}_n\)-module \({\mathbb {R}}[\Pi _\gamma ]\) is known as the (right) “permutation module" for \({\mathbb {S}}_n\) and is often denoted by \(M^\gamma \) (see for example [83, Chap. 2]).
For each shape \(\gamma \vdash n\), the permutation module \({\mathbb {R}}[\Pi _\gamma ]\) decomposes into irreducible (right) \({\mathbb {S}}_n\)-submodules according to
where again \(\nu \vartriangleright \gamma \) means that \(\nu \) strictly dominates \(\gamma \). Thus \({\mathbb {R}}[\Pi _\gamma ]\) contains exactly one copy of the irreducible \(V_\gamma ^*\) and \(K_{\gamma ,\nu }\) copies of each irreducible module that comes before it in dominance order on partitions. The multiplicities \(K_{\gamma ,\nu }\) are known as Kostka numbers (see [83, Sect. 2.11] and Fig. 29). Furthermore, \(V_\gamma ^*\) does not appear as a submodule of \({\mathbb {R}}[\Pi _\pi ]\) for partitions \(\pi \) whose shapes come before \(\gamma \) in dominance order, which is beneficial for the computational algorithms that we explore in Sect. 6.
The following proposition says that when a vector in \({\mathbb {R}}[\Pi _\gamma ]\) that lives entirely in the submodule \(V_\gamma ^*\) is lifted to the permutahedron by the characteristic matrix of an equitable partition of shape \(\gamma \), the resulting vector resides in the single isotypic component \(W_\gamma \subseteq {\mathbb {R}}[{\mathbb {S}}_n]\) defined in (2).
Proposition 6
If \(\pi \in \Pi _\gamma \) and \(\mathbf{x} \in V_\gamma ^*\subseteq {\mathbb {R}}[\Pi _\gamma ],\) then \(\mathbf{B}_{\pi } \mathbf{x} \in W_\gamma \).
Proof
The map \({\mathbb {R}}[\Pi _\gamma ] \rightarrow {\mathbb {R}}[{\mathbb {S}}_n]\) given by \(\mathbf{x} \mapsto \mathbf{B}_{\pi } \mathbf{x}\) is an injective, right \({\mathbb {S}}_n\)-module homomorphism (i.e., it commutes with the right \({\mathbb {S}}_n\) action on the the two spaces). It is injective, since \( \mathbf{B}_{\pi }\) has rank \(m_\gamma \), and is an \({\mathbb {S}}_n\)-module homomorphism, since for \(\tau \in {\mathbb {S}}_n\) and \(\pi ,\mu \in \Pi _\gamma \) we have
Upon restriction to the irreducible submodule \(V_\gamma ^*\), by Schur’s lemma, the map must be an isomorphism. Thus the image of \(V_\gamma ^*\) under \(\mathbf{x} \mapsto \mathbf{B}_{\pi } \mathbf{x}\) is an isomorphic copy of \(V_\gamma ^*\). The isotypic component \(W_\gamma \) contains all copies of \(V_\gamma ^*\) in \({\mathbb {R}}[{\mathbb {S}}_n]\) so \(\mathbf{B}_{\pi } \mathbf{x} \in W_\gamma \). \(\square \)
An important implication of Propositions 5 and 6 is that we can compute, visualize, and interpret Laplacian eigenvectors on the lower-dimensional Schreier graphs, and then lift them up to the higher-dimensional permutahedron graph in different manners—assigning different groups of candidates to groups of ranking slots—in order to generate vectors that reside in specific spaces \(Z_{\gamma ,\lambda }\) (i.e., have certain symmetry types and smoothness levels). Next, we show that scaled versions of vectors generated in this manner constitute a tight frame for the space \({\mathbb {R}}[{\mathbb {S}}_n]\) of all possible signals on the permutahedron \({\mathbb {P}}_n\).
4.2 Tight Frame Construction
Our strategy is to construct a tight Parseval frame for each nonempty space \(Z_{\gamma , \lambda }\), and then let the dictionary \({\varvec{\Phi }}\) be the union of these tight frames, so that \({\varvec{\Phi }}\) is a tight Parseval frame for \({\mathbb {R}}[{\mathbb {S}}_n]\). A set of vectors \(\{{\varvec{\varphi }}_j\}\) is a frame for a Hilbert space \({{\mathcal {H}}}\) if there exists frame bounds (constants) \(A, B > 0\) such that \( A \Vert \mathbf{f}\Vert _2^2 \le \sum _j | \langle \mathbf{f}, {\varvec{\varphi }}_j \rangle |^2 \le B \Vert \mathbf{f}\Vert _2^2,~\forall \mathbf{f} \in {{\mathcal {H}}}.\) A frame is said to be tight if \(A=B\), and a Parseval frame if \(A=B=1\). For finite dimensional Hilbert spaces (such as our \(Z_{\gamma , \lambda }\) spaces), a (finite) frame is simply a set of spanning vectors for the space. A frame is a group frame if there exits a finite group G that acts as linear transformations on \({{\mathcal {H}}}\) such that \(\{{\varvec{\varphi }}_j\} = \{ g \varvec{\varphi }_1\}_{g \in G}\); that is, the frame is generated by rotating a single frame vector \({\varvec{\varphi }}_1\) by the group G. For more background on frames and group frames, and their use in signal processing and machine learning, see [17, 59, 60, 97].
Analogous to Eq. (8), the Laplacian matrix \({\varvec{\mathcal {L}}}_\gamma \) on the Schreier graph \({\mathbb {P}}_\gamma \) equals
where \(\rho _\gamma \) is the representation of \({\mathbb {S}}_n\) on \({\mathbb {R}}[\Pi _\gamma ]\), which decomposes as in (13). It follows that \(V_\gamma ^*\) is closed under multiplication by \({\varvec{\mathcal {L}}}_\gamma \), and therefore, from the identical argument as in the proof of Proposition 1, \(V_\gamma ^*\) decomposes into eigenspaces for \({\varvec{\mathcal {L}}}_\gamma \).
For each shape \(\gamma \vdash n\) and eigenvalue \(\lambda \in \Lambda _\gamma \), to generate a tight Parseval frame \(\varPhi _{\gamma ,\lambda }\) for \(Z_{\gamma , \lambda }\), we
-
(i)
construct an orthonormal basis \(\left\{ \mathbf{v}_{\gamma ,\lambda ,k}\right\} _{k=1}^{\kappa _{\gamma ,\lambda }}\) for the graph Laplacian eigenspace of the Schreier graph \({\mathbb {P}}_\gamma \) that is associated with the eigenvalue \(\lambda \) in \(V_\gamma ^*\), and then
-
(ii)
lift each eigenvector \(\mathbf{v}_{\gamma ,\lambda ,k}\) in the basis back to the permutahedron \({\mathbb {P}}_n\) in \(m_\gamma \) different ways.
Specifically, we define \({\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi }:=c_{\gamma }{} \mathbf{B}_{\pi } \mathbf{v}_{\gamma , \lambda , k}\), and
where the constant \(c_\gamma := \sqrt{\frac{d_\gamma }{n!}}\) with \(d_\gamma = \dim (V_\gamma )\). It may be the case that some of the liftings \(\mathbf{B}_\pi {\mathbf {v}}_{\gamma ,\lambda ,k}\) are equal for different \(\pi \in \Pi _\gamma \), but we keep these multiple copies in \(\Phi _{\gamma ,\lambda }\) (viewing it as a multiset), so that \(|\Phi _{\gamma ,\lambda }| = \kappa _{\gamma ,\lambda } m_\gamma \). We remove some of these redundancies in the reduced frame \(\bar{\Phi }_{\gamma ,\lambda }\) in (22) below.
Remark 1
It is often but not always the case that \(\kappa _{\gamma ,\lambda }=1\), so that the basis \(\left\{ \mathbf{v}_{\gamma ,\lambda ,k}\right\} _{k=1}^{\kappa _{\gamma ,\lambda }}\) consists of a single vector. When \(\kappa _{\gamma ,\lambda }=1\), we shorten the notation from \(\mathbf{v}_{\gamma ,\lambda ,k}\) to \(\mathbf{v}_{\gamma ,\lambda }\), and from \({\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi }\) to \({\varvec{\varphi }}_{\gamma ,\lambda ,\pi }\). To make interpretations more consistent, in our implementations, we always choose the eigenvectors of the Schreier graphs to have norm 1 and a positive coefficient on the vertex associated with ordered set partition \(\mu \) that is last in lexicographic order (e.g.,
Theorem 1
For \(\gamma \vdash n\) and \(\lambda \in \Lambda _\gamma ,\) the collection of atoms \(\varPhi _{\gamma ,\lambda }\) defined in (15) is the union of of \(\kappa _{\gamma ,\lambda }\) orthogonal tight Parseval frames and, as such, is a tight Parseval frame for \(Z_{\gamma , \lambda }\). The set of atoms \({{\mathcal {D}}}:=\bigcup _{\gamma \vdash n} \bigcup _{\lambda \in \Lambda _\gamma } \varPhi _{\gamma ,\lambda }\) is a tight Parseval frame for \({\mathbb {R}}[{\mathbb {S}}_n]\).
Proof
Let \(\mathbf{v} \in V_\gamma ^*\subseteq {\mathbb {R}}[\Pi _\gamma ]\) be a unit Laplacian eigenvector of \({\mathbb {P}}_\gamma \) of eigenvalue \(\lambda \) and symmetry type \(\gamma \). For \(\pi \in \Pi _\gamma \), the lifted vector \(\mathbf{w}=\mathbf{B}_\pi \mathbf{v} \in {\mathbb {R}}[{\mathbb {S}}_n]\) is a Laplacian eigenvector of \({\mathbb {P}}_n\) of eigenvalue \(\lambda \) by Proposition 5, and \(\mathbf{w} \in W_\gamma \) (i.e., it has symmetry type \(\gamma \)) by Proposition 6. Define the left \({\mathbb {S}}_n\)-module,
where \(\rho _L(\sigma )\) is the matrix of \(\sigma \) in the left regular representation of \({\mathbb {S}}_n\). Therefore, \(\left\{ \rho _L(\sigma ) {\mathbf {w}}\right\} _{\sigma \in {\mathbb {S}}_n}\) is a group frame for \(V_{\gamma ,{\mathbf {v}}}\) if we view \(\left\{ \rho _L(\sigma ) {\mathbf {w}}\right\} _{\sigma \in {\mathbb {S}}_n}\) as a multiset of size n! with (potentially many) repetitions.
Since the left and right actions of \({\mathbb {S}}_n\) commute and the Laplacian \({\varvec{\mathcal {L}}}\) is constructed (8) using the right representation, \(V_{\gamma ,{\mathbf {v}}}\) is a space of Laplacian eigenvectors of \({\mathbb {P}}_n\) of eigenvalue \(\lambda \). Moreover, by the double centralizer theorem (see, for example, [34, Theorem 5.18.1]), \(W_\gamma \cong V_\gamma \otimes V_\gamma ^*\), as an \(({\mathbb {S}}_n,{\mathbb {S}}_n)\) bimodule, where \(V_\gamma \) and \(V_\gamma ^*\) are irreducible left and right \({\mathbb {S}}_n\) modules indexed by \(\gamma \), respectively. Using this isomorphism, we write \(\mathbf{w}= w_1 \otimes w_2 \in V_\gamma \otimes V_\gamma ^*\), and then \(V_{\gamma ,{\mathbf {v}}} = {\mathbb {R}}[{\mathbb {S}}_n] {\mathbf {w}}= ({\mathbb {R}}[{\mathbb {S}}_n] w_1) \otimes w_2 \cong V_\gamma \otimes w_2 \cong V_\gamma \), isomorphic as left \({\mathbb {S}}_n\)-modules. Therefore, \(V_{\gamma ,{\mathbf {v}}}\) is an irreducible \({\mathbb {S}}_n\)-module and, by [97, Theorem 10.5], \(\left\{ \rho _L(\sigma ) {\mathbf {w}}\right\} _{\sigma \in {\mathbb {S}}_n}\) is a tight group frame.
We remove some of the repetition in \(\left\{ \rho _L(\sigma ) {\mathbf {w}}\right\} _{\sigma \in {\mathbb {S}}_n}\) by using Proposition 4:
where the third equality comes from the fact that \({\mathbb {S}}_n\) acts transitively on \(\Pi _\gamma \). Equation (17) tells us that \(V_{\gamma ,{\mathbf {v}}}\) is independent of the specific lifting \(\pi \) that we use. Let \({\mathbb {S}}_\pi = \{\sigma \in {\mathbb {S}}_n \mid \sigma (\pi ) = \pi \}\) be the stabilizer subgroup of \(\pi \in \Pi _\gamma \), and for \(\mu \in \Pi _\gamma \), let \(\tau _\mu \in {\mathbb {S}}_n\) be a permutation such that \(\tau _\mu (\pi ) = \mu \). Then the left coset \(\tau _\mu {\mathbb {S}}_\pi \) consists of all permutations that send \(\pi \) to \(\mu \). By [97, Theorem 10.5], for any \(\mathbf{f} \in {\mathbb {R}}[{\mathbb {S}}_n]\), we have
where the last equality follows from \(|{\mathbb {S}}_\pi | = n!/m_\gamma \) and \(\langle B_\pi {\mathbf {v}}, B_\pi {\mathbf {v}}\rangle = {\mathbf {v}}^\top B_\pi ^\top B_\pi {\mathbf {v}}= n!/m_\gamma {\mathbf {v}}^\top {\mathbf {v}}= n!/m_\gamma \). It follows that \(\Phi _{\gamma ,{\mathbf {v}}} := \{ \sqrt{d_\gamma /n!} B_\mu {\mathbf {v}}\mid \mu \in \Pi _\gamma \}\) is a tight Parseval frame for \(V_{\gamma ,{\mathbf {v}}}\), again viewing \(\Phi _{\gamma ,{\mathbf {v}}}\) as a multiset that can have repetition (as seen in Lemma 1).
Now suppose that \(\{{\mathbf {v}}_i^*\}_{i=1}^{d_\gamma }\) is an orthonormal basis for \(V_\gamma ^*\subseteq {\mathbb {R}}[\Pi _\gamma ]\). Then \(\Phi _{\gamma ,{\mathbf {v}}_i}\) is a tight Parseval frame for \(V_{\gamma ,{\mathbf {v}}_i^*}\) for each \(1 \le i \le d_\gamma \). Moreover, \(V_{\gamma ,{\mathbf {v}}_i}\) and \(V_{\gamma ,{\mathbf {v}}_j}\) are orthogonal subspaces for \(i \not = j\). To see this, identify \(V_{\gamma ,{\mathbf {v}}_i^*}\) with \(V_{\gamma } \otimes {\mathbf {v}}_i^*\) in \(W_\gamma \cong V_\gamma \otimes V_\gamma ^*\). Since \(V_\gamma \) and \(V_\gamma ^*\) are irreducible, they carry a unique (up to scalar multiple) \({\mathbb {S}}_n\)-invariant inner product, and therefore an \({\mathbb {S}}_n\)-invariant inner product on \(W_\gamma \subseteq {\mathbb {R}}[{\mathbb {S}}_n]\) equals \(\langle {\mathbf {w}}_1 \otimes {\mathbf {v}}_i, {\mathbf {w}}_2 \otimes {\mathbf {v}}_j \rangle = \langle {\mathbf {w}}_1, {\mathbf {w}}_2\rangle \langle {\mathbf {v}}_i, {\mathbf {v}}_j \rangle \), up to a scalar. The orthogonality of the spaces \(\{V_{\gamma ,{\mathbf {v}}_i^*}\}_{i=1}^{d_\gamma }\) follows from the orthogonality of \(\{{\mathbf {v}}_i^*\}_{i=1}^{d_\gamma }\).
Finally, if \(\lambda \in \Lambda _\gamma \) and \(\left\{ \mathbf{v}_{\gamma ,\lambda ,k}\right\} _{k=1}^{\kappa _{\gamma ,\lambda }}\) is an orthonormal basis for the graph Laplacian eigenspace of \(V_{\gamma }^*\subseteq {\mathbb {R}}[\Pi _\gamma ]\) of eigenvalue \(\lambda \), then \(\Phi _{\gamma ,\lambda } = \cup _{k = 1}^{\kappa _{\gamma ,\lambda }} \Phi _{\gamma ,\mathbf{v}_{\gamma ,\lambda ,k}}\) is a union of orthogonal tight Parseval frames, and therefore is a tight Parseval frame for \(Z_{\gamma ,\lambda }\). Furthermore, \(\Phi _\gamma = \cup _{\lambda \in \Lambda _\gamma } \Phi _{\gamma ,\lambda }\) a union of orthogonal tight Parseval frames, and therefore is a tight Parseval frame for the isotypic component \(W_\gamma \). Since isotypic components are orthogonal (e.g., [97, Theorem 10.7]), \(\mathcal {D} = \cup _{\gamma \vdash n} \Phi _\gamma \) is a tight Parseval frame for \({\mathbb {R}}[{\mathbb {S}}_n]\). \(\square \)
Remark 2
(Frames for subspaces of \({\mathbb {R}}[{\mathbb {S}}_n]\)) From the proof of Theorem 1 we see that the proposed method can be leveraged to construct a tight Parseval frame for any union of the \(Z_{\gamma ,\lambda }\) subspaces, not just \({\mathbb {R}}[{\mathbb {S}}_n]\). This property can be beneficial for computational reasons, and is explored further in Sect. 6.4. In particular, for \(\gamma \vdash n, \lambda \in \Lambda _\gamma \), and \(1 \le k \le \kappa _{\gamma ,\lambda }\), we have:
-
1.
\(\Phi _{\gamma ,\lambda ,k} = \{ \varvec{\varphi }_{\gamma ,\lambda ,k,\pi }\}_{\pi \in \Pi _\gamma }\) is a tight Parseval frame for \({\mathbb {R}}[{\mathbb {S}}_n] \varvec{\varphi }_{\gamma ,\lambda ,k,\pi } \cong V_\gamma \) (for any \(\pi \in \Pi _\gamma \)).
-
2.
\(\Phi _{\gamma ,\lambda } = \{ \varvec{\varphi }_{\gamma ,\lambda ,k,\pi }\}_{1 \le k \le \kappa _{\gamma ,\lambda }, \pi \in \Pi _\gamma }\) is a tight Parseval frame for the subspace \(Z_{\gamma , \lambda }\).
-
3.
\(\Phi _{\gamma } = \{ \varvec{\varphi }_{\gamma ,\lambda ,k,\pi }\}_{\lambda \in \Lambda _\gamma ,1 \le k \le \kappa _{\gamma ,\lambda }, \pi \in \Pi _\gamma }\) is a tight Parseval frame for the isotypic component \(W_{\gamma }\).
Remark 3
(Equal norms) Since the frame \(\Phi _{\gamma ,\lambda ,k} = \{ \varvec{\varphi }_{\gamma ,\lambda ,k,\pi }\}_{\pi \in \Pi _\gamma }\) is generated by a group action, the frame vectors have equal norms. In fact, for any \(\varvec{\varphi }_{\gamma ,\lambda ,k,\pi } \in \Phi _{\gamma ,\lambda ,k}\),
Remark 4
(Frame angles) The Gram matrix (or Gramian) of \(\Phi _{\gamma ,\lambda ,k}\) is the matrix of inner products,
Since \(\Phi _{\gamma ,\lambda ,k}\) is a tight Parseval frame for the irreducible module \({\mathbb {R}}[{\mathbb {S}}_n] {\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi } \cong V_\gamma \), the Gram matrix \(\mathbf{G}_{\Phi _{\gamma ,\lambda ,k}}\) equals the \(m_\gamma \times m_\gamma \) matrix that projects \({\mathbb {R}}[\Pi _\gamma ]\) onto \(V_\gamma \) (see [97, Corollary 10.2, Theorem 13.1]). This projection has a well-known description (e.g., [97, (13.19)]) as the matrix of the following operator in the center of the group algebra \({\mathbb {R}}[{\mathbb {S}}_n]\):
where the coefficients \(\chi _\gamma (\sigma ^{-1}) \) are given by the irreducible character \(\chi _\gamma \) corresponding to \(\gamma \). Applying \(p_\gamma \) to the basis \(\{{\mathbf {e}}_\mu \}_{\mu \in \Pi _\gamma }\) of \({\mathbb {R}}[\Pi _\gamma ]\) gives
For \(\pi ,\mu \in \Pi _\gamma \), the entry \(\langle {\varvec{\varphi }}_{\gamma ,\lambda ,k,\pi }, {\varvec{\varphi }}_{\gamma ,\lambda ,k,\mu }\rangle \) of \(\mathbf{G}_{\Phi _{\gamma ,\lambda ,k}}\) is the coefficient of \({\mathbf {e}}_\pi \) in \(p_\gamma ({\mathbf {e}}_{\mu })\); namely,
where \({{\mathcal {V}}}_{\pi ,\mu } = \{ \sigma \in {\mathbb {S}}_n \mid \sigma (\mu ) = \pi \}\) is one of the equivalence classes of Definition 2 and is a right coset of the stabilizer \({{\mathcal {V}}}_{\pi ,\pi } = \{ \sigma \in {\mathbb {S}}_n \mid \sigma (\pi ) = \pi \}\). The characters of the symmetric group are integers, so, \( \sum _{\sigma \in {{\mathcal {V}}}_{\pi ,\mu } } \chi _\gamma (\sigma ) \in {\mathbb {Z}}\). Frame vectors corresponding to different values of \(\lambda \) and k are orthogonal (as seen in the proof of Theorem 1). Therefore the Gram matrix \(\mathbf{G}_{\Phi _\gamma } = \bigoplus _{\lambda \in \Lambda _\gamma } \bigoplus _{k = 1}^{\kappa _{\gamma ,\lambda }} \mathbf{G}_{\Phi _{\gamma ,\lambda ,k}}\) for the isotypic component \(W_\gamma \) is the direct sum of \(d_\gamma \) matrices, each of the form (19).
For shapes \(\gamma \) with multiple blocks of the same size, there is redundancy in the frame \(\varPhi _{\gamma ,\lambda }\) that can be removed. Let \(\gamma = [\gamma _1, \ldots , \gamma _\ell ]\) and suppose that two parts of \(\gamma \) are equal; that is, \(\gamma _i = \gamma _j\). For \(\pi \in \Pi _\gamma \), let \(\pi ' \in \Pi _\gamma \) be the ordered set partition obtained from \(\pi \) by swapping row i and row j. For example, if \(\gamma = [4,3,3,2]\), then
satisfy this condition. The liftings from \(V_\gamma \subseteq {\mathbb {R}}[\Pi _\gamma ]\) to \({\mathbb {R}}[{\mathbb {S}}_n]\) via \(\pi \) and \(\pi '\) are related according to the following lemma.
Lemma 1
Let \(\gamma = [\gamma _1, \ldots , \gamma _\ell ] \vdash n\) with \(t= \gamma _i = \gamma _j\) and let \(\pi , \pi ' \in \Pi _\gamma \) be equal after swapping rows i and j in \(\pi \). If \({\mathbf {v}}\in V_\gamma \subseteq {\mathbb {R}}[\Pi _\gamma ],\) then \(\mathbf{B}_\pi {\mathbf {v}}= (-1)^t \mathbf{B}_{\pi '} {\mathbf {v}}\).
Proof
Let \(\gamma = [\gamma _1, \ldots , \gamma _\ell ]\) with \(t =\gamma _i = \gamma _j\). For \(\rho \in \Pi _\gamma \), let \(\rho '\) be the same set partition as \(\rho \) except with rows i and \(i+1\) swapped, as illustrated in (20). Suppose that \({\mathbf {v}}\in {\mathbb {R}}[\Pi _\gamma ]\) is expressed in the canonical basis as \({\mathbf {v}}= \sum _{\rho \in \Pi _\gamma } c_\rho {\mathbf {e}}_\rho \). Suppose further that for each \(\rho \) we have \(c_\rho = (-1)^t c_{\rho '}\), which we call the symmetry property. Then,
since \(\sigma (\rho ) = \pi \) if and only if \(\sigma (\rho ') = \pi '\). Thus, the proposition is proved if we show that every \({\mathbf {v}}\) in the submodule \(V_\gamma \subseteq {\mathbb {R}}[\Pi _\gamma ]\) has the symmetry property \(c_\rho = (-1)^t c_{\rho '}\).
The submodule \(V_\gamma ^*\subseteq {\mathbb {R}}[\Pi _\gamma ]\) is spanned by the following set of vectors, called polytabloids (see [83, 2.3]),
where \(C_\pi \subseteq {\mathbb {S}}_n\) is the column group of \(\pi \), that is, the permutations that stabilize the columns of \(\pi \), and \(\mathsf {sign}(\beta )\) is the sign of the permutation \(\beta \). Let \(\tau _\pi \in C_\pi \) be the permutation that is the product of the t disjoint transpositions (not necessarily adjacent) that swap an entry in row i of \(\pi \) with the corresponding entry in row j. For example, in (20), \(\tau _\pi = (2,3)(5,6)(8,11)\). Then \(\mathsf {sign}(\tau _\pi ) = (-1)^t\) and \(\pi ' = \tau _\pi (\pi )\).
Since \(\tau _\pi \in C_\pi \), we have \(\tau _\pi ^{-1} C_\pi \tau _\pi = C_\pi \) and \(\mathsf {sign} (\tau _\pi ^{-1} \beta \tau _\pi ) = \mathsf {sign} (\beta )\). Moreover, \(\tau _\pi ^{-1} = \tau _\pi \), so we have
and thus \(q_{\pi } = (-1)^t q_{\pi '}\). It follows that \(q_{\pi }\) has the symmetry property \(c_\rho = (-1)^t c_{\rho '}\). Since this is true of each vector of the spanning set \(q_\pi , \pi \in \Pi _\gamma \), it is true for all of \(V_\gamma ^*\), and the result is proved. \(\square \)
Lemma 1 tells us that, in the case where \(\gamma \) has repeated parts, many of the atoms in (15) are identical or are the negatives of others. We then can lift fewer vectors to generate a tight Parseval frame for \(Z_{\gamma , \lambda }\), which leads to a more computationally efficient implementation without sacrificing any interpretability. Define \(z_{\gamma }:=\frac{m_\gamma }{\prod _i k_i!}\), where if \(\gamma =[\gamma _1,\ldots ,\gamma _{\ell }]\), \(k_i\) is the multiplicity of i in \(\gamma \). For example, if \(\gamma =[4,2,2,2,1]\), \(m_\gamma =\frac{11!}{4!2!2!2!1!}\) and \(z_\gamma =\frac{m_\gamma }{3!}\), as \(i=2\) appears three times in \(\gamma \). Identifying ordered set partitions in \(\Pi _{\gamma }\) that feature the same groupings of candidates yields a smaller set of \(z_\gamma \) (unordered) set partitions, which we denote by \(\bar{\Pi }_\gamma \). For example, the ordered set partitions \(\{\{1,2,3,4\},\{5,6\},\{7,8\},\{9,10\},\{11\}\}\), \(\{\{1,2,3,4\},\{7,8\},\{5,6\},\{9,10\},\{11\}\}\), and four others are all identified to a single set partition in \(\bar{\Pi }_{[4,2,2,2,1]}\). For \({\bar{\pi }}\in \bar{\Pi }_\gamma \), define \({\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}} := {\bar{c}}_{\gamma }{} \mathbf{B}_{{\bar{\pi }}} \mathbf{v}_{\gamma , \lambda , k}\), and define the reduced frame
where the constant \({\bar{c}}_\gamma :=\sqrt{\frac{d_\gamma m_\gamma }{n! z_\gamma }}\). In Theorem 2 we show that \(\bar{\varPhi }_{\gamma ,\lambda }\) remains a tight Parseval frame for \(Z_{\gamma ,\lambda }\).
Theorem 2
For \(\gamma \vdash n\) and \(\lambda \in \Lambda _\gamma ,\) the collection of atoms \(\bar{\varPhi }_{\gamma ,\lambda }\) defined in (22) is a tight Parseval frame for \(Z_{\gamma , \lambda },\) and the set of atoms \(\bar{{{\mathcal {D}}}}:=\bigcup _{\gamma \vdash n} \bigcup _{\lambda \in \Lambda _\gamma } \bar{\varPhi }_{\gamma ,\lambda }\) is a tight Parseval frame for \({\mathbb {R}}[{\mathbb {S}}_n]\).
Proof
By Lemma 1, \(\Phi _{\gamma ,\lambda } = \bigcup _{i = 1}^{m_\gamma /z_\gamma } \pm \bar{\Phi }_{\gamma ,\lambda }\). Therefore, from Theorem 1, for any \(\mathbf{f} \in {\mathbb {R}}[{\mathbb {S}}_n]\), we have
By observing that \({\bar{c}}_\gamma =\sqrt{\frac{d_\gamma m_\gamma }{n! z_\gamma }} = \sqrt{\frac{m_\gamma }{z_\gamma }} c_\gamma ,\) we see that \(\bar{\Phi }_{\gamma ,\lambda }\) is a tight Parseval frame for \(Z_{\gamma ,\lambda }\). \(\square \)
In Fig. 15, for \(\gamma =[2,2]\), we show two different eigenvectors of the Schreier graph \({\mathbb {P}}_\gamma \) lifted back to the permutahedron according to the \(z_\gamma =3\) different ordered set partitions in Fig. 13, yielding tight frames \(\bar{\varPhi }_{[2,2],1.2679}\) and \(\bar{\varPhi }_{[2,2],4.7321}\) with three vectors each for the two-dimensional spaces \(Z_{[2,2],1.2679}\) and \(Z_{[2,2],4.7321}\), respectively. An important point of emphasis here is that compared to the orthonormal bases for the same spaces in Fig. 9, the frame vectors in Fig. 15maintain interpretable symmetry properties.
In this case, were we to include all \(m_\gamma =6\) liftings generated from the ordered set partitions in \(\Pi _\gamma \), the resulting frames \(\varPhi _{\gamma ,\lambda }\) would have two copies of each of these three atoms, all scaled by the constant factor \(\frac{c_\gamma }{{\bar{c}}_\gamma }=\sqrt{\frac{z_\gamma }{m_\gamma }}\). In cases where the repeated parts have odd length (e.g., tight frames for \(Z_{[8,1,1],\lambda }\) generated by lifting the eigenvector in Fig. 23b), the removed atoms would have the opposite sign in each entry. Unless specified, our default in the remainder of the paper is to use the tight frames with fewer elements defined in (22).
Returning to the three objectives outlined at the beginning of this section, the proposed dictionary atoms comprise a tight Parseval frame, as shown in Theorems 1 and 2, and therefore satisfy the first objective of preserving the energy of the signal. Each atom \({\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}}\) belongs to the space \(Z_{\gamma , \lambda } = W_\gamma \cap U_\lambda \) and therefore inherits known symmetry and smoothness properties from \(W_\gamma \) and \(U_\lambda \), respectively. In the next section, we investigate the interpretation of specific frame analysis coefficients \(\langle \mathbf{f},{\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}}\rangle \) in the context of several example data sets. The third objective of identifying methods to efficiently compute these inner products is the focus of Sect. 6, and this computational question actually gives rise to additional interesting theoretical questions.
5 Interpretation of the Analysis Coefficients
The inner products between a signal on the permutahedron and the atoms of the tight spectral frame \(\bar{{{\mathcal {D}}}}\) from Theorem 2 (or \({{\mathcal {D}}}\) from Theorem 1), referred to as analysis coefficients, are useful in identifying structure in voting data, such as popular candidates, polarizing candidates, and clusters of candidates commonly ranked similarly by subgroups of voters (e.g., political parties). The analysis coefficients have the form:
In some instances, it is beneficial for interpretation purposes to view these analysis coefficients as inner products between the signal \(\mathbf{f}\) and the atoms [the first two terms in (23)]. All of these signals reside on the permutahedron \({\mathbb {P}}_n\) (see Fig. 15 or the second row from the bottom in Fig. 18 for illustrations of example atoms). In other instances, it is helpful to view the same quantity via the last term in (23): a constant times an inner product between the signal projected down to the Schreier graph \({\mathbb {P}}_\gamma \) in a specific manner, and a Laplacian eigenvector of that Schreier graph.
Recalling that the energy of the signal is equal to the energy of the analysis coefficients, i.e.,
we first investigate what information can be garnered from the decomposition of the energy of the analysis coefficients on the right-hand side of (24) (i) across shapes \(\gamma \), (ii) across eigenvalues \(\lambda \) within a fixed shape \(\gamma \), and (iii) across atoms within a shape–eigenvalue pair \(\gamma , \lambda \). We then examine the interpretation of specific analysis coefficients.
5.1 Energy Decomposition
The squared magnitudes \(|\langle \mathbf{f}, {\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}} \rangle |^2\) in the summand of (24) can be aggregated and plotted in different ways to identify structural patterns in the ranking tallies, \(\mathbf{f}\). First, for each shape \(\gamma \), the sum \(\sum _{\lambda ,k,{\bar{\pi }}} |\langle \mathbf{f}, {\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}} \rangle |^2\) is equal to \(\Vert \mathbf{f}_\gamma \Vert ^2\), the energy of the projection of the signal onto the corresponding isotypic component (compare, e.g., the bottom image in Fig. 3 to the top table in Fig. 18). Each of these quantities can be further decomposed across the eigenspaces associated with the eigenvalues in \(\Lambda _\gamma \) via the sums \(\Vert \mathbf{f}_{\gamma ,\lambda }\Vert ^2=\sum _{k,{\bar{\pi }}} |\langle \mathbf{f}, {\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}} \rangle |^2\). For example, as shown in Fig. 18, for the 2017 Minneapolis City Council Ward 3 e lectio n data \(\mathbf{g}\),
Through plots such as those shown in Figs. 16 and 17, we can visualize the full decomposition of energy into shape–eigenvalue pairs. For example, in Fig. 16 (or the top row of Fig. 18), we see that most of the energy from the 2017 Minneapolis City Council Ward 3 election data \(\mathbf{g}\) shown in Fig. 1 falls in the spaces \(Z_{[4],0}\) (which just conveys information about the total number of voters), \(Z_{[3,1],2}\), \(Z_{[3,1],0.586}\), and \(Z_{[2,2],1.268}\). As discussed in Sect. 3.2, typically occurring rankings are smooth with respect to the underlying permutahedron structure, and we therefore expect to see a decay in the energies \(\{\Vert \mathbf{f}_{\gamma ,\lambda }\Vert ^2\}\) as \(\lambda \) increases, as is the case, e.g., in the sushi data in Fig. 17. While in the examples shown Figs. 16 and 17 the bar at each eigenvalue is comprised of a single color representing the corresponding shape, this need not be the case. For example, when \(n=6\), the eigenvalue \(\lambda =3\) appears in two shapes ( and ), and therefore the energy of a signal at this eigenvalue would be comprised of two bars of different colors stacked on top of each other. Additionally, when the graph Laplacian eigenspace of the Schreier graph \({\mathbb {P}}_\gamma \) that is associated with \(\lambda \) in \(V_{\gamma }\) has dimension greater than one (i.e., \(\kappa _{\gamma ,\lambda }>1\)), the energies of the analysis coefficients resulting from all atoms generated from the basis \(\left\{ \mathbf{v}_{\gamma ,\lambda ,k}\right\} _{k=1}^{\kappa _{\gamma ,\lambda }}\) are stacked on top of one another and shown as a single bar. For example, the apparent outliers at \(\lambda =8\) and \(\lambda =10\) in Fig. 17 are only due to the fact that \(\kappa _{\gamma ,\lambda }>1\) for these shape–eigenvalue pairs, as opposed to some structure of interest in the sushi data; a plot of the energies of the analysis coefficients of a pure noise signal yields similar outliers.
We can further decompose the energy in any shape–eigenvalue pair by examining the sequence of energies, \(\{|\langle \mathbf{f}, {\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}} \rangle |^2\}_{1 \le k \le \kappa _{\gamma ,\lambda }, \pi \in \bar{\Pi }_\gamma },\) associated with each atom in the frame. In particular, for any shape–eigenvalue pair with a significant amount of energy, we can identify the ordered set partitions \(\bar{\pi }\) used to lift the associated eigenvectors to generate the atoms that yield the inner products with the highest magnitudes. For example, for the sushi data signal \(\mathbf{h}\), Fig. 19 shows the shape–eigenvalue-lifting triplets associated with largest magnitude analysis coefficients.
5.2 Interpretation of Specific Analysis Coefficients
How can we use the magnitudes of the analysis coefficients to identify structure in the ranked data? The general methodology for each analysis coefficient is to look at the structure of the corresponding eigenvector on the Schreier graph of the identified shape and the projection of the signal onto that Schreier graph via \(\mathbf{B}_{\bar{\pi }}^{\top }\), as the analysis coefficient is the inner product of those two signals on \(P_{\gamma }\) [see (23)]. One key advantage of this method is that it allows us to create visualizations of high-dimensional data on much lower-dimensional graphs. Some of these eigenvectors are more easily interpretable than others, but we are fortunate that the most interpretable eigenvectors are often the ones associated with the largest magnitude analysis coefficients, particularly for the smooth signals that commonly arise in ranking applications.
5.2.1 The Shape \(\gamma =[n]\): Number of Votes
The only eigenvalue in \(\Lambda _{[n]}\) is \(\lambda =0\), and there is just a single atom associated with the shape \(\gamma =[n]\): \({\varvec{\varphi }}_{[n],0,\{\{1,2,\ldots ,n\}\}}=\frac{1}{\sqrt{n!}} \mathbf{1}_{n!},\) where \(\mathbf{1}_{n!}\) is a constant vector of ones of length n!. This atom is a basis for the isotypic component \(W_{[n]}\), and the analysis coefficient \(\langle \mathbf{f},{\varvec{\varphi }}_{[n],0,\{\{1,2,\ldots ,n\}\}}\rangle = \hat{f}(0) = \Vert \mathbf{f}_{[n]}\Vert \) is just equal to the total number of votes (rankings) divided by n! (c.f., Fig. 3).
5.2.2 The Shape \(\gamma =[n-1,1]\): Individual Popularity and Polarization
The Laplacian eigenvalues \(\Lambda _{[n-1,1]}\) and associated eigenvectors of the Schreier graph \({\mathbb {P}}_{[n-1,1]}\) are the same as the \(n-1\) nonzero Laplacian eigenvalues and associated eigenvectors of a path graph with n vertices. These are known in closed form.
Lemma 2
[see, e.g., [93]] Let \(Q_n\) denote the path graph on n vertices. The Laplacian eigenvalues of \(Q_n\) are
and the associated Laplacian eigenvectors are
In Fig. 20, we show the first two eigenvalues \(\Lambda _{[9,1]}\) and their associated eigenvectors on \({\mathbb {P}}_{[9,1]}\).
Atoms Generated from the Laplacian Eigenvector \(\mathbf{v}_{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) }\): How Popular Is Each Candidate?
For \(z=1,2,\ldots ,n\), let \({\bar{\pi }}_z\) be the set partition that places candidate z in one block and all other candidates in the other block (e.g. with \(n=4\), \({\bar{\pi }}_2=\{\{1,3,4\},\{2\}\}\)). Then the atom \({\varvec{\varphi }}_{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) ,{\bar{\pi }}_z}={\bar{c}}_{[n-1,1]}\mathbf{B}_{{\bar{\pi }}_z} \mathbf{v}_{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) }\) is equal to \({\bar{c}}_{[n-1,1]}\frac{2}{\sqrt{n}}\cos \left( \frac{\pi (i-0.5)}{n}\right) \) on each vertex of the permutahedron \({\mathbb {P}}_n\) associated with a ranking in which candidate z is ranked in place i (see the second row from the bottom in Fig. 18 for illustrations of such atoms). Since the eigenvector \(\mathbf{v}_{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) }=\frac{2}{\sqrt{n}}\cos \left( \frac{\pi (i-0.5)}{n}\right) \) decreases from \(i=1\) to \(i=n\) (see, Fig. 20), the analysis coefficient \({\alpha }_{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) ,{\bar{\pi }}_z}\) conveys a notion of general favorability of candidate z. As shown in Fig. 18, for the 2017 Minneapolis City Council Ward 3 election data, the largest analysis coefficient in this shape–eigenvalue pair is the 290.8 associated with \({\bar{\pi }}_3\), followed by \({\bar{\pi }}_1\), \({\bar{\pi }}_4\), and \({\bar{\pi }}_2\), indicating that candidate 3 is generally popular whereas candidate 2 is generally not popular. In the sushi preference data, the order of the candidate popularities according to this metric, from most popular to least popular, is 8, 3, 1, 6, 2, 5, 4, 9, 7, 10. This ordering is the same as the Condorcet ranking listed in Sect. 2.3, except with shrimp and salmon roe swapped in third and fourth place.
Remark 5
(Relation with rank aggregation and the Borda count) The problem of mapping ranking data into a single consensus ranking of the candidates is referred to as rank aggregation and dates back to social choice theory in the eighteenth century [11] (see [62, 100, Sect. 5] for more recent surveys). Borda’s original method [11] is to assign n points to each first place vote, \(n-1\) points to each second place vote, and so forth, until 1 point for each last place vote. The consensus ranking is then formed according to the total number of points each candidates receives. The ranking of the candidates according to the analysis coefficients \({\alpha }_{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) ,{\bar{\pi }}_z}\) demonstrated above is equivalent to a Borda count aggregate ranking that uses point values
for ith place instead of the more common linearly spaced point values from n to 1. For example, with 10 candidates, the ranking of the analysis coefficients generated from this eigenvector is equivalent to a weighted Borda count where the points assigned for each vote, from first place to last place, are 10.00, 9.56, 8.72, 7.57, 6.21, 4.79, 3.43, 2.28, 1.44, and 1.00.
Atoms Generated from the Laplacian Eigenvector \(\mathbf{v}_{[n-1,1],2-2\cos \left( \frac{2\pi }{n}\right) }\): How Polarizing Is Each Candidate?
The eigenvector elements \(\mathbf{v}_{[n-1,1],2-2\cos \left( \frac{2\pi }{n}\right) }(i)=\frac{2}{\sqrt{n}}\cos \left( \frac{2\pi (i-0.5)}{n}\right) \) decrease from \(i=1\) to , and then increase again from to \(i=n\) (see, e.g., Fig. 20). The atom
therefore features positive values on the vertices of \({\mathbb {P}}_n\) associated with rankings in which candidate z is ranked towards the top or bottom, and negative values on vertices associated with rankings in which candidate z is ranked in the middle. Accordingly, the analysis coefficient \({\alpha }_{[n-1,1],2-2\cos \left( \frac{2\pi }{n}\right) {\bar{\pi }}_z}\) is large when many voters feel strongly (either positively or negatively) about candidate z. For example, looking at the summary of the analysis coefficients for the 2017 Minneapolis City Council Ward 3 election data in the bottom row of the column for eigenvalue \(\lambda =2-2\cos \left( \frac{2\pi }{4}\right) =2\) in Fig. 18, the largest analysis coefficient is the 318.7 associated with the set partition \(\{\{2,3,4\},\{1\}\}\), indicating that candidate 1 (Ginger Jentzen) is often ranked in either first or last place. The corresponding coefficients for candidates 2 and 3 are negative, indicating they are often ranked in positions 2 and 3, and are less polarizing. For the sushi preference data, the largest positive analysis coefficients in the \(\gamma =[9,1]\), \(\lambda =0.382\) pair (see Fig. 19) indicate items 8 and 5 (fatty tuna and sea urchin) are often ranked quite highly or quite lowly, while item 9 (tuna roll) is often ranked in the middle. The key takeaway is that the combination of a near-zero value of \(\alpha _{[n-1,1],2-2\cos \left( \frac{\pi }{n}\right) ,{\bar{\pi }}_z}\) and a relatively large value of \(\alpha _{[n-1,1],2-2\cos \left( \frac{2\pi }{n}\right) ,{\bar{\pi }}_z}\) indicates that the candidate identified by the singleton in the set partition \({\bar{\pi }}_z\) is highly polarizing. This is the case in the sushi preference data for the sea urchin item, for which \(\alpha _{[n-1,1],2-2\cos \left( \frac{2\pi }{n}\right) ,{\bar{\pi }}_z}=-0.016\) (lowest magnitude of any item) and \(\alpha _{[n-1,1],2-2\cos \left( \frac{2\pi }{n}\right) ,{\bar{\pi }}_z}=0.960\) (second highest of the items), indicating that the voting population is roughly split between strongly liking and strongly disliking sea urchin.
5.2.3 The Shapes \(\gamma =[n-2,2]\) and \(\gamma =[n-2,1,1]\): Pairwise Co-occurence
Net of their individual popularities, when are two candidates likely to be ranked similarly (either positively or negatively) by voters? Given a voter’s first choice candidate, are there other candidates the voter is likely to feel positively, negatively, or neutral about? These are the types of pairwise co-occurence questions that can be answered with the second-order marginal information found in the shapes \(\gamma =[n-2,2]\) and \(\gamma =[n-2,1,1]\).
For the 2017 Minneapolis City Council Ward 3 election data, the largest analysis coefficient in this shape, shown in bottom-right of Fig. 18, is the 239.0 associated with eigenvalue \(\lambda =1.268\) and the set partition \(\pi _z=\{12|34\}\), indicating candidates 3 and 4 are often ranked together in the first two positions or last two positions. This is not surprising as these candidates belong to the same political party. For a small number of candidates, such as \(n=4\) in this case, it is possible to visually inspect the atoms of the form \({\varvec{\varphi }}_{\gamma ,\lambda ,{\bar{\pi }}}\) shown in Fig. 18 in order to interpret the analysis coefficients.
For larger values of n, however, it is more convenient to think about the inner products defined in (23) as \(\alpha _{\gamma ,\lambda ,{\bar{\pi }}}= {\bar{c}}_{\gamma } \langle \mathbf{B}_{{\bar{\pi }}}^{\top }{} \mathbf{f},\mathbf{v}_{\gamma , \lambda }\rangle \). Here, \(\mathbf{B}_{{\bar{\pi }}}^{\top }{} \mathbf{f}\) is a projection of the signal from \({\mathbb {P}}_n\) down to the Schreier graph \({\mathbb {P}}_\gamma \), the same structure on which the eigenvector \(\mathbf{v}_{\gamma , \lambda }\) resides. In Fig. 21, we show four such projections of \(\mathbf{h}\) onto \({\mathbb {P}}_{[8,2]}\) that capture the joint placement of four different pairs of items: 3 and 8 (two favorites), 8 and 10 (one favorite and one of the least preferable items), 5 and 6 (two more polarizing items that are often ranked together near the top or together at the bottom), and 1 and 2 (two generally well liked items that are often ranked towards the top but not at the top). In each of the projections shown in Fig. 21, the value at each vertex is equal to the number of voters who ranked the selected pair of items in the ranking positions contained in the vertex labels; for example, Fig. 21a shows that 637 of the 5000 voters placed items 3 and 8 (tuna and fatty tuna) in their top two ranking slots.
The other halves of the inner products \(\alpha _{[8,2],\lambda ,{\bar{\pi }}}= {\bar{c}}_{[8,2]} \langle \mathbf{B}_{{\bar{\pi }}}^{\top }{} \mathbf{f},\mathbf{v}_{[8,2], \lambda }\rangle \) are the Laplacian eigenvectors of the module \(V_{[8,2]}\), the first two of which we show in Fig. 22c, d. The coefficients \(\{\alpha _{[8,2],0.2047,{\bar{\pi }}}\}\) associated with the eigenvector in Fig. 22c capture a notion of pairwise proximity. Specifically, a large positive coefficient indicates the two items/candidates grouped by the set partition \({\bar{\pi }}_z\) are likely to be close to the first two or last two ranking positions, suggesting they may have similar features or belong to the same political party in the case of an election. For the sushi preference data, as shown in Fig. 19, the two largest positive coefficients \({\bar{c}}_{[8,2]} \langle \mathbf{B}_{{\bar{\pi }}}^{\top }{} \mathbf{f},\mathbf{v}_{[8,2], \lambda }\rangle \) are the 1.3304 associated with the set partition \({\bar{\pi }}=\{12456790|38\}\), which groups the two overall favorites fatty tuna and tuna together, and the 1.6543 associated with \({\bar{\pi }}=\{12345689|70\}\), which groups the two overall least favorites—and the only two vegetarian options—egg and cucumber roll together. In ranked voting elections where there are more than two candidates in the same political party, we might expect to see large positive coefficients on these \([n-2,2]\) analysis coefficients associated with each of the pairs of candidates from the same party. A negative coefficient with large magnitude, on the other hand, indicates the two items are frequently ranked at opposite ends. In the sushi data, the most negative coefficient \({\bar{c}}_{[8,2]} \langle \mathbf{B}_{{\bar{\pi }}}^{\top }{} \mathbf{f},\mathbf{v}_{[8,2], \lambda }\rangle \) is the \(-1.7150\) associated with the set partition \({\bar{\pi }}=\{12345679|80\}\), which groups the overall favorite (fatty tuna) and overall least favorite (cucumber roll) items together.
Let us now consider the set partition \({\bar{\pi }}=\{34567890|12\}\), for which the corresponding projection onto \({\mathbb {P}}_{[8,2]}\) is shown in Fig. 21d. Of the 45 analysis coefficients \(\{\alpha _{[8,2],0.2047,{\bar{\pi }}}\}\), \(\alpha _{[8,2],0.2047,\{34567890|12\}}=-0.0231\) is the third smallest in magnitude, and of the 45 analysis coefficients \(\{\alpha _{[8,2],0.4700,{\bar{\pi }}}\}\), \(\alpha _{[8,2],0.4700,\{34567890|12\}}=-0.0088\) is the smallest in magnitude. There are two reasons these coefficients are closer to 0. First, the energy of the projection \(\mathbf{B}_{\{34567890|12\}}^{\top }{} \mathbf{h}\) in Fig. 21d is more evenly spread across the vertices of \({\mathbb {P}}_{[8,2]}\) than the other projections in Fig. 21a–c even though they all have the same sum; thus, the norm \(\Vert \mathbf{B}_{\{34567890|12\}}^{\top }{} \mathbf{h}\Vert \) is smaller. Second, from a graph signal processing viewpoint, the energy \(\Vert \mathbf{B}_{\{34567890|12\}}^{\top }{} \mathbf{h}\Vert ^2\) decomposes across the Laplacian eigenspaces of the Schreier graph \({\mathbb {P}}_{[8,2]}\). In this case, most of that energy falls into the one-dimensional spaces spanned by the eigenvectors shown in Fig. 22a, b. In addition to being Laplacian eigenvectors of \({\mathbb {P}}_{[8,2]}\), these vectors can be viewed as liftings of eigenvectors from the module \(V_{[9,1]}\) to \({\mathbb {P}}_{[8,2]}\). More formally and generally, if \(\nu \vartriangleright \gamma \), then for any \(\xi \in \bar{\Pi }_\nu \) and \(\pi \in \bar{\Pi }_\gamma \), we define the linear mapping \(\mathbf{T}_{\xi ,\pi }: {\mathbb {R}}[\Pi _\nu ] \rightarrow {\mathbb {R}}[\Pi _\gamma ]\) by \(\mathbf{T}_{\xi ,\pi }{} \mathbf{x}:=\mathbf{B}_{\pi }^{\top }{} \mathbf{B}_{\xi }{} \mathbf{x}\) (see Proposition 8). If \(K_{\gamma ,\nu }=1\), then \(\frac{\mathbf{T}_{\xi ,\pi }{} \mathbf{x}}{\Vert \mathbf{T}_{\xi ,\pi }{} \mathbf{x}\Vert }= \pm \frac{\mathbf{T}_{\xi ^\prime ,\pi ^\prime }{} \mathbf{x}}{\Vert \mathbf{T}_{\xi ^\prime ,\pi ^\prime }{} \mathbf{x}\Vert }\) for all \(\xi ,\xi ^\prime \in \bar{\Pi }_\nu \), \(\pi ,\pi ^\prime \in \bar{\Pi }_\gamma \), and \(\mathbf{x} \in V_\nu \), and we identify all of these transformations under the notation \(\mathbf{T}_{\nu ,\gamma }\).
Next, we examine the ordered second-order marginals, discussed in Sect. 3.1 and captured in the \([n-2,1,1]\) isotypic component. Returning to the negative coefficient \(\alpha _{[8,2],0.2047,\{12345679|80\}}=-1.7150\), potential causes for the large magnitude could be that (i) the first selected candidate (8) is commonly ranked first and the second selected candidate (10) is commonly ranked last, (ii) the first selected candidate is commonly ranked last and the second selected candidate is commonly ranked first, or (iii) both candidates are polarizing, but ranked at opposite ends of the spectrum by different sets of voters. Which other coefficients provide structural information that can be combined with \(\alpha _{[8,2],0.2047,\{12345679|80\}}\) to inform which of these scenarios might be occurring? In this case, we know from the coefficients \(\alpha _{[9,1],0.0979,\{123456789|0\}}=-2.1513\) and \(\alpha _{[9,1],0.0979,\{123456790|8\}}=1.9978\) that item 8 (fatty tuna) is the popular one and item 10 (cucumber roll) is the unpopular one. More generally, we can examine the coefficient \(\alpha _{[8,1,1],0.4799,\{12345679|8|0\}}\), which is equal to 1.3471 in this case. From the eigenvector \(\mathbf{v}_{[8,1,1], 0.4799}\) in Fig. 23b, we see that a positive value implies the first selected candidate is more often ranked towards the top (the case here), a negative value implies the second selected candidate is more often ranked towards the top, and a coefficient of small magnitude indicates that both candidates are roughly evenly located at the two ends [scenario (iii) above].
An analysis coefficient associated with the first eigenvector of the module \(V_{[n-2,1,1]}^*\) can also provide some insight into the interpretation of a large positive analysis coefficient associated with the first eigenvector of the module \(V_{[n-2,2]}^*\). Namely, a positive value of the coefficient indicates the second selected candidate is more popular than the first, and vice versa. For example, consider the pairs of items 7/10 and 9/10. From the fact that \(\alpha _{[8,2],0.2047,\{12345689|70\}}=1.6543\) and \(\alpha _{[8,2],0.2047,\{12345678|90\}}=0.7055\) are positive, we conclude these pairs of items often appear together at the top or bottom of the rankings. We know from the fact that the coefficients \(\alpha _{[9,1],0.0979,\{123456789|0\}}=-2.1513\), \(\alpha _{[9,1],0.0979,\{123456890|7\}}=-1.1896\), and \(\alpha _{[9,1],0.0979,\{123456780|9\}}=-0.3733\) are all negative that all three items are unpopular and these pairs therefore appear together more often at the bottom of the rankings. Is one item in each pair more likely to be ranked last? We see from the projections onto \({\mathbb {P}}_{[8,1,1]}\) in Fig. 24b, d that when ranked in the last two slots, item 10 is only slightly more likely than item 7 to be ranked last, but item 9 is much more frequently ranked higher than item 10 in these pairwise frequencies. This difference is reflected in the coefficients \(\alpha _{[8,1,1],0.4799,\{12345689|7|0\}}=-0.4050\) and \(\alpha _{[8,1,1],0.4799,\{12345678|9|0\}}=-0.9136\).
5.2.4 Shapes with \(\gamma _1< n-2\)
For subsequent shapes, we can interpret the analysis coefficient in a similar manner, as the inner products between projections of the signal down to the Schreier graph and Laplacian eigenvectors \(\mathbf{v}_{\gamma ,\lambda }\) in \(V_\gamma ^*\). Once again, the most interpretable eigenvectors and therefore informative analysis coefficients are usually those associated with lower eigenvalues (i.e., the first couple from each new irreducible \(V_\gamma ^*\)).
As a final example, we consider the analysis coefficients \(\alpha _{[7,3],\lambda ,{\bar{\pi }}}\) generated by the values \(\lambda \) equal to 0.3227, 0.5660, and 0.8122 (the associated eigenvectors of which are shown in the bottom row of Fig. 26) and the set partitions \({\bar{\pi }}\) equal to \(\{1234790|568\}\) and \(\{3567890|124\}\). The first projection of the data in Fig. 25, corresponding to \({\bar{\pi }}=\{1234790|568\}\), shows that items 5, 6, and 8 (sea urchin, salmon roe, and fatty tuna) most frequently occur in the top three rankings or rankings 1, 2, and 4, but it is also common that voters ranked two of the three at the top and one at the bottom, or one at the top and two at the bottom. Due to the high concentration of the projection on the vertices of the Schreier \({\mathbb {P}}_{[7,3]}\) labeled by 123 and 124, the inner products with the three eigenvectors in Fig. 26d–f have the 2nd, 7th, and 10th largest magnitudes of the 9000 analysis coefficients \(\{\alpha _{[7,3],\lambda ,{\bar{\pi }}}\}_{\lambda \in \Lambda _{[7,3]},{\bar{\pi }}\in {\bar{\Pi }}_{[7,3]}}\), and the 2nd, 1st, and 1st largest magnitudes of the 120 analysis coefficients in each of the subsets \(\{\alpha _{[7,3],0.3227,{\bar{\pi }}}\}_{{\bar{\pi }}\in {\bar{\Pi }}_{[7,3]}}\), \(\{\alpha _{[7,3],0.5660,{\bar{\pi }}}\}_{{\bar{\pi }}\in {\bar{\Pi }}_{[7,3]}}\), and \(\{\alpha _{[7,3],0.8122,{\bar{\pi }}}\}_{{\bar{\pi }}\in {\bar{\Pi }}_{[7,3]}}\), respectively. That is to say, there is a strong third order effect of these three items being ranked together and highly, net of the first order and second order effects, which are captured, e.g., in the lifted eigenvectors in Fig. 26a–c.
The second projection of the data in Fig. 25 captures the joint ranking positions of three sushi items with cooked fish: 1, 2, and 4 (shrimp, sea eel, and squid). Our suspicion that these three items would be ranked similarly by many voters is confirmed by the projection \(\mathbf{B}_{\{3567890|124\}}^{\top }\mathbf{h}\), which shows the items are most commonly ranked in slots 4/5/6, 5/6/7, or 6/7/8. Despite the frequent closeness of the rankings of these three items, the magnitudes of the inner products with the three eigenvectors in Fig. 26d–f are quite small (all less than 0.075 and ranking 88th, 72nd, and 52nd out of the magnitudes of the analysis coefficients associated with 120 different set partitions for the respective eigenvalues). The reason for this is that the projection \(\mathbf{B}_{\{3567890|124\}}^{\top }{} \mathbf{h}\) is quite close (up to a sign change) to \(\mathbf{T}_{[9,1],[7,3]}{} \mathbf{v}_{[9,1], 0.3820}\) in Fig. 26b, and the eigenvectors in Fig. 26d–f are orthogonal to \(\mathbf{T}_{[9,1],[7,3]}{} \mathbf{v}_{[9,1], 0.3820}\). That is, the third order effect of these three cooked fish sushi items being ranked together is weak once the first order effects found in the [9, 1] shape have been accounted for.
6 Computationally Efficient Algorithms
In this section, we develop efficient algorithms for the proposed transform, and discuss details of our openly available software that implements the transform and helps users find structure in ranked data by visualizing the analysis coefficients and/or projecting the data into lower dimensional spaces to visualize.
What are the initial computational challenges? First, naively computing the eigenvectors of \({\mathbb {P}}_n\) is not feasible for n above 7 or 8, as it has complexity \({{\mathcal {O}}}([n!]^3)\). Second, even naively computing the eigenvectors of the required Schreier graphs is not feasible (complexity \({{\mathcal {O}}}\left( \left[ \frac{n!}{\lceil n/2 \rceil !}\right] ^3\right) \)). Third, any method that explicitly computes and stores all dictionary atoms of length n! on the permutahedron in order to take inner products with the ranked data will quickly run into memory issues as n grows. Fourth, the number of dictionary atoms in \({{\mathcal {D}}}\) from Theorem 1 is \(\sum _{\gamma \vdash n} d_{\gamma } m_{\gamma }\), which grows faster than n!; i.e., the redundancy of the dictionary increases as n increases.
In order to circumvent these issues, we need specialized algorithms (i.e., not standard signal processing or numerical linear algebra techniques) that take advantage of the symmetry and structure present in the permutahedron.
Our general approach is to include as much of the computation as possible into a setup portion that is independent of the data and can therefore be performed just once, offline and ahead of time, for each n. This setup portion consists of three phases: the dynamic constructions of the adjacency matrix and the characteristic matrix \(\mathbf{B}_\gamma \) for each Schreier graph \({\mathbb {P}}_\gamma \) (Sect. 6.1), the computation of the Laplacian eigenvalues and eigenvectors of the Schreier graphs (Sect. 6.2), and the computation of a path from each reading set partition to every other vertex in the Schreier graph (Sect. 6.3). With all of this information stored for a given n, the data-dependent analysis portion of the code (Sect. 6.3) needs to be executed for each new ranked data vector \(\mathbf{f}\).
6.1 Construction of the Schreier Adjacency Matrices and a Lifting Matrix from Each Schreier to the Permutahedron
The Schreier graphs for \({\mathbb {S}}_n\) can be constructed in an iterative dynamic manner from those for \({\mathbb {S}}_{n-1}\). If \(\gamma = [\gamma _1, \gamma _2,\ldots , \gamma _\ell ]\vdash n\), then for each \(1\le i\le \ell \), let \(\Pi _{\gamma ,i} \subseteq \Pi _\gamma \) be the subset of ordered set partitions \(\mu \) with n in the ith row of \(\mu \) and let \(\mathbb {P}_{\gamma ,i}\) be the subgraph of \(\mathbb {P}_\gamma \) induced by \(\Pi _{\gamma ,i}\). In this way, \(\mathbb {P}_\gamma \) partitions into a disjoint union of subgraphs \(\{\mathbb {P}_{\gamma ,i}\}_{i=1}^\ell \) with edges between vertices in different subgraphs as follows: \(\mu \in \mathbb {P}_{\gamma ,i}\) is connected to \(\nu \in \mathbb {P}_{\gamma ,j}\) \((i\ne j)\) by an edge labeled by \((n-1,n)\) if and only if \(\mu = (n-1,n)(\nu )\). Note that \(\mathbb {P}_{\gamma ,i}\cong \mathbb {P}_{\gamma '}\) where \(\gamma '\) is the integer partition given by subtracting 1 from \(\gamma _i\) and, if necessary, sorting the parts so they are nonincreasing. For example, Fig. 27 gives the decomposition of \({\mathbb {P}}_{[4,1,1]}\) into \({\mathbb {P}}_{[3,1,1]},\) \({\mathbb {P}}_{[4,0,1]},\) and \({\mathbb {P}}_{[4,1,0]}\) with \({\mathbb {P}}_{[4,0,1]}\cong {\mathbb {P}}_{[4,1,0]}\cong {\mathbb {P}}_{[4,1]}\). This decomposition allows one to construct the Schreier graphs of \(\mathbb {S}_n\) dynamically from those for \(\mathbb {S}_{n-1}\). Since the permutahedron is a Schreier graph, \(\mathbb {P}_n = \mathbb {P}_{[1,1,\ldots ,1]}\), we also construct it dynamically using this method.
Let \(\mathbf {B}_\gamma := \mathbf {B}_{\pi _1}\) denote the characteristic matrix corresponding to the reading ordered set partition \(\pi _1\in \Pi _\gamma \), which is a matrix that lifts Schreier eigenvectors from the Schreier \({\mathbb {P}}_\gamma \) to the permutahedron \({\mathbb {P}}_n\). Then \(\mathbf {B}_\gamma \) has a recursive structure that respects the decomposition of \(\mathbb {P}_\gamma \), and, as we show in Sect. 6.3, this is the only lifting matrix that we need to compute for shape \(\gamma \). If \(\gamma = [\gamma _1, \ldots , \gamma _\ell ] \vdash n\), then for \(1 \le i \le n\), let \({\mathbb {S}}_{n,i} := \{\sigma \in {\mathbb {S}}_n \mid \sigma (n) = i\}\). The decomposition of \(\mathbf {B}_\gamma \) is given in Proposition 7 and illustrated in Example 1.
Proposition 7
If \(\gamma = [\gamma _1, \ldots , \gamma _\ell ] \vdash n,\) then the characteristic matrix \(\mathbf {B}_\gamma \) decomposes into block sub-matrices,
where \(\gamma (j)\) is obtained from \(\gamma \) by subtracting 1 from \(\gamma _j\).
Proof
For \(\sigma \in {\mathbb {S}}_n\) and \(\mu \in \Pi _\gamma \), the \((\sigma ,\mu )\)-entry of \(\mathbf {B}_\gamma \) equals 1 if and only if \(\sigma (\mu ) = \pi _1\). If n is in the jth block of \(\mu \) and \(\sigma (n) = i\), then this entry can be nonzero only if i is in the jth row of \(\pi _1\). The nonzero blocks equal \(\mathbf {B}_{\gamma (j)}\) by definition (after ignoring n). \(\square \)
Example 1
In Fig. 28, we show the recursive structure of the characteristic matrix \(\mathbf {B}_{[2,2]}\), which corresponds to the ordered set partition \(\pi _1 = \{12\vert 34\}\) (compare with Fig. 14, which gives this same matrix with the permutations in lexiocgraphic order).
Similarly, the recursive structure of \(\mathbf {B}_{[4,1,1]}\), which corresponds to the ordered set partition \(\pi _1 = \{1234\vert 5\vert 6\}\), is given by
6.2 Computation of the Schreier Eigenvalues and Eigenvectors Via Deflation
Frame vectors \({\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}}\) of eigenvalue \(\lambda \) and shape \(\gamma \) are computed as lifts from the \(\lambda \)-eigenspace of the Schreier graph corresponding to the irreducible component \(V_\gamma ^*\) inside of \({\mathbb {R}}[\Pi _\gamma ]\). The multiplicity of \(V_\gamma ^*\) in \({\mathbb {R}}[\Pi _\gamma ]\) is equal to one and the other irreducible components \(V_\nu ^*\) that appear in \({\mathbb {R}}[\Pi _\gamma ]\) correspond to shapes \(\nu \) with \(\nu \triangleright \gamma \) in dominance order. See (13) and Fig. 29. By computing the Laplacian eigenvectors in any order that respects dominance, we will have already computed the eigenvectors \(\mathbf{v}_{\nu ,\varrho ,k}\) for \(\nu \triangleright \gamma \). To single out \(V_\gamma ^*\) inside \({\mathbb {R}}[\Pi _\gamma ]\), we lift the Laplacian eigenvectors from \({\mathbb {P}}_\nu \) up to \({\mathbb {P}}_\gamma \), and find the eigenvectors associated with the eigenvalues of \(V_\gamma ^*\) as an orthogonal complement of this span.
Interesting new theory is needed here, as the multiplicity of \(V_\nu ^*\) in \({\mathbb {R}}[\Pi _\gamma ]\) equals the Kostka number, and therefore we need to lift the eigenvectors of shape \(\nu \) in \(K_{\gamma ,\nu }\) linearly independent ways to \({\mathbb {R}}[\Pi _\gamma ]\). In Proposition 8, we show that lifting and projecting with set partitions constructed from “column-strict" tableaux is the same as Young’s rule for decomposing \({\mathbb {R}}[\Pi _\gamma ]\) into irreducible modules with multiplicity \(K_{\gamma ,\nu }\). Therefore, using mappings that come from different column-strict tableaux give us linearly independent images in \({\mathbb {R}}[\Pi _\gamma ]\) as needed.
The Kostka number \(K_{\nu ,\gamma }\) equals the number of column-strict tableaux of shape \(\nu \) and content \(\gamma \); that is, a tableau constructed by filling the boxes of \(\nu \) with \(\gamma _1\) ones, \(\gamma _2\) twos, and so on, such that the rows weakly increase and the columns strictly increase. For example, the \(K_{\gamma ,\nu } = 3\) column-strict tableaux \(T_1, T_2, T_3\) of shape \(\nu =[5,4]\) and content \(\gamma =[4,2,2,1]\) are shown in Fig. 30. Note that from this definition, \(K_{\gamma ,\nu } =0\) if \(\nu \not \trianglerighteq \gamma \). If T is a column-strict tableaux of shape \(\nu \) and content \(\gamma \), then define \(\xi _T \in \Pi _\gamma \) to be the set partition with j in row r if the jth box of \(T_i\), read in reading order, contains r. By reading order, we mean left-to-right across the rows from top to bottom. The set partitions \(\{\xi _{T_1}, \xi _{T_2},\xi _{T_3}\}\) are shown in Fig. 30. For example, \(\xi _{T_2}\) has 5 in row 3 because the 5th box of \(T_2\) contains 3.
Proposition 8
If \(\pi _1 \in \Pi _\nu \) is the reading-order set partition and \(\{\xi _i\}_{i=1}^{K_{\gamma ,\nu }} \subseteq \Pi _\gamma \) are the set partitions corresponding to the column-strict tableaux \(\{T_i\}_{i=1}^{K_{\gamma ,\nu }}\) of shape \(\nu \) and content \(\gamma ,\) then the matrices \(\mathbf{T}_{\xi _i,\pi _1} = \mathbf{B}_{\xi _i}^\top \mathbf{B}_{\pi _1},\) \(1 \le i \le K_{\gamma ,\nu },\) give \({\mathbb {S}}_n\)-module isomorphisms between \(V_\nu ^*\subseteq {\mathbb {R}}[\Pi _\nu ]\) and \(K_{\gamma ,\nu }\) linearly independent copies \(\{V_{\nu ,i}^*\}_{i=1}^{K_{\gamma ,\nu }}\) of \(V_\nu ^*\) in \({\mathbb {R}}[\Pi _\gamma ]\).
Proof
The transformation \(\mathbf{T}_{\xi _i,\pi _1}\) is a right \({\mathbb {S}}_n\)-module homomorphism, since both \(\mathbf{B}_{\xi _i}, \mathbf{B}_{\pi _1}\) are. We will show that \(\mathbf{T}_{\xi _i,\pi _1} = c_i \Theta _{T_i}\), for a nonzero scalar \(c_i\), where \(\Theta _{T_i}\) is the \({\mathbb {S}}_n\)-module isomophism given by Young’s rule defined in [83, Definition 2.9.3]. These transformations \(\Theta _{T_i}\) are known to give \(K_{\gamma ,\nu }\) linearly independent isomorphisms from \(V_\nu ^*\in {\mathbb {R}}[\Pi _\nu ]\) to \(V_{\nu ,i}^*\in {\mathbb {R}}[\Pi _\gamma ]\) by [83, Theorem 2.11.2].
Since \(\mathbf{T}_{\xi _i,\pi _1}\) is a module homomorphism, and \({\mathbb {S}}_n\) acts transitively on \(\Pi _\nu \), it is sufficient to show that they are equal at \({\mathbf {e}}_{\pi _1}\). Let \({\mathbb {S}}_{\pi _1} = \{ \sigma \in {\mathbb {S}}_n \mid \sigma (\pi _1) = \pi _1\}\) be the stabilizer subgroup of \(\pi _1\). Then
Thus the coefficient of \({\mathbf {e}}_\mu \) in \(\mathbf{B}_{\xi _i}^\top \mathbf{B}_{\pi _1} {\mathbf {e}}_{\pi }\) is \(c_{\mu ,\xi _i,\pi _1} = \# \{\sigma \in {\mathbb {S}}_n \mid \sigma (\xi _i) = \mu , \sigma (\pi _1) = \pi _1\}\). This set is a coset of the stabilizer of \((\xi _i,\pi _1)\) under the action of \({\mathbb {S}}_n\) on \(\Pi _\gamma \times \Pi _\nu \), so its cardinality is constant for all \(\mu \), which we denote by \(c_{\xi _i}\) (equal to the size of this stabilizer). By comparing with the action of \(\Theta _{T_i}\) in [83, Definition 2.9.3] (identifying ordered set partitions here with “tabloids” there), we have that \(\mathbf{T}_{\xi _i,\pi _1} = c_{\xi } \Theta _{T_i}\), where \(\xi _i\) is the set partition filled according to \(T_i\). \(\square \)
To summarize, from the structure of the Schreier graphs, we know that the graph Laplacian of the Schreier graph \({\mathbb {P}}_\gamma \) has the following spectral decomposition:
Thus, when the Laplacian eigenvectors of the Schreier graphs that precede \({\mathbb {P}}_\gamma \) in dominance order have already been computed, to find an orthonormal eigenbasis for \(V_\gamma ^*\), it suffices to diagonalize the rank \(d_\gamma \) matrix
as opposed to the rank \(m_\gamma \) matrix \({\varvec{\mathcal {L}}}_{{\mathbb {P}}_\gamma }\). The complexity of forming and computing the eigendecomposition of the matrix in (25) is \({{\mathcal {O}}}(m_\gamma d_\gamma ^2 + m_\gamma ^2(\sum _{\nu \vartriangleright \gamma } d_\nu K_{\gamma ,\nu }))\).
6.3 Efficient Computation of the Analysis Coefficients
As detailed in (23), the analysis coefficients associated with shape \(\gamma \) can be computed either by lifting each eigenvector \(\mathbf{v}_{\gamma ,\lambda ,k}\) up to the permutahedron in \(z_\gamma \) different ways and taking the inner product of each with the signal \(\mathbf{f}\), or by projecting the signal down to the Schreier graph in \(z_\gamma \) different ways and taking the inner product between each of the projections and each eigenvector \(\mathbf{v}_{\gamma ,\lambda ,k}\). Since the characteristic matrix used to lift or project between the Schreier \({\mathbb {P}}_\gamma \) and the permutahedron \({\mathbb {P}}_n\) is a sparse matrix with n! entries equal to 1 (one per row) and the remainder equal to 0, the respective complexities of these two approaches are \({{\mathcal {O}}}(n!d_\gamma z_\gamma )\) and \({{\mathcal {O}}}(z_\gamma (n!+d_\gamma m_\gamma ))\). However, the more problematic issue with both of these approaches is the memory required to store all \(z_\gamma \) characteristic matrices \(\mathbf {B}_{{\bar{\pi }}}\) associated with each shape \(\gamma \), which is \(4n!z_\gamma \) bytes (e.g., storing these \(\mathbf {B}_{{\bar{\pi }}}\) matrices for all \({\bar{\pi }}\in {\bar{\Pi }}_{[4,3,2,1]}\) alone would require approximately 383GB).
To reduce the required memory, we use the following method which only requires storing the single characteristic matrix \(\mathbf{B}_\gamma =\mathbf{B}_{\pi _1}\) associated with the reading-ordered set partition \(\pi _1\) for each shape \(\gamma \); these matrices are computed dynamically, as detailed in Sect. 6.1. For each \({\bar{\pi }}\in {\bar{\Pi }}_\gamma \), we let \(\sigma \in {\mathbb {S}}_n\) be a permutation such that \(\sigma (\pi _1) = {\bar{\pi }}\). Then we have
where the third equality follows from Proposition 4 and the fourth equality follows from the orthogonality of \(\rho _L(\sigma )\). Therefore, to compute the analysis coefficient \(\langle \mathbf{f},{\varvec{\varphi }}_{\gamma ,\lambda ,k} \rangle \) we can compute instead \(\langle \mathbf{B}^{\top }_{{\pi }_1} \rho _L(\sigma ^{-1}) \mathbf{f}, \mathbf{v}_{\gamma ,\lambda ,k} \rangle \), which amounts to reordering the data vector \(\mathbf{f}\) by \(\sigma ^{-1}\), projecting it down to the Schreier graph using a single matrix \(\mathbf{B}_{\pi _1}\), and then taking the inner product with \(\mathbf{v}_{\gamma ,\lambda ,k}\).
The permutation \(\sigma ^{-1}\) is recorded as the product of a minimal sequence of adjacent transpositions, and the reorderings are computed by sequentially applying these transpositions. Thus, the third and final phase of the setup portion consists of computing \(\sigma ^{-1}\) for each shape \(\gamma \) and every \({\bar{\pi }}\in {\bar{\Pi }}_\gamma \) by constructing a path in \({\mathbb {P}}_\gamma \) from the reading-order set partition \(\pi _1\in {\bar{\Pi }}_\gamma \) to \({\bar{\pi }}\in {\bar{\Pi }}_\gamma \), which is a sequence (\(\pi _1,\pi _2,\ldots ,\pi _r\)) such that \(\pi _{i+1} = (j_i,j_i+1)(\pi _i)\) for \(i=1,\ldots ,r-1\) and \(\pi _r = {\bar{\pi }}\). Thus, paths give the sequence of adjacent transpositions that transform \(\pi _1\) into \({\bar{\pi }}\). A path is minimal if there is no path in \({\mathbb {P}}_\gamma \) from \(\pi _1\) to \({\bar{\pi }}\) with fewer edges (equivalently, no way to transform \(\pi _1\) to \({\bar{\pi }}\) with fewer swaps). Minimal paths can be constructed via a breadth-first search [10]. One example of a minimal path constructed in this manner is shown in Fig. 27. Moreover, all minimal paths to \({\bar{\pi }} \in {\bar{\Pi }}_\gamma \) live entirely in \({\bar{\Pi }}_\gamma \). To see this, we use the fact that the length of a minimal path to \(\pi \in \Pi _\gamma \) equals the number of inversions in \(\pi \), where an inversion is a pair (i, j) with \(i < j\) and j in a higher row than i in \(\pi \) (see [3] for a proof). Thus, inversion numbers go up at each step in a minimal path from \(\pi _1\) to \(\pi \). A minimal path never leaves \({\bar{\Pi }}_\gamma \) and returns to \({\bar{\Pi }}_\gamma \), for that would require the number of inversions to go down. So for each \({\bar{\pi }}\in \bar{\Pi }_\gamma \), we pre-compute and save the (length n!) permutation that rearranges \(\mathbf{f}\) into \(\rho _L(\sigma ^{-1}) \mathbf{f}\). These permutations are constructed by first dynamically making the permutation corresponding to each of the \(n-1\) adjacent swaps. Then we perform a tree traversal on the Schreier graph and multiply adjacent swaps to get the permutation corresponding to each Schreier vertex in \({\bar{\Pi }}_\gamma \). For each Schreier graph \({\mathbb {P}}_\gamma \), the complexity of computing these permutations is \({{\mathcal {O}}}(z_\gamma n!)\).
Finally, in the analysis portion of the code (the only part that is dependent on the data), for each \(\gamma \vdash n\) and each \({\bar{\pi }}\in \bar{\Pi }_\gamma \), the vector \(\mathbf{f}\) is reordered into \(\rho _L(\sigma ^{-1}) \mathbf{f}\) by the stored permutation and projected by \(\mathbf{B}^{\top }_{{\pi }_1}\) to the Schreier graph \({\mathbb {P}}_\gamma \), where its inner product with each of the Laplacian eigenvectors \(\{\mathbf{v}_{\gamma ,\lambda ,k}\}\) is computed and multiplied by the constant \({\bar{c}}_\gamma \). This portion has complexity \({{\mathcal {O}}}(z_\gamma (2n!+m_\gamma d_\gamma ))\) for each shape \(\gamma \).
Remark 6
The implementation as described thus far requires us to store \(z_\gamma \) permutation vectors of length n! for each shape, which can lead to memory issues as n grows (e.g., for \(n=10\), the permutation vectors associated with the liftings in \(\bar{{\bar{\Pi }}}_{[4,3,2,1]}\) require approximatley 170 GB of memory). Thus, for larger n, we also implement an alternative version of the code that is more memory efficient. In this second version, we save only the permutations corresponding to the adjacent transpositions and the lists of adjacent transpositions leading from reading-ordered set partitions to the other vertices of the Schreier graphs. During the analysis phase, we perform a tree traversal of each Schreier graph, and at each step, the data vector is permuted by the adjacent transposition corresponding to the edge in the tree. The rearranged vectors are then projected down to the corresponding Schreier graph, where the inner products with the Laplacian eigenvectors of \({\mathbb {P}}_\gamma \) are computed. While this variant is more efficient from a memory standpoint, the downside is that the work of computing the permutations that rearrange \(\mathbf{f}\) into each \(\rho _L(\sigma ^{-1}) \mathbf{f}\) from phase 3 of the setup portion now needs to be done for each new data vector \(\mathbf{f}\).
6.4 Subsampling of the Dictionary Atoms
The total number of dictionary atoms in \({{\mathcal {D}}}\) from Theorem 1 is \(\sum _{\gamma \vdash n} d_{\gamma } m_{\gamma }\). We briefly mention three ways to reduce the number of atoms in order to improve the computational efficiency of the applying the analysis operator.
First, we always use the less redundant dictionary \(\bar{{{\mathcal {D}}}}\) from Theorem 2, which provides exactly the same information but avoids identical atoms in order to reduce the total number of atoms to \(\sum _{\gamma \vdash n} d_{\gamma } z_{\gamma }\). However, this quantity still grows faster than n!, meaning that the redundancy of the dictionary \(\bar{{{\mathcal {D}}}}\) also increases as n increases.
Second, in many applications, the most relevant information lies in the isotypic components associated with the first handful of symmetry types and/or the Laplacian eigenspaces of the permutahedron associated with the lowest eigenvalues. In this case, we do not need to compute all of the analysis coefficients, which significantly reduces the overall complexity, as the computational bottlenecks lie in the symmetry types that are later in the dominance order. If the energy decomposition onto each isotypic component is still desired, we can leverage the bipartite nature of the permutahedron by computing the inner products between the atoms associated with the transpose shape with the element-wise product of the signal and the sign vector of the permutahedron; for example,
where \({\bar{f}}(\sigma )=\mathsf {sign}(\sigma )f(\sigma )\). We denote by \({{\mathcal {F}}}_n\) the set of integer partitions of n that cannot be written as the transpose of a symmetry type that precedes it in lexicographic order (e.g., when \(n=10\), \({{\mathcal {F}}}_n\) contains 22 of the 42 symmetry types, which are shown below in Fig. 33).
Third, and most efficaciously, we can simply compute transform coefficients for atoms associated with the first k symmetry types, which again is often where the most interesting information resides. As an example, if \(n=10\), \(|{{\mathcal {D}}}|=\sum _{\gamma \vdash n} d_{\gamma } m_{\gamma }=419,571,370\) (redundancy factor of 115.6); \(|\bar{{{\mathcal {D}}}}|=\sum _{\gamma \vdash n} d_{\gamma } z_{\gamma }=44,711,456\) (10.7% of \(|{{\mathcal {D}}}|\), redundancy factor of 12.3); \(\sum _{\gamma \in {{\mathcal {F}}}_n} d_{\gamma } z_{\gamma }=18,004,348\) (40.3% of \(|\bar{{{\mathcal {D}}}}|\), redundancy factor of 5.0); there are \(\sum _{\gamma \in {{\mathcal {F}}}_n^8} d_{\gamma } z_{\gamma }=98,866\) atoms generated from the first \(k=8\) shapes (0.55% of \(\sum _{\gamma \in {{\mathcal {F}}}_n} d_{\gamma } z_{\gamma }\), redundancy factor of 0.03); and, finally, there are \(\sum _{\gamma \in {{\mathcal {F}}}_n^8} \min \{2,d_{\gamma }\} z_{\gamma }=1,821\) atoms generated from the first \(k=8\) shapes and up to two eigenvalues from each shape (1.8% of the previous quantity, redundancy factor of 0.0005). These values are shown in Fig. 31 for a wider range of n. Importantly, if we only plan to use atoms generated from the first k shapes in the analysis, we only need to compute the corresponding Schreier adjacency matrices, lifting matrices, paths, and eigendecompositions described in Sects. 6.1 and 6.2 for these shapes. We demonstrate the resulting computational savings in Sect. 6.5. Applications that could benefit from such a reduced transform with the top k shapes include lossy compression and machine learning problems, for which the reduced transform coefficients can serve as low-dimensional feature vectors for the high-dimensional ranked data.
6.5 Computational Summary
To summarize, when we perform the transform using all atoms associated with the shapes in \({{\mathcal {F}}}_n\), phase 1 of the setup portion is negligible when compared to phases 2 and 3 of the setup and the computation of the analysis coefficients. Figure 32 shows the times required for our MATLAB implementations to perform the proposed transform over all shapes in \({{\mathcal {F}}}_n\), on a 2.3 GHz MacBook Pro laptop with 32GB of RAM. The code is not yet optimized in the sense that we have not precompiled any C/C++ subroutines into MEX functions. As an example, for \(n=9\) candidates (362,880 vertices or possible rankings), the main implementation of the code performs the offline computations in the three phases of the setup portion in approximately 43 s, and then computes the analysis coefficients for each data vector \(\mathbf{f}\) in approximately 26 s.
When we only compute the analysis coefficients associated with the first k symmetry types (\({{\mathcal {F}}}_n^k\)), the two main computational bottlenecks are computing the permutations that rearrange \(\mathbf{f}\) into \(\rho _L(\sigma ^{-1}) \mathbf{f}\) and computing the analysis coefficients, each of which has complexity of \({{\mathcal {O}}}(z_\gamma n!)\) for shape \(\gamma \). The overall complexity is therefore equal to \({{\mathcal {O}}}(n^{{\bar{m}}_k}n!)\), where \({\bar{m}}_k\) is defined to be the smallest value of m that results in \(\sum _{i=1}^m p(i) \ge k\), where p(i) is the partition number of integer i. For \(k=5,8,15,30\), \({\bar{m}}_k\) is equal to 3, 4, 5, 6, respectively. So, for the example above with the top \(k=8\) shapes, which includes all shapes in lexicographic order up to \([n-4,4]\) (i.e., where most of the useful information resides), the complexity is \({{\mathcal {O}}}(n^4 n!)={{\mathcal {O}}}(n! \log ^4(n!))\). With \(n=10\) and \(k=8\), which includes the computation of all of the largest magnitude coefficients in Fig. 19 for the sushi preference data, both the setup and analysis portions run in under one minute.
Finally, to synthesize a signal from the analysis coefficients (i.e., perform the inverse transform), we need to lift each Schreier Laplacian eigenvector back to the permutahedron one time, reorder these vectors in different ways, and then take a linear combination of the reordered vectors:
where \(\alpha _{\gamma ,\lambda ,k,{\bar{\pi }}}=\langle \mathbf{f},{\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}} \rangle \).
7 Related Work, Revisited
7.1 Bases, Frames, and Interpretability
As discussed in Sect. 3.1, Diaconis [29, p. 955] remarks that there is not a natural choice of basis of the irreducible components \(V_{\gamma ,i}\) with interpretable basis elements, and circumvents this issue via Mallows’ method of projecting an overcomplete spanning set of interpretable functions (two of which are shown in Fig. 7) onto the isotypic component \(W_\gamma \). Interestingly, these projections also form a tight frame for \(W_\gamma \).
Proposition 9
For each \(\pi ,\xi \in \Pi _\gamma ,\) define \({\varvec{\delta }}_{\pi ,\xi } \in {\mathbb {R}}[{\mathbb {S}}_n]\) by
that is, \({\varvec{\delta }}_{\pi ,\xi }(\sigma ) = 1\) if and only if \(\sigma \) places the candidates in each block of \(\pi \) in the ranking slots of the corresponding block of \(\xi \). Let \(({\varvec{\delta }}_{\pi ,\xi })_\gamma \) be the orthogonal projection of \({\varvec{\delta }}_{\pi ,\xi }\) onto \(W_\gamma \). Then the collection of vectors \(\{({\varvec{\delta }}_{\pi ,\xi })_\gamma \}_{\pi ,\xi \in \Pi _\gamma }\) is a tight frame for \(W_\gamma \).
Proof
For \(\tau \in {\mathbb {S}}_n\) we have \(\tau {\varvec{\delta }}_{\pi ,\xi } = {\varvec{\delta }}_{\tau ( \pi ),\xi }\) and \({\varvec{\delta }}_{\pi ,\xi } \tau = {\varvec{\delta }}_{ \pi ,\tau ^{-1}(\xi )}\), since \(\sigma (\xi ) = \pi \) if and only if \(\tau \sigma (\xi ) = \tau (\pi )\) if and only if \(\sigma \tau (\tau ^{-1} \xi ) = \pi .\) Furthermore, projection onto the isotypic component \(W_\gamma \) is in the center of the group algebra \({\mathbb {R}}[{\mathbb {S}}_n]\) (see (18) and [97, (13, 19)]), so it commutes with both the left and right action of the group, giving \(\tau ({\varvec{\delta }}_{\pi ,\xi })_\gamma = (\tau {\varvec{\delta }}_{\pi ,\xi })_\gamma = ( {\varvec{\delta }}_{\tau (\pi ),\xi })_\gamma \) and \(({\varvec{\delta }}_{\pi ,\xi })_\gamma \tau = ( {\varvec{\delta }}_{\pi ,\xi }\tau )_\gamma = ( {\varvec{\delta }}_{\pi ,\tau ^{-1}(\xi )})_\gamma \). For any fixed \(\pi , \xi \in \Pi _\gamma \), the irreducible left \({\mathbb {S}}_n\)-module generated by \(({\varvec{\delta }}_{\pi ,\xi })_\gamma \) is given by
where the last equality follows from the fact that \({\mathbb {S}}_n\) acts transitively on \(\Pi _\gamma \). Since this module is irreducible, it follows from [97, Theorem 10.5], that \(\{ ({\varvec{\delta }}_{\tau (\pi ),\xi })_\gamma \mid \tau \in {\mathbb {S}}_n\}\) is a tight group frame for its span, and the span is isomorphic to \(V_\gamma \). The symmetric argument with \({\mathbb {S}}_n\) acting on the right produces a copy of the right irreducible \({\mathbb {S}}_n\)-module \(V_\gamma ^*\) and proves that \(\{({\varvec{\delta }}_{\pi ,\xi })_\gamma \mid \xi \in \Pi _\gamma \}\) is a tight frame for its span. The isotypic component is isomorphic to a tensor product \(W_\gamma \cong V_\gamma \otimes V_\gamma ^*\) of irreducible left and right \({\mathbb {S}}_n\) modules, respectively. Therefore, by [97, Corollary 5.1], \(\{ ({\varvec{\delta }}_{\pi ,\xi })_\gamma \mid \pi ,\xi \in \Pi _\gamma \}\) is a tight frame for \(W_\gamma \). \(\square \)
On the other hand, we can distinguish a subset of our frame vectors that yield a basis for each \(V_{\gamma ,i}\). An ordered set partition \(\pi \in \Pi _\gamma \) is standard if when the rows are put in increasing order from left to right, the columns also are in increasing order from top to bottom. For example, the five ordered set partitions in the bottom right corner of the Schreier graph in Fig. 12 are standard. Standard ordered set partitions are in bijection with standard tableaux and \(d_\gamma =\dim (V_\gamma )\) equals the number of standard ordered set partitions [83]. Let \(\Pi _\gamma ^\text {std}\) denote the set of standard ordered set partitions. The atoms coming from liftings by standard set partitions provide the desired basis.
Proposition 10
For \(\gamma \vdash n,\) \(\lambda \in \Lambda _\gamma ,\) and \(1 \le k \le \kappa _{\gamma ,\lambda },\) the set \(\{{\varvec{\varphi }}_{\gamma , \lambda , k, \pi } \mid \pi \in \Pi _\gamma ^\text {std}\}\) is a basis for the irreducible \({\mathbb {S}}_n\) module \({\mathbb {R}}[{\mathbb {S}}_n] \varvec{\varphi }_{\gamma ,\lambda ,k,\pi _0} \cong V_\gamma \) (for any \(\pi _0 \in \Pi _\gamma )\).
Proof
The module \(V_\gamma ^*\subseteq {\mathbb {R}}[\Pi _\gamma ]\) is spanned by the polytabloids \(\{q_\pi \mid \pi \in \Pi _\gamma \}\) with symmetric group action \(q_\pi \sigma = q_{\sigma ^{-1}(\pi )}\) [see (21)] and has a basis consisting of standard polytabloids \(\{q_\pi \mid \pi \in \Pi _\gamma ^\text {std}\}\) by [83, Theorem 2.5.2]. We show that this property transfers to \({\mathbb {R}}[{\mathbb {S}}_n] \varvec{\varphi }_{\gamma ,\lambda ,k,\pi _0} = {\mathbb {R}}[{\mathbb {S}}_n] (\mathbf {B}_{\pi _0} {\mathbf {v}}_{\gamma ,\lambda ,k})\). For any \(0 \not = {\mathbf {v}}\in V_\gamma ^*\), the map \({\mathbb {R}}[{\mathbb {S}}_n] (\mathbf {B}_{\pi _0} {\mathbf {v}}) \rightarrow V_\gamma ^*\) given by sending \(\sigma \mathbf {B}_{\pi _0} {\mathbf {v}}\) to \({\mathbf {v}}\sigma ^{-1}\) is an isomorphism (the representing matrices are transposes of one another to account for the change from left to right modules). Moreover, for \(0 \not = {\mathbf {v}},{\mathbf {w}}\in V_\gamma ^*\subseteq {\mathbb {R}}[\Pi _\gamma ]\), the action of the symmetric group on \(({\mathbb {R}}[{\mathbb {S}}_n] \mathbf {B}_{\pi _0}) {\mathbf {v}}\) is identical to the action on \(({\mathbb {R}}[{\mathbb {S}}_n] \mathbf {B}_{\pi _0}) {\mathbf {w}}\). To see this, observe that \(\sigma \mathbf {B}_{\pi _0} {\mathbf {v}}= \mathbf {B}_{\sigma ({\pi _0})} {\mathbf {v}}\) and \(\sigma \mathbf {B}_{\pi _0} {\mathbf {w}}= \mathbf {B}_{\sigma ({\pi _0})} {\mathbf {w}}\). Moreover, if \(\sum _{\sigma \in {\mathbb {S}}_n} c_\sigma \sigma \mathbf {B}_{\pi _0} {\mathbf {v}}= \sum _{\sigma \in {\mathbb {S}}_n} c_\sigma \mathbf {B}_{\sigma (\pi _0)} {\mathbf {v}}=0\) is a dependence relation, then there is \(x \in {\mathbb {R}}[{\mathbb {S}}_n]\) so that \({\mathbf {w}}= {\mathbf {v}}x\) so \(\sum _{\sigma \in {\mathbb {S}}_n} c_\sigma \mathbf {B}_{\sigma (\pi _0)} {\mathbf {w}}= \sum _{\sigma \in {\mathbb {S}}_n} c_\sigma \sigma \mathbf {B}_{\pi _0} ({\mathbf {v}}x) =\left( \sum _{\sigma \in {\mathbb {S}}_n} c_\sigma \sigma \mathbf {B}_\pi {\mathbf {v}}\right) x = 0\), since the left and right group actions commute. The converse argument holds, so dependence relations on \(\{ \mathbf {B}_\pi {\mathbf {v}}\}_{\pi \in \Pi _\gamma }\) and \(\{ \mathbf {B}_\pi {\mathbf {w}}\}_{\pi \in \Pi _\gamma }\) are the same. If \({\mathbf {v}}= q_{\pi _0}\), then \(\mathbf {B}_{\sigma (\pi _0)} q_{\pi _0} = \sigma \mathbf {B}_{\pi _0} q_{\pi _0}\) which maps to \(q_{\pi _0} \sigma ^{-1} = q_{\sigma (\pi _0)}\) under the isomorphism, so for each \(\pi \in \Pi _\gamma \), \(\mathbf {B}_\pi q_{\pi _0}\) maps to \(q_\pi \). It follows that \(\{\mathbf {B}_\pi q_{\pi _0} \mid \pi \in \Pi _\gamma ^\text {std}\}\) is a basis for \({\mathbb {R}}[{\mathbb {S}}_n] (\mathbf {B}_{\pi _0} q_{\pi _0})\). The result holds for \({\mathbb {R}}[{\mathbb {S}}_n] (\mathbf {B}_{\pi _0} {\mathbf {v}}_{\gamma ,\lambda ,k})\) by the fact that the dependence relations are the same for \(\{ \mathbf {B}_\pi q_{\pi _0}\}_{\pi \in \Pi _\gamma }\) and \(\{ \mathbf {B}_\pi {\mathbf {v}}_{\gamma ,\lambda ,k} \}_{\pi \in \Pi _\gamma }\). \(\square \)
To recap, there are two important differences between the data analysis methods we propose here and those presented by Diaconis [29]. First, we directly construct each spanning vector as an interpretable function in \(V_{\gamma ,i}\), whereas the Mallows vectors are constructed as interpretable functions but then projected, during which some of the interpretability may be lost. Second, and probably more importantly, the atoms of the form \({\varvec{\varphi }}_{\gamma ,\lambda ,k,{\bar{\pi }}}\) that we use reside in a single Laplacian eigenspace of the permutahedron, whereas the atoms of the form \(({\varvec{\delta }}_{\pi ,\xi })_\gamma \) from Proposition 9 (and shown in the bottom of Fig. 7) reside in multiple Laplacian eigenspaces of the permutahedron, making the corresponding analysis coefficients less interpretable from a smoothness perspective. Our proposed frame is therefore a more refined representation that enables us to capture both smoothness and symmetry information about the ranking data.
7.2 Smoothness, Notions of Frequency, and Symmetry Types
We mentioned in Sect. 3.2 that the symmetry types carry a notion of frequency: the Laplacian eigenvectors of the Cayley graph induced by the generating set of all transpositions, \(\Upgamma _n\), that reside in isotypic components that occur earlier in the dominance ordering are smoother functions with respect to the graph structure. This notion of frequency and relationship between dominance ordering and smoothness does not carry over directly to the setting of the permutahedron, as all but the first and last isotypic components contain Laplacian eigenvectors of \({\mathbb {P}}_n\) associated with different eigenvalues. However, we believe there is still an interesting relationship between the dominance ordering of the isotypic components and the smoothness with respect to the permutahedron of the smoothest eigenvectors within each isotypic component.
Conjecture 1
For each \(\gamma \vdash n,\) define \({\tilde{\lambda }}_\gamma := \min _{\lambda \in \Lambda _\gamma }\{\lambda \}\). If \(\nu \vartriangleright \gamma ,\) then \({\tilde{\lambda }}_\nu < {\tilde{\lambda }}_\gamma \).
We have numerically verified this conjecture for \(n\le 10\). Figure 33 shows an example of Conjecture 1 with \(n=10\).
7.3 Computational Complexity and Efficient Algorithms
A naive implementation of the Fourier transform on \({\mathbb {S}}_n\) requires \({{\mathcal {O}}}((n!)^2)\) time. The FFT of Clausen and Baum [19] reduces the time to \({{\mathcal {O}}}(n^3 n!)\) and Maslen [68] improves this to \({{\mathcal {O}}}(n^2 n!)\). It is conceivable that we could leverage some of the ideas from these works to further improve the complexity of applying our proposed transform. However, these methods also run into space constraints as they save time by storing many vectors of length n! in their implementation. For example, most transforms on \({\mathbb {S}}_n\) are built recursively from \({\mathbb {S}}_{n-1}\) and the sequences of adjacent swaps \((i,i+1) (i+1,i+2) \cdots (n-1,n)\) for each \(1 \le i \le n-1\) (coset representatives of \({\mathbb {S}}_{n-1}\) in \({\mathbb {S}}_n\)). For these reasons in part, References [53, 76] implement algorithms to compute the FFT on individual isotypic components.
Another possible computational improvement we have not yet explored is to approximately compute the transform coefficients, avoiding the eigendecompositions of the Schreier Laplacians. This is the approach taken in most scalable graph signal processing algorithms [43, 89].
7.4 Parametric Distance-Based Models
Of the probability models for ranked data, the most closely related to our framework are distance-based and multistage models that use the Kendall distance metric ([67, Chap. 6], [2, Chap. 8.3], and [100, Sect. 4.3] all contain overviews of distance-based models). For example, Mallows’ \(\phi \)-model [65, 28, Chap. 6] assumes there is a single modal ranking \(\sigma _0\) and the probability of a voter voting for another ranking \(\sigma \) is proportional to \(e^{-\tau d_K(\sigma ,\sigma _0)}\), where \(\tau \) is a single dispersion parameter and the Kendall distance \(d_K(\sigma ,\sigma _0)\) is equal to the number of hops separating \(\sigma \) and \(\sigma _0\) on the permutahedron. Generalizations of this model include mixture models with multiple modes and dispersion rates to represent heterogeneous voting populations [74], generalized Mallows models that allow for different dispersion parameters at each position of the permutation [37], and mixtures of generalized Mallows models [72]. A review and comparison of these models is in [13] and software implementations are detailed in [46, 77].
From a graph signal processing perspective, fitting these models is closely related to blind deconvolution on graphs [47, 79, 85]; e.g.,
where \(\mathbf{x}\) is a sparse input signal whose support corresponds to the S modes used in the mixture of Mallows models, h is a diffusion filter which may be a polynomial of small degree or have a parametrized form such as \(h(\lambda _{\ell })=\alpha e^{-\tau \lambda _{\ell }}\), and \({\varvec{\mathcal {L}}}\) is the graph Laplacian of either the (unweighted) permutahedron or a weighted variant that places different edge weights on edges corresponding to swaps of candidates in different adjacent positions in the rankings.
While our proposed approach models the length n! data vector as the linear combination of a much higher number of atoms, \(\mathbf{f} = \sum _{\phi \in \bar{{{\mathcal {D}}}}} \langle \mathbf{f}, \phi \rangle \phi \), the approach can also be used to generate compressed representations with atoms of the same form. This is because ranked data found in applications is typically smooth with respect to the permutahedron, and therefore the magnitudes of most of the coefficients \( \langle \mathbf{f}, \phi \rangle \) are small (see, e.g., Fig. 17). Thus, sparse linear combinations of these atoms can yield effective approximations to high-dimensional ranked data; i.e., \(\mathbf{f} \approx \sum _{\phi \in \bar{{{\mathcal {D}}}}^{\prime }} \alpha _\phi \phi \), where \(\bar{{{\mathcal {D}}}}^{\prime }\) is a subset of the tight frame dictionary \(\bar{{{\mathcal {D}}}}\) with cardinality \(|\bar{{{\mathcal {D}}}}^{\prime }| \ll n!\).
8 Extensions and Future Directions
We conclude with a brief mention of three lines of future work.
8.1 Closed Form Computation of the Schreier Eigenvalues and Eigenvectors
Availability of closed-form formulas for the Laplacian eigenvalues and eigenvectors of the Schreier graphs could eliminate the need to perform the computations mentioned above altogether. Closed forms are known for Cayley graphs generated by all transpositions [31] and for Cayley graphs generated by transposing i with n for each i (the star graph) [36], but no general, closed formula is known in the case of adjacent transpositions (the permutahedron). Partial results are known [5, 33, 38] when \(\gamma \) is a “hook shape,” i.e., \(\gamma = [n-k,{1,\ldots ,1}]\) for \(0 \le k \le n-1\).
Lemma 3
([5, 38]) Define the k-fold Cartesian product graph
which is the k-dimensional cube graph of side length n. For each subset \(I = \{i_1, \ldots , i_k\} \subseteq \{1, \ldots , n-1\}\) of size k, \(\lambda _I:=\sum _{j=1}^k \lambda _{i_j}^{Q_n}\) is a Laplacian eigenvalue of \(Q_{n,k}\), and the k-fold wedge product
is a Laplacian eigenvector of \(Q_{n,k}\) associated with \(\lambda _I\).
As shown in Fig. 34, the vertices of \(Q_{n,k}\) can be labeled by the set of k-tuples from the alphabet \(\{1, \ldots ,n\}\):
and the vertices of the Schreier graph \({\mathbb {P}}_{[n-k,{1,\ldots ,1}]}\) can be labeled by the k-permutations of \(\{1, \ldots ,n\}\):
Thus \(V({\mathbb {P}}_{[n-k,{1,\ldots ,1}]}) \subseteq V(Q_{n,k})\).
Proposition 11
([5, 38]) Let \(\gamma \) be a “hook shape”; i.e., \(\gamma =[n-k,1^k]=[n-k,1,\ldots ,1]\) for \(0\le k\le n-1\). For each subset \(I = \{i_1, \ldots , i_k\} \subseteq \{1, \ldots , n-1\}\) of size k, \(\lambda _I:=\sum _{j=1}^k \lambda _{i_j}^{Q_n}\) is a Laplacian eigenvalue of \({\mathbb {P}}_\gamma \) with the associated eigenvector given by \(\mathbf{v}_{\gamma ,\lambda _I}:=\mathbf{w}_{\gamma ,\lambda _I} \vert _{V({\mathbb {P}}_\gamma )},\) the restriction of \(\mathbf{w}_{\gamma ,\lambda _I},\) viewed as a function on the vertices of \(Q_{n,k},\) to the vertices of \({\mathbb {P}}_\gamma \). Thus, for any hook shape \(\gamma = [n-k,1^k],\) the complexity of computing each eigenvector of \({\mathbb {P}}_\gamma \) is \({{\mathcal {O}}}(k!n^k),\) and the complexity of computing all eigenvectors of \({\mathbb {P}}_\gamma \) is \({\mathcal O}\left( \frac{n!k!n^k}{(n-k)!}\right) ,\) which is bounded by \({{\mathcal {O}}}(k!n^{2k})\).
Remark 7
Earlier, Edelman and White [33] studied the integer eigenvalues of the permutahedron, found the integer eigenvalues from the hook shapes, and also conjectured that the only integer eigenvalues that appear in non-hook shapes also appear in the hook shapes for the same value of n.
Remark 8
The permutahedron \({\mathbb {P}}_n\) is the same as the Full-Flag Johnson graph FJ(n, 1) studied in [25], in which Dai uses the recursive structure of Full-Flag Johnson graphs to compute a subset of the adjacency spectrum of \({\mathbb {P}}_n\), namely the eigenvalues of the matrix \(M_n\) in [25, Lemma 10]. That matrix \(M_n\) is actually equal to the adjacency matrix of the Schreier graph \({\mathbb {P}}_{[n,n-1]}\), which is a path graph with additional self loops that make it regular. Since, as mentioned in Sect. 5.2.2, the Laplacian eigenvalues of \({\mathbb {P}}_{[n,n-1]}\) are known in closed form, an alternative closed form of the eigenvalues of Dai’s \(M_n\) matrix is \(\left\{ n-3+2\cos \left( \frac{\pi {\ell }}{n}\right) \right\} _{{\ell }=0,1,\ldots ,n-1} \), and we do indeed know which subset of the spectrum of \({\mathbb {P}}_n\) is attained from this matrix.
The Schreier graph \({\mathbb {P}}_{[n-2,2]}\) (shown, e.g., in Fig. 12) closely resembles the quartered Aztec diamond, the adjacency spectrum of which is studied and specified by Ciucu [18, Eq. (2.4)]. However, the quartered Aztec diamond graph studied by Ciucu does not include the self loops that are present in \({\mathbb {P}}_{[n-2,2]}\), and is therefore not regular and the Laplacian eigenvalues do not follow immediately. We have not yet found a way to successfully adapt the approach of [18] to write down a closed-form formula for the Laplacian eigenvalues of \({\mathbb {P}}_{[n-2,2]}\).
8.2 Partial and Incomplete Rankings
Our approach to construct tight frames for signals on the permutahedron can be extended to (i) partial rankings, in which voters are allowed to include ties for one or more candidates [6, 24, 29, 52, 55, 94, 95], and (ii) incomplete rankings, in which voters may only rank a subset of the candidates [20, 52, 64, 90, 91]. A common example of partial rankings is when voters list their top k candidates, and the remaining \(n-k\) are implicitly considered to be tied. An example of incomplete rankings is balanced incomplete block design, where each voter is assigned k out of the n objects to rank, in such a way that each object is judged by the same number of voters and each pair of objects is presented to the same number of voters [67, Chap. 11], [7, 21, 32]. This is used, for example, if there are too many objects for an individual to effectively compare due to time constraints or cognitive limitations. Our frame vectors may yield a naturally interpretable spanning set for the possible vote tallies in this experimental design. Both of these extensions merit further investigation given the myriad applications in which partial and incomplete rankings appear.
8.3 Tight Frames for Analyzing Data on Other Cayley Graphs, Groups, and Combinatorial Structures
The frame construction presented here works when the permutahedron is replaced by the Cayley graph \(\Gamma ({\mathbb {S}}_n, S)\) corresponding to any generating set \(S \subseteq {\mathbb {S}}_n\). For example, if S is the set of all transpositions, then the Cayley graph is the graph \(\Upgamma _n\) discussed in Sect. 3.2, and if \(S = \{(i,n)\mid 1 \le i < n\}\), then the Cayley graph is the star graph studied in [36]. In any of these cases, our methods yield a tight spectral frame with respect to the graph, and signals can be projected down to and interpreted on the corresponding Schreier graphs.
Furthermore, these methods extend to any finite group. The equitable partition \(\sim _\pi \) (see Definition 2) on \({\mathbb {S}}_n\) induced by the ordered set partition \(\pi \in \Pi _\gamma \) has equivalence classes equal to the right cosets of the stabilizer subgroup \({\mathbb {S}}_\pi = \{\sigma \in {\mathbb {S}}_n \mid \sigma (\pi ) = \pi \}\), and the Schreier graph \({\mathbb {P}}_\gamma \) is isomorphic to the Cayley graph determined by \({\mathbb {S}}_n\) acting on these cosets. When \({\mathbb {S}}_n\) is replaced by any finite group \({{\mathbb {G}}}\) and \({\mathbb {S}}_\mu \) is replaced by a subgroup \({\mathbb {H}} \le {\mathbb {G}}\), one obtains tight spectral frames (with the same geometric and energy-preserving properties) for studying data on \({\mathbb {G}}\) with respect to any Cayley graph. These methods naturally extend to groups that generalize the symmetric group such as Weyl groups, complex reflection groups, and finite general linear groups. For example, when applied to the hyperoctahedral group (the Weyl group of type B), these methods provide a setting to analyze data on signed permutations. One can further extend these methods to analyze data on matchings, subsets, and set partitions by using the representation theory of the corresponding semisimple algebras with “group-like” structure, the Brauer, rook monoid, and partition algebras. Fourier analysis methods on these algebras are initiated in the recent work [69].
Notes
The results of this paper are true over the complex numbers; however, we use \({\mathbb {R}}[ {\mathbb {S}}_n]\) as ranked data are real valued.
By ordered set partitions, we mean that changing the ordering of the blocks results in a different partition, but changing the ordering within blocks does not. For example, \(\pi =\{\{1,2\},\{3,4\}\}\) and \(\pi ^{\prime }=\{\{3,4\},\{1,2\}\}\) are distinct elements of \(\Pi _{[2,2]}\). However, \(\pi ^{\prime \prime }=\{\{2,1\},\{3,4\}\}\) is equivalent to \(\pi \).
References
Akritidis, L., Katsaros, D., Bozanis, P.: Effective rank aggregation for metasearching. J. Syst. Softw. 84(1), 130–143 (2011)
Alvo, M., Yu, P.L.H.: Statistical Methods for Ranking Data. Springer, New York (2014)
Armon, S., Halverson, T.: Transition matrices between Young’s natural and seminormal representations. Electronic J. Combin. 28(3), 34pp. (2021)
Arrow, K.J.: A difficulty in the concept of social welfare. J. Polit. Econ. 58(4), 328–346 (1950)
Bacher, R.: Valeur propre minimale du laplacien de Coxeter pour le groupe symétrique. J. Algebra 167(2), 460–472 (1994)
Baggerly, K.A.: Visual estimation of structure in ranked data. PhD Thesis, Rice University (1995)
Bailey, R., Diaconis, P., Rockmore, D.N., Rowley, C.: A spectral analysis approach for experimental designs. In: Excursions in Harmonic Analysis, vol. 4, pp. 367–395. Springer, New York (2015)
Basha, T., Moses, Y., Avidan, S.: Photo sequencing. In: Proceedings of ECCV, pp. 654–667. Springer (2012)
Bennett, P.N., Chickering, D.M., Mityagin, A.: Learning consensus opinion: mining data from a labeling game. In: Proceedings of ACM WWW, pp. 121–130 (2009)
Bondy, J.A., Murty, U.S.: Graph Theory. Springer, London (2000)
Borda, J.d.: Mémoire sur les élections au scrutin. Histoire de l’Academie Royale des Sciences pour 1781 (Paris, 1784) (1784)
Breitling, R., Armengaud, P., Amtmann, A., Herzyk, P.: Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett. 573(1–3), 83–92 (2004)
Ceberio, J., Irurozki, E., Mendiburu, A., Lozano, J.A.: A review of distances for the Mallows and Generalized Mallows estimation of distribution algorithms. Comput. Optim. Appl. 62(2), 545–564 (2015)
Chamberlin, J.R., Cohen, J.L., Coombs, C.H.: Social choice observed: five presidential elections of the American Psychological Association. J. Polit. 46(2), 479–502 (1984)
Chen, W.: How to order sushi. PhD Dissertation, Harvard University (2014)
Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: Proceedings of ACM WSDM, pp. 193–202 (2013)
Christensen, O.: Frames and Bases. Birkhäuser, Boston (2008)
Ciucu, M.: Symmetry classes of spanning trees of Aztec diamonds and perfect matchings of odd squares with a unit hole. J. Algebr. Comb. 27(4), 493–538 (2008)
Clausen, M., Baum, U.: Fast Fourier transforms for symmetric groups: theory and implementation. Math. Comput. 61(204), 833–847 (1993)
Clémençon, S., Jakubowicz, J., Sibony, E.: Multiresolution analysis of incomplete rankings (2014). arXiv preprint arXiv:1403.1994
Cochran, W.G., Cox, G.: Experimental Designs. Wiley, New York (1957)
Cohen, A., Mallows, C.: Analysis of Ranking Data. Bell Laboratories Memorandum (1980)
Condorcet, N.d.: Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix. L’Imprimerie Royale (1785)
Critchlow, D.E.: Metric Methods for Analyzing Partially Ranked Data, vol. 34. Springer, New York (1985)
Dai, I.: Diameter bounds and recursive properties of Full-Flag Johnson graphs. Discrete Math. 341(7), 1932–1944 (2018)
DeConde, R.P., Hawley, S., Falcon, S., Clegg, N., Knudsen, B., Etzioni, R.: Combining results of microarray experiments: a rank aggregation approach. Stat. Appl. Genet. Mol. Biol. (2006). https://doi.org/10.2202/1544-6115.1204
Devlin, S., Uminsky, D.: Identifying group contributions in NBA lineups with spectral analysis. J. Sports Anal. 6(3), 215–234 (2020)
Diaconis, P.: Group Representations in Probability and Statistics. Institute of Mathematical Statistics Lecture Notes-Monograph Series, vol. 11. Institute of Mathematical Statistics, Hayward (1988)
Diaconis, P.: A generalization of spectral analysis with application to ranked data. Ann. Stat. 17(3), 949–979 (1989)
Diaconis, P., Rockmore, D.: Efficient computation of isotypic projections for the symmetric group. DIMACS Series. Discrete Math. Theor. Comput. Sci 11, 87–104 (1993)
Diaconis, P., Shahshahani, M.: Generating a random permutation with random transpositions. Z. Wahrsch. Verw. Geb. 57(2), 159–179 (1981)
Durbin, J.: Incomplete blocks in ranking experiments. Br. J. Stat. Psychol. 4(2), 85–90 (1951)
Edelman, P., White, D.: Codes, transforms and the spectrum of the symmetric group. Pac. J. Math. 143(1), 47–67 (1990)
Etingof, P.I., Golberg, O., Hensel, S., Liu, T., Schwendner, A., Vaintrob, D., Yudovina, E.: Introduction to Representation Theory, vol. 59. American Mathematical Society, Providence (2011)
Where is ranked choice voting used? https://www.fairvote.org/rcv#where_is_ranked_choice_voting_used
Flatto, L., Odlyzko, A.M., Wales, D.B.: Random shuffles and group representations. Ann. Probab. 13(1), 154–178 (1985)
Fligner, M.A., Verducci, J.S.: Distance based ranking models. J. R. Stat. Soc. B 48(3), 359–369 (1986)
Friedman, J.: On Cayley graphs on the symmetric group generated by transpositions. Combinatorica 20(4), 505–519 (2000)
Gabriel, K.R.: Biplots. Statistics Reference Online, Wiley StatsRef (2014)
Ghandehari, M., Guillot, D., Hollingsworth, K.: A non-commutative viewpoint on graph signal processing. In: Proceedings of the International Conference on Sampling Theory and Applications (2019)
Godsil, C., Royle, G.F.: Algebraic Graph Theory, vol. 207. Springer, New York (2013)
Grossman, J., Minton, G.: Inversions in ranking data. Discrete Math. 309(20), 6149–6151 (2009)
Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal. 30(2), 129–150 (2011)
Huang, J., Guestrin, C., Guibas, L.: Fourier theoretic probabilistic inference over permutations. J. Mach. Learn. Res. 10, 997–1070 (2009)
Huang, J., Guestrin, C., Guibas, L.J.: Efficient inference for distributions on permutations. In: Advances in Neural Information Processing Systems, pp. 697–704 (2008)
Irurozki, E., Calvo, B., Lozano, J.A.: PerMallows: an R package for Mallows and generalized Mallows models. J. Stat. Softw. 71(1), 1–30 (2016)
Iwata, K., Yamada, K., Tanaka, Y.: Graph blind deconvolution with sparseness constraint (2020). arXiv preprint arXiv:2010.14002
Jiang, X., Sun, J., Guibas, L.: A Fourier-theoretic approach for inferring symmetries. Comput. Geom. 47(2), 164–174 (2014)
Kakarala, R.: A signal processing approach to Fourier analysis of ranking data: the importance of phase. IEEE Trans. Signal Process. 59(4), 1518–1527 (2011)
Kamishima, T.: Nantonac collaborative filtering: recommendation based on order responses. In: Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, pp. 583–588 (2003)
Kamishima, T., Akaho, S.: Efficient clustering for orders. In: Zighed, D., Tsumoto, S., Ras, Z., Hacid, H. (eds.) Mining Complex Data, pp. 261–279. Springer, Berlin (2009)
Kidwell, P., Lebanon, G., Cleveland, W.: Visualizing incomplete and partially ranked data. IEEE Trans. Vis. Comput. Graph. 14(6), 1356–1363 (2008)
Kondor, R.: \(S_n\)ob: A C++ toolkit for fast Fourier transforms on the symmetric group (2006)
Kondor, R.: A Fourier space algorithm for solving quadratic assignment problems. In: Proceedings of ACM-SIAM SODA, pp. 1017–1028 (2010)
Kondor, R., Barbosa, M.S.: Ranking with kernels in Fourier space. In: Proceedings of COLT, pp. 451–463 (2010)
Kondor, R., Borgwardt, K.M.: The skew spectrum of graphs. In: Proceedings of the International Conference on Machine Learning, pp. 496–503 (2008)
Kondor, R., Dempsey, W.: Multiresolution analysis on the symmetric group. In: Proceedings of the Neural Information Processing Systems, pp. 1637–1645 (2012)
Kondor, R., Howard, A., Jebara, T.: Multi-object tracking with representations of the symmetric group. In: Artificial Intelligence and Statistics, pp. 211–218 (2007)
Kovačević, J., Chebira, A.: Life beyond bases: the advent of frames (part I). IEEE Signal Process. Mag. 24(4), 86–104 (2007)
Kovačević, J., Chebira, A.: Life beyond bases: the advent of frames (part II). IEEE Signal Process. Mag. 24(5), 115–125 (2007)
Li, X., Wang, X., Xiao, G.: A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications. Brief. Bioinform. 20(1), 178–189 (2017)
Lin, S.: Rank aggregation methods. Wiley Interdiscip. Rev. Comput. Stat. 2(5), 555–570 (2010)
Liu, Q., Crispino, M., Scheel, I., Vitelli, V., Frigessi, A.: Model-based learning from preference data. Annu. Rev. Stat. Appl. 6, 329–354 (2019)
Malandro, M.E.: Inverse semigroup spectral analysis for partially ranked data. Appl. Comput. Harmon. Anal. 35(1), 16–38 (2013)
Mallows, C.L.: Non-null ranking models. I. Biometrika 44(1/2), 114–130 (1957)
Marden, J.I.: Use of nested orthogonal contrasts in analyzing rank data. J. Am. Stat. Assoc. 87(418), 307–318 (1992)
Marden, J.I.: Analyzing and Modeling Rank Data. Chapman and Hall/CRC, London (1995)
Maslen, D.: The efficient computation of Fourier transforms on the symmetric group. Math. Comput. 67(223), 1121–1147 (1998)
Maslen, D., Rockmore, D., Wolff, S.: The efficient computation of Fourier transforms on semisimple algebras. J. Fourier Anal. Appl. 24, 1377–1400 (2018)
McCullagh, P.: Permutations and regression models. In: Probability Models and Statistical Analyses for Ranking Data, pp. 196–215. Springer, New York (1993)
McCullagh, P., Ye, J.: Matched pairs and ranked data. In: Probability Models and Statistical Analyses for Ranking Data, pp. 299–306. Springer, New York (1993)
Meilă, M., Chen, H.: Dirichlet process mixtures of generalized Mallows models. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 358–367 (2010)
2017 Minneapolis election results. http://vote.minneapolismn.gov/results/2017/
Murphy, T.B., Martin, D.: Mixtures of distance-based models for ranking data. Comput. Stat. Data Anal. 41(3–4), 645–655 (2003)
Ortega, A., Frossard, P., Kovačević, J., Moura, J.M., Vandergheynst, P.: Graph signal processing: overview, challenges, and applications. Proc. IEEE 106(5), 808–828 (2018)
Plumb, G., Pachauri, D., Kondor, R., Singh, V.: \(S_n\)FFT: a Julia toolkit for Fourier analysis of functions over permutations. J. Mach. Learn. Res. 16(1), 3469–3473 (2015)
Qian, Z., Philip, L.: Weighted distance-based models for ranking data using the R package rankdist. J. Stat. Softw. 90(1), 1–31 (2019)
Raman, K., Joachims, T.: Bayesian ordinal peer grading. In: Proceedings of the ACM Conference on Learning @ Scale, pp. 149–156 (2015)
Ramírez, D., Marques, A.G., Segarra, S.: Graph-signal reconstruction and blind deconvolution for structured inputs. Signal Process. 188, 108180 (2021)
Rockmore, D.: Some applications of generalized FFTs. In: Proceedings of DIMACS Workshop: Groups and Computation, pp. 329–369 (1997)
Rockmore, D., Kostelec, P., Hordijk, W., Stadler, P.F.: Fast Fourier transform for fitness landscapes. Appl. Comput. Harmon. Anal. 12(1), 57–76 (2002)
Rubinstein, R., Bruckstein, A.M., Elad, M.: Dictionaries for sparse representation modeling. Proc. IEEE 98(6), 1045–1057 (2010)
Sagan, B.E.: The Symmetric Group: Representations, Combinatorial Algorithms, and Symmetric Functions, vol. 203. Springer, New York (2013)
Schreier, O.: Die untergruppen der freien gruppen. In: Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, vol. 5, pp. 161–183. Springer (1927)
Segarra, S., Mateos, G., Marques, A.G., Ribeiro, A.: Blind identification of graph filters. IEEE Trans. Signal Process. 65(5), 1146–1159 (2016)
Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Recommender Systems Handbook, pp. 257–297. Springer, Boston (2011)
Shuman, D.I.: Localized spectral graph filter frames: a unifying framework, survey of design considerations, and numerical comparison. IEEE Signal Process. Mag. 37(6), 43–63 (2020)
Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30(3), 83–98 (2013)
Shuman, D.I., Vandergheynst, P., Kressner, D., Frossard, P.: Distributed signal processing via Chebyshev polynomial approximation. IEEE Trans. Signal Inf. Process. Netw. 4(4), 736–751 (2018)
Sibony, E.: Multiresolution analysis of ranking data. PhD Thesis, Télécom ParisTech (2016)
Sibony, E., Clémençon, S., Jakubowicz, J.: A multiresolution analysis framework for the statistical analysis of incomplete rankings (2016). arXiv preprint arXiv:1601.00399
Stoyanovich, J., Jacob, M., Gong, X.: Analyzing crowd rankings. In: Proceedings of ACM WebDB, pp. 41–47 (2015)
Strang, G.: The discrete cosine transform. SIAM Rev. 41(1), 135–147 (1999)
Thompson, G.: Generalized permutation polytopes and exploratory graphical methods for ranked data. Ann. Stat. 21(3), 1401–1430 (1993)
Ukkonen, A.: Visualizing sets of partial rankings. In: Proceedings of the International Symposium on Intelligent Data Analysis, pp. 240–251 (2007)
Uminsky, D., Banuelos, M., González-Albino, L., Garza, R., Nwakanma, S.A.: Detecting higher order genomic variant interactions with spectral analysis. In: Proceedings of the European Signal Processing Conference, pp. 1–5 (2019)
Waldron, S.F.: An Introduction to Finite Tight Frames. Springer, Boston (2018)
Wang, J., Srebro, N., Evans, J.: Active collaborative permutation learning. In: Proceedings of ACM SIGKDD, pp. 502–511 (2014)
Wong, H.S., Chin, T.J., Yu, J., Suter, D.: Mode seeking over permutations for rapid geometric model fitting. Pattern Recognit. 46(1), 257–271 (2013)
Yu, P.L., Gu, J., Xu, H.: Analysis of ranking data. Wiley Interdiscip. Rev. Comput. Stat. 11(6), e1483 (2019)
Zbikowski, C.: How the 2017 Ward 3 election in Minneapolis foreshadows our local political future. Streets.mn (2019). https://streets.mn/2019/05/20/how-the-2017-ward-3-election-in-minneapolis-foreshadows-our-local-political-future/
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Isaak Pesenson.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, Y., DeJong, J., Halverson, T. et al. Signal Processing on the Permutahedron: Tight Spectral Frames for Ranked Data Analysis. J Fourier Anal Appl 27, 70 (2021). https://doi.org/10.1007/s00041-021-09878-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00041-021-09878-3