Combinatorial methods for barcode analysis

Jaramillo Rodriguez, Elena

doi:10.1007/s41468-023-00143-8

Combinatorial methods for barcode analysis

Published: 22 November 2023

Volume 8, pages 239–270, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Applied and Computational Topology Aims and scope Submit manuscript

Combinatorial methods for barcode analysis

Download PDF

Elena Jaramillo Rodriguez ORCID: orcid.org/0000-0002-4347-1668¹

151 Accesses
1 Citation
Explore all metrics

Abstract

A barcode is a finite multiset of closed intervals on the real line. Barcodes are important objects in topological data analysis, where they serve as summaries of the persistent homology groups of a filtration. We introduce a new combinatorial invariant associated to barcodes by mapping each barcode to a multipermutation, i.e., a permutation of some multiset, which captures the overlapping arrangement of its bars. We call the set all such multipermutations the space of combinatorial barcodes. We define an order on this space and show that the resulting poset is a graded lattice. The cover relations in this lattice can also be used to determine the set of barcode bases of persistence modules. We explore some connections between combinatorial barcodes, trapezoidal words, and Stirling permutations. Finally, we generalize this construction, producing an entire family of multipermutation invariants of barcodes. For a large class of barcodes, these multipermutations provide bounds on the Wasserstein and bottleneck distances between pairs of barcodes, thereby linking combinatorial barcodes to continuous metrics on barcodes.

Combinatorial Results on Barcode Lattices

Article Open access 30 July 2024

Stratifying the space of barcodes using Coxeter complexes

Article Open access 26 December 2022

Palindrome partitions and the Calkin–Wilf tree

Article 20 August 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A barcode is a finite multiset of closed intervals on the real line, $B = \{ [b_i,d_i]^{m_i}\}_{i=1}^n$, where the exponents $m_i$ indicate multiplicity. Barcodes are important objects in persistent homology, where they serve as summaries of the persistent homology groups of a given filtration (Zomorodian and Carlsson 2005). In this context, each bar in a barcode represents an interval of resolutions in a filtration during which a particular generator is present. By analyzing the lengths and arrangements of the different bars researchers are able to determine significant features in the filtration. This theory has been applied to the study of data, where a filtration is applied to some dataset and its persistent homology groups are used to determine features in the data’s underlying shape (Carlsson 2009; Ghrist 2008).

In this paper, we introduce a new combinatorial invariant for barcodes by mapping each barcode to a multipermutation, i.e., a permutation of some multiset, which captures the overlapping arrangement of its bars. We call the set all such multipermutations the space of combinatorial barcodes, denoted ${\overline{L}}(n,2)$. This space inherits an order relation $\le $ as an induced subposet of the multinomial Newman lattice, L(n, 2). L(n, 2) is the poset of all permutations of the multiset $\{1^2, 2^2, \dots , n^2\}$ ordered by inversions, in a manner analogous to the weak Bruhat order on the symmetric group (Bennett and Birkhoff 1994).

We show that $({\overline{L}}(n,2), \le )$ is actually a principal ideal of L(n, 2) and hence forms a graded lattice. We also show that the order $\le $ can be defined in terms of what we call inversion multisets of multipermutations. We use these inversion multisets to compute several combinatorial properties of $({\overline{L}}(n,2), \le )$. Inversion multisets also have a natural interpretation in terms of the crossing numbers of barcodes, another invariant we introduce. These crossing numbers can be used to describe the set of barcode bases of a large class of persistence modules.

We then generalize our construction, thereby producing an entire family of multipermutations associated to barcodes. Finally, we show that under certain conditions these multipermutations can provide bounds on the Wasserstein and bottleneck distances between a pair of barcodes.

1.1 Motivation and related work

Our work is inspired by a series of papers (Kanari et al. 2020; Curry et al. 2021; Brück and Garin 2022) which developed similar invariants associated to barcodes. In Kanari et al. (2020); Curry et al. (2021) the authors introduced a mapping from the space of barcodes with $n+1$ distinct bars to the symmetric group ${\mathfrak {S}}_n$, defined as follows. Let B be barcode with $n+1$ distinct bars, including one essential bar containing all others. Begin by relabeling the bars according to their birth time, so $b_0< b_1< \dots < b_n$. After relabeling, record the ordering of the death times, excluding the essential bar which is now labeled 0, so as to produce a permutation $\pi _B \in {\mathfrak {S}}_n$. In Brück and Garin (2022) the authors refined this mapping in order to stratify the space of barcodes into regions with the same average and standard deviation in each endpoint and the same permutation type, $\pi _B$.

One of the main contributions of Kanari et al. (2020); Curry et al. (2021) was the discovery that these permutations can be used to describe the fibers of the topological morphology descriptor (${{\,\textrm{TMD}\,}}$). ${{\,\textrm{TMD}\,}}$, first introduced in Kanari et al. (2018), is a method for encoding the spatial structure of metric trees in a barcode. In Kanari et al. (2020); Curry et al. (2021) the authors demonstrated that the cardinality of the fibers of ${{\,\textrm{TMD}\,}}$, when restricted to a certain class of metric trees known as merge trees, can be computed from $\pi _B$ alone. In particular,

$$\begin{aligned} |{{\,\textrm{TMD}\,}}^{-1}(B)|= \prod _{i=1}^n \nu _i(\pi _B), \end{aligned}$$

(1)

where $\nu (\pi _B)$ denotes the left inversion vector of $\pi _B$. This result follows from the observation that inversions in $\pi _B$ correspond to pairs of nested bars in B, i.e., bars $[b_i,d_i], [b_j,d_j]$ satisfying $b_i< b_j< d_j < d_i$.

Note that a pair of non-nested bars may be either disjoint or stepped, i.e., the bars intersect but one is not a subset of the other. This distinction is important for the study of barcode bases of persistence modules. Recall, a barcode basis of a persistence module $(V_\bullet , f_\bullet )$ of the form,

is a choice of ordered bases for each $V_i$ with respect to which the matrices of the maps $f_i$ admit at most a single 1 in each row and column while all other entries equal 0 (Jacquard et al. 2022). Barcode bases are so-called because they allow us to decompose $(V_\bullet , f_\bullet )$ into a direct sum of interval modules which can be summarized by a barcode $B =\{[b_i,d_i]^{m_i}\}_{i=1}^n$ (Zomorodian and Carlsson 2005). Recently, Jacquard et al. (Jacquard et al. 2022) discovered that the set of barcode bases of $(V_\bullet , f_\bullet )$ is in one-to-one correspondence with,

$$\begin{aligned} \prod _{i=1}^n {{\,\textrm{GL}\,}}(m_{i}; {\mathbb {F}}) \times \prod _{\begin{array}{c} i < j: \\ b_i \le b_j \le d_i \le d_j \end{array}} {{\,\textrm{Mat}\,}}(m_i \times m_j; \mathbb F), \end{aligned}$$

(2)

where ${{\,\textrm{GL}\,}}(m; {\mathbb {F}})$ denotes the general linear group of $m\times m$ matrices with entries in ${\mathbb {F}}$. Note that the condition, $b_i \le b_j \le d_i \le d_j$, indicates that a pair of bars are stepped (assuming the endpoints of distinct bars are distinct).

Although one can identify pairs of nested bars in B from their permutation type, $\pi _B$, one cannot distinguish non-nested bars as being either stepped or disjoint from $\pi _B$ alone. Our goal is to develop alternative combinatorial invariants for barcodes which are capable of recording all possible arrangements of bars.

1.2 Outline of our contributions

In Sect. 3, we introduce a map from the space of barcodes with n bars to certain equivalence classes of L(n, 2), the multinomial Newman lattice. This map is defined as follows. Given a barcode B with n distinct bars, write the birth and death times in ascending order then record the sequence of labels. This produces a multipermutation $\sigma _B\in L(n,2)$. To account for potential redundancy from different initial labelings, we consider the invariant $g(B) = [\sigma _B]$, the orbit of $\sigma _B$ under an action by ${\mathfrak {S}}_n$ which identifies two multipermutations if they are equal after relabeling. For each equivalence class $[\sigma _B]$ we let $\overline{\sigma _B}$ be the unique representative in ascending order, i.e., the representative where $1, \dots , i-1$ appear before the first i for all $i\in [n]$. We call these ascending-order representatives combinatorial barcodes and denote the collection of all combinatorial barcodes by ${\overline{L}}(n,2)$.

For example, if B is the strict barcode with 3 bars given by $b_1 = 1.5,\, d_1 = 3.0,\, b_2 = 1.0,\, d_2 = 2.0,\, b_3 = 2.5,\, d_3 = 2.75$, then the birth/death times are ordered, $b_2< b_1< d_2< b_3< d_3 < b_1$, and $\sigma _B =(2~1~2~3~3~1) \in L(3,2)$. If the permutation $(1~2) \in {\mathfrak {S}}_3$, written in cycle notation, acts on $\sigma _B$ the resulting multipermutation is (1 2 1 3 3 2). Hence, both permutations belong to the same equivalence class, $[\sigma _B]$ and $\overline{\sigma _B} = (1~2~1~3~3~2)$.

The combinatorial barcodes in ${\overline{L}}(n,2)$ provide new combinatorial invariants associated to barcodes. In this paper, we seek to answer the following questions:

1.
What is the combinatorial structure of ${\overline{L}}(n,2)$?
2.
How is the structure of this space related to barcodes arising from persistence modules?

To begin answering these questions we consider the poset $({\overline{L}}(n,2), \le )$, where the order $\le $ is inherited from the multinomial Newman lattice L(n, 2). This order is an extension of the weak Bruhat order applied to the permutohedron. We call the poset $({\overline{L}}(n,2),\le )$ the combinatorial barcode poset. In our first main result, Theorem 3.1, we show that $({\overline{L}}(n,2),\le )$ is order-isomorphic to a principal ideal of L(n, 2) and, hence, forms a lattice as well.

We then introduce inversion multisets of combinatorial barcodes. The inversion multiset $s\in {\overline{L}}(n,2)$ is the multiset of pairs ${{\,\textrm{invm}\,}}(s)= \{(j,i)^{a_{ij}}: 1\le i < j \le n\}$ where $a_{ij}$ is equal to the number of pairs of i’s and j’s that appear out of order. In Proposition 3.1 we show that for $s,t \in {\overline{L}}(n,2)$, $s\le t$ if and only if ${{\,\textrm{invm}\,}}(s) \subseteq {{\,\textrm{invm}\,}}(t)$. We then use these inversion multisets to define crossing numbers for barcodes. For any $s \in {\overline{L}}(n,2)$ and $(j,i) \in [n]^2$ with $j>i$, we define the crossing number of i and j in s as the multiplicity of $(j,i) \in {{\,\textrm{invm}\,}}(s)$, which we denote by ${{\,\mathrm{cross\#}\,}}(s, j,i)$, or simply ${{\,\mathrm{cross\#}\,}}(j,i)$. We note that these crossing numbers have a natural interpretation in terms of the bars in a barcode, the crossing number of bars i and j is 0 if the bars are disjoint, 1 if they are stepped, and 2 if they are nested. Moreover, in Theorem 3.2 we show that the rank of a combinatorial barcode s in $({\overline{L}}(n,2)\le )$, denoted $\rho (s)$, is given by

$$\begin{aligned} \rho (\overline{\sigma _B}) = \sum _{i < j} {{\,\mathrm{cross\#}\,}}(j,i). \end{aligned}$$

In Propositions 3.2 and 3.3 we show that the cardinality of the fibers of $TMB^{-1}$ (Eq. 1) and the set of barcode bases of a persistence module (Eq. 2) can also be expressed in terms the crossing numbers of the combinatorial barcode $\overline{\sigma _B}$. Moreover, in Theorem 3.3 we show that if two persistence modules have corresponding multipermutations that differ only in swapping a pair of adjacent entries, then the space of barcode bases for each module differ only up to a factor of the field ${\mathbb {F}}$.

In Sect. 4 we introduce inversion vectors of combinatorial barcodes, which are analogues of classic inversion vectors for permutations. Our next major result, Theorem 4.1, shows that the map which sends a combinatorial barcode to its inversion vector is a bijection of sets. As a corollary, we are able to compute the rank generating function of ${\overline{L}}(n,2)$.

Theorem 4.1

Let $T_n = \prod _{i=1}^n[0,2(n-i)]$. Then the map $J: {\overline{L}}(n,2) \rightarrow T_n$ which sends each combinatorial barcode to its inversion vector is a bijection.

Corollary 4.1

For $k\in [0,2n]$, let $c_k = \#\{s \in {\overline{L}}(n,2): \rho (s) =k\}$, i.e., the number of combinatorial barcodes in ${\overline{L}}(n,2)$ of rank k. Then,

$$\begin{aligned} \sum _{k=0}^{2n}c_k q^k = \prod _{i=1}^n \big (1+q +\dots +q^{2(n-i)}\big ). \end{aligned}$$

In Corollary 4.2 we also show that the number of combinatorial barcodes with k distinct elements in their inversion vectors is given by the second-order Eulerian numbers $C_{n,k}$, defined recursively by $C_{n,k} = kC_{n-1,k} + (2n-k)C_{n-1,k-1}$, which are also known to enumerate trapezoidal words, Stirling permutations, and plane-recursive trees with certain statistics (Riordan 1976; Gessel and Stanley 1978; Janson 2008).

In Sect. 5, we generalize our approach above. For each $k\in {\mathbb {Z}}_{\ge 0}$, we consider a map from the space of barcodes with n bars to an ascending-order multipermutation $\overline{\sigma _k(B)}\in L(n,2^k+1)$, where $L(n,2^k+1)$ denotes the set of all permutations of the multiset $\{1^{2^k+1},2^{2^k+1}, \ldots , n^{2^k+1}\}$. We denote the set of all such representatives by ${\overline{L}}(n,2^k+1)$. This produces an entire family of new combinatorial invariants, denoted $\overline{\sigma _k(B)}$. In Theorem 5.1 we show that $({\overline{L}}(n,2^k+1), \le )$ retains the graded lattice structure found in the $k=0$ case, and so we call the poset the power-k barcode lattice.

We also show that $\overline{\sigma _k(B)}$ contains the invariants $\overline{\sigma _j(B)}$ as a substring for all $j \le k$. It follows that increasing k amounts to refining the information in the invariant. We then prove that as k goes to infinity, this invariant uniquely determines a large family of barcodes up to an affine transformation.

Theorem 5.2

Let $B,B'$ be fundamentally strict barcodes with n bars, where $B =\{[b_i,d_i]\}_{i=1}^n$ and $B' =\{[b_i',d_i')\}_{i=1}^n$ such that $\overline{\sigma _k(B)} = \overline{\sigma _k(B')}$ for all $k\in {\mathbb {N}}$. If $G_B$ (equivalently $G_{B'}$) is connected, then there exist constants $\alpha >0$ and $\delta \in {\mathbb {R}}$ such that $B = \alpha B' +\delta $, where $ \alpha B' +\delta := \{(\alpha b_i'+ \delta ,\alpha d_i' + \delta ): i\in [n]\}$.

We also show that these invariants can provide upper bounds on the bottleneck and q-Wasserstein metrics ($d_\infty $ and $d_q$, respectively) between barcodes, up to an affine transformation.

Theorem 5.3

Let $B,B'$ be k-strict barcodes with n bars such that $\overline{\sigma _k(B)} = \overline{\sigma _k(B)}$. Suppose there exists a bar $[b_*,d_*] \in B$ (or equivalently in $B'$) which contains all others, that is to say $b_* \le b_i$ and $d_* \ge d_i$ for all $i\in [n]$. Then there exist constants $\alpha > 0$ and $\delta \in {\mathbb {R}}$ such that

$$\begin{aligned} d_\infty (B, \alpha B' +\delta )&\le \frac{ d_*-b_*}{2^k}, \text { and} \\ d_q(B, \alpha B' +\delta )&\le (n-1)^{\frac{1}{q}} \frac{ d_*-b_*}{2^k}. \end{aligned}$$

Finally, we show that the barcode lattices form convex polytopes via an embedding into the classic permutohedron. In fact, these polytopes are a special case of Bruhat interval polytopes, studied at length in Tsukerman and Williams (2015). We use some results therein to compute the dimensions of our barcode polytopes (Corollary 5.1).

2 Background

2.1 Barcodes

The language of barcodes used below comes from persistence homology; see (Ghrist 2008) for an introduction and details.

Definition 2.1

A barcode is a finite multiset of closed intervals on the real line, $B = \{ [b_i,d_i]^{m_i}\}_{i=1}^n$, where necessarily $b_i < d_i$ for all $i\in [n]$. Each interval is called a bar; its left endpoint $b_i$ is called its birth (time) and its right endpoint $d_i$ is called its death (time). We denote the set of all barcodes with n distinct bars by ${\mathcal {B}}^n$.

Remark 1

Readers familiar with persistent homology should note that we suppose that the bars corresponding to essential classes have finite values instead of having a death time at infinity. This is reasonable in the context of computing Rips/Čech complexes of a dataset, where there is at most a single essential class and it is assigned a finite death-time in order to display it in a persistence diagram.

Barcodes are often displayed as a stacked set of intervals above the real line as in Fig. 1b. Barcodes are also commonly represented as points $[b_i,d_i]\in {\mathbb {R}}^2$ in a figure known as a persistence diagram. Figure 1c shows the persistence diagram for the barcode in Fig. 1b. Note that the points in Fig. 1c lie above the diagonal since we require that $b_i < d_i$ for all $i \in [n]$.

Definition 2.2

A barcode $B = \{ [b_i,d_i]^{m_i}\}_{i=1}^n$ is called strict if $m_i = 1$ for all $i\in [n]$ and $d_i \ne d_j$ for all $i\ne j$, i.e., if no bars are repeated and no pair of distinct bars share a death time. We denote the set of strict barcodes with n bars by ${\mathcal {B}}^n_{st}$. When we refer to a strict barcode, we often simply write $B = \{ [b_i,d_i]\}_{i=1}^n$ with the multiplicities omitted.

Similar definitions of strict barcodes were introduced in Kanari et al. (2020); Curry et al. (2021). However, these definitions differ slightly from ours in the following aspects: (1) we do not require that there exist an essential bar $[b_0, d_0]$ that contains all the others, and (2) we only require that no death times are repeated, whereas the definition in Curry et al. (2021) requires that no birth times are repeated either, i.e., $b_i \ne b_j $ for all $i\ne j$. We note that our definition is well suited for barcodes arising from Rips/Čech complexes where many generators of the order 0 groups may born at time 0, but it is unlikely that a pair of generators will die at the same time.

One can define a topology on the space of all barcodes, regardless of strictness or the number of bars, by way of a distance function. Two popular and well-studied choices are the bottleneck distance and the q-Wasserstein distance (Edelsbrunner and Harer 2008).

Definition 2.3

Let $B= \{[b_i,d_i]^{m_i}\}_{i=1}^n$ and $B'= \{[b_i',d_i']^{m_i'}\}_{i=1}^m$ be two barcodes. The bottleneck distance between B and $B'$ is

$$\begin{aligned} d_\infty (B,B') = \inf _\gamma \max _{x\in B}\Vert x- \gamma (x)\Vert _\infty , \end{aligned}$$

where $\gamma $ runs over all perfect matchings of the points (bars) $x_i = [b_i,d_i]$ in B and the points (bars) in $B'$, allowing bars to be matched to the diagonal $\Delta = \{(x,x): x\in {\mathbb {R}}\}$. Here $\Vert \cdot \Vert _\infty $ denotes the $\ell ^\infty $-norm on ${\mathbb {R}}^2$.

Intuitively, the bottleneck distance computes the largest distance in $\ell ^\infty $-norm between a pair of matched points on the persistence diagrams of B and $B'$, taking the infimum over all perfect matchings. The q-Wasserstein distance is similar; it can be thought of as the total distance between all matched points, again taking the infimum over all perfect matchings.

Definition 2.4

Let $B= \{[b_i,d_i]^{m_i}\}_{i=1}^n$ and $B'= \{[b_i',d_i']^{m_i'}\}_{i=1}^m$ be two barcodes. The q-Wasserstein distance between B and $B'$ is

$$\begin{aligned} d_q(B,B') = \inf _\gamma \big ( \sum _{x\in B}\Vert x- \gamma (x)\Vert _\infty ^q \big )^{1/q}, \end{aligned}$$

where $\gamma $ runs over all perfect matchings of the points (bars) $x_i = [b_i,d_i]$ in B and the points (bars) in $B'$, allowing bars to be matched to the diagonal $\Delta = \{(x,x): x\in {\mathbb {R}}\}$. Here $\Vert \cdot \Vert _\infty $ denotes the $\ell ^\infty $-norm on ${\mathbb {R}}^2$.

2.2 Persistence modules

For the purposes of this paper, a persistence module is a finite collection $(V_\bullet , f_\bullet )$ of finite-dimensional vector spaces $V_i$ over a field ${\mathbb {F}}$ and ${\mathbb {F}}$-linear maps $f_i$ arranged as below (Jacquard et al. 2022).

The length of $(V_\bullet , f_\bullet )$ is the number $\ell +1$. If $(V_\bullet , f_\bullet )$ and $(W_\bullet , h_\bullet )$ are persistence modules of the same length, their direct sum is defined pointwise, i.e., for every appropriate index i the i-th vector space is $V_i \oplus W_i$ and the i-th map is given by $f_i \oplus h_i$. We say $(V_\bullet , f_\bullet )$ and $(W_\bullet , h_\bullet )$ are isomorphic if there exist linear bijections $\phi _i: V_i \rightarrow W_i$ such that the following diagram commutes for each $i\in [\ell ]$.

Now, let $0 \le i \le j \le \ell $. The interval module of length $\ell +1$ corresponding to i, j is the persistence module ${\textbf{I}}[i,j]_\bullet $ given by,

where $V_i = {\mathbb {F}}$ for all $i \in [i, j]$, maps between adjacent ${\mathbb {F}}$’s are identities, and all other vector spaces are trivial. Despite their apparent simplicity, interval modules are of particular interest due to the following result of Zomorodian and Carlsson.

Theorem 2.1

(Zomorodian and Carlsson 2005) Let $(V_\bullet , f_\bullet )$ be a persistence module of length $\ell +1$. Then, there exists a barcode $B = \{[b_i,d_i]^{m_i}\}_{i=1}^n$ with $b_i,d_i \in \{0, \dots , \ell \}$ for all $i\in [n]$ such that $(V_\bullet , f_\bullet )$ is isomorphic to the following direct sum of interval modules:

$$\begin{aligned} (V_\bullet , f_\bullet ) \simeq \bigoplus _{i=1}^n {\textbf{I}}[b_i,d_i]_\bullet ^{m_i}, \end{aligned}$$

(3)

where ${\textbf{I}}[b_i,d_i]_\bullet ^{m_i}$ denotes the direct sum of $m_i$ copies of the interval module ${\textbf{I}}[b_i,d_i]_\bullet $.

Definition 2.5

We say a persistence module $(V_\bullet , f_\bullet )$ is strict if the barcode B associated to its interval decomposition is strict.

To realize the isomorphism in Theorem 2.1 one must determine an appropriate choice of bases for each of the vector spaces $V_i$. Concretely, let $(V_\bullet , f_\bullet )$ be a persistence module of length $\ell +1$. A basis family of $(V_\bullet , f_\bullet )$ is a collection,

$$\begin{aligned} {\mathcal {U}} = \{U_i \subset V_i:\, 0\le i \le \ell \}, \end{aligned}$$

where $U_i$ is an ordered basis of $V_i$ for each i. Given a fixed basis family, each map $f_i$ can be written as a matrix $A_i$ of size $\dim (V_{i}) \times \dim (V_{i-1})$ with entries in ${\mathbb {F}}$. To determine the interval decomposition of $(V_\bullet , f_\bullet )$, we seek basis families which produce matrices of a particularly nice form, known as barcode form (Jacquard et al. 2022).

Definition 2.6

An $m\times n$ matrix A of rank r is said to be in barcode form if and only if it has at most a single 1 in each row and column, which appear in the first r rows in strictly increasing column order, and the remaining entries are all 0.

We say that a basis family ${\mathcal {U}}$ is a barcode basis for $(V_\bullet , f_\bullet )$ if it produces matrices $A_i$ which are all in barcode form. Note that given a barcode basis ${\mathcal {U}}$, one can recover the decomposition (3) and vice-versa; the positions of the 1’s in each $A_i$ indicate which basis vectors have non-zero image under $f_i$. We denote the set of all ordered barcodes bases of $(V_\bullet , f_\bullet )$ by ${\mathscr {B}}(V_\bullet , f_\bullet )$

The following theorem of Jacquard et al. shows the connection between the space of barcode bases of a persistence module and the intersections of the bars in that modules interval decomposition.

Theorem 2.2

(Jacquard et al. 2022) Let $(V_\bullet , f_\bullet )$ be a persistence module of length $\ell +1$. Let $B = \{[b_i,d_i]^{m_i}\}_{i=1}^n$ be the barcode associated to the interval decomposition of $(V_\bullet , f_\bullet )$ as in Theorem 2.1. Then the set of barcode bases of $(V_\bullet , f_\bullet )$, ${\mathscr {B}}(V_\bullet , f_\bullet )$, admits a bijection,

$$\begin{aligned} {\mathscr {B}}(V_\bullet , f_\bullet ) \cong \prod _{i=1}^n {{\,\textrm{GL}\,}}(m_{i}; {\mathbb {F}}) \times \prod _{\begin{array}{c} i < j: \\ b_i \le b_j \le d_i \le d_j \end{array}} {{\,\textrm{Mat}\,}}(m_i \times m_j; {\mathbb {F}}), \end{aligned}$$

where ${{\,\textrm{GL}\,}}(m; {\mathbb {F}})$ denotes the general linear group of $m\times m$ matrices over ${\mathbb {F}}$.

2.3 Partially ordered sets and double occurrence words

We will assume the reader is familiar with the basic definitions and properties of posets. For a complete introduction to partially ordered sets see (Stanley 2011, Ch.3), for example.

We recall for the reader that a lattice is a poset L for which every pair of elements has a least upper bound and greatest lower bound. We also recall that a principal ideal of a lattice L is a subposet of the form $I = \{s\in L: s \le \alpha \}$ for some $\alpha \in L$; we say I is the principal ideal generated by $\alpha $. It is a well known result that a principal ideal I of a lattice L is a sublattice of L. A congruence on a lattice L is an equivalence relation R on L such that if a R b and c R d then $(a\wedge c)~ R~ (b \wedge d)$ and $(a \vee c)~ R~ (b \vee d)$ (Reading 2016). One can show that an equivalence relation R on a lattice L is a lattice congruence if and only if each equivalence class of R is an interval in LReading (2016).

Lattice congruences are of interest because they allow us to define quotient lattices. If L is a lattice and R is a congruence on L then the quotient lattice L/R is the poset on the R-classes where $C_1 \le C_2$ in L/R if and only if there exists $a\in C_1$ and $b\in C_2$ such that $a\le b$ in L (Reading 2016). One can show that if R is an lattice congruence on a finite lattice L, then L/R is order-isomorphic to the induced subposet of L whose elements are either the minima (or the maxima) of each equivalence class (Reading 2016).

A lattice of particular interest to us is the permutohedron (Caspard et al. 2016). Recall that for $\pi \in {\mathfrak {S}}_n$ an inversion in $\pi $ is a pair $(\pi _i, \pi _j)$ such that $i <j$ and $\pi _i > \pi _j$, i.e., it is a pair of elements that appear out of order. The inversion set of $\pi $, ${{\,\textrm{inv}\,}}(\pi )$, is the set of all inversions in $\pi $. The $inversion number $ of $\pi $ is the cardinality $\#{{\,\textrm{inv}\,}}(\pi )$ of its inversion set. For example, if $\pi = (1~2~5~4~3~6) \in {\mathfrak {S}}_6$ then ${{\,\textrm{inv}\,}}(\pi ) = \{(5,4), (5,3), (4,3)\}$ and $\#{{\,\textrm{inv}\,}}(\pi ) = 3$. Notice that $\pi \ne \sigma \implies {{\,\textrm{inv}\,}}(\pi ) \ne {{\,\textrm{inv}\,}}(\sigma )$, so we can think of ${{\,\textrm{inv}\,}}$ as an injective map from the permutations in ${\mathfrak {S}}_n$ to subsets of $[n]^2$.

The weak Bruhat order (or weak order for short) is the relation $\le $ on ${\mathfrak {S}}_n$ defined by $\pi \le \sigma $ if and only if ${{\,\textrm{inv}\,}}(\pi ) \subseteq {{\,\textrm{inv}\,}}(\sigma )$. Note that $\pi \le \sigma \implies \#{{\,\textrm{inv}\,}}(\pi ) \le \#{{\,\textrm{inv}\,}}(\sigma )$, but the converse need not hold. One can show that $\pi \lessdot \sigma $ if and only if $\#{{\,\textrm{inv}\,}}(\pi ) +1 = \#{{\,\textrm{inv}\,}}(\sigma )$ and $\sigma = (i \,\,\, i+1)\pi $ for some $i \in [n-1]$, which means that $\sigma $ equals $\pi $ after transposing a pair of its adjacent entries.

It is a well-known result that the weak order on ${\mathfrak {S}}_n$ forms the face lattice of a polytope known as the permutohedron (Berge 1971; Caspard et al. 2016); therefore we refer to the poset $({\mathfrak {S}}_n, \le )$ as the permutohedron as well. The weak Bruhat order can also be defined similarly on arbitrary Coxeter systems (see (Björner and Brenti 2005)), but for this paper the definition on the symmetric group is sufficient.

One can generalize the construction of the permutohedron to multiset permutations. For ${\textbf{m}}= (m_1, \ldots ,m_n) \in {\mathbb {N}}^n$, let $L({\textbf{m}})$ denote the set of permutations of the multiset $M = \{1^{m_1}, \ldots , n^{m_n}\}$, with the usual multiset notation where exponents represent multiplicity. An order relation on $L({\textbf{m}})$ can be succinctly defined via the following cover relations. For $s,t \in L({\textbf{m}})$, we say that $s \lessdot t$ if and only if s and t differ only in swapping an adjacent pair of entries, which are in numerical order in s but are reversed in t. For instance, we have that $(1~1~2~3~2) \lessdot (1~2~1~3~2)$ in L(2, 2, 1). The poset $L({\textbf{m}})$ is called the multinomial Newman lattice and was originally introduced by Bennett and Birkhoff in Bennett and Birkhoff (1994).

The multinomial Newman order can also be defined explicitly as follows. Consider the set,

$$\begin{aligned} S=\{1_1, \ldots , 1_{m_1}, \ldots , n_1, \ldots , n_{m_n}\}, \end{aligned}$$

which we endow with the lexicographic total ordering $1_1 \lessdot 1_2 \lessdot \dots \lessdot 1_{m_1} \lessdot \dots \lessdot n_{m_n}$. Note that we may identify a multipermutation $s \in L({\textbf{m}})$ with a unique corresponding permutation $\pi \in {\mathfrak {S}}_S$ where we require that copies of the same elements appear in lexicographic order, that is to say $i_j$ appears before $i_k$ in $\pi $ for all $i\in [n]$ and all $j < k$. For example, the multipermutation $(1~2~1~3~2) \in L(2,2,1)$ is identified with the permutation $(1_1 ~2_1~ 1_2~ 3_1~ 2_2)$. Let $\iota : L({\textbf{m}}) \rightarrow {\mathfrak {S}}_S$ denote this mapping.

One can show that $\iota $ is in fact an order-isomorphism from $L({\textbf{m}})$ to a principal ideal of the permuhatedron ${\mathfrak {S}}_S$; it follows that $L({\textbf{m}})$ is also a lattice (Bennett and Birkhoff 1994; Santocanale and Wehrung 2016; Santocanale 2007). Specifically, the multinomial Newman lattice is isomorphic to the principal ideal generated by $(n_{1} \ldots n_{m_n} \ldots 1_1 \ldots 1_{m_1})$. Thus $L({\textbf{m}})$ has (necessarily unique) minimal and maximal elements, denoted ${\hat{0}}$ and ${\hat{1}}$ respectively. In particular, ${\hat{0}}$ is the identity permutation $(1~1\ldots 1~2~2 \ldots n)$ while ${\hat{1}}$ is the “fully reversed” permutation $(n~n \ldots n ~(n-1)~(n-1) \ldots 1)$.

In Sects. 3 and 4 we will pay close attention to the Newman lattice $L(n,2) = L(2,2,\dots , 2)$, consisting of all permutations of $\{1^2, 2^2, \dots , n^2\}$. The lattice L(n, 2) is closely related to a class of strings known as double occurrence words, defined as follows (Lothaire 2002). A word over [n] is a sequence of symbols $w = (a_1~ a_2~ \ldots ~ a_k)$ with $a_i \in [n]$ for all $i\in [k]$. A double occurrence word, sometimes called a Gauss code, is a word in which every integer $i\in [n]$ occurs either zero or two times in w. For example, (1 2 1 3 3 2) and (4 1 4 3 3 1) are double occurrence words over [4] while (1 2 4 3 3 2) is not. Note that the multipermutations in L(n, 2) are the double occurrence words over [n] where each integer occurs exactly twice. Double occurrence words have been studied extensively the context of knot theory (Burns et al. 2015; Gibson 2011; Turaev 2004), mathematical logic (Courcelle 2008), algebraic combinatorics (Shtylla et al. 2009), enumerative combinatorics (Cruz et al. 2019; Guterman et al. 2020), and genomics (Braun et al. 2018; Burns et al. 2013; Jonoska et al. 2017).

Often it is beneficial to identify two words $w,w'$ if they are equal after some permutation of the integers in [n]. We say w is combinatorially equivalent to $w'$, denoted $w\equiv w'$, if there exists a bijection of symbols $\pi \in {\mathfrak {S}}_n$ such that $\pi (w) = w'$, where $\pi (w)$ indicates that $\pi $ acts element-wise on w. For example, if $w = (1~2~1~3~2~3)$ and $w' = (1~3~1~2~3~2)$ then $w\equiv w'$ because $(2~3) \circ (w) = w'$, where (2 3) is the permutation in ${\mathfrak {S}}_3$ written in cycle notation. One may verify that $\equiv $ defines an equivalence relation on the set of double occurrence words over [n]. The equivalence class of w under $\equiv $ is denoted by [w]. We emphasize that the equivalence relation $\equiv $ is not a lattice congruence on L(n, 2); note, for example, that the equivalence class of (1 1 2 2) in L(2, 2) is $\{(1~1~2~2), (2~2~1~1)\}$, which does not form an interval in L(2, 2).

A double occurrence word w is said to be in ascending order if $1, 2,\dots , i-1$ appear before the first instance of i in w, for all $i\in [n]$, i.e., if the first copy of 1 appears before the first copy of 2, which appears before the first copy of 3, and so on. For example, the word (1 2 1 3 2 3) is in ascending order while the equivalent word (1 3 1 2 3 2) is not. If [w] is an equivalence class of double occurrence words, we let ${\overline{w}}$ denote the unique word in [w] which is in ascending order.

3 The combinatorial barcode lattice

3.1 Barcodes as multiset permutations

Our goal in this section is to develop a new combinatorial invariant associated to strict barcodes by mapping each barcode to certain multipermutations in the multinomial Newman lattice L(n, 2). We begin with the following definition.

Definition 3.1

Let $B = \{[b_i,d_i]\}_{i=1}^n$ be a strict barcode and let $T_n$ be the set of labels $\{x_1, y_1, \ldots , x_n, y_n\}$. Then, the relation $\trianglelefteq _B$ on $T_n$ is given by,

$$\begin{aligned} y_i \trianglelefteq _B y_j&\iff d_i \le d_j, \end{aligned}$$

(4)

$$\begin{aligned} y_i \trianglelefteq _B x_j&\iff d_i < b_j, \end{aligned}$$

(5)

$$\begin{aligned} x_i \trianglelefteq _B y_j&\iff b_i \le d_j, \end{aligned}$$

(6)

$$\begin{aligned} x_i\trianglelefteq _B x_j&\iff (b_i < b_j) \text { or } (b_i= b_j \text { and } d_i \le d_j) . \end{aligned}$$

(7)

In essence, $\trianglelefteq _B$ orders $T_n$ according to the ordering of the endpoints in B, with some rules deciding ties between birth and death times (6) or two birth times (7). Note, it is not possible for two death times to be equal when B is strict.

Lemma 3.1

Let $B = \{[b_i,d_i]\}_{i=1}^n$ be a strict barcode and let $T_n$ be the set of labels $\{x_1, y_1, \ldots , x_n, y_n\}$. Then the set $(T_n, \trianglelefteq _B)$ is totally ordered.

Proof

One can verify that $\trianglelefteq _B$ is reflexive and anti-symmetric. It is also clear that $(T,\trianglelefteq _B)$ is strongly-connected, i.e., any two elements are comparable. Thus, we only have left to show that $\trianglelefteq _B$ is transitive. Let $a,b,c \in T$ with $a\trianglelefteq _B b$ and $b \trianglelefteq _B c$. Now, proceed by cases:

1.
If $a = y_i, b = y_j, c = y_k$, then $d_i \le d_j \le d_k$. Hence, $y_i \trianglelefteq _B y_k$ and $a \trianglelefteq _B c$.
2.
If $a = y_i, b = y_j, c = x_k$, then $b_i \le d_j < b_k$. Hence, $y_i \trianglelefteq _B x_k$ and $a \trianglelefteq _B c$. By the same logic, if exactly one of a, b, c is of the form $x_i$, then again $a \trianglelefteq _B c$.
3.
If $a = y_i, b = x_j, c = x_k$, then $d_i < b_j \le b_k$. Hence, $y_i \trianglelefteq _B x_k$ and $a \trianglelefteq _B c$. By the same logic, if exactly one of a, b, c is of the form $x_i$, then again $a \trianglelefteq _B c$.
4.
If $a = x_i, b = x_j, c = x_k$, then
1. i
  If $b_i < b_j$ or $b_j < b_k$ then $b_i < b_k$. Hence, $x_i \trianglelefteq _B x_k$ and $a \trianglelefteq _B c$..
2. ii
  Else, $b_i = b_j = b_k$ and $d_i \le d_j \le d_k$. Hence, $x_i \trianglelefteq _B x_k$ and $a \trianglelefteq _B c$.

$\square $

Now, let $B = \{[b_i,d_i]\}_{i=1}^n$ be a strict barcode. If we linearly order the elements in $T_n$ according to $\trianglelefteq _B$, then their indices produce a multipermutation (double occurrence word) $\sigma _B \in L(n,2)$. For example, if B is the strict barcode with 3 bars given by $b_1 = 1.0,\, d_1 = 2.0,\, b_2 = 1.0,\, d_2 = 3.0,\, b_3 = 2.5,\, d_3 = 2.75$, then $T_n$ is ordered,

$$\begin{aligned} x_1 \trianglelefteq _B x_2 \trianglelefteq _B y_1 \trianglelefteq _B x_3 \trianglelefteq _B y_3 \trianglelefteq _B y_2. \end{aligned}$$

Thus, $\sigma _B = (1~2~1~3~3~2) \in L(3,2)$.

Let $f: {\mathcal {B}}^n_{st} \rightarrow L(n,2)$ denote the map given by $f(B) =\sigma _B$. The map f provides a new combinatorial invariant on the space of strict barcodes with n bars by associating a double occurrence word to each barcode. However, the map f is highly dependent on the given labeling of the bars. For example, consider the strict barcode $B_2$ given by: $b_2 = 1.0,\, d_2 = 2.0,\, b_1 = 1.0,\, d_1 = 3.0,\, b_3 = 2.5,\, d_3 = 2.75$. Note that $B_2$ is equal to $B_1$ from the prior example, except that the labels of bars 1 and 2 have been swapped. As a result, $f(B_1) = (1~2~1~3~3~2)$ while $f(B_2)= (2~1~2~3~3~1)$. We would like our invariant to be the same for any two barcodes that are equal up to some such relabeling.

To that end, consider the combinatorial equivalence class of $\sigma _B$. Recall, we say two multipermutations $\sigma _1, \sigma _2\in L(n,2)$ are combinatorially equivalent if there exists a permutation $\tau \in {\mathfrak {S}}_n$ such that $\tau \circ \sigma _1 = \sigma _2$, where $\tau \circ \sigma _1$ indicates that $\tau $ acts element-wise on $\sigma _1$. For $\sigma \in L(n,2)$, let $[\sigma ]$ denote it combinatorial equivalence class and let $L(n,2)/{\mathfrak {S}}_n$ denote the set of all such equivalence classes.

By considering the combinatorial equivalence class of $\sigma _B$, we can extend the notion of combinatorial equivalence to strict barcodes. One can verify that the following relation is reflexive, symmetric, and transitive, hence it does define an equivalence relation.

Definition 3.2

Let $B_1, B_2 \in {\mathcal {B}}_{st}^{n}$. Let $g: {\mathcal {B}}^n_{st} \rightarrow L(n,2)/{\mathfrak {S}}_n$ denote the map given by $g(B) = [\sigma _B]$. We say $B_1, B_2$ are combinatorially equivalent if and only if $g(B_1) = g(B_2)$.

3.2 Ordering combinatorial barcodes

Although combinatorial equivalence is not a lattice congruence, we can still identify $L(n,2)/{\mathfrak {S}}_n$ with an induced subposet of L(n, 2), by selecting appropriate representatives of each equivalence class.

Recall that a multipermutation $\sigma \in L(n,2)$ is in ascending order if $1, 2,\dots , i-1$ appear before the first instance of i in $\sigma $, for all $i\in [n]$. Given some $w\in L(n,2)$, one can find an ascending order word ${\overline{w}} \in [w]$ as follows. Let $\tau _w$ be the substring of w given by the first occurrence of each element; when $w =\sigma _B$ for some barcode B, $\tau _w$ this is the string of the labels of the birth times. Now consider $\tau _w$ as a permutation in ${\mathfrak {S}}_n$. Then, the action of $\tau _w^{-1}$ on w relabels w so that the birth times now appear in ascending order and we may take ${\overline{w}} = \tau _w^{-1} \circ w$. For example, if $w = (2~1~4~1~3~3~2~4)\in L(n,2)$ we have that $\tau _w = (2~1~4~3)$, in one-line notation, and so $\tau _w^{-1}\circ w = (1~2~3~2~4~4~1~3)$.

Note that if $s,t \in [w]$, then $\tau _s^{-1}\circ s = \tau _t^{-1}\circ t$. Hence, each equivalence class has a unique member which is in ascending order. Therefore, the map $\psi : L(n,2)/{\mathfrak {S}}_n \rightarrow L(n,2)$ given by $\psi ([s]) = {\overline{s}}$, which sends each equivalence class [s] to its ascending order representative is well-defined. In fact, $\psi $ is a bijection of sets between $L(n,2)/{\mathfrak {S}}_n$ and ${\overline{L}}(n,2)$, the set of all words in L(n, 2) which are in ascending order. For that reason, we may treat the sets $L(n,2)/{\mathfrak {S}}_n$ and ${\overline{L}}(n,2)$ interchangeably.

Definition 3.3

The combinatorial barcode poset is the induced subposet $({\overline{L}}(n,2), \le )$ of the multinomial Newman lattice $(L(n,2),\le )$, where $\le $ denotes the weak order. A multipermutation $s \in {\overline{L}}(n,2)$ is called a combinatorial barcode.

Recall that the relation $\le $ on the multinomial Newman lattice is itself defined using the embedding $\iota : L({\textbf{m}}) \hookrightarrow {\mathfrak {S}}_S$. Hence, for all $s,t\in {\overline{L}}(n,2)$ we have that,

$$\begin{aligned} s\le t \iff {{\,\textrm{inv}\,}}(\iota (s))\subseteq {{\,\textrm{inv}\,}}(\iota (t)). \end{aligned}$$

(8)

The roles of $f,g, \psi , \iota $ and the equivalence relation $\equiv $ are summarized in the diagram (9), below. Here, we let $\Sigma $ denote the quotient map that sends each word $w\in L(n,2)$ to its equivalence class [w]. By abuse of notation we also let $\iota $ denote both the inclusion map from ${\overline{L}}(n,2)$ to L(n, 2) and the inclusion map from L(n, 2) to ${\mathfrak {S}}_S$. We emphasize that this diagram is not commutative, although it is the case that $\Sigma \circ f = g$.

Our first main result is that $({\overline{L}}(n,2), \le )$, is a principal ideal of the multinomial Newman lattice L(n, 2) and hence is a lattice.

Theorem 3.1

The combinatorial barcode poset $({\overline{L}}(n,2), \le )$ is the principal ideal of the multinomial Newman lattice, L(n, 2), generated by the “fully nested” permutation: $(1~2~\ldots ~(n-1)~n~n~(n-1)~\ldots ~2~1)$. Consequently, $({\overline{L}}(n,2), \le )$ is a lattice.

Theorem 5.1 in Sect. 5 is a generalization of Theorem 3.1, above. In the interest of space, we omit the proof of Theorem 3.1 here and refer the reader to the proof of Theorem 5.1.

3.3 Inversion multisets and barcode crossing numbers

A remarkable property of the combinatorial barcode lattice is that it admits an elegant, alternate construction based on inversions that does not require first “translating” to the symmetric group.

Definition 3.4

Let $s\in L(n,2)$ be a double occurrence word. Then, the inversion multiset of s is the multiset of pairs ${{\,\textrm{invm}\,}}(s) = \{(j,i)^{a_{ij}}: 1\le i < j \le n\}$ where $a_{ij}$ is equal to the number of pairs of indices $(k,\ell )$ such that $s_k = i, s_\ell =j$ and $k >\ell $.

Stated simply, the inversion multiset has as elements the pairs (j, i), $i<j$, with multiplicity equal to the number of pairs of i’s and j’s that appear out of order in s. For example, ${{\,\textrm{invm}\,}}((1~2~3~2~4~4~1~3)) = \{(2,1)^2, (3,1)^1, (4,1)^2, (3,2)^1, (4,3)^2 \}$.

Now, for $s,t \in {\overline{L}}(n,2)$, write $s \preceq t$ if ${{\,\textrm{invm}\,}}(s)\subseteq {{\,\textrm{invm}\,}}(t)$; recall, given multisets $A= \{x_1^{a_1}, \ldots , x_n^{a_n}\}$, $B=\{x_1^{b_1}, \ldots , x_n^{b_n}\}$ we say that $A \subseteq B$ if $a_i \le b_i$ for all $i \in [n]$.

Proposition 3.1

For $s,t \in {\overline{L}}(n,2)$, we have that $s \preceq t$ if and only if $s \le t$.

Proof

Let $s \in {\overline{L}}(n,2)$. Let (j, i) be an element in ${{\,\textrm{invm}\,}}(s)$ with multiplicity k. Since s is in ascending-order, the first copy of i appears before the first copy of j for all $i <j$. Therefore, the copies of i, j must appear according to one of three patterns:

$$\begin{aligned} (1) \, i \ldots i \ldots j \ldots j, \,(2)\, i \ldots j \ldots i \ldots j, \,(3)\, i \ldots j \ldots j \ldots i. \end{aligned}$$

It follows that $k\in \{0,1,2\}$, specifically, $k=0$ when s contains pattern (1), $k=1$ when s contains pattern (2), and $k=2$ when s contains pattern (3).

Now, let $A_{ij} = \{(j_1, i_1), (j_2, i_1), (j_1, i_2), (j_2,i_2)\}$ be the set of all possible inversions involving i and j in $\iota (s)$, which is a permutation of the set $S=\{1_1,1_2, 2_1, 2_2, \ldots , n_1, n_2\}$. It follows that,

$$\begin{aligned} {{\,\textrm{inv}\,}}(\iota (s)) \cap A_{ij} = {\left\{ \begin{array}{ll} \emptyset &{},\, k = 0 \\ \{(j_1, i_2) \}&{}, \, k=1 \\ \{(j_1, i_2), (j_2,i_2)\} &{}, \, k=2 \end{array}\right. }. \end{aligned}$$

Now suppose $s \preceq t$. Then, $(j,i) \in {{\,\textrm{invm}\,}}(t)$ with multiplicity $\ell $ and necessarily $\ell \ge k$. As a result, $({{\,\textrm{inv}\,}}(\iota (s)) \cap A_{ij}) \subseteq ({{\,\textrm{inv}\,}}(\iota (t)) \cap A_{ij})$. Applying this argument to all pairs in ${{\,\textrm{invm}\,}}(s)$, it follows that ${{\,\textrm{inv}\,}}(\iota (s)) \subseteq {{\,\textrm{inv}\,}}(\iota (t))$. Thus, $\iota (s) \le \iota (s)$ and hence $s\le t$.

Moreover, this argument is reversible in that we can deduce the multiplicity of $(j,i) \in {{\,\textrm{invm}\,}}(s)$ from ${{\,\textrm{inv}\,}}(\iota (s)) \cap A_{ij}$. Thus, we also have that if $s\le t$, then $s \preceq t$, as desired. $\square $

Remark 2

We note that the key to Proposition 3.1, above, is the fact that the inversion multisets of elements in ${\overline{L}}(n,2)$ are in one-to-one correspondence with the inversion sets of those elements after embedding them in ${\mathfrak {S}}_S$. This is not true for all multipermutations in L(n, 2). For example, (1 2 2 1) and (2 1 1 2) both have the inversion multiset $\{(2,1)^2\}$ but their inversion sets are $\{(2_1,1_2), (2_2,1_2) \}$ and $\{(2_1, 1_1), (2_1, 1_2)\}$, respectively. Hence, we cannot use inversion multisets to define the ordering of L(n, 2); this construction is only valid on the ascending order words in ${\overline{L}}(n,2)$.

The proof of Proposition 3.1 illuminates a simple way of computing the rank of the combinatorial barcode $s\in {\overline{L}}(n,2)$. For each pair $i < j$, define the crossing number of i and j, ${{\,\mathrm{cross\#}\,}}(i,j)$, as the multiplicity of $(j,i) \in {{\,\textrm{invm}\,}}(s)$. If B is a strict barcode, labeled so that $\sigma _B = \overline{\sigma _B}$, then crossing number of a pair of bars $i <j $ in $\sigma _B$ is given by,

$$\begin{aligned} {{\,\mathrm{cross\#}\,}}(j,i)= {\left\{ \begin{array}{ll} \begin{aligned} 0 &{},\,\, d_i< b_j &{}&{}\text { (disjoint)}\\ 1 &{}, \,\, b_i \le b_j \le d_i< d_j &{}&{}\text { (stepped)}\\ 2 &{}, \,\, b_i< b_j< d_j < d_i &{}&{}\text { (nested)} \end{aligned} \end{array}\right. }. \end{aligned}$$

(10)

If B is given as a barcode diagram then the crossing number of a pair of bars is easy to read: it equals 0 when the bars are disjoint, 1 when they are stepped, and 2 when they are nested (see Fig. 2).

Theorem 3.2

Let $s\in {\overline{L}}(n,2)$ be a combinatorial barcode and let $\rho (s)$ denote its rank. Then,

$$\begin{aligned} \rho (s) = \sum _{i < j} {{\,\mathrm{cross\#}\,}}(j,i). \end{aligned}$$

Proof

The result follows immediately from the observation that $|{{\,\textrm{invm}\,}}(s)|= \vert {{\,\textrm{inv}\,}}(\iota (s))|= \rho (s)$, where elements are counted with multiplicity in the inversion multiset. $\square $

3.4 Connections to barcode bases and TMD

We now return to our motivations, which are to use combinatorial barcodes to study the fibers of the TMD and the set of barcodes bases of persistence modules. Observe how we can restate Eq. 1 and Theorem 2.2 in terms of the crossing numbers of a barcode.

Proposition 3.2

Let ${{\,\textrm{TMD}\,}}$ denote the topological morphology descriptor from Kanari et al. (2018), and let $B = \{[b_i,d_i]\}_{i=0}^n$ be a strict barcode with $n+1$ bars, including an essential bar $[b_0,d_0]$ which contains all others. Suppose no birth times in B are repeated and without loss of generality assume that B is labeled canonically. Then,

$$\begin{aligned} |{{\,\textrm{TMD}\,}}^{-1}(B) |= \prod _{j=1}^n |\{0 \le i < j: {{\,\mathrm{cross\#}\,}}(j,i) =2\} |. \end{aligned}$$

(11)

Proof

This follows directly from Eq. 1, after observing that inversions in $\pi _B$ correspond to nested pairs of bars when the birth times $b_i$ are distinct. $\square $

Proposition 3.3

Let $(V_\bullet , f_\bullet )$ be a persistence module of length $\ell +1$. Let $B = \{[b_i,d_i]^{m_i}\}_{i=1}^n$ be the barcode associated to the interval decomposition of $(V_\bullet , f_\bullet )$ as in Theorem 2.1. Suppose $d_i \ne d_j$ for all $i\ne j$ and assume without loss of generality assume that B is labeled so that $\sigma _B = \overline{\sigma _B}$. Then the set of barcode bases of $(V_\bullet , f_\bullet )$, ${\mathscr {B}}(V_\bullet , f_\bullet )$, admits a bijection,

$$\begin{aligned} {\mathscr {B}}(V_\bullet , f_\bullet ) \cong \prod _{i=1}^n {{\,\textrm{GL}\,}}(m_{i}; {\mathbb {F}}) \times \prod _{\begin{array}{c} i < j: \\ {{\,\mathrm{cross\#}\,}}(j,i) = 1 \end{array}} {{\,\textrm{Mat}\,}}(m_i \times m_j; {\mathbb {F}}), \end{aligned}$$

(12)

where ${{\,\textrm{GL}\,}}(m; {\mathbb {F}})$ denotes the general linear group of $m\times m$ matrices over ${\mathbb {F}}$.

Proof

This follows directly from Theorem 2.2, after observing that $b_i \le b_j \le d_i \le d_j$ if and only if ${{\,\mathrm{cross\#}\,}}(j,i) =1$. $\square $

Now, consider a strict persistence module $(V_\bullet , f_\bullet )$ with associated barcode $B = \{[b_i,d_i]\}_{i=1}^n$. Note that the first term on right hand side of Equation (12) reduces to ${{\,\textrm{GL}\,}}(1, {\mathbb {F}})^n \cong ({\mathbb {F}}- \{0\})^n$, while the second reduces to,

$$\begin{aligned} \prod _{\begin{array}{c} i<j: \\ {{\,\mathrm{cross\#}\,}}(j,i) = 1 \end{array}} {{\,\textrm{Mat}\,}}(1; {\mathbb {F}}) \cong \prod _{\begin{array}{c} i<j: \\ {{\,\mathrm{cross\#}\,}}(j,i) = 1 \end{array}} {\mathbb {F}}. \end{aligned}$$

From this observation, we see how the cover relations in $({\overline{L}}(n,2), \le )$ relate to the space of barcode bases for strict persistence modules.

Theorem 3.3

Consider two strict persistence modules $(V_\bullet , f_\bullet ), (W_\bullet , h_\bullet )$. Let $B = \{[b_i,d_i]\}_{i=1}^n, B' = \{[b_i',d_i']\}_{i=1}^n$ be the barcodes associated to the interval decomposition of $(V_\bullet , f_\bullet )$ and $(W_\bullet , h_\bullet )$, respectively. Assume without loss of generality that B is labeled so that $\sigma _B = \overline{\sigma _B}$ (and likewise for $B'$). Suppose $\sigma _B \lessdot \sigma _{B'}$, so $\sigma _B$ and $\sigma _{B'}$ differ only in swapping an adjacent pair of entries $i < j$ which are ordered in $\sigma _B$ but inverted in $\sigma _{B'}$. Then,

$$\begin{aligned} {\mathscr {B}}(V_\bullet , f_\bullet )&\cong {\mathscr {B}}(W_\bullet , h_\bullet ) \times {\mathbb {F}}, \text { if } {{\,\mathrm{cross\#}\,}}(j,i) = 0 \text { in } B, \\ {\mathscr {B}}(W_\bullet , h_\bullet )&\cong {\mathscr {B}}(V_\bullet , f_\bullet ) \times {\mathbb {F}}, \text { otherwise}. \end{aligned}$$

Proof

Swapping i, j produces a new inversion. If the bars i and j were disjoint (had crossing number 0 in B), this produces a new pair of stepped bars. Hence, the product in Equation (12) gains an extra term, ${\mathbb {F}}$. Otherwise, the bars i and j were already stepped, in which case swapping i and j forms a pair of nested bars from the stepped bars. Hence, the product in Equation (12) loses one copy of ${\mathbb {F}}$. $\square $

4 Connections to trapezoidal words, stirling permutations, & Eulerian numbers

In this section we briefly discuss some connections between the space of combinatorial barcodes, ${\overline{L}}(n,2)$, and several well-studied objects in combinatorics.

Definition 4.1

Let $s\in L(n,2)$ be a double occurrence word. The inversion vector of s is the vector $\mu (s) \in {\mathbb {Z}}^n$ where $\mu (s)_i = \sum _{j >i}a_{ij}$, where $a_{ij}$ denotes the multiplicity of (j, i) in ${{\,\textrm{invm}\,}}(s)$.

Put simply, the i-th component of $\mu (s)$ is the number of inversions in s where i is the smaller (right) element.