1 Introduction

A universal cycle for a set \(\mathbf {E}\) of length n stringsFootnote 1 is a circular string of length \(|\mathbf {E}|\) where each string in \(\mathbf {E}\), or an encoding of it, appears exactly once as a substring [6]. Universal cycles generalize de Bruijn sequences, where \(\mathbf {E}\) is the set of k-ary strings of length n [9, 51] (and see [10]). For example, 00010111 is a de Bruijn sequence for \(k=2\) and \(n=3\) since its substrings—000, 001, 010, 101, 011, 111, 110, 100—are precisely the \(k^n = 2^3 = 8\) binary strings of length 3. When visualizing de Bruijn sequences and universal cycles it is common to picture a window moving through the circular string, where the width of the window is equal to the length of the encodings of the objects in \(\mathbf {E}\). See [18, 19] and a new website [40] for recent surveys on the rich history of these objects.

This paper constructs the first fixed-content universal cycle. More precisely, we construct two different universal cycles with content S: \(\mathcal {V}(S)\) and \(\mathcal {U}(S)\). Then we show that the constructions are equivalent (i.e., \(\mathcal {V}(S) = \mathcal {U}(S)\)), and develop efficient algorithms for generating our universal cycle.

Throughout the article, S is a multiset of symbols referred to as the content. By convention, S has k distinct symbols \(1,2, \ldots , k\), and its cardinality (including repetitions) is n. For example, \(k=3\) and \(n=4\) when \(S = \{1,1,2,3\}\). The symbol \(\cdot \) denotes concatenation and is optional, so \(a \cdot b = ab\).

1.1 Fixed-Content Universal Cycles

De Bruijn sequences are examples of universal cycles that store the contents of \(\mathbf {E}\) without any encoding, and additional examples exist for many other interesting sets [6, 11, 25, 26, 30, 31, 47]. But this is not possible for many other natural sets, including the permutations of \(\{1,2,\ldots ,n\}\) in one-line notation, and the n-bit binary strings with weight w (i.e., w copies of 1). To illustrate this point, if the permutations of \(n=3\) (i.e., \(\mathbf {E} = \{123, 132, 213, 231, 312, 321\}\)) had a universal cycle, then 123abc would be such a cycle (as any cycle could be rotated to have 123 as its first substring). Since 23a is a window of the universal cycle, it must be that \(23a \in \mathbf {E}\), and hence \(a=1\). Similarly, \(b = 2\) and \(c = 3\). But \(123abc = 123123\) is not a universal cycle for \(\mathbf {E}\). Likewise, there is no universal cycle for the binary strings of length \(n=4\) and weight \(w=2\) (i.e., \(S = \{0011, 0101, 0110, 1001, 1010, 1100\}\)).

Fortunately, permutations and fixed-weight binary strings have a simple alternate encoding: The shorthand representation of a string omits its final symbol. This is a suitable choice since permutations and fixed-weight strings are determined by their length \(n{-}1\) prefixes, as the final symbol is redundant.

Example 1

Consider the set \(\mathbf {E}_1 = \{ 12, 13, 21, 23, 31, 32 \}\) of shorthand permutations for \(n=3\). Observe that 231321 is a universal cycle for \(\mathbf {E}_1\).

Example 2

Consider the set \(\mathbf {E}_2 = \{ 0001\), 0010, 0100, 1000, 0011, 0110, 0101, 1001, 1010, \(1100\}\) of shorthand fixed-weight strings of length \(n=5\) and weight \(w=2\). Observe that 1010011000 is a universal cycle for \(\mathbf {E}_2\).

In this paper, we consider the natural generalization of permutations and fixed-weight binary strings, namely strings with fixed-content. These objects are also known strings with the same Parikh vector and as multiset permutations. Owing to the latter term, we let \(\varvec{Perm}(S)\) denote the strings with content S. For example, the set of strings with content \(S = \{1,1,2,3\}\) is

$$\begin{aligned} \varvec{Perm}(S)= & {} \{1123,1132,1213,1231,1312,1321,2113,\\&2131,2311,3112,3121,3211\}. \end{aligned}$$

We let \(\varvec{Short}_{}(S)\) denote the shorthand representation of strings with content S. For example, the set of shorthand representations of strings with content \(S = \{1,1,2,3\}\) is

$$\begin{aligned} \varvec{Short}_{}(S) = \{112,113,121,123,131,132,211,213,231,311,312,321\}. \end{aligned}$$

As with permutations and fixed-weight binary strings, the final symbol of a string with content S is redundant, so the shorthand encoding can be used in a universal cycle without any loss of information. Such a cycle can be referred to as universal cycle of \(\varvec{Short}_{}(S)\), or as a shorthand universal cycle of \(\varvec{Perm}(S)\), or simply as fixed-content universal cycle over S. Regardless of the name, the windows provide the strings in \(\varvec{Short}_{}(S)\), which provide a simple encoding of the strings in \(\varvec{Perm}(S)\).

Example 3

Observe that 112312131132 is a universal cycle of \(\varvec{Short}_{}(S)\) with \(S = \{1,1,2,3\}\). It can also be described as a shorthand universal cycle for \(\varvec{Perm}(S)\), or as a fixed-content universal cycle over S.

1.1.1 First Symbol or Missing Symbol

If \(\beta = {b_1} \cdot b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\), then \({b_1}\) is \(\beta \)’s first symbol. The missing symbol from \(\beta \) is \({b_n}\) for the unique \({b_n}\) with \(\beta \cdot {b_n} \in \varvec{Perm}(S)\). We also use \({z} = {b_n}\) for the missing symbol to emphasize that it is not included in \(\beta \). For example, if \(S = \{1,1,2,3\}\) and \(\beta = {2}13 \in \varvec{Short}_{}(S)\), then \(\beta \)’s first symbol is \({b_1} = {2}\) and its missing symbol is \({z} = {b_4} = {1}\). When viewed symbol-by-symbol, fixed-content universal cycles repeatedly select between the first symbol and the missing symbol of the current window, as formalized by the following lemma.

Lemma 1

Let \(\mathcal {W}\) be a fixed-content universal cycle over S and \(\beta = {b_1} \cdot b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\) with missing symbol z. The window \(\beta \) appears exactly once in \(\mathcal {W}\) and it is immediately followed by its first symbol \({b_1}\) or its missing symbol z. Equivalently, if \(\beta '\) is the window following \(\beta \) in \(\mathcal {W}\), then

  1. Case 0:

    \(\beta ' = b_2 \cdots b_{n-1} \cdot {z}\) or

  2. Case 1:

    \(\beta ' = b_2 \cdots b_{n-1} \cdot {b_1}\).

Proof

Since z is missing from \(\beta \in \varvec{Short}_{}(S)\), we have \(\beta \cdot {z} = {b_1} \cdot b_2 \cdots b_{n-1} \cdot {z} \in \varvec{Perm}(S)\) and \(S = \{{b_1}, b_2, \ldots , b_{n-1}, {z}\}\). Since \(\beta '\) follows \(\beta \), we have \(\beta ' = b_2 \cdots b_{n-1} \cdot x \in \varvec{Short}_{}(S)\) for some symbol \(x \in S\). Thus, the claim holds by the following (where − denotes multiset difference),

$$\begin{aligned} x \in S - \{b_2, \ldots , b_{n-1}\} = \{{b_1}, b_2, \ldots , b_{n-1}, {z}\} - \{b_2, \ldots , b_{n-1}\} = \{{b_1}, {z}\}. \end{aligned}$$

\(\square \)

It is important to note that a window’s first symbol and missing symbol can be equal. For example, if \(S = \{1,1,2,3\}\) then \({b_1} = {z} = 1\) for \(123 \in \varvec{Short}_{}(S)\) and \(132 \in \varvec{Short}_{}(S)\). In these situations, the two cases of Lemma 1 are identical, and there is only one choice for next symbol and window.

1.2 Characterizations Using Graphs

Fixed-content universal cycles can be characterized using several different directed graphs with labeled edges. Each characterization provides its own insights into this new type of universal cycles.

1.2.1 Transition Graphs: Hamilton Cycles and Binary Representation

The graph \(T(S)\) has vertex set \(\varvec{Short}_{}(S)\) and edges uv for \(u = b_1 \cdot b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\) and \(v = b_2 \cdot b_3 \cdots b_n\) with label \(b_n\). In other words, the vertices are windows, and edges transition from window to window by shifting in their label. This type of graph is often known as a transition graph within the universal cycle literature, and it leads directly to the characterization in Remark 2, which views universal cycles and Hamilton cycles as cyclic objects (i.e., they are unchanged by rotation).

Remark 2

Universal cycles with fixed-content S are in one-to-one correspondence with the concatenation of the edge labels along the Hamilton cycles in \(T(S)\).

Fig. 1
figure 1

a The transition graph \(T(S)\) with a Hamilton cycle. b The augmented transition graph \(T'(S)\). Transition graphs for \(S = \{1,1,2,3\}\). Our universal cycle 112312131132 follows the Hamilton cycle from vertex 132 in (a). It has binary representation 001000110001 starting from vertrex 112 in (b), where thick straight and thin curved edges are of type 0 and 1, respectively. Its weight of 4 is minimum possible for S

For our purposes, it is helpful to consider an augmented transition graph \(T'(S)\). This graph has vertex set \(\varvec{Short}_{}(S)\) and its edge set is defined based on the first and missing symbols from Lemma 1. That is, if \(u = {b_1} \cdot b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\) is missing z, then there are two edges of the form uv where

  1. Edge 0:

    \(v = b_2 \cdots b_{n-1} \cdot {z}\) with label z, and

  2. Edge 1:

    \(v = b_2 \cdots b_{n-1} \cdot {b_1}\) with label \({b_1}\).

This definition ensures that each vertex has out-degree two. Furthermore, each vertex has in-degree two. This is because the edges partition into two vertex-disjoint directed cycle covers: one includes the 0 edges, and the other includes the 1 edges. Figure 1 shows \(T(S)\) and \(T'(S)\) for \(S = \{1,1,2,3\}\).

The augmented transition graph visualizes a simple consequence of Lemma 1: fixed-content universal cycles can be represented in binary. More precisely, a binary representation is an initial window in \(\varvec{Short}_{}(S)\) and a binary string of length \(|\varvec{Short}_{}(S)|\) whose bits follow the cases of Lemma 1 or the edge types of \(T'(S)\). If we use the convention that the initial window is the lexicographically smallest string in \(\varvec{Short}_{}(S)\) (e.g., 112 for \(S = \{1,1,2,3\}\)), then the binary string suffices by itself. A fixed-content universal cycle has weight w if it has a binary representation of weight w, and it has minimum-weight if it has a binary representation of minimum possible weight for its content S.

1.2.2 Arc Digraphs: Eulerian Circuits and Universal Cycle Existence

To prove that universal cycles exist we can model the windows as edges rather than vertices. Let the vertices of \(A(S)\) be the \((n-2)\)-permutations of S (i.e., two symbols are absent from S) with an edge from \(u = {b_1} \cdot b_2 \cdots b_{n-2}\) to \(v = b_2 b_3 \cdots b_{n-1}\) with label \(b_{n-1}\) if \({b_1} \cdot b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\). This type of graph is known as an arc digraph within the literature, and it leads to a second characterization.

Remark 3

Universal cycles with fixed-content S are in one-to-one correspondence with the concatenation of the edge labels along the Eulerian circuits in \(A(S)\).

Figure 2 shows \(A(S)\) for \(S = \{1,1,2,3\}\). The fact that \(A(S)\) is always Eulerian is a consequence of our new results; it also provides a nice exercise for active readers. Remark 3 was previously proven for permutations (i.e., \(k = n\)) by Jackson [27] and fixed-weight binary strings (i.e., \(k = 2\)) [35].

Fig. 2
figure 2

The arc digraph \(A(S)\) for \(S = \{1,1,2,3\}\). Our shorthand universal cycle 112312131132 starts at vertex 32 and then travels to 21 and 11

1.2.3 Rotator Graphs: Shift Gray Codes for (Multiset) Permutations

Recall that the vertex set of \(T'(S)\) is \(\varvec{Short}_{}(S)\). By replacing each vertex with its corresponding member of \(\varvec{Perm}(S)\) we obtain our final graph \(R(S)\). Figure 3 illustrates \(R(S)\) for \(S = \{1,1,2,3\}\).

The edges of \(R(S)\) can be understood as applying an operation known as a shift. Given string \(\alpha = b_1 \cdot b_2 \cdots b_n\), let \(\textstyle \mathsf {shift}_{[}(\alpha )]{i,j}\) (or simply \(\textstyle \mathsf {shift}_{i,j}(\))) be the result of moving \(b_i\) into the jth position. In the special case of \(i = 1\) (i.e., the first symbol is moved to the right) we let \(\sigma _{j}\) denote \(\textstyle \mathsf {shift}_{i,j}(\)). If \(u = {b_1} \cdot b_2 \cdots b_{n-1} {b_n} \in \varvec{Perm}(S)\), then \(R(S)\) contains two edges of the form uv where

  1. Edge 0:

    \(v = \textstyle \mathsf {shift}_{[}(u)]{1,n} = b_2 \cdots b_{n-1} \cdot {b_n} \cdot {b_1}\) with label \({\sigma _{n}}\), and

  2. Edge 1:

    \(v = \textstyle \mathsf {shift}_{[}(u)]{1,n-1} = b_2 \cdots b_{n-1} \cdot {b_1} \cdot {b_n}\) with label \({\sigma _{n-1}}\).

In other words, an edge shifts the first symbol into the last or second-last position. Thus, Hamilton cycles of \(R(S)\) provide a shift Gray code of \(\varvec{Perm}(S)\), meaning that each succesive string is obtained by a shift. The Gray codes are also cyclic since a shift transforms the last string into the first string.

Remark 4

Universal cycles with fixed-content S are in one-to-one correspondence with the Hamilton cycles of \(R(S)\). In turn, the Hamilton cycles are in one-to-one correspondence with cyclic Gray codes of \(\varvec{Perm}(S)\) in which each \(\alpha \in \varvec{Perm}(S)\) is followed by \(\textstyle \mathsf {shift}_{[}(\alpha )]{1,n}\) or \(\textstyle \mathsf {shift}_{[}(\alpha )]{1,n{-}1}\).

Fig. 3
figure 3

a The graph \(R(S)\) for \(S = \{1,1,2,3\}\). b Our Hamilton cycle is a shift Gray code of \(\varvec{Perm}(S)\). The graph \(R(S)\) for \(S = \{1,1,2,3\}\) in (a). The Hamilton cycle in \(R(S)\) starting from vertex 1321 corresponds to our universal cycle \(\mathcal {V}(S) = \mathcal {U}(S) = 112312131132\) in (b)

Our main results contribute to the literature on shift Gray codes for (multiset) permutations due to Remark 4. Corbett provided the first such result with a Hamilton cycle in the rotator graph whose vertices are permutations (i.e., \(\varvec{Perm}(S)\) with \(n=k\)) and whose edges apply \(\sigma _i\) for all \(2 \le i \le n\) [8]. The rotator graph is used as a multiprocessor network topology and Corbett’s cycle can be generated by the greedy Gray code algorithm [54]. It is possible to create a Hamilton cycle using only the following three operations [49]: \(\tau = \sigma _2\); \(\sigma _3\); \(\sigma = \sigma _n\). If edges in the opposite direction are also allowed, then \(\sigma \), \(\sigma ^{-1}\), and \(\tau \) are sufficient [7]. When n is odd, \(\sigma \) and \(\tau \) allow for Hamilton paths [43] and cycles [44], but only Hamilton paths are possible when n is even [33, 50]. When S is a multiset rather than a set, then Gray codes do not exist using \(\sigma \), \(\sigma ^{-1}\), and \(\tau \) when \(k=2\) [5]. However, a cyclic Gray code known as cool-lex order had been shown to exist using all \(\sigma _i\) [52]. Cool-lex order was first discovered for fixed-weight binary strings (i.e., \(\varvec{Perm}(S)\) with \(k=2\)) which are also known as combinations, and its name comes from its similarity to co-lexicographic order [36].

1.3 Necklaces and Necklace Cycles

A necklace is a lexicographically smallest string in an equivalence class of strings under rotation. Let \(\varvec{N}(S)\) be the set of necklaces with content S. For example, \(\varvec{N}(S) = \{1123, 1132, 1213\}\) for \(S = \{1,1,2,3\}\). A necklace cycle is a cyclic order of length n strings obtained by repeatedly applying \(\textstyle \mathsf {shift}_{1,n}(\)). In general, the length of a necklace cycle divides n. For example, the necklace cycle containing 132132 has length 3. A necklace cycle contains one necklace, which is its representative. For example, the thick straight edges in Fig. 3a create necklace cycles with representatives 1123, 1213, and 1132 from left-to-right. The aperiodic prefix of a string is its shortest prefix that can be repeated to create it. For example, the aperiodic prefix of 132132 is 132.

1.4 New Results

It is important to note that the graph-based models in Fig. 1.2 require exponential space with respect to n, and so they do not lead to efficient algorithms for generating a single universal cycle. With this point in mind, we now summarize our main results below.

  1. 1.

    The construction of a minimum-weight universal cycle \(\mathcal {V}(S)\) using cycle-joining in Sect. 2. More specifically, necklace cycles are joined together according to a first-inversion tree.

    • A successor rule that generates successive symbols of \(\mathcal {V}(S)\) in O(n)-time.

  2. 2.

    The construction of universal cycle \(\mathcal {U}(S)\) using a necklace concatenation approach in Sect. 4. More specifically, we concatenate \(\varvec{N}(S)\) in reverse cool-lex order (using aperiodic prefixes).

    • A new shift Gray code for fixed-content necklaces with an O(n)-amortized time algorithm.

    • The reversal of \(\mathcal {U}(S)\) can be generated in O(1)-amortized time per symbol using O(n) space.

  3. 3.

    A proof that the two constructions are equivalent in Sect. 5. That is, \(\mathcal {V}(S) = \mathcal {U}(S)\).

Our constructions are implemented in C and are provided in the Appendix. The output can be viewed at http://debruijnsequence.org [40]. Section 6 concludes with future work and open problems.

Prior to this article, universal cycles for shorthand permutations (where \(n=k\)) and shorthand fixed-weight binary strings (where \(k=2\)) were constructed and efficiently generated under slightly different names (see [24, 37] and [35]). Note that these previous works use lexicographically largest representatives for necklaces. Our use of lexicographically smallest representatives is more standard, but it requires an adjustment to the original definition of cool-lex order in Sect. 4. A preliminary version of this paper presented our necklace concatenation construction [45].

1.5 Granddaddy and Cool-Daddy

Besides our main results, we wish to suggest to the reader that our fixed-weight universal cycle is both natural and fundamental. As a point of comparison, we consider the most famous de Bruijn sequence.

The granddaddy de Bruijn sequence \(\mathcal {G}_{k}(n)\) (as coined by Knuth [29]) is the lexicographically smallest de Bruijn sequence for k-ary strings of length n. For example, \(\mathcal {G}_{2}(4) = 0000100110101111\) is the granddaddy for \(n=4\) and \(k=2\). The granddaddy can be constructed in several elegant ways.

  • The granddaddy is constructed by a simple prefer-smallest greedy algorithm (see [32]).

  • The granddaddy is constructed by a simple cycle joining approach. More specifically, necklace cycles are joined together according to a last non-k tree (see [18, 19]); here we use the term last-0 tree when referring to the binary case (i.e., \(k=2\)).

  • The granddaddy is constructed by a simple necklace concatenation approach. More specifically, necklaces are concatenated in lexicographic order (using aperiodic prefixes) (see [15, 16]).

To our knowledge, no simple greedy algorithm constructs our fixed-content universal cycle. However, our contributions match the latter two bullets quite closely, especially in the binary case.

In terms of cycle joining, when constructing the granddaddy \(\mathcal {G}_{2}(6)\), the necklace cycle represented by \(001\underline{0}11\) is joined to the necklace cycle represented by 001111 by complementing the last zero (as underlined). Similarly, when constructing our fixed-content universal cycle with \(S=\{1,1,2,2,3,3\}\), the necklace cycle represented by \(12\underline{31}23\) is joined to the necklace cycle represented by 121323 by swapping the first inversion (as underlined).

In terms of necklace concatenation, when \(n=4\) and \(k=3\), the granddaddy \(\mathcal {G}_{3}(4)\) is equal to

$$\begin{aligned}&0 \cdot 000001 \cdot 000011 \cdot 000101 \cdot 000111 \cdot 001 \cdot 001011 \cdot 001101\nonumber \\&\qquad \cdot 001111 \cdot 01 \cdot 010111 \cdot 011 \cdot 011111 \cdot 1 \end{aligned}$$
(1)

owing to the lexicographic order of binary necklaces of length 6 (and their aperiodic prefixes). Similarly, our fixed-content universal cycle \(\mathcal {U}(S)\) with content \(S=\{1,1,2,2,3,3\}\) is equal to

$$\begin{aligned}&112233 {\cdot }\! 122313 {\cdot }\! 123213 {\cdot }\! 122133 {\cdot }\! 121233 {\cdot }\! 112332 {\cdot }\! 123132 {\cdot }\! 132 {\cdot }\! 121332 {\cdot }\! 113322 \nonumber \\&\qquad {\cdot } \! 131322 {\cdot }\! 113232 {\cdot }\! 112323 {\cdot }\! 123 {\cdot }\! 121323 {\cdot }\! 113223 \end{aligned}$$
(2)

owing to the reverse cool-lex order of necklaces in \(\varvec{N}(S)\) (and their aperiodic prefixes).

These comparisons with the binary granddaddy are summarized in Fig. 4. Due to the similarities, we refer to our fixed-content universal cycle as the cool-daddy.

Fig. 4
figure 4

Comparing the binary granddaddy de Bruijn sequence \(\mathcal {G}_{2}(n)\) for n-bit binary strings, and our fixed-content universal cycle \(\mathcal {U}(S) = \mathcal {V}(S)\) with content S. See Fig. 5 and (1)–(2) for specific examples

2 Cycle Joining feat. the First-Inversion Spanning Tree

We begin this section by outlining the classic cycle-joining approach used to construct de Bruijn sequences and universal cycles. We then apply the approach to derive a simple successor rule for a fixed-content universal cycle. The rule is based on a string’s “first-inversion”, where an inversion in a string \(a_1\cdots a_n\) is a consecutive pair \(a_i, a_{i+1}\) where \(a_i > a_{i+1}\).

2.1 Cycle Joining

Call a string in \(\varvec{Short}_{}(S)\) a state. A function \(f:\varvec{Short}_{}(S) \rightarrow S\) is said to be a feedback function. Given such a feedback function f, let function \(F: \varvec{Short}_{}(S) \rightarrow \varvec{Short}_{}(S)\) map a state \(\beta =b_1b_2\cdots b_{n-1}\) to state \(b_2\cdots b_{n-1} f(\beta )\). A cycle is a sequence of distinct states \(\beta _1, \beta _2, \ldots , \beta _j\) such that \(F(\beta _i) = \beta _{i+1}\) for \(1\le i < j\) and \(F(\beta _j) = \beta _1\). Each cycle can be represented by a single string \(c_1\cdots c_j\) where \(c_i\) corresponds to the first symbol of \(\beta _i\).

Example 4

Consider content \(S= \{1,1,2,2,3,3\}\). Let \(f(\beta ) = z\), recalling \(\beta z\) has content S. That is, \(\beta \in \varvec{Short}_{}(S)\) and z is its missing symbol with respect to S. Then the cycles

$$\begin{aligned} \underline{1}2132, \underline{2}1323, \underline{1}3231, \underline{3}2312, \underline{2}3121, \underline{3}1213 \ \ \ \ \text{ and } \ \ \ \ \underline{1}2312, \underline{2}3123, \underline{3}1231 \end{aligned}$$

can be represented by the strings \(C_1 = 121323\) and \(C_2 = 123\), respectively.

Note that \(C_1\) is a universal cycle for \(\{12132, 21323, 13231, 32312, 23121, 31213\}\) and \(C_2\) is a universal cycle for \(\{12312, 23123, 31231\}\).

If \(\beta _1 = xb_2\cdots b_{n-1}\) and \(\beta _2 = yb_2\cdots b_{n-1}\) are both states where \(x\ne y\) then we say \(\beta _1\) and \(\beta _2\) form a conjugate pair. For each state \(\beta \) there is at most one other state that forms a conjugate pair with \(\beta \) because of the content restrictions. If \(C_1\) and \(C_2\) are disjoint cycles where \(C_1\) contains one state from a conjugate pair and \(C_2\) contains the other, then the two cycles can be “joined” together to form a single cycle by swapping the successors of the conjugate states. This “cycle-joining” process is well known, and formally stated in [13, Thm. 1] and [46, Lemma 3]. The process is a special case of Hierholzer’s cycle joining algorithm for producing Euler cycles [23].

Example 5

Consider the cycles \(C_1\) and \(C_2\) from Example 4 and the conjugate pair \(3\underline{2312}, 1\underline{2312}\); the state 32312 is in \(C_1\) and the state 12312 is in \(C_2\). By swapping the successors of these states we obtain a single cycle for the union of the states from \(C_1\) and \(C_2\):

$$\begin{aligned} 12132, 21323, 13231, 32312, {23123, 31231, 12312,} 23121, 31213 \end{aligned}$$

corresponding to \(C=1213{231}23\).

This cycle-joining process has been formalized in [18, 19] to produce a number of simple successor rules for de Bruijn sequences and universal cycles. Next we apply the cycle-joining approach to construct a universal cycle for \(\varvec{Short}_{}(S)\).

2.2 The First-Inversion Tree

Let \(\beta = b_1b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\), where z denotes the missing symbol, and consider the following feedback function

$$\begin{aligned} f(\beta ) = z. \end{aligned}$$
(3)

This function can be used to partition \(\varvec{Short}_{}(S)\) into necklace cycles whose representatives are in \(\varvec{N}(S)\). More specifically, each \(b_1b_2 \cdots b_{n-1} \in \varvec{Short}_{}(S)\) with missing symbol z is followed by \(b_2 \cdots b_{n-1}z \in \varvec{Short}_{}(S)\) in a cycle. For example, the cycle associated with \(112233 \in \varvec{N}(S)\) includes the strings \(11223, 12233, 22331, 23311, 33112, 31122 \in \varvec{Short}_{}(S)\).

Let tail(S) denote the unique non-decreasing string composed of all the elements of S; it is a necklace. The cycles can be joined together into a spanning tree rooted at \({{\,\mathrm{tail}\,}}(S)\) where the parent of every necklace \(\alpha \), not including \({{\,\mathrm{tail}\,}}(S)\), is obtained by transposing the two symbols in the “first inversion” of \(\alpha \). We call the resulting tree the first-inversion treeFootnote 2. For example, see the first-inversion tree in Fig. 5b for content \(S=\{1,1,2,2,3,3\}\). It is easy to see that such a mapping always induces a tree since the parent of each necklace \(\alpha \) is lexicographically smaller than \(\alpha \). In fact, the paths from each node to the root resembles insertion sort (or more accurately, gnome sort [39])

Fig. 5
figure 5

a The last-0 tree for binary strings of length \(n=6\) rooted at the necklace 111111. Parents are obtained by complementing the last zero (underlined). b The first-inversion tree for content \(S = \{1,1,2,2,3,3\}\) rooted at the necklace 112233. Parents are obtained by swapping the first inversion (underlined). Necklace cycles are joined into tree-like structures in both the a (binary) grand-daddy de Bruijn sequence \(\mathcal {G}_{k}(n)\), and b our cool-daddy universal cycle \(\mathcal {V}(S)\). (The non-binary grand-daddy sequences (i.e., with \(k>2\)) are similar, but somewhat more complicated. Specifically, the necklace cycles do not group into conjugate pairs, but rather groups of size k. Thus, the actual cycle joining is more complicated, even though a similar tree-like structure is present.). In both figures, the underlines illustrate the parent rule, and each necklace cycle is denoted by the aperiodic prefix of its representative. For example, in (a) the necklace cycle containing \(0101\underline{0}1\) is denoted \(\underline{0}1\), and in (b) the necklace cycle \(12\underline{31}23\) is denoted \(\underline{1}2\underline{3}\)

2.3 A Simple Successor Rule

When the symbols involved in the first inversion of a necklace are rotated to the front of both the necklace and its parent, then the length \(n-1\) suffixes correspond to conjugate pairs whose successors are swapped during the cycle joining process. For example, given the necklace \(113\underline{32}2\) and its parent \(113\underline{23}2\), the conjugate pairs are 22113 and 32113. Let \(\varvec{X}(S)\) denote the set of all \(2|\mathbf {N}(S)-1|\) states from such conjugate pairs. Repeated application of the cycle-joining approach yields a universal cycle for \(\varvec{Short}_{}(S)\) with the following successor-rule:

$$\begin{aligned} g(\beta ) = \left\{ \begin{array}{ll} b_1 &{} \text{ if } \beta \in \varvec{X}(S)\\ z &{} \text{ otherwise. }\end{array} \right. \end{aligned}$$

Let \(\mathcal {V}(S)\) denote the universal cycle for \(\varvec{Short}_{}(S)\) obtained by starting with \({{\,\mathrm{tail}\,}}(S)\) and repeatedly applying g on the last \(n-1\) symbols. In other words, \(g(\beta )\) is the symbol following \(\beta \) in the universal cycle \(\mathcal {V}(S)\). For example, \(\mathcal {V}(\{1,1,2,2,3,3\})\) =

$$\begin{aligned}&11223312231312321312213312123311233212313213212133211\\&3322131322113232112323123121323113223. \end{aligned}$$

Testing whether or not a state \(\beta \) is in \(\varvec{X}(S)\) can be done in O(n) time as follows. Let \(h(\beta )\) be the rotation of \(\beta \) that takes the longest non-decreasing suffix of \(b_3\cdots b_n\) and rotates it to the front of \(\beta \). For example \(h(12332{1233}) = {1233}12332\).

Lemma 5

Let \(\beta = b_1b_2\cdots b_{n-1}\) be a state with missing symbol z. Then \(\beta \in \varvec{X}(S)\) if and only if

  • \(b_1 > z\) and \(h({b_1z}b_2\cdots b_{n-1})\) is a necklace and \(b_1 \ge b_{n-1}\), or

  • \(z > b_1\) and \(h({zb_1}b_2\cdots b_{n-1})\) is a necklace and \(z \ge b_{n-1}\).

Proof

Let xy denote the elements z and \(b_1\) listed in decreasing order. For \(\beta \) to be in \(\varvec{X}(S)\), the conjugate pair \(zb_1b_2\cdots b_{n-1}\) and \(b_1zb_2\cdots b_{n-1}\) must satisfy the properties that \(x\ne y\) and xy is the first inversion in the necklace representative for \(xyb_2\cdots b_{n-1}\). Since xy is the first inversion, \(h(xyb_2\cdots b_{n-1})\) must be a necklace and \(x \ge b_{n-1}\). \(\square \)

Together, Lemma 5 and Lemma 9 (in Sect. 4.2) imply the following result.

Corollary 6

If \(\beta =b_1\cdots b_{n-1}\) is in \(\varvec{X}(S)\) then both \(h(z\beta )\) and \(h(b_1zb_2\cdots b_{n-1})\) are necklaces.

Example 6

If \(\beta = b_1\cdots b_5 = 32113\) for content \(S = \{1,1,2,2,3,3\}\), then \(z=2\) and \(b_1=3\). Since \(b_1 > z\), \(h({32}2{113}) = {113}322\) is a necklace, and \(b_1\ge b_5\), the state belongs to \(\varvec{X}(S)\).

Based on Lemma 5, the following is a successor rule to construct \(\mathcal {V}(S)\). An illustration is in Fig. 6.

Successor rule for fixed-content universal cycle \(\mathcal {V}(S)\)

Let \(\beta = b_1b_2\cdots b_{n-1}\) be a state where z is the missing symbol. Then

figure a

Since testing if a string is a necklace can be done in O(n) time [2] we obtain the following theorem. The minimum-weight property is a direct consequence of using the feedback function in (3) since the spanning tree joins the necklace cycles (which use weight 0 edges) using as few weight 1 edges as possible.

Theorem 7

\(\mathcal {V}(S)\) is a minimum-weight universal cycle for \(\varvec{Short}_{}(S)\) that can be constructed in O(n) time per symbol.

Fig. 6
figure 6

The fixed-content universal cycle \(\mathcal {V}(S)\) for \(S = \{1,1,2,2,3,3\}\) in column-major order, as generated by the successor rule in (4). More specifically, the columns labeled \(\beta \) show successive states of the universal cycle, and any single column (e.g., \(b_1\) or \(b_5\)) provides the universal cycle. Each state is a member of \(\varvec{Short}_{}(S)\) and hence is shorthand for a member of \(\varvec{Perm}(S)\). The columns labeled case provide the symbol following \(\beta \) in the universal cycle; the next symbol is \(\beta \)’s first symbol \(b_1\) when (4a) or (4b) applies, and is \(\beta \)’s missing symbol \(z = b_6\) when (4a) applies. For example, the first state \(\beta = 11223\) has case (4c) applied. In other words, \(\beta z = 112233\) appears in the universal cycle (as opposed to \(\beta b_1 = 112231\)). As a check, note that this next symbol 3 is the final symbol in the next state 12233. When (4b) applies, the \(\varvec{N}(S)\) column provides the necklace \(h(z b_1 b_2 \cdots b_{n-1})\), and these cases correspond to going downward in the first-inversion tree (see Fig. 5b). Similarly, when (4c) applies, the \(\varvec{N}(S)\) column provides the necklace \(h(b_1 z b_2 \cdots b_{n-1})\), and these cases correspond to going upward in the first-inversion tree. In both of these cases, the necklace comes from the child of the parent–child conjugate pair. For example, 121233 appears twice in the \(\varvec{N}(S)\) column, first going downward from the root 112233 via state 22331, and then going upward to the root 112233 via state 12331 (where 22331 and 12331 are conjugates of each other). The Gray code for \(\varvec{Perm}(S)\) using two operations, as generated by (5), can be obtained from this figure as follows: Add the missing symbol to each \(\beta \), and map the cases (4a)–(4b) to (5a), and (4c) to (5b). For example, \(112233 \in \varvec{Perm}(S)\) is followed by \(122331 \in \varvec{Perm}(S)\) using \(\textstyle \mathsf {shift}_{1,n}(=) \textstyle \mathsf {shift}_{1,6}(\)) in this order because (4c) maps to (5b). Similarly, a minimum-weight binary representation is obtained by using 1 for each (4a) or (4b), and 0 for each (4c)

3 A Shift Gray Code for Multiset Permutations

An immediate consequence of the successor rule described in the previous section is a shift Gray code for \(\varvec{Perm}(S)\) whose successor-rule, defined as follows, shifts the first symbol to either the nth or \((n{-}1)\)st position in the string.

Successor rule for \(\varvec{Perm}(S)\) using two operations

Let \(\alpha = a_1 a_2\cdots a_{n} \in \varvec{Perm}(S)\). Then

figure b

Theorem 8

Starting with any initial string \(\alpha \) in \(\varvec{Perm}(S)\) and repeatedly applying the function \(\textstyle \mathsf {nextMultiPerm}(\alpha )\) a total of \(|\varvec{Perm}(S)-1|\) times produces a cyclic shift Gray code for \(\varvec{Perm}(S)\). Moreover, the Gray code can be generated in O(n)-time per string.

Example 7

Let \(S = \{1,1,2,3\}\) and consider \(\mathcal {V}(S) = 112312131132\). It corresponds to the following shift Gray code for \(\varvec{Perm}(S)\), with the index where the first symbol is shifted to obtain the next string in the listing given in the final column:

Short hand

Shift gray code

Shift index

112

1123

(4)

123

1231

(4)

231

2311

(3)

312

3121

(4)

121

1213

(4)

213

2131

(4)

131

1312

(3)

311

3112

(3)

113

1132

(4)

132

1321

(4)

321

3211

(4)

211

2113

(3)

Note that the first symbol is shifted in to the \(n{-}1\)st position exactly \(2|\varvec{N}(S)-1|\) times.

4 Necklace Concatenations feat: Cool-lex Order

In this section we start by providing a brief background of fixed-content necklaces. We introduce cool-lex order for \(\varvec{Perm}(S)\) and provide a successor rule to produce the corresponding listing; a special focus is given to \(\mathbf {N}(S)\). We then describe a known necklace concatenation approach that is applied to construct a universal cycle for \(\varvec{Short}_{}(S)\). We conclude by providing a recursive description of cool-lex and use it to efficiently list \(\varvec{N}(S)\) in cool-lex order.

4.1 Necklaces with Fixed-Content

Let \(\alpha = a_1a_2\cdots a_n\) be a string. Let \(\alpha ^t\) denote the string composed of t copies of \(\alpha \). The period of \(\alpha \) is the smallest value p such that \(\alpha = (a_1\cdots a_p)^t\) for some integer t. Let \({{\,\mathrm{ap}\,}}(\alpha ) = a_1\cdots a_p\) where p is the period of \(\alpha \); we say \({{\,\mathrm{ap}\,}}(\alpha )\) is the aperiodic prefix of \(\alpha \). If \(\alpha \) has period n we say it is aperiodic; otherwise we say it is periodic.

The number of fixed-content necklaces in \(\varvec{N}(S)\) can be deduced using Pólya theory as discussed in [21]. In the following formula, it is assumed that the content S is composed of \(n_i \ge 1\) occurrences of each symbol i, \(|S| = n\), and \(k \ge 1\):

$$\begin{aligned} |\varvec{N}(S)|= & {} \frac{1}{n} \sum _{j | gcd(n_1,n_2,\ldots ,n_{k})}^{} \phi (j) \frac{(n/j)!}{(n_1/j)!\cdots (n_{k}/j)!} \end{aligned}$$
(6)

where Euler’s totient function \(\phi (j)\) denotes the number of positive integers less than or equal to j that are relatively prime to j.

There exists a O(1)-amortized time algorithm to list \(\varvec{N}(S)\) in reverse lexicographic order [41].

4.2 Cool-lex Order

Cool-lex order for fixed-content strings was introduced in [52]. In the binary case, when \(k=2\), cool-lex order has been well-studied under two natural equivalences [34, 35, 42]. When extending cool-lex to fixed-content strings where \(k>2\), there are a number of equivalent ways to present the ordering. The presentation we give here differs from the original presentation in [52]. In particular, we consider the longest non-decreasing prefix instead of the longest non-increasing prefix of the strings in question. Cool-lex order for \(\varvec{Perm}(S)\) is a shift Gray code, where successive strings differ by the shift of a single symbol. If \(\alpha =a_1a_2\cdots a_n \in \varvec{Perm}(S)\), then recall that

$$\begin{aligned} \textstyle \mathsf {shift}_{[}(\alpha )]{t,s} = a_1\cdots a_{s-1}{a_ta_sa_{s+1}\cdots a_{t-1}}a_{t+1}\cdots a_n, \end{aligned}$$

denotes the operation that shifts \(a_t\) into position s. This operation can be implemented in constant time by using a doubly-linked list data structure, so long as pointers to the symbol and position are provided. Cool-lex order provides a prefix-shift Gray code for \(\varvec{Perm}(S)\) meaning that successive strings differ by a single prefix-shift corresponding to the operation \(\textstyle \mathsf {shift}_{[}(\alpha )]{t,1}\). Moreover, the next value of t can be updated in constant time after each shift. The ordering is also cyclic because a prefix-shift transforms the last string in the order into the first. As an example, the set \(\varvec{Perm}(\{1,1,2,2,3,3\})\) is listed in cool-lex order on the left side of Fig. 7.

One of the most notable features of cool-lex order is that it has a simple successor rule; the prefix-shift that creates the next fixed-content string in the order can be specified by the following rule.

Cool-lex Successor Rule for Fixed-Content Strings

Let \(\alpha = a_1a_2\cdots a_n \in \varvec{Perm}(S)\) and let j denote the length of \(\alpha \)’s longest non-decreasing prefix. The string following \(\alpha \) in cool-lex order, denoted \(\textstyle \mathsf {next}(\alpha )\), is obtained from \(\alpha \) by the prefix-shift in the following cumulative cases

figure c
Fig. 7
figure 7

Cool-lex order for the strings with content \(S = \{1,1,2,2,3,3\}\) (i.e., \(\varvec{Perm}(S)\)) appear to the left of the vertical line in column-major order, as generated by the successor rule in (7). Observe that each of these strings is obtained from the previous by a prefix-shift (i.e., \(\textstyle \mathsf {shift}_{i,1}(\)) for some \(i > 1\)). For example, the third string \(113\underline{2}23\) is transformed into the fourth string \(\underline{2}11323\) by moving the underlined symbol to the left into the first position (or equivalently by rotating the prefix \(113\underline{2}\) one position to the right to obtain \(\underline{2}113\)). The order is also cyclic in this regard, since the last string is transformed into the first by a prefix-shift. The column to the right of the vertical line illustrates the necklaces with content S (i.e., \(\varvec{N}(S)\)) as they appear in cool-lex order. Observe that each necklace is obtained from the previous by a shift given by (8). For example, the first necklace \(113\underline{2}23\) is transformed into the second necklace \(1\underline{2}1323\) by moving the underlined symbol two positions to the left (or equivalently by rotating the substring \(13\underline{2}\) one position to the right to obtain \(\underline{2}13\)). The order is again cyclic in this regard, since the last string is transformed into the first by a shift. The fixed-content universal cycle \(\mathcal {U}(S)\) is obtained by reversing the order of these necklaces, and concatenating their aperiodic prefixes, as shown in Fig. 9. In particular, note that 112233 is last here, and first in Fig. 9

Another benefit of cool-lex order is that its relative order provides shift Gray codes for numerous interesting subsets of \(\varvec{Perm}(S)\). This phenomenon was discussed in the binary case in [34], and more generally in [53]. In particular, this occurs for necklaces, as illustrated in Fig. 7 for \(S = \{1,1,2,2,3,3\}\). By adapting the techniques from [34, 53], we obtain the following successor rule for fixed-content necklaces.Footnote 3

Cool-lex Successor Rule for Fixed-Content Necklaces

Let \(\alpha = a_1a_2\cdots a_n \in \varvec{N}(S)\) and let j denote the length of \(\alpha \)’s longest non-decreasing prefix. When \(j<n-1\), let \(\alpha ' = a_1\cdots a_{j}{a_{j+2}a_{j+1}}a_{j+3}\cdots a_n\); it is \(\alpha \) with \(a_{j+1}\) and \(a_{j+2}\) transposed. The necklace following \(\alpha \) in cool-lex order, denoted \(\textstyle \mathsf {next}(\alpha )\), is obtained from \(\alpha \) by the shift in the following cumulative cases

figure d

where s is the smallest index such that the result of shifting the specified element yields a necklace.

Example 8

Consider the necklace \(\alpha = a_1 a_2 a_3 a_4 a_5 a_6 = 122133\) with longest non-decreasing prefix of length \(j=3\). Since \(a_3 < a_5\) and 122313 is a necklace, \(\textstyle \mathsf {next}(\alpha ) = \textstyle \mathsf {lshift}_{\alpha }(j+2)\). We can determine the result of \(\textstyle \mathsf {lshift}_{\alpha }(5)\) by bubbling the symbol \(a_5 = 3\) to the left, starting from \(\alpha \), as follows:

$$\begin{aligned} a_1 a_2 a_3 a_4 {a_5} a_6= & {} 122133 \text {is a necklace;} \\ a_1 a_2 a_3 {a_5} a_4 a_6= & {} 122313 \text {is a necklace;}\\ a_1 a_2 {a_5} a_3 a_4 a_6= & {} 123213 \text {is a necklace;} \\ a_1 {a_5} a_2 a_3 a_4 a_6= & {} 132213 \text {is {not} a necklace;}\\ {a_5} a_1 a_2 a_3 a_4 a_6 = 312213 \text {is {not} a necklace}. \end{aligned}$$

The result is the last necklace in this list. Hence, \(\textstyle \mathsf {lshift}_{\alpha }(5) = a_1 a_2 a_5 a_3 a_4 a_6 = 123213\).

Observe from the previous example, that the necklaces that result from shifting the specified symbol to the left are all contiguous. This property is formalized in the upcoming Lemma 10. Its proof requires the following technical result.

Lemma 9

If \(a_1a_2\cdots a_n\) is a necklace that contains a smallest index t such that \(a_t > a_{t+1}\), then \(a_1\cdots a_{t-1}{a_{t+1}a_t}a_{t+2}\cdots a_n\) is a necklace.

Proof

Let \(\beta = b_1b_2\cdots b_n = a_1\cdots a_{t-1}{a_{t+1}a_t}a_{t+2}\cdots a_n\). Let \(\beta _j\) denote the rotation of \(\beta \) starting at \(b_j\) and let \(\alpha _j\) denote the rotation of \(\alpha = a_1a_2\cdots a_n\) starting at \(a_j\). If \(\beta \le \beta _j\) for each \(2 \le j \le n\), then \(\beta \) is a necklace. Since \(\alpha \) is a necklace, each \(a_i \ge a_1\) and thus each \(b_i \ge b_1\). Since \(b_1\cdots b_{t-1}\) is non-decreasing it is straightforward to observe that \(\beta _j > \beta \) for \(2 \le j \le {t+1}\). Now consider the prefix of length t for \(\beta _j\) where \(t+2 \le j \le n\). This prefix is the same as the length t prefix of \(\alpha _j\). If this prefix is less than or equal to \(b_1\cdots b_{t}\), then it must be strictly less than \(a_1\cdots a_t\) since \(a_t > b_t\). But this contradicts the fact that \(\alpha \) is a necklace. Thus this prefix must be strictly greater than \(b_1\cdots b_{t}\). Thus \(\beta _j \ge \beta \) for each \(2 \le j \le n\) and hence \(\beta \) is a necklace. \(\square \)

Given \(\alpha = a_1a_2\cdots a_n \in \varvec{N}(S)\), then from (8), let t correspond to the index of the element shifted to position s so

$$\begin{aligned} \textstyle \mathsf {next}(\alpha ) = a_1\cdots a_{s-1}{a_ta_s\cdots a_{t-1}}a_{t+1}\cdots a_n. \end{aligned}$$

Using this notation, the following result illustrates the “bubble property” that necklaces have with respect to cool-lex order.

Lemma 10

For \(s \le i < t\), \(a_1\cdots a_{i-1}{a_ta_i\cdots a_{t-1}}a_{t+1}\cdots a_n\) is a necklace.

Proof

When \(i=s\), \(\textstyle \mathsf {next}(\alpha ) = a_1\cdots a_{s-1}{a_ta_s\cdots a_{t-1}}a_{t+1}\cdots a_n\) is a necklace by definition. For \(i>s\), we step through the cases of (8) and the possible values of t, recalling that j is the length of the longest non-decreasing prefix of \(\alpha \). If \(|\varvec{N}(S)|=1\), then the necklace is either \(1^n\) or \(1^{n-1}2\), and the result clearly holds as part of case (8a). Otherwise there are at least two symbols that are not 1. Case (8a) implies that \(\alpha \) is non-decreasing and \(t=n\). From the definition of a necklace, it is easy to observe that \(s=\lceil n_1/2\rceil +1\), where \(n_1\) is the number of occurrences of the symbol 1. Moreover, \(\textstyle \mathsf {shift}_{[}(\alpha )]{n,i}\) is a necklace for all \(s\le i < n\). Otherwise, if \(t=j+1\) (from case (8b)) then by Lemma 9, \(\textstyle \mathsf {shift}_{[}(\alpha )]{t,t-1}\) is a necklace; if \(t=j+2\) (from case (8c)) then the successor-rule itself requires \(\textstyle \mathsf {shift}_{[}(\alpha )]{t,t-1}\) is a necklace. Now, suppose there exists some r, where \(s< r <t-1\), such that \(\sigma = d_1\cdots d_n= \textstyle \mathsf {shift}_{[}(\alpha )]{t,r} = a_1\cdots a_{r-1}{a_ta_r\cdots a_{t-1}}a_{t+1}\cdots a_n\) is not a necklace. Since \(a_1\cdots a_{t-2}\) is non-decreasing, Lemma 9 implies that \(d_1\cdots d_r\) is non-decreasing. Thus, by the definition of a necklace, there is some suffix \(\sigma '\) of \(d_{r+1}\cdots d_n\) that is less than or equal to \(d_1\cdots d_r\) (namely, a prefix of a rotation that is a necklace). If \(a_t > d_s\), then since \(\textstyle \mathsf {next}(\alpha )\) also has suffix \(\sigma '\), \(a_1\cdots a_{s-1}a_t\) is greater than \(\sigma '\), contradicting the fact that \(\textstyle \mathsf {next}(\alpha )\) is a necklace. Otherwise it must be that each element in \(a_s\cdots a_{r-1}\) is \(a_t\) and hence \(\sigma = \textstyle \mathsf {next}(\alpha )\) which we already stated is a necklace, a contradiction. \(\square \)

Since O(n) time is sufficient for testing whether or not a string is a necklace [2], the necklace successor rule runs in \(O(n^2)\) time. In Sect. 4.4, a recursive description of cool-lex order is provided. When focusing on necklaces, an optimization allows \(\varvec{N}(S)\) to be listed in cool-lex order in O(n)-amortized time per necklace.

4.3 Necklace Concatenation Construction

As mentioned in Sect. 1.5, the most well-known de Bruijn sequence is the so-called grand-daddy de Bruijn sequence; it is the lexicographically smallest k-ary de Bruijn sequence of order n. It can be generated very elegantly using an approach that is often referred to as the FKM construction or FKM algorithm, due to its discoverers [15, 16]. As discussed in [35], the authors of this article prefer to describe the construction using a slightly different approach called the necklace-prefix algorithm. The difference between the algorithms is that the former uses an ordering of Lyndon words (i.e., aperiodic necklaces) whose length divides n, while the latter uses an order of necklaces that is then reduced to the same set of Lyndon words.Footnote 4 For example, the FKM algorithm is based on column (b) of Fig. 8, while the necklace-prefix algorithm uses column (a). When using lexicographic order, the resulting constructions are identical, but this is not true for other orders. For example, the two concatenation schemes give different results when using co-lexicographic order (which orders strings from right-to-left), with the necklace-prefix algorithm creating the grandmama de Bruijn sequence [12] and the FKM algorithm not working. Similarly, we use the necklace-prefix algorithm when working with cool-lex order, since cool-lex order is only defined for strings of the same length.

Formally, the necklace-prefix algorithm takes an order of strings, filters out the non-necklaces, reduces the remaining necklaces to their aperiodic prefix, and concatenates the prefixes. Amazingly, the granddaddy de Bruijn sequence is created by applying the necklace-prefix algorithm to the k-ary strings of length n in lexicographic order. This is illustrated in Fig. 8 for \(n=6\) and \(k=2\). The approach has been generalized to other sets in [47].

Fig. 8
figure 8

The grand-daddy de Bruijn sequence \(\mathcal {G}_{k}(n)\) for \(n=6\) and \(k=2\) is constructed by the necklace-prefix algorithm applied to the binary strings of length 6. The algorithm starts with the lexicographic order of binary strings of length 6 (which are not shown), then reduces the order to the necklaces in column (a), and their aperiodic prefixes in column (b), and concatenates these prefixes to get the grand-daddy de Bruijn sequence in (c). The FKM algorithm yields the same concatenation directly from column (b), since it contains the Lyndon words of length 1, 2, and 3 (i.e., the divisors of \(n=6\)) in lexicographic order

Fig. 9
figure 9

Our cool-daddy fixed-content universal cycle \(\mathcal {U}(S)\) for content \(S = \{1,1,2,2,3,3\}\). The cycle uses the shorthand representation, and is constructed using the necklace-prefix algorithm on reverse cool-lex order. The fixed-content necklaces over S are given in reverse cool-lex order in column (a), they are reduced to their aperiodic prefix in column (b), and their concatenation gives the universal cycle in column (c)

Unfortunately, the magic runs out when we consider fixed-content strings, even in their shorthand representatives. As an illustration, note that the lexicographic order of necklaces with content \(S = \{1,1,2,2,3,3\}\) places the following necklaces consecutively,

$$\begin{aligned} \ldots 113322, 121233, \ldots , \end{aligned}$$

and so, the necklace-prefix algorithm generates \(\cdots 1133{22121}233 \cdots \). The bold substring of length \(n{-}1=5\) is not shorthand for a string with the content S because it has too many 2’s. The cause of the issue is also clear: The leftmost 2 moves several positions to the left from \(1133\underline{2}2\) to \(1\underline{2}1233\). This issue leads us to instead use reverse cool-lex order, since this will ensure that individual symbols move at most one position to the left between successive necklaces.

Let \(\alpha _1, \alpha _2, \ldots , \alpha _m\) denote the necklaces \(\varvec{N}(S)\) listed in reverse cool-lex order. Amazingly, by applying the necklace-prefix algorithm outlined with respect to cool-lex order, we obtain a universal cycle for \(\varvec{Short}_{}(S)\). Let

$$\mathcal {U}(S) = {{\,\mathrm{ap}\,}}(\alpha _1){{\,\mathrm{ap}\,}}(\alpha _2)\cdots {{\,\mathrm{ap}\,}}(\alpha _m).$$

An example of \(\mathcal {U}(S)\) is provided in Fig. 9 for \(S = \{1,1,2,2,3,3\}\).

Theorem 11

\(\mathcal {U}(S)\) is universal cycle for \(\varvec{Short}_{}(S)\). Moreover, \(\mathcal {U}(S) = \mathcal {V}(S)\).

This concatenation construction does not attain the sufficient conditions provided in [17] for when concatenating smaller cycles yields a larger universal cycle. Instead, this theorem is proved in Sect. 5 by demonstrating that the symbol following a given length \(n-1\) substring \(\beta \) in \(\mathcal {U}(S)\) is given by the successor-rule \(g(\beta )\) used to generate \(\mathcal {V}(S)\). By applying an efficient algorithm to list \(\varvec{N}(S)\) in cool-lex order, as presented in the next section, we obtain the following result.

Theorem 12

The reversal of \(\mathcal {U}(S)\) can be generated in O(1)-amortized time per symbol using O(n) space.

4.4 Recursive Generation of Fixed-Content Necklaces in Cool-lex

In [52], a recursive description is given to list all strings with fixed-content S in cool-lex order. In that description, the focus is on strings in reverse lexicographic order, whereas, we will focus on lexicographic order. In this section, we restate this recurrence using the original terminology and then apply it to generate the necklaces \(\varvec{N}(S)\) in cool-lex order.

Recall \({{\,\mathrm{tail}\,}}(S)\) denotes the unique non-decreasing string (and necklace) with content S. A scutFootnote 5 of S is any non-decreasing string \(\alpha \) composed of some of the elements of S such that \(\alpha \) is not a suffix of tail(S), but every proper suffix of \(\alpha \) is a suffix of tail(S). Let \(\alpha _i(S)\) (or simply, \(\alpha _i\)) denote the i-th scut of S when the scuts are listed in decreasing order of the first symbol, then by decreasing length. Let \(R_i\) denote the multiset S with the content of \(\alpha _i(S)\) removed.

Example 9

Consider \(S = \{1,1,2,2,3,3\}\). Then \(tail(S) = 112233\) and the scuts of S in decreasing order of the first symbol, then decreasing length, are:

$$\begin{aligned} 23, \ 2, \ 1233, \ 133, \ 13, \ 1. \end{aligned}$$

Note \(\alpha _4(S) = 133\) and \(R_4 = \{1,1,2,2,3,3\} {\setminus } \{1,3,3\} = \{1,2,2\}\).

If S is a multiset with j scuts, then the following recurrence \(\varvec{\mathcal {C}}(S, \gamma )\) (simplified from Definition 2.4 in [52]) produces a listing for all strings of the form \(\beta \gamma \) where \(\beta \) has content S as they appear in cool-lex order:

$$\begin{aligned} \varvec{\mathcal {C}}(S, \gamma ) = \varvec{\mathcal {C}}(R_1, \alpha _1 \gamma ), \varvec{\mathcal {C}}(R_2, \alpha _2 \gamma ), \ldots , \varvec{\mathcal {C}}(R_j, \alpha _j\gamma ), tail(S)\gamma . \end{aligned}$$

Note that \(\varvec{\mathcal {C}}(S,\epsilon )\) will produce a listing of all strings with fixed-content S. Recall Fig. 7 illustrating the cool-lex order for \(\varvec{Perm}(\{1,1,2,2,3,3\})\). This is the same listing generated by \(\varvec{\mathcal {C}}(\{1,1,2,2,3,3\},\epsilon )\). In particular observe that the strings are ordered by suffixes corresponding to the scuts: 23, 2, 1233, 133, 13, 1.

We now focus on how to modify this recurrence to list \(\varvec{N}(S)\) as they appear in cool-lex order.

Lemma 13

\(\varvec{\mathcal {C}}(S, \gamma )\) contains a necklace if and only if \(tail(S)\gamma \) is a necklace.

Proof

(\(\Leftarrow \)) \(tail(S)\gamma \) is in \(\varvec{\mathcal {C}}(S,\gamma )\) by definition. Thus if \(tail(S)\gamma \) is a necklace then \(\varvec{\mathcal {C}}(S, \gamma )\) contains a necklace. (\(\Rightarrow \)) If \(\varvec{\mathcal {C}}(S, \gamma )\) contains necklace then it must be of the form \(\lambda \gamma \) where \(\lambda \) has content S. If \(\lambda = tail(S)\), then we are done. Otherwise, repeated application of Lemma 9 implies that \(tail(S)\gamma \) is a necklace. \(\square \)

Based on Lemma 13, the recurrence \(\varvec{\mathcal {C}}(S, \gamma )\) can be updated to list only the necklaces as follows (where \(\langle \ \rangle \) denotes an empty list).

$$\begin{aligned} \varvec{\mathcal {N}}(S, \gamma ) = \left\{ \begin{array}{ll} \langle \ \rangle &{} \ \ \ \ \text{ if } tail(S)\gamma \text {is not a necklace;} \\ \varvec{\mathcal {N}}(R_1, \alpha _1 \gamma ), \ldots , \varvec{\mathcal {N}}(R_j, \alpha _j\gamma ), tail(S)\gamma &{} \ \ \ \ \text{ otherwise, } \end{array} \right. \end{aligned}$$

Note that \(\varvec{\mathcal {N}}(S, \epsilon )\) will produce a listing of all necklaces with fixed-content S as they appear in \(\varvec{\mathcal {C}}(S,\epsilon )\). Recall from (8), the corresponding successor-rule which implies that successive necklaces in this ordering differ by a shift.

The function Cool(t) in Algorithm 1 implements the recurrence for \(\varvec{\mathcal {N}}(S, \gamma )\). Given content S, by initializing the global string \(a_1a_2\cdots a_n\) to tail(S), the initial call Cool(n) generates the necklaces \(\varvec{N}(S)\) in cool-lex order. The parameter t passed in the function Cool(t) indicates how the string \(a_1a_2\cdots a_n\) is partitioned into the two pieces based on \(\varvec{\mathcal {N}}(S', \gamma )\): \(a_1a_2\cdots a_t = tail(S')\) and \(a_{t+1} \cdots a_n = \gamma \). Each call Cool(t) corresponding to \(\varvec{\mathcal {N}}(S', \gamma )\) iterates through the scuts of \(S'\) in the proper order. This is done by scanning \(tail(S') = a_1 \cdots a_t\) from right to left until we reach an index i where \(a_i \ne a_{i-1}\) (Line 4). To produce all scuts starting with \(a_{i-1}\), and their corresponding recursive calls if a necklace can be produced, we iteratively shift this symbol through positions \(i, i+1, \ldots , t\) obtaining a new scut for each swap (Lines 5-7). Once all scuts starting with \(a_{i-1}\) have been processed we restore \(a_1\cdots a_t\) to \(tail(S')\) (Line 8). We repeat this approach by continuing to traverse tail(S) from right to left until we reach a symbol that is the same as \(a_1\) (Line 3). The function Visit() outputs the string \(a_1a_2\cdots a_n\), and the function Swap(ij) swaps the symbols at index i and j in \(a_1a_2\cdots a_n\).

figure e

When analyzing this algorithm, if every string tested in Line 7 was a necklace, then the work done by each necklace test can be assigned to the following recursive call. Since each recursive call generates at least one necklace, and since the necklace testing can be done in O(n)-time [2], the overall algorithm runs in O(n)-amortized time per necklace. However, within each recursive call, there can be a number of negative necklace tests. For instance, consider the string \(\alpha = 112233{112233}\) and the call to Cool(6). This results in necklace tests for the following 6 strings, none of which are necklaces since the rotation starting with the suffix 112233 is smaller than string in question:

$$\begin{aligned}&112323{112233}, 112332{112233}, 121233{112233}, \\&122133{112233}, 122313{112233}, 122331{112233}. \end{aligned}$$

Fortunately there exists a simple optimization: once a string tested on Line 7 is not a necklace, then by Lemma 10 none of the following strings tested will be either. This optimization can be applied to Cool(t) by replacing Line 7 with the following fragment:

figure f

This optimization ensures that at most one necklace test is negative per recursive call.

Theorem 14

If \(a_1a_2\cdots a_n\) is initialized to tail(S), then a call to the optimized Cool(n) lists the necklaces \(\varvec{N}(S)\) in cool-lex order in O(n)-amortized time per string.

In the binary case when \(k=2\), \(\varvec{N}(S)\) can be generated in O(1)-amortized time [42].

4.4.1 Application: Efficient Construction of \(\mathcal {U}(S)\) in Reverse

To construct the reverse of the universal cycle \(\mathcal {U}(S)\), which itself is a fixed-content universal cycle over S, we can directly apply the optimized Algorithm 1 to list \(\varvec{N}(S)\) in cool-lex order with a simple modification. Instead of outputting the current necklace \(\alpha =a_1a_2\cdots a_n\), the function Visit( )

  • \(\triangleright \) determines the period p of \(\alpha \) and then

  • \(\triangleright \) outputs \(a_pa_{p-1}\cdots a_1\).

Since the aperiodic prefix of \(\alpha \) can be determined in O(n) (see [2]), the modified algorithm still runs in O(n)-amortized time per necklace. Since the total length of \(\mathcal {U}(S)\) is proportional to \(n|\varvec{N}(S)|\) (see Section 5 in [41] which implies \(|\mathcal {U}(S)| \ge n|\varvec{N}(S)|/2\)) we obtain the result previously stated in Theorem 12.

5 Proof of Theorem 11

If \(|\varvec{N}(S)| = 1\), then the content of S is either all 1 s, or all 1 s and a single 2; the necklaces are \(1^n\) and \(1^{n-1}2\), respectively. In these cases we clearly have \(\mathcal {U}(S) = \mathcal {V}(S)\). Otherwise, let \(\alpha _1,\alpha _2, \ldots , \alpha _m\) denote the necklaces of \(\varvec{N}(S)\) listed in reverse cool-lex order, allowing \(\alpha _{m+1}=\alpha _1\). Recall that

$$\mathcal {U}(S) = {{\,\mathrm{ap}\,}}(\alpha _1){{\,\mathrm{ap}\,}}(\alpha _2)\cdots {{\,\mathrm{ap}\,}}(\alpha _m).$$

We focus on consecutive necklaces \(\alpha _{i}\) and \(\alpha _{i+1}\), where \(1\le i \le m\). Let \(\alpha _{i} = c_1c_2\cdots c_n\) and let \(\alpha _{i+1} = a_1a_2\cdots a_n\). From (8), there exists some s and t such that

$$\alpha _{i} = a_1\cdots a_{s-1}{a_ta_s\cdots a_{t-1}}a_{t+1}\cdots a_n.$$

Applying this notation, we obtain the following result.

Lemma 15

The length \(n-1\) suffix of \({{\,\mathrm{ap}\,}}(\alpha _i){{\,\mathrm{ap}\,}}(\alpha _{i+1})\) is the same as the length \(n-1\) suffix of \(\alpha _{i+1}\).

Proof

Let p be the period of \(\alpha _{i+1}\). If \(p=n\), the result is trivial. Otherwise \(\alpha _{i+1}\) is periodic, and from the definition of a necklace and the assumptions on the content, \(a_1 = a_{p+1} = 1\) and \(a_p > 1\). Since \(\alpha _{i+1}\) is periodic it is a simple exercise to demonstrate that \(\alpha _i\) is aperiodic since it is the cool-lex successor of \(\alpha _{i+1}\). Let \(a_ja_{j+1}\) be the leftmost inversion in \(\alpha _{i+1}\). Clearly \(j\le p\). If \(j<p\), then \(t\le p+1\) and the result follows. If \(j=p\) then \(a_1\cdots a_p\) is non-decreasing. We consider two cases based on (8)b. If \(a_p > a_{p+2} = a_2\), then \(t=p+1\) and again the result follows. Otherwise, it must be that \(a_2\cdots a_p\) are all the same symbol, namely 2. Swapping \(a_{p+1}=1\) and \(a_{p+2}=2\) in \(\alpha _{i+1}\) to obtain \(\gamma \) does not yield a necklace because the rotation starting from position \(p+2\) in \(\gamma \) will be lexicographically smaller than \(\gamma \). Thus based on (8)b, \(t=p+1\), and again the result follows. \(\square \)

We now prove that \(\mathcal {U}(S) = \mathcal {V}(S)\). Let \(\beta =b_1\cdots b_{n-1}\) be a substring of \(\mathcal {U}(S)\). We show (i) \(\beta \) is an element of \(\varvec{Short}_{}(S)\) with missing symbol z and (ii) the symbol following \(\beta \) in \(\mathcal {U}(S)\) is \(g(\beta )\), and hence one of z or \(b_1\).

Suppose \(\beta \) is completely contained in some \({{\,\mathrm{ap}\,}}(\alpha _i)\). Then clearly \(\beta \) is an element of \(\varvec{Short}_{}(S)\) and \({{\,\mathrm{ap}\,}}(\alpha _i) = \alpha _i\), which means \(\alpha _i\) is an aperiodic necklace. Thus \(\beta z = \alpha _i\) or \(z\beta = \alpha _i\). Since each necklace in \(\varvec{N}(S)\) begins with 1, the symbol following \(\beta \) in each case is the missing symbol z. In both cases \(h(z\beta )\) is not a necklace since it is a proper rotation of the aperiodic necklace \(\alpha _i\). Thus by Corollary 6, \(\beta \) is not in \(\varvec{X}(S)\) and hence \(g(\beta ) = z\). For the remaining cases, \(\beta = \sigma _1\sigma _2\) where \(\sigma _2\) is a non-empty prefix of some \({{\,\mathrm{ap}\,}}(\alpha _{i+1})\), and by applying Lemma 15, \(\sigma _1\) is a non-empty suffix of \(\alpha _i\). Let \(|\sigma _2|=x\) and thus \(\sigma _2 = a_1a_1\cdots a_x\) and \(\sigma _1 = c_{x+2}\cdots c_{n}\). Note \(x < n-1\) and the symbol following \(\beta \) in \(\mathcal {U}(S)\) is \(a_{x+1}\). Let p denote the period of \(\alpha _{i+1}\).

Case 1: \(\sigma _2 = {{\,\mathrm{ap}\,}}(\alpha _{i+1})\). This means \(\alpha _{i+1}\) is periodic, i.e., \(\alpha _{i+1} = (a_1\cdots a_p)^j\), for some \(j > 1\). By Lemma 15, \(\beta \) is a suffix of \(\alpha _{i+1}\) which means \(z\beta = \alpha _{i+1}\). Both z and the symbol following \(\beta \) are 1 as they are each the first symbol of some necklace in \(\varvec{N}(S)\). If \(a_1\cdots a_p\) contains an inversion, then \(h(z\beta )\) is rotation of \(\alpha _{i+1}\) that must be lexicographically larger than \(\alpha _{i+1}\) by the definition of p. Thus by Corollary 6, \(g(\beta )=z\). Otherwise, suppose \(a_1\cdots a_p\) is non-decreasing. If \(z=b_1\), \(g(\beta ) = z\) from (4). Otherwise \(b_1 > z\). By the content assumptions, it must be that \(b_1=2\). If \(b_1 < b_{n-1}\), then \(g(x) = z\) from (4). Otherwise, \(a_1\cdots a_p = 12^{p-1}\), and \(h(b_1zb_2\cdots b_{n-1}) = 12^{p-1}{21}2^{p-2}(a_1\cdots a_p)^{j-2}\) is not a necklace. Again \(g(x) = z\) from (4).

Case 2: \(\sigma _2\) is a proper prefix of \({{\,\mathrm{ap}\,}}(\alpha _{i+1})\). Consider three cases for x, noting that \(x<p\):

  • (A): \(t-1 < x\). In this case \(\sigma _1 = a_{x+2} \cdots a_{n}\) and \(\sigma _2 {a_{x+1}} \sigma _1= \alpha _{i+1}\). Clearly, \(\beta \) is in \(\varvec{Short}_{}(S)\) with missing symbol \(z=a_{x+1}\). Since \(t-1< x < n-1\), by the definition of t and (8), \(\sigma _2\) must contain an inversion. Since \(\alpha _{i+1}\) is a necklace with period p, its rotation \(h(z\beta )\) must be strictly larger than \(\alpha _{i+1}\), and hence is not a necklace. Thus, by Corollary 6, \(g(\beta ) = z\).

  • (B): \(s-1 \le x < t-1\). In this case \(\sigma _1 = a_{x+1}\cdots a_{t-1}a_{t+1}\cdots a_{n}\) and thus \(\beta \) is in \(\varvec{Short}_{}(S)\) with missing symbol \(z=a_t\). Note \(a_{x+1} = b_1\). If \(z=b_1\), then clearly \(g(\beta ) = b_1=z\). Otherwise, by the definitions of s and t and Lemma 10, both \(h(zb_1\cdots b_{n-1})\) and \(h(b_1zb_2\cdots b_{n-1})\) are necklaces. Also, by the definition of t, \(a_{x+1} (=b_1)\) is greater than or equal to \(a_{x} (=b_{n-1})\). Thus, the larger of z and \(b_1\) is greater than or equal to \(b_{n-1}\). Thus \(g(\beta ) = b_1\), from Lemma 5.

  • (C): \(x < s-1\). In this case \(\sigma _2 = c_1\cdots c_x\) and \(\sigma _2 {c_{x+1}} \sigma _1 = \alpha _i\). Clearly, \(\beta \) is in \(\varvec{Short}_{}(S)\) and \(z=c_{x+1} = a_{x+1}\). By the definition of s, \(\gamma = c_1\cdots c_{s-2}{c_sc_{s-1}}c_{s+1}\cdots c_{n}\) is not a necklace. If \(x = s-2\), then \(\gamma = h(b_1zb_2\cdots b_{n-1})\) and by Corollary 6\(\beta \) is not in \(\varvec{X}(S)\). Thus \(g(\beta ) = z\). Otherwise \(x < s-2\) and by the definition of t, \(c_{x+1} \le c_{x+2}\). If \(c_{x+1}=c_{x+2}=b_1\), then \(g(\beta )= z=b_1\). Otherwise it is a simple exercise to demonstrate that \(h(b_1zb_2\cdots b_{n-1})\) is not a necklace since \(\gamma \) is not a necklace. Again, by Corollary 6, \(\beta \) is not in \(\varvec{X}(S)\) and thus \(g(\beta ) = z\).

For each case, we have demonstrated that \(\beta \) is in \(\varvec{Short}_{}(S)\) and the symbol following \(\beta \) in \(\mathcal {U}(S)\) is \(g(\beta )\). Thus \(\mathcal {U}(S) = \mathcal {V}(S)\), completing the proof of Theorem 11.

6 Final Remarks

In this paper we presented two algorithms that construct the first fixed-content universal cycle.

  1. 1.

    We developed a successor-rule based on the first-inversion tree of necklaces that runs in O(n) time per symbol using O(n) space.

  2. 2.

    We developed concatenation construction based on cool-lex order of fixed-content necklaces that runs in O(1)-amortized time per symbol using O(n) space.

The first result provides a cyclic shift Gray code for multiset permutations (i.e., string with a given Parikh vector) in which the next multiset permutation is obtained by the shifting the first symbol into the last or second last position. The Gray code can be generated in O(n)-time per string, starting from any string in the set. The second result involved the creation of an O(n)-time per string shift Gray code algorithm for listing necklaces of fixed content in cool-lex order.

6.1 Additional Observations

We conclude with additional observations and avenues for future research, some of which are expanded upon online [40].

Cycle-Joining. The first-inversion tree swaps the leftmost inversion in a necklace, and paths to the root node (i.e., \({{\,\mathrm{tail}\,}}(S)\)) resemble insertion sort or gnome sort [39]. Different fixed-content universal cycles can be created by using other trees. For example, one could instead swap rightmost inversions according to the last-inversion tree. Alternatively, one could focus on the smallest value that is not in sorted order. This change results in paths to the root that follow selection sort, and generalizes the decrementing spanning tree from [24] and its associated bell ringer universal cycle for permutations.

Feedback Functions. Our cycle-joining construction is based on the underlying feedback function \(f(b_1b_2\cdots b_{n-1}) = z\), which returns the missing symbol z and creates necklace cycles. Instead, one could start with the feedback function \(f(b_1b_2\cdots b_{n-1}) = b_1\), which creates initial cycles that are not necklace cycles. Universal cycles resulting from this feedback function would have maximum-weight rather than minimum-weight cycles.

With regard to the choice of feedback function, it is worth noting some connections between this article and foundational work in the area. In the special case when \(k=2\), recall that multiset permutations correspond to fixed-weight binary strings. In this case, the feedback function \(f(\beta ) = z\) corresponds to two of the “simple” feedback functions presented in [22, Ch. 7], depending on the weight w of the strings. The pure summing register (PSR) and the complementing summing register (CSR) apply feedback functions defined as follows, where the operator \(\oplus \) denotes addition modulo 2:

$$\begin{aligned} {{\,\mathrm{PSR}\,}}(\beta ) = b_1\oplus b_2 \oplus \cdots \oplus b_{n-1} \text { and } {{\,\mathrm{CSR}\,}}(\beta ) = b_1\oplus b_2 \oplus \cdots \oplus b_{n-1} \oplus 1. \end{aligned}$$

If w is even, then \(f(\beta ) = {{\,\mathrm{PSR}\,}}(\beta )\); if w is odd, then \(f(\beta ) = {{\,\mathrm{CSR}\,}}(\beta )\). Cycle-joining using these feedback functions has been previously studied in [14]. As we proved in Theorem 11, they also are the underlying feedback function used in the constructions from [35] which were later applied to construct weight-range universal cycles in [46].

Shift Gray Codes and Applications. Our Gray codes using \(\textstyle \mathsf {shift}_{1,n}(\)) and \(\textstyle \mathsf {shift}_{1,n{-}1}(\)) can be used to optimize exhaustive computations for objective function focused on the (ordered) pairs of symbols in a string. For example, consider a directed traveling salesman problem, where each permutation of \(\{1,2,\ldots ,n\}\) represents a Hamilton path in the graph, and each ordered pair of symbols represents a directed edge. Notice that the operation \(\textstyle \mathsf {shift}_{1,n}(\)) changes only one edge in the associated path, while \(\textstyle \mathsf {shift}_{1,n{-}1}(\)) changes two.Footnote 6 Thus, the cost of each successive paths can be updated in O(1)-time. These Gray codes are also helpful when ordering events with repetition (e.g., multiple deliveries along the same route in a stacker crane problem [1]).

Other interesting questions can be asked about the existence of Gray codes using various sets of strings and shifts. For example, there is no Gray code for \(\varvec{Perm}(S)\) using \(\textstyle \mathsf {shift}_{n,1}(\)), \(\textstyle \mathsf {shift}_{1,n}(\)), and \(\textstyle \mathsf {shift}_{1,2}(\)), even for fixed-weight binary strings (i.e., \(k=2\)) [5]. However, the latter two operations are sufficient for permutations (i.e., \(n=k\)) [44] (also see [38]). A specific open question is whether \(\varvec{Perm}(S)\) has a Gray code using \(\textstyle \mathsf {shift}_{1,2}(\)), \(\textstyle \mathsf {shift}_{1,3}(\)), and \(\textstyle \mathsf {shift}_{1,n}(\)) (see [49] when \(n=k\)).

Encodings. Shorthand representation is not the only encoding of \(\varvec{Perm}(S)\) that could be considered for use in universal cycles. In particular, order isomorphism [20, 28], relaxed shorthand [55], and graphical representations [4] have all been considered for permutations.