Keywords

1 Introduction

A multiplicative semigroup of nonnegative matrices is called primitive if it possesses at least one strictly positive matrix. Such semigroups were introduced relatively recently and have been intensively studied in the literature due to applications to Markov chains, linear dynamical systems, graph theory, etc. Their relation to synchronizing automata are especially important. There are rather surprising links between primitive semigroups and functional equations with the contraction of the argument. Those equations are usually applied to generate fractals and self-similar tilings. The theory of those equations can produce short and clear proofs of some known results on primitivity. For example, the characterization of primitive families, the theorem of existence of a common invariant affine subspace for matrices of non-synchronizing automata, etc. A new approach is also useful to study Hurwitz primitive (or m-primitive) semigroups. We discuss the characterization theorem for Hurwitz primitivity, which looks very similar to usual primitivity in spite of the totally different proofs. This leads to the concept of Hurwitz-synchronizing automata and reset m-tuples. We prove polynomial decidability of the existence of reset m-tuples and formulate several open problems.

Throughout the paper we denote the vectors by bold letters: \({\varvec{x}}= (x_1, \ldots , x_n) \in {\mathbb R}^n\). The vectors \({\varvec{e}}_i, \, i = 1, \ldots , n\) denote the canonical basis in \({\mathbb R}^n\) (all but one components of \({\varvec{e}}_i\) are zeros, the ith component is one). The norms of vectors and of matrices is always Euclidean. By norm of affine operator we mean the norm of its linear part. The spectral radius (the maximal modulus of eigenvalues) of a matrix A is denoted by \(\rho (A)\). By convex body we mean a convex compact set with a nonempty interior, \(\, \mathrm{co} \, M \) denotes the convex hull of M. The support of a non-negative vector (matrix) is the set of positions of its strictly positive entries.

2 Contraction Operators and Reachability Theorems

Many facts on reachability in graphs and in automata can be formulated in terms of contraction operators on convex domains. The following results proved in [17] implies at least two important results on reachability. Let \({\mathcal {B}}\) be an arbitrary family of affine operators acting in \({\mathbb R}^d\). This family is called contractive if for every \(\varepsilon > 0\), there exists a product \(\Pi \) of operators from \({\mathcal {B}}\) (with repetitions permitted) such that \(\Vert \Pi \Vert < \varepsilon \). Clearly, this is equivalent to the existence of a product with the spectral radius (maximal modulus of eigenvalues) smaller than one.

Theorem 1

[17] Let \(G \subset {\mathbb R}^n\) be a convex body and \({\mathcal {B}}\) be a family of affine operators respecting this body, i.e., \(B\,G \subset G\) for all \(B \in {\mathcal {B}}\). Then \({\mathcal {B}}\) is contractive unless all operators of \({\mathcal {B}}\) possess a common invariant affine subspace of some dimension \(q, \ 0 \le q\le d-1,\) that intersects G.

In the next section we show how to prove this fact using tools of functional analysis. Now let us demonstrate two of its corollaries from reachability problems. The first one deals with synchronizing automata and was presented in 2016 by Berlinkov and Szykula:

Theorem 2

[3] If an automaton is not synchronizing, then its matrices possess a proper invariant common linear subspace.

Let us show how this fact can be deduced from Theorem 1.

Proof

To every matrix A of the automaton we associate the corresponding affine operator \(A|_{V}\) on the affine hyperplane \(V = \{{\varvec{x}}\in {\mathbb R}^n \ | \ \sum _{i=1}^n x_i = 1\}\). For an arbitrary product \(\Pi \) of matrices of the automaton, we have either \(\Vert \Pi |_{V}\Vert = 0\) if \(\Pi \) has a positive row, of \(\Vert \Pi |_{V}\Vert \ge 1\) otherwise. If a product with positive row exists, then the automaton is synchronizing. Otherwise, the set of matrices of the automaton is not contractive on V. Applying Theorem 1 to the simplex \(\Delta = \mathrm{co}\, \{{\varvec{e}}_1, \ldots , {\varvec{e}}_n\} \subset V\) we conclude the existence of a common invariant affine subspace in V. Its linear span with the origin is the desired invariant linear subspace.    \(\square \)

The next fact that can be derived from Theorem 1 is the criterion of primitivity of matrix family proved in 2013 by Protasov and Voinov [16]. Let \({\mathcal {A}}\) be an arbitrary irreducible family of nonnegative \(n\times n\) matrices. Irreducibility means that there is no coordinate subspace of \({\mathbb R}^n\), i.e., subspace spanned by several vectors of the canonical basis, which is invariant for every matrix from \({\mathcal {A}}\). The family \({\mathcal {A}}\) is called primitive, if there exists a strictly positive product of matrices from \({\mathcal {A}}\).

The concept of primitivity of matrix families was introduced in [16] and has been studied in the literature due to many applications, see the bibliograpghy in [2, 4, 11]. The importance of this property is explained by the fact that if the family \({\mathcal {A}}\) is finite and all its matrices have neither zero rows nor zero columns, then the primitivity of \({\mathcal {A}}\) implies that almost all long products of matrices from \({\mathcal {A}}\) are strictly positive.

Theorem 3

[16] Let \({\mathcal {A}}\) be an irreducible family of non-negative matrices. Suppose all matrices of \({\mathcal {A}}\) have neither zero rows nor zero columns; then \({\mathcal {A}}\) is not primitive if and only if there exists a partition of the set \(\Omega = \{1, \ldots , n\}\) to \(r\ge 2\) nonempty subsets \(\{\Omega _k\}_{k=1}^r\), on which all the matrices from \({\mathcal {A}}\) act as permutations.

This means that for every \(A \in {\mathcal {A}}\), there is a permutation \(\sigma \) of the set \(\{1, \ldots , r\}\) such that for each \(i \in \Omega _k\), the support of the vector \(A{\varvec{e}}_i\) is contained in \(\Omega _{\sigma (k)}, \, k = 1, \ldots , k\). So, if the family \({\mathcal {A}}\) is not primitive, then there is a partition of the set of basis vectors, common to all matrices from \({\mathcal {A}}\), such that each matrix \(A \in {\mathcal {A}}\) defines a permutation of this partition. If \({\mathcal {A}}\) is primitive, we formally set \(r=1\) and the partition is trivial \(\Omega _1 = \Omega \). For one matrix, this fact is a part of Perron-Frobenius theorem. Moreover, in this case the permutation is cyclic. For families of matrices, these permutations can be arbitrary.

Despite the simple formulation, the proof of Theorem 3 is surprisingly long and technical. Now in the literature there are at least five different proofs of this theorem based on different ideas. The authors proof from [16] is based on geometry of convex polytopes. In that work the problem of finding a purely geometrical and possibly simpler proof was left. The problem seems to be reasonable in view of the combinatorial nature of the theorem. The first successful responds to this challenge was made by Alpin and Alpina in [1] and by Blondel, Jungers, and Olshevsky in [4]. Then Alpin and Alpina [2] suggested another construction. All those works presented (different!) combinatorial proofs, although still rather long. In 2015 Voynov and Protasov [17] noted that Theorem 3 can be actually derived by the same idea of contraction families of affine operators (Theorem 1), which gives the shortest known proof of this result. That proof may be called analytic, since Theorem 1 is established by analytic methods, with functional equations.

Before giving the proof we make a couple of observations. Nothing changes in Theorem 3 if we replace the family \({\mathcal {A}}\) by the family of all column-stochastic matrices with the same supports as matrices from \({\mathcal {A}}\). We assume that this replacement is already done and we keep the same notation for the new family. Thus, all matrices from \({\mathcal {A}}\) generate affine operators on the affine hyperplane V and respect the simplex \(\Delta \subset V\). Under the assumptions of Theorem 3, primitivity of the family \({\mathcal {A}}\) is equivalent to contractivity of this family on V. This fact is rather simple, its proof can be found in [14].

Proof of Theorem 3. Sufficiency is obvious. To prove the necessity we apply Theorem 1 to the family \({\mathcal {A}}|_{V}\) of affine operators on V and to the convex body \(G = \Delta \). If \({\mathcal {A}}\) non-contracting, then the operators from \({\mathcal {A}}|_{V}\) share a common invariant affine subspace \(L \subset V, \, 0 \, \le \, \mathrm{dim}\, L \, \le \, n-2\), intersecting \(\Delta \). Due to the irreducibility of \({\mathcal {A}}\), the subspace L intersects the interior of \(\Delta \), i.e., contains a positive point \({\varvec{a}}\in \Delta \). Consider the following relation on the set \(\Omega \): \(\, i \sim j\), if the vector \({\varvec{e}}_i - {\varvec{e}}_j\) belongs to \(\tilde{L}\) (the linear part of L). This is an equivalence relation splitting \(\Omega \) to classes \(\Omega _1, \ldots , \Omega _r\). Since \(\mathrm{dim}\, L \, \le \, n-2\), we have \(\, r \ge 2\). Let us show that for every \(i\in \Omega \) and \(A \in {\mathcal {A}}\), the supports of all vectors of the set \(M_i(A) = \{A{\varvec{e}}_i , \ A \in {\mathcal {A}}\}\) lie in one set \(\Omega _k\). It suffices to prove that the difference of each two elements of the set \(M_i(A)\) lies in \(\tilde{L}\). Take arbitrary \(A, B \in {\mathcal {A}}\). Let a matrix \(C \in {\mathcal {A}}\) have the same ith column as the matrix A and all other columns as the matrix B. For a positive vector \(\, {\varvec{a}}= \sum _i a_i{\varvec{e}}_i \, \in \, L\), we have

$$ a_i(A{\varvec{e}}_i \, - \, B{\varvec{e}}_i) \ = \ a_i(C{\varvec{e}}_i \, - \, B{\varvec{e}}_i) \ = \ C(a_i{\varvec{e}}_i)\, - \, B(a_i{\varvec{e}}_i) \ = \ C{\varvec{a}}\, - \, B{\varvec{a}}\ \in \ \tilde{L} . $$

Since \(C{\varvec{a}}\in L\) and \(B{\varvec{a}}\in L\), we have \(C{\varvec{a}}- B{\varvec{a}}\, \in \, \tilde{L}\), and hence \(A{\varvec{e}}_i - B{\varvec{e}}_i \, \in \, \tilde{L}\). Thus, the supports of ith columns of all matrices from \({\mathcal {A}}\) belong to one set \(\Omega _k\). Then the supports of all columns with indices from the equivalence class of the index i (say, \(\Omega _j\)) lie in the same class \(\Omega _k\). Consequently, the matrix A defines a map \(\sigma (j) = k\). Since A does not have zero rows, it follows that \(\sigma \) is a permutation.    \(\square \)

3 Contractive Families and Functional Equations

The proof of Theorem 1 is realized by applying the theory of fractal curves and equations of self-similarity. This idea originated in [17], here we slightly simplify that proof. Let us have a finite family of affine operators \({\mathcal {B}}= \{B_1, \ldots , B_m\}\). The self-similarity equation is the equation on function \({\varvec{v}}: \, [0,1] \rightarrow {\mathbb R}^d\):

$$\begin{aligned} {\varvec{v}}(t) \ = \ B_k \, {\varvec{v}}(mt -k+1)\, , \quad t \in \left[ \frac{k-1}{m}\, , \, \frac{k}{m} \right] \, \qquad k \ = \ 1, \ldots , m . \end{aligned}$$
(1)

This equation plays an exceptional role in the theory of subdivision schemes, compactly supported wavelets, etc. see [5, 6] and references therein. A advantage of those equations is an existence and uniqueness theorem for solutions. We will restrict ourselves to \(L_2\)-solutions. The solvability and smoothness of the solution are expressed in terms of the so-called \(L_2\)-spectral radius or in short, 2-radius of linear or affine operators: \(\, \rho _2({\mathcal {B}})\, = \, \lim _{k\rightarrow \infty }\bigl [\, m^{-k}\sum _{\Pi \in {\mathcal {B}}^k}\Vert \Pi \Vert ^{1/k} \bigr ]^{1/m^k}\). Here we denote by \({\mathcal {B}}^k\) the set of products of length k of operators from the family \({\mathcal {B}}\) (repetitions permitted). If there is a convex body G such that \(BG \in G\) for all \(B \in {\mathcal {B}}\), then, of course, all norms of products \(\Vert \Pi \Vert \) are uniformly bounded, and hence \(\rho _2 \le 1\). It turns out that the family \({\mathcal {B}}\) is contrative precisely when \(\rho _2 <1\). This fact is rather simple, its proof can be found in [17]. Now we formulate the main result on equations of self-similarity.

Theorem 4

[13] Suppose a finite family of affine operators \({\mathcal {B}}\) does not possess a common invariant affine subspace; then Eq. (1) possesses an \(L_2\)-solution if and only if \(\rho _2({\mathcal {B}}) <1\). In this case the solution is unique and continuously depends on the family \({\mathcal {B}}\).

This theorem implies Theorem 1 on contractive families.

Proof of Theorem 1. We show only the existence of a common invariant subspace. For the proof that this subspace intersects G, see [17]. It suffices to prove the theorem for finite families \({\mathcal {B}}\). Indeed, if \({\mathcal {B}}\) is non-contractive, then so is every its finite subset, and consequently every finite subset of \({\mathcal {B}}\) possesses an invariant affine subspace intersecting G. In this case the whole family \({\mathcal {B}}\) possesses such an invariant subspace. Thus, we assume \({\mathcal {B}}= \{B_1, \ldots , B_m\}\) and that this family is not contractive. Take an arbitrary point \({\varvec{a}}\in \mathrm{int}\, G\) and consider the operator \(B_1^{\varepsilon }\) defined as \(B_1^{\varepsilon }{\varvec{x}}\, = \, (1-\varepsilon )B_1{\varvec{x}}\, + \, \varepsilon \, {\varvec{x}}\). For every \(\varepsilon \in (0, 1)\), the operator \(B_1^{\varepsilon }\) maps G to \(\mathrm{int} \, G\), hence \(\rho (B_1^{\varepsilon }) < 1\). Therefore, the family \({\mathcal {B}}^{\varepsilon }\, = \, \{B_1^{\varepsilon }, B_2, \ldots , B_m\}\) is contractive, and consequently \(\rho _2({\mathcal {B}}^{\varepsilon }) < 1\). Hence, the self-similarity Eq. (1) with the family \({\mathcal {B}}^{\varepsilon }\) possesses a unique \(L_2\)-solution \({\varvec{v}}_{\varepsilon }\). We have \({\varvec{v}}_{\varepsilon }(t) \in G\) for almost all \(t \in [0,1]\). Taking an arbitrary sequence \(\varepsilon _k \rightarrow 0\) we obtain a bounded sequence \({\varvec{v}}_{\varepsilon _k}\), which has a weak-* limit \({\varvec{v}}\). This limit satisfies Eq. (1) with the initial family of operators \({\mathcal {B}}\). Therefore, if the family \({\mathcal {B}}\) does not have an invariant affine subspace, then \(\rho _2 ({\mathcal {B}}) <1\). Hence, \({\mathcal {B}}\) is contractive which is a contradiction.    \(\square \)

4 m-primitivity and m-syncronising Automata

In this section we compare the main results on primitive and on m-primitive families and rise a question whether the analytic method can be extended to the study of m-primitivity. Then we define m-synchronizing automata, prove that the existence of a synchronising m-tuple can be decided in polynomial time and leave several open problems. For example, how to find the synchronising m-tuple and what is an upper bound for its length?

The concept of m-primitivity was introduced in 1990 by Fornasini [7] and then studied by Fornasini and Valcher [8, 9]. Now there is an extensive literature on this subject, see [10, 12, 15] and references therein. The main application of m-primitivity is the multivariate Markov chains [7, 8], although there are some natural generalization to the graph theory, dynamical systems and large networks. In the notation, m is the number of matrices, that is why we say not “k-primitive” as in some works, but “m-primitive”. It would also be natural to use the term “Hurwitz primitive”.

Let us have a finite family of nonnegative \(n\times n\)-matrices \({\mathcal {A}}= \{A_1, \ldots , A_m\}\). For a given m-tuple \({\varvec{\alpha }}= (\alpha _1, \ldots , \alpha _m)\) of nonnegative integers, \(\sum _{i=1}^m \alpha _i = k \, \ge \, 1\), called also colour vector we denote by \({\mathcal {A}}^{\,{\varvec{\alpha }}}\) the sum of all products of m matrices from \({\mathcal {A}}\), in which every product contains exactly \(\alpha _i\) factors equal to \(A_i\). The number k will be referred to as the length of \({\varvec{\alpha }}\) and denoted by \(|{\varvec{\alpha }}|\). For example, if \(\, {\mathcal {A}}= \{A_1, A_2, A_3\}\), then \({\mathcal {A}}^{\, (1,3,0)}\, = \, A_1A_2^3 + A_2A_1A_2^2 + A_2^2 A_1 A_2 + A_2^3A_1\). Such sums are called in the literature Hurwitz products, although they are not actually products but sums of products. A family of non-negative matrices is m-primitive if there exists a strictly positive Hurwitz product of those matrices. This property is weaker than primitivity: primitivity implies m-primitivity, but not vice versa.

This notion has an obvious combinatorial interpretation. Suppose there are n villages, some of them are connected by one-way roads colored in m colors (two villages may be connected by several roads). The m-primitivity means that there exists a colored vector \({\varvec{\alpha }}= (\alpha _1, \ldots , \alpha _m)\) such that every two villages are connected by a path of length \(|{\varvec{\alpha }}|\) that for each \(i = 1, \ldots , m\) contains precisely \(\alpha _i\) roads of the ith color. The criterion of m-primitivity was proved in [15] in 2013:

Theorem 5

[15] Let \({\mathcal {A}}\) be an irreducible family of non-negative matrices. Suppose all matrices of \({\mathcal {A}}\) have no zero columns; then \({\mathcal {A}}\) is not primitive if and only if there exists a partition of the set \(\Omega = \{1, \ldots , n\}\) to \(r\ge 2\) nonempty subsets \(\{\Omega _k\}_{k=1}^r\), on which all the matrices from \({\mathcal {A}}\) act as permutations and all those permutations commute.

If \({\mathcal {A}}\) is primitive, we formally set \(r=1\) and the partition is trivial \(\Omega _1 = \Omega \). This criterion almost literally repeats the criterion of primitivity in Theorem 3. The main difference is the following: if for non-primitivity the matrices \(A_i\) can define arbitrary permutations of the sets \(\{\Omega _j\}_{j=1}^r\), for non m-primitivity those permutations have to commute. This criterion quite naturally shows the common properties and the difference between those two concepts. Another difference is that the criterion of Theorem 5 does not require the absence of zero rows and columns, as it was in Theorem 3, but just zero columns. This condition is much less restrictive, since it is always satisfied for column stochastic matrices and for matrices of automata.

As a corollary of Theorem 5 one can obtain that the m-primitivity is polynomially decidable. In [15] an algorithm was presented to find the partition \(\{\Omega _j\}_{j=1}^r\). If \(r=1\), then the family is m-primitive. The complexity of the algorithm is \(O(mn^{\, 3} + m^{2}n^{\, 2})\).

Now let us formulate three open problems.

Problem 1

Can Theorem 5 be derived by a geometrical of function-analytic argument, similar to Theorem 1?

In spite of similarity of Theorems 3 and 5, their proofs are totally different. The only known proof of Theorem 5 is combinatorial and has nothing in common with all five known proofs of Theorem 3.

Problem 2

If the family is primitive, how to find its positive Hurwitz product within polynomial time?

The algorithm from [15] based on Theorem 5 is not constructive. It decides whether the family is m-primitive by finding the partition \(\{\Omega _j\}_{j=1}^r\), but does not find any positive Hurwitz product. The greedy algorithm for finding positive product of a primitive family seems not to work here.

Problem 3

What is a sharp upper bound for the exponent of m-primitivity?

The exponent of m-primitivity is the minimal number \(k = k(n, m)\) such that every m-primitive family of matrices of size n has a positive Hurwitz product with a colour vector of length at most k.

Conjecture 1

Under the assumption that all matrices of the family \({\mathcal {A}}\) do not have zero columns, the exponent of m-primitivity is polynomial in n and m.

Examples from the works [10, 12] show that the exponents of m-primitivity can be exponential in m. In example from [12], we have \(k(n,m) \ge C n^{m+1}\). However, in all those examples the matrices have zero columns.

Similarly to m-primitivity, we can define m-synchronising automata. Let us have deterministic finite automaton with n states and with m actions. The automaton is called m-synchronizing if there exists an m-tuple \({\varvec{\alpha }}= (\alpha _1, \ldots , \alpha _m)\) such that for every state there exists a reset word of length \(|{\varvec{\alpha }}|\) with \(\alpha _i\) commands of the ith action, \(i = 1, \ldots , m\). This means that for every starting state there is a word (may be different for different states) with the m-tuple \({\varvec{\alpha }}\) that leaves the automaton on a prescribed particular state, one for all starting states. Thus the reset words may be different for different states, but with the same reset m-tuple.

In terms of graphs, this means that there is a path from each vertex to the particular vertex that has exactly \(\alpha _i\) edges of the ith colour, \(i = 1, \ldots , m\). The synchronising m-tuple has the following meaning. Assume realisation of each action is not free, it requires some resources. It is possible to leave a stock set of resources, \(\alpha _i\) units for the ith action, so that it is always possible to reset the system using this stock? Let us stress that we can control the sequences of actions to reset the system from a given state, but we are not able to control the stock, it is the same for all states.

The existence of reset m-tuples can be decided within polynomial time, at least for irreducible automata, i.e., whose sets of matrices are irreducible.

Theorem 6

There is a polynomial time algorithm to decide if a given automaton is m-synchronising. The algorithm spends less than \(m n^2 \bigl (\log _2 n + \frac{m+4}{2} \bigr )\) arithmetic operations.

Proof

Let \({\mathcal {A}}= \{A_1, \ldots , A_m\}\) be a family of matrices of the automaton. An m-tuple is synchronising precisely when the corresponding Hurwitz product of matrices \(A_1, \ldots , A_m\) possesses a positive row. Since all matrices \(A_i\) are column stochastic, we can apply Theorem 5. Using algorithm from [15], we find the partition \(\{\Omega _j\}_{j=1}^r\). If \(r=1\), then \({\mathcal {A}}\) is m-primitive and hence the corresponding Hurwitz product is strictly positive and so has a positive row. If \(r\ge 2\), then there is no Hurwitz product with a positive row. Indeed, since all permutations of the set \(\{\Omega _j\}_{j=1}^r\) defined by matrices from \({\mathcal {A}}\) commute, it follows that all products \(\Pi = A_{d_1}\cdots A_{d_k}\) corresponding to one m-tuple \({\varvec{\alpha }}\) define the same permutation. Therefore, they have the same block structure corresponding to the partition \(\{\Omega _j\}_{j=1}^r\) and hence their sum does not have a positive row. Thus, the automaton is m-synchronizing if and only if \(r=1\). By [15, Theorem 2], the algorithm spends less than \(m n^2 \bigl (2p + \log _2 n + \frac{m}{2} \bigr )\) operations, where p is the maximal number of positive components in columns of the matrices from \({\mathcal {A}}\). In our case, \(p=1\), which completes the proof.    \(\square \)

Thus, for irreducible automata, the existence of reset m-tuple can be decided by a polynomial algorithm. But this algorithm does not find the reset m-tuple.

Problem 4

If an automaton in m-primitive, how to find its reset m-tuple within polynomial time?

Problem 5

What is a sharp upper bound for the minimal length of the reset m-tuple?

For synchronising automata this bound is known to be \(O(n^3)\) and there is a long standing Černý conjecture that the minimal upper bound is actually \((n-1)^2\). What are the bounds for m-synchronizing automata?