Abstract
Due to idempotency of addition, unlike in linear algebra, two different maxpolynomials may be identical as functions. This makes the theory of maxpolynomials in max-algebra different from that in linear algebra. This is reflected in the analogue of the fundamental theorem of algebra, showing that every maxpolynomial can be factorized to linear factors in near linear time. Maxpolynomial equations can also be solved with significantly less effort than in linear algebra.
On the other hand the concept of the characteristic maxpolynomial of a matrix is more tricky, as there are several reasonable, nonequivalent definitions of this concept. The version studied in this chapter is based on the max-algebraic permanent (assignment problem). It is proved that its greatest corner is the principal eigenvalue. Although it is not clear whether all terms of a characteristic maxpolynomial can be found efficiently, it is shown how to find all essential terms in low-order polynomial time.
The max-algebraic Cayley-Hamilton theorem is presented. It has a two-sided form, reflecting the lack of subtraction. It is not easy to find it for a given matrix in general, however, a number of easily solvable special cases is studied.
Access provided by Autonomous University of Puebla. Download chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The aim of this chapter is to study max-algebraic polynomials, that is, expressions of the form
where c r , j r ∈ℝ. The number j p is called the degree of p(z) and p+1 is called its length .
We will consider (5.1) both as formal algebraic expressions with z as an indeterminate and as max-algebraic functions of z. We will abbreviate “max-algebraic polynomial” to “maxpolynomial” . Note that j r are not restricted to integers and so (5.1) covers expressions such as
In conventional notation p(z) has the form
and if considered as a function, it is piecewise linear and convex.
Each expression \(c_{r}\otimes z^{j_{r}}\) will be called a term of the maxpolynomial p(z). For a maxpolynomial of the form (5.1) we will always assume
where p is a nonnegative integer. If c p =0=j 0 then p(z) is called standard . Clearly, every maxpolynomial p(z) can be written as
where q(z) is a standard maxpolynomial. For instance (5.2) is of degree 12.3 and length 3. It can be written as
where q(z) is the standard maxpolynomial
There are many similarities with conventional polynomial algebra, in particular (see Sect. 5.1) there is an analogue of the fundamental theorem of algebra, that is, every maxpolynomial factorizes to linear terms (although these terms do not correspond to “roots” in the conventional terminology). However, there are aspects that make this theory different. This is caused, similarly as in other parts of max-algebra, by idempotency of addition, which for instance yields the formula
for all a,b,k∈ℝ. This property has a significant impact on many results. Perhaps the most important feature that makes max-algebraic polynomial theory different is the fact that the functional equality p(z)=q(z) does not imply equality between p and q as formal expressions. For instance (1⊕z)2 is equal by (5.4) to 2⊕z 2 but at the same time expands to 2⊕1⊗z⊕z 2 by basic arithmetic laws. Hence the expressions 2⊕1⊗z⊕z 2 and 2⊕z 2 are identical as functions. This demonstrates the fact that some terms of maxpolynomials, do not actually contribute to the function value. In our example 1⊗z≤2⊕z 2 for all z∈ℝ. This motivates the following definitions: A term \(c_{s}\otimes z^{j_{s}}\) of a maxpolynomial \({\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\) is called inessential if
holds for every z∈ℝ and essential otherwise. Clearly, an inessential term can be removed from [reinstated in] a maxpolynomial ad lib when this maxpolynomial is considered as a function. Note that the terms \(c_{0}\otimes z^{j_{0}}\) and \(c_{p}\otimes z^{j_{p}}\) are essential in any maxpolynomial \({\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\).
FormalPara Lemma 5.0.1If the term \(c_{s}\otimes z^{j_{s}}\), 0<s<p, is essential in the maxpolynomial \({\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\) then
Since the term \(c_{s}\otimes z^{j_{s}}\) is essential and the sequence \(\{j_{r}\}_{r=0}^{p}\) is increasing there is an α∈ℝ such that
and
Hence
□
We will first analyze general properties of maxpolynomials yielding an analogue of the fundamental theorem of algebra and we will also briefly study maxpolynomial equations. Then we discuss characteristic maxpolynomials of square matrices. Maxpolynomials, including characteristic maxpolynomials, were studied in [8, 20, 62, 65, 71]. The material presented in Sect. 5.1 follows the lines of [65] with kind permission of Academic Press.
5.1 Maxpolynomials and Their Factorization
One of the aims in this section is to seek factorization of maxpolynomials. We will see that unlike in conventional algebra it is always possible to factorize a maxpolynomial as a function (although not necessarily as a formal expression) into linear factors over ℝ with a relatively small computational effort. We will therefore first study expressions of the form
where \(\beta_{r}\in\overline{\mathbb{R}}\) and e r ∈ℝ (r=1,…,p) and show how they can be multiplied out; this operation will be called evolution. We call expressions (5.5) a product form and will assume
The constants β r will be called corners of the product form (5.5). Note that (5.5) in conventional notation reads
Hence, a factor (ε⊕z)e is the same as the linear function ez of slope e. A factor (β⊕z)e, β∈ℝ, is constant e β while z≤β and linear function ez if z≥β. Therefore (5.5) is the function b(z)+f(z)z, where
Every product form is a piecewise linear function with constant slope between any two corners, and for z<β 1 and z>β p . It follows that a product form is convex when all exponents e r are positive. However, this function may, in general, be nonconvex and therefore we cannot expect each product form to correspond to a maxpolynomial as a function.
Let us first consider product forms
that is, product forms where all exponents are 1 and all β r ∈ℝ (and still β 1<⋅⋅⋅<β p ). Such product forms will be called simple .
We can multiply out any simple product form using basic arithmetic laws as in conventional algebra. This implies that the coefficient at z k (k=0,…,p) of the obtained maxpolynomial is
where r=p−k. Note that (5.8) is 0 if r=0. However, due to (5.6) this coefficient significantly simplifies, namely (5.8) is actually the same as
when k<p and 0 when k=p. Hence the maxpolynomial obtained by multiplying out a simple product form (5.7) is of length p+1 and can be found as follows.
The constant term is β 1⊗⋅⋅⋅⊗β p ; the term involving z k (k≥1) is obtained by replacing β k in the term involving z k−1 by z.
We now generalize this procedure to an algorithm for any product form with positive exponents and finite corners. Product forms with these two properties are called standard .
Algorithm 5.1.1
EVOLUTION
Input: β 1,…,β p , e 1,…,e p ∈ℝ (parameters of a product form).
Output: Terms of the maxpolynomial obtained by multiplying out (5.5).
\(t_{0}:=\beta_{1}^{e_{1}}\otimes\cdots \otimes\beta_{p}^{e_{p}}\)
for r=1,…,p do
t r :=t r−1 after replacing \(\beta_{r}^{e_{r}}\) by \(z^{e_{r}}\)
The general step of this algorithm can also be interpreted as follows:
\(c_{r}:=c_{r-1}\otimes(\beta_{r}^{e_{r}})^{-1}\) and j r :=j r−1+e r with \(c_{0}:=\beta_{1}^{e_{1}}\otimes\cdots \otimes \beta_{p}^{e_{p}}\) and j 0=0.
Alternatively, the sequence of pairs \(\{(e_{r},\beta_{r})\}_{r=1}^{p}\) is transformed into the sequence
where the sum of an empty set is 0 by definition. Note that the algorithm EVOLUTION is formulated for general product forms but its correctness is guaranteed for standard product forms:
Theorem 5.1.2
If the algorithm EVOLUTION is applied to standard product form (5.5) then the maxpolynomial \(f(z)=\sum_{r=0,\ldots,p}^{\oplus}t_{r}\) is standard, has no inessential terms and is the same function as the product form.
Proof
Let \(f(z)={\sum_{r=0,\ldots,p}^{\oplus}}t_{r}\). Then f(z) is standard since all terms involving z have positive exponents and one of the terms (t 0) is constant. The highest order term (t p ) has coefficient zero.
Let r∈{0,1,…,p} and let z be any value satisfying β r <z<β r+1. Then
and \(f(z)=c_{r}\otimes z^{j_{r}}\) because any other term has either some z’s replaced by some β’s (≤β r <z) or some β’s (≥β r+1>z) replaced by z’s and will therefore be strictly less than t r . At the same time, if β r <z<β r+1, then the value of (5.5) is \(c_{r}\otimes z^{j_{r}}\) for r=0,1,…,p. We deduce that f(z) and (5.5) are equal for all z∈ℝ and hence f(z) has no inessential terms. □
Example 5.1.3
Let us apply EVOLUTION to the product form (1⊕z)⊗(3⊕z)2. Here
We find
For the inverse operation (that will be called resolution) we first notice that if a standard maxpolynomial p(z) was obtained by EVOLUTION then two consecutive terms of p(z) are of the form
By cancelling the common factors we get \(\beta_{r}^{e_{r}}\oplus z^{e_{r}}\) or, alternatively \((\beta_{r}\oplus z)^{e_{r}}\).
Example 5.1.4
Consider the maxpolynomial 7⊕6⊗z⊕z 3. By cancelling the common factor for the first two terms we find 1⊕z, for the next two terms we get 6⊕z 2=(3⊕z)2. Hence the product form is (1⊕z)⊗(3⊕z)2.
This idea generalizes to nonstandard maxpolynomials as they can always be written in the form (5.3).
Example 5.1.5
In fact there is no need to transform a maxpolynomial to a standard one before we apply the idea of cancellation of common factors and we can straightforwardly formulate the algorithm:
Algorithm 5.1.6
RESOLUTION
Input: Maxpolynomial \({\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\).
Output: Product form \({\prod_{r=1,\ldots,p}^{\otimes}}(\beta_{r}\oplus z)^{e_{r}}\).
For each r=0,1,…,p−1 cancel a common factor \(c_{r+1}\otimes z^{j_{r}}\) of two consecutive terms \(c_{r}\otimes z^{j_{r}}\) and \(c_{r+1}\otimes z^{j_{r+1}}\) to obtain \(c_{r}\otimes c_{r+1}^{-1}\oplus z^{j_{r+1}-j_{r}}=(\beta_{r+1}\oplus z)^{e_{r+1}}\).
Observe that e r+1=j r+1−j r and \(\beta_{r+1}=\frac{c_{r}-c_{r+1}}{j_{r+1}-j_{r}}\) for r=0,1,…,p−1. Again, this algorithm is formulated without specific requirements on the input and we need to identify the conditions under which it will work correctly.
It will be shown that the algorithm RESOLUTION works correctly if the sequence
is increasing (in which case the sequence {β r } is increasing). A maxpolynomial satisfying this requirement is said to satisfy the concavity condition . Before we answer the question of the correctness of the algorithm RESOLUTION, we present an observation that will be useful:
Theorem 5.1.7
The algorithms EVOLUTION and RESOLUTION are mutually inverse.
Proof
EVOLUTION maps
while RESOLUTION maps
Hence EVOLUTION applied to the result of RESOLUTION produces
One can similarly deduce that RESOLUTION applied to the result of EVOLUTION produces (e r ,β r ). □
This theorem finds an immediate use in the following key statement.
Theorem 5.1.8
For a standard maxpolynomial p(z) satisfying the concavity condition the algorithm RESOLUTION finds a standard product form q(z) such that p(z)=q(z) for all z∈ℝ.
Proof
Suppose that the maxpolynomial p(z) satisfies the concavity condition. Then the sequence
is increasing and finite and e r >0, since j r are increasing. Hence the product form q(z) produced by RESOLUTION is standard.
By an application of EVOLUTION to q(z) we get a maxpolynomial t(z) and t(z)=q(z) for all z∈ℝ by Theorem 5.1.2. At the same time t(z)=p(z) for all z∈ℝ by Theorem 5.1.7. Hence the statement. □
Note that the computational complexity of RESOLUTION is O(p).
Lemma 5.1.9
Let p(z) and p′(z) be two maxpolynomials such that p′(z)=c⊗z j⊗p(z). Then the concavity condition holds for p(z) if and only if it holds for p′(z).
Proof
Let p(z) and p′(z) be two maxpolynomials such that
for some c∈ℝ. Then
If p′(z)=z j⊗p(z) for some j∈ℝ then
and the statement follows. □
Theorem 5.1.10
A maxpolynomial has no inessential terms if and only if it satisfies the concavity condition.
Proof
Due to Lemma 5.0.1 we only need to prove the “if” part.
By Lemma 5.1.9 we may assume without loss of generality that p(z) is standard. By applying RESOLUTION and then EVOLUTION the result now follows by Theorems 5.1.8, 5.1.2 and 5.1.7. □
It follows from Theorem 5.1.8 that if a standard maxpolynomial p(z) satisfies the concavity condition then the algorithm RESOLUTION applied to p(z) will produce a standard product form equal to p(z) as a function. If p(z) does not satisfy the concavity condition then it contains an inessential term (Theorem 5.1.10). By removing an inessential term, p(z) as a function does not change. Hence by a repeated removal of inessential terms we can find a standard maxpolynomial p′(z) from p(z) such that p′(z) satisfies the concavity condition and p(z)=p′(z) for all z∈ℝ. Formally, this process can be described by the following algorithm:
Algorithm 5.1.11
RECTIFICATION
Input: Standard maxpolynomial \(p(z)={\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\).
Output: Standard maxpolynomial p′(z) with no inessential terms and p′(z)=p(z) for all z∈ℝ.
\(p^{\prime}(z):=c_{p-1}\otimes z^{j_{p-1}}\oplus c_{p}\otimes z^{j_{p}}\)
s:=p−1, t:=p
For r=p−2,p−3,…,0 do
begin
Until \(\frac{c_{s}-c_{t}}{j_{t}-j_{s}}>\frac{c_{r}-c_{s}}{j_{s}-j_{r}}\) do
begin
Remove \(c_{s}\otimes z^{j_{s}}\) from p′(z), let \(c_{s}\otimes z^{j_{s}}\) and \(c_{t}\otimes z^{j_{t}}\) be the lowest and second-lowest order term in p′(z), respectively.
end
\(p^{\prime}(z):=c_{r}\otimes z^{j_{r}}\oplus p^{\prime }(z)\), t:=s, s:=r
end
Clearly, RECTIFICATION runs in O(p) time since every term enters and leaves p(z) at most once.
We summarize the results of this section:
Theorem 5.1.12
[71] (Max-algebraic Fundamental Theorem of Algebra) For every maxpolynomial p(z) of length p it is possible to find using O(p) operations a product form q(z) such that p(z)=q(z) for all z∈ℝ. This product form is unique up to the order of its factors.
Proof
Let p(z) be the maxpolynomial \({\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\). By taking out \(c_{p}\otimes z^{j_{0}}\) it is transformed to a standard maxpolynomial, say p′(z), which in turn is transformed using RECTIFICATION into a standard maxpolynomial p′′(z) with no inessential terms. The algorithm RESOLUTION then finds a standard product form q(z) such that q(z)=p′′(z) for all z∈ℝ. By Theorems 5.1.8 and 5.1.10 we have p′′(z)=p′(z)=p(z) for all z∈ℝ and the statement follows. □
We may now extend the term “corner” to any maxpolynomial: Corners of a maxpolynomial p(z) are corners of the product form that is equal to p(z) as a function.
It will be important in the next section that it is possible to explicitly describe the greatest corner of a maxpolynomial:
Theorem 5.1.13
The greatest corner of \(p(z)=\sum_{r=0,\ldots,p}^{\oplus}c_{r}\otimes z^{j_{r}}\), p>0, is
Proof
A corner exists since p>0. Let γ be the greatest corner of p(z). Then
for all z≥γ and for all r=0,1,…,p. At the same time there is an r<p such that
for all z<γ. Hence γ=max r=0,1,…,p−1 γ r where γ r is the intersection point of \(c_{p}\otimes z^{j_{p}}\) and \(c_{r}\otimes z^{j_{r}}\), that is
and the statement follows. □
Note that an alternative treatment of maxpolynomials can be found in [8] and in [2] in terms of convex analysis and (in particular) Legendre–Fenchel transform.
5.2 Maxpolynomial Equations
Maxpolynomial equations are of the form
where p(z) and q(z) are maxpolynomials. Since both p(z) and q(z) are piecewise linear convex functions, it is clear geometrically that the solution S set to (5.9) is the union of a finite number of closed intervals in ℝ, including possibly one-element sets, and unbounded intervals (see Fig. 5.1, where S consists of one closed interval and two isolated points). Let us denote the set of boundary points of S (that is, the set of extreme points of the intervals) by S ∗. The set S ∗ can easily be characterized:
Theorem 5.2.1
[64] Every boundary point of S is a corner of p(z)⊕q(z).
Proof
Let z∈S ∗. If z is not a corner of p(z)⊕q(z) then p(z)⊕q(z) does not change the slope in a neighborhood of z. By the convexity of p(z) and q(z) then neither p(z) nor q(z) can change slope in a neighborhood of z. But then z is an interior point to S, a contradiction. □
Theorem 5.2.1 provides a simple solution method for maxpolynomial equations (5.9). After finding all corners of p(z)⊕q(z), say β 1<⋅⋅⋅<β r , it remains
-
(1)
to check which of them are in S, and
-
(2)
if γ 1<⋅⋅⋅<γ t are the corners in S then by selecting arbitrary interleaving points α 0,…,α t so that
$$\alpha_{0}<\gamma_{1}<\alpha_{1}<\cdots <\gamma_{t}<\alpha_{t}$$and checking whether α j ∈S for j=0,…,t, it is decided about each of the intervals [γ j−1,γ j ] (j=1,…,t+1) whether it is a subset of S. Here γ 0=−∞ and γ t+1=+∞.
Example 5.2.2
[64] Find all solutions to the equation
If p(z)=9⊕8⊗z⊕4⊗z 2⊕z 3 and q(z)=10⊕8⊗z⊕5⊗z 2 then
All corners are solutions and by checking the interleaving points (say) 1,2.5,4,6 one can find S=[2,3]∪{5}.
5.3 Characteristic Maxpolynomial
5.3.1 Definition and Basic Properties
There are various ways of defining a characteristic polynomial in max-algebra, briefly characteristic maxpolynomial [62, 99]. We will study the concept defined in [62].
Let \(A=(a_{ij})\in\overline{\mathbb{R}}^{n\times n}\). Then the characteristic maxpolynomial of A is
It immediately follows from this definition that χ A (x) is of the form
or briefly, \({\sum_{k=0,\ldots,n}^{\oplus}}\delta_{n-k}\otimes x^{k}\), where δ 0=0. Hence the characteristic maxpolynomial of an n×n matrix is a standard maxpolynomial with exponents 0,1,…,n, degree n and length n+1 or less.
Example 5.3.1
If
then
Theorem 5.3.2
[62] If \(A=(a_{ij})\in\overline{\mathbb{R}}^{n\times n}\) then
for k=1,…,n, where P k (A) is the set of all principal submatrices of A of order k.
Proof
The coefficient δ k is associated with x n−k in χ A (x) and therefore is the maximum of the weights of all permutations that select n−k symbols of x and k constants from different rows and columns of a submatrix of A obtained by removing the rows and columns of selected x. Since x only appear on the diagonal the corresponding submatrices are principal. □
Hence we can readily find δ n =maper(A) and δ 1=max (a 11,a 22,…,a nn ), but other coefficients cannot be found easily from (5.10) as the number of matrices in P k (A) is \(\binom{n}{k}\).
If considered as a function, the characteristic maxpolynomial is a piecewise linear convex function in which the slopes of the linear pieces are n and some (possibly none) of the numbers 0,1,…,n−1. Note that it may happen that δ k =ε for all k=1,…,n and then χ A (x) is just x n. We can easily characterize such cases:
Proposition 5.3.3
If \(A=(a_{ij})\in\overline{\mathbb{R}}^{n\times n}\) then χ A (x)=x n if and only if D A is acyclic.
Proof
If D A is acyclic then the weights of all permutations with respect to any principal submatrix of A are ε and thus all δ k =ε. If D A contains a cycle, say (i 1,…,i k ,i 1) for some k∈N then
thus δ k >ε by Theorem 5.3.2. □
Note that the coefficients δ k are closely related to the best submatrix problem and to the job rotation problem, see Sect. 2.2.3.
5.3.2 The Greatest Corner Is the Principal Eigenvalue
By Theorem 5.1.13 we know that the greatest corner of a maxpolynomial \(p(z)={\sum_{r=0,\ldots,p}^{\oplus}}c_{r}\otimes z^{j_{r}}\), p>0, is
If p(x)=χ A (x) where \(A=(a_{ij})\in\overline{\mathbb{R}}^{n\times n}\) then p=n, j r =r and c r =δ n−r for r=0,1,…,n with c n =δ 0=0. Hence the greatest corner of χ A (x) is
or, equivalently
We are ready to prove a remarkable property of characteristic maxpolynomials resembling the one in conventional linear algebra. As a convention, the greatest corner of a maxpolynomial with no corners (that is, λ(A)=ε, see Proposition 5.3.3 ) is by definition ε.
Theorem 5.3.4
[62] If \(A=(a_{ij})\in\overline {\mathbb{R}}^{n\times n}\) then the greatest corner of χ A (x) is λ(A).
Proof
The statement is evidently true if λ(A)=ε. Thus assume now that λ(A)>ε, hence at least one corner exists. Let β be the greatest corner of χ A (x) and k∈{1,…,n}, then δ k =maper(B), where B∈P k (A). We have
for some π∈ap(B) and its constituent cycles π 1,…,π s . We also have
for all j=1,…,s. Hence
and so
yielding using (5.11):
Suppose now \(\lambda(A)=\frac{w(\sigma,A)}{l(\sigma)}\), σ=(i 1,…,i r ), r∈{1,…,n}. Let \(\overline{B}=A(i_{1},\ldots,i_{r})\). Then
Therefore
yielding by (5.11):
which completes the proof. □
Example 5.3.5
The principal eigenvalue of
is λ(A)=3. The characteristic maxpolynomial is
and the greatest corner is 3.
5.3.3 Finding All Essential Terms of a Characteristic Maxpolynomial
As already mentioned in Sect. 2.2.3, no polynomial method is known for finding all coefficients of a characteristic maxpolynomial or, equivalently, to solve the job rotation problem. Recall (see Sect. 2.2.3) that this question is equivalent to the best principal submatrix problem (BPSM), which is the task to find the greatest optimal values δ k for the assignment problem of all k×k principal submatrices of A, k=1,…,n. It will be convenient now to denote by BPSM(k) the task of finding this value for a particular integer k.
We will use the functional interpretation of a characteristic maxpolynomial to derive a method for finding coefficients of this maxpolynomial corresponding to all essential terms. Recall that as every maxpolynomial, the characteristic maxpolynomial is a piecewise linear and convex function which can be written using conventional notation as
If for some k∈{0,…,n} the term δ n−k ⊗x k is inessential, then
holds for all x∈ℝ, and therefore all inessential terms may be ignored if χ A (x) is considered as a function. We now present an O(n 2(m+nlog n)) method for finding all essential terms of a characteristic maxpolynomial for a matrix with m finite entries. It then follows that this method solves BPSM(k) for those k∈{1,…,n}, for which δ n−k ⊗x k is essential and, in particular, when all terms are essential then this method solves BPSM(k) for all k=1,…,n.
We will first discuss the case of finite matrices. Let A=(a ij )∈ℝn×n be given. For convenience we will denote χ A (x) by z(x) and A⊕x⊗I by A(x)=(a(x) ij ). Hence
and
Since z(x) is piecewise linear and convex and all its linear pieces are of the form z k (x):=kx+δ n−k for k=0,1,…,n and constants δ n−k , the maxpolynomial z(x) has at most n corners. Recall that z n (x):=nx, that is, δ 0=0. The main idea of the method for finding all linear pieces of z(x) is based on the fact that it is easy to evaluate z(x) for any real value of x as this is simply maper(A⊕x⊗I), that is, the optimal value for the assignment problem for A⊕x⊗I. By a suitable choice of O(n) values of x we will be able to identify all linear pieces of z(x).
Let x be fixed and π∈ap(A(x))=ap(a(x) ij ) (recall that ap(A) denotes the set of optimal permutations to the assignment problem for a square matrix A, see Sect. 1.6.4). We call a diagonal entry a(x) ii of the matrix A(x) active , if x≥a ii and if this diagonal position is selected by π, that is, π(i)=i. All other entries will be called inactive. If there are exactly k active values for a certain x and permutation π then this means that z(x)=kx+δ n−k =x k⊗δ n−k , that is, the value of z(x) is determined by the linear piece with the slope k. Here δ n−k is the sum of n−k inactive entries of A(x) selected by π. No two of these inactive entries can be from the same row or column and they are all in the submatrix, say B, obtained by removing the rows and columns of all active elements. Since all active elements are on the diagonal, B is principal and the n−k inactive elements form a feasible solution to the assignment problem for B. This solution is also optimal by optimality of π. This yields the following:
Proposition 5.3.6
[20] Let x∈ℝ and π∈P n . If \(z(x)=\mathrm{maper}(A(x))= \sum_{i=1}^{n}a(x)_{i,\pi(i)}\), i 1,…,i k are indices of all active entries and {j 1,…,j n−k }=N−{i 1,…,i k } then A(j 1,…,j n−k ) is a solution to BPSM(n−k) for A and δ n−k =maper(A(j 1,…,j n−k )).
There may, of course, be several optimal permutations for the same value of x selecting different numbers of active elements which means that the value of z(x) may be equal to the function value of several linear pieces with different slopes at x. We will pay special attention to this question in Proposition 5.3.14 below.
Proposition 5.3.7
[20] If \(z(\overline{x})=z_{r}(\overline {x})=z_{s}(\overline{x})\) for some \(\overline{x}\in\mathbb{R}\) and integers r<s, then there are no essential terms with the slope k∈(r,s) and \(\overline{x}\) is a corner of z(x).
Proof
Since \(z_{r}(\overline{x})=\delta_{n-r}+r\bar{x}=z(\overline{x})\geq\delta_{n-k}+k\bar{x}\) for every k, we have z r (x)=δ n−r +rx≥δ n−k +kx=z k (x) for every \(x<\overline{x}\) and k>r, thus z(x)≥z r (x)≥z k (x) for every \(x<\overline{x}\) and for every k>r.
Similarly, z(x)≥z s (x)≥z k (x) for every \(x>\overline{x}\) and for every k<s. Hence, z(x)≥z k (x) for every x and for every integer slope k with r+1≤k≤s−1. □
For \(x\leq\widetilde{a}=\min(a_{11},a_{22},\ldots,a_{nn})\), z(x) is given by \(\max_{\pi}\sum_{i=1}^{n}a_{i,\pi(i)}=\mathrm{maper}(A)=\delta_{n}\). Then obviously, z(x)=z 0(x)=δ n for \(x\leq\widetilde{a}\).
Now, let α ∗:=max ij a ij and let E be the matrix whose entries are all equal to 1. For x≥α ∗ the matrix A(x)−α ∗⋅E (in conventional notation) has only nonnegative elements on its main diagonal. All off-diagonal elements are negative. Therefore we get z(x)=nx=z n (x) for x≥α ∗. Note that for finding z(x) there is no need to compute α ∗.
The intersection point of z 0(x) with z n (x) is \(x_{1}=\frac {\delta_{n}}{n}\). We find z(x 1) by solving the assignment problem \(\max_{\pi}\sum_{i=1}^{n}a(x_{1})_{i,\pi(i)}\).
Corollary 5.3.8
If z(x 1)=z 0(x 1) then z(x)=max (z 0(x),z n (x)).
Thus, if z(x 1)=z 0(x 1), we are done and the function z(x) has the form
Otherwise we have found a new linear piece of z(x). Let us call it z k (x):=kx+δ n−k , where k is the number of active elements in the corresponding optimal solution and δ n−k is given by δ n−k :=z(x 1)−kx 1. We remove x 1 from the list.
Next we intersect z k (x) with z 0(x) and with z n (x). Let x 2 and x 3, respectively, be the corresponding intersection points. We generate a list L:=(x 2,x 3). Let us choose an element from the list, say x 2, and determine z(x 2). If z(x 2)=z 0(x 2), then x 2 is a corner of z(x). By Proposition 5.3.7 this means that there are no essential terms of the characteristic maxpolynomial with slopes between 0 and k. We delete x 2 from L and process a next point from L. Otherwise we have found a new linear piece of z(x) and can proceed as above. Thus, for every point in the list we either find a new slope which leads to two new points in the list or we detect that the currently investigated point is a corner of L. In such a case this point will be deleted and no new points are generated. If the list L is empty, we are done and we have already found the function z(x). Every point of the list either leads to a new slope (and therefore to two new points in L) or it is a corner of z(x), in which case this point is deleted from L. Therefore only O(n) entries will enter and leave the list. This means the procedure stops after investigating at most O(n) linear assignment problems. Thus we have shown:
Theorem 5.3.9
[20] All essential terms of the characteristic maxpolynomial of A∈ℝn×n can be found in O(n 4) steps.
The proof of the following statement is straightforward.
Proposition 5.3.10
Let A=(a ij ), B=(b ij )∈ℝn×n, r, s∈N, a rs ≤b rs , a ij =b ij for all i, j∈N, i≠r, j≠s. If π∈ap(A) satisfies π(r)=s then π∈ap(B).
Corollary 5.3.11
If \(\mathrm{id}\in \mathrm{ap}(A(\overline{x}))\) then id∈ap(A(x)) for all \(x\geq\overline{x}\).
Remarks
-
1.
A diagonal element of A(y) may not be active for some y with y>x even if it is active in A(x). For instance, consider the following 4×4 matrix A:
$$\left(\begin{array}{r@{\quad }r@{\quad }r@{\quad }r}0&0&0&29\\0&8&20&0\\0&0&12&28\\29&28&0&16\end{array}\right).$$For x=4 the unique optimal permutation is π=(1)(2,3,4) of value 80, for which the first diagonal element is active. For y=20 the unique optimal permutation is π=(1,4)(2)(3) of value 98, in which the second and third, but not the first, diagonal elements of the matrix are active.
-
2.
If an intersection point x is found by intersecting two linear functions with the slopes k and k+1 respectively, this point is immediately deleted from the list L since it cannot lead to a new essential term (as there is no slope strictly between k and k+1).
-
3.
If at an intersection point y the slope of z(x) changes from k to l with l−k≥2, then an upper bound for δ n−r related to an inessential term rx+δ n−r , k<r<l, can be obtained by z(y)−ry. Due to the convexity of the function z(x) this is the least upper bound on δ n−r which can be obtained by using the values of z(x).
Taking into account our previous discussion, we arrive at the following algorithm. The values x which have to be investigated are stored as triples (x,k(l),k(r)) in a list L. The interpretation of such a triple is that x has been found as the intersection point of two linear functions with the slopes k(l) and k(r), k(l)<k(r).
Algorithm 5.3.12
ESSENTIAL TERMS
Input: A=(a ij ) ∈ℝn×n.
Output: All essential terms of the characteristic maxpolynomial of A, in the form kx+δ n−k .
-
1.
Solve the assignment problem with the cost matrix A and set δ n :=maper(A) and z 0(x):=δ n .
-
2.
Determine x 1 as the intersection point of z 0(x) and z n (x):=nx.
-
3.
Let L:={(x 1,0,n)}.
-
4.
If L=∅, stop. The function z(x) has been found. Otherwise choose an arbitrary element (x i ,k i (l),k i (r)) from L and remove it from L.
-
5.
If k i (r)=k i (l)+1, then (see Remark 2 above) go to step 4. (x i is a corner of z(x); for x close to x i the function z(x) has slope k i (l) for x<x i , and k i (r) for x>x i .)
-
6.
Find z(x i )=maper(A(x i )). Take an arbitrary optimal permutation to the assignment problem for the matrix A(x i ) and let k i be the number of active elements in this solution. Set \(\delta_{n-k_{i}}:=z(x_{i})-k_{i}x_{i}\).
-
7.
Set \(z_{i}(x):=k_{i}x+\delta_{n-k_{i}}\).
-
8.
Intersect z i (x) with the lines having slopes k i (l) and k i (r). Let y 1 and y 2 be the intersection points, respectively. Add the triples (y 1,k i (l),k i ) and (y 2,k i ,k i (r)) to the list L and go to step 4. [See a refinement of this step after Proposition 5.3.14.]
Example 5.3.13
Let
We solve the assignment problem for A by the Hungarian method and transform A to a normal form. The asterisks indicate entries selected by an optimal permutation:
Thus z 0(x)=14.
Now we solve 14=4x and we get x 1=3.5. By solving the assignment problem for x 1=3.5 we get:
Thus z 2(3.5)=17 and we get z 2(x):=2x+10. Intersecting this function with z 0(x) and z 4(x) yields the two new points x 2:=2 (solving 14=2x+10) and x 3:=5 (solving 2x+10=4x). Investigating x=2 shows that the slope changes at this point from 0 to 2. Thus we have here a corner of z(x). Finding the value z(5) amounts to solving the assignment problem with the cost matrix
This assignment problem yields the solution z(5)=20=z 4(5). Thus no new essential term has been found and we have z(x) completely determined as
In max-algebraic terms z(x)=14⊕10⊗x 2⊕x 4.
The following proposition enables us to make a computational refinement of the algorithm ESSENTIAL TERMS. We refer to the assignment problem terminology introduced in Sect. 1.6.4.
Proposition 5.3.14
Let \(\overline{x}\in\mathbb{R}\) and let B=(b ij ) be a normal form of \(A(\overline{x})\). Let C=(c ij ) be the matrix obtained from B as follows:
Then every π∈ap(C) [π∈ap(−C)] is an optimal solution to the assignment problem for \(A(\overline{x})\) with maximal [minimal] number of active elements.
Proof
The statement immediately follows from the definitions of C and of a normal form of a matrix. □
If for some value of \(\overline{x}\) there are two or more optimal solutions to the assignment problem for \(A(\overline{x})\) with different numbers of active elements then using Proposition 5.3.14 we can find an optimal solution with the smallest number and another one with the greatest number of active elements. This enables us to find two new lines (rather than one) in step 6 of Algorithm 5.3.12:
-
(a)
z k (x):=kx+δ n−k , where k is the minimal number of active elements of an optimal solution to the assignment problem for \(A(\overline {x})\) and δ n−k is given by \(\delta_{n-k}:=z(\overline{x})-k\overline{x}\);
-
(b)
z k′(x):=k′x+δ n−k′, where k′ is the maximal number of active elements of an optimal solution to the assignment problem for \(A(\overline{x})\) and δ n−k′ is given by \(\delta_{n-k^{\prime}}:=z(\overline{x})-k^{\prime}\overline{x}\).
In step 8 of Algorithm 5.3.12 we then intersect z i (x) with the line having the slope k i (l) and z k′(x) with the line having slope k i (r).
So far we have assumed in this subsection that all entries of the matrix are finite. If some (but not all) entries of A are ε, the same algorithm as in the finite case can be used except that the lowest order finite term has to be found since a number of the coefficients of the characteristic maxpolynomial may be ε. The following theorem is useful here. In this theorem we denote
where A min [A max ] is the least [greatest] finite entry of A. We will also denote in this and the next subsection
and
Clearly, the lowest-order finite term of the characteristic maxpolynomial is \(z_{k_{0}}(x)=\delta_{k_{0}}\otimes x^{n-k_{0}}\).
Theorem 5.3.15
[38] If \(A\in\overline{\mathbb{R}}^{n\times n}\) then n−k 0 is the number of active elements in \(A(\overline {x})\), where \(\overline{x}\) is any real number satisfying
and \(\delta_{k_{0}}=z(\overline{x})-(n-k_{0})\overline{x}\).
Proof
It is sufficient to prove that if x 0 is a point of intersection of two different linear pieces of χ A (x) then
Suppose that
for some r, s∈{0,1,…,n}, r>s. Then
If A min ≤0 then \(\delta_{r}\geq sA_{\min}\geq nA_{\min}=\underline {\delta}\). If A min ≥0 then \(\delta_{r}\geq sA_{\min}\geq 0=\underline{\delta}\). Hence \(\delta_{r}\geq\underline{\delta}\).
If A max ≤0 then \(\delta_{s}\leq rA_{\max}\leq0=\overline{\delta}\). If A max ≥0 then \(\delta_{s}\leq rA_{\max}\leq nA_{\max}=\overline{\delta}\). Hence \(\delta_{s}\leq\underline{\delta}\).
We deduce that \(\delta_{r}-\delta_{s}\geq\underline{\delta}-\overline{\delta}\) and the rest follows from the fact that r−s≥1 and \(\underline{\delta}-\overline{\delta}\leq0\). □
It follows from this result that for a general matrix, k 0 can be found using O(n 3) operations. Note that for symmetric matrices this problem can be converted to the maximum cardinality bipartite matching problem and thus solved in \(O(n^{2.5}/\sqrt{\log n})\) time [37].
Theorem 5.3.15 enables us to modify the beginning of the algorithm ESSENTIAL TERMS for \(A\in\overline{\mathbb{R}}^{n\times n}\) by finding the intersection of the lowest order finite term \(z_{k_{0}}(x)\) (rather than z 0(x)) with x n. Moreover, instead of considering the classical assignment problem we rather formulate the problem in step 6 of the algorithm as the maximum weight perfect matching problem in a bipartite graph (N,N;E). This graph has an arc (i,j)∈E if and only if a ij is finite. It is known [1] that the maximum weight perfect matching problem in a graph with m arcs can be solved by a shortest augmenting path method using Fibonacci heaps in O(n(m+nlog n)) time. Since in the worst case O(n) such maximum weight perfect matching problems must be solved, we get the following result.
Theorem 5.3.16
[20] If \(A\in\overline{\mathbb{R}}^{n\times n}\) has m finite entries, then all essential terms of χ A (x) can be found in O(n 2(m+nlog n)) time.
5.3.4 Special Matrices
Although no polynomial method seems to exist for finding all coefficients of a characteristic maxpolynomial for general matrices or even for matrices over {0,−∞}, there are a number of special cases for which this problem can be solved efficiently. These include permutation, pyramidal, Hankel and Monge matrices and special matrices over {0,−∞} [28, 37, 116].
We briefly discuss two special types: diagonally dominant matrices and matrices over {0,−∞}.
Proposition 5.3.17
If \(A=(a_{ij})\in\overline{\mathbb{R}}^{n\times n}\) is diagonally dominant then so are all principal submatrices of A and all coefficients of the characteristic maxpolynomial can be found by the formula
for k=1,…,n, where \(a_{i_{1}i_{1}}\geq a_{i_{2}i_{2}}\geq\cdots \geq a_{i_{n}i_{n}}\).
Proof
Let A be a diagonally dominant matrix, B=A(i 1,i 2,…,i k ) for some indices i 1,i 2,…,i k and suppose that id∉ap(B). Take any π∈ap(B) and extend π to a permutation σ of the set N by setting σ(i)=i for every i∉{i 1,i 2,…,i k }. Then obviously σ is a permutation of a weight greater than that of id∈P n , a contradiction. The formula follows. □
Matrices over T={0,−∞} have implications for problems outside max-algebra and in particular for the conventional permanent, which for a real matrix A=(a ij ) we denote as usual by per(A), that is
If A=(a ij )∈T n×n then δ k =0 or δ k =−∞ for every k=1,…,n. Clearly, δ k =0 if and only if there is a k×k principal submatrix of A with k independent zeros, that is, with k zeros selected by a permutation or, equivalently, k zeros no two of which are either from the same row or from the same column.
It is easy to see that if A=(a ij )∈T n×n then \(B=2^{A}=(2^{a_{ij}})=(b_{ij})\) is a zero-one matrix. If π∈P n then
Hence per(B)>0 is equivalent to
But this is equivalent to
Thus, the task of finding the coefficient δ k of the characteristic maxpolynomial of a square matrix over T is equivalent to the following problem expressed in terms of the classical permanents:
PRINCIPAL SUBMATRIX WITH POSITIVE PERMANENT : Given an n×n zero-one matrix A and a positive integer k (k≤n), is there a k×k principal submatrix B of A with positive (conventional) permanent?
Another equivalent version for matrices over T is graph-theoretical: Since every permutation is a product of cycles, δ k =0 means that in D A (and F A ) there is a set of pairwise node-disjoint cycles covering exactly k nodes. Hence deciding whether δ k =0 is equivalent to the following:
EXACT CYCLE COVER : Given a digraph D with n nodes and a positive integer k ( k≤n), is there a set of pairwise node-disjoint cycles covering exactly k nodes of D ?
Finally, it may be useful to see that the value of k 0 defined by (5.13) can explicitly be described for matrices over {0,−∞}:
Theorem 5.3.18
[28] If A∈T n×n then k 0=n+maper(A⊕(−1)⊗I).
Proof
Since all finite δ k are 0 in conventional notation we have:
Therefore, for x<0:
from which the result follows by setting x=−1. □
5.3.5 Cayley–Hamilton in Max-algebra
A max-algebraic analogue of the Cayley–Hamilton Theorem was proved in [119] and [140], see also [8]. Some notation used here has been introduced in Sect. 1.6.4.
Let A=(a ij )∈ℝn×n and v∈ℝ. Let us denote
and
The following equation is called the (max-algebraic) characteristic equation for A (recall that max ∅=ε):
where
and
Theorem 5.3.19
(Cayley–Hamilton in max-algebra) Every real square matrix A satisfies its max-algebraic characteristic equation.
An application of this result in the theory of discrete-event dynamic systems can be found in Sect. 6.4.
In general it is not easy to find a max-algebraic characteristic equation for a matrix. However, as the next theorem shows, unlike for characteristic maxpolynomials it is relatively easy to do so for matrices over T={0,−∞}. Given a matrix A=(a ij ), the symbol 2A will stand for the matrix \((2^{a_{ij}})\).
Theorem 5.3.20
[28] If A∈T n×n then the coefficients d k in the max-algebraic characteristic equation for A are the coefficients at λ n−k of the conventional characteristic polynomial for the matrix 2A.
Proof
If A∈T n×n then all finite c k are 0. Note that if k∈N and maper(B)=ε for all B∈P k (A) then the term c k ⊗λ n−k does not appear on either side of the equation. If B=(b ij )∈T k×k then p +(B,0) is the number of even permutations that select only zeros from B. The matrix 2B is zero-one, zeros corresponding to −∞ in B and ones corresponding to zeros in B. Thus p +(B,0) is the number of even permutations that select only ones from 2B. Similarly for p −(B,0). Since 2B is zero-one, all terms in the standard determinant expansion of 2B are either 1 (if the corresponding permutation is even and selects only ones), or −1 (if the corresponding permutation is odd and selects only ones), or 0 (otherwise). Hence det 2B=p +(B,0)−p −(B,0). Since
it follows that
which is the coefficient at λ n−k of the conventional characteristic polynomial of the matrix 2A. □
5.4 Exercises
Exercise 5.4.1
Find the standard form of
and then factorize it using RECTIFICATION and RESOLUTION. [1⊗z 2.5⊗(2⊕1⊗z 2.2⊕3⊗z 3.7⊕z 5.8); \(1\otimes z^{2.5}\otimes(-\frac{1}{3.7}\oplus x)^{3.7}\otimes(\frac{3}{2.1}\oplus z)^{2.1}\)]
Exercise 5.4.2
Find the characteristic maxpolynomial and characteristic equation for the following matrices; factorize the maxpolynomial and check whether χ A (x)=LHS⊕RHS of the maxpolynomial equation:
-
(a)
\(A=\left(\begin{array}{r@{\quad }r@{\quad }r}3&-2&1\\4&0&5\\3&1&2\end{array}\right)\). [χ A (x)=9⊕6⊗x⊕3⊗x 2⊕x 3=(3⊕x)3; λ 3⊕9=3⊗λ 2⊕6⊗λ]
-
(b)
\(A=\left(\begin{array}{r@{\quad }r@{\quad }r}1&0&-3\\2&3&1\\4&-2&0\end{array}\right)\). [χ A (x)=5⊕4⊗x⊕3⊗x 2⊕x 3=(1⊕x)2⊗(3⊕x); λ 3⊕4⊗λ=3⊗λ 2⊕5]
-
(c)
\(A=\left(\begin{array}{r@{\quad }r@{\quad }r}1&2&5\\-1&0&3\\1&1&1\end{array}\right)\). [χ A (x)=6⊕6⊗x⊕1⊗x 2⊕x 3=(3⊕x)2⊗(0⊕x); λ 3=1⊗λ 2⊕6⊗λ]
Exercise 5.4.3
A square matrix A is called strictly diagonally dominant if ap(A)={id}. Find a formula for the characteristic equation of strictly diagonally dominant matrices. [λ n⊕δ 2⊗λ n−2⊕δ 4⊗λ n−4⊕⋅⋅⋅=δ 1⊗λ n−1⊕δ 3⊗λ n−3⊕δ 5⊗λ n−5⊕⋅⋅⋅ where δ k = the sum of k greatest diagonal values]
References
Ahuja, R. K., Magnanti, T., & Orlin, J. B. (1993). Network flows: Theory, algorithms and applications. Englewood Cliffs: Prentice Hall.
Akian, M., Bapat, R., & Gaubert, S. (2004). Perturbation of eigenvalues of matrix pencils and optimal assignment problem. Comptes Rendus de L’Academie Des Sciences Paris, Série I, 339, 103–108.
Baccelli, F. L., Cohen, G., Olsder, G.-J., & Quadrat, J.-P. (1992). Synchronization and linearity. Chichester: Wiley.
Burkard, R. E., & Butkovič, P. (2003). Max algebra and the linear assignment problem. Mathematical Programming Series B, 98, 415–429.
Butkovič, P. (2003). On the complexity of computing the coefficients of max-algebraic characteristic polynomial and characteristic equation. Kybernetika, 39, 129–136.
Butkovič, P., & Lewis, S. (2007). On the job rotation problem. Discrete. Optimization, 4, 163–174.
Butkovič, P., & Murfitt, L. (2000). Calculating essential terms of a characteristic maxpolynomial. Central European Journal of Operations Research, 8, 237–246.
Cuninghame-Green, R. A. (1983). The characteristic maxpolynomial of a matrix. Journal of Mathematical Analysis and Applications, 95, 110–116.
Cuninghame-Green, R. A. (1995). Maxpolynomial equations. Fuzzy Sets and Systems, 75(2), 179–187.
Cuninghame-Green, R. A. (1995). Minimax algebra and applications. In Advances in imaging and electron physics (Vol. 90, pp. 1–121). New York: Academic Press.
Cuninghame-Green, R. A., & Meijer, P. F. (1980). An algebra for piecewise-linear minimax problems. Discrete Applied Mathematics, 2, 267–294.
Gondran, M., & Minoux, M. (1978). L’indépendance linéaire dans les dioïdes. Bulletin de la Direction Etudes et Recherches. EDF, Série C, 1, 67–90.
Murfitt, L. (2000). Discrete-event dynamic systems in max-algebra. Thesis, University of Birmingham.
Olsder, G. J., & Roos, C. (1988). Cramér and Cayley-Hamilton in the max algebra. Linear Algebra and Its Applications, 101, 87–108.
Straubing, H. (1983). A combinatorial proof of the Cayley-Hamilton theorem. Discrete Mathematics, 43, 273–279.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Butkovič, P. (2010). Maxpolynomials. The Characteristic Maxpolynomial. In: Max-linear Systems: Theory and Algorithms. Springer Monographs in Mathematics. Springer, London. https://doi.org/10.1007/978-1-84996-299-5_5
Download citation
DOI: https://doi.org/10.1007/978-1-84996-299-5_5
Publisher Name: Springer, London
Print ISBN: 978-1-84996-298-8
Online ISBN: 978-1-84996-299-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)