1 Introduction

The minimum spanning tree problem (MSTP) is well studied in the combinatorial optimization literature. A generalization of this problem, called the quadratic minimum spanning tree problem (QMSTP), received considerable attention from the research community recently. Some of these papers focus on exact algorithms (Assad and Xu 1992; Pereira et al. 2015a, b) and lower bounds (Pereira et al. 2013, 2015b; Rostami and Malucelli 2015), while the majority of the published works deal with heuristic algorithms (Cordone and Passeri 2012; Fu and Hao 2015; Lozano et al. 2014; Öncan and Punnen 2010; Palubeckis et al. 2010; Sundar and Singh 2010; Zhou and Gen 1998). Isolated results on some theoretical properties of the problem are also available. Special cases of QMSTP studied in the literature include multiplicative objective functions (Goyal et al. 2011; Kern and Woeginger 2007; Mittal and Schulz 2013), spanning trees with conflict pair constraints (Darmann et al. 2011; Zhang et al. 2011), and spanning tree problems with one quadratic term (Buchheim and Klein 2014; Fischer and Fischer 2013). A multiobjective version of QMSTP (Maia et al. 2013, 2014) and a version with fuzzy costs (Gao and Lu 2005) have also been investigated in the literature. Some polynomially solvable special cases of QMSTP are discussed in Ćustić et al. (2016), along with various complexity results.

Let \(G=(V,E)\) be a simple graph such that \(|V|=n\) and \(E=\{1,2,\ldots ,m\}\). For each \((e,f)\in E\times E\) a cost q(ef) is given. Let \(\mathcal {F}\) be the family of all spanning trees of G and Q be an \(m\times m\) matrix, with its (ij)-th entry denoted by q(ij). The cost Q(T) of each \(T\in \mathcal {F}\) is given by

$$\begin{aligned} Q(T)= \sum _{e\in T}\sum _{f\in T} q(e,f). \end{aligned}$$

The notation \(e\in T\) is used to indicate that e belongs to the edge set of T. Then the QMSTP is to find a spanning tree T in \(\mathcal {F}\) such that Q(T) is as small as possible. The QMSTP is well known to be strongly NP-hard. In fact, it is NP-hard even if the square matrix Q is of rank one (Punnen 2001) or the underlying graph is a wheel (Ćustić et al. 2016).

Similarly, for each \(T\in \mathcal {F}\), let \(C(T)=\sum _{e\in T}c(e)\), where c(e) is a given cost of edge \(e\in E\). Given a cost matrix Q, the quadratic spanning tree linearization problem (QST-LP) is to determine if there exists a linear cost vector \(C=(c(1),c(2),\ldots ,c(m))\) such that \(Q(T)=C(T)\) for all \(T\in \mathcal {F}\). If the answer to this decision problem is ‘yes’, the quadratic cost matrix Q is said to be linearizable and C is called a linearization of Q. In the literature, QMSTP is considered predominately in the context of a complete graph G. In that case, \(|\mathcal {F}|\) is equal to \(n^{n-2}\), and hence QST-LP is a non-trivial problem. In fact, there is no immediate direct way to test if QST-LP belongs to NP. Testing membership in NP is straightforward for most NP-complete versions of combinatorial optimization problems. This makes investigation of QST-LP even more interesting.

The linearization problem for the quadratic assignment problem (QAP) was considered by Kabadi and Punnen (2011), Adams and Waddell (2014) and Çela et al. (2016). The special case of the Koopmans–Beckman QAP linearization problem was studied by Bookhold (1990), Punnen and Kabadi (2013) and Çela et al. (2016). Although fast algorithms that recognize whether a QAP instance is linearizable exist, only for the case of symmetric Koopmans–Backman QAP a simple closed form expression that characterizes linearizable instances is known (see Kabadi and Punnen 2011; Punnen and Kabadi 2013). The linearization problem for the bilinear assignment problem was studied by Ćustić et al. (2017) and for the quadratic travelling salesman problem was studied by Punnen and Woods (2017).

In this paper, we provide a characterization of linearizable instances of QMSTP in the standard context of complete graphs (Sect. 3). Our characterization can be tested in \(O(m^2)\) time, and unlike the QAP linearization results, it is given as a simple closed form condition on the matrix Q. Also, an O(m) algorithm for recognizing an \(m\times m\) sum matrix represented in factored form is given. This leads to an O(m) algorithm to test if a symmetric matrix Q is linearizable when represented in factored form (Sect. 5). As a byproduct of these results, we have new polynomially solvable special cases of the QMSTP. Further, we extend the characterization beyond complete graphs, e.g., for complete bipartite graphs, cactuses etc (Sect. 4). Concluding remarks are presented in Sect. 6 along with some open problems.

2 Preliminaries

In this section we present some definitions and basic facts about the QMSTP that one used later in the paper.

Let \(M^{m\times m}\) be the vector space of all real valued \(m\times m\) matrices. The set of linearizable quadratic cost matrices for QMSTP on a given graph with m edges forms a subspace of \(M^{m\times m}\). As a consequence we have the following.

Observation 1

Let \(Q_1\) and \(Q_2\) be two cost matrices for the QMSTP on a graph G. If \(Q_1\) and \(Q_2\) are linearizable, then \(\alpha Q_1+\beta Q_2\) is also linearizable for any scalars \(\alpha \) and \(\beta \). Furthermore, if \(C_1\) is a linearization of \(Q_1\) and \(C_2\) is a linearization of \(Q_2\), then \(\alpha C_1+\beta C_2\) is a linearization of \(\alpha Q_1+\beta Q_2\).

A square matrix A is said to be a skew-symmetric matrix if \(A^T=-A\).

Observation 2

If Q is a cost matrix for the QMSTP on a graph G, A is a skew-symmetric matrix and D is a diagonal matrix, all of the same size, then Q is linearizable if and only if \(Q+A+D\) is linearizable.

It may be noted that, if Q is skew-symmetric, then \(Q(T)=0\) for any spanning tree T. Thus a skew-symmetric matrix is linearizable for any graph G.

Observation 3

If Q is a cost matrix for the QMSTP on a graph G. Then Q is linearizable if and only if \(\frac{1}{2}(Q+Q^T)\) is linearizable. Furthermore, C is a linearization of Q if and only if C is a linearization of \(\frac{1}{2}(Q+Q^T).\)

Proof

Note that \(Q= \frac{1}{2}(Q-Q^T)+\frac{1}{2}(Q+Q^T)\). As \( \frac{1}{2}(Q-Q^T)\) is skew-symmetric, the result follows from Observations 1,  2, and the fact that the null-vector is a linearization of a skew-symmetric matrix. \(\square \)

Note that \(\frac{1}{2}(Q+Q^T)\) is a symmetric matrix. Thus in view of Observation 3 hereafter we assume without loss of generality that the cost matrix Q is symmetric.

Definition 4

An \(n_1\times n_2\) matrix \(H=(h(i,j))\) is called a sum matrix if there exist vectors \(a=(a(1),a(2),\ldots ,a(n_1))\) and \(b=(b(1),b(2),\ldots ,b(n_2))\) such that \(h(i,j)=a(i)+b(j)\) for all \(i=1,\ldots ,n_1\) and \(j=1,\ldots ,n_2\). A square matrix is called a weak sum matrix if the relation above is not necessarily satisfied for the diagonal elements.

Note that if an \(n\times n\) square sum matrix \(H=(h(i,j))\) is symmetric, then \(h(i,j)=a(i)+a(j)\) for all \(i,j=1,2,\ldots ,n\), for some vector \(a=(a(1),\ldots ,a(n))\). Similarly, if an \(n\times n\) square weak sum matrix \(H=(h(i,j))\) is symmetric, then \(h(i,j)=a(i)+a(j)\) for all \(i,j=1,2,\ldots ,n,\ i\ne j\), for some vector \(a=(a(1),\ldots ,a(n))\).

Furthermore, given an \(n_1\times n_2\) (weak) sum matrix \(H=(h(i,j))\), it takes \(O(n_1+n_2)\) time to find two vectors \(a=(a(1),a(2),\ldots ,a(n_1))\), \(b=(b(1),b(2),\ldots ,b(n_2))\) such that \(h(i,j)=a(i)+b(j)\) \(\forall i,j\) (\(i\ne j\) for the weak sum case). For example, we can set a(1) to an arbitrary value and then each of a(i)’s and b(i)’s can be calculated in a constant time. In the case when H is a symmetric square matrix, the vector \({\bar{a}}=({\bar{a}}(1),{\bar{a}}(2),\ldots ,{\bar{a}}(n))\) such that \(h(i,j)={\bar{a}}(i)+{\bar{a}}(j)\) is produced by setting \({\bar{a}}(i)=(a(i)+b(i))/2\) \(\forall i\).

3 Characterization of linearizable QMSTP on \(K_n\)

In the QMSTP literature, the default underlying graph structure is a complete graph \(K_n\). In such a setting, we show that a symmetric cost matrix is linearizable if and only if it is a symmetric weak sum matrix.

Theorem 5

A symmetric cost matrix Q of the QMSTP on a complete graph \(K_n\) is linearizable if and only if it is a symmetric weak sum matrix. Further, a linearization of a linearizable symmetric matrix Q can be identified by a closed form expression.

Proof

First assume that \(Q=(q(e,f))\) is a weak sum matrix, i.e., there exist a(e), \(e=1,\ldots ,m\), such that \(q(e,f)=a(e)+a(f)\) for all \(e\ne f\). Then we have

$$\begin{aligned} Q(T)&=\sum _{e\in T}\sum _{f\in T}q(e,f)=\sum _{e\in T}\sum _{\begin{array}{c} f\in T\\ f\ne e \end{array}} (a(e)+a(f))+ \sum _{e\in T}q(e,e)\\&=\sum _{e\in T}(2n-4)a(e)+\sum _{e\in T}q(e,e)\\&=\sum _{e\in T}c(e), \end{aligned}$$

where a linearization c(e) is given by

$$\begin{aligned} c(e):=(2n-4)a(e)+q(e,e). \end{aligned}$$

Next, we assume that Q is linearizable. It is not hard to see that for \(n\le 3\) every corresponding symmetric square cost matrix is a symmetric weak sum matrix. Namely, the conditions \(q(i,j)=a(i)+a(j)\) for all non-diagonal elements of a \(3\times 3\) symmetric matrix lead to a system of three independent linear equations with three variables. Hence we can assume that \(n\ge 4\).

Consider an \({n\atopwithdelims ()2}\times {n \atopwithdelims ()2}\) sum matrix \(H=(h(i,j))\) of the form \(h(i,j)=a(i)+a(j)\), where \(a(1)=0\) and \(a(i)=q(i,1)\) for \(i=2,\ldots , {n \atopwithdelims ()2}\). By subtracting H and an appropriate diagonal matrix from Q, we could obtain zeros on the first row, the first column and the diagonal. Since H is a sum matrix, it is linearizable, and furthermore, any diagonal matrix is linearizable. Hence, from Observation 1 it follows that without loss of generality we can assume that elements of the first row, the first column and the diagonal of Q are equal to zero. In that case, showing that Q is a weak sum matrix is equivalent to showing that all elements of Q that are not in the first row, the first column or on the diagonal, have the same value. Namely, \(q(1,j)=0=a(1)+a(j)\) for all \(j\ge 2\) implies that a(j) is the same for all \(j\ge 2\). And if all elements not in the first row/column or on the diagonal have the same value K, then Q is obviously a weak sum matrix (\(q(i,j)=a(i)+a(j)\) where \(a(1)=0\) and \(a(i)=K/2\) for \(i\ge 2\)).

Now, we assume the contrary, i.e. Q is linearizable but there are two elements of Q (not in the first row/column or on the diagonal) that have different values. Moreover, due to the symmetry of Q, there is a row b that contains such two distinct value elements q(bx) and q(by). As any element of row b (except q(b, 1) and q(bb)) can be a member of such pair, without loss of generality we can assume that edges 1 and x are nonincident.

Next, we show that there exists a cycle \(C_1\) that contains edges 1 and b, and a cycle \(C_2\) that contains edges x and y with the following property: \(C_1\cup C_2\setminus \{e,f\}\) does not contain a cycle for all \(e\in \{1,b\}\), \(f\in \{x,y\}\). In the case when there are no two pairs of edges from \(\{1,b\}\times \{x,y\}\) that are incident, it is straightforward to construct \(C_1\) and \(C_2\) that are edge disjoint and satisfy the above property, see Fig. 1a. In the case when there are at least two pairs of edges from \(\{1,b\}\times \{x,y\}\) that are incident, all possible 1, bxy configurations can be reduced to only two cases, presented in Fig. 1b and c. These reductions consist of symmetries (defined by exchanging sets \(\{1,b\}\) and \(\{x,y\}\), and by exchanging elements inside of those two sets), and edge contractions (that can induce more incidences and only make the case more complicated). These configurations in Fig. 1b and c are extended with a (dashed) edge that constitutes feasible \(C_1\) and \(C_2\). In particular, \(C_1=\{1,b,e_{dash}\}\), \(C_2=\{x,y,b\}\) in Fig. 1b and \(C_1=\{1,b,e_{dash}\}\), \(C_2=\{x,y,e_{dash}\}\) in Fig. 1c, where \(e_{dash}\) is the dashed auxiliary edge in the figures. Note that we used the fact that 1 and x are nonincident, otherwise there are instances for which \(C_1\) and \(C_2\) with the property above do not exist, see Fig. 1d.

Fig. 1
figure 1

Configurations of \(\{1,b,x,y\}\) and corresponding \(C_1\cup C_2\)

Let T be a minimum cardinality set of edges of a tree connected to both \(C_1\) and \(C_2\) that spans the remaining vertices. Then we define B to be \(T\cup C_1\cup C_2 \setminus \{1,b,x,y\}\). It is easy to see that B extended by any two edges \(e\in \{1,b\}\) and \(f\in \{x,y\}\) forms a spanning tree.

Let \(C=(c(i))\) be a cost vector that linearizes Q. Since both \(B\cup \{b,x\}\) and \(B\cup \{1,x\}\) form a spanning tree, we have that

$$\begin{aligned} C(B\cup \{b,x\})-C(B\cup \{1,x\})=\sum _{e\in B\cup \{b,x\}}c(e)-\sum _{e\in B\cup \{1,x\}}c(e)=c(b)-c(1). \end{aligned}$$

Analogously, \(C(B\cup \{b,y\})-C(B\cup \{1,y\})=c(b)-c(1)\), hence

$$\begin{aligned} Q(B\cup \{b,x\})-Q(B\cup \{1,x\})=Q(B\cup \{b,y\})-Q(B\cup \{1,y\}). \end{aligned}$$
(1)

Now let us express the cost of the spanning tree \(B\cup \{b,x\}\) in terms of the quadratic cost matrix Q. Since \(q(e,e)=0\) for all e,  we have

$$\begin{aligned} Q(B\cup \{b,x\})&=\sum _{e\in B\cup \{b,x\}}\sum _{f\in B\cup \{b,x\}}q(e,f)\\&=\sum _{e\in B}\sum _{f\in B}q(e,f)+\sum _{e\in B}2q(b,e)+\sum _{e\in B}2q(x,e)+2q(b,x). \end{aligned}$$

Since \(q(1,e)=0\) for all e we analogously have

$$\begin{aligned} Q(B\cup \{1,x\})=\sum _{e\in B}\sum _{f\in B}q(e,f)+\sum _{e\in B}2q(x,e). \end{aligned}$$

Therefore

$$\begin{aligned} Q(B\cup \{b,x\})-Q(B\cup \{1,x\})=\sum _{e\in B}2q(b,e)+2q(b,x). \end{aligned}$$
(2)

Analogously

$$\begin{aligned} Q(B\cup \{b,y\})-Q(B\cup \{1,y\})=\sum _{e\in B}2q(b,e)+2q(b,y). \end{aligned}$$
(3)

Then from (1), (2) and (3) it follows that \(q(b,x)=q(b,y)\) which is a contradiction to our choice of bx and y. \(\square \)

4 Extension to other classes of graphs

In this section we will generalize the approach from the proof of Theorem 5 to obtain a characterization of linearizable cost matrices of the QMSTP for a larger class of graphs. Note that in the study of linearizable instances of the QAP, the case when the underlying graph structure is not complete has not yet been considered. Our results of the section are summarized in Theorem 16, after which an illustrative example is given.

As noted earlier, any skew-symmetric matrix is linearizable regardless of the structure of the underlying graph. We now observe that if the underlying graph is a cycle, then the resulting QMSTP is linearizable regardless the structure of the cost matrix Q.

Lemma 6

The QMSTP on a cycle is linearizable for any cost matrix Q. Further, the linearization \(C = (c(1),c(2),\ldots ,c(m)) \) is given by

$$\begin{aligned} c(e)=q(e,e)+\sum _{i\in E\setminus \{e\}}(q(i,e)+q(e,i))-\frac{\sum _{i\in E}\sum _{j\in E, j\ne i}q(i,j)}{m-1}, \ \ e \in E. \end{aligned}$$
(4)

Proof

Let G be a cycle with edges \(e_1,e_2,\ldots ,e_m\). Then the spanning trees of G are precisely \(T_1,T_2,\ldots ,T_m\) where \(T_i=G\setminus \{e_i\}\). We need to find a vector \(C=(c(1),c(2),\ldots ,c(m))\) such that

$$\begin{aligned} \sum _{e\in T_i}c(e)=Q(T_i) \end{aligned}$$

for all \(i=1,\ldots ,m\). Equivalently, we want to find a solution to the linear system above, where the variables being \(c(1),c(2),\ldots ,c(m)\). It can be verified that the coefficient matrix is invertible and hence the system has a unique solution. The formula for the linearization can be verified by simple algebra. \(\square \)

Note that the result of Lemma 6 can be extended to any real valued objective function for a spanning tree, not simply the quadratic objective function. The following is an immediate corollary of Lemma 6.

Corollary 7

The QMSTP is linearizable for any cost matrix Q on the graph \(T\cup \{e\}\) where T is a tree and e is an edge (not necessarily in T) joining two vertices of T.

In order to present a sufficient condition for the linearization (Lemma 10), we need a notion of biconnected components.

Definition 8

A subgraph \(G'\) of a simple graph G is called a biconnected component of G if it is a maximal subgraph of G with the property that if any vertex of \(G'\) were to be removed, \(G'\) will remain connected.

Throughout this text, biconnected components are represented by the set of their edges. The following fact is straightforward to prove, for example see [Ćustić (2014) Ch.5, p. 101].

Proposition 9

An instance of the MSTP on a graph G has the property that every spanning tree has the same cost, if and only if all edges from the same biconnected component of G have the same cost.

Lemma 10

Let Q be a symmetric cost matrix of the QMSTP on a graph \(G=(V,E)\) such that for every pair I, J of biconnected components of G, the submatrix of Q defined by rows I and columns J is a sum matrix if \(I\ne J\), or a symmetric weak sum matrix if \(I=J\). Then Q is linearizable and a linearization of Q can be computed in \(O(|E|^2)\) time.

Proof

Let Q be a symmetric matrix that satisfies the hypothesis of the lemma. Note that for a (sub)matrix M, being a sum matrix is equivalent to being a sum of two matrices \(M=R+C\), where every row of matrix R and every column of matrix C is a constant vector. (If \(m(i,j)=a(i)+b(j)\) then set \(r(i,j)=a(i)\) \(\forall j\) and \(c(i,j)=b(j)\) \(\forall i\).) Therefore Q can be expressed as

$$\begin{aligned} Q=A+A^T+D, \end{aligned}$$
(5)

where \(D=(d(i,j))\) is a diagonal matrix, and matrix \(A=(a(i,j))\) has the property that \(a(i,j)=a(i,k)\) if j and k are edges from the same biconnected component. Note that matrices A and D can be found in \(O(|E|^2)\) time (see the discussion bellow Definition 4). From Proposition 9 it follows that an MSTP instance defined by any row of A [i.e. for some fixed row i we define the length of an edge j to be a(ij)] has the property that every spanning tree has the same cost. Let r(i) denote the constant objective function value of the MSTP corresponding to the i-th row of A. Then the objective value of the QMSTP for some spanning tree T is

$$\begin{aligned} Q(T)&=\sum _{e\in T}\sum _{f\in T}(a(e,f)+a(f,e)+d(e,f))\\&=\sum _{e\in T}\sum _{f\in T}(a(e,f)+a(f,e))+\sum _{e\in T}d(e,e)\\&=\sum _{e\in T}(r(e)+r(e))+\sum _{e\in T}d(e,e)\\&=\sum _{e\in T}(2r(e)+d(e,e)). \end{aligned}$$

Hence, by setting

$$\begin{aligned} c(i):=2r(i)+d(i,i) \end{aligned}$$
(6)

we obtain a linearization of Q. Note that the choice of matrix A is not always uniquely determined. Hence a linearization is not necessarily unique.\(\square \)

Next we define the concept of backbone, which will play a crucial role in proving the necessary conditions.

Definition 11

Let abx and y be distinct edges of a simple graph G with n vertices. We say that a set B of \(n-3\) edges is an \(\{a,b\}\)-\(\{x,y\}\) -backbone of G if adding any two edges \(e\in \{a,b\}\) and \(f\in \{x,y\}\) to B generates a spanning tree of G.

Lemma 12

Let Q be a linearizable symmetric cost matrix of the QMSTP on a simple graph G, and let a and x be two fixed distinct edges of G. If for all additional edges b and y there exist a sequence of \(k\ge 2\) edges \(z_1,z_2,\ldots ,z_k\) such that \(x=z_1\), \(y=z_k\) and there exists an \(\{a,b\}-\{z_i,z_{i+1}\}\)-backbone \(B_i\) for every \(i=1,\ldots , k-1\), then Q is a symmetric weak sum matrix.

Proof

Let Q be linearizable and let abxy be four distinct edges such that there exists an \(\{a,b\}-\{x,y\}\)-backbone B. Since Q is linearizable it follows that

$$\begin{aligned} Q(B\cup \{b,x\})-Q(B\cup \{a,x\})=Q(B\cup \{b,y\})-Q(B\cup \{a,y\}). \end{aligned}$$
(7)

Indeed, by expressing spanning tree objective values from (7) with linearization costs \(C=(c(i))\), one gets \(c(b)-c(a)=c(b)-c(a)\). However, by expressing spanning tree objective values from (7) with quadratic costs \(Q=(q(i,j))\), one gets

$$\begin{aligned} q(b,x)-q(a,x)=q(b,y)-q(a,y). \end{aligned}$$

Now assume that a and x are fixed and there exist edges by and \(z_1,\ldots ,z_k\) with \(z_1=x,z_k=y\), such that there exists an \(\{a,b\}\)-\(\{z_i,z_{i+1}\}\)-backbone \(B_i\) for every \(i=1,\ldots , k-1\). Then by the same reasoning as above for all \(i=1,\ldots ,k-1\), we obtain the following system of equations:

$$\begin{aligned} q(b,x)-q(a,x)&=q(b,z_2)-q(a,z_2),\\ q(b,z_2)-q(a,z_2)&=q(b,z_3)-q(a,z_3),\\&\vdots \\ q(b,z_{k-1})-q(a,z_{k-1})&=q(b,y)-q(a,y). \end{aligned}$$

As the right-hand side of every i-th equation is identical to the left-hand side of the \((i+1)\)-th equation, it follows that \(q(b,x)-q(a,x)=q(b,y)-q(a,y)\), which can be rearranged to

$$\begin{aligned} q(b,y)=q(b,x)+q(a,y)-q(a,x). \end{aligned}$$
(8)

Note that (8) is satisfied also for \(b=a\) or \(y=x\).

By the assumption of the lemma, we can obtain (8) for all b and y, therefore it follows that q(by) is a sum of a function of b and a function of y (as a and x are fixed), i.e.

$$\begin{aligned} q(i,j)=s(i)+t(j)\qquad \forall i\ne j, \end{aligned}$$

for some vectors \(s=(s(i))\) and \(t=(t(i))\). As Q is symmetric it follows that

$$\begin{aligned} q(i,j)=w(i)+w(j)\qquad \forall i\ne j, \end{aligned}$$

for some vector \(w=(w(i))\), which proves the lemma. \(\square \)

Note that Theorem 5 can be proved using Lemma 12 and the fact that if a and x are two nonincident edges of a complete graph, then for any other pair of edges b and y there exists an \(\{a,b\}\)-\(\{x,y\}\)-backbone. However, the independent proof given earlier is more intuitive for this important special case.

Corollary 13

Let Q be a linearizable symmetric cost matrix of the QMSTP on a simple graph G. Let I and J be two disjoint sets of edges of G, and let \(a\in I\) and \(x\in J\) be two fixed edges. Let \(Q_{IJ}\) be the submatrix of Q defined by rows I and columns J. If for all additional edges \(b\in I\) and \(y\in J\) there exist a sequence of \(k\ge 2\) edges \(z_1,z_2,\ldots ,z_k\) such that \(x=z_1\), \(y=z_k\) and there exists an \(\{a,b\}\)-\(\{z_i,z_{i+1}\}\)-backbone \(B_i\) for every \(i=1,\ldots , k-1\), then \(Q_{IJ}\) is a sum matrix.

Proof

The proof is similar as that of Lemma 12. \(\square \)

In most of the cases when we make use of Lemma 12 and Corollary 13, k will be equal to 2, i.e. we will not need additional edges \(z_i\).

Given the edges a, b, x and y, usually we try to build an \(\{a,b\}{-}\{x,y\}\)-backbone in the following way. We aim to find a cycle \(C_1\) that contains a and b and a cycle \(C_2\) that contains x and y, such that if the intersection of \(C_1\) and \(C_2\) is nonempty, then it is connected and does not contain a pair of edges from \(\{a,b\}\times \{x,y\}\). We denote \(C_1\) and \(C_2\) as feasible backbone cycles for abx, and y. For feasible backbone cycles \(C_1\) and \(C_2\), \((C_1\setminus \{a,b\}) \cup (C_2\setminus \{x,y\})\) extended by a tree which is connected to \(C_1\) and \(C_2\) and spans the remaining set of vertices, forms an \(\{a,b\}{-}\{x,y\}\)-backbone.

Lemma 14

A symmetric cost matrix Q of the QMSTP on a complete bipartite graph \(K_{n_1,n_2}\) with \(\min \{n_1,n_2\}\ge 3\), is linearizable if and only if Q is a symmetric weak sum matrix. A linearization of a linearizable symmetric matrix Q is given by (6).

Proof

If Q is a weak sum matrix, then from Lemma 10 it follows that Q is linearizable and a linearization is given by (6). Note that in the case of the complete bipartite graph \(K_{n_1,n_2}\), entries of the matrix A in the expression (5) are the same for every fixed row. Hence, \(r(i)=(n_1+n_2-1)a(i,j)\) for any column j.

Let \(\min \{n_1,n_2\}\ge 3\) and assume that Q is linearizable. We fix two arbitrary nonincident edges a and x. We will show that for any two additional edges b and y, (\(b\ne y\)), conditions of Lemma 12 are satisfied, which completes the proof.

Fig. 2
figure 2

The abxy configurations

In Fig. 2 all possible essentially different configurations of incidences between a, b, x and y, up to symmetries, are presented. (The symmetries are defined by exchanging sets \(\{a,b\}\) and \(\{x,y\}\), and by exchanging elements inside of those two sets.) Configurations in Fig. 2 are of two types. In the cases where we apply Lemma 12 with \(k=2\), the configurations are extended by (dashed) edge(s) that form feasible backbone cycles. In other cases we use one auxiliary edge of Lemma 12 (\(k=3\)), therefore configurations are extended by the edge z which plays the role of \(z_2\) in Lemma 12. \(\square \)

The previous lemma gives linearization characterization only for \(\min \{n_1,n_2\}\ge 3\). Note that for configurations in Fig. 2e, h, i, j, k, and m, we actually use the fact that \(\min \{n_1,n_2\}\ge 3\). If \(\min \{n_1,n_2\}< 3\), a linearizable cost matrix Q is not necessary a weak sum matrix. For example, if \(n_1\) or \(n_2\) equals to 1, \(K_{n_1,n_2}\) is a tree, and if \(n_1=n_2=2\), \(K_{n_1,n_2}\) is a cycle. In both cases an arbitrary Q is linearizable. For the remaining case of \(n_1=2\) and \(n_2\ge 3\), we present the following counterexample of a symmetric matrix Q that is linearizable but not a weak sum matrix. For \(i\ne j\), cost element q(ij) is equal to 1 if edges i and j are incident through the \(n_2\)-set vertex, and 0 otherwise. Then the linearization costs are given by \(c(i)=q(i,i)+2/(n_2+1)\).

Lemma 15

Let Q be a linearizable symmetric cost matrix of the QMSTP on a graph G. Then for every two distinct biconnected components I, J of G, the submatrix of costs q(ij), \(i\in I\), \(j\in J,\) is a sum matrix.

Proof

If I or J is just one edge, i.e. a bridge, then there is nothing to prove, as every \(1\times n\) matrix is a sum matrix. Since a biconnected component cannot have exactly two edges, in the rest of the proof we assume that \(\min \{|I|,|J|\}\ge 3\).

We will again make use of backbones. First we fix two edges \(a\in I\) and \(x\in J\). It is easy to see that for every pair of additional edges \(b\in I\) and \(y\in J\) there exist an \(\{a,b\}{-}\{x,y\}\)-backbone. Namely, in every biconnected component, there exist a cycle that contains any pair of edges. Hence, there exists a cycle in I that contains a and b, and a cycle in J that contains x and y. As their intersection contains at most one vertex, they are feasible backbone cycles. Hence, by Corollary 13, the lemma follows. \(\square \)

Lemmas 6101415 and Theorem 5 can be combined to produce the linearization characterization for a larger class of graphs, e.g. cactuses.

Theorem 16

Let G be a graph such that every biconnected component is either a clique, a cycle or a biclique (with vertex partition sets of sizes at least three). Then a symmetric cost matrix Q of the QMSTP on G is linearizable if and only if the submatrices of Q that correspond to different biconnected components are sum matrices, and submatrices that correspond to single biconnected components that are either a clique or a biclique are symmetric weak sum matrices. Furthermore, if Q is linearizable, a linearization can be computed in \(O(|E|^2)\) time.

Proof

Let Q be of the form described in the theorem. We denote by k the number of biconnected components of G that are cycles. Then Q can be expressed as \(Q=H+B_1+\cdots +B_k\), where H satisfies the hypothesis of Lemma 10, and \(B_i\) is a matrix in which all entries that are not in the submatrix defined by the i-th cycle, are equal to 0. Note that matrices H and \(B_i\), \(i=1,\ldots ,k\), can be found in \(O(|E|^2)\) time. From Lemma 10 it follows that H is linearizable and its linearization vector \(C_H\) can be computed by (6). From Lemma 6, it follows that for every \(i=1,\ldots ,k\), \(B_i\) is linearizable and its linearization \(C_{B_i}\) is given by (4). Therefore by Observation 1, Q is also linearizable and its linearization vector is given by \(C=C_H+C_{B_1}+\cdots +C_{B_k}\).

Conversely, if Q is linearizable then it has to be of the form described in the theorem. This follows directly from Lemma 15 and the proofs of Theorem 5 and Lemma 14. Namely, backbones of biconnected components can be extended into backbones of G by adding edges that span the remaining vertices. \(\square \)

We present an example that illustrates Theorem 16. Let \(G=(V,E)\) be the graph presented by Fig. 3a. Graph G has four biconnected components with its corresponding edge sets \(E_1=\{e_1,e_2,e_3\}\), \(E_2=\{e_4\}\), \(E_3=\{e_5,e_6,e_7,e_8,e_9,e_{10}\}\) and \(E_4=\{e_{11}\}\). Let the symmetric matrix \(Q=(q(i,j))\), presented in Fig. 3b, be a QMSTP cost matrix associated to G, such that q(ij) is the QMSTP cost associated to the edge pair \((e_i,e_j)\). We denote by \(Q_{E_iE_j}\) the submatrix of Q consisting of elements \(q(k,\ell )\) for \(e_k\in E_i\) and \(e_{\ell }\in E_j\). In Fig. 3b, Q is divided into submatrices \(Q_{E_iE_j}\), \(i,j\in \{1,\ldots ,4\},\) using dashed lines.

Fig. 3
figure 3

A linearizable QMSTP instance

Biconnected components \(E_1,\ldots ,E_4\) are cycles and cliques, so according to Theorem 16, matrix Q is linearizable if and only if submatrices \(Q_{E_iE_j}\) have some specific properties. In particular, submatrices that correspond to a pair of different biconnected components, i.e. \(Q_{E_iE_j}\) with \(i\ne j\), have to be sum matrices. There are 12 such submatrices, and 10 of them have one row and/or one column in which case the sum matrix property is trivially satisfied. The remaining 2 submatrices are \(Q_{E_1E_3}\) and \(Q_{E_3E_1}\). Since Q is symmetric, they are transpose of each other, hence it is enough to check the sum matrix property only for one of them. Indeed they are sum matrices, since they are a sum of vectors (3, 2, 4) and (1, 3, 0, 5, 2, 4). It remains to check the remaining 4 submatrices \(Q_{E_iE_i},\) \(i\in \{1,\ldots ,4\}\). If \(E_i\) is a clique or (big enough) biclique then we need to check whether \(Q_{E_iE_i}\) is a weak sum matrix. If \(E_i\) is a cycle then there are no necessary conditions on \(Q_{E_iE_i}\). Edges \(E_1\) form a complete graph on three vertices, but in the same time \(E_1\) forms a cycle. This is not a contradiction, because every symmetric \(3\times 3\) matrix is a weak sum matrix. Biconnected components \(E_2\) and \(E_4\) are trivial cliques, hence the weak sum property of \(Q_{E_2E_2}\) and \(Q_{E_4E_4}\) is trivially satisfied. \(E_3\) is a complete graph, hence it remains to check whether \(Q_{E_3E_3}\) is a weak sum matrix. It is easy to see that \(Q_{E_3E_3}\) is a symmetric weak sum matrix generated by the vector \(a=(3,1,2,5,0,4)\), i.e. for \(i\ne j\), the i-th row and j-th column of the submatrix \(Q_{E_3E_3}\) contains the value \(a(i)+a(j)\). We see that all submatrices of Q have the required structures, therefore Q is linearizable.

At this point, it is straightforward to obtain a linearization of Q. We can express Q as \(Q=H+B_1\), where H is the matrix obtained from Q by replacing elements in the submatrix \(Q_{E_1E_1}\) by 0. Matrix H satisfies Lemma 10, and its linearization vector \(C_H\) can be calculated as described in the proof of Lemma 10 using vectors obtained in the analysis above. Furthermore, \(B_1\) is linearizable and its linearization vector \(C_B\) can be calculated as described in Lemma 6. Then vector \(C=C_H+C_B\) is a linearization of Q. And \(C=(54,41,48,12,27,42,23,67,40,45,2)\) is one such vector for the considered matrix Q.

5 Recognition of linearizable QMSTP

Theorem 16 gives us a solution for the quadratic spanning tree linearization problem (QST-LP) for the class of graphs in which every biconnected component is either a clique, a biclique or a cycle. Given such a graph \(G=(V,E)\), one can find in linear time its biconnected components (see Hopcroft and Tarjan 1973), and determine which type they are. Now for a given (not necessary symmetric) cost matrix Q, from Observation 3 it follows that Q is linearizable if and only if the symmetric matrix \(\frac{1}{2}(Q+Q^T)\) is linearizable. According to Theorem 16, to determine whether Q is linearizable we need to check whether appropriate submatrices of \(\frac{1}{2}(Q+Q^T)\) are sum matrices or symmetric weak sum matrices. In the worst case this takes \(\varTheta (|E|^2)\) time, since potentially every element of Q which is not on the main diagonal has to be examined. Next we examine whether the recognition can be done faster if the cost matrix is given in the factored form.

Let \(H=(h(i,j))\) be an \(m\times m\) matrix of rank p. Then the elements of H are of the form

$$\begin{aligned} H(i,j)=\sum _{k=1}^p a_i^{(k)}b_j^{(k)}, \end{aligned}$$
(9)

for some vectors \(a^{(k)}\) and \(b^{(k)}\), \(k=1,\ldots ,p\). Hence, an \(m\times m\) matrix of a rank p can be represented with 2pm values. We say that (9) is a factored form representation of matrix H. (To save space, in this section we denote i-th element of a vector by \(a_i\), instead of previous notation a(i).)

Note that every sum matrix can be written as the sum of a constant row matrix and a constant column matrix. Since \(\text {rank}(M_1+M_2)\le \text {rank}(M_1)+\text {rank}(M_2)\) for every pair of matrices \(M_1\) and \(M_2\), it follows that every sum matrix has the rank at most 2. Therefore, the problem of recognizing sum matrices represented in a factored form (9) is reduced to the following question. Given \(a_i,b_i,c_i,d_i\), \(i=1,\ldots ,m\), is it possible to decide in O(m) time whether a matrix \(H=(h(i,j))\) with \(h(i,j)=a_ib_j+c_id_j\) is a sum matrix? An affirmative answer to this question follows from the following theorem.

Theorem 17

Let an \(m\times m\) matrix \(H=(h(i,j))\) be of the form \(h(i,j)=a_ib_j+c_id_j\), \(i,j=1,\ldots ,m\).

  • If at least one of the vectors abcd is a constant vector, then H is a sum matrix if and only if a or b is a constant vector, and c or d is a constant vector.

  • If none of the vectors abcd is a constant vector, then H is a sum matrix if and only if there exist three constants \(K\ne 0\), \(K_1\) and \(K_2\) such that \(a_i=Kc_i+K_1\) and \(d_i=-Kb_i+K_2\), \(i=1,\ldots ,m\).

Proof

Let a matrix \(H=(h(i,j))\) be of the form \(h(i,j)=a_ib_j+c_id_j\). Let us assume H is a sum matrix, i.e. there exist two vectors e and f such that \(h(i,j)=e_i+f_j\), \(i,j=1,\ldots ,m\). Then for arbitrary \(i,j,k,\ell \in \{1,\ldots ,m\}\)

$$\begin{aligned} h(i,k)-h(i,\ell )=f_k-f_\ell \quad \text {and} \quad h(j,k)-h(j,\ell )=f_k-f_\ell . \end{aligned}$$

Hence \(h(i,k)-h(i,\ell )=h(j,k)-h(j,\ell )\). Now from \(h(i,j)=a_ib_j+c_id_j\) it follows that

$$\begin{aligned} a_ib_k+c_id_k-a_ib_\ell -c_id_\ell =a_jb_k+c_jd_k-a_jb_\ell -c_jd_\ell , \end{aligned}$$

which can be rearranged to

$$\begin{aligned} a_i(b_k-b_\ell )+c_i(d_k-d_\ell )=a_j(b_k-b_\ell )+c_j(d_k-d_\ell ). \end{aligned}$$

Finally, we get a necessary condition that if H is a sum matrix then

$$\begin{aligned} (a_i-a_j)(b_k-b_\ell )=-(c_i-c_j)(d_k-d_\ell ), \end{aligned}$$
(10)

for every \(i,j,k,\ell \in \{1,\ldots ,m\}\). Now we divide our investigation into two cases.

Case 1: At least one of the vectors abcd is a constant vector. Without loss of generality we can assume that a is a constant vector. From (10) it follows that

$$\begin{aligned} (c_i-c_j)(d_k-d_\ell )=0\quad \forall i,j,k,\ell \in \{1,\ldots ,m\}. \end{aligned}$$
(11)

Hence either c or d is a constant vector. Otherwise there would exist ij for which \(c_i-c_j\ne 0\), and \(k,\ell \) for which \(d_k-d_\ell \ne 0\) which would contradict (11).

Note that this is also a sufficient condition. Let us assume \(a_i=\alpha \) and \(d_i=\delta \), \(i=1,\ldots ,m\). Then,

$$\begin{aligned} h(i,j)=\alpha b_j+c_i\delta =e_i+f_j \quad \forall i,j\in \{1,\ldots ,m\}, \end{aligned}$$

where \(e_i:=\delta c_i\) and \(f_i:=\alpha b_i\), \(i=1,\ldots ,m\). In the case when c (instead of d) is a constant vector, in a similar way one gets that H is a sum matrix.

Case 2: None of the vectors abcd is a constant vector. Assume that there are two elements of vector a that are the same, i.e. there exist ij, \(i\ne j\) such that \(a_i=a_j\). Then for the same ij \(c_i=c_j\) holds. Assume the contrary, i.e. \(a_i=a_j\) and \(c_i\ne c_j\). As d is not a constant vector, there exist \(k,\ell \) such that \(d_k\ne d_\ell \). Now for such \(i,j,k,\ell \), Eq. (10) does not hold, which is a contradiction. Hence, \(a_i=a_j\) if and only if \(c_i=c_j\). Using the same logic, for all \(i,j\in \{1,\ldots ,m\}\), \(b_i=b_j\) if and only if \(d_i=d_j\).

Let \(N_1\subseteq \{1,\ldots ,m\}\) be a maximal set of indices i for which \(a_i\)’s (and \(c_i\)’s) are pairwise distinct. That is, for every \(i,j\in N_1\), \(i\ne j\), it follows that \(a_i\ne a_j\) (and hence \(c_i\ne c_j\) also). Let \(N_2\) be a set of indices with the same property for vectors b and d. Now from (10) it follows that

$$\begin{aligned} \frac{a_i-a_j}{c_i-c_j}=-\frac{d_k-d_\ell }{b_k-b_\ell }, \end{aligned}$$

for every distinct \(i,j\in N_1\) and \(k,\ell \in N_2\). By fixing some distinct \(k,\ell \in N_2\), it follows that \((a_i-a_j)/(c_i-c_j)\) is a nonzero constant (which we denote by K) for every distinct \(i,j\in N_1\). Analogously, it follows that \((d_k-d_\ell )/(b_k-b_\ell )=-K\) for every distinct \(k,\ell \in N_2\). Hence

$$\begin{aligned} a_i-a_j=K(c_i-c_j)=Kc_i-Kc_j\quad \forall i,j\in N_1, \end{aligned}$$

and therefore

$$\begin{aligned} a_i=Kc_i+(a_j-Kc_j)\quad \forall i,j\in N_1. \end{aligned}$$
(12)

By fixing \(j\in N_1\) in (12) we get that \(a_i=Kc_i+K_1\) for some constant \(K_1\) and for all \(i\in N_1\). Note that from the way we defined \(N_1\), this relation can be extended to entire \(\{1,\ldots , m\}\), i.e. we have that

$$\begin{aligned} a_i=Kc_i+K_1 \quad i=1,\ldots , m, \end{aligned}$$
(13)

for some constants \(K\ne 0\) and \(K_1\). Analogously we obtain that

$$\begin{aligned} d_i=-Kb_i+K_2 \quad i=1,\ldots , m, \end{aligned}$$
(14)

for some additional constant \(K_2\).

Note that (13) and (14) are sufficient conditions also. Namely

$$\begin{aligned} h(i,j)&=a_ib_j+c_id_j\\&=(Kc_i+K_1)b_j+c_i(-Kb_j+K_2)\\&=K_2c_i+K_1b_j \end{aligned}$$

is a sum matrix relation. \(\square \)

Corollary 18

Given \(a_i,b_i,c_i,d_i\), \(i=1,\ldots ,m\), it is possible to decide in O(m) time whether the square matrix \(H=(h(i,j))\) with \(h(i,j)=a_ib_j+c_id_j\) is a sum matrix.

Proof

It follows directly from the statement and the proof of Theorem 17. Namely, the following is an O(m) time algorithm.

First we check whether any of the vectors abcd is a constant vector. If so, then H is a sum matrix if and only if a or b is a constant vector, and c or d is a constant vector. Else, find i and j such that \(a_i-a_j\ne 0\) and define K to be \(K=(a_i-a_j)/(c_i-c_j)\). Furthermore, define \(K_1,K_2\) to be \(K_1=a_1-Kc_1\) and \(K_2=d_1+Kb_1\). Then H is a sum matrix if and only if (13) and (14) are satisfied. \(\square \)

In the case of complete graphs, the following result on the recognition of linearizable cost matrices represented in factored form straightforwardly holds.

Corollary 19

Let \(G=(V,E)\) be a complete graph or a complete bipartite graph. Let \(Q=(q(i,j))\) be a symmetric cost matrix of a QMSTP on the graph G such that \(q(i,j)=a_ib_j+c_id_j\), \(i,j=1,\ldots ,|E|,\) \(i\ne j,\) for some given vectors abcd. Then in O(|E|) time it can be decided whether Q is linearizable, and if so, a linearization can be calculated in O(|E|) time.

Proof

It follows directly from Corollary 18 and the fact that in the case of complete (bipartite) graphs, r(i) from (6) can be calculated in O(1) time for every \(i=1,\ldots ,|E|\). \(\square \)

6 Conclusions and future work

We investigated the problem of characterizing linearizable QMSTP cost matrices, and we resolved the problem for the default structure of an underlying graph, i.e. complete graphs. We then extended the characterization to a broader class of graphs. The main result is presented as Theorem 16. In particular, given a graph G, Lemma 10 gives a sufficient condition for a cost matrix to be linearizable, and in the case of complete and complete bipartite graphs, the condition is also necessary. A natural question that imposes itself is: for which class of graphs the condition of Lemma 10 is also necessary? In view of Lemma 15, this question can be rephrased as the following open problem: for which biconnected graphs a symmetric QMSTP cost matrix is linearizable only if it is a weak sum matrix?

In this paper so far we have encountered two types of biconnected graphs for which a linearizable QMSTP cost matrix does not need to be a weak sum matrix. These graphs were cycles and complete bipartite graphs \(K_{2,n}\). Note that both of these graph classes contain a vertex with degree 2. As a matter of fact, for every biconnected graph that contains a vertex with degree 2, the weak sum condition is not necessary. For example, let \(G=(V,E)\), with \(|V|=n\), be a biconnected graph such that \(p\in V\) is of degree 2 and \(E_p\) is the set of two edges incident to p. Then the following symmetric matrix \(Q=(q(i,j))\) given by

$$\begin{aligned}q(e,f) = {\left\{ \begin{array}{ll} 1/2 &{} \text{ if } e,f\in E_p,\ e\ne f,\\ 1/(2(n-3)) &{} \text{ if } e,f\in E\setminus E_p,\ e\ne f,\\ 0 &{} \text{ otherwise, } \end{array}\right. }\end{aligned}$$

is linearizable, but it is not a weak sum matrix. The linearization is given by \(c(e)=\frac{n-3}{n-1}\), \(e\in E\). (Note that such cost matrices have even the stronger property that the cost of every spanning tree is the same.) Therefore, an interesting question would be to identify how dense a graph needs to be in order that all linearizable cost matrices are weak sum matrices. Is it enough that the minimum vertex degree is at least 3?