1 Introduction

In this paper, we study standard quadratic optimization problems given by

$$\begin{aligned} \hbox {(StQP)} \quad \nu (Q) := \min _{x \in {\varDelta }_n} x^T Q x, \end{aligned}$$
(1)

where \(Q \in \mathcal {S}\) and \(\mathcal {S}\) denotes the set of \(n \times n\) real symmetric matrices, \(x \in \mathbb {R}^n\), and \({\varDelta }_n\) denotes the unit simplex in \(\mathbb {R}^n\) given by

$$\begin{aligned} {\varDelta }_n := \{x \in \mathbb {R}^n_+{:}\, e^T x= 1\}, \end{aligned}$$

where \(e \in \mathbb {R}^n\) is the vector of all ones and \(\mathbb {R}^n_+\) denotes the nonnegative orthant in \(\mathbb {R}^n\).

Standard quadratic optimization problems arise in a variety of applications (see, e.g. [2]). Despite its seemingly simple structure, the problem (StQP) encompasses several interesting problem classes. For instance, it is well-known that the problem of minimizing a nonhomogeneous quadratic function over the unit simplex can be reformulated in the form of (StQP) using the following identity:

$$\begin{aligned} x^T Q x + 2 c^T x = x^T (Q + e c^T + c e^T) x, \quad \hbox {for each}~ x \in {\varDelta }_n. \end{aligned}$$

Similarly, any quadratic optimization problem over a polytope can, in theory, be formulated as an instance of (StQP) by representing each feasible point as a convex combination of the vertices of the polytope (see, e.g., [4]). In the resulting alternative formulation, each component of \(x\) corresponds to the weight of a different vertex. Therefore, the dimension \(n\) is equal to the number of vertices of the polytope. Clearly, such a reformulation is not viable if there is an exponential number of vertices as in box-constrained quadratic optimization problems.

Since the maximum stable set problem admits a formulation in the form of (StQP) [14], it follows that (StQP) is in general an NP-hard problem.

An instance of problem (1) can be equivalently reformulated as the following instance of a linear optimization problem over the cone of completely positive matrices [4]:

$$\begin{aligned} \nu (Q) = \min \{{\langle }Q, X {\rangle }{:}\, {\langle }E, X {\rangle }= 1, \quad X \in \mathcal {C}\}, \end{aligned}$$
(2)

where \({\langle }A , B {\rangle }= \hbox {trace}(AB) = \sum _{i=1}^n \sum _{j=1}^n A_{ij} B_{ij}\) for \(A, B \in \mathcal {S}\), \(E = ee^T \in \mathcal {S}\) is the matrix of all ones, and \(\mathcal {C}\subseteq \mathcal {S}\) is the cone of the completely positive matrices given by

$$\begin{aligned} \mathcal {C}:= \hbox {conv}\{u u^T{:}\,u \in \mathbb {R}^n_+\}, \end{aligned}$$

where \(\hbox {conv}\{\cdot \}\) denotes the convex hull. The dual cone of \(\mathcal {C}\) with respect to the trace inner product \({\langle }\cdot , \cdot {\rangle }\) is the cone of copositive matrices given by

$$\begin{aligned} \mathcal {C}^* := \left\{ X \in \mathcal {S}{:}\, u^T X u \ge 0\quad \hbox {for all}~u \in \mathbb {R}^n_+\right\} . \end{aligned}$$

Since (StQP) is in general an NP-hard problem, the convex optimization problem (2) is in general intractable. Indeed, the computational complexity is now hidden in the conic constraint \(X \in \mathcal {C}\), for which the membership problem is NP-hard [10]. Similarly, the membership problem for the dual cone \(\mathcal {C}^*\) is also NP-hard [10, 15].

Despite the fact that the convex reformulation (2) of the problem (StQP) does not seem to be helpful from the computational complexity point of view, it offers a new perspective as the difficulty is now shifted to the convex cones \(\mathcal {C}\) and \(\mathcal {C}^*\). In fact, Burer [7] established more generally that, under mild assumptions, every optimization problem with a quadratic objective function, linear equality constraints, and a mix of nonnegative and binary variables admits an exact reformulation as a linear optimization problem over the cone \(\mathcal {C}\). Therefore, there has recently been a considerable research activity towards a better understanding of the cones \(\mathcal {C}\) and \(\mathcal {C}^*\). Many of these studies propose different ways of approximating each of these two cones by a sequence of tractable convex cones of increasing accuracy (see, e.g., [3, 5, 6, 8, 12, 13, 1618, 20]). Such sequences therefore yield approximation hierarchies for the cones \(\mathcal {C}\) and \(\mathcal {C}^*\). By replacing the difficult conic constraints with the sequences of increasingly more accurate tractable approximations, one can obtain a sequence of increasingly tighter bounds on the optimal value of a linear optimization problem over the cones of completely positive or copositive matrices.

In this paper, we focus on two approximation hierarchies, each of which is comprised of a sequence of polyhedral cones of increasing accuracy. The first approximation hierarchy, defined originally in the context of standard quadratic optimization by Bomze and de Klerk [3] and later extended to a hierarchy that consists of nested cones by Yıldırım [20], is motivated by exploiting necessary conditions for copositivity of a matrix. By duality, these conditions translate into a hierarchy of inner polyhedral approximations \(\mathcal {I}_r,~r = 0,1,\ldots \) of the cone \(\mathcal {C}\) of completely positive matrices, with the property that \(\mathcal {I}_0 \subseteq \mathcal {I}_1 \subseteq \cdots \subseteq \mathcal {C}\) and \(\hbox {cl}\left( \cup _{r \in \mathbb {N}} \mathcal {I}_r\right) = \mathcal {C}\), where \(\hbox {cl}(\cdot )\) denotes the closure. Exploiting a sequence of sufficient conditions for copositivity of a matrix, de Klerk and Pasechnik [8] proposed another hierarchy, which, by duality, yields a sequence of outer polyhedral approximations \(\mathcal {O}_r,~r = 0, 1, \ldots \) of the cone \(\mathcal {C}\), with the property that \(\mathcal {O}_0 \supseteq \mathcal {O}_1 \supseteq \cdots \supseteq \mathcal {C}\) and \(\cap _{r \in \mathbb {N}} \mathcal {O}_r = \mathcal {C}\).

We study these two particular approximation hierarchies for the following reasons. First, each of the two approximation hierarchies is composed of polyhedral cones. Therefore, replacing the difficult conic constraint \(X \in \mathcal {C}\) in (2) with \(X \in \mathcal {I}_r\) (resp., with \(X \in \mathcal {O}_r\)), where \(r \in \mathbb {N}\), gives rise to a linear programming problem with \(O(n^{r+2})\) variables [3, 20], whose optimal value yields an upper (resp., lower) bound on the optimal value \(\nu (Q)\). In the particular case of standard quadratic optimization, both upper and lower bounds at any level of the hierarchy reduce to an optimization problem over an increasingly larger finite set (see Sects. 2.1 and 2.2). These observations make these two particular approximation hierarchies more amenable to analysis in comparison with the other approximation hierarchies, most of which give rise to semidefinite programming problems (e.g., [12, 13, 16, 17]) that are considerably more difficult to analyze theoretically. Second, these two approximation hierarchies lead to a polynomial-time approximation scheme for standard quadratic optimization [3, 20] (see Sect. 2.3). As such, they constitute a natural choice for studying approximations of standard quadratic optimization problems.

In this paper, we investigate the upper and lower bounds arising from these inner and outer polyhedral approximations in the context of standard quadratic optimization. We give complete algebraic descriptions of the sets of the instances of (StQP) for which the upper bound or the lower bound is already exact at any given finite level \(r \in \mathbb {N}\) of the corresponding hierarchies (see Theorems 2 and 4). We identify the structural properties of the sets of instances on which upper and lower bounds converge to the optimal value only in the limit (see Proposition 2 and Theorem 3). Furthermore, we present several geometric and topological properties of these sets (see Proposition 1, Corollary 1, Propositions 8 and 9).

We remark that our perspective in this paper, in general, is not algorithmic in the sense that some of our algebraic descriptions may not necessarily translate into an efficient method for deciding, for a given instance of (StQP), whether the upper and/or the lower bound is exact at a given level of the corresponding hierarchies. Rather, our main objective is to further our understanding of the strengths and limitations of these particular inner and outer polyhedral approximations of the cone \(\mathcal {C}\) in the context of standard quadratic optimization. We believe that the insights obtained from our results may be useful for the construction of improved polyhedral approximation hierarchies.

The paper is organized as follows. We define our notation in Sect. 1.1. In Sect. 2, we describe the inner and outer polyhedral approximations of the cone of completely positive matrices. Section 3 is devoted to some preliminary results. In Sect. 4, we present algebraic descriptions of the sets of instances on which the upper bound is exact at any given finite level of the hierarchy. We identify the structural properties of the instances on which the upper bound converges to the optimal value in the limit. We also discuss several geometric and topological properties of these sets. The corresponding results for lower bounds are given in Sect. 5. We conclude the paper in Sect. 6.

1.1 Notation

We denote by \(\mathbb {R}^n\) and \(\mathbb {R}^n_+\) the \(n\)-dimensional Euclidean space and the nonnegative orthant, respectively. We use \(\mathbb {N}^n\) and \(\mathbb {Q}^n\) for the set of \(n\)-dimensional nonnegative integer vectors and the set of rational vectors, respectively. The unit simplex is represented by \({\varDelta }_n \subset \mathbb {R}^n\). The space of \(n \times n\) real symmetric matrices is denoted by \(\mathcal {S}\). We reserve calligraphic uppercase letters to denote subsets of \(\mathcal {S}\). The cones of completely positive and copositive matrices in \(\mathcal {S}\) are denoted by \(\mathcal {C}\) and \(\mathcal {C}^*\), respectively. We use \(\mathcal {N}\) to denote the cone of componentwise nonnegative \(n \times n\) real symmetric matrices. The set of \(n \times n\) symmetric matrices with rational entries is represented by \(\mathcal {Q}\). The \(i\)th unit vector in \(\mathbb {R}^n\) is denoted by \(e_i,~i = 1,\ldots ,n\). We use \(e = \sum _{i=1}^n e_i\) and \(E = e e^T\) to represent the \(n\)-dimensional vector of all ones and the \(n \times n\) matrix of all ones, respectively. We reserve \(I \in \mathcal {S}\) for the identity matrix. Given \(x \in \mathbb {R}^n\), \(\hbox {Diag}(x) \in \mathcal {S}\) denotes the diagonal matrix whose diagonal entries are given by the components of \(x\). Similarly, for \(M \in \mathcal {S}\), \(\hbox {diag}(M) \in \mathbb {R}^n\) represents the vector composed of the diagonal entries of \(M\). The closure and the convex hull of a set are denoted by \(\hbox {cl}(\cdot )\) and \(\hbox {conv}(\cdot )\), respectively. For a given instance of (StQP), we denote the optimal value by \(\nu (Q)\) and the set of optimal solutions by \(\varOmega (Q) \subseteq {\varDelta }_n\). Let \(U \subseteq \{1,\ldots ,n\}\) and \(V \subseteq \{1,\ldots ,n\}\) be two nonempty index sets with \(|U| = u\) and \(|V| = v\). Given \(x \in \mathbb {R}^n\), we denote by \(x_U \in \mathbb {R}^{u}\) the restriction of \(x\) to its components indexed by \(U\). Similarly, for a given \(M \in \mathcal {S}\), \(M_{UV}\) denotes the \(u \times v\) submatrix of \(M\) whose rows and columns are indexed by the sets \(U\) and \(V\), respectively.

2 Polyhedral approximation hierarchies

In this section, we give a detailed description of the two hierarchies of inner and outer polyhedral approximations of the cone of completely positive matrices.

2.1 Inner polyhedral approximations

Let us define the following sequence of increasingly larger finite subsets of the unit simplex (see [3, 20]):

$$\begin{aligned} {\varDelta }_n^r := \bigcup _{k = 0}^r {\varLambda }_n^k, \quad r = 0, 1, \ldots , \end{aligned}$$
(3)

where

$$\begin{aligned} {\varLambda }_n^k := \{x \in {\varDelta }_n{:}\, (k + 2) x \in \mathbb {N}^n\}, \quad k = 0, 1, \ldots \end{aligned}$$
(4)

Consider the following sequence of sets:

$$\begin{aligned} \mathcal {I}_r := \left\{ \sum _{d \in {\varDelta }_n^r} \lambda _d \, d d^T{:}\, \lambda _d \ge 0 \hbox { for each }d \in {\varDelta }_n^r \right\} , \quad r = 0, 1, \ldots \end{aligned}$$

Since \({\varDelta }_n^r\) is a finite set for each \(r \in \mathbb {N}\), each set \(\mathcal {I}_r\) is a polyhedral cone. Furthermore, this sequence of cones satisfies [20, Theorem2.2]

$$\begin{aligned} \mathcal {I}_0 \subseteq \mathcal {I}_1 \subseteq \cdots \subseteq \mathcal {C}, \quad \hbox {and} \quad \hbox {cl}\left( \bigcup _{r \in \mathbb {N}} \mathcal {I}_r\right) = \mathcal {C}. \end{aligned}$$

Therefore, replacing the difficult conic constraint \(X \in \mathcal {C}\) in (2) with \(X \in \mathcal {I}_r\), we obtain the following linear programming problem whose optimal value yields an upper bound on \(\nu (Q)\):

$$\begin{aligned} u_r(Q) := \min \left\{ {\langle }Q, X {\rangle }{:}\, {\langle }E, X {\rangle }= 1, \quad X \in \mathcal {I}_r\right\} , \quad r = 0,1,\ldots \end{aligned}$$

As already observed in [20] (see also [3]), it follows directly from the definition of \(\mathcal {I}_r\) that \(u_r(Q)\) can be equivalently expressed as a minimization problem over a finite set:

$$\begin{aligned} u_r(Q) = \min _{d \in {\varDelta }_n^r} d^T Q d, \quad r = 0,1,\ldots \end{aligned}$$
(5)

Since \(|{\varDelta }_n^r| = O(n^{r+2})\) (see [3, 20]), \(u_r(Q)\) can be computed in polynomial time for each fixed \(r \in \mathbb {N}\).

2.2 Outer polyhedral approximations

Consider the following sequence of sets (see [3, 8]):

$$\begin{aligned} \mathcal {O}_r := \left\{ \sum _{d \in {\varLambda }_n^r} \beta _d \left( (r+2) d d^T - \hbox {Diag}(d) \right) {:}\, \beta _d \ge 0 \hbox { for all }d \in {\varLambda }_n^r\right\} , \quad r = 0, 1, \ldots \end{aligned}$$

Similarly, since \({\varLambda }_n^r\) is a finite set, \(\mathcal {O}_r\) is a polyhedral cone for each \(r \in \mathbb {N}\). By [8, Theorem 3.3],

$$\begin{aligned} \mathcal {C}\subseteq \cdots \subseteq \mathcal {O}_1 \subseteq \mathcal {O}_0 = \mathcal {N}, \quad \hbox {and} \quad \mathcal {C}= \bigcap _{r \in \mathbb {N}} \mathcal {O}_r. \end{aligned}$$

In a similar manner, replacing the difficult conic constraint \(X \in \mathcal {C}\) in (2) with \(X \in \mathcal {O}_r\) gives rise to the following linear programming problem, whose optimal value yields a lower bound on \(\nu (Q)\):

$$\begin{aligned} l_r(Q) := \min \{{\langle }Q, X {\rangle }{:}\, {\langle }E, X {\rangle }= 1, \quad X \in \mathcal {O}_r\}, \quad r = 0,1,\ldots \end{aligned}$$

Using the definition of \(\mathcal {O}_r\), the lower bound \(l_r(Q)\) can similarly be stated as the following minimization problem over a finite set (see also [3, 20]):

$$\begin{aligned} l_r(Q)&= \left( \frac{r+2}{r+1}\right) \min _{d \in {\varLambda }_n^r} \left( d^T Q d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q) \right) , \nonumber \\&= \frac{1}{(r+1)(r + 2)}\min _{z \in (r+2) {\varLambda }_n^r}(z^T Q z - z^T \hbox {diag}(Q)), \quad r = 0,1,\ldots \end{aligned}$$
(6)

Similarly, for each fixed \(r \in \mathbb {N}\), \(l_r(Q)\) can be computed in polynomial time since \(|{\varLambda }_n^r| = O(n^{r+2})\) (see [3]).

2.3 Error bounds and the maximum stable set problem

Given an instance of (StQP), the two sequences \(\{l_r(Q)\}\) and \(\{u_r(Q)\}, ~r \in \mathbb {N}\), give rise to increasingly tighter lower and upper bounds, respectively, on the optimal value \(\nu (Q)\). Yıldırım [20] established that

$$\begin{aligned} u_r(Q) - l_r(Q) \le \frac{1}{r + 1} \left( \max _{i = 1,\ldots ,n} Q_{ii} - \nu (Q) \right) , \quad r = 0, 1, \ldots , \end{aligned}$$

which implies that these bounds lead to a polynomial-time approximation scheme for standard quadratic optimization (see also [3] for a slightly different result).

The bounds \(l_r(Q)\) and \(u_r(Q)\) have closed form expressions for the maximum stable set problem [3, 17, 20], which we review next. Let \(G\) be a simple, undirected graph with \(n\) vertices and \(m\) edges. A subset \(S\) of vertices of \(G\) is called a stable set if no two vertices in \(S\) are connected by an edge. The maximum stable set problem is that of finding the stable set with the largest cardinality in \(G\). The size of the largest stable set, denoted by \(\alpha (G)\), is called the stability number of \(G\).

Motzkin and Straus [14] established that the maximum stable set problem can be formulated as an instance of (StQP) as

$$\begin{aligned} \frac{1}{\alpha (G)} = \nu (I + A_G) = \min _{x \in {\varDelta }_n} x^T (I + A_G) x, \end{aligned}$$
(7)

where \(A_G \in \mathcal {S}\) denotes the vertex adjacency matrix of \(G\).

For the maximum stable set problem, the upper bounds have the following simple closed form expressions [20]:

$$\begin{aligned} u_r(I + A_G) = \left\{ \begin{array}{l@{\quad }l} \frac{1}{r+2}, &{}\quad \hbox {if}~r < \alpha (G) - 2, \\ \frac{1}{\alpha (G)}, &{}\quad \hbox {otherwise}. \end{array} \right. \end{aligned}$$
(8)

Similarly, the lower bounds have the following more complicated closed form expressions [3, 17, 20]:

$$\begin{aligned} l_r(I + A_G) = \left\{ \begin{array}{l@{\quad }l} 0, &{} \hbox {if}~r \le \alpha (G) - 2, \\ \frac{{s \atopwithdelims ()2}\alpha (G) + st}{{r + 2 \atopwithdelims ()2}}, &{} \hbox {otherwise}, \end{array} \right. \end{aligned}$$
(9)

where \(s\) and \(t\) are nonnegative integers that satisfy \(r + 2 = s \alpha (G) + t\) with \(0 \le t < \alpha (G)\) and the convention that \({s \atopwithdelims ()2} = 0\) if \(s = 0\) or \(s = 1\).

In the context of the maximum stable set problem, it follows from (8) that the upper bound matches the optimal value at level \(r = \max \{0,\alpha (G) - 2\}\) of the inner approximation hierarchy. Similarly, if \(G\) is a complete graph, then \(l_0(Q) = \nu (Q) = 1/\alpha (G) = 1\), where \(Q = I + A_G\) by (9). It follows that both inner and outer approximations are already exact at level 0 in this case. On the other hand, if \(\alpha (G) \ge 2\), then \(l_r(Q) < \nu (Q) = 1/\alpha (G)\) for each \(r \in \mathbb {N}\) (see [17]), which implies that the lower bound matches the optimal value only in the limit. Therefore, the upper and lower bounds exhibit different behaviors in terms of finite convergence in the context of the maximum stable set problem.

In this paper, we are interested in understanding the behavior of upper and lower bounds that arise from the completely positive reformulation of general standard quadratic optimization problems. In particular, we give complete algebraic descriptions of the instances of (StQP) for which \(u_r(Q) = \nu (Q)\) (see Theorem 2) or \(l_r(Q) = \nu (Q)\) (see Theorem 4) for any given \(r \in \mathbb {N}\). We aim to identify structural properties of the instances of (StQP) for which \(u_r(Q) > \nu (Q)\) (see Proposition 2) or \(l_r(Q) < \nu (Q)\) (see Theorem 3) for all \(r \in \mathbb {N}\). Furthermore, we present several geometric and topological properties of these sets (see Proposition 1, Corollary 1, Propositions 8 and 9).

3 Preliminaries

In this section, we collect some preliminary results about (StQP). We then establish some basic properties of upper and lower bounds.

We first review the optimality conditions of (StQP).

Theorem 1

Given an instance of (StQP), let \(x^* \in \varOmega (Q)\) with the optimal value \(\nu (Q) = (x^*)^T Q x^*\). Define

$$\begin{aligned} P := \{j \in \{1,\ldots ,n\}{:}\, x^*_j > 0\}, \quad \text{ and } \quad Z := \{j \in \{1,\ldots ,n\}{:}\, x^*_j = 0\}. \end{aligned}$$
(10)

Then, \(x^*\) satisfies

$$\begin{aligned} Q_{PP} \, x^*_P = \nu (Q) e_P, \quad Q_{ZP} x^*_P \ge \nu (Q) e_Z. \end{aligned}$$

Proof

The assertion directly follows from the KKT conditions. \(\square \)

Next, given an instance of (StQP), we present several basic results about the optimal value \(\nu (Q)\) and the upper and lower bounds.

Lemma 1

Let \(Q, Q_1, Q_2 \in \mathcal {S}\).

  1. (i)

    \(l_0(Q) = \min _{1 \le i \le j \le n} Q_{ij}\).

  2. (ii)

    If \(Q \in \mathcal {N}\), then \(u_r(Q) \ge \nu (Q) \ge l_r(Q) \ge 0\) for each \(r = 0,1,\ldots \)

  3. (iii)

    If \(Q_1 - Q_2 \in \mathcal {N}\), then \(\nu (Q_1) \ge \nu (Q_2)\), \(u_r(Q_1) \ge u_r(Q_2)\), and \(l_r(Q_1) \ge l_r(Q_2)\) for each \(r = 0,1,\ldots \)

  4. (iv)

    For any \(\mu \in \mathbb {R}_+\) and any \(r \in \mathbb {N}\), \(\nu (\mu Q) = \mu \nu (Q)\), \(l_r(\mu Q) = \mu l_r(Q)\), and \(u_r(\mu Q) = \mu u_r(Q)\).

  5. (v)

    For any \(\lambda \in \mathbb {R}\) and any \(r \in \mathbb {N}\), \(\nu (Q + \lambda E) = \nu (Q) + \lambda \), \(l_r(Q + \lambda E) = l_r(Q) + \lambda \), and \(u_r(Q + \lambda E) = u_r(Q) + \lambda \).

Proof

  1. (i)

    By (4), \({\varLambda }_n^0 = \{e_i: i = 1,\ldots ,n\} \cup \{(1/2)(e_i + e_j): 1 \le i < j \le n\}\). The assertion follows from this observation and (6).

  2. (ii)

    This is a direct consequence of part (i) and the monotonicity of the lower bounds.

  3. (iii)

    Let \(Q_1 - Q_2 \in \mathcal {N}\). For any \(x \in {\varDelta }_n\), we have \(x^T(Q_1 - Q_2) x \ge 0\), which implies that \(x^T Q_1 x \ge x^T Q_2 x\). Therefore,

    $$\begin{aligned} \nu (Q_1) = \min _{x \in {\varDelta }_n} x^T Q_1 x \ge \min _{x \in {\varDelta }_n} x^T Q_2 x = \nu (Q_2). \end{aligned}$$

    Since \({\varDelta }_n^r \subset {\varDelta }_n\), we can argue similarly for upper bounds by simply replacing \(x \in {\varDelta }_n\) above with \(d \in {\varDelta }_n^r\) and using (5). Considering lower bounds, note that \(l_r(Q_1 - Q_2) \ge 0\) by part (ii) for each \(r = 0,1,\ldots \) By (6),

    $$\begin{aligned} d^T (Q_1 - Q_2) d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q_1 - Q_2) \ge 0, \quad \hbox {for all}~ d \in {\varLambda }_n^r, ~r = 0,1,\ldots , \end{aligned}$$

    which, after rearranging, yields

    $$\begin{aligned} d^T Q_1 d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q_1) \ge d^T Q_2 d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q_2), \end{aligned}$$

    for each \(d \in {\varLambda }_n^r, ~r = 0,1,\ldots \) Minimizing both sides of the inequality over \(d \in {\varLambda }_n^r\), we obtain \(l_r(Q_1) \ge l_r(Q_2)\) for each \(r = 0,1,\ldots \)

  4. (iv)

    This is a trivial consequence of (6), (5), and the hypothesis that \(\mu \in \mathbb {R}_+\).

  5. (v)

    Let \(\lambda \in \mathbb {R}\). For any \(x \in {\varDelta }_n\), \(x^T(Q + \lambda E) x \!=\! x^T Q x + \lambda \), which implies that \(\nu (Q + \lambda E) = \nu (Q) + \lambda \). Similarly, by (5), we obtain \(u_r(Q + \lambda E) = u_r(Q) + \lambda \) since \({\varDelta }_n^r \subset {\varDelta }_n\). For any \(d \in {\varLambda }_n^r,~r = 0,1,\ldots \), we have

    $$\begin{aligned} d^T (Q + \lambda E) d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q + \lambda E)&= \left( d^T Q d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q) \right) \\&\quad +\, \lambda \left( \frac{r+1}{r+2}\right) , \end{aligned}$$

    where we used \(d^T E d = d^T \hbox {diag}(E) = e^T d = 1\) by (4). It follows from (6) that \(l_r(Q + \lambda E) = l_r(Q) + \lambda \).

\(\square \)

4 Upper bounds

In this section, we focus on the upper bounds \(u_r(Q)\). We first give a complete algebraic description of the set of instances of (StQP) for which the upper bound matches the optimal value at any given finite level of the inner approximation hierarchy. We next discuss some interesting properties of this set. Finally, we identify the structural properties of the instances for which the upper bound is exact at a finite level of the hierarchy and for which the upper bound converges to the optimal value only in the limit.

Let us define the following sets:

$$\begin{aligned} \mathcal {U}_r := \{Q \in \mathcal {S}{:}\, u_r(Q) = \nu (Q)\}, \quad r = 0,1,\ldots \end{aligned}$$
(11)

Due to the monotonicity of the upper bounds, we readily obtain

$$\begin{aligned} \mathcal {U}_0 \subseteq \mathcal {U}_1 \subseteq \cdots \subseteq \mathcal {S}. \end{aligned}$$

We first give complete algebraic descriptions of the sets \(\mathcal {U}_r\).

Theorem 2

For each \(r \in \mathbb {N}\), we have

$$\begin{aligned} \mathcal {U}_r = \bigcup _{d \in {\varDelta }_n^r} \mathcal {V}_d, \end{aligned}$$
(12)

where

$$\begin{aligned} \mathcal {V}_d := \{Q \in \mathcal {S}{:}\, d^T Q d \le x^T Q x, \quad \hbox {for each}~x \in {\varDelta }_n\}, \quad d \in {\varDelta }_n^r. \end{aligned}$$
(13)

Proof

Let us fix \(r \in \mathbb {N}\). By (11) and (5), \(Q \in \mathcal {U}_r\) if and only if

$$\begin{aligned} \nu (Q) = u_r(Q) = \min _{d \in {\varDelta }_n^r} d^T Q d, \end{aligned}$$

which holds if and only if there exists some \(d \in {\varDelta }_n^r \cap \varOmega (Q)\), or equivalently \(Q \in \mathcal {V}_d\) for some \(d \in {\varDelta }_n^r\). The relation (12) follows. \(\square \)

The next proposition presents several geometric properties of the sets \(\mathcal {U}_r\).

Proposition 1

For each \(r \in \mathbb {N}\), \(\mathcal {U}_r\) is a closed cone given by the union of a finite number of nonempty, closed, and convex cones. Furthermore, \(\mathcal {U}_r\) contains the line \(\{\lambda E: \lambda \in \mathbb {R}\}\) for each \(r \in \mathbb {N}\).

Proof

For any \(x \in \mathbb {R}^n\), note that \(x^T Q x = \langle Q, xx^T \rangle \). Therefore, for each \(r \in \mathbb {N}\) and each \(d \in {\varDelta }_n^r\), \(\mathcal {V}_d\) is closed and convex since it is given by the intersection of an infinite number of closed half spaces in \(\mathcal {S}\). Clearly, \(\mathcal {V}_d\) is a cone.

For each \(r \in \mathbb {N}\), the set \(\mathcal {U}_r\) is a closed cone since it is given by the union of a finite number of closed cones. Finally, for any \(\lambda \in \mathbb {R}\) and any \(d \in {\varDelta }_n^r\), we have \(\lambda E \in \mathcal {V}_d\). Therefore, \(\mathcal {U}_r\) contains the line \(\{\lambda E: \lambda \in \mathbb {R}\}\) for each \(r \in \mathbb {N}\). \(\square \)

For any \(r \in \mathbb {N}\), it follows from Proposition 1 that \(Q + \lambda E \in \mathcal {U}_r\) for any \(\lambda \in \mathbb {R}\) whenever \(Q \in \mathcal {U}_r\). Note, however, that the cones \(\mathcal {U}_r\) are in general nonconvex. It is easy to verify that \(e_i (e_i)^T \in \mathcal {U}_0\) for each \(i = 1,\ldots ,n\). However, \(I = \sum _{i=1}^n e_i (e_i)^T \not \in \mathcal {U}_0\) for any \(n \ge 3\).

Remark 1

For any \(r \in \mathbb {N}\) and any \(d \in {\varDelta }_n^r\), the unique global minimizer of the quadratic function \((x-d)^T (x-d)\) is given by \(d\). If we define \(Q_d := I - e d^T - d e^T \in \mathcal {S}\), it follows that \(x^T Q_d x = x^T x - 2 d^T x = (x-d)^T (x-d) - d^T d \ge -d^T d = d^T Q_d \, d\) for any \(x \in {\varDelta }_n\), with equality if and only if \(x = d\). Therefore, \(\varOmega (Q_d) = \{d\}\) and \(Q_d \in \mathcal {V}_d\) by (13). Since \({\varDelta }_n^r \subset {\varDelta }_n^{r+1}\) by (3), it follows from Proposition 1 that \(\mathcal {U}_r \subset \mathcal {U}_{r+1}\) for each \(r \in \mathbb {N}\), i.e., the set \(\mathcal {U}_{r+1}\) is strictly larger than the set \(\mathcal {U}_r\) for each \(r \in \mathbb {N}\).

Remark 2

Let \(G = (V,E)\) be a simple, undirected graph, where \(|V| = n\). Consider the formulation (7) of the maximum stable set problem on \(G\) as an instance of (StQP), where \(Q = I + A_G\). By (8), \(u_r(I + A_G) = \nu (I + A_G)\) for \(r \ge \alpha (G) - 2\), which implies that \(I + A_G \in \mathcal {U}_r\) for each \(r \ge \alpha (G) - 2\). It follows that, for any simple, undirected graph \(G = (V,E)\), we have \(I + A_G \in \mathcal {U}_{n-2}\).

Let us next define

$$\begin{aligned} \mathcal {U}:= \bigcup _{r \in \mathbb {N}} \mathcal {U}_r, \quad \hbox {and} \quad \mathcal {U}_\infty := \mathcal {S}\backslash \mathcal {U}. \end{aligned}$$
(14)

Note that the set \(\mathcal {U}\) consists of all instances of (StQP) for which the upper bound matches the optimal value at some finite level of the inner approximation hierarchy. For instance, it follows from Remark 2 that, for any simple, undirected graph \(G = (V,E)\), we have \(I + A_G \in \mathcal {U}\). We next present several structural properties of the instances in the sets \(\mathcal {U}\) and \(\mathcal {U}_\infty \).

Proposition 2

The following relations are satisfied:

  1. (i)

    \(\mathcal {U}= \left\{ Q \in \mathcal {S}{:}\, \varOmega (Q) \cap \mathbb {Q}^n \ne \emptyset \right\} \) and \(\mathcal {U}_\infty = \left\{ Q \in \mathcal {S}{:}\, \varOmega (Q) \cap \mathbb {Q}^n = \emptyset \right\} \).

  2. (ii)

    \(\mathcal {Q}\subseteq \mathcal {U}\).

Proof

  1. (i)

    The first relation follows from the fact that \(Q \in \mathcal {U}_r\) if and only if \({\varDelta }_n^r \cap \varOmega (Q) \ne \emptyset \) and the relation \(\cup _{r \in \mathbb {N}} {\varDelta }_n^{r} = \mathbb {Q}^n \cap {\varDelta }_n\). The second relation is an immediate consequence of the first one and (14).

  2. (ii)

    Vavasis [19] proved that any quadratic optimization problem with rational data, which is bounded below, has a rational optimal solution. The assertion directly follows from this result and part (i).

\(\square \)

For a given \(Q \in \mathcal {S}\), Proposition 2(i) identifies an important structural property of the corresponding instance of (StQP) in order for the upper bounds to be exact at some finite level of the inner approximation hierarchy. An easy sufficient condition for membership in \(\mathcal {U}\) is presented in Proposition 2(ii), which implies that, for a given \(Q \in \mathcal {S}\), a necessary condition for \(Q \in \mathcal {U}_\infty \) is that \(Q\) has irrational entries. Note, however, that this condition is not sufficient since, for any \(Q \in \mathcal {Q}\), we have \(\mu Q \in \mathcal {U}\) for any \(\mu \in \mathbb {R}_+\) by Proposition 1.

The next corollary presents an important topological property of the set \(\mathcal {U}\) and immediately follows from Proposition 2(ii) and the fact that \(\hbox {cl}(\mathcal {Q}) = \mathcal {S}\).

Corollary 1

We have \(\hbox {cl}(\mathcal {U}) = \mathcal {S}\).

It follows from Corollary 1 that, for any \(Q \in \mathcal {S}\), either the upper bound is exact at some finite level of the hierarchy or there exists an arbitrarily small perturbation of \(Q\) for which the upper bound is exact at a finite level. Therefore, the set \(\mathcal {U}\) is a dense subset of the set \(\mathcal {S}\). This result reveals the strength of the inner approximation hierarchy of the completely positive cone in the context of standard quadratic optimization.

Finally, the next example illustrates that \(\mathcal {U}_\infty \ne \emptyset \) for any \(n \ge 2\).

Example 1

Similar to Remark 1, let \(d \in {\varDelta }_n \backslash \mathbb {Q}^n\) and define \(Q_d := I - e d^T - d e^T \in \mathcal {S}\). Then, \(\varOmega (Q_d) = \{d\}\) and \(\varOmega (Q_d) \cap \mathbb {Q}^n = \emptyset \). By Proposition 2(i), \(Q_d \in \mathcal {U}_\infty \).

5 Lower bounds

In this section, we focus on the behavior of lower bounds \(l_r(Q)\). Our analysis of lower bounds is considerably more involved since the expression (6) is more complicated in comparison with (5).

Similar to \(\mathcal {U}_r\), let us define the following sets:

$$\begin{aligned} \mathcal {L}_r := \{Q \in \mathcal {S}{:}\, l_r(Q) = \nu (Q)\}, \quad r = 0,1,\ldots \end{aligned}$$
(15)

Since the lower bounds are monotonically nondecreasing, we obtain

$$\begin{aligned} \mathcal {L}_0 \subseteq \mathcal {L}_1 \subseteq \cdots \subseteq \mathcal {S}. \end{aligned}$$

Similarly, let

$$\begin{aligned} \mathcal {L}:= \bigcup _{r \in \mathbb {N}} \mathcal {L}_r, \quad \hbox {and} \quad \mathcal {L}_\infty := \mathcal {S}\backslash \mathcal {L}. \end{aligned}$$
(16)

This section is organized as follows. In Sect. 5.1, we give a simple algebraic description of the set \(\mathcal {L}_0\) (see Proposition 3) and present its geometric properties (see Corollary 2). Two auxiliary sets \(\mathcal {S}_1\) and \(\mathcal {S}_2\) are introduced in Sect. 5.2, which are later used to identify structural properties of the instances in the sets \(\mathcal {L}\) and \(\mathcal {L}_\infty \) (see Theorem 3 and Proposition 8) in Sect. 5.3. We give complete algebraic descriptions of the sets \(\mathcal {L}_r\) (see Theorem 4) and present several geometric properties of these sets (see Proposition 9) in Sect. 5.4. Finally, we close this section by discussing the relations among the sets \(\mathcal {L}_r\), \(\mathcal {L}\), \(\mathcal {U}_r\), and \(\mathcal {U}\) (see Proposition 11).

5.1 Characterization of \(\mathcal {L}_0\)

First, we focus on the set \(\mathcal {L}_0\). To that end, given an instance of (StQP) and any \(\gamma \in \mathbb {R}\), replacing \(Q \in \mathcal {S}\) with \(Q + \gamma E\) in (StQP) shifts the optimal value by \(\gamma \) by Lemma 1(v) , i.e., \(\nu (Q + \gamma E) = \nu (Q) + \gamma \), while \(\varOmega (Q + \gamma E) = \varOmega (Q)\). In particular, the shifted matrix obtained with \(\gamma = -\min _{1 \le i \le j \le n} Q_{ij} = -l_0(Q)\) will play an important role and we define it below for future reference:

$$\begin{aligned} Q^s := Q - l_0(Q) E. \end{aligned}$$
(17)

Note that \(Q^s \in \mathcal {N}\) and \(l_0(Q^s) = 0\) by Lemma 1(i).

Next, we give a simple algebraic description of the set \(\mathcal {L}_0\).

Proposition 3

\(Q \in \mathcal {L}_0\) if and only if there exists an index \(k \in \{1,\ldots ,n\}\) such that \(Q_{kk} = \min _{1 \le i \le j \le n} Q_{ij}\). Therefore,

$$\begin{aligned} \mathcal {L}_0 = \bigcup _{k=1}^n \left\{ Q \in \mathcal {S}{:}\, Q_{kk} \le Q_{ij},~1 \le i \le j \le n\right\} . \end{aligned}$$
(18)

Proof

Suppose that \(Q \in \mathcal {L}_0\). Then, \(l_0(Q) = \nu (Q) = \min _{1 \le i \le j \le n} Q_{ij}\) by Lemma 1(i). Suppose, for a contradiction, that \(Q_{kk} > l_0(Q)\) for each \(k = 1,\ldots ,n\). By (17), \(\min _{k = 1,\ldots ,n} Q^s_{kk} > 0\). Since \(Q^s \in \mathcal {N}\), for any \(x \in {\varDelta }_n\), we have

$$\begin{aligned} x^T Q^s x \ge \sum _{j=1}^n Q^s_{jj} \, x_j^2 \ge \left( \min _{k = 1,\ldots ,n} Q^s_{kk} \right) \left( \sum _{j=1}^n x_j^2 \right) \ge \frac{ \min _{k = 1,\ldots ,n} Q^s_{kk}}{n} > 0, \end{aligned}$$

where we used \(\min _{x \in {\varDelta }_n} \sum _{j=1}^n x_j^2 = 1/n\) to derive the third inequality. Together with Lemma 1(v), we obtain that \(\nu (Q^s) = \nu (Q) - l_0(Q) > 0\), contradicting our hypothesis.

Conversely, given \(Q \in \mathcal {S}\), suppose that there exists an index \(k \in \{1,\ldots ,n\}\) such that \(Q_{kk} = \min _{1 \le i \le j \le n} Q_{ij}\). It is easy to verify that \(e_k \in \varOmega (Q)\). Therefore, \(\nu (Q) = Q_{kk} = \min _{1 \le i \le j \le n} Q_{ij} = l_0(Q)\) by Lemma 1(i), which implies that \(Q \in \mathcal {L}_0\). \(\square \)

The following corollary presents some geometric properties of the set \(\mathcal {L}_0\) and immediately follows from (18).

Corollary 2

The set \(\mathcal {L}_0\) is a closed cone given by the union of \(n\) polyhedral cones and contains the line \(\{\lambda E{:}\, \lambda \in \mathbb {R}\}\).

We remark that the set \(\mathcal {L}_0\) is in general nonconvex since \(e_i (e_i)^T \in \mathcal {L}_0\) for each \(i = 1,\ldots ,n\), while \(I = \sum _{i=1}^n e_i (e_i)^T \not \in \mathcal {L}_0\) for any \(n \ge 2\).

5.2 Two auxiliary sets

In this section, we will define two auxiliary sets that will subsequently be helpful in the description of the sets \(\mathcal {L}\) and \(\mathcal {L}_\infty \). To that end, we first define two index sets.

For a given \(Q \in \mathcal {S}\), we have \(e_k^T Q e_k = Q_{kk} \ge \nu (Q)\) for each \(k = 1,\ldots ,n\). Therefore, each \(Q \in \mathcal {S}\) induces the following partition of the indices:

$$\begin{aligned} U := \{ k \in \{1,\ldots ,n\} {:}\, Q_{kk} = \nu (Q) \}, \quad V := \{ k \in \{1,\ldots ,n\} {:}\, Q_{kk} > \nu (Q) \}. \end{aligned}$$
(19)

We next define the following two auxiliary sets, which partition the set \(\mathcal {S}\):

$$\begin{aligned} \mathcal {S}_1&:= \left\{ Q \in \mathcal {S}{:}\, \min _{k = 1,\ldots ,n} Q_{kk} = \nu (Q)\right\} = \{Q \in \mathcal {S}{:}\, U \ne \emptyset \}, \end{aligned}$$
(20)
$$\begin{aligned} \mathcal {S}_2&:= \left\{ Q \in \mathcal {S}{:}\, \min _{k = 1,\ldots ,n} Q_{kk} > \nu (Q)\right\} = \{Q \in \mathcal {S}{:}\, U = \emptyset \}. \end{aligned}$$
(21)

Note that \(Q \in \mathcal {S}_1\) if and only if \(\varOmega (Q) \cap \{e_1,\ldots ,e_n\} \ne \emptyset \). Therefore,

$$\begin{aligned} \mathcal {S}_1 = \bigcup _{k = 1}^n \mathcal {V}_{e_k}, \end{aligned}$$
(22)

where \(\mathcal {V}_{e_k}\) is defined as in (13).

The next proposition presents some geometric properties of the set \(\mathcal {S}_1\) and its relation with the set \(\mathcal {L}_0\).

Proposition 4

\(\mathcal {S}_1\) is a closed cone given by the union of a finite number of nonempty, closed, and convex cones. Furthermore,

$$\begin{aligned} \mathcal {L}_0 \subseteq \mathcal {S}_1. \end{aligned}$$

Proof

The assertions directly follow from the definition (13), (22), and Proposition 3. \(\square \)

We now turn our attention to the set \(\mathcal {S}_2\). The following proposition is one of the main results of this section.

Proposition 5

We have \(\mathcal {S}_2 \subseteq \mathcal {L}_\infty \), i.e., for a given \(Q \in \mathcal {S}\), if \(\min _{k = 1,\ldots ,n} Q_{kk} > \nu (Q)\), then \(Q \in \mathcal {L}_\infty \).

Proof

Let \(Q \in \mathcal {S}_2\) and let \(x^* \in \varOmega (Q)\). Let the index sets \(P\) and \(Z\) be defined as in (10). Note that \(|P| \ge 1\) since \(x^* \in {\varDelta }_n\). First, we claim that \(|P| \ge 2\). Clearly, \(|P| = 1\) if and only if \(x^* = e_k\) for some \(k \in \{1,\ldots ,n\}\), in which case \(Q \in \mathcal {S}_1\) by (22), contradicting our hypothesis.

By Theorem 1,

$$\begin{aligned} Q_{PP} \, x^*_P = \nu (Q) e_P. \end{aligned}$$
(23)

Fixing \(k \in P\) and considering the corresponding equation above, we obtain

$$\begin{aligned} \sum _{j \in P} Q_{kj} \, x^*_j = \nu (Q). \end{aligned}$$

Since \(Q_{kk} > \nu (Q)\) and \(x^* \in {\varDelta }_n\), it follows that there exists some \(l \in P\) such that \(l \ne k\) and \(Q_{kl} < \nu (Q)\).

Let us denote the smallest face of \({\varDelta }_n\) that contains \(x^*\) by \(F\), i.e., \(F = \hbox {conv}\{e_i{:}\, i \in P\}\). We make the following claim. For each \(r \in \mathbb {N}\), there exists \(q^r \in {\varLambda }_n^r \cap F\) such that

$$\begin{aligned} f_r(q^r) := \left( \frac{r+2}{r+1} \right) (q^r)^T Q (q^r) - \left( \frac{1}{r + 1}\right) (q^r)^T \hbox {diag}(Q) < \nu (Q), \quad r = 0,1,\ldots \nonumber \\ \end{aligned}$$
(24)

We will prove our claim by induction on \(r\). For \(r = 0\), we define \(q^0 = (1/2)(e_k + e_l)\), where \(k \in P\) and \(l \in P\) are as defined above. Clearly, \(q^0 \in {\varLambda }_n^0 \cap F\) and

$$\begin{aligned} f_0(q^0) = 2 (q^0)^T Q (q^0) - (q^0)^T \hbox {diag}(Q) = Q_{kl} < \nu (Q), \end{aligned}$$

by the choice of \(k \in P\) and \(l \in P\). This establishes (24) for \(r = 0\).

Suppose now that there exists some \(q^r \in {\varLambda }_n^{r} \cap F\) that satisfies (24) for some \(r \in \mathbb {N}\). We will show that we can construct \(q^{r+1} \in {\varLambda }_n^{r+1} \cap F\) that satisfies (24) for \(r + 1\).

Let us define \(z^r = (r + 2) q^r\). By the induction hypothesis,

$$\begin{aligned} f_r(q^r) = \frac{1}{(r+1)(r+2)} \left( (z^r)^T Q (z^r) - (z^r)^T \hbox {diag}(Q) \right) < \nu (Q). \end{aligned}$$
(25)

For each \(j \in P\), let us define \(w^{j} := z^r + e_j\). We have

$$\begin{aligned} \left( \frac{1}{r+3}\right) w^{j} = \left( \frac{r+2}{r+3}\right) q^r + \left( \frac{1}{r+3}\right) e_j \in {\varLambda }_n^{r+1} \cap F, \quad \hbox {for each}~j \in P. \end{aligned}$$

We will show that there exists some \(j^\prime \in P\) such that \(q^{r+1} = (1/(r+3)) w^{j^\prime }\) satisfies (24) for \(r + 1\). To that end,

$$\begin{aligned} (w^j)^T Q (w^j) - (w^j)^T \hbox {diag}(Q)&= (z^r)^T Q (z^r) + 2 (z^r)^T Q e_j \\&+\, Q_{jj} - (z^r)^T \hbox {diag}(Q) - Q_{jj}, \\&= (z^r)^T Q (z^r) - (z^r)^T \hbox {diag}(Q) + 2 (z^r)^T Q e_j, \quad j \in P. \end{aligned}$$

Let us now focus on the term \((z^r)^T Q e_j\). Observe that \(q^r = (1/(r+2)) z^r \in F\) by the induction hypothesis and \(e_j \in F\) since \(j \in P\). Therefore, \((z^r)^T Q e_j = (z^r_P)^T Q_{PP} (e_j)_P\) for each \(j \in P\). Multiplying both sides by \(x^*_j\) and summing over \(j \in P\), we obtain

$$\begin{aligned} \sum _{j \in P} x^*_j \left( (z^r_P)^T Q_{PP} (e_j)_P \right)&= (z^r_P)^T Q_{PP} \left( \sum _{j \in P} x^*_j (e_j)_P \right) , \\&= (z^r_P)^T Q_{PP} \, x^*_P, \\&= \nu (Q) ((z^r_P)^T e_P), \\&= (r + 2) \nu (Q), \end{aligned}$$

where we used (23) in the third line and the definition of \(z^r\) in the last line. Since \(\sum _{j \in P} x^*_j = 1\) and \(x^* \ge 0\), there exists some \(j^\prime \in P\) such that \((z^r_P)^T Q_{PP} (e_{j^\prime })_P = (z^r)^T Q e_{j^\prime } \le (r + 2) \nu (Q)\). By defining \(q^{r+1} = (1/(r+3)) w^{j^\prime }\), we obtain

$$\begin{aligned} f_{r+1}(q^{r+1})&= \left( \frac{r+3}{r+2} \right) (q^{r+1})^T Q (q^{r+1}) - \left( \frac{1}{r + 2}\right) (q^{r+1})^T \hbox {diag}(Q), \\&= \frac{1}{(r+2)(r+3)} \left( (w^{j^\prime })^T Q (w^{j^\prime }) - (w^{j^\prime })^T \hbox {diag}(Q) \right) , \\&= \frac{1}{(r+2)(r+3)}\left( (z^r)^T Q (z^r) - (z^r)^T \hbox {diag}(Q) + 2(z^r)^T Q e_{j^\prime } \right) , \\&\le \frac{1}{(r+2)(r+3)}\left( (z^r)^T Q (z^r) - (z^r)^T \hbox {diag}(Q) + 2 (r + 2) \nu (Q) \right) , \\&< \frac{1}{(r+2)(r+3)} \left( (r+1)(r+2) \nu (Q) + 2 (r + 2) \nu (Q) \right) , \\&= \left( \frac{r+1}{r+3} \right) \nu (Q) + \left( \frac{2}{r+3} \right) \nu (Q), \\&= \nu (Q), \end{aligned}$$

where we used the choice of \(j^\prime \) in the first inequality and the induction hypothesis (25) in the second one. Since \(l_r(Q) \le f_r(q^r)\) by (6), it follows from (24) that \(l_r(Q) < \nu (Q)\) for each \(r \in \mathbb {N}\). Therefore, \(Q \in \mathcal {L}_\infty \). \(\square \)

Example 2

Consider an instance of (StQP) in which \(Q^s\) is a diagonal matrix with strictly positive diagonal entries. Note that this class includes all instances of (StQP) in which \(Q\) itself is a strictly positive diagonal matrix. Let \(\beta = \sum _{i=1}^n (1/Q^s_{ii})\). It is easy to verify that the unique optimal solution \(x^* \in \varOmega (Q)\) is given by \(x^*_j = (1/Q^s_{jj})/\beta \), \(j = 1,\ldots ,n\). Therefore, by (22), \(Q \not \in \mathcal {S}_1\) for any \(n \ge 2\). It follows from Proposition 5 that \(Q \in \mathcal {L}_\infty \) for any \(n \ge 2\).

Remark 3

Let \(G = (V,E)\) be a simple, undirected graph, where \(|V| = n\). Consider the formulation (7) of the maximum stable set problem on \(G\) as an instance of (StQP), where \(Q = I + A_G\). If \(G\) is a complete graph, then \(Q = E\), which implies that \(Q \in \mathcal {L}_0\) by Corollary 2. Therefore, \(l_0(Q) = 1/\alpha (G) = 1\). On the other hand, if \(\alpha (G) \ge 2\), then \(\min _{k = 1,\ldots ,n} Q_{kk} = 1 > \nu (Q) = 1/\alpha (G)\), which implies that \(Q \in \mathcal {S}_2\) by (21). By Proposition 5, \(Q \in \mathcal {L}_\infty \). It follows that the relation \(l_r(Q) < \nu (Q)\) established in [17] is a special case of Proposition 5.

The following result, which is an immediate consequence of Proposition 5 and (22), presents an important relation between the sets \(\mathcal {L}\) and \(\mathcal {S}_1\).

Corollary 3

We have \(\mathcal {L}\subseteq \mathcal {S}_1\), where \(\mathcal {L}\) and \(\mathcal {S}_1\) are defined as in (16) and (20), respectively. Therefore, for a given \(Q \in \mathcal {S}\), if \(Q \in \mathcal {L}\), then \(\varOmega (Q) \cap \{e_1,\ldots ,e_n\} \ne \emptyset \).

Corollary 3 presents a necessary condition for \(Q \in \mathcal {L}\). An interesting question is whether this condition is sufficient, i.e., whether we have \(\mathcal {L}= \mathcal {S}_1\). The next two examples answer this question in the negative for different reasons.

Example 3

Consider an instance of (StQP), where

$$\begin{aligned} Q = \left[ \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 2 &{} 2 \\ 2 &{} 2 &{} 0\\ 2 &{} 0 &{} 2 \end{array} \right] . \end{aligned}$$

One can verify that \(\nu (Q) = 1\) and \(\varOmega (Q) = \{e_1, (e_2+e_3)/2\}\), which implies that \(Q \in \mathcal {S}_1\) by (22). In this example, \(U = \{1\}\) and \(V = \{2,3\}\). Let us focus on \(Q_{VV}\). Note that \(\nu (Q_{VV}) = 1\), which implies that, for \(n = 2\), \(Q_{VV} \in \mathcal {L}_\infty \) by Proposition 5. Therefore, for any \(r \in \mathbb {N}\), \(l_r(Q_{VV}) < \nu (Q_{VV} )= 1\). It follows that

$$\begin{aligned} l_r(Q)&= \left( \frac{r+2}{r+1}\right) \min _{d \in {\varLambda }_n^r} \left( d^T Q d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q)\right) , \\&\le \left( \frac{r+2}{r+1}\right) \min _{d \in {\varLambda }_n^r{:}\, d_1 = 0} \left( d^T Q d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q)\right) , \\&= l_r(Q_{VV}) < 1, \end{aligned}$$

for any \(r \in \mathbb {N}\). Therefore, \(Q \in \mathcal {L}_\infty \), i.e., \(Q \in \mathcal {S}_1 \backslash \mathcal {L}\).

Example 4

Consider an instance of (StQP), where

$$\begin{aligned} Q = \left[ \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 1 &{} 1 \\ 1 &{} 3 &{} 0\\ 1 &{} 0 &{} 3 \end{array} \right] . \end{aligned}$$

\(Q\) is positive definite. It is easy to verify that \(\nu (Q) = 1\) and \(\varOmega (Q) = \{e_1\}\), which implies that \(Q \in \mathcal {S}_1\) by (22). Similarly, \(U = \{1\}\) and \(V = \{2,3\}\). We claim that \(Q \not \in \mathcal {L}\). Let us fix \(r \in \mathbb {N}\) and define \(d_r := \left( \frac{1}{r+2}\right) [r, 1, 1]^T \in {\varLambda }_n^r\). By (6),

$$\begin{aligned} l_r(Q)&\le \left( \frac{r+2}{r+1}\right) \left( d_r^T Q d_r - \left( \frac{1}{r+2}\right) d_r^T \hbox {diag}(Q)\right) , \\&= \frac{r^2 + 4r + 6 - (r + 6)}{(r+1)(r+2)}, \\&= \frac{r^2 + 3r}{r^2 + 3r + 2} < 1, \end{aligned}$$

which implies that \(l_r(Q) < \nu (Q)\) for each \(r \in \mathbb {N}\). Therefore, \(Q \in \mathcal {L}_\infty \), i.e., \(Q \in \mathcal {S}_1 \backslash \mathcal {L}\). Note that \(\nu (Q_{VV}) = 3/2 > 1 = \nu (Q)\) on this instance.

5.3 Structural properties of \(\mathcal {L}\) and \(\mathcal {L}_\infty \)

In this section, our goal is to identify the structural properties of the instances in the sets \(\mathcal {L}\) and \(\mathcal {L}_\infty \). We first generalize Examples 3 and 4 in an attempt to further our understanding of the set \(\mathcal {S}_1 \backslash \mathcal {L}\) (see Propositions 6 and 7). We then use these observations to present our main result in Theorem 3, which establishes the structural properties of the instances in \(\mathcal {L}\) and \(\mathcal {L}_\infty \). Finally, Proposition 8 presents a relation between the sets \(\mathcal {L}\) and \(\mathcal {S}_1\).

We first present a useful property of the instances in the set \(\mathcal {S}_1 \backslash \mathcal {L}\).

Lemma 2

For any \(Q \in \mathcal {S}_1 \backslash \mathcal {L}\), there exists \(k \in \{1,\ldots ,n\}\) such that \(Q_{kk} > \nu (Q)\), i.e., we have \(V \ne \emptyset \), where \(V\) is defined as in (19).

Proof

Let \(Q \in \mathcal {S}_1 \backslash \mathcal {L}\). Suppose, for a contradiction, that \(V = \emptyset \). Then, \(\{e_1,\ldots ,e_n\} \subseteq \varOmega (Q)\) since \(U = \{1,\ldots ,n\}\). By Theorem 1, we have \(Q_{ij} \ge Q_{ii} = \nu (Q)\) for each \(1 \le i \le j \le n\). Therefore, \(Q \in \mathcal {L}_0\) by Proposition 3, which contradicts our hypothesis since \(\mathcal {L}_0 \subseteq \mathcal {L}\). \(\square \)

It follows from the proof of Lemma 2 that

$$\begin{aligned} Q_{ij} \ge \nu (Q) \quad \hbox {for each} \quad i \in U,~j = 1,\ldots ,n. \end{aligned}$$
(26)

The next proposition gives a sufficient condition for membership in \(\mathcal {S}_1 \backslash \mathcal {L}\) and generalizes Example 3.

Proposition 6

Let \(Q \in \mathcal {S}_1\) be such that \(V \ne \emptyset \). If \(\nu (Q_{VV}) = \nu (Q)\), where \(V\) is defined as in (19), then \(Q \in \mathcal {S}_1 \backslash \mathcal {L}\).

Proof

The assertion follows from the observation that \(Q_{VV} \in \mathcal {L}_\infty \) for \(n = |V|\) by Proposition 5 and a similar argument as in Example 3. \(\square \)

The next proposition presents another sufficient condition for membership in \(\mathcal {S}_1 \backslash \mathcal {L}\), thereby generalizing Example 4.

Proposition 7

Let \(Q \in \mathcal {S}_1\) be such that \(|V| \ge 2\), where \(V\) is defined as in (19). Suppose that there exist indices \(i \in U\), \(j \in V\), \(k \in V\), and \(j \ne k\) such that

$$\begin{aligned} Q_{ij} = Q_{ik} = \nu (Q), \quad Q_{jk} < \nu (Q). \end{aligned}$$
(27)

Then, \(Q \in \mathcal {S}_1 \backslash \mathcal {L}\).

Proof

Suppose that \(Q \in \mathcal {S}_1\) satisfies the hypothesis and let us fix \(r \in \mathbb {N}\). We define \(d \in {\varLambda }_n^r\) as follows:

$$\begin{aligned} d_i = \frac{r}{r+2}, \quad d_j = \frac{1}{r+2}, \quad d_k = \frac{1}{r+2}, \end{aligned}$$

and all remaining entries of \(d\) are set to 0. By (6),

$$\begin{aligned} l_r(Q)&\le \left( \frac{r+2}{r+1}\right) \left( d^T Q d - \left( \frac{1}{r+2}\right) d^T \hbox {diag}(Q)\right) , \\&= \frac{\nu (Q) (r^2 + 3r) + 2 Q_{jk}}{r^2 + 3r + 2} < \nu (Q), \end{aligned}$$

where we used (27) to derive the last inequality. It follows that \(Q \not \in \mathcal {L}\). \(\square \)

We remark that the sufficient conditions of Propositions 6 and 7, in general, are not implied by one another. Indeed, the instance in Example 3 satisfies only the sufficient condition of Proposition 6, whereas the instance in Example 4 satisfies only that of Propositon 7 with \(i = 1\), \(j = 2\), and \(k = 3\).

Recall that \(Q \in \mathcal {S}_1\) if and only if \(\varOmega (Q) \cap \{e_1,\ldots ,e_n\} \ne \emptyset \). Motivated by the sufficient conditions of Propositions 6 and 7, let us define the following subset of \(\mathcal {S}_1\).

$$\begin{aligned} \mathcal {S}_1^\prime&:= \{Q \in \mathcal {S}_1{:}\, V \ne \emptyset ~\hbox {and}~\nu (Q_{VV}) = \nu (Q)\} \cup \{Q \in \mathcal {S}_1{:}\, \exists ~i \in U,~\exists ~j \in V,\nonumber \\&\quad \qquad \quad \quad \,\, ~\exists ~k \in V ~\hbox {such that} ~j \ne k,~Q_{ij} = Q_{ik} = \nu (Q),~Q_{jk} < \nu (Q)\}.\qquad \end{aligned}$$
(28)

The next theorem, which is one of the main results of Sect. 5, establishes the relationships between the sets \(\mathcal {L}\) and \(\mathcal {L}_\infty \) and the sets \(\mathcal {S}_1\), \(\mathcal {S}_1^\prime \), and \(\mathcal {S}_2\).

Theorem 3

The following relations are satisfied:

  1. 1.

    \(\mathcal {L}= \mathcal {S}_1 \backslash \mathcal {S}_1^{\prime }= \{Q \in \mathcal {S}{:}\, \varOmega (Q) \cap \{e_1,\ldots ,e_n\} \ne \emptyset \} \backslash \mathcal {S}_1^{\prime }\),

  2. 2.

    \(\mathcal {L}_\infty = \mathcal {S}_2 \cup \mathcal {S}_1^\prime = \{Q \in \mathcal {S}{:}\, \varOmega (Q) \cap \{e_1,\ldots ,e_n\} = \emptyset \} \cup \mathcal {S}_1^{\prime }\),

where \(\mathcal {S}_1^\prime \) is defined as in (28).

Proof

Note that \(\mathcal {S}_1 \cup \mathcal {S}_2 = \mathcal {L}\cup \mathcal {L}_\infty = \mathcal {S}\) and \(\mathcal {S}_1 \cap \mathcal {S}_2 = \mathcal {L}\cap \mathcal {L}_\infty = \emptyset \) by (16), (20), and (21). By Propositions 56 and 7, we obtain \(\mathcal {S}_1^\prime \cup \mathcal {S}_2 \subseteq \mathcal {L}_\infty \). Taking the complements of both sides yields \(\mathcal {L}\subseteq \mathcal {S}_1 \backslash \mathcal {S}_1^{\prime }\). Therefore, both assertions will be proved by showing that any one of these two inclusions in fact holds with equality. We will show that the latter inclusion holds with equality by showing that the reverse inclusion \(\mathcal {S}_1 \backslash \mathcal {S}_1^{\prime } \subseteq \mathcal {L}\) is satisfied.

Let \(Q \in \mathcal {S}_1 \backslash \mathcal {S}_1^{\prime }\). Let us define the following index sets:

$$\begin{aligned} W_1&:= \left\{ (j,k){:}\, j \in V, \quad k \in V,\quad j \ne k, \quad Q_{jk} < \nu (Q)\right\} , \end{aligned}$$
(29)
$$\begin{aligned} W_2&:= \left\{ (j,k){:}\, j \in V, \quad k \in V,\quad j \ne k, \quad Q_{jk} \ge \nu (Q)\right\} . \end{aligned}$$
(30)

If \(W_1 = \emptyset \), then it follows from (26), (30), and (28) that \(Q_{ij} \ge \nu (Q)\) for \(1 \le i \le j \le n\) and \(Q_{kk} = \nu (Q)\) for some \(k = 1,\ldots ,n\). Therefore, \(Q \in \mathcal {L}_0\) by Proposition 3, which implies that \(Q \in \mathcal {L}\), establishing the reverse inclusion.

We will henceforth assume that \(W_1 \ne \emptyset \). In this case, it follows from (28) that

$$\begin{aligned} \beta _{ijk} := \max \{Q_{ij},Q_{ik}\} > \nu (Q), \quad \hbox {for each } i \in U,~(j,k) \in W_1. \end{aligned}$$

We also define

$$\begin{aligned} \beta := \min _{i \in U;~(j,k) \in W_1} \beta _{ijk} > \nu (Q), \quad \hbox {and} \quad \rho := \beta - \nu (Q) > 0. \end{aligned}$$
(31)

Next, we will establish a lower bound on \(l_r(Q)\) for each \(r \in \mathbb {N}\). Let us fix \(r \in \mathbb {N}\). Recall that \(l_r(Q)\) is given by

$$\begin{aligned} l_r(Q) = \frac{1}{(r+1)(r + 2)} \min _{z \in (r+2) {\varLambda }_n^r}(z^T Q z - z^T \hbox {diag}(Q)), \quad r = 0,1,\ldots . \end{aligned}$$

We can rewrite the expression in the parentheses on the right-hand side as

$$\begin{aligned} z^T Q z - z^T \hbox {diag}(Q)&= \sum _{i \in U} \sum _{j \in U} Q_{ij} z_i z_j - \sum _{i \in U} Q_{ii} z_i + 2 \sum _{i \in U} \sum _{k \in V} Q_{ik} z_i z_k \nonumber \\&+ \sum _{j \in V} \sum _{k \in V} Q_{jk} z_j z_k - \sum _{j \in V} Q_{jj} z_j. \end{aligned}$$
(32)

Next, we will derive lower bounds on the terms on the right-hand side of (32). For a given \(z \in (r+2) {\varLambda }_n^r\), let us define

$$\begin{aligned} \sum _{j \in V} z_j = \eta , \end{aligned}$$

so that \(\sum _{i \in U} z_i = r + 2 - \eta \).

By (26),

$$\begin{aligned} \sum _{i \in U} \sum _{j \in U} Q_{ij} z_i z_j - \sum _{i \in U} Q_{ii} z_i&= \sum _{i \in U} Q_{ii} z_i (z_i - 1) + \sum _{i \in U} \sum _{j \in U \backslash \{i\}} Q_{ij} z_i z_j, \nonumber \\&\ge \nu (Q) \left( \left( \sum _{i \in U} z_i \right) ^2 - \sum _{i \in U} z_i \right) , \nonumber \\&= (r + 2 - \eta ) (r + 1 - \eta ) \nu (Q), \end{aligned}$$
(33)

where we used \(z \in \mathbb {N}^n\) to derive the inequality.

Similarly,

$$\begin{aligned} \sum _{j \in V} \sum _{k \in V} Q_{jk} z_j z_k - \sum _{j \in V} Q_{jj} z_j \!=\! \sum _{j \in V} Q_{jj} z_j (z_j - 1) + \sum _{(j,k) \in W_1} Q_{jk} z_j z_k + \sum _{(j,k) \in W_2} Q_{jk} z_j z_k. \end{aligned}$$

By (29) and (30), we obtain the following lower bounds.

$$\begin{aligned} \sum _{j \in V} \sum _{k \in V} Q_{jk} z_j z_k - \sum _{j \in V} Q_{jj} z_j \ge \left\{ \begin{array}{l@{\quad }l} \nu (Q) \eta (\eta - 1), &{} \hbox {if}~z_j z_k = 0 \\ &{} \hbox {for all}~(j,k) \in W_1, \\ \eta (\eta - 1) l_{\eta - 2} (Q_{VV}), &{} \hbox {otherwise}, \end{array} \right. \end{aligned}$$
(34)

where the second part follows from the definition of \(l_r(Q_{VV})\) and the fact that \(z_V \in (r+2) {\varLambda }_{|V|}^{\eta - 2}\). Note that, in the second case, \(z_V\) should have at least two positive components, which implies that \(\eta \ge 2\).

Finally,

$$\begin{aligned} \sum _{i \in U} \sum _{k \in V} Q_{ik} z_i z_k = \sum _{i \in U} z_i \left( \sum _{k \in V} Q_{ik} z_k \right) . \end{aligned}$$

Note that \(Q_{ik} \ge \nu (Q)\) for each \(i \in U\) and \(k \in V\) by (26). Furthermore, if there exists \((j^\prime ,k^\prime ) \in W_1\) such that \(z_{j^\prime } z_{k^\prime } > 0\), then

$$\begin{aligned} \sum _{k \in V} Q_{ik} z_k = \sum _{k \in V \backslash \{ j^\prime ,k^\prime \}} Q_{ik} z_k + Q_{i,j^\prime } z_{j^\prime } + Q_{i,k^\prime } z_{k^\prime } \ge \nu (Q) \eta + \rho , \quad \hbox {for each}~i \in U, \end{aligned}$$

since \(\max \{Q_{i,j^\prime },Q_{i,k^\prime }\} \ge \nu (Q) + \rho \) by (31) for each \(i \in U\). Therefore,

$$\begin{aligned} \sum _{i \in U} \sum _{k \in V} Q_{ik} z_i z_k \ge \left\{ \begin{array}{l@{\quad }l} (r + 2 - \eta ) \left( \nu (Q) \eta + \rho \right) , &{} \hbox {if}~z_j z_k > 0 \\ &{} \hbox {for some}~(j,k) \in W_1, \\ \nu (Q) (r + 2 - \eta ) \eta , &{} \hbox {otherwise}, \end{array} \right. \end{aligned}$$
(35)

Using these lower bounds, we consider the following five cases:

Case 1: If \(\sum _{j \in V} z_j = \eta = 0\), then \(z_j = 0\) for each \(j \in V\). By (32) and (33),

$$\begin{aligned} z^T Q z - z^T \hbox {diag}(Q) \ge (r + 2)(r + 1) \nu (Q). \end{aligned}$$

Case 2: If \(\sum _{j \in V} z_j = \eta = 1\), then it follows from (32), (33), (34), and (35) that

$$\begin{aligned} z^T Q z - z^T \hbox {diag}(Q) \ge (r + 2)(r + 1) \nu (Q). \end{aligned}$$

Case 3: If \(\sum _{j \in V} z_j = \eta = r + 2\), then \(z_i = 0\) for each \(i \in U\). By (32) and (34),

$$\begin{aligned} z^T Q z - z^T \hbox {diag}(Q) \ge (r + 2)(r + 1) \min \{\nu (Q),l_r(Q_{VV})\}. \end{aligned}$$

Case 4a: If \(2 \le \sum _{j \in V} z_j = \eta \le r + 1\) and \(z_j z_k = 0\) for each \((j,k) \in W_1\), then it follows from (32), (33), (34), and (35) that

$$\begin{aligned} z^T Q z - z^T \hbox {diag}(Q) \ge (r+2)(r+1)\nu (Q). \end{aligned}$$

Case 4b: If \(2 \le \sum _{j \in V} z_j = \eta \le r + 1\) and there exists \((j,k) \in W_1\) such that \(z_j z_k > 0\), then, by (32), (33), (34), and (35),

$$\begin{aligned}&\displaystyle z^T Q z \!-\! z^T \hbox {diag}(Q) \ge (r\!+\!2\!-\!\eta )(r+1+\eta )\nu (Q) \!+\! \eta (\eta \!-\! 1) l_{\eta - 2}(Q_{VV}) + 2 \rho (r + 2 - \eta ). \end{aligned}$$

It follows from the five cases above that

$$\begin{aligned} l_r(Q) \ge \min \left\{ \nu (Q),~l_r(Q_{VV}), \min _{\eta \in \{2,\ldots ,r+1\}} h(Q,\eta ,r)\right\} , \end{aligned}$$
(36)

where

$$\begin{aligned} h(Q,\eta ,r) := (1 - \lambda _{\eta ,r}) \nu (Q) + \lambda _{\eta ,r} l_{\eta - 2}(Q_{VV}) + \frac{2 \rho (r + 2 - \eta )}{(r+1)(r+2)}, \end{aligned}$$
(37)

and

$$\begin{aligned} \lambda _{\eta ,r} := \frac{\eta (\eta - 1)}{(r+1)(r+2)}. \end{aligned}$$

We next establish that the second and the third terms in (36) are at least as large as \(\nu (Q)\) for all sufficiently large values of \(r\).

For any \({\hat{Q}} \in \mathcal {S}_1\) with \(V \ne \emptyset \), we have

$$\begin{aligned} \nu ({\hat{Q}}_{VV}) = \min \{x^T {\hat{Q}} x{:}\, e^T x = 1, ~x \ge 0, ~x_i = 0, ~i \in U\} \ge \nu ({\hat{Q}}). \end{aligned}$$

By (28), we therefore have \(\nu (Q_{VV}) > \nu (Q)\) for any \(Q \in \mathcal {S}_1 \backslash \mathcal {S}_1^{\prime }\). Since \(\lim _{r \rightarrow \infty } l_r(Q_{VV}) = \nu (Q_{VV})\), there exists \({\hat{r}} \in \mathbb {N}\) such that

$$\begin{aligned} l_r(Q_{VV}) \ge \nu (Q), \quad \hbox {for all}~r > {\hat{r}}. \end{aligned}$$
(38)

Therefore, the second term in (36) is at least as large as \(\nu (Q)\) for all \(r > {\hat{r}}\). Let us next focus on the third term in (36) for \(r > {\hat{r}}\). Note that it suffices to consider only \(\eta \in \{2,\ldots ,{\hat{r}}+2\}\) for the range of the minimum since \(h(Q,\eta ,r) > \nu (Q)\) for all \(\eta > {\hat{r}} + 2\) by (38).

Let us now fix \(\eta \in \{2,\ldots ,{\hat{r}}+2\}\) and consider the last term in (36) as a function of \(r\). We claim that there exists \(r_\eta \in \mathbb {N}\) such that

$$\begin{aligned} h(Q,\eta ,r) \ge \nu (Q), \quad \hbox {for all}~r \ge r_\eta , \end{aligned}$$
(39)

where \(h\) is defined as in (37). Indeed, \(h(Q,\eta ,r)\) can be rewritten as

$$\begin{aligned} \nu (Q) + \frac{2 \rho (r + 2 - \eta ) - \eta (\eta - 1) \left( \nu (Q) - l_{\eta - 2}(Q_{VV})\right) }{(r+1)(r+2)}. \end{aligned}$$

Therefore, there exists \(r_\eta \in \mathbb {N}\) such that the second term is nonnegative for all \(r \ge r_\eta \), which establishes (39).

It follows from (38), (39), and (36) that

$$\begin{aligned} l_r(Q) = \nu (Q), \quad \hbox {for all}~r \ge {r}^*, \end{aligned}$$

where

$$\begin{aligned} {r}^* := \max \left\{ {\hat{r}} + 1,~\max _{\eta \in \{2,\ldots ,{\hat{r}}+2\}} r_\eta \right\} < \infty . \end{aligned}$$

Therefore, \(Q \in \mathcal {L}\), which implies that \(\mathcal {S}_1 \backslash \mathcal {S}_1^\prime \subseteq \mathcal {L}\). The proof is complete. \(\square \)

For illustrative purposes, we present the following example.

Example 5

Let

$$\begin{aligned} Q(\rho ) := \left[ \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 1 + \rho &{} 1 + \rho \\ 1 + \rho &{} 3 &{} 0 \\ 1 + \rho &{} 0 &{} 3 \end{array} \right] . \end{aligned}$$

For any \(\rho \ge 0\), \(\nu (Q(\rho )) = 1\) and \(\varOmega (Q(\rho )) = e_1\), which implies that \(Q(\rho ) \in \mathcal {S}_1\). We have \(U = \{1\}\) and \(V = \{2,3\}\). Note that \(Q(0) \in \mathcal {S}_1^\prime \) by Example 4. On the other hand, for any \(\rho > 0\), \(Q(\rho ) \in \mathcal {S}_1 \backslash \mathcal {S}_1^{\prime }\), which implies that \(Q \in \mathcal {L}\) by Theorem 3. Our computational experiments reveal that \(Q(1) \in \mathcal {L}_5\) and \(Q(0.1) \in \mathcal {L}_{48}\).

Example 5 illustrates that \(\mathcal {L}\), in general, is not a closed set. Our next result gives a description of the closure of \(\mathcal {L}\).

Proposition 8

We have

$$\begin{aligned} \hbox {cl}(\mathcal {L}) = \mathcal {S}_1 = \{Q \in \mathcal {S}{:}\, \varOmega (Q) \cap \{e_1,\ldots ,e_n\} \ne \emptyset \}. \end{aligned}$$

Proof

By Corollary 3, \(\mathcal {L}\subseteq \mathcal {S}_1\). Note that \(\mathcal {S}_1\) is a closed set by Proposition 4. Therefore, \(\hbox {cl}(\mathcal {L}) \subseteq \mathcal {S}_1\).

Conversely, let \(Q \in \mathcal {S}_1\). Therefore, there exists some \(i \in \{1,\ldots ,n\}\) such that \(e_i \in \varOmega (Q)\). Let us define the following sequence:

$$\begin{aligned} Q_k := Q + \frac{1}{k}\left( e e^T - e_i (e_i)^T \right) , \quad k = 1,2,\ldots \end{aligned}$$

Clearly, \(\lim _{k \rightarrow \infty } Q_k = Q\). We will show that \(Q_k \in \mathcal {L}\) for each \(k = 1,2,\ldots \) Let us fix \(k\). Since \(Q_k - Q \in \mathcal {N}\), \(\nu (Q_k) \ge \nu (Q)\) by Lemma 1(iii). Furthermore, \((e_i)^T Q_k e_i = \nu (Q) \ge \nu (Q_k)\), which implies that \(\nu (Q_k) = \nu (Q)\) and \(e_i \in \varOmega (Q_k)\). Therefore, \(Q_k \in \mathcal {S}_1\) for each \(k = 1, 2, \ldots \) Since \(e_i \in \varOmega (Q)\), we have \(Q_{ij} \ge \nu (Q)\) and \(Q_{jj} \ge \nu (Q)\) for each \(j = 1,\ldots ,n\). By the definition of \(Q_k\), \(U = \{i\}\), \(V = \{1,\ldots ,n\} \backslash \{i\}\). We have

$$\begin{aligned} \nu ((Q_k)_{VV}) = \min _{x \in {\varDelta }_n{:}\, x_i = 0} \left\{ x^T Q x + \frac{1}{k} \right\} \ge \nu (Q) + \frac{1}{k} > \nu (Q) = \nu (Q_k), \end{aligned}$$

which implies that the condition of Proposition 6 cannot be satisfied. Since \((Q_k)_{ij} > \nu (Q)\) for each \(j \in V\), the condition of Proposition 7 cannot be satisfied either. Therefore, \(Q_k \not \in \mathcal {S}_1^\prime \), i.e., \(Q_k \in \mathcal {S}_1 \backslash \mathcal {S}_1^{\prime }\). By Theorem 3, \(Q_k \in \mathcal {L}\) for each \(k = 1,2,\ldots \), which implies that \(Q \in \hbox {cl}(\mathcal {L})\). \(\square \)

By Proposition 8, \(Q(0) \in \hbox {cl}(\mathcal {L}) \backslash \mathcal {L}\) in Example 5. Note that \(\hbox {cl}(\mathcal {L}) \subset \mathcal {S}\) in general. For instance, \(I \not \in \hbox {cl}(\mathcal {L})\) since \(I \not \in \mathcal {S}_1\) for any \(n \ge 2\).

5.4 Description of \(\mathcal {L}_r\)

In this section, we give complete algebraic descriptions of the sets \(\mathcal {L}_r\). In particular, our next result generalizes Proposition 3.

Theorem 4

For any \(r = 0,1,\ldots \), we have

$$\begin{aligned} \mathcal {L}_r := \bigcup _{k = 1}^n \mathcal {L}^k_r, \end{aligned}$$
(40)

where

$$\begin{aligned} \mathcal {L}^k_r := \left\{ Q \in \mathcal {S}{:}\, Q_{kk} \le \frac{1}{(r+1)(r+2)} \left( z^T Q z - z^T \hbox {diag}(Q)\right) , \quad \hbox {for all}~z \in (r+2){\varLambda }_n^r \right\} , \end{aligned}$$

for \(k = 1,\ldots ,n\).

Proof

Let us fix \(r \in \mathbb {N}\) and let \(Q \in \mathcal {L}_r\). By Corollary 3 and (16), \(\mathcal {L}_r \subseteq \mathcal {S}_1\), which implies that

$$\begin{aligned} \nu (Q) = \min _{k = 1,\ldots ,n} Q_{kk} = l_r(Q) = \frac{1}{(r+1)(r + 2)}\min _{z \in (r+2){\varLambda }_n^r}(z^T Q z - z^T \hbox {diag}(Q)). \end{aligned}$$

Therefore, \(Q \in \mathcal {L}_r\) if and only if \(Q \in \mathcal {L}^k_r\) for some \(k \in \{1,\ldots ,n\}\). \(\square \)

Recall that \(\mathcal {L}^k_r\) is defined by \(O(n^{r+2})\) inequalities for each \(k = 1,\ldots ,n\). Therefore, it follows from Theorem 4 that, for any fixed \(r \in \mathbb {N}\), one can check in polynomial time if \(Q \in \mathcal {L}_r\). On the other hand, Theorem 3 does not yield an algorithmic procedure for checking if \(Q \in \mathcal {L}\).

Our next result establishes some geometric properties of the sets \(\mathcal {L}_r\).

Proposition 9

For any \(r = 0,1,\ldots \), the set \(\mathcal {L}_r\) is given by the union of \(n\) polyhedral cones.

Proof

Since \({\varLambda }_n^r\) is a finite set, \(\mathcal {L}^k_r\) is a polyhedral cone for each \(r \in \mathbb {N}\) and \(k = 1,\ldots ,n\). The assertion directly follows from (40). \(\square \)

We next establish an interesting connection between the behavior of lower bounds and the stability number of a certain associated graph. Given \(M \in \mathcal {N}\), we define the sparsity graph \(G_M\) associated with \(M\) as follows. There are \(n\) vertices labeled \(1,\ldots ,n\) and vertex \(i\) and vertex \(j\) are connected by an edge if \(M_{ij} > 0, ~1 \le i < j \le n\). The next result establishes a connection between the stability number of the sparsity graph of the matrix \(Q^s\) and the behavior of lower bounds \(l_r(Q)\).

Proposition 10

Let \(Q \in \mathcal {S}\backslash \mathcal {L}_0\) and let \(G = G_{Q^s}\) denote the sparsity graph of \(Q^s\) with stability number \(\alpha (G)\). Then, \(l_r(Q) = \min _{1 \le i \le j \le n} Q_{ij} < \nu (Q)\) for each \(r = 0,1,\ldots ,\alpha (G)-2\) and \(l_r(Q) > \min _{1 \le i \le j \le n} Q_{ij}\) for each \(r \ge \alpha (G) - 1\). Therefore,

$$\begin{aligned} Q \not \in \mathcal {L}_r, \quad \hbox {for each}~r = 0,1,\ldots ,\alpha (G)-2. \end{aligned}$$
(41)

Proof

Given \(Q \in \mathcal {S}\backslash \mathcal {L}_0\), let \(G = G_{Q^s}\) denote the sparsity graph of \(Q^s\) and let \(A_G \in \mathcal {S}\) denote the adjacency matrix of \(G\). First, we claim that there exists some \(1 \le i < j \le n\) such that \(Q^s_{ij} = 0\). By Proposition 3, \(Q \in \mathcal {L}_0\) if and only if \(Q^s \in \mathcal {N}\) given by (17) has a diagonal entry equal to zero. Therefore, for each \(Q \not \in \mathcal {L}_0\), \(\min _{k=1,\ldots ,n}Q^s_{kk} > 0\) and \(Q^s_{ij} = 0\) for some \(1 \le i < j \le n\). It follows that \(\{i,j\}\) is a stable set in \(G\) by definition of the sparsity graph. Therefore, \(\alpha (G) \ge 2\).

We now define the following two matrices.

$$\begin{aligned} Q_* := \left( \min _{1 \le i \le j \le n{:}\, Q^s_{ij} > 0} Q^s_{ij}\right) \left( I + A_G \right) , \quad Q^* := \left( \max _{1 \le i \le j \le n} Q^s_{ij} \right) \left( I + A_G \right) . \end{aligned}$$

Clearly, we have \(Q_* \in \mathcal {N}\), \(Q^* \in \mathcal {N}\), \(Q^s - Q_* \in \mathcal {N}\), and \(Q^* - Q^s \in \mathcal {N}\). By Lemma 1(iii),

$$\begin{aligned} l_r(Q_*) \le l_r(Q^s) \le l_r(Q^*), \quad \hbox {for each}~r = 0,1,\ldots . \end{aligned}$$

Note that each of \(Q^*\) and \(Q_*\) is a positive multiple of \(I + A_G\). By Lemma 1(iv), we have

$$\begin{aligned} \left( \min _{1 \le i \le j \le n{:}\, Q^s_{ij} > 0} Q_{ij}\right) l_r(I + A_G) \le l_r(Q^s) \le \left( \max _{1 \le i \le j \le n} Q^s_{ij}\right) l_r(I + A_G). \end{aligned}$$

The assertion now follows directly from (17), Lemma 1(v), and (9). The relation (41) follows immediately. \(\square \)

Example 6

Consider an instance of (StQP) in which

$$\begin{aligned} Q = Q^s = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 2 &{} 2 &{} \ldots &{} 2 \\ 2 &{} n &{} 0 &{} \ldots &{} 0 \\ 2 &{} 0 &{} n &{} \ldots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 2 &{} 0 &{} 0 &{} \ldots &{} n \end{array} \right] \end{aligned}$$

is an \(n \times n\) arrowhead matrix, where \(n \ge 3\). It is easy to verify that \(\nu (Q) = 1\) and \(\varOmega (Q) = \{e_1\}\). Therefore, \(U = \{1\}\) and \(V = \{2,\ldots ,n\}\). By (28) and Theorem 3, \(Q \in \mathcal {L}\). We have \(\alpha (G) = n - 1\), where \(G= G_{Q}\) denotes the sparsity graph of \(Q\). By Proposition 10, we have \(l_r(Q) = \min _{1 \le i \le j \le n} Q_{ij} = 0\) for each \(r = 0,1,\ldots ,n-3\) and \(l_r(Q) > 0\) for each \(r \ge n - 2\). It follows that \(Q \not \in \mathcal {L}_r\) for each \(r = 0,1,\ldots ,n-3\).

5.5 Relations among different sets

In this section, we summarize the relations among all the important sets defined in the previous sections.

Proposition 11

The following relations are satisfied:

$$\begin{aligned} \mathcal {L}_0 \subseteq \mathcal {L}_1 \subseteq \cdots \subseteq \mathcal {L}\subseteq \hbox {cl}(\mathcal {L}) = \mathcal {S}_1 \subseteq \mathcal {U}_0 \subseteq \mathcal {U}_1 \subseteq \cdots \subseteq \mathcal {U}\subseteq \hbox {cl}(\mathcal {U}) = \mathcal {S}, \end{aligned}$$

where \(\mathcal {L}_r\), \(\mathcal {L}\), \(\mathcal {S}_1\), \(\mathcal {U}_r\), and \(\mathcal {U}\) are defined as in (15), (16), (20), (11), and (14), respectively.

Proof

By Corollary 3, Proposition 4, and Proposition 1, we obtain \(\mathcal {L}\subseteq \mathcal {S}_1 \subseteq \mathcal {U}_0\) since \(\{e_1,\ldots ,e_n\} \subseteq {\varDelta }_n^0\). The last equality follows from Corollary 1. The remaining inclusions follow from the definitions (15), (16), (11), and (14). \(\square \)

It is worth noticing the significant difference between the sets \(\mathcal {L}\) and \(\mathcal {U}\). For any \(n \ge 2\), \(\hbox {cl}(\mathcal {L})\) is strictly contained in the set \(\mathcal {U}_0\), which is the smallest set in the sequence of the sets \(\mathcal {U}_r\). On the other hand, recall that \(\hbox {cl}(\mathcal {U}) = \mathcal {S}\) by Corollary 1. Therefore, for the particular polyhedral approximation hierarchies considered in this paper, it follows that upper and lower bounds exhibit quite different behaviors in terms of finite convergence in the context of standard quadratic optimization. In particular, our results reveal that inner polyhedral approximations, which give rise to upper bounds, are considerably stronger than outer polyhedral approximations in terms of finite convergence in the context of standard quadratic optimization.

We close this section by briefly commenting on the instances of (StQP) for which the upper and lower bounds coincide at a finite level of the hierarchy. From a computational point of view, this class of instances is especially important since equality of upper and lower bounds yields a certificate of optimality. We therefore define the following sets:

$$\begin{aligned} \mathcal {E}_r := \left\{ Q \in \mathcal {S}{:}\, l_r(Q) = u_r(Q) \right\} = \mathcal {L}_r \cap \mathcal {U}_r = \mathcal {L}_r, \quad r = 0, 1, \ldots , \end{aligned}$$

where the last equality follows from Proposition 11. Therefore, the algebraic description of such instances are precisely given by Theorem 4.

6 Concluding remarks

In this paper, we investigated the sequences of copositive optimization based upper and lower bounds on the optimal value of a standard quadratic optimization problem. We gave complete algebraic descriptions of the sets of instances for which the upper and/or the lower bound is exact at a finite level of the hierarchy. We identified the structural properties of the sets of instances for which the upper and/or the lower bound converges to the optimal value only in the limit. We discussed several geometric and topological properties of these sets.

For the particular polyhedral approximation hierarchies considered in this paper, an important consequence of our analysis is that the upper bounds seem to be more well-behaved in comparison with the lower bounds. Note that the extreme rays of inner polyhedral approximations \(\mathcal {I}_r\), which give rise to upper bounds, are given by \(d d^T\), where \(d \in {\varDelta }_n^r\) and the extreme rays of the cone \(\mathcal {C}\) of completely positive matrices are given by rank one matrices \(x x^T\), where \(x \in \mathbb {R}^n_+ \backslash \{0\}\) (see, e.g., [1]). It follows that the set of extreme rays of \(\mathcal {I}_r\) is a subset of the set of extreme rays of \(\mathcal {C}\). On the other hand, the outer polyhedral approximations \(\mathcal {O}_r\) are generated by the matrices \((r+2) d d^T - \hbox {Diag}(d)\), where \(d \in {\varLambda }_n^r\). For \(d = e_i \in {\varLambda }_n^r\), the corresponding matrix is a multiple of \(e_i (e_i)^T\), \(i = 1,\ldots ,n\), which is also an extreme ray of \(\mathcal {C}\). However, for each \(d \in {\varLambda }_n^r \backslash \{e_i: i = 1,\ldots ,n\}\), it is easy to construct a \(w \in \mathbb {R}^n \backslash \{0\}\) such that \(w^T d = 0\) and \({\langle }(r+2) d d^T - \hbox {Diag}(d), w w^T {\rangle }< 0\), which implies that \((r+2) d d^T - \hbox {Diag}(d) \not \in \mathcal {C}\). Together with our analysis, this observation suggests that only a few faces of \(\mathcal {O}_r\) in fact coincide with those of \(\mathcal {C}\) whereas most of the faces of \(\mathcal {O}_r\) do not support the cone \(\mathcal {C}\). Our results, combined with the recent progress on the facial structure of the cone \(\mathcal {C}\) (see, e.g., [9, 11]), may serve as a basis for the construction of tighter polyhedral outer approximations.