1 Introduction

Quadratic complementarity problems (QCPs for short) are natural extensions of linear complementarity problems (LCPs) [1] and important examples of nonlinear complementarity problems (NCPs) [2], as well as serving as an important bridge linking them. More importantly, QCPs are useful mathematical models for a class of three person non-cooperative games [3, 4], the generalized Markowitz portfolio problems [5], etc.

QCPs form a special class of NCPs, with the nonlinear functions being quadratic polynomial mappings. A quadratic polynomial mapping can be concisely represented by a third-order tensor, a matrix together with a vector, as a linear mapping can be completely determined by a matrix and a vector. Higher-order tensors have been attracting increasingly attention. Powerful tools parallel to those for matrices, such as spectral theory and tensor decompositions, are developed for treating tensors [6], avoiding the traditional analysis for polynomials via a vast number of monomials. The tensor formulation and these fruitful analysis tools motivate the recent trend on studying tensor complementarity problems (TCPs) (see [7,8,9,10,11,12,13] and references therein), and polynomial complementarity problems (PCPs) [5, 14, 15]. Both the difficulties and variety of new discoveries of TCPs and PCPs lie in investigations on tensors. Generally speaking, the difficulties are huge, when moving from matrices to third-order tensors. It becomes much more gentler, when moving further to higher-order tensors. Thus, the study on QCPs is of investigational necessity, and also quite meaningful, which can even broaden the known knowledge boundary of NCPs [5]. This is one reason for focusing on QCPs in this paper.

This study is a follow-up of the authors [5], in which basics of the solution set for a QCP are studied, such as the existence, compactness and uniqueness. As is well-known [2], another fundamental building block for the theory of solution set of a mathematical optimization problem is error bound analysis. It is the foundation for both perturbation analysis of the solution set and algorithmic designs together with convergence analysis [2].

In the literature, several kinds of error bounds were established for LCPs and NCPs [1, 2]. It is well known that the solution set of an LCP is a polyhedral set. Following the seminal work by Hoffman [18], a polyhedral set has a (global) Lipschitz error bound [2, 16, 17], while the case becomes subtle for NCPs. Usually, structures of the underlying problems must be taken into consideration to achieve certain types of error bounds, e.g., Lipschitzian, Hölderian; see [2, 19] and references therein. In particular, when the defining functions of an NCP are polynomials, error bounds can be derived from error bound analysis for polynomial systems, which has been an active research frontier [19,20,21,22,23,24,25,26,27,28].

In general, hardly a global or even a local Lipschitz error bound can be established for a polynomial system. Then, alternatively people have been trying to give various types of local or global Hölderian-type error bounds. The most important ingredient in a Hölderian-type error bound is the exponent over the residual measure. The existence of the exponents in these Hölderian-type error bounds can usually be guaranteed by the classical Łojasiewicz inequality for semialgebraic functions [24], while the explicit values of these exponents are unknown [2, 25, 29]. In the literature, systems of quadratic systemsFootnote 1 are studied extensively by Luo and Sturm [28], in which a \(\frac{1}{2}\) Hölderian-type error bound was given under suitable conditions.

Very recently, by a careful study on the curve expansion for polynomials, an explicit bound for Łojasiewicz’s exponent was given by D’Acunto and Kurdyka [21]; in turn, Hölderian-type error bounds with explicit exponents have being derived for a given system of polynomial inequalities and equalities in terms of the degrees of these polynomials, the number of variables and the numbers of equalities and inequalities. (These are all together the parameters of the system.) These exponents are usually fractional numbers with the denominator being exponential functions of the parameters of the given systems. Thus, these are called fractional error bounds. Along this line, one advance is the work by Li, Mordukhovich and Pham [26] (Lemmas 2.1 and 2.2). If we apply their results to QCPs, then we get a fractional error bound with the exponents being of order \(\frac{1}{9^{3n}}\) (Proposition 4.1).

The rest article is organized as follows: Preliminaries on error bounds are given in Sect. 2, including the main error bounds in [26]. An error bound via the natural residual function with explicit fractional exponent is established in Sect. 3. To the best knowledge of the authors, this is the first error bound for QCP (and even TCP) via the natural residual function with explicit fractional exponent. Error bounds via polynomial reduction are established in Sect. 4. Section 4.1 is a direct application of the known results to QCPs, setting the ground standard error bounds of QCPs by the state-of-the-art error bound theory of polynomial systems. A general result for polynomial system is given in Sect. 4.2 first, which is of independent interest and has potential applications in elsewhere. With the general result, a standard polynomial reduction and a reduction based on sparsity of the underlying third-order tensor in the QCP are given in Sect. 4.3. Section 4.4 uses tensor decomposition, and it is a sparsity exploration from a geometric perspective. We will see that the error bounds established by polynomial reduction (with exponents of order \(\frac{1}{6^{3n}}\)) improve significantly the ground standard results in Sect. 4.1. Some final remarks are given in the last section.

2 Preliminaries

Given a QCP [5], in whose problem setting \({\mathcal {A}}\in {\mathbb {R}}^{n\times n\times n}\) is a given third-order tensor, \(B\in {\mathbb {R}}^{n\times n}\) a given matrix, and \({\mathbf {c}}\in {\mathbb {R}}^n\) a given vector, the solution set of this QCP, denoted also by \(\textsc {sol}({\mathcal {A}},B,{\mathbf {c}})\), is

$$\begin{aligned} S:=\{{{\mathbf {x}}} : {{\mathbf {x}}}\,\ge \,{{\mathbf {0}}},\ {{\mathcal {A}}}{{\mathbf {x}}}^2+B{{\mathbf {x}}}+{\mathbf {c}}\ge {{\mathbf {0}}}, \ {{\mathbf {x}}}^{\mathsf {T}}({{\mathcal {A}}}{{\mathbf {x}}}^2+B{{\mathbf {x}}}+{\mathbf {c}})=0\}, \end{aligned}$$
(1)

where \({\mathcal {A}}{\mathbf {x}}^2\) is an n-vector with its i-th entry being \(\sum _{j,k=1}^na_{ijk}x_jx_k\) for all \(i\in \{1,\dots ,n\}\). Let \(\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}})\) denote the projection of \({\mathbf {x}}\) onto \({{\mathbb {R}}}_+^n\), the nonnegative orthant in \({{\mathbb {R}}}^n_+\). Then, with \(F({\mathbf {x}}):={\mathcal {A}}{\mathbf {x}}^2+B{\mathbf {x}}+{\mathbf {c}}\), \({\mathbf {x}}\in S\) is equal to (cf. [2])

$$\begin{aligned} {\mathbf {x}}\in \big \{{\mathbf {x}}\in {{\mathbb {R}}}^n: {\mathbf {x}}=\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\big \}, \end{aligned}$$

the set of fixed points of the mapping \({\mathbf {x}}\mapsto \varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\). The natural residual function of the QCP (1) is then defined as

$$\begin{aligned} r({\mathbf {x}}):=\Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert \ \text {for all }{\mathbf {x}}\in {{\mathbb {R}}}^n. \end{aligned}$$

It is well known and easily checked that an \({\mathbf {x}}\in {\mathbb {R}}^n\) is in S, if and only if \(r({\mathbf {x}})=0\). Assuming S is nonempty, we are interested in whether there exist scalars \(\kappa >0\) and \(\beta >0\) such that the bound \({\text {dist}}({\mathbf {x}},S)\le \kappa r({\mathbf {x}})^{\beta }\) holds for all \({\mathbf {x}}\in {\mathbb {R}}^n\) or all \({\mathbf {x}}\in K\) with some subset K of \({{\mathbb {R}}}^n\). Actually, this follows easily from Łojasiewicz’s inequality for semialgebraic functions [20], since both the distance function \({\text {dist}}({\mathbf {x}},S)\) and the natural residual function are semialgebraic. However, though powerful as it is, the classical result only guarantees the existence of the exponent \(\beta \), not an explicit value for it. It is often difficult to determine the corresponding exponent \(\beta \) in \({\L }\)ojasiewicz’s gradient equality, and it is typically unknown [25]. Our first concern in this article is giving an explicit exponent \(\beta \) for the error bound via the natural residual function.

From another perspective, the solution sets for QCPs can be interpreted as the solution sets for systems of polynomial inequalities and equalities

$$\begin{aligned} g_{i}({\mathbf {x}})\le 0, i\in \{1,\dots ,r\},\ \text {and }h_{j}({\mathbf {x}})=0, j\in \{1,\dots ,s\}, \end{aligned}$$

where \(g_{i},h_{j}:{\mathbb {R}}^n\rightarrow {{\mathbb {R}}}\) for \(i\in \{1,\dots ,r\}\) and \(j\in \{1,\dots ,s\}\) are real-valued polynomial functions on \({\mathbb {R}}^n\). For \(\gamma \in {{\mathbb {R}}}\), it is defined \([\gamma ]_+:=\max \{0,\gamma \}\) and extends component-wisely to vectors in \({{\mathbb {R}}}^n\), and \([{\mathbf {x}}]_-:={\mathbf {x}}-[{\mathbf {x}}]_+\). It is easy to see that \([{\mathbf {x}}]_+=\varPi _{{{\mathbb {R}}}^n_+}({\mathbf {x}})\). One of the advances in studying error bounds for polynomial systems is the following result by Li, Mordukhovich and Pham [26, Theorem 3.5].Footnote 2

Lemma 2.1

Let r and s be nonnegative integers, and

$$\begin{aligned} T:=\{{\mathbf {x}}\in {{\mathbb {R}}}^n: g_{i}({\mathbf {x}})\le 0, i\in \{1,\dots , r\},\ \text {and }h_j({\mathbf {x}})=0, j\in \{1,\dots ,s\}\}, \end{aligned}$$

where \(g_{i}\) and \(h_{j}\) are real polynomials on \({\mathbb {R}}^n\) with degree at most d. Let \({\bar{{\mathbf {x}}}}\in T\). Then, there are numbers \(c, \epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}},T)\le c\left( \sum _{i=1}^r[g_{i} ({\mathbf {x}})]_{+}+\sum _{j=1}^s|h_{j}({\mathbf {x}})|\right) ^{\frac{1}{R(n+r+s,d+1)}} \end{aligned}$$
(2)

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \), where the quantity R is defined as

$$\begin{aligned} R(n,d):=\left\{ \begin{array}{ll} 1 ,&{} {\mathrm{if}}\,d=1, \\ d(3d-3)^{n-1} ,&{} {\mathrm{if}}\,d\ge 2. \end{array} \right. \end{aligned}$$
(3)

If \(K\subset {{\mathbb {R}}}^n\) is compact, then there exists \({\bar{c}}>0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}},T)\le {\bar{c}}\left( \sum _{i=1}^r[g_{i}({\mathbf {x}})]_{+}+\sum _{j=1}^s|h_{j} ({\mathbf {x}})|\right) ^{\frac{1}{R(n+r+s,d+1)}}\ \text {for all }{\mathbf {x}}\in K. \end{aligned}$$

The exponent in Lemma 2.1 involves the number of equality and inequality constraints. Actually, we can combine all the equalities into one with the sacrifice of the degree being doubly increased [26, Theorem 3.6].

Lemma 2.2

Let T be as that in Lemma 2.1 and \({\bar{{\mathbf {x}}}}\in T\). Then, there are numbers \(c, \epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}},T)\le c\left( \sum _{i=1}^r[g_{i}({\mathbf {x}})]_{+}+\sum _{j=1}^s|h_{j}({\mathbf {x}})|\right) ^{\frac{2}{R(n+r,2d)}} \end{aligned}$$
(4)

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \), where the quantity R is defined as (3). If \(K\subset {{\mathbb {R}}}^n\) is compact, then there exists \({\bar{c}}>0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}},T)\le {\bar{c}}\left( \sum _{i=1}^r[g_{i}({\mathbf {x}})]_{+}+\sum _{j=1}^s|h_{j}({\mathbf {x}})|\right) ^{\frac{2}{R(n+r,2d)}}\ \text {for all }{\mathbf {x}}\in K. \end{aligned}$$

3 Error Bound via Natural Residual Function

In this section, we study error bound via the natural residual function with explicit exponent. Recall that \( F({\mathbf {x}}):={\mathcal {A}}{\mathbf {x}}^2+B{\mathbf {x}}+{\mathbf {c}}=:(f_1({\mathbf {x}}),\dots ,f_n({\mathbf {x}}))^{\mathsf {T}}\), and \( S=\big \{{\mathbf {x}}\in {{\mathbb {R}}}^n: {\mathbf {x}}=\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\big \}\). The projection \(\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\) can be equivalently written as the unique \({\mathbf {z}}\in {{\mathbb {R}}}^n\) such that

$$\begin{aligned}&\big ({\mathbf {z}}-{\mathbf {x}}+F({\mathbf {x}})\big )_i\ge 0\ \text {and }z_i=0\ \text {for all }i\in I({\mathbf {x}}),\end{aligned}$$
(5)
$$\begin{aligned}&z_i\ge 0 \ \text {and }\big ({\mathbf {z}}-{\mathbf {x}}+F({\mathbf {x}})\big )_i=0\ \text {for all }i\notin I({\mathbf {x}}), \end{aligned}$$
(6)

where

$$\begin{aligned} I({\mathbf {x}}):=\big \{i\in \{1,\dots ,n\}: x_i-f_i({\mathbf {x}})\le 0\big \}=\big \{i\in \{1,\dots ,n\}: \big [\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\big ]_i=0\big \}. \end{aligned}$$

An index subset \(I\subset \{1,\dots ,n\}\) is active at a vector \({\mathbf {x}}\in {{\mathbb {R}}}^n\), if \(I\subseteq I({\mathbf {x}})\). If I is active at \({\mathbf {x}}\), then \(({\mathbf {x}},{\mathbf {z}}=\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}})))\) satisfies the systems (5) and (6) with \(I({\mathbf {x}})\) being replaced by I.

In the following, for some \(\epsilon >0\), we first discuss the boundedness of the set

$$\begin{aligned} V(\epsilon ):=\{{\mathbf {x}}\in {{\mathbb {R}}}^n: \Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert \le \epsilon \}. \end{aligned}$$
(7)

For a tensor \({\mathcal {A}}\in {{\mathbb {R}}}^{n\times n\times n}\), its i-th slice \([a_{i\cdot \cdot }]\) is an \(n\times n\) matrix, denoted as \(A_i\), for all \(i\in \{1,\dots ,n\}\). The next definition can be found in [11].

Definition 3.1

A tensor \({\mathcal {A}}\in {{\mathbb {R}}}^{n\times n\times n}\) is called an \(R_0\) tensor, if the system

$$\begin{aligned} \begin{array}{ll} {\mathbf {x}}^{\mathsf {T}}A_i{\mathbf {x}}\ge 0,&{}\text {if }x_i=0,\\ {\mathbf {x}}^{\mathsf {T}}A_i{\mathbf {x}}=0,&{}\text {if }x_i>0, \end{array} \end{aligned}$$

does not have a solution in \({{\mathbb {R}}}^n_+\setminus \{{\mathbf {0}}\}\).

Proposition 3.1

If \({\mathcal {A}}\) is an \(R_0\) tensor, then the set \(V(\epsilon _0)\) is bounded for some \(\epsilon _0>0\).

Proof

Suppose on the contrary that there is a sequence \(\{{\mathbf {x}}_k: {\mathbf {x}}_k\in V(\frac{1}{k})\}\) with \(k=1,2,\dots \) such that \(\Vert {\mathbf {x}}_k\Vert \rightarrow \infty \) as \(k\rightarrow \infty \). Without loss of generality, we assume that \({\tilde{{\mathbf {x}}}}_k:=\frac{{\mathbf {x}}_k}{\Vert {\mathbf {x}}_k\Vert }\rightarrow {\tilde{{\mathbf {x}}}}\). As \(\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}_k-F({\mathbf {x}}_k))\ge {\mathbf {0}}\), it follows that

$$\begin{aligned} {\mathbf {0}}\ne {{\tilde{{\mathbf {x}}}}}\ge {\mathbf {0}}. \end{aligned}$$
(8)

Actually, if \({{\tilde{{\mathbf {x}}}}}_i<0\), then \(({\mathbf {x}}_k)_i\rightarrow -\infty \) as \(k\rightarrow \infty \). Thus, when k is sufficiently large,

$$\begin{aligned} \frac{1}{k}\ge \Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert \ge \big |({\mathbf {x}}_k)_i-[({\mathbf {x}}_k)_i-f_i({\mathbf {x}}_k)]_+\big |\ge \big |({\mathbf {x}}_k)_i\big |, \end{aligned}$$

which is a contradiction. We also have that for each \(i\in \{1,\dots ,n\}\), the sequence \(\{({\mathbf {x}}_k)_i\}_{k=1}^\infty \) is bounded from below, i.e., there exists a constant M such that

$$\begin{aligned} ({\mathbf {x}}_k)_i\ge M \end{aligned}$$
(9)

for all \(k=1,2,\dots \).

Next, we show that \(f_i({\mathbf {x}}_k)\) is bounded from \(-\infty \) for each \(i\in \{1,\dots ,n\}\). Actually, if \(f_i({\mathbf {x}}_k)\rightarrow -\infty \) (take a subsequence if necessary), then with (9) we must have that

$$\begin{aligned} \big |({\mathbf {x}}_k)_i-[({\mathbf {x}}_k)_i-f_i({\mathbf {x}}_k)]_+\big |\ge |f_i({\mathbf {x}}_k)|-2M\rightarrow \infty \ \text {as }k\rightarrow \infty , \end{aligned}$$

which is a contradiction. Therefore, there exists a constant K such that

$$\begin{aligned} F({\mathbf {x}}_k)\ge K{\mathbf {e}} \end{aligned}$$
(10)

for all \(k=1,2,\dots \), where \({\mathbf {e}}\) is the vector of all ones. Thus, we must have that

$$\begin{aligned} \lim _{k\rightarrow \infty }\frac{{\mathcal {A}}{\mathbf {x}}_k^2}{\Vert {\mathbf {x}}_k\Vert ^2}={\mathcal {A}}{\tilde{{\mathbf {x}}}}^2\ge {\mathbf {0}}. \end{aligned}$$
(11)

If \({{\tilde{{\mathbf {x}}}}}_i>0\), then we must have that \(({\mathbf {x}}_k)_i\rightarrow \infty \) as \(k\rightarrow \infty \). It follows from

$$\begin{aligned} \frac{\big |({\mathbf {x}}_k)_i-[({\mathbf {x}}_k)_i-f_i({\mathbf {x}}_k)]_+\big |}{\Vert {\mathbf {x}}_k\Vert }\le \frac{1}{k\Vert {\mathbf {x}}_k\Vert } \end{aligned}$$

and (10) that \(\frac{f_i({\mathbf {x}}_k)}{\Vert {\mathbf {x}}_k\Vert }\rightarrow 0\). Consequently, we have \({{\tilde{{\mathbf {x}}}}}^{\mathsf {T}}A_i{{\tilde{{\mathbf {x}}}}}=0\). Thus, this, together with (8) and (11), implies that the system

$$\begin{aligned} \begin{array}{ll} {\mathbf {x}}^{\mathsf {T}}A_i{\mathbf {x}}\ge 0,&{}\text {if }x_i=0,\\ {\mathbf {x}}^{\mathsf {T}}A_i{\mathbf {x}}=0,&{}\text {if }x_i>0, \end{array} \end{aligned}$$

has a solution \({{\tilde{{\mathbf {x}}}}}\). This is a contradiction to the hypothesis that \({\mathcal {A}}\) is an \(R_0\) tensor. Therefore, the set \(V(\epsilon _0)\) should be bounded for some \(\epsilon _0>0\). \(\square \)

Lemma 3.1

Let \({\mathcal {A}}\) be an \(R_0\) tensor and \(\textsc {sol}({\mathcal {A}},B,{\mathbf {c}})\ne \emptyset \). Then, there exists a scalar \(\epsilon >0\) such that for any \({\mathbf {x}}\in \{{\mathbf {x}}\in {{\mathbb {R}}}^n: \Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert \le \epsilon \}\), \(I({\mathbf {x}})\) is active at some \({\mathbf {x}}^*\in \textsc {sol}({\mathcal {A}},B,{\mathbf {c}})\).

Proof

Let \(\epsilon _0>0\) be a scalar such that \(V(\epsilon _0)\) is bounded by Proposition 3.1. There exists some \(\epsilon \le \epsilon _0\) satisfies the conclusion. Suppose on the contrary that there exist an \(I\subseteq \{1,\dots ,n\}\) and a sequence \(\{{\mathbf {x}}^{(1)},{\mathbf {x}}^{(2)},\dots \}\) satisfying \(I({\mathbf {x}}^{(i)})=I\) for all \(i=1,2,\dots \), and \({\mathbf {x}}^{(i)}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}^{(i)}-F({\mathbf {x}}^{(i)}))\rightarrow 0\), and no \({\mathbf {x}}^*\in \textsc {sol}({\mathcal {A}},B,{\mathbf {c}})\) for at which I is active.

For each \(i=1,2,\dots \), denote by \({\mathbf {z}}^{(i)}:=\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}^{(i)}-F({\mathbf {x}}^{(i)}))\). It follows from the assumption that the sequence \(\{({\hat{{\mathbf {x}}}}^{(i)},{\hat{{\mathbf {z}}}}^{(i)})\}\) is bounded.

By continuity, every limit point \(({\mathbf {x}}^\circ ,{\mathbf {z}}^\circ )\) of this sequence satisfies

$$\begin{aligned}&\big ({\mathbf {z}}^\circ -{\mathbf {x}}^\circ +F({\mathbf {x}}^\circ )\big )_i\ge 0\ \text {and }z^\circ _i=0\ \text {for all }i\in I,\\&\quad \quad z^\circ _i\ge 0 \ \text {and }\big ({\mathbf {z}}^\circ -{\mathbf {x}}^\circ +F({\mathbf {x}}^\circ )\big )_i=0\ \text {for all }i\notin I,\ \text {and }{\mathbf {x}}^\circ -{\mathbf {z}}^\circ ={\mathbf {0}}. \end{aligned}$$

Consequently, \({\mathbf {x}}^\circ =\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}^\circ -F({\mathbf {x}}^\circ ))\), or equivalently \({\mathbf {x}}^\circ \in S\), and I is active at \({\mathbf {x}}^\circ \), which is an immediate contradiction to the contradictory hypothesis. \(\square \)

Theorem 3.1

Suppose that the solution set S is nonempty and \({\mathcal {A}}\) is an \(R_0\) tensor. Then, for any nonempty compact set \(K\subset {{\mathbb {R}}}^n\), there exists a scalar \(\kappa >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, S)\le \kappa \Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert ^{\max \{\frac{1}{3\cdot 6^{5n-1}},\frac{1}{2\cdot 9^{3n-1}}\}} \end{aligned}$$
(12)

for all \({\mathbf {x}}\in K\).Footnote 3

Proof

Let \(\epsilon \in ]0,1[\) be a scalar smaller than the scalar feasible to Lemma 3.1. If we can show that there exists a scalar \(\kappa _1>0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, S)\le \kappa _1\Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert ^{\max \{\frac{1}{3\cdot 6^{5n-1}},\frac{1}{2\cdot 9^{3n-1}}\}} \end{aligned}$$
(13)

for all \({\mathbf {x}}\in V(\epsilon )=\{{\mathbf {x}}\in {{\mathbb {R}}}^n: \Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert \le \epsilon \}\), then (12) will follow. Actually, given (13) for \({\mathbf {x}}\in V(\epsilon )\), it remains to show that (12) holds for all \({\mathbf {x}}\in K\setminus V(\epsilon )\). If \(V(\epsilon )\subseteq K\), we are done. In the following, we assume that \(V(\epsilon )\not \subseteq K\). Define

$$\begin{aligned} M:=\sup _{{\mathbf {x}}\in K}{\text {dist}}({\mathbf {x}},S). \end{aligned}$$

As the function \({\text {dist}}(\cdot ,S)\) is continuous and K is compact, we have \(M<\infty \). Let \(\alpha :=\max \{\frac{1}{3\cdot 6^{5n-1}},\frac{1}{2\cdot 9^{3n-1}}\}\). Then, we have

$$\begin{aligned} {\text {dist}}({\mathbf {x}},S)\le M\le \frac{M}{\epsilon ^\alpha }\epsilon ^\alpha \le \frac{M}{\epsilon ^\alpha }\Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert ^\alpha \end{aligned}$$

for all \({\mathbf {x}}\in K\setminus V(\epsilon )\). Taking \(\kappa \) to be the maximum of \(\kappa _1\) and \(\frac{M}{\epsilon ^\alpha }\), the conclusion then follows. Therefore, in the following, we establish (13) for all \({\mathbf {x}}\in V(\epsilon )\).

For any given \({\mathbf {x}}\in \{{\mathbf {x}}\in {{\mathbb {R}}}^n: \Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert \le \epsilon \}\), we define \({\mathbf {z}}:=\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\). It follows from (5) and (6) that

$$\begin{aligned}&\big ({\mathbf {z}}-{\mathbf {x}}+F({\mathbf {x}})\big )_i\ge 0\ \text {and }z_i=0\ \text {for all }i\in I({\mathbf {x}}),\nonumber \\&\quad \quad \text {and }z_i\ge 0\ \text {and } \big ({\mathbf {z}}-{\mathbf {x}}+F({\mathbf {x}})\big )_i=0\ \text {for all }i\notin I({\mathbf {x}}). \end{aligned}$$
(14)

It follows from Lemma 3.1 that the following system in \(({\mathbf {x}}^\circ ,{\mathbf {z}}^\circ )\) is consistent:

$$\begin{aligned}&\big ({\mathbf {z}}^\circ -{\mathbf {x}}^\circ +F({\mathbf {x}}^\circ )\big )_i\ge 0\ \text {and }z^\circ _i=0\ \text {for all }i\in I({\mathbf {x}}),\ \text {and }\nonumber \\&\quad \quad z^\circ _i\ge 0\ \text {and }\big ({\mathbf {z}}^\circ -{\mathbf {x}}^\circ +F({\mathbf {x}}^\circ )\big )_i=0\ \text {for all }i\notin I({\mathbf {x}}),\ \text {and }{\mathbf {x}}^\circ -{\mathbf {z}}^\circ ={\mathbf {0}}, \end{aligned}$$
(15)

since there exists a solution of QCP (i.e., a point in S) at which \(I({\mathbf {x}})\) is active. Thus, the set of solutions to the system (15), denoted by T, is nonempty. It is also easy to see that every solution \(({\mathbf {x}}^\circ ,{\mathbf {z}}^\circ )\) of this system satisfies \({\mathbf {x}}^\circ \in S\). Note that system (15) is a polynomial system with 2n equalities and n inequalities with the maximum degree being 2. It then follows from Lemmas 2.1 and 2.2 that

$$\begin{aligned} {\text {dist}}(({\mathbf {x}},{\mathbf {z}}),T)\le \kappa _2\Vert {\mathbf {x}}-{\mathbf {z}}\Vert ^{\max \{\frac{1}{3\cdot 6^{5n-1}},\frac{1}{2\cdot 9^{3n-1}}\}} \end{aligned}$$

for some \(\kappa _2>0\). Since T is a closed set, there exists \(({\bar{{\mathbf {x}}}},{\bar{{\mathbf {z}}}})\in T\) such that

$$\begin{aligned} {\text {dist}}(({\mathbf {x}},{\mathbf {z}}),T)=\Vert ({\mathbf {x}},{\mathbf {z}})-({\bar{{\mathbf {x}}}},{\bar{{\mathbf {z}}}})\Vert . \end{aligned}$$

As \({\bar{{\mathbf {x}}}}\in S\), we thus have

$$\begin{aligned} {\text {dist}}({\mathbf {x}},S)\le {\text {dist}}(({\mathbf {x}},{\mathbf {z}}),T)\le \kappa _2\Vert {\mathbf {x}}-\varPi _{{{\mathbb {R}}}_+^n}({\mathbf {x}}-F({\mathbf {x}}))\Vert ^{\max \{\frac{1}{3\cdot 6^{5n-1}},\frac{1}{2\cdot 9^{3n-1}}\}}. \end{aligned}$$

Note that \(\kappa _2\) in the above inequality depends on \({\mathbf {x}}\) (actually on \(I({\mathbf {x}})\)), since it depends on the system (15). While the set \(V(\epsilon )\) is bounded and thus compact for each given \(\epsilon >0\), there are only finitely many index sets \(I({\mathbf {x}})\)’s. Thus, there are only finitely many systems of the form (15) determined by the index sets. For each one of such systems, an application of Lemmas 2.1 and 2.2 to the compact set \(V(\epsilon )\) will yield a \(\kappa _2\), which is valid for all points in \(V(\epsilon )\) with this given index set; and then taking a maximum of these finitely many \(\kappa _2\)’s will give a universal \(\kappa _1\) which is valid for each \({\mathbf {x}}\in V(\epsilon )\). The proof is complete. \(\square \)

4 Error Bounds via Polynomial Reduction

In this section, we first give the ground standing error bound for QCPs from [26], then study a general case on polynomial systems, and finally apply it to the solution sets of QCPs.

4.1 First Result of Error Bound for QCP

Since the solution set of a QCP can be written down as a system of polynomial equality and inequalities, Lemmas 2.1 and 2.2 can apply to the solution set of a QCP directly, and give a ground standing error bound immediately.

Proposition 4.1

Let S be given in (1) and nonempty, and \({\bar{{\mathbf {x}}}}\in S\). There are positive \(\kappa >0\) and \(\epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, S)\le \kappa \bigg ([-{\mathbf {x}}]_{+}+[-{\mathcal {A}}{\mathbf {x}}^2-B{\mathbf {x}}-\mathbf {c}]_{+}+\big |{\mathbf {x}}^{\mathsf {T}}({\mathcal {A}}{\mathbf {x}}^2+B{\mathbf {x}}+\mathbf {c})\big |\bigg )^{\beta } \end{aligned}$$

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \), where

$$\begin{aligned} \beta :=\max \bigg \{\frac{1}{4\cdot 9^{3n}},\frac{1}{3\cdot 15^{3n-1}}\bigg \}. \end{aligned}$$
(16)

It is not hard to find that the maximum in the exponent (16) always takes \(\frac{1}{4\cdot 9^{3n}}\), since for \(n\ge 2\) it holds \(\frac{1}{4\cdot 9^{3n}}>\frac{1}{3\cdot 15^{3n-1}}\). From this perspective, the error bound given in [26, Theorem 3.5] (cf. Lemma 2.1) is more applicable to the solution sets of QCPs than [26, Theorem 3.6] (cf. Lemma 2.2).

Despite the significance of \(\frac{1}{4\cdot 9^{3n}}\) over \(\frac{1}{3\cdot 15^{3n-1}}\), the denominator is of order \(9^{3n}\), which increases rapidly. On the one hand, we cannot expect to derive an error bound for a polynomial system avoiding the order \(d^n\) for a denominator in general (cf. [26, Example 3.2]); on the other hand, we hope to reduce the denominator as much as possible. Note that the base 9 in \(9^{3n}\) comes from \(3\times (d+1)-3\) (cf. (3)) with \(d=3\) for the defining polynomial system of QCP (1). Toward a closer look at this system (1), we see that there is only one polynomial equality with degree 3, the other polynomial inequalities have degree one or two. Thus, one expects that a careful study on the system will result in a denominator of order \(6^n\). This will be discussed in the sequel, and some results will be established.

4.2 Polynomial Reduction

Let

$$\begin{aligned} T:=\bigg \{{\mathbf {x}}\in {{\mathbb {R}}}^n:\ \begin{array}{cc} p_t({\mathbf {x}})\le 0 \ \ \text {for }t\in \{1,\dots ,r\},\\ f_i({\mathbf {x}})={\sum }_{j=1}^{n_i}g_{i,j}({\mathbf {x}})h_{i,j}({\mathbf {x}})=0 \ \ \text {for }i\in \{1,\dots ,s\} \end{array} \bigg \} \end{aligned}$$
(17)

be the solution set of a system of polynomial inequalities and equalities with positive integers \(n_i\)’s, r, s, and \(\max \{{\text {deg}}(p_t): t\in \{1,\dots ,r\}\}=w\) for some nonnegative integer w, \(\max \{{\text {deg}}(g_{i,j}): j\in \{1,\dots ,n_i\},\ i\in \{1,\dots ,s\}\}= p\), and \(\max \{{\text {deg}}(h_{i,j}): j\in \{1,\dots ,n_i\},\ i\in \{1,\dots ,s\}\}=q\), for some nonnegative integers pq. We assume that \(q+1\le p\), and \(\max \{{\text {deg}}(f_i): i\in \{1,\dots ,s\}\}=pq\). Then, a direct application of Lemmas 2.1 and 2.2 to T yields a local error bound with an explicit exponent

$$\begin{aligned} \max \bigg \{\frac{1}{R(n+r+s,\max \{pq,w\}+1)},\ \frac{2}{R(n+r,2\max \{pq,w\})}\bigg \}. \end{aligned}$$
(18)

We can establish another local error bound for T via polynomial reduction as the next result.

Theorem 4.1

Let \(T\ne \emptyset \) be given in (17) and \({\bar{{\mathbf {x}}}}\in T\). There are positive \(\kappa >0\) and \(\epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, T)\le \kappa \bigg (\sum _{t=1}^r[p_t({\mathbf {x}})]_++\sum _{i}|f_i({\mathbf {x}})|\bigg )^{\gamma }, \end{aligned}$$

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \), where \(N_s=n_1+\dots +n_s\) and

$$\begin{aligned} \gamma :=\max \bigg \{\frac{1}{R(n+2N_s+r+s,\max \{p,w\}+1)}, \frac{2}{R(n+N_s+r,2\max \{p,w\})}\bigg \}. \end{aligned}$$
(19)

Proof

Let

$$\begin{aligned} V:=\left\{ ({\mathbf {x}},{\mathbf {u}})\in {{\mathbb {R}}}^n\times {{\mathbb {R}}}^{N_s} : \ \begin{array}{c} p_t({\mathbf {x}})\le 0\qquad \qquad \text {for all }t\in \{1,\dots ,r\}, \\ \begin{array}{cc} \sum \limits _{j=1}^{n_i}u_{i,j}h_{i,j}({\mathbf {x}})=0\\ g_{i,j}({\mathbf {x}})-u_{i,j}=0 \end{array} \text { for all }j\in \{1,\dots ,n_i\}, i\in \{1,\dots ,s\} \end{array} \right\} . \end{aligned}$$

Note that for the defining equalities and inequalities in V, the polynomials all have degrees being not greater than \( \max \{q+1,p,w\}\le \max \{p,w\}\).

Obviously, \(V\ne \emptyset \) whenever \(T\ne \emptyset \). Actually, let \(\overline{{\mathbf {x}}}\in T\). Then, \((\overline{{\mathbf {x}}},\overline{{\mathbf {u}}})\in V\) with

$$\begin{aligned} \overline{u}_{i,j}:=g_{i,j}(\overline{{\mathbf {x}}})\ \text {for all }j\in \{1,\dots ,n_i\},\ i\in \{1,\dots ,s\}. \end{aligned}$$

Note that the defining system in V has r inequalities, \(s+N_s\) equalities in \(n+N_s\) variables. Consequently, by Lemmas 2.1 and 2.2, there exist \(\kappa >0\) and \(\epsilon _1>0\) such that

$$\begin{aligned} {\text {dist}}(({\mathbf {x}},{\mathbf {u}}), V)\le \kappa \left( \sum _{t=1}^r[p_t({\mathbf {x}})]_++\sum _{i}\bigg |\sum _ju_{i,j}h_{i,j}({\mathbf {x}})\bigg |+\sum _{i,j}|g_{i,j}({\mathbf {x}})-u_{i,j}|\right) ^{\gamma } \end{aligned}$$

for all \(({\mathbf {x}},{\mathbf {u}})\) such that \(\Vert ({\mathbf {x}},{\mathbf {u}})-(\overline{{\mathbf {x}}},\overline{{\mathbf {u}}})\Vert \le \epsilon _1\) with \(\gamma \) given by (19).

Since \(g_{i,j}\)’s are polynomials, there exists \(\epsilon >0\) such that for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-\overline{{\mathbf {x}}}\Vert \le \epsilon \), we have

$$\begin{aligned} \Vert {\mathbf {x}}-\overline{{\mathbf {x}}}\Vert ^2+\sum _{i,j}(g_{i,j}({\mathbf {x}})-\overline{u}_{i,j})^2= \Vert {\mathbf {x}}-\overline{{\mathbf {x}}}\Vert ^2+\sum _{i,j}(g_{i,j}({\mathbf {x}})-g_{i,j}(\overline{{\mathbf {x}}}))^2 \le \epsilon _1^2. \end{aligned}$$

Therefore, for any \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-\overline{{\mathbf {x}}}\Vert \le \epsilon \), we have that \(\Vert ({\mathbf {x}},{\mathbf {u}})-(\overline{{\mathbf {x}}},\overline{{\mathbf {u}}})\Vert \le \epsilon _1\) with

$$\begin{aligned} u_{i,j}=g_{i,j}({\mathbf {x}})\ \text {for all }j\in \{1,\dots ,n_i\},\ i\in \{1,\dots ,s\}. \end{aligned}$$
(20)

Since V is a closed set, the distance function to V is always attainable. It follows that for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-\overline{{\mathbf {x}}}\Vert \le \epsilon \) with \({\mathbf {u}}\) as in (20), there exists \((\hat{{\mathbf {x}}},\hat{{\mathbf {u}}})\in V\) such that

$$\begin{aligned} \Vert ({\mathbf {x}},{\mathbf {u}})-(\hat{{\mathbf {x}}},\hat{{\mathbf {u}}})\Vert&\le \kappa \left( \sum _{t=1}^r[p_t({\mathbf {x}})]_++\sum _{i}\big |\sum _ju_{i,j}h_{i,j}({\mathbf {x}})\big |+\sum _{i,j}|g_{i,j}({\mathbf {x}})-u_{i,j}|\right) ^{\gamma }\\&=\kappa \left( \sum _{t=1}^r[p_t({\mathbf {x}})]_++\sum _{i}\big |\sum _jg_{i,j}({\mathbf {x}})h_{i,j}({\mathbf {x}})\big |\right) ^{\gamma }\\&=\kappa \left( \sum _{t=1}^r[p_t({\mathbf {x}})]_++\sum _{i}|f_i({\mathbf {x}})|\right) ^{\gamma }. \end{aligned}$$

On the other hand, \((\hat{{\mathbf {x}}},\hat{{\mathbf {u}}})\in V\) implies that

$$\begin{aligned} \hat{u}_{i,j}:=g_{i,j}(\hat{{\mathbf {x}}})\ \text {for all }j\in \{1,\dots ,n_i\},\ i\in \{1,\dots ,s\}. \end{aligned}$$

Thus, \(\hat{{\mathbf {x}}}\in T\). Hence,

$$\begin{aligned} {\text {dist}}({\mathbf {x}},T)\le \Vert {\mathbf {x}}-\hat{{\mathbf {x}}}\Vert \le \Vert ({\mathbf {x}},{\mathbf {u}})-(\hat{{\mathbf {x}}},\hat{{\mathbf {u}}})\Vert \le \kappa \bigg (\sum _{t=1}^r[p_t({\mathbf {x}})]_++\sum _{i}|f_i({\mathbf {x}})|\bigg )^{\gamma }, \end{aligned}$$

which is the claimed local error bound for the set T. \(\square \)

The technique in Theorem 4.1 is polynomial reduction, i.e., by reducing the equality constraints in (17) into equalities with lower degrees with the cost of the increased number of equalities.

By polynomial reduction, the maximum degree of polynomial equalities is reduced from pq to p, while with the sacrifice of the total number of variables and equalities being increased by \(2N_s\). Let both s and \(2N_s\) be fixed. Suppose that \(pq>w=p\). Then, it follows that

$$\begin{aligned} R(n+r+s,\max \{pq,w\}+1)=(pq+1)q^{n+r+s-1}(3p)^{n+r+s-1} \end{aligned}$$

and

$$\begin{aligned} R(n+2N_s+r+s,\max \{p,w\}+1)=(p+1)(3p)^{2N_s}(3p)^{n+r+s-1}. \end{aligned}$$

When n is sufficiently large, we immediately see that the error bound in Theorem 4.1 is significantly better than that in Lemma 2.1. For instance, consider the case of \(n=15\), \(r=7\), \(s=1\), \(n_1=2\), \(w=10\), \(p=10\), \(q=9\), then

$$\begin{aligned} \frac{1}{R(n+r+s,\max \{pq,w\}+1)}=\frac{1}{R(23,91)}=\frac{1}{91\times 270^{22}} \end{aligned}$$

and

$$\begin{aligned} \frac{1}{R(n+2N_s+r+s,\max \{p,w\}+1)}=\frac{1}{R(27,11)}=\frac{1}{11\times 30^{26}}. \end{aligned}$$

We see that the latter is approximately \(9^{17}\) times of the former. We have a similar conclusion for Lemma 2.2.

4.3 Standard Reduction and Sparsity

In this section, we apply the general result in Sect. 4.2 to QCPs. Let a tensor \({\mathcal {A}}=(a_{i_{1}i_2 i_3})\in {{\mathbb {R}}}^{n\times n\times n}\), then \( {\mathcal {A}}{\mathbf {x}}^3:=\sum \limits _{i,j,k=1}^{n}a_{ijk}x_{i}x_{j}x_{k}\). The reduction is focused on the equality \({\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{\mathsf {T}}B{\mathbf {x}}+{\mathbf {c}}^{\mathsf {T}}{\mathbf {x}}=0\) (cf. (1)). A standard reformulation is

$$\begin{aligned} \sum _{i=1}^nx_i({\mathcal {A}}{\mathbf {x}}^2+B{\mathbf {x}}+{\mathbf {c}})_i={\mathbf {x}}^{\mathsf {T}}({\mathcal {A}}{\mathbf {x}}^2+B{\mathbf {x}}+{\mathbf {c}})=0, \end{aligned}$$

i.e., representing it by a sum of products of linear polynomials and quadratic polynomials. With this, a direct application of Theorem 4.1 will give the following local error bound, which improves Proposition 4.1 in certain cases.

Proposition 4.2

Let S be the solution set of QCP given by (1) and \(\overline{{\mathbf {x}}}\in S\). Then, there are positive \(\kappa >0\) and \(\epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, S)\le \kappa \big ([-{\mathbf {x}}]_{+}+[-{\mathcal {A}}{\mathbf {x}}^2-B{\mathbf {x}}-\mathbf {c}]_{+}+|{\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}|\big )^{\gamma } \end{aligned}$$

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \) with

$$\begin{aligned} \gamma :=\max \bigg \{\frac{1}{3\times 6^{5n}}, \frac{1}{2\times 9^{4n-1}}\bigg \}. \end{aligned}$$
(21)

Note that the ingredient of the polynomial reduction technique in Sect. 4.2 is reducing a polynomial equality of higher degree into a few polynomial equalities of lower degrees. As for the QCP (1), the emphasis should be put on the term \({\mathcal {A}}{\mathbf {x}}^3\), which is the only cubic part. The discussions in the rest and the next section are in the same category from this perspective.

We write \(y_{s}:=x_{j}x_{k}\) where \(s=\{(j,k): j,k\in \{1,\ldots ,n\}\}\). Assume that the nonzero components of \({\mathcal {A}}\) are indexed by the set

$$\begin{aligned} {\mathcal {S}}:=\{(i,j,k): a_{ijk}\ne 0\}, \end{aligned}$$

and the cardinality of \({\mathcal {S}}\) is \(K\ge 0\). Then, the polynomial \({\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}=0\) can be equivalently written as the following system

$$\begin{aligned} \begin{array}{ll} \sum \limits _{(i,s)=(i,j,k)\in {\mathcal {S}}}a_{ijk}x_{i}y_{s}+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}=0,\\ y_{s}=x_{j}x_{k}\ \text {for all }(i,s)=(i,j,k)\in {\mathcal {S}}.\\ \end{array} \end{aligned}$$
(22)

Using the polynomial reduction technique, and replacing the polynomial equality \({\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}=0\) with the above system of equalities, the solution set can be equivalently written as

$$\begin{aligned} U:=\left\{ ({\mathbf {x}},{\mathbf {y}})\in {{\mathbb {R}}}^n\times {{\mathbb {R}}}^{K}\ : \ \begin{array}{cc} {\mathbf {x}}\ge {\mathbf {0}},\ \ {\mathcal {A}}{\mathbf {x}}^2+B{\mathbf {x}}+\mathbf {c}\ge {\mathbf {0}}, \\ \sum \limits _{(i,s)=(i,j,k)\in {\mathcal {S}}}a_{ijk}x_{i}y_{s}+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}=0,\\ y_{s}=x_{j}x_{k}\ \text {for all }(i,s)=(i,j,k)\in {\mathcal {S}} \end{array} \right\} . \end{aligned}$$
(23)

Therefore, with \(r=2n\), \(s=1\) and \(n_1=K\), a direct application of Theorem 4.1 gives the following local error bound.

Proposition 4.3

Let \(S\ne \emptyset \) be the solution set of QCP given by (1) and \(\overline{{\mathbf {x}}}\in S\). There are positive \(\kappa >0\) and \(\epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, S)\le \kappa \big ([-{\mathbf {x}}]_{+}+[-{\mathcal {A}}{\mathbf {x}}^2-B{\mathbf {x}}-\mathbf {c}]_{+}+|{\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}|\big )^{\gamma } \end{aligned}$$

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \) with

$$\begin{aligned} \gamma =:\max \bigg \{\frac{1}{3\times 6^{3n+2K}}, \frac{1}{2\times 9^{3n+K-1}}\bigg \}. \end{aligned}$$
(24)

The error bound in Proposition 4.3 is better than that in Proposition 4.1, when the tensor \({\mathcal {A}}\) is sparse, i.e., \(K\ll n\). In particular, when K is of a constant magnitude and n is increasing to large, the exponent in (24) (dominated by \(6^{3n}\)) is much larger than the exponent in (16) (dominated by \(9^{3n}\)).

The next example is in the spirit of [26, Example 3.2], showing that the sparsity of the tensor indeed plays a role.

Example 4.1

Let \({\mathcal {A}}\in {{\mathbb {R}}}^{n\times n\times n}\) be diagonal with \(a_{iii}=1\) for all \(i\in \{1,\dots ,K\}\) with \(K<n\) and all the other entries be zero. Let \(B\in {{\mathbb {R}}}^{n\times n}\) be the matrix with the nonzero elements being \(b_{i,i+1}=-1\) for all \(i\in \{1,\dots ,K\}\) and \(b_{i,i}=-1\) for all \(i\in \{K+1,\dots ,n\}\), and \({\mathbf {c}}={\mathbf {0}}\). It can be verified that the solution set S of the QCP is the singleton \(\{{\mathbf {0}}\}\). Let \(\epsilon >0\). We take \({\mathbf {x}}\in {{\mathbb {R}}}^n\) with

$$\begin{aligned} x_i:=\epsilon ^{2^{i-1}}\ \text {for all }i\in \{1,\dots ,K+1\}\ \text {and }x_i=0\ \text {for the others}. \end{aligned}$$

Then, we have \( {\text {dist}}({\mathbf {x}},S)=\sqrt{\sum _{i=1}^{K+1}\epsilon ^{2^i}}=O(\epsilon )\), and

$$\begin{aligned}{}[-{\mathbf {x}}]_{+}+[-{\mathcal {A}}{\mathbf {x}}^2-B{\mathbf {x}}-\mathbf {c}]_{+}+|{\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+ {\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}|=\epsilon ^{2^K}+\epsilon ^{2^{K+1}}. \end{aligned}$$

Thus,

$$\begin{aligned} {\text {dist}}({\mathbf {x}},S)=O\bigg ([-{\mathbf {x}}]_{+}+[-{\mathcal {A}}{\mathbf {x}}^2-B{\mathbf {x}}-\mathbf {c}]_{+}+|{\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+ {\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}|\bigg )^{\frac{1}{2^K}}. \end{aligned}$$

4.4 Rank Decomposition

In the following, we present another method to reduce the degree of the polynomial \({\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}=0\) using tensor decomposition. Actually, this is a much more intrinsic approach for exploring sparsity of \({\mathcal {A}}\). A similar result as Proposition 4.3 will be established.

Given a third-order tensor \({\mathcal {A}}\in {{\mathbb {R}}}^{n\times n\times n}\), we can flatten it as an \(n\times n^2\) matrix M by combining its second and third indices as

$$\begin{aligned} m_{is}=a_{ijk}\ \text {with }s=(j-1)n+k\ \text {for all }j,k\in \{1,\dots ,n\}. \end{aligned}$$

As a matrix in \({{\mathbb {R}}}^{n\times n^2}\), we can decompose it into a sum of \(t={\text {rank}}(M)\) rank one matrices as

$$\begin{aligned} M=\sum _{i=1}^t{\mathbf {u}}_i{\mathbf {v}}_i^{\mathsf {T}} \end{aligned}$$

with \({\mathbf {u}}_i\in {{\mathbb {R}}}^n\) and \({\mathbf {v}}_i\in {{\mathbb {R}}}^{n^2}\). Obviously, \(t={\text {rank}}(M)\le n\). We can then fold a long vector \({\mathbf {v}}\in {{\mathbb {R}}}^{n^2}\) into an \(n\times n\) matrix V by chopping the \(n^2\)-vector \({\mathbf {v}}\) into n n-vectors consecutively and putting them accordingly as column vectors of V. The reverse operation is more clear: by stacking the column vectors of V one by one into an \(n^2\)-vector. Thus, in a more compact form, we can decompose a tensor \({\mathcal {A}}\in {{\mathbb {R}}}^{n\times n\times n}\) as

$$\begin{aligned} {\mathcal {A}}=\sum _{s=1}^{t}{\mathbf {u}}_{s}\otimes {\mathbf {T}}_{s},\ \ {\mathbf {u}}_{s}\in {\mathbb {R}}^{n},\ {\mathbf {T}}_{s}\in {\mathbb {R}}^{n\times n},\ \ \text {where }t\le n. \end{aligned}$$

It is also a direct calculation to see that

$$\begin{aligned} {\mathcal {A}}{\mathbf {x}}^3=\sum _{s=1}^{t}(\mathbf {u}_{s}^{\mathsf {T}}{\mathbf {x}})({\mathbf {x}}^{{\mathsf {T}}}{\mathbf {T}}_{s}{\mathbf {x}}). \end{aligned}$$

Denote \({\mathbf {y}}_{s}:={\mathbf {x}}^{{\mathsf {T}}}{\mathbf {T}}_{s}{\mathbf {x}}\). Then, the polynomial equality \({\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+{\mathbf {c}}^{{\mathsf {T}}}{\mathbf {x}}=0\) can be represented as

$$\begin{aligned} \begin{array}{ll} \sum \limits _{s=1}^{t}({\mathbf {u}}_{s}^{\mathsf {T}}{{\mathbf {x}}}){\mathbf {y}}_{s}+{{\mathbf {x}}}^{{\mathsf {T}}}{\mathbf {B}}{{\mathbf {x}}}+{\mathbf {c}}^{{\mathsf {T}}}{{\mathbf {x}}}=0,\\ {\mathbf {y}}_{s}:={{\mathbf {x}}}^{{\mathsf {T}}}{\mathbf {T}}_{s}{{\mathbf {x}}}\quad \hbox {for all }s\in \{1,\ldots ,t\}. \end{array} \end{aligned}$$
(25)

And the corresponding solution set can be written as

$$\begin{aligned} W:=\left\{ ({\mathbf {x}},{\mathbf {y}})\in {{\mathbb {R}}}^n\times {{\mathbb {R}}}^{t}\ : \ \begin{array}{cc} {{\mathbf {x}}}\ge {{\mathbf {0}}},\ \ {{\mathcal {A}}}{{\mathbf {x}}}^2+{\mathbf {B}}{{\mathbf {x}}}+{\mathbf {c}}\ge {{\mathbf {0}}},\\ \sum \limits _{s=1}^{t}({\mathbf {u}}_{s}^{\mathsf {T}}{{\mathbf {x}}}){\mathbf {y}}_{s}+ {{\mathbf {x}}}^{{\mathsf {T}}}{\mathbf {B}}{{\mathbf {x}}}+{\mathbf {c}}^{{\mathsf {T}}}{{\mathbf {x}}}=0,\\ {\mathbf {y}}_{s}={{\mathbf {x}}}^{{\mathsf {T}}}{\mathbf {T}}_{s}{{\mathbf {x}}},\ \ s\in \{1,\ldots ,t\}. \end{array} \right\} \end{aligned}$$
(26)

Through the above transformation, the degree of the polynomial system W is 2, and the number of the equality constraints is reduced to \(t+1\le n+1\).

Summarizing the above analysis, the key ingredient is flattening the tensor \({\mathcal {A}}\) into an \(n\times n^2\) matrix, and reformulating the cubic polynomial \({\mathcal {A}}{\mathbf {x}}^3\) as a sum of products of linear and quadratic polynomials. Note that there are three sub-indices in the tensor components \(a_{ijk}\)’s, and they are of symmetric roles in \({\mathcal {A}}{\mathbf {x}}^3\). The above flattening is by grouping the second and the third indices. This matrix flattening is mode-1 flattening. Likewise, we have mode-2 (i.e., grouping the first and the third indices) and mode-3 (i.e., grouping the first and the second indices) matrix flattenings. Unlike the matrix case, surprisingly, the three \(n\times n^2\) matrices derived from the three ways of matrix flattenings can be of different ranks [6]. These ranks are called multilinear ranks in the literature, denoted as \({\text {rank}}_1({\mathcal {A}}),{\text {rank}}_2({\mathcal {A}})\) and \({\text {rank}}_3({\mathcal {A}})\), respectively, for the mode-1, mode-2 and mode-3 matrix flattenings of \({\mathcal {A}}\). For each mode matrix flattening, similar reformulation as (26) can be derived.

Let

$$\begin{aligned} {\text {mrank}}({\mathcal {A}}):=\min \{{\text {rank}}_1({\mathcal {A}}), {\text {rank}}_2({\mathcal {A}}),{\text {rank}}_3({\mathcal {A}})\}. \end{aligned}$$

With the parameters in Theorem 4.1 being \(r=2n\), \(s=1\), \(N_s=2{\text {mrank}}({\mathcal {A}})\), \(p=w=2\) and \(q=1\), we have the next result.

Proposition 4.4

Let \(S\ne \emptyset \) be given in (1) and \({\bar{{\mathbf {x}}}}\in S\). There are positive \(\kappa >0\) and \(\epsilon >0\) such that

$$\begin{aligned} {\text {dist}}({\mathbf {x}}, S)\le \kappa \bigg ([-{\mathbf {x}}]_{+}+[-{\mathcal {A}}{\mathbf {x}}^2-\mathbf {B}{\mathbf {x}}-\mathbf {c}]_{+}+|{\mathcal {A}}{\mathbf {x}}^3+{\mathbf {x}}^{{\mathsf {T}}}B{\mathbf {x}}+c^{{\mathsf {T}}}{\mathbf {x}}|\bigg )^\gamma \end{aligned}$$

for all \({\mathbf {x}}\) such that \(\Vert {\mathbf {x}}-{\bar{{\mathbf {x}}}}\Vert \le \epsilon \), where

$$\begin{aligned} \gamma :=\max \bigg \{\frac{1}{3\times 6^{3n+2{\text {mrank}}({\mathcal {A}})}},\frac{1}{2\times 9^{3n+{\text {mrank}}({\mathcal {A}})-1}}\bigg \}. \end{aligned}$$
(27)

It is easy to see that Proposition 4.4 generalizes Proposition 4.3.

In the generic case, \({\text {mrank}}({\mathcal {A}})=n\), we have \( \gamma =\max \bigg \{\frac{1}{3\times 6^{5n}},\frac{1}{2\times 9^{4n-1}}\bigg \}\), which is the conclusion of Proposition 4.2.

Likewise, the merit of Proposition 4.4 is for tensors \({\mathcal {A}}\) with low multilinear ranks, i.e., when \({\text {mrank}}({\mathcal {A}})\) is small or even a constant when n varies. We can see that when \({\text {mrank}}({\mathcal {A}})\) is small or a constant as n varies, the exponent (27) is much better than that in (16).

5 Conclusions

In this article, error bound analysis for the solution set of a QCP is investigated. A local error bound via the natural residual function with explicit fractional exponent is firstly established. Then, local error bounds via the sparsity, from both the algebraic and geometric perspectives, of the third-order tensor in the underlying QCP are derived, which improve the ground standard result for QCP from the state-of-the-art error bound theory for general polynomial systems. The latter conclusions are established with the aim of a general polynomial reduction technique for polynomial systems, which should be of independent interest and have great potential applications in a variety of situations, including TCPs studies.

Several further study topics can be drawn. Firstly, algorithms with explicit convergence rate can be established upon the error bound results given in this article. Secondly, it is of great interest as well as importance to study at which solution of a QCP, the local Lipschitz error bound holds. Whenever this is built, a locally linear convergent algorithm for solving QCP shall be given. Lastly, extensions to TCPs and general polynomial complementarity problems, as well as error bounds with constant fractional exponents for structured QCPs, are also of interest. Note that all the results can be extended to general polynomial case analogously. However, the reduction technique as in Sect. 4.2 gives only a slight reduction of the base in the exponent but an increase in the number of equalities. Further studies are therefore necessary for the general case.