1 Introduction

Given two finite dimensional Euclidean spaces \(\mathbf {E}\) and \({\mathbf {F}}\), each equipped with an inner product denoted as \(\langle \cdot , \cdot \rangle\), we consider a conic program in standard form with decision variable \(x\in \mathbf {E}\):

$$\begin{aligned} \begin{aligned} {\text {minimize}} \quad&\langle c, x \rangle \\ \text {subject to} \quad&\mathcal {A}x = b,\\&x\in \mathcal {K}. \end{aligned}\qquad \qquad \qquad (\mathcal {P}) \end{aligned}$$

Here the problem data comprises a linear map \(\mathcal {A}:\mathbf {E}\rightarrow \mathbf {F}\), a right hand side vector \(b\in \mathbf {F}\), and a cost vector \(c\in \mathbf {E}\). The cone \(\mathcal {K}\subset \mathbf {E}\) is proper [1, Section 2.4.1]. The solution set and optimal value of (\(\mathcal {P}\)) are denoted as \(\mathcal {X}_\star\) and \(p_\star\) respectively. When \(\mathcal {K}\) is the nonnegative orthant \({ \mathbf{R}}_+^n\), the second order cone \(\text{ SOC}_{n}\), or the set of positive semidefinite matrices \({ \mathbf{S}}_+^n\), we call the corresponding problem (\(\mathcal {P}\)) an linear programming (LP), second order cone programming (SOCP), or semidefinite programming (SDP), respectively.

In this paper, we provide an elementary framework based on strict complementarity (see Sect. 2.1) to establish error bounds and quantify the sensitivity of the solution of Problem (\(\mathcal {P}\)). In the following, \(\left\| \cdot \right\| _2\) denotes the Euclidean norm induced by the inner product, while \(\left\| \cdot \right\|\) is a generic norm that will be specified when we instantiate these bounds later in the paper.

  • Error bound: Given \(x\in \mathbf {E}\), define three error metrics: suboptimality \(\epsilon _{{\tiny opt}}(x):\,=\langle c, x \rangle -p_\star\), linear infeasibility \(\mathcal {A}x -b\), and conic infeasibility as \(x_{-}=x -x_{+}\), where the conic part \(x_{+}:\,= \mathcal {P}_{\mathcal {K}}(x)\).Footnote 1 These errors are easy to measure, while the distance of a given point x to the solution is not. This paper shows how to establish an error bound for (\(\mathcal {P}\)) that bounds the distance to the solution in terms of these measurable error metrics, for some constants \(c_i,i=1,2,3\) and exponent \(p>0\) independent of x:

    $$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star )^p\le c_1 |\epsilon _{{\tiny opt}}(x)| + c_2 \left\| \mathcal {A}x-b\right\| +c_3 \left\| x_{-}\right\| , \end{aligned} \qquad \qquad \qquad (\hbox {ERB}) \end{aligned}$$

    where \(\mathbf{dist{}}(x,\mathcal {X}_\star ):= \inf _{x_\star \in \mathcal {X}_\star }\left\| x-x_\star \right\| _2\) is the distance to \(x_\star\).

  • Sensitivity of the solution: We often wish to understand how the solution of the problem changes with perturbations of the problem data. Given new problem data \((c',\mathcal {A}',b')\in \mathbf {E}\times \mathbf {L}(\mathbf {E},\mathbf {F})\times \mathbf {F}\) for Problem (\(\mathcal {P}\)), where \(\mathbf {L}(\mathbf {E},\mathbf {F})\) is the set of linear maps from \(\mathbf {E}\) to \(\mathbf {F}\), Problem (\(\mathcal {P}\)) admits a new optimal solution set \(\mathcal {X}_\star '\). This paper shows how to quantify the sensitivity of the solution, for some constants \(c'_i,i=1,2,3\) and exponent \(p'>0\), via the inequality

    $$\begin{aligned} {\textbf {dist}}^{p'}(\mathcal {X}_\star ,\mathcal {X}_\star ')\le c_1'\left\| c-c'\right\| +c_2'\left\| \mathcal {A}- \mathcal {A}'\right\| +c_3'\left\| b-b'\right\| ,\qquad \qquad \qquad (\hbox {SSB}) \end{aligned}$$

    where \({\textbf {dist}}(\mathcal {X}_\star ,\mathcal {X}_\star '):\,= \inf _{x_\star \in \mathcal {X}_\star ,x'\in \mathcal {X}_\star '}\left\| x_\star -x_\star '\right\| _2\) is the distance between solution sets. In fact, once an error bound of the form (ERB) is available, we can prove an inequality of this form by bounding the error metrics of the new solution \(x_\star '\in \mathcal {X}_\star '\) with respect to the original problem data \((c, \mathcal {A}, b)\) in terms of the perturbation \((c-c',\mathcal {A}- \mathcal {A}',b-b')\).

Importance of the error bound and sensitivity of solution. The error bound and sensitivity of the solution can be regarded as condition numbers for Problem (\(\mathcal {P}\)). They guarantee that the output of iterative algorithms to solve (\(\mathcal {P}\)) is still useful despite optimization error (of the algorithm) and measurement error (of the problem data) [2, 3]. The error bound is also vital in proving faster convergence for first order algorithms [4,5,6,7]. Hence a huge body of work has devoted to establish error bounds and sensitivity of solutions  [2, 4, 5, 8,9,10].

Our Contribution. In this paper, we use the notion of strict complementarity (defined in Sect. 2.1) to provide an elementary, geometric, and unified framework, described in detail in Sect. 2, to establish bounds of the form (ERB) and (SSB) for the conic program (\(\mathcal {P}\)). Specifically, in Sects. 3 and 4, we show how to construct a bound with exponents \(p=p'=1\) for LP and \(p=p'=2\) for SOCP and SDP, under strict complementarity, and provide a way to obtain explicit estimates of \(c_i,i=1,2,3\) in terms of the primal and dual solutions and problem data when the primal solution is unique. Table 1 summarizes our results.

The main contribution of this paper is a new and simple framework for proving bounds of this form. As discussed in Sect. 5, many particular bounds that we present here have been discovered before. On the other hand, we believe that some of the bounds are new: in particular, bounds on the sensitivity of the solution that pertain when the primal or dual solution are not unique.

Paper organization. The rest of the paper is organized as follows. In Sect. 2.1, we discuss two important analytical conditions assumed throughout this paper: strong duality and dual strict complementarity. In Sect. 2.2, we describe the basic framework of the strict complementarity approach: linear regularity of convex sets, facial reduction, and extension via orthogonal decomposition. In Sect. 3, we apply the framework to specific examples, LP, SOCP, and SDP, to establish error bounds. We next demonstrate how to use the error bound established to characterize the sensitivity of solutions in Sect. 4 by bounding the error measures of the new solution \(x_\star '\in \mathcal {X}_\star '\) in terms of the perturbation \((c-c',\mathcal {A}- \mathcal {A}',b-b')\). Finally, we discuss previous results regarding (ERB) and (SSB), how this work relates to them, and potential new directions.

Notation. We use \(\mathbf {E},\mathbf {F},\mathbf {E}',\mathbf {F}'\) to represent generic finite dimensional Euclidean spaces. For a set \(\mathcal {C}\) in \(\mathbf {E}\), we denote its interior, boundary, affine hull, and relative interior as \(\mathbf{int}(\mathcal {C})\), \(\partial \mathcal {C}\), \(\mathrm{affine}(\mathcal {C})\), and \(\mathbf{rel int}(\mathcal {C})\) respectively. We equip \({ \mathbf{R}}^n\) with the dot inner product, and \({ \mathbf{S}}^n\) and \({ \mathbf{R}}^{n\times n}\) with the trace inner product. The distance to a set \(\mathcal {C}\) is defined as \(\mathbf{dist{}}(x,\mathcal {C})=\inf _{z\in \mathcal {C}}\left\| x-z\right\| _2\). We write \(\left\| \cdot \right\|\) for an arbitrary norm, and \(\left\| \cdot \right\| _2\) for the \(\ell _2\) norm induced by the underlying inner product. For matrices, the operator norm (maximum singular value), Frobenius norm, and nuclear norm (sum of singular values) are denoted as \(\left\| \cdot \right\| _{{\tiny {op }}}\), \(\left\| \cdot \right\| _{{\tiny \mathrm{F}}}\), and \(\left\| \cdot \right\| _*\) respectively. For a linear map \(\mathcal {B}:\mathbf {E}\rightarrow \mathbf {F}\) and a linear space \(\mathcal {V}\subset \mathbf {E}\), we write the restriction of \(\mathcal {B}\) to \(\mathcal {V}\) as \(\mathcal {B}_{\mathcal {V}}\). We define the largest and smallest singular value of \(\mathcal {B}\) as \(\sigma _{\max } (\mathcal {B}):\,=\max _{\left\| x\right\| _2=1}\left\| \mathcal {B}(x)\right\| _2\) and \(\sigma _{\min } (\mathcal {B}):\,=\min _{\left\| x\right\| _2=1}\left\| \mathcal {B}(x)\right\| _2\) respectively.

2 The strict complementary slackness approach

In Sect. 2.1, we introduce two important structural conditions, strong duality and dual strict complementarity, that are essential to our approach. Next in Sect. 2.2, we describe the main ingredients of the strict complementary slackness approach: linear regularity (Sect. 2.2.1), facial reduction (Sect. 2.2.2), and orthogonal decomposition (Sect. 2.2.3). Our main result, Theorem 1, is in Sect. 2.2.3.

2.1 Analytical conditions

Here we define two conditions that are essential to our framework: strong duality and dual strict complementarity. To start, let us recall the dual problem of (\(\mathcal {P}\)) is

$$\begin{aligned} \begin{array}{ll} \text{ maximize } &{} \langle b, y \rangle \\ \text{ subject } \text{ to } &{} c- \mathcal {A}^* y \in \mathcal {K}^*. \\ \end{array}\qquad \qquad \qquad\qquad\qquad\quad (\mathcal {D}) \end{aligned}$$

The vector \(y\in \mathbf {F}\) is the decision variable, the linear map \(\mathcal {A}^*\) is the adjoint of the linear map \(\mathcal {A}\), and the cone \(\mathcal {K}^*\) is the dual cone of \(\mathcal {K}\), i.e., \(\mathcal {K}^*:\,=\{s\in \mathbf {E}\mid \langle s, x \rangle \ge 0,\,\forall x\in \mathcal {K}\}\). Let us introduce strong duality first.

Definition 1

(Strong duality) The primal and dual problems (\(\mathcal {P}\)) and \((\mathcal {D})\) satisfy strong duality (SD) if the primal and dual solution sets \(\mathcal {X}_\star , \mathcal {Y}_\star\) are nonempty, \(\mathcal {X}_\star\) is compact, and there exists a primal and dual solution pair \((x_\star ,y_\star )\in \mathcal {X}_\star \times \mathcal {Y}_\star\) such that

$$\begin{aligned} \begin{aligned} \langle c, x_\star \rangle =\langle b, y_\star \rangle =\langle \mathcal {A}x_\star , y_\star \rangle . \end{aligned}\qquad \qquad \qquad \qquad\qquad\qquad\quad (\hbox {SD}) \end{aligned}$$

Equivalently, define the slack vector \(s_\star = c-\mathcal {A}^* y_\star\) to rewrite the equality (SD) as

$$\begin{aligned} 0 =\langle c-\mathcal {A}^*(y_\star ), x_\star \rangle = \langle s_\star , x_\star \rangle . \end{aligned}$$

Note that we require the existence of primal and dual optimal solutions instead of just equality of optimal values. Strong duality in the stated form is ensured by primal and dual Slater’s condition: there is a primal and dual feasible pair (xy) with \((x,c-\mathcal {A}^*y)\in \mathbf{int}(\mathcal {K})\times \mathbf{int}(\mathcal {K}^*)\).

Next we state the second condition: dual strict complementarity. This condition is the key to established error bounds for a variety of optimization problems [4, 5].

Definition 2

(Dual strict complementarity (DSC)) Given a solution pair \((x_\star ,y_\star )\in \mathcal {X}_\star \times \mathcal {Y}_\star\), define the complementary face \(\mathcal {F}_{s_\star } :\,=\{x\mid \langle x, s_\star \rangle =0, \,s_\star = c-\mathcal {A}^*y_\star \}\cap \mathcal {K}\). The solution pair \((x_\star ,y_\star )\) satisfies dual strict complementarity if

$$\begin{aligned} \begin{aligned} x_\star \in \mathrm{rel}(\mathcal {F}_{s_\star }). \end{aligned} \qquad \qquad \qquad \qquad\qquad\qquad\quad (\hbox {DSC}) \end{aligned}$$

If (\(\mathcal {P}\)) and \((\mathcal {D})\) has one such pair, we say (\(\mathcal {P}\)) and \((\mathcal {D})\) (or simply (\(\mathcal {P}\))) admits dual strict complementarity.

Let us now unpack the definition of \(\mathcal {F}_{s_\star }\) and dual strict complementarity. Also see Fig. 1 for a graphical illustration of the condition.

Fig. 1
figure 1

Strict Complementarity: For both plots, the indigo cone is the cone \(\mathcal {K}\); the slack vector \(s_\star\) is the blue ray; the complementary face \(\mathcal {F}_{s_\star }\) is the dashed black ray; and the complementary space \(\mathcal {V}_{s_\star }\) is the black line (both solid and dashed parts). In the 2D case, the complementary hyperplane \(\mathcal {H}_{s_\star }\) and \(\mathcal {V}_{s_\star }\) coincide. In the 3D case, the complementary hyperplane \(\mathcal {H}_{s_\star }\) is the yellow plane, which is tangent to the purple cone \(\mathcal K\) at the point \(x_\star\) and is orthogonal to \(s_\star\)

Understanding the complementary face \(\mathcal {F}_{s_\star }\). To understand the name complementary face, let us introduce the complementary hyperplane, the hyperplane \(\mathcal {H}_{s_\star } = \{x\mid \langle x, s_\star \rangle =0\}\) orthogonal to the slack vector \(s_\star =c-\mathcal {A}^*y_\star\). The complementary face \(\mathcal {F}_{s_\star }\) is simply the intersection of \(\mathcal {H}_{s_\star }\) and the cone \(\mathcal {K}\). The intersection is nonempty due to strong duality. We can see that \(\mathcal {F}_{s_\star }\) is indeed a face because its intersection with \(\mathcal {K}\) is nonempty (it contains \(x_\star\)), and every \(x \in \mathcal {K}\) lies on the same side of the hyperplane \(\mathcal {H}_{s_\star }\), \(\langle x, s_\star \rangle \ge 0\), as \(s_\star \in \mathcal {K}^*\). In particular, we see the face \(\mathcal {F}_{s_\star }\) is exposed.

Dual strict complementarity (DSC) and Slater’s condition. To better understand dual strict complementarity, define the complementary space as the affine hull of \(\mathcal {F}_{s_\star }\), \(\mathcal {V}_{s_\star }:\,=\mathrm{affine}(\mathcal {F}_{s_\star })\). The complementary face \(\mathcal {F}_{s_\star } =\mathcal {H}_{s_\star }\cap \mathcal {K}\) is a cone, so the complementary space \(\mathcal {V}_{s_\star }\) is a linear subspace. Imagine modifying problem (\(\mathcal {P}\)) by replacing the cone \(\mathcal {K}\) by \(\mathcal {F}_{s_\star }\) in problem (\(\mathcal {P}\)) and restricting the decision variable x to the subspace \(\mathcal {V}_{s_\star }\). Note that \(x_\star\) is still a solution to this problem, so this procedure is related to facial reduction: the modified problem restricts x to a face of the original cone \(\mathcal {K}\). DSC means that there is a primal x in the interior of the cone \(\mathcal {F}_{s_\star }\subset \mathcal {V}_{s_\star }\), where the interior is taken w.r.t. the subspace \(\mathcal {V}_{s_\star }\). Hence DSC is equivalent to the usual Slater’s condition for the modified problem.

Primal strict complementarity and strict complementarity. Given dual strict complementarity (DSC), a natural way to define primal strict complementarity (PSC) is to reverse the role of \(s_\star\) and \(x_\star\) in the definition of DSC. Precisely, PSC means that there exists \((x_\star ,y_\star )\) with \(c-\mathcal {A}^*y_\star =s_\star \in \mathrm{rel}(\{s\mid \langle x_\star , s \rangle =0 \,\text {and}\, s\in \mathcal {K}^* \})\). Primal and dual strict complementarity are not always equivalent unless the cone \(\mathcal {K}\) is exposed.Footnote 2 Happily, all the symmetric cones are exposed, including \(\mathcal {K}= { \mathbf{R}}_+^n\), \(\text{ SOC}_{n}\), and \({ \mathbf{S}}_+^n\). [11] shows that DSC and PSC actually hold “generically”Footnote 3 for general conic programs. It is worth noting that the standard notion of strict complementarity (SC) for SDP [12, Definition 4] and LP, both defined algebraically, are equivalent to the geometric notion of DSC here.Footnote 4 SC always holds for LP [13], holds “generically” for SDP as shown in [12], and even holds for some structured instances of SDP [3]. Due to the equivalence, DSC also holds under the same conditions for LP and SDP.

2.2 Defining the strict complementary slackness approach

In this section, we explain how to use the two assumptions in Sect. 2.1 to establish a framework to prove error bounds of the form (ERB) and sensitivity bounds of the form (SSB). As we explained in the introduction, an error bound (ERB) can be used to derive a sensitivity bound (SSB). Hence, we focus on proving an error bound first. Our main theorem, Theorem 1 in Sect. 2.2.3, reduces the task to bounding the quantity \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x_{+})\right\| _2\).Footnote 5 We explain how to further bound this quantity in Sect. 3.

2.2.1 From optimality to feasibility: linear regularity of convex sets

Our first step is to identify problem (\(\mathcal {P}\)) with the feasibility problem of finding x such that

$$\begin{aligned} \langle c, x \rangle =p_\star ,\quad \mathcal {A}(x)=b, \quad \text {and}\quad x \in \mathcal {K}. \end{aligned}$$
(1)

This transformation activates the following geometric result, called linear regulairty regularity of convex sets [14, Theorem 4.6], [15, Theorem 2.1], a classical result on error bounds for feasibility systems. This result states that the distance to the intersection of two sets is bounded by the sum of distances to the two sets.

Lemma 1

(Linear regularity) Suppose the \(C\subset \mathbf {E}\) is an affine space with \(C=\{x\mid \mathcal {B}x = d\}\), where \(\mathcal {B}:\mathbf {E}\rightarrow \mathbf {F}\) is a linear map, and \(D\subset \mathbf {E}\) is a closed convex cone. If \(C\cap \mathbf{int}(D)\not =\emptyset\) and \(C\cap D\) is compact, then there are some \(\gamma ,\gamma '>0\) such that for all \(x\in \mathbf {E}\),

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x, C\cap D)\le \gamma \left\| \mathcal {B}x-d\right\| _2+\gamma '\mathbf{dist{}}(x,D). \end{aligned} \end{aligned}$$
(2)

Now it is tempting to set \(C=\{x\mid \langle c, x \rangle =p_\star ,\,\mathcal {A}(x)=b\}\) and \(D=\mathcal {K}\), and conclude (ERB) holds with exponent \(p=1\), since \(C\cap D = \mathcal {X}_\star\). The catch is that for most problems of interest, the optimal solution \(x_\star \in \partial \mathcal {K}\) lies on the boundary of the cone \(\mathcal {K}\), and the condition \(\mathbf{int}(D)\cap C=\emptyset\) does not hold! Indeed, unless \(c=0\), which makes (\(\mathcal {P}\)) a feasibility problem, we always have \(x_\star \in \partial \mathcal {K}\).

2.2.2 Facial reduction

We may still use Lemma 1 to establish an error bound. The key is to use the facial reduction idea mentioned earlier. Recall the condition required is \(C\cap \mathbf{int}(D)= \{x\mid \langle c, x \rangle =p_\star ,\, \mathcal {A}(x)=b\}\cap \mathbf{int}(\mathcal {K})\not =\emptyset\), which does not hold for (\(\mathcal {P}\)) with nonzero \(c\). The problem is that the cone \(\mathcal {K}\) lies in the large space \(\mathbf {E}\), so its interior (with respect to \(\mathbf {E}\)) does not contain \(x_\star\). Instead, consider restricting the variable x to the complementary space \(\mathcal {V}_{s_\star }=\mathrm{affine}(\mathcal {F}_{s_\star })\) and replacing the cone \(\mathcal {K}\) by the complementary face \(\mathcal {F}_{s_\star }\). The interior of the cone \(\mathcal {F}_{s_\star }\) with respect to the space \(\mathcal {V}_{s_\star }\) does contain \(x_\star\) under strict complementarity, so we may activate Lemma 1.

This modification enables an error bound for \(x\in \mathcal {V}_{s_\star }\) as stated in Lemma 2 below. We also provide a more concrete estimate of the constants \(\gamma ,\gamma '\) when \(\mathcal {X}_\star\) is a singleton.

Lemma 2

Suppose strong duality and dual strict complementarity hold. Then there are constants \(\gamma ,\gamma '\) such that for any \(x\in \mathcal {V}_{s_\star }\):

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star )&\le \gamma \left\| \mathcal {A}(x)-b\right\| _2 +\gamma ' \mathbf{dist{}}(x,\mathcal {F}_{s_\star }). \end{aligned} \end{aligned}$$
(3)

Moreover, if \(\mathcal {X}_\star\) is a singleton, then we may take \(\gamma '=0\) and \(\gamma =\frac{1}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\). Here, the linear map \(\mathcal {A}_{\mathcal {V}_{s_\star }}\) is \(\mathcal {A}\) restricted to \(\mathcal {V}_{s_\star }\) and \(\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})\) is the smallest singular value of \(\mathcal {A}_{\mathcal {V}_{s_\star }}\).

Proof

The inequality 3 is immediate by using Lemma 1 with \(C=\{x\in \mathcal {V}_{s_\star }\mid \mathcal {A}(x)=b\}\subset \mathcal {V}_{s_\star }\), \(D=\mathcal {F}_{s_\star }\), and \(\mathbf {E}= \mathcal {V}_{s_\star }\).

Indeed, this choice gives \(C\cap D = \mathcal {X}_\star\). The condition \(C\cap \mathbf{int}(D) \not =\emptyset\), where the interior is taken with respect to the space \(\mathcal {V}_{s_\star }\), is exactly (DSC): \(\exists x_\star \in \mathcal {X}_\star\) such that \(x_\star \in \mathbf{rel int}(\mathcal {F}_{s_\star })\).

Now we show that \(\gamma '=0\) and \(\gamma =\frac{1}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\) when \(\mathcal {X}_\star\) is a singleton. First assume \(\mathcal {A}_{\mathcal {V}_{s_\star }}\) has a trivial nullspace, so \(x_\star\) is the only solution in \(\mathcal {V}_{s_\star }\) to \(\mathcal {A}_{\mathcal {V}_{s_\star }}(x)=b\). Hence \(\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})>0\) and so for any \(x\in \mathcal {V}_{s_\star }\), \(\left\| x-x_\star \right\| _2\le \frac{1}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\left\| \mathcal {A}_{\mathcal {V}_{s_\star }}(x-x_\star )\right\| _2 = \frac{1}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\left\| \mathcal {A}(x)-b\right\| _2\). Finally, we show by contradiction that the nullspace of \(\mathcal {A}_{\mathcal {V}_{s_\star }}\) is trivial whenever \(\mathcal {X}_\star\) is a singleton. If the nullspace is not trivial, then there is some \(x'\in \mathcal {V}_{s_\star }\) such that \(\mathcal {A}_{\mathcal {V}_{s_\star }}(x')=0\). Hence \(x_\star +\alpha x'\) for some small enough \(\alpha\) is still optimal, as \(x_\star \in \mathbf{rel int}(\mathcal {F}_{s_\star })\), which contradicts our hypothesis that the solution set \(\mathcal {X}_\star\) is a singleton. \(\square\)

This choice of the face \(\mathcal {F}_{s_\star }\) and the corresponding linear space \(\mathcal {V}_{s_\star }\) correspond to the idea of facial reduction [16, 17]. Facial reduction is a conceptual and numerical technique designed to handle conic feasibility problems for which constraint qualifications (such as Slater’s condition) fail. Note that such failure is the interesting case for a feasibility system (1) when the optimal solution \(x_\star \in \partial \mathcal {K}\). Indeed, our choice of face can be considered as one step of the facial reduction procedure.

2.2.3 Extension to the whole space: orthogonal decomposition

In this section, we derive our main result, Theorem 1, by extending the previous result to the whole space using the orthogonal decomposition \(\mathbf {E}= \mathcal {V}_{s_\star }\oplus \mathcal {V}_{s_\star }^\perp\) with \(\mathcal {V}_{s_\star }\perp \mathcal {V}_{s_\star }^\perp\).

Theorem 1

Suppose strong duality and dual strict complementarity hold. Then for some constants \(\gamma ,\gamma '\) described in Lemma 2 and for all \(x\in \mathbf {E}\), we have

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star ) \le&(1+\gamma \sigma _{\max }(\mathcal {A}))\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x_{+})\right\| _2+\gamma \left\| \mathcal {A}(x)-b\right\| _2\\&+\,(1+\gamma \sigma _{\max }(\mathcal {A}))\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_{-})\right\| _2+\gamma '\left\| \mathcal {P}_{\mathcal {V}_{s_\star }}(x_{-})\right\| _2\\&+\,\gamma '\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_{+}),\mathcal {F}_{s_\star }), \end{aligned} \end{aligned}$$
(4)

where \(\mathcal {P}_{\mathcal {V}_{s_\star }}\) and \(\mathcal {P}_{\mathcal {V}_{s_\star }^\perp }\) are orthogonal projections to \(\mathcal {V}_{s_\star }\) and \(\mathcal {V}_{s_\star }^\perp\) respectively. The terms \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }}(x_{-})\right\| _2\) and \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_{-})\right\| _2\) can themselves be bounded by \(\left\| x_{-}\right\| _2\).

Proof

Recall Lemma 2 establishes an error bound for only those \(x\in \mathcal {V}_{s_\star }\). Using the orthogonal decomposition proposed, for any \(x\in \mathbf {E}\),

$$\begin{aligned} x = \mathcal {P}_{\mathcal {V}_{s_\star }}(x) + \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x). \end{aligned}$$

This decomposition immediately gives

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star )&\le \left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x)\right\| _2 + \mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x), \mathcal {X}_\star ). \end{aligned} \end{aligned}$$
(5)

The second term \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x), \mathcal {X}_\star )\) can be bounded using Lemma 2:

$$\begin{aligned} \mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x), \mathcal {X}_\star )\le \gamma \left\| \mathcal {A}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x))-b\right\| _2+ \gamma '\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x),\mathcal {F}_{s_\star }). \end{aligned}$$
(6)

To translate the above bound to linear infeasibility \(\mathcal {A}(x)-b\) and conic infeasibility \(x_{-}\), we note that \(x= \mathcal {P}_{\mathcal {V}_{s_\star }}(x)+\mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x)\) and \(x=x_{+}+x_{-}\)Footnote 6. With these two decompositions, and tad more algebra, we arrive at the theorem. \(\square\)

To go further, we need to bound the following two terms in terms of \(\epsilon _{{\tiny opt}}(x)\), \(\mathcal {A}(x)-b\), and \(x_{-}\):

  1. 1.

    The distance to the space \(\mathcal {V}_{s_\star }\): \(\mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x_{+})\).

  2. 2.

    The term \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_{+}),\mathcal {F}_{s_\star })\).

In Sect. 3, we show how to bound both terms for the special cases of LP, SOCP, SDP, and more general conic programs (\(\mathcal {P}\)) where \(\mathcal {K}\) is a finite product of LP, SOCP, or SDP cones. A quick summary of results can be found in Table 1.

As we shall see in Sect. 3, the term \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_{+}),\mathcal {F}_{s_\star })\) is usually zero. Thus the major challenge is bounding \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x_{+})\right\| _2\)Footnote 7. Note that under this condition, for feasible x of (\(\mathcal {P}\)), the bound (4) reduces to

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star )&\le (1+\gamma \sigma _{\max }(\mathcal {A}))\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x)\right\| _2. \end{aligned} \end{aligned}$$
(7)

If the solution set \(\mathcal {X}_\star\) is a singleton, then from Lemma 2, we know \(\gamma =\frac{1}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\), and we encounter a condition number like quantity \(\frac{ \sigma _{\max }(\mathcal {A})}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\) in (7). Depending on applications, the condition number may scale with the problem dimension but the bound is still tight as the following example shows.

Example 1

Consider an SDP with \(C= -\mathbf{1}\mathbf{1}^\top\), where \(\mathbf{1}\in { \mathbf{R}}^n\) is the all one vector, \(\mathcal {A}= {\mathbf{diag}}(\cdot )\), and \(b = \mathbf{1}\). This is a simplification of the SDP for \(\mathbb {Z}_2\) synchronization [3, 18]. For this SDP, it is easily verified that the unique optimal solution is \(\mathbf{1}\mathbf{1}^\top\) and dual strict complementarity holds with dual optimal slack \(S_\star =-\mathbf{1}\mathbf{1}^\top +nI\). The condition number like quantity \(\frac{ \sigma _{\max }(\mathcal {A})}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\) in this case is \(\sqrt{n}\) which does scale with the dimension \(n\). However, if in (7) we let \(x = I_n\), the identity matrix which is feasible, then the LHS and RHS of (7) are \(\sqrt{n^2-n}\) and \(\sqrt{n-1}+\sqrt{n^2-n}\) respectively. Thus the bound is actually tight for large \(n\).

3 Application: error bounds

In this section, we show how to use the framework established in Sect. 2 to analyze conic programs (\(\mathcal {P}\)) over the nonnegative orthant, the second order cone, the set of positive semidefinite matrices, or a finite product of these cones. Our analysis has two main steps:

  1. 1.

    Identify and write out the complementary face \(\mathcal {F}_{s_\star }\) and \(\mathcal {V}_{s_\star }\).

  2. 2.

    Bound the term \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\) via a function \(f(\langle s_\star , x_{+} \rangle ,\left\| x\right\| )\) , called the violation of complementarity, using the explicit structure of \(\mathcal {V}_{s_\star }\).

We summarize the findings of this section as the following lemma and corollary. Refer to Table 1 for a quick summary of the results.

Lemma 3

Define the complementarity error \(\epsilon (x)=\langle s_\star , x \rangle\). Suppose strong duality holds. The quantity \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\) can be bounded by several different functions \(f(\langle s_\star , x_{+} \rangle ,\left\| x\right\| )\), which we call the violation of complementarity, depending on the slack vector \(s_\star\) and the cone \(\mathcal {K}\). The first two trivial cases are the following.

  1. 1.

    If \(s_\star =0\), then \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| =0=:f_{0}(\epsilon (x_{+}))\).

  2. 2.

    If \(s_\star \in \mathbf{int}(\mathcal {K}^*)\), then \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2=\left\| x_+\right\| _2\le c_\star \epsilon (x_{+}) =:f_{\mathbf{int}}(\epsilon (x_{+}))\), where \(c_\star = \sup _{x\in \mathcal {K}}\frac{1}{\langle s_\star , \frac{x}{\left\| x\right\| } \rangle }<\infty\).

Moreover, for the nontrivial case \(s_\star \in \partial \mathcal {K}^*/ \{0\}\), we have the following bounds:

  1. 3.

    \(\mathcal {K}= { \mathbf{R}}_+^n\): \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\le \frac{1}{s_{\min >0}}\epsilon (x_{+})=:f_{\tiny { \mathbf{R}}_+^n}(\epsilon (x_{+})),\) where \(s_{\min >0}\) is the smallest nonzero element of \(s_\star\).

  2. 4.

    \(\mathcal {K}= \text{ SOC}_{n}\): \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x_+)\right\| _2 \le \sqrt{2\sqrt{2}\frac{\left\| x\right\| _2\epsilon (x_{+})}{\left\| s_\star \right\| _2}} =:f_{\tiny \text{ SOC}_{n}}(\epsilon (x_{+}),\left\| x\right\| _2)\).

  3. 5.

    \(\mathcal {K}= { \mathbf{S}}_+^n\): \(\left\| \mathcal {P}_{\mathcal {V}_{S_\star }^\perp }(X_{+}) \right\| _{{\tiny \mathrm{F}}} \le \frac{\epsilon (X_{+})}{T} + \sqrt{2\frac{\epsilon (X_{+})}{T} \left\| X\right\| _{{\tiny {op }}}}=:f_{\tiny { \mathbf{S}}_+^{n}}(\epsilon (X_{+}),\left\| X\right\| _{{\tiny {op }}})\). Here T is the smallest nonzero eigenvalue of \(S_\star\).

Proof

Let us first consider the two trivial cases (1) \(s_\star =0\) and (2) \(s_\star \in \mathbf{int}(\mathcal {K}^*)\). These cases are excluded whenever \(c\) and \(b\) are both nonzero. In the first case, \(\mathcal {V}^\perp _{s_\star }=\{0\}\), and we simply have \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| =0\). In the second case, we have \(\mathcal {V}^\perp _{s_\star }=\mathbf {E}\), and \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| =\left\| x_+\right\| \le c_\star \langle s_\star , x_+ \rangle\), where \(c_\star = \sup _{x\in \mathcal {K}}\frac{1}{\langle s_\star , \frac{x}{\left\| x\right\| } \rangle }<\infty\). We defer the proof for the other cases to Sect. 3.1, 3.2, and Sect. 3.3 for LP, SOCP, and SDP respectively.

Combining Lemma 3, and Theorem 1, we reach the following corollary. The quantity \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_{+}),\mathcal {F}_{s_\star })\) can be verified to be zero for the two trivial cases by noting (i) the complementary space \(\mathcal {V}_{s_\star }=\mathbf {E}\) and \(\mathcal {F}_{s_\star }=\mathcal {K}\) for the case \(s_\star =0\), and (ii) the complementary space \(\mathcal {V}_{s_\star }=\{0\}\) and \(\mathcal {F}_{s_\star }\) is a closed cone for the case \(s_\star \in \mathbf{int}(\mathcal {K}^*)\). It is also zero for other three cases as shown in Sects. 3.13.3.

Corollary 1

Suppose strong duality and dual strict complementarity hold, and one of the five cases in Lemma 3 pertains. Then there exists constants \(\gamma ,\gamma '\) so that for all \(x\in \mathbf {E}\),

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star )&\le \kappa f(\epsilon (x_{+}),\left\| x\right\| )+\gamma \left\| \mathcal {A}(x)-b\right\| _2\\&\quad +\,\kappa \left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_{-})\right\| _2+\gamma '\left\| \mathcal {P}_{\mathcal {V}_{s_\star }}(x_{-})\right\| _2,\\ \end{aligned} \end{aligned}$$
(8)

where the condition number \(\kappa = 1+\gamma \sigma _{\max }(\mathcal {A})\). In particular, when \(\mathcal {X}_\star\) is a singleton, then \(\gamma '=0\) and \(\gamma = \frac{1}{\sigma _{\min }(\mathcal {A}_{\mathcal {V}_{s_\star }})}\). Here the formula for \(f(\epsilon (x),\left\| x\right\| )\) can be found in Lemma 3 for each of the different cases, and we can further decompose the complementarity error \(\epsilon (x_{+})=\epsilon _{{\tiny opt}}(x) +\langle y_\star , b-\mathcal {A}(x) \rangle - \langle s_\star , x_{-} \rangle\) using \(x=x_{+}+x_{-}\).

A few remarks regarding the lemma and the corollary are in order.

Remark 1

(Global and local error bound) Note the formula for the violation of complementarity \(f(\epsilon (x),\left\| x\right\| )\) uses \(\left\| x\right\| _2\) for SOCP and \(\left\| X\right\| _{{\tiny {op }}}\) for SDP. Hence the bound (8) for these two cases does not quite align with the form of the error bound (ERB) we seek. To eliminate the dependence on this norm (by bounding the norm), either of the following conditions suffices:

  • \(\left\| x\right\| \le B\) for some constant B,

  • \(\max \{\epsilon _{{\tiny opt}}(x)\,,\left\| \mathcal {A}x-b\right\| \,,\left\| x_{-}\right\| \}\le \bar{c}\) for some constant \(\bar{c}\).

The second requirement combined with (8) for SOCP (SDP) implies \(\left\| x\right\| _2\) (\(\left\| X\right\| _{{\tiny {op }}}\)) is bounded by some \(\bar{B}\) depending on \(\bar{c}\) but independent of x (X). We may then replace the term \(\left\| x\right\| _2\) (\(\left\| X\right\| _{{\tiny {op }}}\)) by B or \(\bar{B}\) in f. Requiring either of these two conditions produces a local error bound. Interestingly, no such condition on the norm is not necessary for the LP case and the other two trivial cases; hence the bounds (8) in these cases are global error bounds.

Remark 2

(Value of p and estimate of \(c_i\) in (ERB)) Ignoring the term f, the bound (8) in Lemma 3 is linear in \(\epsilon _{{\tiny opt}}(x),\left\| \mathcal {A}x-b\right\| ,\left\| x_{-}\right\|\). For LP and the two trivial cases, f is linear in \(\epsilon (x_{+})\), hence the error bound (ERB) holds with exponent \(p=1\). For SOCP and SDP, the square root of \(\epsilon (x_{+})\) appears in f, hence (8) gives an error bound of the form (ERB) with exponent \(p=2\), under the assumption \(\left\| x\right\| \le B\).

Now let’s consider the constants \(c_1\), \(c_2\), and \(c_3\) in the error bound (ERB). It is cumbersome to estimate these for general x; here, suppose x is feasible. For the SOCP and SDP cases, also suppose \(\left\| x\right\| \le B\). Then the bound (8) reduces to

$$\begin{aligned} \begin{aligned} \mathbf{dist{}}(x,\mathcal {X}_\star )&\le (1+\gamma \sigma _{\max }(\mathcal {A}))f(\epsilon _{{\tiny opt}}(x),B). \end{aligned} \end{aligned}$$
(9)

The resulting constant \(c_1\) for \(\epsilon _{{\tiny opt}}(x)\) in the error bound (ERB) for each of the five cases appears in Table 1.

Table 1 This table presents the bound \(f(\epsilon (x_{+}),\left\| x\right\| )\) for \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }}(x_{+})\right\| _2\), the power p in (ERB) and (SSB), and an estimate of \(c_1\) for feasible x for different cases based on \(s_\star\) and \(\mathcal {K}\)

Remark 3

(Conditions for LP) Recall that for linear programming, dual strict complementarity is the same as strict complementarity, which always holds under strong duality [13]. Hence we need not explicitly require the dual strict complementarity condition. Also the compactness condition for strong duality in Sect. 2.1 can be dropped if we establish Theorem 1 using Hoffman’s lemma [8] instead of Lemma 1.

Remark 4

(Finite product of cones) Error bounds for a conic program whose cone is a finite product of \({ \mathbf{R}}_+^n, \text{ SOC}_{n}\), and \({ \mathbf{S}}_+^n\) can be established by bounding the term \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }}(x_{+})\right\| _2\) by a sum of te correponding fs in Lemma 3. We omit the details.

Next, we prove the bound f, violation of complementarity, in Lemma 3 for the LP, SOCP, and SDP comes by following the aforementioned procedure: (i) identify and write out \(\mathcal {F}_{s_\star }\) and \(\mathcal {V}_{s_\star }\), and (ii) bound the term \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\).

3.1 Linear programming (LP)

In linear programming, the cone \(\mathcal {K}={ \mathbf{R}}_+^n= \{x\in { \mathbf{R}}^n\mid x_i\ge 0,\,\forall \,i =1,\dots , n\}\).

Identify \(\mathcal {F}_{s_\star }\) and \(\mathcal {V}_{s_\star }\). For a particular dual optimal solution \((y_\star ,s_\star )\), satisfying dual strict complementarity, the complementary face \(\mathcal {F}_{s_\star }=\{x \in { \mathbf{R}}_+^n\mid x_i = 0,\text { for all } (s_\star )_i>0\}\), and the complementary space \(\mathcal {V}_{s_\star }=\{x\in { \mathbf{R}}^n\mid x_i = 0 \text { for all } (s_\star )_i>0\}\). Hence, the term \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_+), \mathcal {F}_{s_\star })\) is simply zero as \(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_+)\in \mathcal {F}_{s_\star }\).

Bound the term \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\). For the term \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\), denote \(I_{s_\star } =\{i\mid (s_\star )_i>0\}\), \(I^c_{s_\star }=\{1,\dots ,n\}-I_{s_\star }\), and \(s_{\min >0} = \min _{i\in I_s} s_i\), we have

$$\begin{aligned} \begin{aligned} \left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2=\left\| (x_+)_{I_{s_\star }}\right\| _2&\le \Vert (x_+)_{I_{s_\star }}\Vert _1&\le \frac{1}{s_{\min >0}}\langle s_\star , x_+ \rangle .\\ \end{aligned} \end{aligned}$$
(10)

Hence, Lemma 3 for the LP case is established.

3.2 Second order programming (SOCP)

In second order cone programming, the cone \(\mathcal {K}\) is \(\text{ SOC}_{n} = \{x=(x_{1:n},x_{n+1})\mid \left\| x_{1:n}\right\| _2\le x_{n+1} ,\, x_{1:n}\in { \mathbf{R}}^n, \, x_{n+1} \in { \mathbf{R}}\}\).

Identify \(\mathcal {F}_{s_\star }\) and \(\mathcal {V}_{s_\star }\). Given a dual solution \(s_\star = (s_{\star ,1:n},s_{\star ,n+1})\in (\text{ SOC}_{n})^* = \text{ SOC}_{n}\) satisfying dual strict complementarity, the complementary face is defined as \(\mathcal {F}_{s_\star } = \{x\mid \langle s_{\star ,1:n}, x_{1:n} \rangle +x_{n+1}s_{\star ,n+1}=0,; x\in \text{ SOC}_{n}\}\). We can further simplify this expression as

$$\begin{aligned} \mathcal {F}_{s_\star } = \{\lambda (-s_{\star ,1:n}, s_{\star ,n+1})\mid \lambda \ge 0\}. \end{aligned}$$

The complementary space \(\mathcal {V}_{s_\star }\) for the nontrivial case \(s_\star \in \partial \text{ SOC}_{n} \setminus (0,0)\) is simply

$$\begin{aligned} \mathcal {V}_{s_\star }={\mathbf{span}}\{ (-s_{\star ,1:n},s_{\star ,n+1})\}. \end{aligned}$$

For the dual optimal solution \(s_\star\), denote \(\check{s}_{\star }=\frac{1}{\left\| s_\star \right\| }(-s_{\star ,1:n},s_{\star ,n+1})\). Note that the term \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{s_\star }}(x _+), \mathcal {F}_{s_\star } )\) is again simply zero as \(\mathcal {P}_{\mathcal {V}_{s_\star }}(x_+) \in \mathcal {F}_{s_\star }\).

Bound the term \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(x_+)\right\| _2\). We now turn to analyze \(\mathcal {P}_{\mathcal {V}^\perp }(x_+)\), which can be written explicitly as

$$\begin{aligned} \begin{aligned} \mathcal {P}_{\mathcal {V}^\perp }(x_+)&= x_+- \langle x_+, \check{s}_{\star } \rangle \check{s}_{\star }. \end{aligned} \end{aligned}$$
(11)

Now introduce the shorthand \(\epsilon (x) = \langle x_+, \frac{s_\star }{\left\| s_\star \right\| } \rangle = \langle x_{+,1:n}, \frac{s_{\star ,1:n}}{\left\| s_\star \right\| } \rangle +x_{+,n+1}\frac{s_{\star ,n+1}}{\left\| s_\star \right\| }\). The norm square of \(\mathcal {P}_{\mathcal {V}^\perp }(\mathbf {x}_+)\) can be written as

$$\begin{aligned} \begin{aligned} \left\| \mathcal {P}_{\mathcal {V}^\perp }(x_+)\right\| _2^2&= \left\| x_+\right\| _2^2 - \langle x_+, \check{s}_{\star } \rangle ^2 \\&= \left\| x_+\right\| _2^2 +x_{+,n+1}^2 -\left( -\langle x_{+,1:n}, s_{\star ,1:n} \rangle +x_{+,n+1}\frac{s_{\star ,n+1}}{\left\| s_\star \right\| }\right) ^2 \\&\overset{(a)}{\le } 2x_{+,n+1}^2 -(2x_{+,n+1}\frac{s_{\star ,n+1}}{\left\| s_\star \right\| }-\epsilon (x))^2 \overset{(b)}{\le } -\epsilon ^2(x) +2\sqrt{2}x_{+,n+1}\epsilon (x), \end{aligned} \end{aligned}$$
(12)

where in step (a), we use the fact that \(x_+\in \text{ SOC}_{n}\) and the definition of \(\epsilon\). In step (b), we use the fact \(s_\star \in \partial {\text{ SOC}_{n}}\). Lemma 3 for the SOCP case is established by noting \(x_{+,n+1}\le \left\| x_{+}\right\| \le \left\| x\right\|\).

3.3 Semidefinite programming (SDP)

In semidefinite programming, the cone \(\mathcal {K}\) is \({ \mathbf{S}}_+^n=\{X\in { \mathbf{S}}^n\mid X\succeq 0 \}\). Note that we use capital letter X and S for matrices.

Identify \(\mathcal {F}_{s_\star }\) and \(\mathcal {V}_{s_\star }\). For a dual optimal solution \((y_\star ,S_\star )\) with \(S_\star = C-\mathcal {A}y_\star \succeq 0\) satisfying dual strict complementarity, the complementary face \(\mathcal {F}_{S_\star } := \{X\mid \langle X, S_\star \rangle =0, X\succeq 0\} = \{X\mid X = VRV^\top , R\in { \mathbf{S}}^r,R\succeq 0\}\), where \({\mathbf{rank}}(S_\star )=n-r\), and \(V\in { \mathbf{R}}^{n\times r}\) is a matrix with orthonormal columns that span \({\mathbf{range}}(S_\star )\). In this case, the complementary space \(\mathcal {V}_{S_\star } = \{VRV^\top \mid R \in { \mathbf{S}}^r\}\). We note that the term \(\mathbf{dist{}}(\mathcal {P}_{\mathcal {V}_{S_\star }}(X_+), \mathcal {F}_{S_\star })\) is again zero as \(\mathcal {P}_{\mathcal {V}_{S_\star }}(X_+) \in \mathcal {F}_{S_\star }\).

Bound the term \(\left\| \mathcal {P}_{\mathcal {V}^\perp _{s_\star }}(X_+)\right\| _{{\tiny \mathrm{F}}}\). We now turn to bound the term \(\left\| \mathcal {P}_{\mathcal {V}_{S_\star }^\perp }(X_+)\right\| _{{\tiny \mathrm{F}}}\). We utilize Lemma 4 [19, Lemma 4.3] to bound the term \(\left\| \mathcal {P}_{\mathcal {V}_{S_\star }^\perp }(X_+)\right\| _{{\tiny \mathrm{F}}}\) with \(S = S_\star\), and use the observation that \(\left\| X_+\right\| _{{\tiny {op }}}\le \left\| X\right\| _{{\tiny {op }}}\).

Lemma 4

Suppose \(X,S\in { \mathbf{S}}^{n}\) are both positive semidefinite. Let \(V\in { \mathbf{R}}^{n\times r}\) be the matrix formed by the eigenvectors with the r smallest eigenvalues of S and define \(\mathcal {V}= {\mathbf{range}}(V)\). Let \(\epsilon ={\mathbf{tr}}(XS)\). If \(T = \lambda _{n-r}(S)>0\), then

$$\begin{aligned} \left\| \mathcal {P}_{\mathcal {V}^\perp }(X) \right\| _{{\tiny \mathrm{F}}} \le \frac{\epsilon }{T} + \sqrt{2\frac{\epsilon }{T} \left\| X\right\| _{{\tiny {op }}}} \quad \text {and}\quad \left\| \mathcal {P}_{\mathcal {V}^\perp }(X) \right\| _* \le \frac{\epsilon }{T} + 2\sqrt{r\frac{\epsilon }{T} \left\| X\right\| _{{\tiny {op }}}}. \end{aligned}$$

4 Application: sensitivity of solution

As discussed in the introduction, to study the sensitivity of the solution, we consider a solution \(x_\star '\) of the perturbed problem

$$\begin{aligned} \begin{aligned} \underset{x}{\text {minimize}}\quad&\langle c', x \rangle \\ \text {subject to} \quad&\mathcal {A}' x = b' ,\qquad \qquad \qquad \qquad (\mathcal {P}')\\&x\in \mathcal {K}. \end{aligned} \end{aligned}$$

where the problem data \((\mathcal {A}',b',c')= (\mathcal {A},b,c) + (\Delta \mathcal {A}, \Delta b,\Delta c)\) for some small perturbation \(\Delta = (\Delta \mathcal {A}, \Delta b,\Delta c)\), and ask how the distance \(\left\| x_\star '-x_\star \right\| _2\) changes according to \(\Delta\).

Note that once the error bound (ERB) is established, we can understand the sensitivity of the solution by estimating the suboptimality, linear and conic infeasibility of the new solution \(x_\star '\) with respect to the original problem (\(\mathcal {P}\)) via the perturbation \(\Delta\). Following this strategy, we prove the following theorem:

Theorem 2

Suppose the primal and dual Slater’s condition holds for some \((x_0,y_0)\), and the map \(\mathcal {A}\) is surjective. For any small enough \(\varepsilon >0\), there is some constant \(\bar{c}\), such that for any optimal \(x_\star '\) of (\(\mathcal {P}'\)), and all \(\left\| \Delta \right\| := \left\| \Delta \mathcal {A}\right\| _{{\tiny {op }}} +\left\| \Delta b\right\| _2+\left\| \Delta c\right\| _2\le \varepsilon\), we have

$$\begin{aligned} \max \{\epsilon _{{\tiny opt}}(x_\star '),\left\| \mathcal {A}(x_\star ')-b\right\| _2\}\le \bar{c}\left\| \Delta \right\| . \end{aligned}$$

Hence if (ERB) holds, then (SSB) holds with \(p=p'\).

To facilitate the proof, we define the smallest nonzero singular value of \(\mathcal {A}\) as \(\sigma _{\min >0}(\mathcal {A})= \min _{\left\| x\right\| _2=1, x\perp { \mathbf{nullspace}}(\mathcal {A})}\left\| \mathcal {A}(x)\right\| _2\), and the pseudoinverse of \(\mathcal {A}'\) as \((\mathcal {A}')^\dagger (y) = \mathrm{argmin}_{\mathcal {A}'(x)=y}\left\| x\right\| _2\).

Proof

Consider any solution \(x_\star '\) to the problem (\(\mathcal {P}'\)). Suppose the following assumptions are satisfied (proved in Appendix A):

  1. 1.

    Primal and dual Slater’s condition for (\(\mathcal {P}'\)): there exist a primal \(x_0'\) and dual \(y_0'\) solution feasible for problem (\(\mathcal {P}'\)) and its dual that satisfy \(\min \{\mathbf{dist{}}(x_0',\partial \mathcal {K}),\mathbf{dist{}}(c-\mathcal {A}^*y_0',\partial (\mathcal {K}^*))\}>\rho\) for some \(\rho\) independent of \(\Delta\), and \(\max \{\left\| x_0-x_0'\right\| _2,\left\| y_0-y_0'\right\| _2\}<\xi\) for some \(\xi\) independent of \(\Delta\).

  2. 2.

    Boundedness of primal solutions of (\(\mathcal {P}'\)): There is some \(B>0\) independent of \(\Delta\) such that any solution \(x_\star\) to (\(\mathcal {P}'\)) satisfies \(\left\| x_\star \right\| _2\le B\).

Let us start with the linear infeasibility \(\mathcal {A}(x_\star ')-b\). Using the linear feasibility of \(x_\star '\), \(\mathcal {A}'x_\star '=b'\), w.r.t. (\(\mathcal {P}'\)), we have

$$\begin{aligned} \begin{aligned} \mathcal {A}'(x_\star ') = b'&\implies (\mathcal {A}+\Delta \mathcal {A})x_\star ' = b+\Delta b\implies \mathcal {A}x_\star '-b =\Delta b-\Delta \mathcal {A}x_\star ' \\&\implies \left\| \mathcal {A}x_\star '-b\right\| _2\le \left\| \Delta b\right\| _2+\left\| \Delta \mathcal {A}\right\| _{{\tiny {op }}}B. \end{aligned} \end{aligned}$$
(13)

This shows \(\left\| \mathcal {A}x_\star '-b\right\| _2\le c\left\| \Delta \right\|\) for any \(c>B+1\).

Next, consider \(\epsilon _{{\tiny opt}}(x_\star ')\). We would like to use the optimality of \(x_\star '\) to (\(\mathcal {P}'\)) and compare it against \(x_\star\). However, since \(x_\star\) is not necessarily feasible for (\(\mathcal {P}'\)), we need more subtle reasoning. Consider \(\hat{x} :\,=(1-\alpha )\left( x_\star + (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right) +\alpha x_0'\) with \(\alpha = \frac{\left\| (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right\| _2}{\rho +\left\| (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right\| _2}\). Here \((\mathcal {A}')^{\dagger }\) exists and has largest singular value at most \(\frac{2}{\sigma _{\min >0}(\mathcal {A})}\) as long as \(\sigma _{\max }({\Delta \mathcal {A}})\le \frac{\sigma _{\min >0}(\mathcal {A})}{2}\). We have \(\left\| \hat{x}\right\| _2\le B_1\) for some \(\Delta\) independent \(B_1\) as \(\left\| x_0-x_0'\right\| _2<\xi\). Note that \(\hat{x}\in \mathcal {K}\) since

$$\begin{aligned} \hat{x} = (1-\alpha ) \underbrace{x_\star }_{\in \mathcal {K}} + \alpha \underbrace{\left( x_0' + \frac{1-\alpha }{\alpha } (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right) }_{\in \mathcal {K}\;\text {since}\; \left\| \frac{1-\alpha }{\alpha } (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right\| _2\le \rho }, \end{aligned}$$

if \(\alpha >0\). The case of \(\alpha =0\) is trivial. We also know \(\hat{x}\) is feasible with respect to the linear constraints of (\(\mathcal {P}'\)):

$$\begin{aligned} \begin{aligned} \mathcal {A}'(\hat{x})&=(1-\alpha )\mathcal {A}'(x_\star + (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )) + \alpha \mathcal {A}'x_0' \\&= (1-\alpha )(b+\Delta \mathcal {A}(x_\star )+\Delta b-\Delta \mathcal {A}x_\star ) +\alpha b' = b'. \end{aligned} \end{aligned}$$
(14)

Note by construction \(\hat{x}- x_\star = \alpha (x_0' - x_\star -\alpha (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star ))=:\Delta x\) with \(\alpha = \frac{\left\| (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right\| _2}{\rho +\left\| (\mathcal {A}')^\dagger (\Delta b-\Delta \mathcal {A}x_\star )\right\| _2}\). Hence \(\left\| \Delta x \right\| _2\le c'\left( \left\| \Delta b\right\| _2+\left\| \Delta \mathcal {A}\right\| _{{\tiny {op }}}\right)\) for some constant \(c'\). Now using the optimality of \(x_\star '\), we have

$$\begin{aligned} \begin{aligned} c'x_\star ' \le c'\hat{x} \implies \epsilon _{{\tiny opt}}(x_\star ') = c^\top x_\star ' - c^\top x_\star \le \Delta c^\top (\hat{x}-x_\star ')+c^\top \Delta x. \end{aligned} \end{aligned}$$
(15)

Hence \(\epsilon _{{\tiny opt}}(x_\star ') \le \bar{c}\left\| \Delta \right\|\) for large enough \(\bar{c}\) using \(\left\| x_\star '\right\| _2\le B\).

5 Discussion

Using the framework established in Sect. 2, we have shown an error bound of the form (ERB) and a bound on the sensitivity of the solution with respect to problem data (SSB) for a broad class of problems: LP, SOCP, and SDP, and conic programs for which the cone is a finite product of the LP, SOCP, and SDP cones. Let us now compare the results we have obtained with the literature on error bounds and sensitivity of solution.

Error bound The celebrated results of [8] show that the error bound is linear for linear programming: in (ERB), the exponent \(p=1\). The work of Sturm [9, Section 4] shows that under strict complementarity and compactness of the solution set, SDPs satisfy a quadratic error bound: \(p=2\) in (ERB). Sturm also discusses the exponent of \(\mathbf{dist{}}(x,\mathcal {X}_\star )\) without strict complementarity: it can be bounded by \(2^d\) where d is the singularity degree, which is at most \(n-1\) [9, Lemma 3.6]. A recent result shows that \(p=2\) under dual strict complementarity type conditions when the cone \(\mathcal {K}\) is a symmetric cone under the framework of amenable cones [20]. When the cone \(\mathcal {K}\) is defined as a semialgebraic set (LP, SOCP, SDP are special cases), Drusvyatskiy, Ioffe, and Lewis [21, Corrollary 4.8] showed that for generic C, the exponent p is always 2 when the inequality is restricted to feasible x. We note that the proofs for these bounds do not provide estimates for \(c_i\). In our framework, estimates of \(c_i\) (expressed in terms of the primal and dual solution) can be obtained supposing the primal solution is unique.

Sensitivity of solution When \(p'=1\), the bound describing the sensitivity of the solution (SSB) is also called stability or metric regularity [22, Definition A.6], discussed in detail in [22, Appendix A], and [23, 24]. In the context of semidefinite programming, when the primal and dual solutions are unique and strict complementarity holds, Nayakkankuppam and Overton [10] shows that (SSB) holds with \(p'=1\). When the cone \(\mathcal {K}\) is a semialgebraic set, Drusvyatskiy, Ioffe, and Lewis [21, section 5] showed that for generic perturbations \(C-C',b-b'\), the sensitivity bound (SSB) holds with \(p'=1\).

In Sect. 4, we have seen that (SSB) holds whenever (ERB) with \(p'=p\). Hence, using our results from Sect. 3, we see \(p'=1\) for LP, and \(p'=2\) for SOCP and SDP. This result improves on the previous bound for SOCPs and SDPs assuming only strict complementarity.

Extension to quadratic programming (QP)? A potential future direction is to use the approach of the paper to establish error bounds for QP. The strategy consists of three steps: (1) reducing the QP to an SOCP, (2) utilizing the error bound for the SOCP, and (3) translating the error bound to the QP setting. We leave the detail to future work.

Other Cones? An interesting future direction is the extension to other cones, e.g., the copositive cone, the completely positive cone, and the doubly positive cone (the intersection of nonnegative matrices and positive semidefinite matrices). Can we still bound the term \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x)\right\| _2\)? For the cones \({ \mathbf{R}}_+^n\), \(\text{ SOC}_{n}\), and \({ \mathbf{S}}_+^n\), our technique relies on the explicit structure of \({ \mathbf{R}}_+^n\), \(\text{ SOC}_{n}\), and \({ \mathbf{S}}_+^n\) to bound \(\left\| \mathcal {P}_{\mathcal {V}_{s_\star }^\perp }(x)\right\| _2\). Characterizing the facial structure seems to be challenging for other cones.