1 Introduction

The deployment of Artificial Neural Networks (ANNs) in safety-critical applications such as medical image processing or semi-autonomous vehicles poses a number of new assurance, verification, and certification challenges [2, 5]. For ANN-based end-to-end steering control of self-driving cars, for example, it is important to know how much noisy or even maliciously manipulated sensory input is tolerated [12]. Here we are addressing these challenges by establishing maximum and verified bounds for the resilience of given ANNs on these kinds of input disturbances.

More precisely, we are defining and computing safe perturbation bounds for multi-class ANN classifiers. This measure compares the relative ratio-ordering of multiple, so-called softmax output neurons for capturing scenarios where one only wants to consider inputs that classify to a certain class with high probability. The problem of finding minimal perturbation bounds is reduced to solving a corresponding mixed-integer programming (MIP). In particular, the encoding of some non-linear functions such as ReLU and max-pooling nodes require the introduction of integer variables. These integer constraints are commonly handled by off-the-shelf MIP-solvers such as CPLEXFootnote 1 which are based on branch-and-bound algorithms. In the MIP reduction, a number of nonlinear expressions are linearized using a variant of the well-known big-M [8] encoding strategy. We also define a dataflow analysis [6] for generating relatively small big-M as the basis for speeding up MIP solving. Other important heuristics in encoding the MIP problem include the usage of solving several substantially simpler MIP problems for speeding up the overall generation of satisfying instances by the solver. Lastly, branch-and-bound is run in parallel on a number of computing cores.

We demonstrate the effectiveness and scalability of our approach and encoding heuristics by computing maximum perturbation bounds for benchmark sets such as MNIST [13] and agent games [14]. These cases studies include ANNs for image recognition and for high-level maneuver decisions for autonomous control of a robot. Using the heuristic encodings outlined above we experienced a speed-up of about two orders of magnitude compared with vanilla MIP encodings. Moreover, parallelization of branch-and-bound [23] on different computing cores can yield, up to a certain threshold, linear speed-ups using a high-performance parallelization framework.

The practical advantages of our approach for validating and qualifying ANNs for safety-relevant applications are manifold. First, perturbation bounds provide a formal interface between sensor sets and ANNs in that they provide a maximum tolerable bound on possible sensor errors. These assume-guarantee interfaces therefore form the basis for decoupling the design of sensor sets from the design of the classifier itself. Second, our method also computes minimally perturbed inputs of different classification, which might be included into ANN training sets for potentially improving classification results. Third, maximum perturbation bounds are a useful measure of the resilience of an ANN towards (adversarial) perturbation, and also for objectively comparing different ANNs. Last, large perturbation bounds are intuitively inversely related with the problem of overfitting, that is poor generalization to new inputs, which is a common issue with ANNs.

An overview of concrete problems and various approaches to the safety of machine learning is provided in [2]. We compare our results only with work that is most closely related to ours. Techniques including the generation of test cases [7, 15, 16] or strengthening the resistance of a network with respect to adversarial perturbation [17] are used for validating and improving ANNs. In contrast to our work, these methods do not actually establish verified properties on the input-output behavior of ANNs. Formal methods-based approaches for verifying ANNs include abstraction-refinement based approaches [18], bounded model checking for neural network for control problems [21] and neural network verification using SMT solvers or other specialized solvers [9, 11, 19]. Instead we rely on solving MIP problems and parallelization of branch-and-bound algorithms. In contrast to previous approaches we also go beyond verification and solve optimization problems for ANNs for establishing maximum perturbation bounds. These kinds of problems might also be addressed in SMT-based approaches either by using binary search over SMT or by using SMT solvers that support optimization such as \(\nu Z\) [4], but it is not clear how well these approaches scale to complex ANNs. Recent work also targets ReLU [11] or application of a single image [3, 9] (point-wise robustness or computing measures by taking samples). Our proposed resilience measure for ANNs goes beyond [3, 9, 11] in that it applies to multi-classification network using the softmax descriptor. Moreover, our proposed measure is a property of the classification network itself rather than just a property of a single image (as in [9]) or by only taking samples from the classifier without guarantee (as in [3]).

The paper is structured as follows. Section 2 reviews the foundations of feed-forward ANNs. Section 3 presents an encoding of various neurons in terms of linear constraints. Section 4 defines our measure for quantifying the resilience of an ANN, that is, its capability to tolerate random or even adversarial input perturbations. Section 5 summarizes our MIP encoding heuristics for substantially increasing the performance of the MIP-solver in establishing in minimal perturbation bounds of ANN. Finally, we present the results of some of our experiments in Sect. 6, and we describe possible improvements and extensions in Sect. 7.

2 Preliminaries

We introduce some basic concepts of feed-forward artificial neural networks (ANN) [1]. These networks consist of a sequence of layers labeled from \(l=0,1, \ldots , L\), where 0 is the index of the input layerL is the output layer, and all other layers are so-called hidden layers. For the purpose of this paper we assume that each input is of bounded domain. Superscripts \(^{(l)}\) are used to index layer l-specific variables, but these superscripts may be omitted for input layers. Layers l are comprised of nodes \(n^{(l)}_i\) (so-called neurons), for \(i=0,1, \ldots , d^{(l)}\), where \(d^{(l)}\) is the dimension of the layer l. By convention nodes of index 0 have a constant output 1; these bias nodes are commonly used for encoding activation thresholds of neurons. In a feed-forward net, nodes \(n^{(l-1)}_j\) of layer \(l-1\) are connected with nodes \(n^{(l)}_i\) in layer l by means of directed edges of weight \(w^{(l)}_{ji}\). For the purpose of this paper we are assuming that all weights in a network have fixed values, since we do not consider re-learning. Figure 1 illustrates a small feed-forward network structure with four layers, where each layer comes with a different type of node functions, which are also main ingredients of convolutional neural networks. These node functions are specified in Fig. 2. The first hidden layer of the network in Fig. 1 is a fully-connected ReLU layer. Node \(n^{(1)}_2\), for example, computes the weighted linear sum of all inputs from the previous layer as \(\textsf {im}^{(1)}_2\), and outputs the maximum of 0 and this weighted sum. The second hidden layer is using max-pooling for down-sampling an input representation by reducing its dimensionality; node \(n^{(2)}_1\), for example, just outputs the maximum of its inputs. Node \(n^{(3)}_1\) in the output layer applies the sigmoid-shaped \(\textsf {tan}^{-1}\) on the weighted linear input sum.

Fig. 1.
figure 1

An illustration of how a neural network is defined.

Fig. 2.
figure 2

Input-output function neurons.

Fig. 3.
figure 3

Topological structure for an output layer with 3 neurons using softmax.

Given an input to the network these node functions are applied successively from layer 0 to \(L-1\) for computing the corresponding network output at layer L. For \(l = 1\) to L we use \(x^{(l)}_i\) to denote the output value of node \(n^{(l)}_i\) and \(x^{(l)}_i(a_1, \ldots , a_d)\) denotes the output value \(x^{(l)}_i\) for the input \(a_1, \ldots , a_d\), sometimes abbreviated by \(x^{(l)}_i(a)\).

For the purpose of multi-class classification, outputs in layer L are often transformed into a probability distribution by means of the \(\textsf {softmax}\) function

$$\frac{\textsf {e}^{x^{(L-1)}_i}}{\sum _{j= 1, \ldots , d^{L}} \textsf {e}^{x^{(L-1)}_j}}\text{. }$$

In this way, the output \(x^{(L)}_i\) is interpreted as the probability of the input to be in class i. For the inputs \(x^{(L-1)}_1 = -1\), \(x^{(L-1)}_2 = 2\), \(x^{(L-1)}_3 = 3\) of the nodes in Fig. 3, for example, the corresponding outputs (0.0132, 0.2654, 0.7214) for \((x^{(L)}_1, x^{(L)}_2, x^{(L)}_3)\) sum up to 1.

3 Arithmetic Encoding of Artificial Neural Networks

In a first step, we are encoding the behavior of ANNs in terms of linear arithmetic constraints. In addition to [11] we are also considering \(\textsf {tan}^{-1}\), max-pooling and softmax nodes as commonly found in many ANNs in practice. These encodings are based on the input-output behavior of every node in the network, and the main challenge is to handle the non-linearities, which are arising from non-linear activation functions (e.g., ReLU and \(\textsf {tan}^{-1}\)), max-pooling and softmax nodes.

Constraints for ReLU and \(\textsf {tan}^{-1}\) nodes as defined in Fig. 2 are separated into, first, an equality constraint (1) for the intermediate value \(\textsf {im}^{(l)}_i\) and, second, several linear constraints for encoding the non-linear behavior of these nodes.

$$\begin{aligned} \textsf {im}^{(l)}_i= \sum _{j=0, \dots , d^{(l-1)}} w^{(l)}_{ji} x^{(l-1)}_j \end{aligned}$$
(1)

We now describe the encoding of the non-linear functions (\(x^{(l)}_i = \textsf {max} (0, \textsf {im}^{(l)}_i)\) or \(x^{(l)}_i = \textsf {tan}^{-1} (\textsf {im}^{(l)}_i)\)).

Encoding ReLU activation function. The non-linearity in ReLU constraints \(x^{(l)}_i = \textsf {max} (0, \textsf {im}^{(l)}_i)\) is handled using the well-known big-M method [8], which introduces a binary integer variable \(b^{(l)}_i\) together with a positive constant \(M^{(l)}_i\) such that \(-M^{(l)}_i\le \textsf {im}^{(l)}_i\) and \(x^{(l)}_i\le M^{(l)}_i\) for all possible values of \(\textsf {im}^{(l)}_i\) and \(x^{(l)}_i\). A derivation of the following reduction is listed in the appendix.

Proposition 1

iff the constraints (2a) to (4b) hold.

$$\begin{aligned} x^{(l)}_i&\ge 0 \end{aligned}$$
(2a)
$$\begin{aligned} x^{(l)}_i&\ge {\textsf {im}}^{(l)}_i \end{aligned}$$
(2b)
$$\begin{aligned} {\textsf {im}}^{(l)}_i - b^{(l)}_i M^{(l)}_i&\le 0 \end{aligned}$$
(3a)
$$\begin{aligned} {\textsf {im}}^{(l)}_i + (1- b^{(l)}_i)M^{(l)}_i&\ge 0 \end{aligned}$$
(3b)
$$\begin{aligned} x^{(l)}_i&\le {\textsf {im}}^{(l)}_i + (1-b^{(l)}_i)M^{(l)}_i \end{aligned}$$
(4a)
$$\begin{aligned} x^{(l)}_i&\le b^{(l)}_i M^{(l)}_i \end{aligned}$$
(4b)

The efficiency of a MIP-solver via big-M encoding heavily depends on the size of \(M^{(l)}_i\), because MIP-solvers typically relax binary integer variables to real-valued variables, resulting in a weak LP-relaxation for large big-Ms. It is therefore essential to choose relatively small values for \(M^{(l)}_i\). We apply static analysis [6] based on interval arithmetic for propagating the bounded input values through the network, as the basis for generating “good” values for \(M^{(l)}_i\).

Max-Pooling. The output \(x^{(l)}_i\) of a max-pooling node is rewritten as \(x^{(l)}_i = \textsf {max} (\textsf {im}_1, \textsf {im}_2)\), where \(\textsf {im}_1 = \textsf {max} (x^{(l-1)}_{j_1}, x^{(l-1)}_{j_2})\) and \(\textsf {im}_2 = \textsf {max} (x^{(l-1)}_{j_3}, x^{(l-1)}_{j_4})\). Encoding the \(\textsf {max}(x_1, x_2)\) function into MIP constraints is accomplished by introducing three binary integer variables to encode \(y=\textsf {max}(x_1, x_2)\) using the big-M method.

Property-directed encoding of softmax. The exponential function in the definition of softmax, of course, can not be encoded into a linear MIP constraint. However, using the proposition below, one confirms that if the property to be analyzed does not consider the concrete value of output values from neurons but only the ratio ordering, then (1) it suffices to omit the construction of the output layer, and (2) one may rewrite the property by replacing each \(x^{(L)}_{i}\) by \(x^{(L-1)}_{i}\).

Proposition 2

Given a feed-forward ANN with softmax output layer and a constant \(\alpha > 0\), then for all \(i, j\in \{1, \ldots , d^{(L)}\}\):

$$ x^{(L)}_{i_1} \ge \alpha \, x^{(L)}_{i_2} \Leftrightarrow x^{(L-1)}_{i_1} \ge \ln (\alpha ) + x^{(L-1)}_{i_2}.$$

This equivalence is simply derived by using the definition of softmax, multiplying by the positive denominator, and by applying the logarithm and the resulting inequality. The derivation is listed in the appendix.

Encoding \({{\textsf {\textit{tan}}}}^{-1}\) with error bounds. The handling of non-linearity in \(\textsf {tan}^{-1}(\textsf {im})\) is based on results in digital signal processing for piece-wise approximating \(\textsf {tan}^{-1}(\textsf {im})\) with quadratic constraints and error bounds. In case \(-1 \le \textsf {im} \le 1\) the quadratic approximation methods (Eq. (7) of [20]) are used, and \(\textsf {tan}^{-1}(\textsf {im})\) is approximated by \(\frac{\pi }{4}\textsf {im}+0.273\,\textsf {im}(1- |\textsf {im}|)\) with a maximum error smaller than 0.0038. The absolute value \(|\textsf {im}|\) in the formula is removed by encoding case splits between \(\textsf {im} \ge 0\) and \(\textsf {im} < 0\) using big-M methods. Otherwise, when considering the case \(\textsf {im} > 1\) or \(\textsf {im} < -1\), the symmetry condition of \(\textsf {tan}^{-1}\) [22] states that (1) if \(\textsf {im} > 0\) then \(\textsf {tan}^{-1}(\textsf {im}) + \textsf {tan}^{-1}(\frac{1}{\textsf {im}}) = \frac{\pi }{2}\), and (2) if \(\textsf {im} < 0\) then \(\textsf {tan}^{-1}(\textsf {im}) + \textsf {tan}^{-1}(\frac{1}{\textsf {im}}) = -\frac{\pi }{2}\). This implies that we can create a variable \(\textsf {im}_{inv}\) with a constraint that \(\textsf {im}_{inv}\,\textsf {im} = 1\), i.e., variable \(\textsf {im}_{inv}\) is the inverse of im. By utilizing the fact that \(-1 \le \textsf {im}_{inv} \le 1\), the value of \(\textsf {tan}^{-1}(\textsf {im}_{inv})\) can be computed by the formula in (i).

Moreover, case splits are encoded using the big-M method as outlined above. Since quadratic terms are used, our approach for handling \(\textsf {tan}^{-1}\) nodes requires solving mixed integer quadratic constraint problem (MIQCP) problems.

Using these approximations for \(\textsf {tan}^{-1}(\textsf {im}_i)\), we obtain lower and upper bounds for the value of the node variable \(x_i\), where the interval between lower and upper bound is determined by the approximation error of \(\textsf {tan}^{-1}\). Since the approximation error propagates through the network and using lower and upper bounds instead of an equality constraint relaxes the problem, our method computes approximations for the measure when it is used for ANNs with \(\textsf {tan}^{-1}\) as activation function.

Pre-processing based on dataflow analysis. We use interval arithmetic to obtain relatively small values for big-M, in order to avoid a weak LP-relaxation of the MIP. Interval bounds for the values of \(x^{(l)}_i\) are denoted by \([\textsf {Lo}(x^{(l)}_i), \textsf {Up}(x^{(l)}_i)]\). We are assuming that all input values (at layer \(l = 0\)) are bounded, and the output of bias nodes is restricted by the singleton [1, 1] (the value of the bias is given by the weight of a bias node). Interval bounds for the values of node outputs \(x^{(l)}_i\) are obtained from the interval bounds of connected nodes from the previous layers by means of interval arithmetic.

The output \(x^{(l)}_i\) of ReLU nodes is defined by \(\textsf {im}^{(l)}_i = \sum _{j=0, \dots , d^{(l-1)}} w^{(l)}_{ji} x^{(l-1)}_j\) and the ReLU function \(\textsf {max} (0, \textsf {im}^{(l)}_i)\). Therefore, interval bounds for \(x^{(l)}_i\) are computed by first considering the interval bounds \(\textsf {Lo}(\textsf {im}^{(l)}_i)\) and \(\textsf {Up}(\textsf {im}^{(l)}_i)\), which are determined by weights of the linear sum and the bounds on \(x^{(l-1)}_j\). The bounds \(\textsf {Lo}(\textsf {im}^{(l)}_i)\) and \(\textsf {Up}(\textsf {im}^{(l)}_i)\) are obtained from interval arithmetic as follows:

$$\begin{aligned} \textsf {Lo}(\textsf {im}^{(l)}_i) = \sum _{j=0,\ldots ,d^{(l-1)}} \textsf {min}\left( w_{ij}^{(l)}\cdot \textsf {Lo}(x^{(l-1)}_j), w_{ij}^{(l)}\cdot \textsf {Up}(x^{(l-1)}_j)\right) \end{aligned}$$
$$\begin{aligned} \textsf {Up}(\textsf {im}^{(l)}_i) = \sum _{j=0,\ldots ,d^{(l-1)}} \textsf {max}\left( w_{ij}^{(l)}\cdot \textsf {Lo}(x^{(l-1)}_j), w_{ij}^{(l)}\cdot \textsf {Up}(x^{(l-1)}_j)\right) . \end{aligned}$$

Given \(\textsf {Lo}(\textsf {im}^{(l)}_i)\) and \(\textsf {Up}(\textsf {im}^{(l)}_i)\) the bounds on \(x^{(l)}_j\) are derived using the definition of ReLU, i.e.,

$$\begin{aligned}{}[\textsf {Lo}(x^{(l)}_i), \textsf {Up}(x^{(l)}_i)] = [\textsf {max}(0, \textsf {Lo}(\textsf {im}^{(l)}_i)), \textsf {max}(0, \textsf {Up}(\textsf {im}^{(l)}_i))]. \end{aligned}$$
Fig. 4.
figure 4

Dataflow analysis for bounding computed values in a neural network.

Note that if \(\textsf {Lo}(x^{(l)}_i) \ge 0\) or \(\textsf {Up}(x^{(l)}_i) \le 0\) these bounds suffice to determine which case of the piece-wise linear ReLU function applies. In this way, the constraints (2)–(4) maybe dropped and the value of \(x_i^{(l)}\) is directly encoded using linear constraints, which reduces the number of binary variables. See Fig. 4 for an example of dataflow analysis.

In the case of max-pooling nodes, the output \(x^{(l)}_i\) is simply the maximum \(\textsf {max} (x^{(l-1)}_{j_1}, x^{(l-1)}_{j_2}, x^{(l-1)}_{j_3}, x^{(l-1)}_{j_4})\) of its four inputs. Therefore, the bounds \(\textsf {Lo}_{x^{(l)}_i}\) and \(\textsf {Up}_{x^{(l)}_i}\) on the output are given by the maximum of the lower and uppers bounds of the four inputs respectively. Interval bounds of the outputs for \(\textsf {tan}^{-1}\) are obtained using a polynomial approximation for \(\textsf {tan}^{-1}\). Finally, the output of \(\textsf {softmax}\) nodes is a probability in [0, 1] which might also be further refined using interval arithmetic. These bounds on \(\textsf {softmax}\) nodes, however, are not used in our encodings, because of the property-driven encoding of \(\textsf {softmax}\) output layers as described previously.

4 Perturbation Bounds

We define concrete measures for quantifying the resilience of multi-classification neural networks with \(\textsf {softmax}\) output neurons. This measure for resilience is defined over all possible inputs of the network. In particular, our developments do not depend on probability distributions of training and test data as in previous work [3]. Maximum resilience of these ANNs is obtained by means of solving corresponding MIP problems (or MIQCPs in the case of \(\textsf {tan}^{-1}\) activation functions).

Fig. 5.
figure 5

Finding the smallest possible perturbation for a multi-class classifier to loose confidence.

Fig. 6.
figure 6

Two images A, B that both classify to number 5.

We illustrate the underlying principles of maximum resilience using examples from the MNIST database [13] for digit recognition of input images (see Fig. 5). Input images in MNIST are of dimension \(24\times 24\) and are represented as a vector \(a_1, \ldots , a_{576}\). Input layers of ANN-based multi-digit classifiers for MNIST therefore consist of 576 input neurons, and the output layer is comprised of 10 \(\textsf {softmax}\) neurons. Let the output \(x^{(L)}_0, \ldots , x^{(L)}_9\) at the last layer be the computed probabilities for an input image to be classified to characters ‘0’ to ‘9’.

To formally define a perturbation, we allow each input \(a_{i}\) (\(i = 1, \ldots , d\)) to have a small disturbance \(\epsilon _i\), so the input after perturbation is \((a_1 + \epsilon _1, \ldots , a_{d} + \epsilon _{d})\). We sometimes use the concise notation of \(a+\epsilon := (a_1 + \epsilon _1, \ldots , a_{d} + \epsilon _{d})\) for the perturbed input. The global value of the perturbation is obtained by taking the sum of the absolute values of each disturbance \(\epsilon _i\), i.e., \(|\epsilon _1|+ |\epsilon _2| + \ldots + |\epsilon _{d}|\).

Definition 1

(Maximum Perturbation Bound for m -th classifier). For a given ANN with \(d^{(L)}\) neurons in a \(\textsf {softmax}\) output layer and given constants \(\alpha \ge 1\) and \(k\in \{1,\ldots ,d^{(L)}-1\}\), we define the maximum perturbation bound for the m-th classifier, denoted by \(\varPhi _m\),Footnote 2 to be the maximum value such that:

figure a

Intuitively, the bound \(\varPhi _m\) guarantees that for all inputs that strongly (defined by \(\alpha \)) classify to class m, if the total amount of perturbation is limited to a value strictly below \(\varPhi _m\), then either (1) the perturbed input can still be classified as m, or (2) the probability of classifying to m is among the k highest probabilities. Dually, \(\varPhi _m\) is the smallest value such that there exists an input that originally classifies to m, for which the computed probability for class m may not be among the k highest after being perturbed with value greater than or equal to \(\varPhi _m\). Figure 5 illustrates an example of an MNIST image being perturbed, where the neural network considers the perturbed image to be ‘0’ or ‘3’ with at least the probability of being a ‘5’. The “not among the k highest” property is an indicator that the confidence of classifying to class m has decreased under perturbation, as the perturbed input can be interpreted as at least k other classes. In our experiment evaluations below we used the fixed value \(k=2\).

Constant \(\alpha \ge 1\) may be interpreted as indicating the level of confidence of being classified to a class m. When setting \(\alpha \) to 1, the analysis takes all inputs for which the probability of class m is greater than or equal to the probabilities of the other classes. Since there might exist an image that has the same probability for all classes, setting \(\alpha = 1\) may result in a maximum perturbation of zero. Increasing k helps to avoid this effect, because it requires that at most \(k-1\) other classes have probabilities greater than or equal to the probility of m. By picking an \(\alpha >1\) low-confidence inputs are removed and part (II) of Definition 1 forces the perturbation to be greater than zero. E.g., assume if point B in Fig. 6 is classified to ‘5’ with probability 0.35 and to ‘0’ with probability 0.34, then even by setting \(\alpha = 1.1\), point B will not be considered in the analysis. By setting \(\alpha \) to 25 one already only considers inputs that classifies to m with probability higher than 0.95.

Provided that \(\varPhi _m\) can be computed for each class m (as shown below), one defines a measure for safe perturbation by taking the minimum of all \(\varPhi _m\), and the measure is computed by computing each \(\varPhi _m\) independently.

Definition 2

(Perturbation Bound for ANN). For an ANN with L layers and \(d^{(L)}\) \(\textsf {softmax}\) neurons in the output layer, a given \(\alpha \ge 1\), \(k\in \{1,\ldots ,d^{(L)}-1\}\), and \(\varPhi _m\) the perturbation bound for the m-th classifier of this ANN from Definition 1, the perturbation bound for ANN is defined as \(\varXi := \textsf {min}(\varPhi _1, \ldots , \varPhi _{d^{L}})\).

Based on the dual interpretation above of Definition 1 we are now ready to encode the problem of finding \(\varPhi _m\) in terms of the following optimization problem, where \(a=(a_1,\ldots ,a_d)\) and \(a+\epsilon = (a_1 + \epsilon _1,\ldots ,a_d+\epsilon _d)\).

(5)

Proposition 3

For a given \(\alpha \ge 1\) and \(k\in \{1,\ldots ,d^{(L)}-1\}\), the optimal value of the optimization problem (5) as stated above equals \(\varPhi _m\). For ANNs using \(\textsf {tan}^{-1}\) problem (5) yields an under-approximation \(\varPhi _m'\le \varPhi _m\), because the feasible region is relaxed due to the approximation of \(\textsf {tan}^{-1}\).

The first set of conjunctive constraints specifies that the input \(a = (a_1, \ldots , a_d)\) strongly classifies to m (i.e., satisfies condition I in Definition 1), while the second set of disjunctive constraints specifies that by feeding the image after perturbation, the neural network outputs that at least k classes in I are more likely (or equally likely) than class m (i.e., the second condition in Definition 1 is violated). Therefore, for input \(a = (a_1, \ldots , a_d)\) and its associated perturbation \(\epsilon = (\epsilon _1,\ldots ,\epsilon _d)\), we have that \(\sum _{i=1, \ldots , d}|\epsilon _i| \ge \varPhi _m\). By computing the minimum objective of \(\sum _{i=1, \ldots , d}|\epsilon _i|\) satisfying the constraints we obtain \(\sum _{i=1, \ldots , d}|\epsilon _i| = \varPhi _m\).

We now address the following issues in order to transform optimization problem (5) into a MIP: (1) the objective is not linear due to the introduction of the absolute value function, (2) the non-linearity of \(\textsf {softmax}\) due to the function \(x^{(L)}_i = \textsf {e}^{x^{(L-1)}_i}/ \sum _{j= 1, \ldots , d^{L}} \textsf {e}^{x^{(L-1)}_j}\), and (3) the disjunction in the second set of constraints.

(i) Transforming objectives. Since the objective \(|\epsilon _1|+ |\epsilon _2| \ldots , |\epsilon _{d}|\) in problem (5) is not linear, we create new variables \(\epsilon ^{\textsf {abs}}_i\) in optimization problem (6), where \(i\in \{1, \ldots , d\}\), such that every \(\epsilon ^{\textsf {abs}}_i\) is greater than \(\epsilon _i\) and \(-\epsilon _i\). Whenever the value is minimized, we have that \(\epsilon ^{\textsf {abs}}_i = |\epsilon _i|\).

(ii) Removing softmax output layer. Optimization problem (5) contains the inequality \(x^{(L)}_{m}(a_1, \ldots , a_{d}) \ge \alpha x^{(L)}_{i}(a_1, \ldots , a_{d})\). It follows from Proposition 2 that replacing this inequality with \(x^{(L-1)}_{m}(a_1, \ldots , a_{d}) \ge \ln (\alpha ) + x^{(L-1)}_{i}(a_1, \ldots , a_{d})\) is sufficient, thereby omitting the exponential function.

(iii) Transforming disjunctive constraints. The disjunctive constraint in problem (5) guarantees at least k classifications with probability equal or higher as m. We rewrite it by introducing a binary variable \(c_i\) for each class \(i\ne m\). Then we use (1) an integer constraint \(\sum _{i=1, \ldots , d, i\ne m} c_i \ge k\) to select k classifications and (2) the big-M method to enforce that if classification i is selected (i.e., \(c_i = 1\)), the probability of classifying to i is higher or equal to the probability of classifying to m.

By applying the transformations (i)–(iii) to the optimization problem (5) we obtain problem (6), which is a MIP, and it follows from Proposition 3 that maximum perturbations bounds can be obtained by solving the MIP in (6).

Theorem 1

For a given \(\alpha \ge 1\) and \(k\in \{1,\ldots ,d^{(L)}-1\}\), the optimum of the MIP in (6) equals \(\varPhi _m\) for ANNs with ReLU nodes and softmax output layer. For ANNs using \(\textsf {tan}^{-1}\) it yields an under-approximation.

(6)

5 Heuristic Problem Encodings

We list some simple but essential heuristics for efficiently solving MIP problems for the verification of ANNs. Notice that these heuristics are not restricted to computing the resilience of ANNs, and may well be applicable for other verification tasks involving ANNs.

1. Smaller big-M s by looking back at multiple layers. The dataflow analysis in Sect. 3 essentially views neurons at the same layer to be independent. Here we propose a more fine-grained analysis by considering a fixed number of predecessor layers at once. Finding the bound for the output of a neuron \(x^{(l)}_i\), for example, can be understood as solving a substantially smaller MIP problem by considering neurons from layer \(l-1\) and \(l-2\) when considering two preceding layers. These MIP problems are independent for each node in these layers and can therefore be solved in parallel. For each node, we first set the upper bound as a variable to be maximized in the objective, and trigger the MIP-solver to find such a value. Relations over integer binary variables can be derived by applying similar techniques. Notice that these MIPs only generate correct lower and upper bounds if they can be solved to optimality.

2. Branching priorities. This encoding heuristics uses the given structure of feed-forward ANNs in that binary integer variables originating from lower layers are prioritized for branching. Intuitively, variables from the first hidden layer only depend on the input and it influences all other binary integer variables corresponding to neurons in deeper layers.

3. Constraint generation from samples and solver initialization. For computing \(\varPhi _m\) on complex systems via MIP, we use the following three-step process. First, find an input assignment \((a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d)\) such that the probability of classifying to m is \(\alpha \) times larger, i.e., \(x^{(L)}_{m}(a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d) \ge \alpha x^{(L)}_{j}(a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d)\) for all \(j = 1, \ldots , d^{(L)}, j\ne m\). Finding \((a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d)\) is equivalent to solving a substantially simpler MIP problem without introducing variables \(\epsilon _1, \ldots , \epsilon _d\) and \( \epsilon ^{\textsf {abs}}_1, \ldots , \epsilon ^{\textsf {abs}}_d\). Second, use Eq. (6) to compute the minimum perturbation by considering the domain to be size 1, i.e., \(\{(a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d)\}\). As the domain is restricted to a single input, all variables \(a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d\) in Eq. (6) are replaced by constants \(a^{\textsf {ini}}_1, \ldots , a^{\textsf {ini}}_d\). This also yields substantially simpler MIP problems, and the computed bound is denoted by \(\varPhi ^{ini}_m\). Third, and finally, initialize the MIP-solver by using the computed values from steps 1 and 2, such that the search directly starts with a feasible solution with objective \(\varPhi ^{\textsf {ini}}_m\). Also, the constraint \(- \varPhi ^{\textsf {ini}}_m \le \sum _{i=1, \ldots , d} \epsilon _i \le \varPhi ^{\textsf {ini}}_m\), as \(\sum _{i=1, \ldots , d} \epsilon _i \le \sum _{i=1, \ldots , d} |\epsilon _i| = \varPhi _m \le \varPhi ^{\textsf {ini}}_m\), can be further added to restrict the search space.

6 Implementation and Evaluation

We implemented an experimental platform in C++ for verifying and computing perturbation bounds for neural networks, which is based on IBM CPLEX Optimization Studio 12.7 (academic version) for MIP solving. We used three different benchmark sets as the basis for our evaluations: (1) MNISTFootnote 3 for number characterization, (2) agent gamesFootnote 4, and (3) deeptraffic for simulating highway overtaking scenariosFootnote 5. These benchmarks are denoted by \(\text {I}_{\text {MNIST}}\), \(\text {I}_{\text {Agent}}\), and \(\text {I}_{\text {deeptraffic}}\) respectively, in the following. For each of the benchmarks we created neural networks with different numbers of hidden layers and numbers of neurons, which are shown in Tables 1 and 2. All the networks were trained using ConvNetJS [10].

Table 1. Execution time for verifying perturbation problem over a single input instance. Time out (t.o.) is set to be 1 h. Agent games turn out to be quite simple to solve, therefore no heuristics are being applied (n.a.).
  • Agents in agent games have 9 sensors, each pointing into a different direction and returning the distances to an apple, poison or a wall, which amounts to the 27 inputs. Neural networks of various size were trained for an agent that gets rewarded for eating red things (apples) and gets negative reward when it eats green things (poison).

  • deeptraffic is used as a gamified simulation environment for highway traffic. The controller is trained based on a grid sensor map, and it outputs high-level driving decisions to be taken such as switch lane, accelerate or decelerate.

  • For MNIST digit recognition [13] has 576 input nodes for the pixels of a gray-scale image, where we trained three networks with different numbers of neurons in the hidden layers.

In our experimental validation we focus on efficiency gains of our MIP encodings and parallelization for verifying neural networks, and the computation of perturbation bound by means of the optimization problem stated in Eq. (6).

Evaluation of MIP Encodings. To understand how dataflow analysis and our heuristic encodings reduce the overall execution time, we have created synthetic benchmarks where for each example, we only ask for a given input instance (e.g., an image) that classifies to m, whether the perturbation bound is below \(\delta \). By restricting ourselves to only verify a single input instance and by not minimizing \(\delta \), the problem under verification (local robustness related to an input) is substantially simpler and is similar to those stated in [3, 11]. Table 1 gives a summary over results being evaluated using Google Computing Engine (16 CPU and 60 GB RAM) by only allowing 12 threads to be used. Compared to a naïve approach that sets \(M^{(l)}_i\) uniformly to a large constant, applying dataflow analysis can bring benefits for instances that take a longer time to solve. The first two heuristics we have implemented are useful for solving some very difficult problems. Admittedly, it can also result in longer solutions times for simpler instances, but as our ultimate goal is for scalability such an issue is currently minor. More difficult instances (see \(\text {I}_{\text {MNIST}}^{4\times 50}\) in Table 1) could only be solved using heuristic 1. for preprocessing.

Effects of Parallelization. For \(\text {I}_{\text {MNIST}}\) we further measured the solution time for local robustness with \(\epsilon =0.01\) for 10 test inputs using 8, 16, 24, 32 and 64 threads on machines that have at least as many CPUs as we allow CPLEX to have threads. The results are shown in Fig. 7. It is clearly visible that using more threads can bring a significant speed-up till 32 cores, especially for instances that cannot be solved fast with few threads. Interestingly, one can also observe that for this particular problem (200 neurons in hidden layers), increasing the number of threads from 32 to 64 does not improve performance (many lines just flatten from 32 cores onwards). However, for some other problems (e.g., 400 neurons in hidden layers in hidden layers or computing resilience), the parallelization effect can last longer to some larger number of threads. We suspect that for problems that have reached a certain level of simplicity, adding additional parallelization may not further help.

Fig. 7.
figure 7

Execution time vs. the number of threads of five test inputs for \(\text {I}_{\text {MNIST}}\) with \(\epsilon =0.01\).

Table 2. Computation time and results for computing the maximum resilience \(\varPhi _m\).

Computing \(\varPhi _m\) by solving problem (6). Table 2 shows the result of computing precise \(\varPhi _m\). For simpler problems, we can observe from the first 4 rows of Table 2 that the computed \(\varPhi _m\) increases, when the value of the parameter \(\alpha \) increases. This is a natural consequence - for inputs being classified with higher confidence, it should allow for more perturbation to bring to ambiguity. Notably, using a value of \(\alpha \) above its maximum makes the problem infeasible, because there does not exist an input for which the neural network has such high confidence. For complex problems, by setting \(\alpha \) is closer to its maximum (which can be computed by solving another substantially simpler MIP that maximizes \(\alpha \) for all inputs that classify to class m), one shrinks the complete input space to inputs with high confidence. Currently, scalability of our approach relies on sometimes setting a high value of \(\alpha \), as can be observed in the lower part of Table 2.

7 Concluding Remarks

Our definition and computation of maximum perturbation bounds for ANNs using MIP-based optimization is novel. By developing specialized encoding heuristics and using parallelization we demonstrate the scalability and possible applicability of our verification approach for neural networks in real-world applications. Our verification techniques also allow to formally and quantitatively compare the resilience of different neural networks. Also, perturbation bounds provide a formal assume-guarantee interface for decoupling the design of sensor sets from the design of the neural network itself. In our case, the network assumes a maximum sensor input error for resilience, and the input sensor sets need to be designed to guarantee the given error bound. These kinds of contract-based interfaces may form the basis for constructing more modularized safety cases for autonomous systems.

Nevertheless, we consider the developments in this paper as only a first tiny step towards realizing the full potential of formal verification techniques for artificial neural networks and their deployment for realizing new safety-critical functionalities such as self-driving cars. For simplicity we have restricted ourselves to 1-norms for measuring perturbations but other vector norms may, of course, also be used depending on the specific needs of the application context. Also, the development of specialized MIP solving strategies for verifying ANNs, which go beyond the encoding heuristics provided in this paper, may result in considerable efficiency gains. Notice also that the offline verification approach as presented here is applied a posteriori to fixed and “fully trained” networks, whereas real-world networks are usually trained and improved in the field and during operation. Furthermore, the exact relationship of our perturbation bounds with the common phenomena of over-fitting in a neural network classifier deserves a closer examination, since perturbation may also be viewed as generalization from samples. And, finally, investigation of further measures of the resilience of ANNs is needed, as perturbation bounds do not generally cover the resilience of ANNs to input transformations such as scaling or rotation.