Keywords

1 Introduction

Out-of-distribution (OOD) detection has become an indispensable part of building reliable open-world machine learning models [2]. An OOD detector determines whether an input is from the same distribution as the training data, or different distribution. As of recently a plethora of exciting literature has emerged to combat the problem of OOD detection [16, 20, 21, 24, 26,27,28,29, 33].

Despite the promise, previous methods primarily focused on clean OOD data, while largely underlooking the robustness aspect of OOD detection. Concerningly, recent works have shown the brittleness of OOD detection methods under adversarial perturbations [5, 16, 37]. As illustrated in Fig. 1, an OOD image (e.g., mailbox) can be perturbed to be misclassified by the OOD detector as in-distribution (traffic sign data). Failing to detect such an adversarial OOD exampleFootnote 1 can be consequential in safety-critical applications such as autonomous driving [12]. Empirically on CIFAR-10, our analysis reveals that the false positive rate (FPR) of a competitive method Outlier Exposure [19] can increase from 3.66% to 99.94% under adversarial attack.

Motivated by this, we make an important step towards the robust OOD detection problem, and propose a novel training framework, A dversarial Training with informative Outlier Mining (ATOM). Our key idea is to selectively utilize auxiliary outlier data for estimating a tight decision boundary between ID and OOD data, which leads to robust OOD detection performance. While recent methods [16, 19, 32, 33] have leveraged auxiliary OOD data, we show that randomly selecting outlier samples for training yields a large portion of uninformative samples, which do not meaningfully improve the decision boundary between ID and OOD data (see Fig. 2). Our work demonstrates that by mining low OOD score data for training, one can significantly improve the robustness of an OOD detector, and somewhat surprisingly, generalize to unseen adversarial attacks.

Fig. 1.
figure 1

Robust out-of-distribution detection. When deploying an image classification system (OOD detector \(G(\mathbf {x})\) + image classifier \(f(\mathbf {x})\)) in an open world, there can be multiple types of OOD examples. We consider a broad family of OOD inputs, including (a) Natural OOD, (b) \(L_\infty \) OOD, (c) corruption OOD, and (d) Compositional OOD. A detailed description of these OOD inputs can be found in Sect. 4.1. In (b-d), a perturbed OOD input (e.g., a perturbed mailbox image) can mislead the OOD detector to classify it as an in-distribution sample. This can trigger the downstream image classifier \(f(\mathbf {x})\) to predict it as one of the in-distribution classes (e.g., speed limit 70). Through adversarial training with informative outlier mining (ATOM), our method can robustify the decision boundary of OOD detector \(G(\mathbf {x})\), which leads to improved performance across all types of OOD inputs. Solid lines are actual computation flow.

We extensively evaluate ATOM on common OOD detection benchmarks, as well as a suite of adversarial OOD tasks, as illustrated in Fig. 1. ATOM achieves state-of-the-art performance, significantly outperforming competitive methods using standard training on random outliers [19, 32, 33], or using adversarial training on random outlier data [16]. On the classic OOD evaluation task (clean OOD data), ATOM achieves comparable and often better performance than current state-of-the-art methods. On \(L_\infty \) OOD evaluation task, ATOM outperforms the best baseline ACET [16] by a large margin (e.g. 53.9% false positive rate deduction on CIFAR-10). Moreover, our ablation study underlines the importance of having both adversarial training and outlier mining (ATOM) for achieving robust OOD detection.

Lastly, we provide theoretical analysis for ATOM, characterizing how outlier mining can better shape the decision boundary of the OOD detector. While hard negative mining has been explored in different domains of learning, e.g., object detection, deep metric learning [11, 13, 38], the vast literature of OOD detection has not explored this idea. Moreover, most uses of hard negative mining are on a heuristic basis, but in this paper, we derive precise formal guarantees with insights. Our key contributions are summarized as follows:

  • We propose a novel training framework, adversarial training with outlier mining (ATOM), which facilitates efficient use of auxiliary outlier data to regularize the model for robust OOD detection.

  • We perform extensive analysis and comparison with a diverse collection of OOD detection methods using: (1) pre-trained models, (2) models trained on randomly sampled outliers, (3) adversarial training. ATOM establishes state-of-the-art performance under a broad family of clean and adversarial OOD evaluation tasks.

  • We contribute theoretical analysis formalizing the intuition of mining informative outliers for improving the robustness of OOD detection.

  • Lastly, we provide a unified evaluation framework that allows future research examining the robustness of OOD detection algorithms under a broad family of OOD inputs. Our code and data are released to facilitate future research on robust OOD detection: https://github.com/jfc43/informative-outlier-mining.

2 Preliminaries

We consider the setting of multi-class classification. We consider a training dataset \(\mathcal {D}_{\text {in}}^{\text {train}}\) drawn i.i.d. from a data distribution \(P_{\boldsymbol{X},Y}\), where \(\boldsymbol{X}\) is the sample space and \({Y} = \{1,2,\cdots ,K \}\) is the set of labels. In addition, we have an auxiliary outlier data \(\mathcal {D}_{\text {out}}^{\text {auxiliary}}\) from distribution \(U_\mathbf {X}\). The use of auxiliary outliers helps regularize the model for OOD detection, as shown in several recent works [16, 25, 29, 32, 33].

Robust Out-of-Distribution Detection. The goal is to learn a detector \(G:\mathbf {x} \rightarrow \{-1, 1\}\), which outputs 1 for an in-distribution example \(\mathbf {x}\) and output \(-1\) for a clean or perturbed OOD example \(\mathbf {x}\). Formally, let \(\Omega (\mathbf {x})\) be a set of small perturbations on an OOD example \(\mathbf {x}\). The detector is evaluated on \(\mathbf {x}\) from \(P_{\mathbf {X}}\) and on the worst-case input inside \(\Omega (\mathbf {x})\) for an OOD example \(\mathbf {x}\) from \(Q_{\mathbf {X}}\). The false negative rate (FNR) and false positive rate (FPR) are defined as:

$$\begin{aligned} \text {FNR}(G) = \mathbb {E}_{\mathbf {x}\sim P_{\mathbf {X}}} \mathbb {I}[G(\mathbf {x}) = -1], \quad \text {FPR}(G; Q_{\mathbf {X}}, \Omega ) = \mathbb {E}_{\mathbf {x}\sim Q_{\mathbf {X}}} \max _{\delta \in \Omega (\mathbf {x})} \mathbb {I}[G(\mathbf {x}+\delta )=1]. \end{aligned}$$

Remark. Note that test-time OOD distribution \(Q_{\mathbf {X}}\) is unknown, which can be different from \(U_\mathbf {X}\). The difference between the auxiliary data \(U_{\mathbf {X}}\) and test OOD data \(Q_{\mathbf {X}}\) raises the fundamental question of how to effectively leverage \(\mathcal {D}_{\text {out}}^{\text {auxiliary}}\) for improving learning the decision boundary between in- vs. OOD data. For terminology clarity, we refer to training OOD examples as outliers, and exclusively use OOD data to refer to test-time anomalous inputs.

3 Method

In this section, we introduce Adversarial Training with informative Outlier Mining (ATOM). We first present our method overview, and then describe details of the training objective with informative outlier mining.

Method Overview: A Conceptual Example. We use the terminology outlier mining to denote the process of selecting informative outlier training samples from the pool of auxiliary outlier data. We illustrate our idea with a toy example in Fig. 2, where in-distribution data consists of class-conditional Gaussians. Outlier training data is sampled from a uniform distribution from outside the support of in-distribution. Without outlier mining (left), we will almost sample those “easy” outliers and the decision boundary of the OOD detector learned can be loose. In contrast, with outlier mining (right), selective outliers close to the decision boundary between ID and OOD data, which improves OOD detection. This is particularly important for robust OOD detection where the boundary needs to have a margin from the OOD data so that even adversarial perturbation (red color) cannot move the OOD data points across the boundary. We proceed with describing the training mechanism that achieves our novel conceptual idea and will provide formal theoretical guarantees in Sect. 5.

Fig. 2.
figure 2

A toy example in 2D space for illustration of informative outlier mining. With informative outlier mining, we can tighten the decision boundary and build a robust OOD detector.

3.1 ATOM: Adversarial Training with Informative Outlier Mining

Training Objective. The classification involves using a mixture of ID data and outlier samples. Specifically, we consider a \((K+1)\)-way classifier network f, where the \((K+1)\)-th class label indicates out-of-distribution class. Denote by \(F_\theta (\mathbf {x})\) the softmax output of f on \(\mathbf {x}\). The robust training objective is given by

$$\begin{aligned} \mathop {\mathrm {minimize\quad }}_\theta \mathbb {E}_{(\mathbf {x},y)\sim \mathcal {D}_{\text {in}}^{\text {train}}} [\ell (\mathbf {x}, y; F_\theta )] + \lambda \cdot \mathbb {E}_{\mathbf {x} \sim \mathcal {D}_{\text {out}}^{\text {train}}} \max _{\mathbf {x}' \in \Omega _{\infty , \epsilon }(\mathbf {x})} [\ell (\mathbf {x}', K+1; F_\theta )] \end{aligned}$$
(1)

where \(\ell \) is the cross entropy loss, and \(\mathcal {D}_{\text {out}}^{\text {train}}\) is the OOD training dataset. We use Projected Gradient Descent (PGD) [30] to solve the inner max of the objective, and apply it to half of a minibatch while keeping the other half clean to ensure performance on both clean and perturbed data.

Once trained, the OOD detector \(G(\mathbf {x})\) can be constructed by:

$$\begin{aligned} G(\mathbf {x}) = {\left\{ \begin{array}{ll} -1 &{} \quad \text {if } F(\mathbf {x})_{K+1} \ge \gamma , \\ 1 &{} \quad \text {if } F(\mathbf {x})_{K+1} < \gamma , \end{array}\right. } \end{aligned}$$
(2)

where \(\gamma \) is the threshold, and in practice can be chosen on the in-distribution data so that a high fraction of the test examples are correctly classified by G. We call \(F(\mathbf {x})_{K+1}\) the OOD score of \(\mathbf {x}\). For an input labeled as in-distribution by G, one can obtain its semantic label using \(\hat{F}(\mathbf {x})\):

$$\begin{aligned} \hat{F}(\mathbf {x}) = \mathop {\mathrm {arg \, max}}_{y \in \{1,2,\cdots , K\}} F(\mathbf {x})_y \end{aligned}$$
(3)

Informative Outlier Mining. We propose to adaptively choose OOD training examples where the detector is uncertain about. Specifically, during each training epoch, we randomly sample N data points from the auxiliary OOD dataset \(\mathcal {D}_{\text {out}}^{\text {auxiliary}}\), and use the current model to infer the OOD scoresFootnote 2. Next, we sort the data points according to the OOD scores and select a subset of \(n<N\) data points, starting with the \(qN^\text {th}\) data in the sorted list. We then use the selected samples as OOD training data \(\mathcal {D}_{\text {out}}^{\text {train}}\) for the next epoch of training. Intuitively, q determines the informativeness of the sampled points w.r.t the OOD detector. The larger q is, the less informative those sampled examples become. Note that informative outlier mining is performed on (non-adversarial) auxiliary OOD data. Selected examples are then used in the robust training objective (1).

figure a

We provide the complete training algorithm using informative outlier mining in Algorithm 1. Importantly, the use of informative outlier mining highlights the key difference between ATOM and previous work using randomly sampled outliers [16, 19, 32, 33].

4 Experiments

In this section, we describe our experimental setup and show that ATOM can substantially improve OOD detection performance on both clean OOD data and adversarially perturbed OOD inputs. We also conducted extensive ablation analysis to explore different aspects of our algorithm.

4.1 Setup

In-Distribution Datasets. We use CIFAR-10, and CIFAR-100 [22] datasets as in-distribution datasets. We also show results on SVHN in Appendix B.8.

Auxiliary OOD Datasets. By default, we use 80 Million Tiny Images (TinyImages) [45] as \(\mathcal {D}_{\text {out}}^{\text {auxiliary}}\), which is a common setting in prior works. We also use ImageNet-RC, a variant of ImageNet [7] as an alternative auxiliary OOD dataset.

Out-of-Distribution Datasets. For OOD test dataset, we follow common setup in literature and use six diverse datasets: SVHN, Textures [8], Places365 [53], LSUN (crop), LSUN (resize) [50], and iSUN [49].

Hyperparameters. The hyperparameter q is chosen on a separate validation set from TinyImages, which is different from test-time OOD data (see Appendix B.9). Based on the validation, we set \(q=0.125\) for CIFAR-10 and \(q=0.5\) for CIFAR-100. For all experiments, we set \(\lambda =1\). For CIFAR-10 and CIFAR-100, we set \(N=400,000\), and \(n=100,000\). More details about experimental set up are in Appendix B.1.

Robust OOD Evaluation Tasks. We consider the following family of OOD inputs, for which we provide details and visualizations in Appendix B.5:

  • Natural OOD: This is equivalent to the classic OOD evaluation with clean OOD input \(\mathbf {x}\), and .

  • \(L_\infty \) attacked OOD (white-box): We consider small \(L_\infty \)-norm bounded perturbations on an OOD input \(\mathbf {x}\) [1, 30], which induce the model to produce a high confidence score (or a low OOD score) for \(\mathbf {x}\). We denote the adversarial perturbations by \(\Omega _{\infty , \epsilon }(\mathbf {x})\), where \(\epsilon \) is the adversarial budget. We provide attack algorithms for all eight OOD detection methods in Appendix B.4.

  • Corruption attacked OOD (black-box): We consider a more realistic type of attack based on common corruptions [17], which could appear naturally in the physical world. For each OOD image, we generate 75 corrupted images (15 corruption types \(\times \) 5 severity levels), and then select the one with the lowest OOD score.

  • Compositionally attacked OOD (white-box): Lastly, we consider applying \(L_\infty \)-norm bounded attack and corruption attack jointly to an OOD input \(\mathbf {x}\), as considered in [23].

Evaluation Metrics. We measure the following metrics: the false positive rate (FPR) at 5% false negative rate (FNR), and the area under the receiver operating characteristic curve (AUROC).

Table 1. Comparison with competitive OOD detection methods. We use DenseNet as network architecture for all methods. We evaluate on four types of OOD inputs: (1) natural OOD, (2) corruption attacked OOD, (3) \(L_\infty \) attacked OOD, and (4) compositionally attacked OOD inputs. The description of these OOD inputs can be found in Sect. 4.1. \(\uparrow \) indicates larger value is better, and \(\downarrow \) indicates lower value is better. All values are percentages and are averaged over six different OOD test datasets described in Sect. 4.1. Bold numbers are superior results. Results on additional in-distribution dataset SVHN are provided in Appendix B.8. Results on a different architecture, WideResNet, are provided in Appendix B.12.

4.2 Results

ATOM vs. Existing Methods. We show in Table 1 that ATOM outperforms competitive OOD detection methods on both classic and adversarial OOD evaluation tasks. There are several salient observations. First, on classic OOD evaluation task (clean OOD data), ATOM achieves comparable or often even better performance than the current state-of-the-art methods. Second, on the existing adversarial OOD evaluation task, \(L_\infty \) OOD, ATOM outperforms current state-of-the-art method ACET [16] by a large margin (e.g. on CIFAR-10, our method outperforms ACET by 53.9% measured by FPR). Third, while ACET is somewhat brittle under the new Corruption OOD evaluation task, our method can generalize surprisingly well to the unknown corruption attacked OOD inputs, outperforming the best baseline by a large margin (e.g. on CIFAR-10, by up to 30.99% measured by FPR). Finally, while almost every method fails under the hardest compositional OOD evaluation task, our method still achieves impressive results (e.g. on CIFAR-10, reduces the FPR by 57.99%). The performance is noteworthy since our method is not trained explicitly on corrupted OOD inputs. Our training method leads to improved OOD detection while preserving classification performance on in-distribution data (see Appendix B.14). Consistent performance improvement is observed on alternative in-distribution datasets (SVHN and CIFAR-100), alternative network architecture (WideResNet, Appendix B.12), and with alternative auxiliary dataset (ImageNet-RC, see Appendix B.11).

Adversarial Training Alone is not Able to Achieve Strong OOD Robustness. We perform an ablation study that isolates the effect of outlier mining. In particular, we use the same training objective as in Eq. (1), but with randomly sampled outliers. The results in Table 2 show AT (no outlier mining) is in general less robust. For example, under \(L_\infty \) OOD, AT displays 23.76% and 31.61% reduction in FPR on CIFAR-10 and CIFAR-100 respectively. This validates the importance of outlier mining for robust OOD detection, which provably improves the decision boundary as we will show in Sect. 5.

Effect of adversarial Training. We perform an ablation study that isolates the effect of adversarial training. In particular, we consider the following objective without adversarial training:

$$\begin{aligned} \mathop {\mathrm {minimize\quad }}_\theta \mathbb {E}_{(\mathbf {x},y)\sim \mathcal {D}_{\text {in}}^{\text {train}}} [\ell (\mathbf {x}, y; \hat{F}_\theta )] + \lambda \cdot \mathbb {E}_{\mathbf {x} \sim \mathcal {D}_{\text {out}}^{\text {train}}} [\ell (\mathbf {x}, K+1; \hat{F}_\theta )], \end{aligned}$$
(4)

which we name Natural Training with informative Outlier Mining (NTOM). In Table 2, we show that NTOM achieves comparable performance as ATOM on natural OOD and corruption OOD. However, NTOM is less robust under \(L_\infty \) OOD (with 79.35% reduction in FPR on CIFAR-10) and compositional OOD inputs. This underlies the importance of having both adversarial training and outlier mining (ATOM) for overall good performance, particularly for robust OOD evaluation tasks.

Table 2. Ablation on ATOM training objective. We use DenseNet as network architecture. \(\uparrow \) indicates larger value is better, and \(\downarrow \) indicates lower value is better. All values are percentages and are averaged over six different OOD test datasets described in Sect. 4.1.

Effect of Sampling Parameter q. Table 3 shows the performance with different sampling parameter q. For all three datasets, training on auxiliary outliers with large OOD scores (i.e., too easy examples with \(q=0.75\)) worsens the performance, which suggests the necessity to include examples on which the OOD detector is uncertain. Interestingly, in the setting where the in-distribution data and auxiliary OOD data are disjoint (e.g., SVHN/TinyImages), \(q=0\) is optimal, which suggests that the hardest outliers are mostly useful for training. However, in a more realistic setting, the auxiliary OOD data can almost always contain data similar to in-distribution data (e.g., CIFAR/TinyImages). Even without removing near-duplicates exhaustively, ATOM can adaptively avoid training on those near-duplicates of in-distribution data (e.g. using \(q=0.125\) for CIFAR-10 and \(q=0.5\) for CIFAR-100).

Ablation on a Different Auxiliary Dataset. To see the effect of the auxiliary dataset, we additionally experiment with ImageNet-RC as an alternative. We observe a consistent improvement of ATOM, and in many cases with performance better than using TinyImages. For example, on CIFAR-100, the FPR under natural OOD inputs is reduced from 32.30% (w/TinyImages) to 15.49% (w/ImageNet-RC). Interestingly, in all three datasets, using \(q=0\) (hardest outliers) yields the optimal performance since there are substantially fewer near-duplicates between ImageNet-RC and in-distribution data. This ablation suggests that ATOM’s success does not depend on a particular auxiliary dataset. Full results are provided in Appendix B.11.

Table 3. Ablation study on q. We use DenseNet as network architecture. \(\uparrow \) indicates larger value is better, and \(\downarrow \) indicates lower value is better. All values are percentages and are averaged over six natural OOD test datasets mentioned in Sect. 4.1. Note: the hyperparameter q is chosen on a separate validation set, which is different from test-time OOD data. See Appendix B.9 for details.

5 Theoretical Analysis

In this section, we provide theoretical insight on mining informative outliers for robust OOD detection. We proceed with a brief summary of our key results.

Results Overview. At a high level, our analysis provides two important insights. First, we show that with informative auxiliary OOD data, less in-distribution data is needed to build a robust OOD detector. Second, we show using outlier mining achieves a robust OOD detector in a more realistic case when the auxiliary OOD data contains many outliers that are far from the decision boundary (and thus non-informative), and may contain some in-distribution data. The above two insights are important for building a robust OOD detector in practice, particularly because labeled in-distribution data is expensive to obtain while auxiliary outlier data is relatively cheap to collect. By performing outlier mining, one can effectively reduce the sample complexity while achieving strong robustness. We provide the main results and intuition here and refer readers to Appendix A for the details and the proofs.

5.1 Setup

Data Model. To establish formal guarantees, we use a Gaussian \(\mathcal {N}(\mu , \sigma ^2 I)\) to model the in-distribution \(P_{\mathbf {X}}\) and the test OOD distribution can be any distribution largely supported outside a ball around \(\mu \). We consider robust OOD detection under adversarial perturbation with bounded \(\ell _\infty \) norm, i.e., the perturbation \(\Vert \delta \Vert _\infty \le \epsilon \). Given \(\mu \in \mathbb {R}^d, \sigma> 0, \gamma \in (0, \sqrt{d}), \epsilon _\tau > 0\), we consider the following data model:

  • \(P_{\mathbf {X}}\) (in-distribution data) is \(\mathcal {N}(\mu , \sigma ^2 I)\). The in-distribution data \(\{\mathbf {x}_i\}_{i=1}^n\) is drawn from \(P_{\mathbf {X}}\).

  • \(Q_{\mathbf {X}}\) (out-of-distribution data) can be any distribution from the family \(\mathcal {Q}= \{Q_{\mathbf {X}}: \Pr _{\mathbf {x} \sim Q_{\mathbf {X}}}[ \Vert \mathbf {x} - \mu \Vert _2 \le \tau ] \le \epsilon _\tau \}\), where \(\tau = \sigma \sqrt{d} + \sigma \gamma + \epsilon \sqrt{d}\).

  • Hypothesis class of OOD detector: \(\mathcal {G} = \{G_{u,r}(\mathbf {x}): G_{u,r}(\mathbf {x}) = 2\cdot \mathbb {I}[\Vert \mathbf {x} - u\Vert _2 \le r]-1, u\in \mathbb {R}^d, r \in \mathbb {R}_+ \}\).

Here, \(\gamma \) is a parameter indicating the margin between the in-distribution and OOD data, and \(\epsilon _\tau \) is a small number bounding the probability mass the OOD distribution can have close to the in-distribution.

Metrics. For a detector G, we are interested in the False Negative Rate \(\mathrm {FNR}(G)\) and the worst False Positive Rate \(\sup _{Q_{\mathbf {X}}\in \mathcal {Q}}\mathrm {FPR}(G;Q_{\mathbf {X}}, \Omega _{\infty ,\epsilon }(\mathbf {x}))\) over all the test OOD distributions \(\mathcal {Q}\) under \(\ell _{\infty }\) perturbations of magnitude \(\epsilon \). For simplicity, we denote them as \(\mathrm {FNR}(G)\) and \(\mathrm {FPR}(G; \mathcal {Q})\).

While the Gaussian data model may be simpler than the practical data, its simplicity is desirable for our purpose of demonstrating our insights. Finally, the analysis can be generalized to mixtures of Gaussians which better models real-world data.

5.2 Learning with Informative Auxiliary Data

We show that informative auxiliary outliers can reduce the sample complexity for in-distribution data. Note that learning a robust detector requires to estimate \(\mu \) to distance \(\gamma \sigma \), which needs \({\tilde{\Theta }}(d/\gamma ^2)\) in-distribution data, for example, one can compute a robust detector by:

$$\begin{aligned} u = \bar{\mathbf {x}} = \frac{1}{n} \sum _{i=1}^n \mathbf {x}_i, \quad r = (1 + \gamma /4\sqrt{d}) \hat{\sigma }, \end{aligned}$$
(5)

where \(\hat{\sigma }^2 = \frac{1}{n} \sum _{i=1}^n \Vert {\mathbf {x}}_i - \bar{\mathbf {x}} \Vert _2^2.\) Then we show that with informative auxiliary data, we need much less in-distribution data for learning. We model the auxiliary data \(U_{\mathbf {X}}\) as a distribution over the sphere \(\{\mathbf {x}: \Vert \mathbf {x} - \mu \Vert _2^2 = \sigma ^2_o d\}\) for \(\sigma _o > \sigma \), and assume its density is at least \(\eta \) times that of the uniform distribution on the sphere for some constant \(\eta > 0\), i.e., it’s surrounding the boundary of \(P_{\mathbf {X}}\). Given \(\{\mathbf {x}_i\}_{i=1}^n\) from \(P_{\mathbf {X}}\) and \(\{\tilde{\mathbf {x}}_i\}_{i=1}^{n'}\) from \(U_{\mathbf {X}}\), a natural idea is to compute \(\bar{\mathbf {x}}\) and r as above as an intermediate solution, and refine it to have small errors on the auxiliary data under perturbation, i.e., find u by minimizing a natural “margin loss”:

$$\begin{aligned} u = \mathop {\mathrm {arg \, min}}_{p: \Vert p - \bar{\mathbf {x}} \Vert _2 \le s}&\frac{1}{n'} \sum _{i=1}^{n'} \max _{\Vert \delta \Vert _\infty \le \epsilon } \mathbb {I}\left[ \Vert \tilde{\mathbf {x}}_i + \delta - p\Vert _2 < t \right] \end{aligned}$$
(6)

where st are hyper-parameters to be chosen. We show that with \({\tilde{O}}(d/\gamma ^4)\) in-distribution data and sufficient auxiliary data can give a robust detector. See proof in Appendix A.2.

5.3 Learning with Informative Outlier Mining

In this subsection, we consider a more realistic data distribution where the auxiliary data can contain non-informative outliers (far away from the boundary), and in some cases mixed with in-distribution data. The non-informative outliers may not provide useful information to distinguish a good OOD detector statistically, which motivates the need for outlier mining.

Uninformative Outliers can Lead to Bad Detectors. To formalize, we model the non-informative (“easy” outlier) data as \(Q_q = \mathcal {N}(0, \sigma _q^2 I )\), where \(\sigma _q\) is large to ensure they are obvious outliers. The auxiliary data distribution \(U_{\mathrm {mix}}\) is then a mixture of \(U_{\mathbf {X}}\), \(Q_q\) and \(P_{\mathbf {X}}\), where \(Q_q\) has a large weight. Formally, \(U_{\mathrm {mix}}= \nu U_{\mathbf {X}}+ (1-2\nu ) Q_q + \nu P_{\mathbf {X}}\) for a small \(\nu \in (0,1)\). Then we see that the previous learning rule cannot work: those robust detectors (with u of distance \(O(\sigma \gamma )\) to \(\mu \)) and those bad ones (with u far away from \(\mu \)) cannot be distinguished. There is only a small fraction of auxiliary data from \(U_{\mathbf {X}}\) for distinguishing the good and bad detectors, while the majority (those from \(Q_q\)) do not differentiate them and some (those from \(P_{\mathbf {X}}\)) can even penalize the good ones and favor the bad ones.

Informative Outlier Mining Improves the Detector with Reduced Sample Complexity. The above failure case suggests that a more sophisticated method is needed. Below we show that outlier mining can help to identify informative data and improve the learning performance. It can remove most data outside \(U_{\mathbf {X}}\), and keep the data from \(U_{\mathbf {X}}\), and the previous method can work after outlier mining. We first use in-distribution data to get an intermediate solution \(\bar{\mathbf {x}}\) and r by Eqs. (5). Then, we use a simple thresholding mechanism to only pick points close to the decision boundary of the intermediate solution, which removes non-informative outliers. Specifically, we only select outliers with mild “confidence scores” w.r.t. the intermediate solution, i.e., the distances to \(\bar{\mathbf {x}}\) fall in some interval [ab]:

$$\begin{aligned} S := \{ i : \Vert \tilde{\mathbf {x}}_i - \bar{\mathbf {x}} \Vert _2 \in [a, b] , 1 \le i \le n'\} \end{aligned}$$
(7)

The final solution \(u_{\mathrm {om}}\) is obtained by solving Eq. (6) on only S instead of all auxiliary data. We can prove:

Proposition 1

(Error bound with outlier mining). Suppose \( \sigma ^2\gamma ^2 \ge C \epsilon \sigma _o d\) and \(\sigma \sqrt{d} + C \sigma \gamma ^2< \sigma _o \sqrt{d} < C \sigma \sqrt{d}\) for a sufficiently large constant C, and \(\sigma _q \sqrt{d} > 2(\sigma _o\sqrt{d} + \Vert \mu \Vert _2)\). For some absolute constant c and any \(\alpha \in (0,1)\), if the number of in-distribution data \(n \ge \frac{C d}{\gamma ^4} \log \frac{1}{\alpha } \) and the number of auxiliary data \(n' \ge \frac{\exp (C \gamma ^4)}{\nu ^2\eta ^2}\log \frac{d\sigma }{\alpha }\), then there exist parameter values stab such that with probability \(\ge 1 - \alpha \), the detector \(G_{u_{\mathrm {om}}, r}\) computed above satisfies:

$$\begin{aligned} \mathrm {FNR}(G_{u_{\mathrm {om}},r}) \le \exp (-c \gamma ^2), \quad \mathrm {FPR}(G_{u_{\mathrm {om}},r}; \mathcal {Q}) \le \epsilon _\tau . \end{aligned}$$

This means that even in the presence of a large amount of uninformative or even harmful auxiliary data, we can successfully learn a good detector. Furthermore, this can reduce the sample size n by a factor of \(\gamma ^2\). For example, when \(\gamma = \Theta (d^{1/8})\), we only need \(n = {\tilde{\Theta }}(\sqrt{d})\), while in the case without auxiliary data, we need \(n = {\tilde{\Theta }}(d^{3/4})\).

Remark. We note that when \(U_{\mathbf {X}}\) is as ideal as the uniform distribution over the sphere (i.e., \(\eta =1\)), then we can let u be the average of points in S after mining, which will require \(n' = {\tilde{\Theta }}(d/(\nu ^2\gamma ^2))\) auxiliary data, much less than that for more general \(\eta \). We also note that our analysis and the result also hold for many other auxiliary data distributions \(U_{\mathrm {mix}}\), and the particular \(U_{\mathrm {mix}}\) used here is for the ease of explanation; see Appendix A for more discussions.

6 Related Work

OOD Detection.  [18] introduced a baseline for OOD detection using the maximum softmax probability from a pre-trained network. Subsequent works improve the OOD uncertainty estimation by using deep ensembles [24], the calibrated softmax score [27], the Mahalanobis distance-based confidence score [26], as well as the energy score [29]. Some methods regularize the model with auxiliary anomalous data that were either realistic [19, 33, 35] or artificially generated by GANs [25]. Several other works [3, 31, 41] also explored regularizing the model to produce lower confidence for anomalous examples. Recent works have also studied the computational efficiency aspect of OOD detection [28] and large-scale OOD detection on ImageNet [21].

Robustness of OOD Detection. Worst-case aspects of OOD detection have been studied in [16, 37]. However, these papers are primarily concerned with \(L_\infty \) norm bounded adversarial attacks, while our evaluation also includes common image corruption attacks. Besides, [16, 32] only evaluate adversarial robustness of OOD detection on random noise images, while we also evaluate it on natural OOD images. [32] has shown the first provable guarantees for worst-case OOD detection on some balls around uniform noise, and  [5] studied the provable guarantees for worst-case OOD detection not only for noise but also for images from related but different image classification tasks. Our paper proposes ATOM which achieves state-of-the-art performance on a broader family of clean and perturbed OOD inputs. The key difference compared to prior work is introducing the informative outlier mining technique, which can significantly improve the generalization and robustness of OOD detection.

Adversarial Robustness. Adversarial examples [4, 14, 36, 44] have received considerable attention in recent years. Many defense methods have been proposed to mitigate this problem. One of the most effective methods is adversarial training [30], which uses robust optimization techniques to render deep learning models resistant to adversarial attacks. [6, 34, 46, 52] showed that unlabeled data could improve adversarial robustness for classification.

Hard Example Mining. Hard example mining was introduced in [43] for training face detection models, where they gradually grew the set of background examples by selecting those examples for which the detector triggered a false alarm. The idea has been used extensively for object detection literature [11, 13, 38]. It also has been used extensively in deep metric learning [9, 15, 39, 42, 47] and deep embedding learning [10, 40, 48, 51]. Although hard example mining has been used in various learning domains, to the best of our knowledge, we are the first to explore it to improve the robustness of out-of-distribution detection.

7 Conclusion

In this paper, we propose Adversarial Training with informative Outlier Mining (ATOM), a method that enhances the robustness of the OOD detector. We show the merit of adaptively selecting the OOD training examples which the OOD detector is uncertain about. Extensive experiments show ATOM can significantly improve the decision boundary of the OOD detector, achieving state-of-the-art performance under a broad family of clean and perturbed OOD evaluation tasks. We also provide a theoretical analysis that justifies the benefits of outlier mining. Further, our unified evaluation framework allows future research to examine the robustness of the OOD detector. We hope our research can raise more attention to a broader view of robustness in out-of-distribution detection.