Keywords

1 Introduction

The predictive performances of algorithms especially neural networks are heavily dependent on the amount of data they are trained with. In contrast, privacy regulations aiming at hiding individual sensitive information hinder the application of machine learning tools on heterogeneous multi-center data. Since it is not our objective to argue about the benefits of these privacy regulations, we strive to find methods that allow publishing of sensitive data simultaneously to maintaining individual’s privacy. While such methods are trivial to implement for categorical data (e.g. a data base with entries for sex, age, gender, etc.) complex data such as images pose a difficult objective. Contrary to categorical data images obtain their meaning by the spatial relationship of individual pixels. Perturbing pixels by adding random noise would not hinder a human or a machine observer from re-identifying the image’s content; recognizing people by their face being the most obvious example. Older techniques rely on blurring or pixelation of people’s faces, e.g. Google Street View [11].

Training of machine learning models with such samples would tremendously decrease their predictive performance because a great deal of features are lost in the process which the model never sees (see Fig. 1). This is of utmost importance in the medical domain as we must ensure the model learns on valid features for detecting pathologies.

Fig. 1.
figure 1

Example of face anonymization with Differential Privacy [17]. Compared to conventional approaches based on noise (a), blur (b), and mosaic (d) our content-aware approach (e)–(g) changes the identity of the image. For \(\epsilon =10\) (e) one can still see strong similarities between reconstruction and ground truth as e.g. the lock of hair on the forehead. For small \(\epsilon \) the similarity decreases as desired to disable re-identification. However, if the subsequent task was to classify the eye color, this would still be possible with the CADP results from (e)-(g), since we can condition the transformation and therefore leave important aspects unaltered.

Fig. 2.
figure 2

Content-aware differential privacy (CADP) pipeline. After training the INN to convergence we feed each sample \(\textbf{x}\) with the corresponding condition \(\textbf{c}(\textbf{y})\) to obtain our latent representation \(\textbf{z}\). After clipping its \(L_1\)-norm to the desired sensitivity s, Laplacian distributed noise \(\textrm{Lap}(0,s/\epsilon )\) is added to obtain \(\epsilon \)-DP. The perturbed \(\tilde{\textbf{z}}\) is fed in reverse to obtain the differentially private image \(\tilde{\textbf{x}}\).

We hypothesize that the tools of machine learning namely neural networks based on Normalizing Flows (NF) known as Invertible Neural Networks (INN) may be used to address the privacy issue when dealing with images and medical ones in particular [2]. Our contribution is three-fold:

  • First, we provide mathematically grounded evidence that INNs provide a valuable tool to obtain \(\epsilon \)-differentially private images that exhibit all features of natural images (e.g. sharpness or authenticity). \(\epsilon \) quantifies the probability of data leakage, the lower \(\epsilon \) the more privacy is guaranteed.

  • Second, by conditioning our network on meta-data provided in conjunction with the dataset (e.g. pathologies) the INN is able to automatically extract dimensions most likely corresponding to classifying those meta variables. We assume these features merit attention for downstream tasks and, thus, should be modified as little as possible self-evident within the bounds of desired privacy. We term this method Content-Aware DP (CADP).

  • Third, we show the generalizability of our method not just to images but also to categorical data making it a universal tool for obtaining differentially private data.

We focus on the task of protecting images in particular, or data in general in any context, detached from their intended usage.

2 Related Work

Differentially Private Invertible Neural Networks. In general each learning based algorithm can be trained in a privacy preserving fashion by using differentially private stochastic gradient descent (DP-SGD) [1]. DP-SGD achieves differentially private model training by clipping the per-sample gradient and adding calibrated Gaussian noise proportional to the desired level of privacy. Therefore, DP-SGD tweaks the model parameters instead of the input to obtain privacy by e.g. ensuring no inputs might be reconstructed from the model parameters [23].

One can distinguish between input-, output-, and algorithm-perturbation to achieve DP. When the output of the algorithm or the algorithm itself is perturbed as e.g. in DP-SGD the analysis is performed on the non-private data, where one has to be concerned about the composition property (\(\epsilon \) degrades over multiple analyses of the dataset). Further, since one cannot release the data the possibilities for analysis are limited. We circumvent above mentioned limitations by performing input-perturbation and use the robustness of DP against post-processing (any further processing of differential private data retains privacy guarantees).

Obviously, INNs can be trained with DP-SGD as well [24]. However, after training one can only use the INN in a generative manner by sampling the latent space \(\textbf{z} \sim \mathcal {N}(\textbf{0};\textbf{I})\) and obtain data samples that have no relation to in reality occuring data samples and are therefore artificial. Thus, it does not allow for perturbation of the real data samples intended to be published or used for model training. Even worse, using artificial data is also not completely secure against attacks [4] and may even lead to wrong pathologies in generated images [5, 15].

Differential Privacy for Images. The most prominent application in the literature about differentially private images deals with faces, as this is the most vivid example. Older approaches rely on pixeling, blurring, obfuscation, or inpainting [10], but this has been proven as ineffective against deep learning based recognizers [18, 19]. Another promising path is the generation of fully artificial data with e.g. Generative Adversarial Networks (GAN) with the known drawbacks mentioned above [6, 21, 24, 25]. Ziller et al. claimed to having applied DP to medical images. [27]. However, their approach also only involves training a conventional CNN on medical images with DP-SGD. We take a different path and alter the content of the input image in a private manner as we want to preserve as much information as possible and only alter dimensions that are not identification related. To the best of our knowledge DP has never been applied directly to the content of medical images before.

3 Methods

3.1 (Conditional) Invertible Neural Networks

INNs deal with the approximation of a complex, unobservable distribution \(p(\textbf{x})\) by a simpler tractable prior \(q(\textbf{z})\), usually a spherical multivariate Gaussian. Let \(\mathcal {X} = \left\{ \textbf{x}^{(1)}, ..., \textbf{x}^{(n)} \right\} \) be n observed i.i.d. samples from \(p(\textbf{x})\). The objective is to approximate \(p(\textbf{x})\) via a model \(f_{\boldsymbol{\theta }}\) consisting of a series of K bijective functions \(f_{\boldsymbol{\theta }} = f_1 \odot ... \odot f_K\) parameterized fully by \(\boldsymbol{\theta }\) transforming \(q(\textbf{z})=\mathcal {N}(\textbf{0};\textbf{I})\) into \(p(\textbf{x})\) and vice versa (\(f_{\boldsymbol{\theta }}(\textbf{x})=\textbf{z}\longleftrightarrow f_{\boldsymbol{\theta }}^{-1}(\textbf{z})=\textbf{x}\)).

Such a model can efficiently be used in a generative manner to sample \(\textbf{x} \sim p\) by first sampling \(\textbf{z} \sim \mathcal {N}(\textbf{0};\textbf{I})\) and subsequently transforming the sample as \(\textbf{x} = f_{\boldsymbol{\theta }}(\textbf{z})\).

Since \(f_{\boldsymbol{\theta }}\) exhibits invertibility, exact likelihood evaluation becomes tractable by utilizing the change of variables formula [7, 8].

$$\begin{aligned} \log p(\textbf{x}) = \log q \left( f_{\boldsymbol{\theta }}^{-1} (\textbf{z}) \right) + \log \left| \det \left( \frac{\partial f_{\boldsymbol{\theta }}^{-1}(\textbf{z})}{\partial \textbf{x}} \right) \right| \end{aligned}$$
(1)

An isotropic Gaussian is usually chosen as prior. Since its covariance matrix is diagonal, components are independent. With INNs sharp image details can be obtained, while simultaneously allowing to modify independent components of the image in latent space [14].

We build on the foundations laid by Ardizzone et al., who incorporated conditions by e.g. concatenation of class labels to the input [3]. This enables the INN to implicitly learn the meta-data dependent distribution in latent space. In the reverse pass we provide the label we would like to obtain, e.g. a pathology, and the INN generates an altered version of the original image that still exhibits the desired pathology (\(f_{\boldsymbol{\theta }}(\textbf{x},\textbf{c})=\textbf{z}\longleftrightarrow f_{\boldsymbol{\theta }}^{-1}(\textbf{z},\textbf{c})=\textbf{x}\)).

3.2 Content-Aware Differential Privacy

Being termed the gold standard in obscuring data sample sensitive information, DP provides a mathematically grounded, quantifiable measure of leaked information while simultaneously being applicable in a simple manner [26]. From a high-level perspective it guarantees that changing one value in the database (\(\mathcal {X}\) and \(\mathcal {X}^{\prime }\)) will have only a small effect on the model prediction [9].

$$\begin{aligned} Pr \left[ \mathcal {M}(\mathcal {X}) \in \mathcal {S} \right] \le \exp (\epsilon ) Pr \left[ \mathcal {M}(\mathcal {X}^{\prime }) \in \mathcal {S} \right] , \end{aligned}$$
(2)

where \(\mathcal {M}\) denotes a randomized mechanism and \(\mathcal {S}\) all sets of outputs. The closer the two probabilities are, the less information is leaked (small \(\epsilon \)). DP is usually obtained by perturbing data with calibrated noise proportional to the function’s f (\(L_1\)-norm) sensitivity on dataset \(\mathcal {X}\), which is the maximum change in the function’s value by changing one data point. To achieve pure \(\epsilon \)-DP the Laplace mechanism is commonly used.

figure a

After training an INN to convergence i.e. \(f_{\boldsymbol{\theta }}(\mathcal {X},\mathcal {C}) \sim \mathcal {N}(\textbf{0},\textbf{I})\), each image and label \((\textbf{x}_i,\textbf{y}_i)\in \mathcal {X}\) with corresponding condition \(\textbf{c}_i(\textbf{y}_i)\) is forwarded through the network (see Fig. 2). The resulting latent space \(f_{\boldsymbol{\theta }}(\textbf{x}_i,\textbf{c}_i(\textbf{y}_i)) = \textbf{z}_i\) is modified in a differentially private manner by sampling from a Laplace distribution with standard deviation determined by the sensitivity s and the desired \(\epsilon \). We clip our sensitivity by dividing each \(\textbf{z}_i\) by its \(L_1\)-norm (Algorithm 1) [1]. Since \(\mathcal {Z}\) is learned to be an isotropic Gaussian each component is independent and can, thus, be modified individually. INNs can trivially be expanded to be trained on categorical data as well, making our method a general technique for applying DP on data.

Theorem 1

(\(\epsilon \)-Content-Aware-DP Mechanism). For an image \(\textbf{x} \in \mathcal {X}\) there exists a mechanism \(\mathcal {M}_{\textrm{CA}}\) that maps \(\textbf{x}\) to its differentially private counterpart \(\tilde{\textbf{x}} \in \mathcal {X}\). We say \(\mathcal {M}_{\textrm{CA}}\) satisfies \(\epsilon \)-DP, if and only if for all \(\textbf{x},\textbf{x}^{\prime } \in \mathcal {X}\)

$$\begin{aligned} \mathcal {M}_{\textrm{CA}} = f_{\boldsymbol{\theta }}^{-1} \left[ f_{\boldsymbol{\theta }}(\textbf{x}) + (l_1, ... , l_k) \right] = f_{\boldsymbol{\theta }}^{-1} \left[ \textbf{z} + (l_1, ... , l_k) \right] = f_{\boldsymbol{\theta }}^{-1} \left[ \tilde{\textbf{z}} \right] \;, \end{aligned}$$
(5)

where \(f_{\boldsymbol{\theta }}\) denotes a function that maps \(\textbf{x}\) to a latent vector \(\textbf{z} \in \mathcal {Z}\) and by reverse pass \(f_{\boldsymbol{\theta }}^{-1}\) maps \(\textbf{z}\) to \(\textbf{x}\). \(\tilde{\textbf{z}} = \textbf{z} + (l_1,...,l_k)\) denotes the \(\epsilon \)-DP perturbed version of \(\textbf{z}\) with \(l_i\) i.i.d. random variables drawn from \(\textrm{Lap}\left( s / \epsilon \right) \).

Proof

Let \(\textbf{x} \in \mathcal {R}^{|\mathcal {X}|}\) and \(\textbf{x}^{\prime } \in \mathcal {R}^{|\mathcal {X}|}\) be such that \(|| \textbf{x} - \textbf{x}^{\prime } ||_1 \le 1\), and \(g(\textbf{x}) = f_{\boldsymbol{\theta }}^{-1}\left( f_{\boldsymbol{\theta }} (\textbf{x}) \right) \) be some function \(g: \mathcal {R}^{|\mathcal {X}|} \rightarrow \mathcal {R}^{|\mathcal {Z}|} \rightarrow \mathcal {R}^{|\mathcal {X}|}\). We only consider functions that are volume preserving meaning their Jacobian determinant is equal to one \(\left( \left| \det \left( \partial f_{\boldsymbol{\theta }}(\textbf{x}) /\partial \textbf{z} \right) \right| = 1 \right) \). Let \(p_{\textbf{x}}\) denote the probability density function of \(\mathcal {M}_{\textrm{CA}}(\textbf{x},g,\epsilon )\), and \(p_{\textbf{x}^{\prime }}\) of \(\mathcal {M}_{\textrm{CA}}(\textbf{x}^{\prime },g,\epsilon )\). We assume the distance between points is similar in \(\mathcal {X}\) and \(\mathcal {Z}\) as shown by [14]. We compare the two at some arbitrary point \(\textbf{t} \in \mathcal {R}^{|\mathcal {Z}|}\)

$$\begin{aligned} \begin{aligned} \frac{p_{\textbf{x}}(\textbf{t})}{p_{\textbf{x}^{\prime }}(\textbf{t})}&= \prod _{i=1}^{k} \left( \frac{ \exp \left( - \frac{\epsilon }{s} | g(\textbf{x}) - f_{\boldsymbol{\theta }}^{-1} (\textbf{t}) | \right) }{ \exp \left( - \frac{\epsilon }{s} | g(\textbf{x}^{\prime }) - f_{\boldsymbol{\theta }}^{-1} (\textbf{t}) | \right) } \right) = \prod _{i=1}^{k} \left( \frac{ \exp \left( - \frac{\epsilon }{s} | f_{\boldsymbol{\theta }}^{-1} \left( f_{\boldsymbol{\theta }} (\textbf{x}) - \textbf{t} \right) | \right) }{ \exp \left( - \frac{\epsilon }{s} | f_{\boldsymbol{\theta }}^{-1} \left( f_{\boldsymbol{\theta }} (\textbf{x}^{\prime }) - \textbf{t} \right) | \right) } \right) \\&= \prod _{i=1}^{k} \left( \exp - \frac{\epsilon }{s} |f_{\boldsymbol{\theta }}^{-1}\left( \textbf{z}_{\textbf{x}} - \textbf{t} \right) - f_{\boldsymbol{\theta }}^{-1}\left( \textbf{z}_{\textbf{x}^{\prime }} - \textbf{t} \right) | \right) \\&= \prod _{i=1}^{k} \left( \exp - \frac{\epsilon }{s} |f_{\boldsymbol{\theta }}^{-1}\left( \textbf{z}_{\textbf{x}} - \textbf{z}_{\textbf{x}^{\prime }} \right) | \right) \\&\le \prod _{i=1}^{k} \exp \left( - \frac{\epsilon | \textbf{z}_{\textbf{x}} - \textbf{z}_{\textbf{x}^{\prime }} |}{s} \right) = \exp \left( \frac{\epsilon || \textbf{z}_{\textbf{x}} - \textbf{z}_{\textbf{x}^{\prime }} ||_{1}}{s} \right) \\&\le \exp (\epsilon ), \end{aligned} \end{aligned}$$
(6)

where the first inequality follows from the triangle inequality, and the last follows from the definition of sensitivity and \(||\textbf{x}-\textbf{x}^{\prime }||_1 \le 1\). \(\frac{p_{\textbf{x}}(\textbf{t})}{p_{\textbf{x}^{\prime }}(\textbf{t})} \ge \exp (- \epsilon )\) follows by symmetry.

4 Experiments

We apply our approach for content-aware differential privacy to several publicly available datasets to showcase its generalizability. In each case we first train the INN on the training partition and subsequently train a classifier on the differentially private data. Note that our goal is not to reach as high as possible predictive performance but to close the gap between original and differentially private training. To exemplify the principle of content-aware DP we use the MNIST dataset, since the effect of transformations in latent space is obvious [16]. Next, we use two dedicated medical datasets, the first being a collection of retinal optical coherence tomography (OCT) scans with four classes (choroidal neovascularization (CNV), diabetic macular edema (DME), drusen, and healthy) [12] and the second being a series of chest x-ray scans with healthy and pneumonic patients [12], which contain more complicated and indistinct transformations.

Since most works in adding privacy to images deal with the prototype example of identifiability of faces, we also apply our approach to the CelebA Faces dataset (see Fig. 1) [17]. After having investigated our method on image data, we expand it to categorical data i.e. diabetes dataset from scikit-learn [20].

For each dataset we train a separete INN with convolutional subnetworks, with depth (number of downsampling operations) dependent on the image resolution. We chose \(d=2\) for MNIST (\(28 \times 28\)), \(d=4\) for OCT and chest x-ray (\(128 \times 128\)), and \(d=6\) for CelebA (\(3 \times 128 \times 128\)). As coupling block we use the volume preserving GIN (general incompressible-flow) [22] for MNIST and diabetes data, and Glow (generative flow) [14] for the other, more complicated datasets. After having trained an INN to convergence we train a classifier with convolutional blocks and two linear layers on the differentially private data. Testing is performed on original data to investigate the amount of true features the model learns. We believe that the performance of the classifier acts as an implicit benchmark to make sure the INN not only reconstructs conditional noise. It is common practice for all works dealing with DP algorithms to be compared to the non-private benchmark. The goal must be to close the still existing gap to incentivize differentially private training by eliminating all its shortcomings. For comparison we also train the same classifier with DP-SGD, the current gold standard [1]. All experiments were performed on a NVIDIA Titan RTX.

 

figure b
Fig. 3.
figure 3

Differentially private reconstruction of MNIST with different \(\epsilon \) and \(s=\epsilon /2\).

Fig. 4.
figure 4

Accuracy of classifier on different datasets with different \(\epsilon \) and \(s=\textrm{min}(\epsilon /2,4)\). Further, we trained the same model with DP-SGD [1]. Training/testing is performed on either original (o) or CADP altered (p) data.

5 Results

The results are presented in a two-fold manner. We first show the differentially private adjusted images per class for each dataset with different levels of \(\epsilon \). Second, we show the reached accuracy of the classifier on the original, not-CADP altered test data chunk when trained on the original, on the CADP altered dataset, or with DP-SGD. MNIST. Even for small \(\epsilon \) our approach generates visually appealing results that are indistinguishable from real digits but exhibit a large difference from the original (see Fig. 3). Attributes being altered are line thickness (e.g. 6), slant (e.g. 1), and even style (e.g. 2). For \(\epsilon =0.2\) a classifier trained on CADP-altered data outperforms the commonly accepted DP-SGD, CADP reaches 92.94% accuracy while DP-SGD only results in 89.24% (c.f. Fig. 4). The gap closes for larger \(\epsilon \).

Retinal OCT and Chest X-ray. In retinal OCTs the perturbations are rather subtle and difficult to interpret for a human observer or a non-expert. Identification related attributes like retinal detachments in specific places are (re-)moved impeding de-identification (see Fig. 5). The CADP-altered images images exhibit transformations resulting in large dissimilarites to their original counterpart. However, CADP induces a smaller privacy-utility tradeoff since the performance of the classifier trained on CADP altered data is close to the one trained on original data (Fig. 4). The classifier trained on data altered by our method outperforms the one trained with DP-SGD by 23.63% on average across all \(\epsilon \) on the OCT test dataset and by 16.52% on the chest X-ray test dataset. We attribute this to the content-awareness of our method, which leaves dimensions corresponding to conditions, i.e. pathologies, unaltered. This is desirable in settings, where one trains a model on private data of another location, e.g. a hospital, and applies it to its own in-house samples.

Categorical Data. INNs can also generate differentially private categorical data as can be seen in Fig. 6 for the diabetes dataset from scikit-learn [20]. The data distributions are kept similar but are still altered equipping each data sample with plausible deniability. To obtain the binary feature of sex, we condition the INN on this feature; the others are learned in an unsupervised fashion.

Fig. 5.
figure 5

Content-aware differentially private images from OCT dataset with different \(\epsilon \) for classes CNV and DME [12]. The sensitivity is set to \(\mathrm {min\,}(\epsilon /2, 4)\). For high \(\epsilon \) (e.g. 10) the reconstructed retinal OCT still share similarities as in Fig. 1. For smaller \(\epsilon \) qualitatively the images look different from their original counterpart. However, the classifier (Fig. 4) still performs well acting as an implicit control of the preserved features.

Fig. 6.
figure 6

Content-aware differentially private data from diabetes dataset from scikit-learn with \(\epsilon =1\) and sensitivity \(s=1\) [20]. With conditions the INN is able to reconstruct the approximate distributions even if binary distributed.

6 Discussion and Conclusion

We introduced a new method to achieve differentially private images based on invertible neural networks, which we term CADP (content-aware differential privacy). We applied the method to medical images and ensured the identity i.e. pathology of the patient is not changed by conditioning the INN on the class labels. We could show that in three experiments on diverse medical data (images of digits, OCT, and X-ray scans), the subsequent classifiers outperformed conventional approaches by a margin when fed with CADP-generated data. By this we reduce the risk for false diagnosis and increase the safety of patients against wrong diagnoses while providing provable and mathematically grounded privacy guarantees. Hence, CADP pre-processed datasets may be used to increase anonymity of medical image data in the future. However, the level of required anonymity should be decided depending on the individual use case.

Even for small \(\epsilon < 1.0\) our method generates visually appealing results that can be used to train a classifier outperforming DP-SGD with the same privacy guarantees. However, clipping of the latent space discards information for reconstruction. In future work, it can be investigated how much information is lost to assure privacy. Further, an in-depth exploration of the latent space can be conducted.