1 Introduction

Margin classifiers are playing an important role in the high-stakes decision domains (e.g. credit assessment) [1, 2]. Recently, to protect user privacy when training margin classifiers on sensitive data, a number of differentially private empirical risk minimization (ERM) algorithms have been proposed [3,4,5,6,7,8]. Meanwhile, as an important social concern about machine learning, algorithmic fairness is receiving increasing attention from both public and academia [9, 10]. Among various machine learning models, the fairness of margin classifiers receives significant attention [11,12,13,14,15] for their wide application in high-stakes domains protected by anti-discrimination regulations.

However, previous studies [16, 17] showed that the differentially private ERM algorithms could make machine learning models unfairly treat different groups, such as recognizing black faces and white faces with different accuracy. Here, their studies are mainly empirical and lack an analysis of how differential privacy impacts the fairness of their studied models. As a result, we still did not know the dominant factor on the fairness of the differentially private machine learning models. Identifying the dominant factor would help find a correct way to ensure the fairness of differentially private margin classifiers.

In this paper, we show that the fairness of non-private margin classifiers dominates the fairness of corresponding differentially private margin classifiers based on well-designed experiments and further analysis. We first empirically evaluate the impact of three representative differentially private ERM algorithms [3,4,5] on the fairness of three classical margin classifiers: Linear support vector machine (SVM), Kernel SVM, and logistic regression (LR). Because in most high-stakes domains, the accuracy of the ‘positive’ label is more important than that of the ‘negative’ label [12, 18], we use equal opportunity [12], which requires that different groups should have the same true positive rate (TPR), as the fairness notion. By testing three datasets widely used in the algorithmic fairness field, we find that the fairness of differentially private margin classifiers strongly depends on the fairness of their non-private versions. In that sense, when a non-private margin classifier has almost the same TPR on different groups, its differentially private version also has almost the same TPR on these groups. Furthermore, when a non-private margin classifier has a significant TPR gap between two groups, differential privacy will amplify this TPR gap.

We confirm the empirical results through a theoretical analysis of how differential privacy impacts the fairness of margin classifiers. Concretely, we reveal that the main reason for significant TPR gaps in differentially private margin classifiers is that ‘positive’ data samples of different groups have significantly different margin distributions in their non-private versions, which is implied by TPR gaps. By contrast, when a non-private margin classifier has similar TPR on different groups, the ‘positive’ data samples from different groups will have similar margin distributions. Consequently, the negative impact brought by differential privacy can be largely ignored, even eliminated. We also show that our analysis results can be extended to other accuracy-based group fairness notions (e.g. equal odds [12]).

In summary, we show that if non-private margin classifiers are fair with negligible TPR gaps, the fairness of their differentially private counterparts can be ensured. As is shown in Section 5.3, when we improve the fairness of non-private margin classifiers with a pre-processing method [11], the TPR gaps of differentially private margin classifiers are close to and even lower than those of non-private margin classifiers.

2 Related work

Algorithmic fairness

Chouldechova et al. [9] presented an overview of current studies on algorithmic fairness. Dwork et al. [19] proposed the notion of individual fairness. However, because the similarity of individuals is hard to measure, a series of group fairness notions [12, 19, 20] have been proposed. Based on these fairness notions, several studies proposed the related algorithms to train fair classifiers [11,12,13,14]. All of these studies took margin classifiers as typical cases to verify the effectiveness of their algorithms.

Differential privacy

Differential privacy has become a de facto standard to protect user privacy of machine learning models. Since Chaudhuri et al. [21] created a novel sensitivity analysis method for convex and continuous loss functions, many differentially private ERM algorithms have been developed to achieve a better privacy-utility trade-off [6, 7, 22, 23], to make differentially private ERM algorithms more usable [5, 24] or to make a non-convex optimization process differentially private [3, 25]. In addition, Jagielski et al. [26] applied differential privacy to protect the sensitive attribute (e.g. gender) of data samples when training a fair classifier.

Differential privacy and algorithmic fairness

Cummings et al. [27] showed that perfect fairness and differential privacy are incompatible under non-trivial accuracy. Bagdasaryan et al. [16] empirically revealed that a differentially private stochastic gradient descent algorithm has a disparate impact on the accuracy of different groups. Motivated by the above findings, some related algorithms [28,29,30,31,32] have been proposed to balance privacy protection and fairness on the classification problem, the selection problem, etc. However, there still lacks a comprehensive study on how differential privacy impacts the fairness of margin classifiers, which is critical to designing differentially private and fair margin classifiers. Compared with previous studies, our study covers a wider spectrum of differentially private ERM algorithms. What is more, beyond the empirical study, we conduct a theoretical analysis of how differential privacy impacts the fairness of margin classifiers.

3 Preliminaries

To present the study results clearly, we list the notations involved in this paper in Table 1.

Table 1 Notations involved in this paper

3.1 Margin classifier

Definition 1

(Geometric margin [33]) The geometric margin \(\rho _h(\mathbf {x})\) of a linear classifier \(h: ~\mathbf {x} \rightarrow\) \(\theta^{\mathsf{T}} \cdot \mathbf{x}\) at a data sample \(\mathbf { x }\) is its Euclidean distance to the hyperplane whose normal vector is \(\theta\):

$$\begin{aligned} \rho_{h}(\mathbf{x}) = \frac{\vert \theta^{\mathsf{T}} \cdot \mathbf{x} \vert }{\left\| \theta \right\|_{2} } \end{aligned}$$

Margin classifier

[34] Margin classifiers learn a model by optimizing a loss function that takes margins as inputs (e.g. maximizing the minimum margin). That is, the loss function of any margin classifier can be represented as a composite function of the margin function and a margin loss function \(\phi (\rho_{h}(\mathbf{x})) : {\mathbb{R}^{p}} \rightarrow \mathbb{R}^{+}\), where p is the dimension of input data.

3.2 Differentially private empirical risk minimization algorithms

We first introduce the definition of neighboring datasets: \(D\) and \(D^{'} \in D^n\) are neighboring datasets if \(D^{'}\) and \(D\) differ in one data sample. We then introduce the definition of \((\epsilon ,\delta )\)-differential privacy as follows.

Definition 2

\((\epsilon ,\delta )\)-differentially privacy [35]. For a random mechanism \(M\) whose input is \(D \in D^n\) and output is \(r \in R\), we say \(M\) is \((\epsilon ,\delta )\)-differentially private if for any subset \(S \subseteq R\), \(Pr(M(D) \in S) \le e^{\epsilon } \cdot Pr (M(D^{'}) \in S) + \delta\), where \(\epsilon\) is the privacy budget, a tunable parameter on the privacy-utility trade-off.

The main idea of differential privacy is to bound the influence of each data sample on the output to prevent attackers from inferring any information about one single data sample from the output. A typical way to satisfy the definition of differential privacy is by adding random noise sampled from a predefined distribution to the computing process. If \(\delta\) is 0, we say \(M\) is \(\epsilon\)-differentially private.

We can design differentially private ERM algorithms according to the following three paradigms: (1) Objective perturbation (adding random noise to loss functions); (2) Gradient perturbation (adding random noise to gradients); (3) Output perturbation (adding random noise to the final model parameters). To comprehensively study the relationship between differential privacy and fairness of margin classifiers, we test three differentially private ERM algorithms, each of which follows one or two of the above three paradigms.

Approximate Minimal Perturbation algorithm (AMP)

[4] combines the objective perturbation and the output perturbation paradigms. It thus divides the total privacy budget into two parts (i.e. the noise of objective perturbation and the noise of output perturbation). Note that even though AMP is a hybrid method, more than 99% of the privacy budget should be allocated to the objective perturbation phase as the authors of AMP recommend.

Differentially Private Stochastic Gradient Descent algorithm (DPSGD)

[3] follows the gradient perturbation paradigm. It adds noise to the clipped gradients. DPSGD can be applied to train non-convex models because it has no assumption on the loss functions.

Private convex permutation-based Stochastic Gradient Descent algorithm (PSGD)

[5] follows the output perturbation paradigm. The goal of PSGD is to help incorporate differential privacy into existing machine learning systems without modifying the original system. It adds noise to the final model parameters based on the sensitivity analysis of convex and continuous loss functions and the stochastic gradient descent process.

Despite adding the noise at different positions, all of the above differentially private ERM algorithms provide utility guarantees for convex models, which bound the difference between the losses of private and non-private models. They guarantee the utility by bounding the Euclidean distance between the private model parameters \(\theta _{priv}\) and non-private model parameters \(\theta ^*\). As a result, we define \((\lambda ,\alpha )\)-deviation to quantify the deviation of model parameters led by differential privacy noise.

Definition 3

(\((\lambda , \alpha )\)-deviation) We say a differentially private ERM algorithm is \((\lambda , \alpha )\)-deviate if it can guarantee that when trained from the same dataset, with the probability at least 1-\(\alpha\), the \(L_2\) distance between private model parameters \(\theta _{priv}\) and non-private model parameters \(\theta ^{*}\) is less than a given value \(\lambda\). That is:

$$\begin{aligned} Pr( \left\| \theta _{priv} - \theta ^{*} \right\| _2 < \lambda ) \ge 1-\alpha \end{aligned}$$

In Definition 3, \(\alpha\) bounds the probability that the \(L_2\) distance between the private model and the original model is higher than or equal to \(\lambda\). We show the the deviation properties of the above three differentially private ERM algorithms in Lemmas 1, 2 and 3.

Lemma 1

  AMP follows \((\frac{n\gamma }{\Lambda }+(\sqrt{2p\log \frac{2}{\alpha }})( \frac{4L}{\Lambda \epsilon _3}(1+\sqrt{2\log \frac{1}{\delta _1}}) + \frac{n\gamma }{\Lambda \epsilon _2}(1+\sqrt{2\log \frac{1}{\delta _2}}))\),\(\alpha\))-deviation.

Lemma 2

  PSGD follows (\(\frac{2p\ln (p/\alpha )kTL\eta }{n\epsilon }\),\(\alpha\))-deviation.

Lemma 3

  When applying DPSGD to optimize a \(\Delta\)-strongly convex and \(L_2\)-Lipchitz continuous loss function, if we set learning rate as \(\frac{1}{\Delta t}\), DPSGD follows (\(\frac{4(L^2 + p\sigma ^2)}{\Delta ^2T\alpha }\),\(\alpha\))-deviation.

The proofs of the above lemmas are shown in Appendix 1 with the pseudocodes of three differentially private learning algorithms.

3.3 Equal opportunity

Let \(D = \{(\mathbf { x } _1,a_1,y_1),\cdots , (\mathbf { x } _n,a_n,y_n)\}\) be a dataset that consists of \(n\) data samples from an unknown distribution over \((X , A) \times Y\), where \(Y = \{+1,-1\}\) is the set of labels, \(A\) is the set of sensitive attributes (e.g. gender, race) and \(X\) is the set of other features in an input space. In this paper, we use equal opportunity [12], which requires that different groups should have the same true positive rate (TPR), as the fairness notion in our study.

Cummings et al. [27] has shown that perfect fairness and differential privacy are incompatible under non-trivial accuracy. We thus use \(\rho\)-True Positive Rate Disparity to measure the degree of fairness of a classifier.

Definition 4

\(\rho\)-True Positive Rate Disparity [36]. For any \(a_i, a_j\) (\(i \ne j\)) \(\in\) A and a classifier \(h_\theta\), we say \(h_\theta\) satisfies \(\rho\)-True Positive Rate Disparity if and only if \(\vert Pr\{h_\theta ( \mathbf { x } _i,a_i) = +1 \vert y_i = +1 \}\) - \(Pr\{h_\theta ( \mathbf { x } _j,a_j) = +1 \vert y_j = +1\} \vert \le \rho\). Here \(\rho\) is the maximum TPR difference among all groups.

4 Empirical study

In this section, we evaluate the impact of differential privacy on the fairness of margin classifiers by applying AMP, DPSGD, and PSGD to train three classical margin classifiers: Linear SVM, Kernel SVM, and LR, respectively. We try to answer the following research questions: Are differentially private ERM algorithms bound to aggravate the TPR gaps of margin classifiers? If not, which factor dominates the aggravation of the TPR gaps? The answers to these questions would help find a correct way to ensure the fairness of differentially private margin classifiers.

4.1 Experiment setup

Datesets

We transform all data samples into one-hot encoded forms and shuffle them before the training process. Then we take the first 80% as the training dataset and the rest 20% as the testing dataset. There are six datasets (CompasFootnote 1, Adult [37], Default [37], German [37], Student [37], Arrhythmia [37]) that are widely used in the algorithmic fairness field. Considering the size of datasets (larger than 1,000), we employ three datasets (Compas, Adult, Default) in our empirical study. The overview of these three datasets is shown in Table 2. (1) Compas dataset contains 7,214 data samples. The binary label indicates whether an offender recidivates within two years after the screening. We set ‘No Recidivism in Two Years’ as the ‘positive’ label and Race as the sensitive attribute. After filtering the data samples with null attributes and selecting the data samples whose races are African-American (black) or Caucasian (white), we obtain 5,915 data samples. (2) Adult dataset contains 45,220 data samples. The binary label indicates whether the income of one citizen is higher than 50k dollars. We set ‘Income Higher than 50k Dollars’ as the ‘positive’ label and Gender as the sensitive attribute. (3) Default dataset contains 30,000 data samples. The binary label indicates whether one user has a default payment. We set ‘No Default Payment’ as the ‘positive’ label and Gender as the sensitive attribute. Note that even with large variances, the results of the rest three datasets give the same answer to the research questions with the three employed datasets. We discuss them in Appendix 1.

Table 2 Overview of datasets

Algorithm implementation and hyperparameter configuration

We implement AMP, DPSGD, PSGD based on the open-source codeFootnote 2 released by Iyengar et al. [4]. All of three algorithms have at least four hyperparameters. To comprehensively study the relationship between differential privacy and fairness of margin classifiers, we conduct a grid search procedure to find the best hyperparameter configuration, which means that under the hyperparameter configuration, private models acquire the highest average test accuracy given a privacy budget. In addition, we independently train ten models for each hyperparameter configuration and average the TPR gaps between groups of these ten models as the final result. We also plot the error bars of the test results to show the statistical significance of our results. We list all potential values of hyperparameters in Table 3.

Table 3 Potential hyperparameter values for the grid search procedure

Privacy parameters

To comprehensively study the impact of differential privacy on the fairness of margin classifiers, we test eight \(\epsilon\) values (from 1 to 8), which covers most privacy budget values used in practice. In addition, following the settings of previous studies [4, 5], we set another privacy parameter \(\delta\) as \(\frac{1}{n^2}\), where n is the size of the training dataset. The potential values of privacy parameters are shown in Table 4.

Table 4 Potential Privacy parameters

Sample clipping

All three differentially private ERM algorithms require that the loss functions should be \(L_2\)-Lipschitz continuous [4]. We achieve it by bounding the \(L_2\) norm of each data sample. Before the training process, we clip the feature vector of each data sample \((\mathbf {x}_i,a_i)\) to \((\mathbf {x}_i ,a_i)\cdot min(1, \frac{L}{ \left\| (\mathbf {x}_i,a_i) \right\| _2})\).

4.2 Experimental results

Linear support vector machine

We obtain the non-private baselines by training \(L_2\) regularized Linear Huber SVM models [38]. Then we train differentially private Linear SVM models via AMP, DPSGD and PSGD on same training datasets.

As is shown in Figure 1, the average TPR gaps of all private models trained on Compas and Adult datasets are larger than those of the non-private models. In contrast, the average TPR gaps of all private models trained on Default dataset are close to that of the non-private model. The TPR gap between the white samples and black samples of the non-private model trained on Compas dataset is about 0.117 (more than 19 times of Default); the TPR gaps between the male samples and female samples of the non-private models trained on Adult and Default datasets are about 0.072 (12 times of Default), 0.006, respectively.

Fig. 1
figure 1

TPR gaps of non-private and differentially private SVM models trained on Compas, Adult, and Default datasets

Kernel support vector machine

We implement the non-private Kernel SVM and its differentially private versions through a Fourier transform-based function approximation method proposed by Rahimi et al. [39]. This method uses random cosine functions to approximate the kernel functions that project the original features to a high-dimension target space. Therefore, two additional parameters are involved in the Kernel SVM implementation: the dimension number of the target space, the standard variance of random cosine functions. We approximate the Gaussian kernel function [33] here and use a grid search procedure to determine the values of these two parameters. Then we train Linear SVM models on the projected high-dimension features.

As is shown in Figure 2, the private models trained on Compas and Adult datasets all have larger average TPR gaps than the non-private models. Note that the TPR gaps of non-private models trained on Compas and Adult datasets are about 19 and 12 times more than that of Default dataset. While in Default dataset, the average TPR gaps of private models are similar to that of the non-private models. Meanwhile, as the privacy budget changes, the size of TPR gaps fluctuates up and down, which shows that the TPR gap changes are accidental errors introduced by the randomness of noise sampling.

Fig. 2
figure 2

TPR gaps of non-private and differentially private Kernel SVM models trained on Compas220,0.9, Adult245,0.1, and Default120,0.3 datasets. The subscripts of datasets indicate the dimension of the target feature space and standard variance of kernel function approximation method

Logistic regression

We obtain the non-private baseline by training a \(L_2\) regularized LR model on the same training datasets as private models. As is shown in Figure 3, the private models trained on Compas and Adult datasets all have larger average TPR gaps than the non-private models. By contrast, when the TPR gap in the non-private model is small (0.014 in Default dataset, about 1/11 and 1/5 of Compas and Adult datasets), the TPR gaps in private models are almost the same as that of the non-private model. The experimental results of Compas and Adult also show that reducing the scale of private noise by increasing privacy budget could reduce the negative impact brought by differentially private ERM algorithms.

Fig. 3
figure 3

TPR gaps of non-private and differentially private LR models trained on Compas, Adult, and Default datasets

Insights

By analyzing the experimental results of three classical margin classifiers learned via three differentially private ERM algorithms over three widely used datasets, we can conclude that differentially private ERM algorithms are not bound to have a disparate impact on the TPR of different groups. That is, when the TPR gaps of non-private models are small enough (such as 0.006 in Default dataset by Linear SVM), differential privacy will not aggravate the TPR gaps of margin classifiers. On the other hand, when non-private models have significant TPR gaps between groups (such as 0.117 in Compas dataset and 0.072 in Adult dataset by Linear SVM), all differentially private ERM algorithms amplify the TPR gaps. In addition, in Compas dataset, the number of black samples is about 1.5 times that of white samples, but the TPR of black samples drops much more than white samples in private models. The result shows that differential privacy only amplifies the bias in the dataset rather than discriminates against the minority group of the dataset. We will further justify this claim in Section 4.3.

4.3 Impact of Data Imbalance

Fig. 4
figure 4

TPR gaps of non-private and differentially private SVM models trained on imbalanced Compas, Adult, and Default datasets

Fig. 5
figure 5

TPR gaps of non-private and differentially private LR models trained on imbalanced Compas, Adult, and Default datasets

Bagdasaryan et al. [16] stated that differential privacy noise would cause less accuracy loss on majority groups and more accuracy loss on minority groups in differentially private neural network models. In order to test whether this claim is applicable in margin classifiers, we subsample the minority group of three datasets studied in Section 4 to construct imbalanced datasets. The details of constructed imbalanced datasets are shown in Table 5. Note that we set the size ratio of Compas dataset as 5:1 because it has much fewer samples than the other two datasets. Thus the testing results will have large variances if we set it as 10:1. We then train non-private and differentially private margin classifiers over these imbalanced datasets with the same grid search procedure used in Section 4. The testing results are shown in Figures 4 and 5. In Compas, where the number of black samples is five times that of white samples when non-private classifiers have significantly higher TPR on white samples, the differential privacy still enlarges the TPR gap between white samples and black samples. On the other hand, in Default, even though the number of female samples is ten times that of male samples when non-private classifiers have similar TPR on different groups, differential privacy has a similar impact on these groups. The above results show that data imbalance has little impact on the accuracy loss of differentially private ERM algorithms caused on different groups.

Table 5 Overview of imbalanced datasets

5 Analysis of impact mechanism

In this section, we analyze how differentially private ERM algorithms impact the TPR gaps of margin classifiers. We synthesize a two-dimensional dataset to show the intuition behind our analysis in Figure 6. For clarity purposes, we only illustrate the ‘positive’ samples. As is shown in Figure 6, in the non-private model, Group1 has a higher TPR than Group2 (i.e. Group1 has more true positive (TP) samples and fewer false negative (FN) samples than Group2 regarding the original non-private model). In the following sections, we omit the description of models that are associated with TP and FN data samples for the convenience of expression. The TPR gap between Group1 and Group2 implies different margin distributions of their TP and FN data samples, i.e. the margins of TP data samples of Group2 mainly distribute on lower values (closer to the original hyperplane), while the margins of FN data samples of Group1 mainly distribute on lower values (Section 5.1). When the private hyperplane deviates from the original hyperplane, more TP samples of Group2 are misclassified as negative and more FN samples of Group1 are correctly classified as positive (Section 5.2). As a result, the TPR gap between these two groups is aggravated (Section 5.3).

Fig. 6
figure 6

An overview of the analysis of Section 5. Each point represents a data sample. The color and shape of one point indicate the group and type the data sample belongs to

5.1 Bridging TPR gap and margin gap

In this section, we show that if one group has a significantly higher TPR than another group in a non-private margin classifier, the margins of the group’s TP data samples will distribute on higher values, while the margins of the group’s FN data samples will distribute on lower values. We first analyze the correlation between the margin and the loss of one data sample. The loss functions of standard linear SVM [33] and LR [33] are:

$$\begin{aligned} loss_{SVM}(\theta , \mathbf {x}_i, y_i)= \left\{ \begin{array}{ll} max(0, 1 - \theta ^T\mathbf {x}_i) \quad \quad &{}y_i = +1 \\ max(0, 1 + \theta ^T\mathbf {x}_i) \quad \quad &{}y_i = -1 \\ \end{array}\right. \end{aligned}$$
$$\begin{aligned} loss_{LR}( \theta , \mathbf { x } _i, y_i)= \left\{ \begin{array}{ll} \log (1 + e^{-(\theta ^T\mathbf {x}_i)}) \quad \quad &{}y_i = +1 \\ \log (1 + \frac{1}{e^{-(\theta ^T\mathbf {x}_i)}}) \quad \quad &{}y_i = -1 \\ \end{array}\right. \end{aligned}$$

where \(\vert \theta ^{T} \mathbf {x}_i\vert = margin_{\mathbf {x}_i} * \left\| \theta \right\| _2\) according to Definition 1. Without loss of generality, we discuss the situation where \({y_i=+1}\) here. By the definitions of the above loss functions, when a data sample \(\mathbf {{x}}_i\) is correctly classified (i.e. \({\theta ^T}\mathbf {{x}}_i > 0\) ), a larger margin implies a smaller value of the loss function. Conversely, when \(\mathbf {{x}}_i\) is wrongly classified (i.e. \({\theta ^T}\mathbf {{x}}_i < 0\)), a smaller margin implies a smaller value of \(\vert \theta ^{T} \mathbf {x}_i \vert\) (i.e. \(-{\theta ^T}\mathbf {x}_i\)), thus a smaller value of the loss function. Consequently, if the average loss of one group (refer to as \({g_a}\)) is lower than another group (refer to as \({g_b}\)), at least one of the following two situations will happen: (1) The correctly classified data samples of \({g_a}\) have a larger average margin than correctly classified data samples of \({g_b}\). (2) The wrongly classified data samples of \({g_a}\) have a smaller average margin than wrongly classified data samples of \({g_b}\). An concrete example of the above \(g_a\) and \(g_b\) is Group 1 and Group 2 in Figure 6.

If one group has a higher TPR than another one, its ‘positive’ data samples should have a lower average loss than the other one. Therefore, the TPR gap between groups inevitably implies the margin distribution difference between their TP data samples (situation (1)) or their FN data samples (situation (2)), even both simultaneously. On the other hand, if two groups have similar TPR, their ‘positive’ samples should have a similar loss and thus have a similar margin distribution.

To further verify the above analysis results, we plot the frequency histograms of data samples’ margins to show the margin distributions of Compas and Default datasets in Figures 7 and 8. Because the only difference between Linear SVM and Kernel SVM is that the former is trained on original features and the latter is trained on projected high-dimension features, the results of Linear SVM can be generalized to Kernel SVM. In Linear SVM and LR models trained on Compas dataset, the TPR gaps between white samples and black samples are about 0.117 and 0.157, respectively. Consequently, the margins of TP black samples mainly distribute on lower values than the white ones, while the margins of FN white samples mainly distribute on the lower values than the black ones. By contrast, in Default dataset, where the TPR gaps of two non-private margin classifiers are both less than 0.015, the margin distributions of different groups’ TP and FN samples are very similar.

Fig. 7
figure 7

Margin distribution of TP samples and FN samples of Compas dataset on non-private Linear SVM and LR models. Subfigures (a), (b), (c), (d) represent SVM_TP, SVM_FN, LR_TP, LR_FN respectively

Fig. 8
figure 8

Margin distribution of TP samples and FN samples of Default dataset on non-private Linear SVM and LR models. Subfigures (a), (b), (c), (d) represent SVM_TP, SVM_FN, LR_TP, LR_FN respectively

5.2 Impact of margin gap

We then show that when the private hyperplane deviates from the original hyperplane, the TP samples with smaller margins are more likely to be wrongly classified as negative, and the FN samples with smaller margins are more likely to be correctly classified as positive.

Theorem 1

Let m denote the margin of one data sample \(\mathbf {{x}}\) to the original hyperplane whose normal vector is \({\theta ^\mathsf {*}}\). If m is greater than \(\frac{\lambda L}{\left\| \theta ^{*} \right\| _2}\), then with the probability less than \(\alpha\), the private model \({\theta _{priv}}\) trained by a differentially private ERM algorithm that is (\({\lambda , \alpha }\))-deviate makes a different prediction with the original model on \(\mathbf {{x}}\), i.e.

$$\begin{aligned} Pr((\theta^{\mathsf{*T}} \cdot \mathbf{x})(\theta^{\mathsf{T}}_{priv} \cdot \mathbf{x})< 0 ) < \alpha \end{aligned}$$

where L is the upper bound of data samples’ \({L_2}\) norm.

Proof

$$\begin{aligned} (\theta^{\mathsf{*T}} \cdot \mathbf{x} ) (\theta_{priv}^{\mathsf{T}} \cdot \mathbf{x} )= & {} (\theta^{\mathsf{*T}} \cdot \mathbf{x} ) ((\theta^{*} +\theta_{priv}- \theta^{*})^{\mathsf{T}} \cdot \mathbf{x} )\\= & {} (\theta^{\mathsf{*T}} \cdot \mathbf{x} ) (\theta^{\mathsf{*T}} \cdot \mathbf{x} + (\theta_{priv}- \theta^{*})^{\mathsf{T}} \cdot \mathbf{x} ) \end{aligned}$$

According to Cauchy-Schwarz inequality,

$$\begin{aligned} \vert (\theta_{priv}- \theta^{*})^{\mathsf{T}} \cdot \mathbf{x} \vert \le \left\| \theta_{priv}- \theta^{*} \right\|_{2} \cdot \left\| \mathbf{x} \right\|_{2} < \lambda L \end{aligned}$$

As we stated in Section 4.1, to ensure the loss functions are \(L_2\)-Lipchitz continuous, the \(L_2\) norm of all data samples are not larger than L. Therefore,

$$\begin{aligned} \left\| \mathbf {x} \right\| _2 \le L \end{aligned}$$

Meanwhile, according to the deviation property of differentially private learning algorithms, with the probability at least 1-\(\alpha\),

$$\begin{aligned} \left\| \theta _{priv}- \theta ^{*} \right\| _2 < \lambda \end{aligned}$$

According to the definition of margin, \(\rho _h(\mathbf {x}) \ge \frac{\lambda L}{\left\| \theta ^{*} \right\| _2}\) implies that \(\vert \theta^{\mathsf{*T}} \cdot \mathbf{x}\vert \ge \lambda L\). Therefore, the sign of \((\theta^{\mathsf{*T}} \cdot \mathbf{x} + (\theta_{priv}- \theta^{*})^{\mathsf{T}} \cdot \mathbf{x} )\) would be consistent with the sign of \((\theta^{\mathsf{*T}} \cdot \mathbf{x})\) with probability at least 1-\(\alpha\). Thus,

$$\begin{aligned} Pr((\theta^{\mathsf{*T}} \cdot \mathbf{x} )( \theta^{\mathsf{T}}_{priv} \cdot \mathbf{x} )< 0 ) < \alpha \end{aligned}$$

According to Definition 3 and deviation properties of three differentially private ERM algorithms identified in Section 3.2, a smaller deviation \({\lambda }\) implies a higher \({\alpha }\). Meanwhile, in Theorem 1, a smaller m implies a smaller \({\lambda }\). Consequently, the bound of Theorem 1 shows that a differentially private margin classifier \({\theta _{priv}}\) is more likely to make a different prediction with the non-private model on one data sample that has a lower m. As is shown in Figure 6, when the hyperplane deviates from its original position, the data samples that are closer to the original hyperplane are more likely to be classified as different classes. When a private model makes different predictions with the non-private model on them, TP samples suffer accuracy loss, while FN samples gain accuracy. Therefore, Theorem 1 shows that the hyperplane deviation led by the differential privacy noise causes more accuracy losses to the TP data samples that are closer to the original hyperplane, and more accuracy gains to the FN data samples that are closer to the original hyperplane.

The above results can also explain why the TPR gaps of private classifiers trained by AMP, DPSGD, and PSGD are slightly different. As these three differentially private ERM algorithms have different deviation properties, which are shown in Appendix 1, they have slightly different impacts on the TPR gaps of trained classifiers. Meanwhile, a potential method to mitigate the negative impact of differentially private ERM algorithms on the fairness of margin classifiers is to provide a tighter analysis on the privacy cost of differentially private ERM algorithms. That is, under the same privacy budget, we can train a differentially private margin classifier with less random noise by tightly analyzing the privacy cost. We put it as our future work.

5.3 Deep analysis of empirical results

With the analysis results from Sections 5.1 and 5.2, we analyze the empirical results from Section 4 as follows.

According to Section 5.1, the TPR gap between groups implies different margin distributions of these groups. Concretely, the group with a higher TPR would have more TP data samples whose margins distribute on high values and more FN data samples whose margins distribute on low values. Meanwhile, as the bound of Theorem 1 shows, when the original hyperplane is deviated by differential privacy noise, the group with a higher TPR will suffer less accuracy loss on TP data samples and gain more accuracy on FN data samples. Therefore, the significant TPR gaps of non-private margin classifiers trained on Compas and Adult datasets are amplified in their differentially private versions.

By contrast, if a non-private margin classifier has almost the same TPR on different groups, the ‘positive’ data samples of these groups will have similar margin distributions. Then the TP and FN data samples of different groups will obtain similar bounds in Theorem 1. As a result, the differentially private version of the margin classifier has almost the same TPR on these groups, too. For example, the TPR gaps of non-private classifiers trained on Default are close to 0. Therefore, the random noise led by differential privacy has little impact on the fairness of differentially private classifiers, i.e. the TPR gaps are close between non-private and private classifiers trained on Default.

To further verify the effectiveness of the above results, we use a pre-processing method proposed by Donini et al. [11] to mitigate the biases that exist in Compas and Adult datasets. Then we train non-private and private Linear SVM and LR models on debiased datasets. The TPR gap testing results are shown in Figure 9. When we reduce the TPR gaps of the non-private models trained on Compas dataset from 0.117, 0.157 to 0.050, 0.050, the negative impact brought by differential privacy is largely mitigated, even eliminated. In Adult dataset, when we reduce the TPR gaps of non-private models from 0.072, 0.071 to 0.024, 0.028, the TPR gaps of private models are very similar to those of non-private models. These results further show that the fairness of differentially private margin classifiers strongly depends on the fairness of their non-private versions.

Fig. 9
figure 9

TPR gaps of non-private and private margin classifiers trained on Compas and Adult datasets that have been pre-processed by the method proposed by [11]. Subfigures (a), (b), (c), (d) represent Compas (SVM), Compas (LR), Adult (SVM), Adult (LR) respectively

6 Discussion and future work

Non-convex models

Currently, domains that are protected by anti-discrimination laws are mainly high-stakes, such as credit assessment and criminal justice. Deep learning models would still be far from being widely deployed in these domains due to their lack of interpretability and robustness [40,41,42]. Therefore, we focus on the fairness of differentially private margin classifiers in this paper. Besides, current differentially private ERM algorithms for non-convex models still lack rigorous utility guarantees. As a result, we put identifying the deviation properties of non-convex models as our future work.

Extending our results to other accuracy-based fairness notions

We have shown that the TPR gap of a non-private margin classifier implies the margin distribution difference between TP samples or FN samples of different groups. According to the qualitative analysis on the loss functions of SVM and LR, we can obtain the same result with the TPR gap when it comes to the true negative rate gap or the total accuracy gap. That is, a true negative rate gap or a total accuracy gap between two groups would also imply the different margin distributions of corresponding data samples. As Theorem 1 only assumes the margin of a data sample, the results of our paper can be extended to other accuracy-based fairness notions, including equal odds [12], which requires that the different groups should have the same true negative rate and true positive rate, and accuracy parity [18], which requires the different groups should have the same accuracy.

Future work

In the future, we will quantitatively analyze the correlation between the TPR gap and margin distribution difference among groups in the non-private margin classifier to understand the impact of differential privacy on the fairness of margin classifiers more deeply.

7 Conclusion

In this paper, we study the dominant factor on the fairness of differentially private margin classifiers. Through conducting a well-designed empirical study and analyzing how differential privacy impacts the fairness of margin classifiers, we show that the fairness of differentially private margin classifiers strongly depends on the fairness of their non-private counterparts. To summarize, we argue that if non-private margin classifiers are fair with negligible TPR gaps, the fairness of their differentially private versions can be ensured.