1 Introduction

Machine learning has gained great success in numerous fields, such as decision-making, risk identification, and disease diagnosis. The widespread adoption of machine learning and its effectiveness is mainly attributed to, except for the advances of ML algorithms, the increasing affordability of collecting, storing, and processing large quantities of data. Especially, the feasibility of joint and integrated use of data from multiple sources (e.g., data collected by different institutes or stored in different data centers). However, sharing data may encounter security and privacy issues. With the increase in awareness of data privacy protection, sharing data containing citizens’ sensitive information is legislatively forbidden. For example, the regulations like the General Data Protection Regulation (GDPR) [1, 2] in European Union, the Personal Data Protection Act (PDPA) [3] in Singapore, and California Consumer Privacy Act (CCPA) [4] in the US.

Under such circumstances, federated learning (FL) is proposed and gets great popularity. It enables multiple data owners (e.g., clients or data centers) to collaboratively train ML models without revealing their private data. As the basic workflow, participants in FL iteratively (i) conduct local computations on their data to derive certain intermediate results, (ii) conceal the intermediate results with certain cryptographic tools, and (iii) share the protected result with other participants, until the final training result achieved.

According to different kinds of data partitioning, FL can be categorized as horizontal federated learning (HFL) and vertical federated learning (VFL). HFL deals with the case when the data is horizontally partitioned, i.e., the datasets share the same feature space but differ in the sample space. For example, two hospitals hold medical records of different groups of patients (i.e., the sample space) that describe whether they have a certain disease (i.e., the feature space). Comparatively, VFL deals with the case when the data is vertically partitioned, i.e., datasets share the same sample space but differ in the feature space. A typical example of VFL is that, two hospitals hold medical records of the same group of patients (i.e., sane sample spaces), but each of them describes different aspects of the medical status of a patient, e.g., dataset from hospital A records the COVID swab test of the patients, whereas dataset from hospital A records chest CT scan of them (i.e., different feature spaces).

Due to its relatively symmetric structure, the construction of HFL is simple. It normally is constructed via model averaging – for each iteration the client train a model for several epochs and sends its model to the client, who afterward aggregates the submitted local model and gets the updated global model. But in terms of VFL, especially VFL constructed with model splitting, the training process is relatively complex. In VFL with model splitting, the whole model is divided into a top model (holding at the server) and several bottom models (as specific intermediate states preserved at the clients). The training process is finished with iteratively bilateral communication. The clients run their bottom model with their local data and upload the result to the server; the server runs a top model to aggregate the participants’ output, computes the gradients of the loss, and sends the gradients back to every participant.

Accordingly, the potential attackers of HFL have chance to peep gradients of all the parameters of the model, which can be used to infer private information. Comparatively, attackers of VFL only control part of the federated model and accordingly can only get gradients of an incomplete model. Thus, it is generally believed that VFL has a higher security guarantee, especially for VFL constructed with model splitting (in which the participants have no access to the last layer of the DNN and thus the label in the server are more secure) [5]. In spite of this, recent research reveals that it is still feasible to conduct inference attacks on VFL [6]. Specifically, Fu et al. [6] find that, the gradients sent from the server essentially can help the client to learn a good feature representation with respect to the labels. This leaked information serves as a pre-trained model for label inference attacks.

In this paper, we propose the dispersed training framework, to combat such attacks and enable the security of VFL. The basic idea of dispersed training is to utilize secret sharing breaking the correlations between the gradients and the training data. As shown in Figure 1, participant B holds his own data with labels hoping to train a model; he hopes to utilize participant A’s data to improve the quality of his model. Participant A is the positional attacker and he is intended to infer participant B’s label. In the dispersed training framework, a shadow model (i.e., participant C) is created for participant A, and part of the data of participant B is shared to participant C. In the training phase, the clients (i.e., participants A and C) update their bottom model with their shared data and upload their partial outputs to the server. The server aggregates the clients’ partial outputs and trains its top model; their outputs can be efficiently aggregated, due to the linearity of the secret sharing schemes. The server’s training output is also segmented into two parts and delivers each segment the A and C, respectively, who later iteratively train and update their bottom model. With such a method, even if the attacker receives the gradients in the training phase, he is incapable to deduce the feature representation of labels from the bottom model.

Figure 1
figure 1

Architecture and workflow of the dispersed training framework

The rest of the paper is organized as follows. §2 presents basic backgrounds of federated Learning and describes how inference attack is conducted on vertical federated Learning. We present our dispersed training framework and its construction on §3. The performance evaluation is presented on §4. Related work on this topic is presented at §5, and we conclude this paper on §6.

2 Inference attacks in VFL: an introduction and analysis

2.1 Federated learning

Compared with the common centralized learning, the federated learning provides a collaborative learning method to perform distributed training models on data, which protects the privacy of data [7]. According to the characteristics of data, the federated learning is mainly divided into three categories: horizontal federated learning, vertical federated learning, and transfer federated learning [8].

Horizontal federated learning is suitable for the situation where the data features of the participants overlap a lot, but the IDs of the samples overlap less. The purpose of horizontal federated learning is to train a more accurate prediction model by combining more samples with the same characteristics. For example, two e-commerce platforms in different regions have purchase records of the same consumption level in their respective regions. The two e-commerce platforms can train a model through horizontal federated learning and can push products to users of this consumption level. Vertical federated learning is suitable for samples with more overlapping IDs and less overlapping data features. The purpose of the vertical federated learning is to train a model by combining more features of the same sample. For example, a bank and a lending company in the same area, the bank has data on a group of people’s economics status and whether they owe money, while the lending company has another asset status of the same group of people. Integrating the lending company and the bank into cooperate method can carry out vertical federated learning by using their data, and train an accurate model to predict risk and decide whether to service a loan to someone [8]. Transfer federated learning is more suitable when the data characteristics of the participants and the number of samples overlap less, or the distributions of the participants differs greatly. For example, hospitals and lending companies in different regions want to train a prediction model, but the two parties have little overlap in data features and sample numbers. At this time, transfer federated learning can be used to train the model [9].

2.2 Label inference attacks in VFL

The label inference attack of vertical federated learning is proposed by Fu et al. [6], which reveals the privacy relationship between vertical federated learning and labels. Fu et al. [6] designed three attack methods to reason about labels with high privacy, which can enable malicious participants to perform inferring attacks on labels of any participant’s data, resulting in serious privacy leakage.

Passive label inference attack is vertical federated learning for model splitting. Although no participant can access the top model, the trained bottom model can be used for inference attacks [10]. Its principle is that the adversary can convert its features into indicative information about labels during the training process, and use that information to make predictions. The adversary uses vertical federated learning to train the bottom model with indicative features, and then adds an initialization layer to the upper layer of the trained bottom model to form a complete model, and uses a small amount of labeled data to perform semi-supervised learning on the attack model. With this model, the label can be directly inferred.

Active label inference attacks use the adversary to accelerate the gradient descent of the bottom model, so that the bottom model is better trained in each iteration, which can provide better training features to the server, making the top model of the server more dependent on the bottom model, and the adversary then fine-tunes this bottom model to form a complete attack model that can perform inference attacks on labels. Fu et al. [6] designed a malicious local optimization algorithm and added an algorithm to limit the learning rate to the gradient descent algorithm, which can ensure that the learning rate is appropriate, so as to accelerate the gradient descent and make the malicious bottom model better trained.

The direct label inference attack is aimed at vertical federated learning without model splitting. The adversary can receive the gradient leaked from the server. Fu et al. [6] proved by mathematical analysis that the adversary can directly analyze the leaked gradient sign. When the gradient sign is negative, it proves that the guessed label is exactly the actual label. When the gradient sign is positive, the guessed label is incorrect. This can cause serious label leakage.

In summary, all three attacks result in serious privacy breaches. Passive label inference attack is due to the bottom model having better inference ability, and the bottom model is trained by semi-supervised learning to form an attack model to perform label inference attack. Active label inference attack is on top of the base, using malicious SGD to accelerate gradient descent, resulting in a malicious bottom model that can be trained better and has better inference attack capability. The direct label inference attack, on the other hand, is not very related to the first two. The direct label inference attack is directly caused by the leakage of label information through the gradient leaked from the server. Due to previous work [6], common privacy protection methods cannot be used to defend against passive label inference attacks and active label inference attacks, the proposed model in this paper is mainly for the discussion of the first two attack methods.

3 Dispersed training

3.1 The training model

Before describing more details with respect to the training model, we present the model in brief. First, a shadow model c is generated, which has the same structure with a but with a benign type. The difference between the malicious and benign types is that when training the model, the benign model utilizes SGD for gradient descent, while the malicious model utilizes its own local optimizer to accelerate gradient descent. Then the malicious model shares its training dataset with the shadow model through secret sharing, and performs local training on the bottom models. Consequently, the output of the bottom model is uploaded to the top model after being merged and aggregated. Finally, the top model assigns the gradient to the bottom models. Note that the shadow model should share the gradient with the malicious model. The above process is repeated until the model converges.

Algorithm 1 is the pseudo-code of the training model. More details are as follows.

  1. 1.

    Initialization. Initialize a shadow model c, where the structure is similar to a with a benign type. Note that although model c has the same structure, the training process is different from that of model a. Then, initialize the parameters of the bottom model and the top model, respectively.

  2. 2.

    Dataset Splitting. According to the assumptions in the literature [6], there are two bottom models a and b in the training model. The training sets are \(x_{a}\) and \(x_{b}\), respectively. in which participant A holding model a is an attacker, and the participant B holding model b is an honest participant. Here we only need to split the dataset \(x_{a}\) of model a. That is, the bottom model a and shadow model c share the training set data \(x_{a}\). Specifically, \(x_{a}\) is split into \(\alpha x_{a}\) and \((1-\alpha )x_{a}\), where \(\alpha\) is a random number, \(\alpha x_{a}\) is participant A’s training set, and \((1-\alpha )x_{a}\) is participant C’s training set

  3. 3.

    Training for Bottom Model. In each round of local training, models a, b, and c use their respective training datasets for local training. However, the training data is too large, in the actual training process, the training data is divided into blocks (denoted as \(bat_{i}\), where \(i\in {a,b,c}\)). It should be noted that the bottom model is a linear model, and the activation function Relu() is also linear. Therefore, during the entire training process, the parameters of each layer between the bottom model a and c are always linear.

  4. 4.

    Merging and Aggregation of Outputs. This stage is divided into two steps:

    1. (a)

      The merging of outputs of the model a and c . Here, although both participants A and C participated in the training locally, for the top model, there is still only an interface with model a. Therefore, in the dispersed training model, we design a merge function Mergshare(). After the model a and c have trained the local model to generate the output \(o_{a}, o_{c}\), the function treats \(o_{a}, o_{c}\) as two shares, and merges the results of model a and c into a new output \(o_{a}\) through Mergshare(). The new output \(o_{a}\) is uploaded to the top model through the interface between the top model and model a. Note that models a and c maintain a linear relationship during training, so their merged result is consistent with the upload results after model a is trained alone. That is to say, the new output \(o_{a}\) uploaded to the top model under the name of A is the same as the previous work [6] that A directly uploads the output to the top model. Here we assume that there is an output merging layer among models a, c and the top model. Its main function is to combine the outputs of models a and c to the top model. At the same time, the gradients received from the top model need to be secretly shared first, and then downloaded to models a and c, respectively.

    2. (b)

      The aggregation of outputs a and b . After the merging in step 4(a), the top model obtains \(o_{a}\) and \(o_{b}\) through its interface with models a and b, respectively. At this time, according to the method of previous work [6], the function Concat() is used to aggregate \(o_{a}\) and \(o_{b}\) to generate \(o_{all}\), which is further trained by the top model.

  5. 5.

    Training for Top Model. The top model is trained with the output \(o_{all}\) and to get the output \(o_{final}\).

  6. 6.

    Gradient Download. After the top model is locally trained, the loss function is first calculated, and then the gradient is passed down to the bottom model. In the previous work [6], the top model will pass gradients down to models a and b, respectively. In this paper, the process of model b receiving its gradient is similar to the previous work [6]. However, it’s different for downloading the gradient to the model a since model c is involved. In particular, the gradient \(g_{a}\) of the model a is generated first. Then a new \(g_{a}\) and \(g_{c}\) are obtained by the secret sharing function SS(), which are passed to the models a and c. This function is implemented by the merging layer.

  7. 7.

    Repeat (1)-(6) for the bottom model and the top model until converge.

Algorithm 1
figure a

The Framework for Training Process.

3.2 The attack model

In the attack model, we still use the model completion proposed by Fu et al. [6]. After completing the federated training process, we will get the trained bottom model. This bottom model also has strong capabilities to run label inferring attack. We retrain the bottom model and use a small amount of labeled data as Fu et al. [6]. More specifically, the malicious attacker adds an extra layer to continue semi-supervised learning, and finally trains the bottom model. The newly trained model is used as an attack model to execute label inference attack.

4 Performance evaluation

4.1 Experiments settings

All experiments are performed on Intel(R) Core i5-12500H @ 2.50GHz, 16GB RAM, NVIDIA GeForce RTX 2050 card (Table 1). The three datasets used in the experiment are CIFAR-10, CINIC-10, and BCW, which are also the datasets used by previous work [6]. In this paper, Top-1 Accuracy is selected as the performance indicator for the attack effect of federated original tasks and label inferring. Top-1 accuracy means that the predicted label takes the largest one in the final probability vector as the prediction result. If this prediction result is the actual label, it means the prediction is correct, otherwise, the prediction is wrong. In the process of training the attack model, this paper selects an additional small amount of labeled data for semi-supervised training of the bottom model. In the attack model experiment, CIFAR-10 and CINIC-10 selected 40 labeled samples, and the BCW dataset selected 20 labeled data for semi-supervised training. As mentioned in the previous work [6], the number of labels will affect the effect of the label inference attack. When the number of samples reaches a certain number, the result of the label inference attack will grow slowly. Therefore, the same as in the previous work [6], In this experiment, 40 data and 20 data are also selected.

Table 1 Experimental equipment, experimental data on the indicators
CIFAR-10::

CIFAR-10 is a typical classification dataset, which contains 60,000 images, of which 50,000 images are used as a training set and 10,000 images are used as a test set. Each photo is a 32*32 color image, and there are 10 different categories of images in the entire dataset.

CINIC-10::

CINIC-10 is an extension of CIFAR-10 through subsampled ImageNet images, and like CIFAR-10, it is also divided into 10 categories. To solve the problem of the small number of CIFAR-10, the CINIC-10 dataset appeared. It contains 270,000 images, which is 4.5 times larger than CIFAR-10. The images are evenly divided into three subsets: training set, validation set, and test set. Each subset has 90,000 images, the number of training sets is 1.8 times that of CIFAR-10, and the number of test sets is 9 times that of CIFAR-10. Through this data set, we can test the effect of our designed scheme on large data sets.

BCW::

The Breast Cancer Wisconsin (BCW) dataset is a breast cancer dataset with a total of 569 pieces of data and 32 feature columns, mainly for nuclear features. The label of the sample is whether the diagnosis is benign or malignant. The data set used in the experiment is the data set used by Fu et al. [6], which randomly selects 426 samples as the training set and the remaining 143 samples as the test set.

In order to compare with the paper proposed by Fu et al. [6], we use the same top and bottom model structure of Fu et al. [6]. For large datasets like CIFAR-10 and CINIC-10, the bottom model is chosen as the residual network and the top model is full connect neural network, for the BCW dataset both the bottom and top models are used as fully connected neural network, and the specific structure is shown in Table 2.

Table 2 Neural network structure of the bottom model and top model

4.2 Comparison with original attack

The performance of the model after dispersed training in the face of a passive label inference attack is shown in Figure 2. It can be seen from the Top-1 accuracy of the attack on the three datasets, compared with the label inference attack proposed by Fu et al. [6] the effect has dropped significantly, especially on the CIFAR-10 dataset by about 70\(\%\). After dispersed training, the Top-1 accuracy rates of attacks on CIFAR-10, CINIC-10, and BCW datasets are 9.99\(\%\), 10.02\(\%\), and 36.36\(\%\), respectively. For the CIFAR-10 and CINIC-10 datasets, the Top-1 accuracy rate of the attack is reduced to about 10\(\%\). These two datasets only have ten categories, so the Top-1 accuracy rate of the 10\(\%\) attack is equivalent to the Top-1 accuracy rate for these ten categories to random guess. The results from the CIFAR-10 dataset to the CINIC-10 dataset are all around 10\(\%\), indicating that our scheme also has a good defense effect on large datasets. Overall, the dispersed training proposed in this paper can effectively prevent passive label inference attacks.

Figure 2
figure 2

Performance comparison between the original passive attack and our attack

In addition, this paper also compares active label inference attacks. As can be seen from Figure 3, the model after dispersed training is also effective against active label inference attacks. On the CIFAR-10 dataset, the attack Top-1 accuracy rate dropped from 84.84\(\%\) to about 10\(\%\), and the attack Top-1 accuracy rate for the other two datasets also dropped significantly.

Figure 3
figure 3

Performance comparison between the original active attack and our attack

Overall, our scheme can reduce the accuracy of label inference attacks to random guessing, which proves that dispersed training can effectively mitigate label inference attacks.

4.3 Comparison with original federated tasks

Table 3 shows the accuracy of the original federated tasks for each dataset after dispersed training. Compared with the original federated learning, the federated accuracy decreased after dispersed training. It drops by 4\(\%\), 15\(\%\), and 18\(\%\) on the BCW, CIFAR-10, and CINIC-10 datasets, respectively. Among them, the accuracy of the CIFAR-10 and CINIC-10 datasets dropped significantly, which was caused by the large datasets. Although the label inference attack can be effectively prevented after dispersed training, the federated accuracy has also decreased to a certain extent, especially on large datasets. Therefore, the implement of dispersed training requires a trade-off between defending against label inference attacks and federated training accuracy.

Table 3 The performance of the original federated tasks and our federated tasks after dispersed training

4.4 Comparison with gradient compression

Compared to common machine learning privacy-preserving methods, our scheme can mainly make the performance of active label inference attacks and passive label inference attacks reduced to random guesses and can make the performance of the original federated task not degraded too much. For example, in the method of adding noise to the gradient, the performance of the original federated learning task decreases from 0.8 to about 0.1, which can lead to the failure of the federated task. Let’s take gradient compression, a privacy-preserving method, as an example. As shown in Table 4, the active label inference attack decreases from 0.8484 to 0.64 after gradient compression at a compression rate of 0.9, while our scheme can be reduced to about 0.10 for random guesses. The original federated performance is reduced, both down to roughly 0.7 or so.

Table 4 Defense performance of gradient compression and our scheme against the active label inference attack

4.5 Output distribution of malicious model

We use t-SNE [11] to map the output of bottom model A into 2D space. As shown in Figure 4, the classification of each color is less obvious. Because of our dispersed training, the bottom model A cannot learn about the relationship between labels and features and has a poor ability to perform label inference attacks. The attack model formed after model completion cannot perform label inference attacks.

Figure 4
figure 4

The outputs of attack model a

5 Related work

At present, data security is a thorny issue, such as the malicious recovery of image data, the anonymity of network data transmission, privacy protection of distributed systems, data privacy protection in big data environment [12], etc. Various solutions are proposed to solve these kinds of problems [13,14,15,16,17,18,19]. Zhang et al. [20] proposed an interesting approach for optimizing the multicast traffic based on the advantages of the software-defined networking. Among them, machine learning, especially federated learning, is a hot topic. Federated learning (FL) was first proposed by google [21,22,23], aiming at building machine learning models with distributed entities (e.g. devices or datasets), where the private information of the entities should be protected [13]. In general, FL has three categories, Horizontal Federated Learning [23,24,25], Vertical Federated Learning [26,27,28,29,30] and Federated Transfer Learning [31]. Federated Reinforcement Learning [32] has also recently emerged. Security issues [33,34,35,36,37,38,39], especially in vertical federated learning, has got arouse wide concern [40, 41]. Wei et al.  [42] investigate the issues of security and privacy in VFL. Fu et al. [6] discuss the problem of label leakage instead of membership inference or sample property [43,44,45,46]. Rassouli et al. [47] prove it’s possible for the adversary to reconstruct the passive party’s feature under the black box. To protect privacy in VFL, Zhu et al. [48] propose a secure framework PIVODL and Han et al. [49] propose FedValue by using Shapley-CMI and guaranteeing the data privacy toward the view of game theory.

In general, differential privacy (DP) [50,51,52,53] and homomorphic encryption (HE) [54,55,56,57] are used to protect the privacy of the data in VFL. Geyer et al. [58] guarantee privacy of the users in the training process. Yuan et al. [59, 60] utilize HE to train data in the cloud. While DP and HE do not work here. First, if we only add some ransom salts to the training data, the training and the attack process are almost identical to the previous work [6]. Therefore, we add a convergence level, where the output and shared gradient converge. As for HE, it also fails in our setting. It’s due to the fact that the bottom model can still be trained well even if the messages between the bottom and top model are encrypted with HE. Consequently, the attacks can be implemented by adding one level. Recently, secure multi-party computation (SMC) [61,62,63,64] is implemented to solve the privacy issues in VFL. For example, SecureML, an SMC framework is used to scalable preserve the privacy in machine learning. SMC can preserve privacy of sensitive data [65]. Mohassel et al. [66] propose a 3PC model by utilizing secret sharing with non-colluding servers. Personalized federated learning [67,68,69] is used to address data heterogeneity in federated learning. Secret sharing [70] preserves the information with respect to the intersection elements.

6 Conclusion

The previous work revealed that the privacy security of VFL also has great risks. Malicious participants in VFL can launch inference attacks on the labels of other participants, resulting in serious privacy leakage. To solve this problem, we propose a dispersed training framework, which introduces a new bottom model, which can share part of the gradient during the training of the malicious bottom model through secret sharing, so that the malicious bottom model cannot better obtain the relationship between labels and features, thereby preventing label inference attacks. Experiments show that dispersed training can effectively prevent label inference attacks. However, the accuracy of the original federal task is also affected to a certain extent, and it can only trade-off between raw federated task accuracy and attack accuracy, which provides a good direction for our future research. In the future, we can take into account the original federal performance and reduce the research on the direction of attack accuracy.