1 Introduction

The growing prevalence of smart devices and the Internet of Things (IoT) is leading to an unprecedented growth in the volumes of data generated every day. The International Data Corporation (IDC) anticipates that billions of IoT devices will generate 79ZB of data by 2025 [1]. Nowadays, the data collectors have many more approaches to collect user’s data than ever before. For example, application providers can require that the users can enjoy the convenience of the internet applications only if they share their data with the application providers. This leads to “the data follows the application, and individual data ownership is not in their own hands” [2]. As a result, the enterprises manage and control the application data and monopolize them, which poses a great challenge to the protection of users’ privacy. Recently, more and more users have realized the value of their data and are worried that third parties may share their information. In order to avoid the disclosure of sensitive information, countries around the world have enacted related laws to preserve the data privacy of citizens. Europe’s General Data Protection Regulation (GDPR), California’s California Consumer Privacy Law (CCPA), and China’s Cyber Security Law and Data Security Law prohibit centralized remote processing of sensitive data collected in distributed mode [3, 4]. However, this also makes it difficult to access users’ necessary private information for the valid users in many legal application fields such as medical treatment and education.

As a new distributed machine learning paradigm, federated learning (FL) can be used to solve the problem mentioned above. To prevent the servers from accessing the clients’ sensitive data directly, FL lets the client devices store the data locally, and trains the global model by aggregating the local models iteratively, which are trained locally on the client devices. During the training procedure, the client devices only need to upload the gradient and the weight parameters to the central server [5].

However, although FL has become an effective scheme to resolve user privacy problems in machine learning, recent studies demonstrate that there are still loopholes in FL protocols, and attackers can launch many kinds of attacks, such as poisoning attack, inference attack, model attack, etc., to damage the trained models by leveraging these loopholes. For example, combined with advanced attack techniques of Generative adversarial networks (GANs), a class representation of global data distribution of all clients can be constructed, and it distinguishes between client-specific attacks (i.e., user-level privacy breaches), so this stronger privacy threat can precisely recover private data from specific clients [6]. In addition to this, sensitive data of participants may be leaked to untrusted servers through uploaded gradient vectors [7], and an opponent can also manipulate the shared model with a model poisoning attack. Besides, the attacker may masquerade as an honest data provider and inferences the attributes of sensitive training data on the target client by observing the update of the target shared model [8].

To make people who are interested in FL security better know its recent research development, in this survey, we collect, classify, introduce and discuss more than one hundred of FL-security-related papers which are published in recent years, and make a comprehensive and systematic study on them. In summary, the contributions mainly include the following points:

  • In our work, both the survey and the non-survey papers related to the privacy and the security of FL are studied, and the similarities and the differences between our work and the related surveys are discussed.

  • We systematically analyzes the threats on the security and the privacy of FL and the corresponding defense methods proposed by the researchers, and makes a comprehensive comparison among them.

  • The aggressive behaviors in FL-related application are classified and discussed, and suggestions for dealing with such behaviors as well as future research directions are provided. Meanwhile, we identify a set of criteria for future solutions that will serve as a reference for scholars and developers studying ways to improve security and privacy in future FL systems.

The rest of this paper is arranged as follows: Sect. 2 makes a detailed research of the existing survey works on the security and the privacy of FL, compares our survey with them, and highlights the unique contributions of this survey; Sect. 3 elaborates on the relevant knowledge of FL and makes a comprehensive analysis of three scenarios of FL; Sect. 4 briefly introduces the threats to the security and privacy protection of FL, explains the various threats to the security and privacy protection of FL by examples, and provides corresponding solutions; Sect. 5 first generalizes the classification of attack methods in security challenges with graphs, and then explains each attack method with pictures and texts; in Sect. 6, the corresponding security defense methods and privacy protection technologies are proposed for four kinds of attack methods (poisoning attack, inference attack, model attack and adversarial attack). Section 7 predicts possible attack patterns and defense strategies for FL in the future, and provides a set of criteria for solutions. Section 8 concludes with a summary and a future outlook.

2 Comparison Between Our Work and the Existing Federated Learning Surveys

Recently, researchers have proposed some FL privacy-and-security related investigation articles. In [9], the authors classify possible attacks and threats during training for FL, list the attack methods of each category, and introduce the attack principles of the corresponding attacks. They summarized specific defense measures against these attacks and threats and analyzed their principles. In [10], the authors describe the development of machine learning and the inevitability of the emergence of FL, and give the definition and classification of FL. Aiming at of the privacy protection problems of FL, common privacy protection technologies are summarized. In addition, the existing mainstream open source frameworks of FL are introduced and compared, and the application scenarios of FL are given. In [11], the authors introduce the training processes of Horizontal Federated Learning (HFL) and Vertical Federated Learning (VFL), and explore the threats to these processes and the reason why they are prone to be attacked, so as to classify and summarize the existing attack methods, such as the poisoning attack, the adversarial attack and the model inversion attack. Aiming at several methods of attack in both scenarios, several corresponding defense measures are introduced, such as gradient sparsity, malicious detection, secret sample alignment, label protection, Verifiable Secret Sharing (VSS) and disturbance sharing. They highlighted the training processes and defenses against threats in both the HFL and VFL. In [12], the authors discuss the classification of FL and analyze its advantages and disadvantages. The hidden danger of FL is pointed out and the current main defense measures are introduced. In [13], the authors introduce the basic concepts and threat models of FL. Three types of attacks launched by internal malicious entities are summarized and security and privacy vulnerabilities of the FL architecture are investigated. Then, the most advanced defense schemes are studied from the aspects of Differential Privacy (DP), Homomorphic Encryption (HE), and Secure Multi-party Computation (SMC). In [14], the authors analyze the possible security problems of FL, focus on the threat of poisoning attack, adversarial attack and privacy disclosure in detail, summarize targeted defense measures and put forward corresponding solutions.

Most of the existing investigations on the privacy and security of FL only combine the basic knowledge of FL with attack methods and solutions, without considering that the solution should still follow some application criteria. In this paper, the theoretical knowledge and related applications of FL are presented in the form of sentences combined with tables, and the problems and solutions in the application are also explained. More importantly, a series of criteria should be followed when developing defense strategies are proposed. Combining the above three parts to form a systematic study of FL privacy and security sequential architecture, which is not present in the existing articles. If the relevant staff can consider and meet as many criteria as possible when formulating the scheme to protect the privacy and security of the system, then the system is undoubtedly robust.

Table 1 summarizes the main similarities and differences between our survey and existing relevant FL surveys. Table 2 shows the differences between our survey and the existing relevant FL surveys (where “\(\surd \)” means “include” and “\(\times \)” means “not include”).

Table 1 A comparison between our survey and the existing federated learning surveys

Compared with the existing investigation articles on FL, this paper mainly focuses on the security and privacy issues of FL, and comprehensively analyzes FL from several aspects, such as attack methods and defense schemes. For privacy and security challenges in FL, suggestions for solving security and privacy issues and future research directions are provided, so as to provide researchers with new solutions to privacy and security of FL. The survey collected most of the relevant literature on privacy and security in FL. The content of the survey is rich and comprehensive. Our investigation on the security and the privacy challenges facing FL is very detailed, and the classification scheme presented is also very comprehensive.

Table 2 Comparison of content differences of our survey and existing FL surveys

3 Concept and Classification of Federated Learning

In this section, we first explain the concept and the algorithm principle of FL, then it introduces the classification of FL scenarios, and introduces the principles and the implementation processes of FL in different scenarios, such as HFL, VFL, and FTL.

3.1 Basic Concept and Algorithm Principle

FL can be regarded as a decentralized and collaborative machine learning method for privacy protection. The model training is completed in multiple iterations by multiple clients collaborating [16]. The concept of FL was first proposed by H.Brendan Mcmahan et al in 2016. It is mainly used to solve the privacy problem caused by centralized model training of data stored in multiple terminals (such as mobile phones) [17]. Google is the first company to introduce the FL system, which is mainly applied in the input method improvement and other scenarios. For example, after users have used relevant words several times, Google’s Gboard system can suggest words and emoticons to them when they input words [18,19,20]. Different from the traditional recommendation system, this system relies on the mobile device itself to a large extent without gaining user privacy. The framework for FL is shown in Fig. 1.

Fig. 1
figure 1

The framework of FL

FL is a distributed training model performed by a group of devices that share local model updates with a central server whose job is to aggregate these updates to build a global machine learning model. A common aggregation model known as the Federated Averaging (FedAvg) [21], allows the servers to aggregate local random gradients from different devices using iterative model averaging methods. Equation 1 [21] shows the framework of federated averaging.

$$\begin{aligned} w_{t+1}=\sum _{k=1}^{K}\frac{n_{k}}{n}w_{t+1}^{k} \end{aligned}$$
(1)

In Equation (1), \(w_{t+1}\) represents the update of global model weight (i.e., aggregation model weight update), n represents the total amount of data of K clients, where \(\sum ^{K}n_{k}=n\), w represents model parameter.

Observations from participants in FL, the FL scenarios consists of a set of participants consisting of a central server (also known as a parameter server) and K clients, each with its own local dataset \(D_{k}\). During the learning process, the clients agree on the common goals and model structure, and train model \(M_{Global}\) in the total dataset \(D=D_{1}\cup D_{2}\cup \cdots \cup D_{K}\). At the beginning of the FL training iteration, a subset of clients \(C\subseteq K \) is selected to receive the current global state of the shared model based on model weights. After receiving the global state, each client performs local training on its own dataset according to the shared model parameters, and sends the model update obtained after training (i.e. the weights learned locally by the client using the local dataset) to the central server. The server applies updates to the current global model to generate a new model. Equation 2 shows the global model update mechanism.

$$\begin{aligned} G_{t+1} =G_{t}+\frac{1}{k_{t}}\sum _{i\in \left[ k_{t} \right] }\Delta L_{t+1}^{i} \end{aligned}$$
(2)

In Equation (2), \(G_{t}\) represents the global model parameter of the server side in the t th iteration, \(k_{t}\) represents the number of clients selected in this round, and \(\Delta L_{t+1}^{i}\) represents the local model update parameter received by the central server from the clients.

After several iterations of the above process, the global model reaches a certain level of accuracy determined by the central server, and FL is complete. Equation 3 represents the target function of the central server.

$$\begin{aligned} _{w}^{min}\text {F(w)},F(w)=\sum _{k=1}^{K}\frac{n_{k}}{n}F_{k}(w) \end{aligned}$$
(3)

In Equation (3), K represents the total number of client devices participating in training, \(n_{k}\) is the data volume of the k th client, and \(F_{k}(w)\) is the local objective function of the k th device. Equation 4 shows the local objective function of the k th device.

$$\begin{aligned} F_{k} (w)=\frac{1}{n_{k}}\sum _{i\in D_{k}}f_{i(w)} \end{aligned}$$
(4)

In Equation (4), \(D_{k}\) is the local dataset of the k th client, and \(f_{i} (w)=\alpha (x_{i},y_{i},w)\) is the loss function generated by the model with parameter w to the instance \((x_{i},y_{i})\) in dataset \(D_{k}\).

The average loss function of the local client is obtained by dividing the sum of the loss functions generated by all instances in \(D_{k}\) by the total data volume of the client.

In summary, the FL scenariomainly consists of two phases, namely local update and global aggregation. The local update phase refers to the calculation of gradients by minimizing the loss function of all training data in these devices [22]. Global aggregation involves the following steps: the server collects updated model parameters from different client devices, aggregates them, and then sends the aggregated parameters back to the clients for use in the next training iteration.

3.2 Federated Learning Classification

The feature and sample ID space of the data parties may not be identical, and we classify FL into HFL, VFL, and FTL based on how data are distributed among various parties in the feature and sample ID space [9, 23]. Let the sample ID space of the i th data \(D_{i}\) be \(x_{i}\), the feature space \(y_{i}\), and the label \(I_{i}\). The expressions of the three scenarios of FL are shown in Table 3 [9].

Table 3 Classification of federated learning scenarios

3.2.1 Horizontal Federated Learning

In HFL, datasets of different participants have the same feature space, but they rarely intersect in the sample ID space. HFL is distributed machine learning that divides the dataset horizontally (i.e. the user dimension) under the condition that the user features of the two datasets overlap more while users overlap less, and HFL takes out the parts with the same feature but not exactly the same users for training [24, 25]. For example, “Hey Siri” and “OK Google” in wake-up word recognition [26] are typical applications of horizontal segmentation, because each user speaks the same sentence in a different voice. The schematic diagram of HFL is shown in Fig. 2.

Fig. 2
figure 2

Schematic diagram of horizontal federated learning

The HFL training process consists of the following steps:

\(\textcircled {1}\) Initialization: Initializes the federated model parameter w and distributes it to the clients (\(w_{1}=w_{2}=w_{k}=w\)).

\(\textcircled {2}\) Local training: the client calculates the corresponding output value \(y_{pre}^{k} =X_{k}w\) and error value \(L_{k}\) of data records. Equation 5 shows the local gradient of the client.

$$\begin{aligned} \Delta w_{k} =\frac{\delta L_{k}}{\delta X_{k}} \end{aligned}$$
(5)

In Equation (5), k represents the k th client.

\(\textcircled {3}\) Gradient aggregation: The parameter server uses the FedAvg [27] algorithm to aggregate the shared gradients of the clients, and the aggregation gradient can be represented as Eq. 6.

$$\begin{aligned} \Delta w =\frac{1}{K}\sum _{k=1}^{K}\Delta w_{k} \end{aligned}$$
(6)

\(\textcircled {4}\) Global parameter update: Parameter server updates global parameters, and Eq. 7 shows the global parameter update.

$$\begin{aligned} w_{n+1}=w_{n}+\eta \Delta w \end{aligned}$$
(7)

In Equation (7), \(w_{k}\) represents the global parameter of the n th iteration, and \(\eta \) represents the learning rate.

3.2.2 Vertical Federated Learning

In VFL, datasets of different participants have feature space with different attributes, but have the same or similar sample ID space. VFL is distributed machine learning that divides the datasets vertically (i.e. feature dimension) under the condition that the users of two datasets overlap more while user features overlap less, and VFL takes out the parts with the same user but not exactly the same user features for training [28]. For example, the collaboration between different companies can often be viewed as a vertical segmentation situation. VFL usually uses entity alignment techniques [29, 30] to collect overlapping samples of all parties. The schematic diagram of VFL is shown in Fig. 3.

Fig. 3
figure 3

Schematic diagram of vertical federated learning

The VFL training process consists of the following steps:

\(\textcircled {1}\) Initialization: There is sample alignment with the same identifier between clients, and the parameter server initializes the federated model parameters for distribution to clients (\(w_{1}=w_{2}=w_{K}=w\)).

\(\textcircled {2}\) Local training: The active party uses Eq. 8 to summarize the output value and error value of data records with the same identifier.

$$\begin{aligned} y_{pre}^{k} =\sum _{k=1}^{K}X_{k}w_{k} \end{aligned}$$
(8)

The intermediate result \(\Delta H_{k} \) is transmitted to the passive party so that both sides can obtain the gradient according to Eq. 9.

$$\begin{aligned} \Delta w_{k}=\Delta H_{k}\bullet \frac{\delta H}{\delta X_{k}} \end{aligned}$$
(9)

In Equation (9), k represents the k th client and H represents the excitation function [31].

\(\textcircled {3}\) Gradient aggregation: The parameter server receives shared gradient information from the clients and gathers them.

\(\textcircled {4}\) Global parameter updating: The parameter server uses the shared gradients of participants to update the corresponding global parameter to obtain global parameter w, as shown in Eq. 10.

$$\begin{aligned} w_{n+1}^{k}=w_{n}^{k}+\eta \Delta w_{k} \end{aligned}$$
(10)

3.2.3 Federated Transfer Learning

In Federated Transfer Learning (FTL), datasets of different participants have feature space of different attributes, and there is little intersection in the sample ID space [32]. FTL is the combination of FL and transfer learning, which does not divide the data and uses transfer learning to overcome the data or label shortage under the circumstance that the users and user features overlap less in the two datasets [33]. Take the cancer diagnosis system as an example. A group of hospitals want to establish a FL system for cancer diagnosis, but each hospital has different patients and different physical examination results. In this case, federated transfer learning is usually adopted. The schematic diagram of FTL is shown in Fig. 4.

Fig. 4
figure 4

Schematic diagram of federated transfer learning

The FTL training process [34] consists of the following steps:

\(\textcircled {1}\) Initialization: Build server model fs using Eq. 11 and the dataset.

$$\begin{aligned} arg\ \underset{\Theta }{min}\ L=\sum _{i=1}^{n}\varrho (y_{i},fs(X_{i})) \end{aligned}$$
(11)

In Equation (11), fs represents the server model to be learned, \(\varrho (*,*) \) represents the loss function of the model (such as cross-entropy loss of the classification task), \(\Theta \) represents all parameters to be learned (namely weight and deviation), and \(\left\{ X_{i},y_{i} \right\} _{i=1}^{n}\) is the sample from server data with the size of n.

\(\textcircled {2}\) Local training: fs is distributed to all clients and the model of user u is trained by learning objective function Eq. 12.

$$\begin{aligned} arg \ \underset{\Theta ^{u}}{min}\ L_{u}=\sum _{i=1}^{n^{u}}\varrho (y_{i}^{u},f_u{}(X_{i}^{u})) \end{aligned}$$
(12)

\(\textcircled {3}\) Gradient aggregation: After the training of all user models \(f_{u}\) based on the shared cloud model is completed, Homomorphic Encryption is used to update them to the server, and Eq. 13 is used for model aggregation.

$$\begin{aligned} f_{s}^{'}(w)=\frac{1}{K}\sum _{k=1}^{K}f_{u_{k}}(w) \end{aligned}$$
(13)

\(\textcircled {4}\) Global parameter update: The server distributes the aggregation model as the updated cloud model \(f_{s}^{'}\) to all clients, and then transfers learning for each client to get their personalized model \(f_{u}\).

The above four steps are repeated when more users emerge continuously.

3.3 The Advantages of Federated Learning

As a product of the development of machine learning technology, FL has some advantages.

\(\textcircled {1}\) User privacy protection: The data of the clients participating in FL is not shared. The data is stored in the local environment to ensure user data security.

\(\textcircled {2}\) Data flexibility: During the FL process, the client can determine if it needs to quit without affecting the normal running of the FL.

\(\textcircled {3}\) Model training that allows large-scale data: FL is based on a global data learning model stored in tens of millions of remote client devices.

4 Security and Privacy Threats

In a FL scenario, attacks can be initiated not only by untrusted servers [7, 35,36,37,38,39], but also by malicious clients [4, 8, 37, 39,40,41]. In general, we think of parameter servers as honest and curious, and their attacks are considered passive attacks. This means that these servers usually serve strictly according to established learning protocols, but they also try to extract sensitive user information from the model update process. Attacks from malicious clients are called active attacks, in which they attempt to recover sensitive information about other users from aggregated global model parameters. These two attacks have the effect of destroying data privacy. Tables 4 and 5 respectively list some security and privacy threats encountered by FL and the corresponding solutions.

Table 4 The security threats to federated learning and corresponding solutions
Table 5 The privacy threats to federated learning and corresponding solutions

5 Security Challenges

This section divides the security challenges existing in FL into four components: Poisoning Attack, Inference Attack, Model Attack and Adversarial Attack. First, each attack method is classified by fine granularity, and its schematic diagram is drawn. Next, typical attack methods are selected to elaborate, and the attack principle is explained. Then, each method is illustrated by example. Finally, all attack methods are compared, and a summary table of attack methods is listed.

The classification of security challenges is shown in Fig. 5. In the attack methods, some sub-methods can be attributed to different superior attack methods, so there is a phenomenon of repeated occurrence of some attack methods in the classification diagram.

Fig. 5
figure 5

Classification of security challenges

5.1 Poisoning Attack

Poisoning attack refers to the fact that attackers manipulate model predictions with training sets during training or retraining, so that the trained models can satisfy the expectation of attackers and destroy models [62]. The methods of manipulating training datasets mainly include contaminating source data, adding malicious samples to training datasets, tampering with some labels in training datasets, deleting some samples in training datasets, etc. [63]. Based on the difference of the attackers’ targets, the poisoning attack can be divided into data poisoning attack and model poisoning attack. The schematic diagram of poisoning attack is shown in Fig. 6.

Fig. 6
figure 6

Schematic diagram of poisoning attack

5.1.1 Data Poisoning Attack

Data poisoning attack refers to the fact that attackers contaminate samples in training sets, resulting in low quality of training data, which reduces the quality of models and damages the availability of data and models. According to whether the data label is tampered or not, it can be classified into clean label poisoning attack and dirty label poisoning attack. Clean label poisoning attack is designed to add malicious data to a training dataset. A typical example of a dirty label poisoning attack is the label flipping attack [64], in which the labels of one class of clean training samples are flipped to another while the features of the data remain unchanged. Traditional dirty label poisoning attacks just reverse the training sample labels in the target class [65]. Some recent literatures have proposed optimized data poisoning attacks [66,67,68]. For example, mature attackers could inject some elaborate fake malicious data samples (such as label error), destroy the probability distribution of the original training data, and reduce the precision of classification or clustering of the learning model. This kind of attack has been proven in many applications, including handwritten number recognition [64] and PDF malware detection [69]. Another common way of attack is data backdoor poisoning attack [63,64,65,66,67,68,69,70]. By modifying the individual features or small regions of the original training dataset as a backdoor, the attacker can embed it into the model. If the input contains the backdoor features (e.g., a stamp on an image), the model will behave according to the goal of the attacker, while poisoning model in a clean input data on the performance is not affected. Trojan neural network attack also belongs to data poisoning attack [11]. In addition to this, the Trojan neural network and target model are packaged together, and data is input into the Trojan neural network and target model at the same time, and the output is integrated, so as to realize the distribution of Trojan network. It is worth noting that any malicious client can carry out the data poisoning attack, and the attack intensity depends on the degree of attacker’s participation in the attack and the amount of contaminated training data. That is, data poisoning attack is less effective in the environment with fewer attackers [71].

5.1.2 Model Poisoning Attack

Model poisoning attack means that attackers disrupt FL by sending incorrect parameters or destroying models during global clustering. Based on whether the attackers focus on a specific goal, the model poisoning attack can be divided into two categories: target attack and non-target (Byzantine) attack. The target attack refers to an attacker’s attack on a specific type of object, while the non-target attack is the attack without distinguishing samples, which is a kind of generalized attack. The authors of [72] study the local model poisoning attack against Byzantine robust FL, whose goal is to destroy the integrity and confidentiality of the model by destroying the integrity of the learning process at the training stage. The authors of [62] propose an optimization-based FL poisoning attack model, which is sufficiently covert and persistent to bypass specific defense methods and avoid catastrophic forgetting. Unlike data poisoning attack, model poisoning attack requires more sophisticated techniques and better computing resources to send data to the server, and its combined effect is stronger than data poisoning attack [73].

5.2 Inference Attack

Inference attack refers to the attacker obtaining infer able information through various means of attack, and then deducing the desired information by using the information, which can be the input features and attribute labels of members, etc. According to the different inference information, inference attack can be divided into the membership inference attack, the attribute inference attack, the feature inference attack, and the label inference attack. Inference attacks can be divided into the white-box attacks [74] and the black-box attacks [75, 76] according to whether the attacked model is known or not. White-box attack is carried out when the attacker knows the model. That is, the attacker can get the prediction output of any input and the intermediate calculation result of hidden layer [74]. The black-box attack is carried out when the attacker only knows the input and output of the model while the parameters of the model are unknown. It is more difficult and less effective than the white-box attack. In addition to that, GAN-based attacks [77, 78] also belong to inference attacks, including the client-side GAN attacks and the server-side GAN attacks. Server-side GAN attack is to calculate the privacy infor mation of user training samples by using periodically exchanged model parameters [79]. Different from the server-side GAN attack, the client-side GAN attack only has aggregation generated global model parameters, and the key of its reconstruction data sample lies in how to obtain the model updates of other users in each round of communication [74]. A schematic diagram of inference attack is shown in Fig. 7.

Fig. 7
figure 7

Schematic diagram of inference attack

5.3 Model Attack

Model attack refers to the attack that changes the global model by tampering with the local model of the attacked clients. Typical model attack methods include the model extraction attack and the model inversion attack. A schematic diagram of the model attack is shown in Fig. 8.

Fig. 8
figure 8

Schematic diagram of model attack

5.3.1 Model Extraction Attack

Model extraction attack refers to that the attacker continuously sends data to the target model, expecting to recover the target model locally, and predicts the parameters and functions of the model through the response information obtained, so as to generate an accurate model or similar model to realize the model extraction [12]. The target of the attacker is to steal the model and damage the confidentiality of the model. The accurate model refers to an alternative model constructed by the attacker with similar predictive performance. If the accurate model is stolen, it can generate adversarial samples, so model extraction attacks pose a great threat to the target model. The authors of [80] carry out an attack on BigML and Amazon’s online services, extracting an almost identical model and proving that the same attack is equally applicable in multiple scenarios.

5.3.2 Model Inversion Attack

Model inversion attack refers to an attacker who, without knowing the training data, obtains the data information of the target model from the prediction results of the completed training model, so as to obtain the user’s private data. The information inferred on the training set from the model inversion attack may be whether a member is included in the training set or some statistical features of the training set. Model inversion attacks can be divided into member inference attacks and attribute inference attacks according to the two kinds of training set information. Under a model inversion attack, a generator that can not have direct access to P but can access to a machine learning model of P and training set Q can recover some variables in training set P [81]. For example, the authors of [82] propose a new class of model inversion attack for face recognition system, which utilize the confidence values displayed in the prediction to recover recognizable images of people’s faces under the condition of only giving the name and accessing the machine learning model.

5.4 Adversarial Attack

Adversarial attack refers to a maliciously constructed adversarial samples submitted to a trained model that produces incorrect predictions in a state of high confidence. It is also known as an evasion attack [63]. Adversarial samples are the incorrect samples classified by the classification model after slight perturbations are added to the original samples. One characteristic of adversarial samples is that it only causes model classification errors and can be calibrated to the correct samples. In terms of attack environment, adversarial attack can be divided into the black-box attack and the white-box attack, and can also be divided into the target attack and the non-target attack according to attack purpose. Adversarial attacks can cause powerful damage to the system in the domains of speech and text recognition. Similarly, in the domain of malware detection, malware developers can also use adversarial attack to add some special statements to their software to evade detection by anti-virus software.

In conclusion, the attack methods of some security challenges confronted by FL are described in detail above. Table 6 comprehensively summarizes the attack methods encountered by FL.

Table 6 Comparison of federated learning attack methods

6 Threat Countermeasures

This section puts forward corresponding solutions to the security and privacy threats facing FL, which are divided into two categories: security defense method and privacy protection technology. Firstly, the security defense method is decomposed into four sub-methods, namely the poisoning attack defense, the inference attack defense, the model attack defense and the adversarial attack defense. Then the privacy protection technology is also decomposed into four sub-technologies, namely the DP technology, the SMC technology, the HE technology and the VSS. The concepts behind each approach and technique are explained below and how they address the security and the privacy challenges of FL. Figure 9 shows the classification of methods for security and the privacy challenges in FL.

Fig. 9
figure 9

Classification of methods for security and the privacy challenges in federated learning

6.1 Security Defense Method

In view of the multiple security threats to FL and combined with solutions proposed in the existing literature, the security defense methods are divided into four categories, namely, poisoning attack defense, inference attack defense, model attack defense and adversarial attack defense. The following sections will analyze four types of security defense methods in detail and discuss their applications.

6.1.1 Poisoning Attack Defense

Poisoning attack defense in FL can be considered from the following aspects: \(\textcircled {1}\) From the data itself, to ensure the authenticity and reliability of data sources; \(\textcircled {2}\) From the point of view of the attacker, sufficient security detection should be conducted to ensure that data and model parameters are not tampered with. In [4], the authors propose DFedForest, a FL system based on local forest algorithm that shares decision trees through blockchain. The system utilizes blockchain technology to ensure mutual trust among participants, register references to local model addresses in a distributed manner, and prevent malicious participants from compromising the accuracy of the model. In [8], the authors adopt a fully decentralized peer-to-peer (P2P) multi-party FL approach (Bicotti), which uses blockchain and cryptography primitives to guarantee privacy between peer clients and protect the process of FL. They propose poof-of-federation (PoF),a layer-1 blockchain consensus protocol that combines the state-of-the-art technology in FL defense to prevent clients from overstepping the system to compromise data integrity and model parameters without sufficient permission. The results show that Bicotti is able to resist the poisoning attacks in previous work. When there are 30% or less attackers in the system, the method can protect individual client updates and maintain the performance of the global model. In [37], in order to avoid model poisoning caused by malicious nodes and privacy disclosure caused by malicious servers, the authors propose a decentralized FL framework based on blockchain, that is, a Blockchain-based FL framework with Committee consensus based on blockchain(BFLC). In the absence of a centralized server, the framework utilizes a blockchain for global model storage and local model update exchange. In order to implement the proposed BFLC, an innovative committee consensus mechanism is designed, which can effectively reduce the amount of consensus calculation and malicious attacks. In [43], the authors in response to the IoTs in the system anomaly detection, the introduction of blockchain authorsization scattered and asynchronous federated study framework, the framework to ensure the data integrity, to prevent a single point of failure. The generative adversarial network-driven DP algorithm is designed to protect the privacy of the local model parameters, prevent poisoning attacks to some extent, and improve the model accuracy. In [56], in order to prevent raw data leakage, DP is applied to each federated edge node, and blockchain technology is used to aggregate updated model parameters, adding carefully selected noise to protect privacy, striking a balance between privacy protection and model accuracy. In [83], the authors propose a secure FL framework (SFAC) for UAV-assisted MCS to deal with the security and privacy threats for UAV-assisted crowdsensing with FL. First, a blockchain-based collaborative learning architecture is introduced for UAVs to promote efficient data transmission and model training of UAVs in MCS. Next, they use blockchain technology to replace the central server, a decentralized FL mechanism is designed to securely exchange local model updates, and drone contributions in collaborative training are recorded to securely exchange local model updates and validate contributions without a central server. Then, a privacy protection algorithm is designed to protect the privacy of the updated local model by applying local difference privacy. The algorithm has ideal learning accuracy. In the absence of actual knowledge of network parameters, the interactions between UAVs (i.e., data owners) and task publishers are formulated as finite Markov decision processes (MDPs), put forward a kind of based on a two-tier reinforcement learning (RL) of the incentive mechanism to promote the high-quality model sharing of unmanned aerial vehicle (UAV). It turns out that using the disturbance on the device enables the aggregation precision and strict privacy protection required by UAVs. In addition, compared with existing schemes, SFAC can effectively incentivize high-quality local model sharing, enabling optimal strategies and better practicability for participants.

6.1.2 Inference Attack Defense

An Inference attack requires the attacker to obtain the part of a FL user level above, and to perform inference effectively to attack is successful. Then avoid performing effective inference can be a defense against a way, it is need to strengthen the privacy protection mechanisms, HE and DP and some other privacy protection technologies obtained a good application here. For example, DP adopts a specific random algorithm to add appropriate noise to the data to blur the data, so that even if the attacker gets the interactive data also cannot deduce the original data effectively and reduce the risk of information disclosure. In [7], the authors propose a privacy-enhanced FL scheme to protect gradients on untrusted servers. Local gradients of participants are encrypted using the Paillier HE system. The encrypted gradients can be further used for secure aggregation on the server-side, so that untrusted servers can only know the updated and aggregated statistics of all participants, while the private information of each user is well protected. In [36], the authors combine HE with DP and propose an efficient FL protocol based on stochastic gradient descent. The user adds noisy data to each local gradient and then encrypts it for optical performance and security, preventing attackers from inferring the user’s privacy from local output (such as gradients). In [50], in order to prevent attackers from identifying the data used to compute gradients, the authors integrate DNN and control algorithm into FL, forming a new DNN (DgstNN). The goal of DgstNN is to minimize the classification error and maximize the normalized distance between the gradient of the original data and the gradient of the digested image. The loss function related to classification error is represented as classification loss, and the loss function related to normalized distance is represented as distance loss. Increasing the distance loss changes the gradient of the digested image so that it is different from the gradient of the original image, thus preventing the opponent from obtaining the gradient information of the original data. Minimizing distance loss can transform digested images into images that humans and other learning models cannot recognize. Even if an attack successfully recreates an image from a gradient, the result will be a digested image that loses the visual features of the original image. In [51], the authors hold that although DP can guarantee the privacy protection theoretically by noise processing of the exchanged update vector and prevent inference attacks. However, the added noise is proportional to the size of the model, and the quality of the model will become worse with the addition of noise. Therefore, the authors propose the compressed sensing extended FL, which includes two schemes: the first scheme FL-CS, which uses compressed sensing to reduce communication bandwidth. The second scheme, FL-CS-DP, combines compressed sensing and DP to protect user information. The results show that this scheme can not only prevent users from inference attacks to reveal privacy, but also prevent the model accuracy from decreasing. In [60], the authors design a security matrix decomposition framework in FL environment, called dFedMF. First, they design a user-level distributed matrix decomposition framework, when each user only uploads gradient information, not original preference data to the server, the model can be learned. Then they use the HE strengthens the distributed matrix decomposition framework, as long as the HE system can guarantee that ciphertext is indistinguishable for choose plaintext attack, there will not be any information to the server. The results verify the feasibility of dFedMF, the system is safe for honest but curious servers, and there is no loss of accuracy. In [84], the authors propose secure learning, a general design of private FL system, which is an efficient and secure aggregation system that prevents powerful inference attacks by denying access to individual model updates and hiding local models from aggregators. In [85], a new partition defense model (PAMPAS) based on user devices and trusted edge servers is designed to resist the attacks from GANs.

6.1.3 Model Attack Defense

Since the object of model attack is model, it is important to prevent model parameters and hyperparameters from being stolen and other model information from being leaked. The security aggregation algorithm and DP technology are effective defense methods, which can not only effectively defend against inference attack, but also against model attack. In [42], in order to resist model extraction attacks and model inversion attacks, the authors design a FL application model supporting blockchain, based on which a data protection aggregation scheme is formulated. Distributed K-means clustering based on DP and HE, distributed random forest algorithm based on DP and distributed AdaBoost based on HE are presented to realize multiple protection in data sharing and model sharing. In [45], the authors propose a FL system called BlockFlow, which introduces the DP technology and a new model contribution auditing mechanism to protect the data of a single agent, and uses Ethereum smart contract to encourage good behavior. The results show that the system can effectively prevent attackers from obtaining the information of the training dataset from the model. In [47], the authors propose a privacy-protecting data publishing framework, FedGP, for federated generative privacy in a FL environment. The main idea is to train the GANs on the client to generate artificial data that can replace the real data of the client. These generated samples can be used to evaluate and train machine learning models. Since some clients may not have enough data to train a GAN locally, a federated GAN model is trained. In this way, user data is always retained on the device. In addition, a federated GAN will generate samples from a common cross-user distribution rather than from a single user, increasing overall privacy. The generator components of the GAN are trained by the FedAvg algorithm to extract private manual data samples and assess the risk of information disclosure. By running a model inversion attack to assess the protection provided, training using a federated GAN was demonstrated to reduce information leakage (for example, face detection in recovered images was reduced from 25.5% to 1.2%). FedGP can generate high-quality marker data and significantly reduce the vulnerability of learning models to model inversion attacks. In [86], in order to deal with model inversion attack, the authors propose a PSI protocol based on VFL, which adopts a hybrid encryption algorithm (a method combining the symmetric secret key encryption). This protocol achieves a certain security goal, as long as the number of arbitrary malicious clients collusion is less than a threshold, malicious clients and servers cannot obtain private information of any honest clients, thus achieving the goal of protecting client privacy. In [87], the authors put forward two methods to test whether the model parameters were damaged. One is to detect numerical differences between the parameters used. Comparing the i th parameter provided by each participant, when there is a large gap between the values of the parameters provided by one participant and those provided by other participants, determine this parameter to be an exception. Another method is that the server performs the corresponding processing using \(W_{G_{1}}=W_{G}+f(\delta _{i})\) according to the parameter \(\delta _{i}\) uploaded by the client, and then calculates \(W_{G_{2}}=W_{G}+f(\Delta )\) by using the parameters uploaded by other clients. Where \( \Delta =\left\{ \delta _{j} \vert j=1,2,\cdots ,n,j\ne i \right\} \), f is the specific function designed. If the difference between \(W_{G_{1}}\) and \(W_{G_{2}}\) exceeds a certain set value, it is inferred that the model update parameter is abnormal.

6.1.4 Adversarial Attack Defense

According to the attack mode of adversarial attack, it can be observed that maliciously constructed adversarial samples submitted to the trained model will cause model classification errors. According to its attack principle, it can be inferred that adversarial training for the model can enhance the robustness of the model. The so-called adversarial training is to use the training set containing adversarial samples and real samples for the training of FL model, and in the training process, the model learns the features of adversarial samples, so as to achieve the role of defense. Another method is to detect the adversarial samples of malicious constructs. As long as the difference between the malicious adversarial samples and the normal samples can be found, the adversarial sample can be detected and the adversarial attack can be prevented. Preventing overfitting of the model is also a way to resist adversarial attacks. If the degree of overfitting is too high, the generalization ability of the model will be weakened and the possibility of successful adversarial attack will be increased. In [88], the authors find that using small batch training data can effectively estimate the characteristics of test samples: The estimated local intrinsic dimensionality (LID) of adversarial examples is significantly higher than that of normal data examples, and this difference becomes more pronounced in deeper layers of DNNs. In the experiment, five most advanced attack methods are used to generate adversarial examples, whose LID features can be easily distinguished from those of normal examples, and the performance of the provided baseline classifier based on LID outperforms several state-of-the-art detection measures by large margins in five attacks of three benchmark datasets. The experiment proves that the detector based on simple LID is robust to the normal attack based on low confidence optimization. In [89], the authors find that the neural network obtained by using regularized input gradients is robust to adversarial examples, which improves the robustness of adversarial disturbances and prevents model overfitting. In [90], the authors propose a new defense approach based on actual observations that is easily integrated into the model and can reinforce the common weakness of the deep network, smoothing the decision function, without knowing the type of attack used to make adversarial examples. When the model uses the proposed defense, the disturbance required for misclassification is much greater, making the attack detectable, and the detection more stable and less likely to be fooled by the adversarial samples. Experiments show that this method is effective against multiple attacks, which brings almost no cost to the training process, and maintains the predictive performance of the original model against clean samples, which is performed better than the most advanced defense methods. In [91], the authors introduce a defense mechanism called defensive distillation to reduce the effectiveness of adversarial samples. They investigate the extensibility and robustness conferred by the use of defensive distillation when training DNNs. It shows that defensive distillation can reduce the effectiveness of sample generation from 95% to less than 0.5% on the DNN studied. This tremendous achievement can be explained by the fact that distillation results in a 1030-fold reduction in the gradient used for the creation of adversarial samples.

To sum up, the security challenge defense of FL is summarized, as shown in Table 7.

Table 7 The federated learning of security defense methods and techniques

6.2 Privacy Protection Technology

Numerous technologies have been proposed to address privacy-related issues in FL at the present time. Commonly used privacy protection technology can be divided into four categories, namely the DP, the SMC, the HE and the VSS technology. Each technique is explained in detail below and the approach proposed under each technique is discussed.

6.2.1 Differential Privacy

DP technology is mainly used to add random noise to datasets so that attackers cannot infer sensitive information about users even if they know the results posted by users. And accordingly, the addition of noise will also cause the quality loss of statistical data, resulting in the decline of the accuracy of the learning model. However, compared with the privacy protection ability of DP technology, its loss is insignificant. Even so, when dealing with the privacy threat of FL, DP is generally combined with other technologies to ensure user privacy security and avoid a decrease in model accuracy. DP can be used in cases where an attack steals private data from one party during training, or attempts to reconstruct the training set based on the generated gradient. A model calculation is considered differential private if the output is independent of a particular data point of the input data.

DP technology [92] can be expressed as the following algorithm: a random algorithm \(M:D\rightarrow R\) satisfies \((\varepsilon ,\delta )\) -differential privacy, if and only if, for any adjacent dataset d with only one data difference, \(d\in D\) and any output \(S\subseteq R\), satisfies Eq. 14.

$$\begin{aligned} Pr[M(d)\in S]\le e^{\varepsilon }Pr[M(d^{'})\in S]+\delta \end{aligned}$$
(14)

In Equation (14), M(d) and \(M(d^{'})\) respectively represent the output of algorithm M on datasets d and \(d^{'}\). Pr is the output probability of the algorithm. \(\varepsilon \) is the privacy budget, which is used to control the privacy protection level. The smaller \(\varepsilon \) is, the stronger the privacy protection capability is. \(\delta \) is another privacy budget, representing the probability that the tolerable privacy budget exceeds \(\varepsilon \). If \(\delta \) is equal to 0, M is said to satisfy \(\varepsilon \) -differential privacy.

In [42], the authors design distributed K-means clustering based on DP and HE, distributed random forest based on DP and distributed AdaBoost based on HE, realizing multiple protection in data sharing and model sharing. In [43], the authors design an improved GAN model named DP-GAN, which has one more perceptron: DP identifier compared with traditional GAN. Differential noise is generated by two games running at the same time: the game between the classic generator and the discriminator and the game between the discriminator and the DP identifier. The data generated by the improved GAN model can meet the requirements of data protection and approximate the original data to the best degree. In [46], in order to make DP play a better role in FL, the FL model has better practicability and privacy protection ability. The authors propose a method combining local gradient perturbation, security aggregation and zero-concentrated DP (zCDP). First, in order to protect shared model updates, each client is required to perturb its gradient in each local iteration to ensure that shared model updates before aggregation are differential private. Because of the combination of periodic averaging and client sampling, gradient perturbation produces some noise to model updates and results in low model utility. Therefore, a secure aggregation protocol with low communication overhead is integrated to reduce the increased noise, while zCDP is used to tightly capture the end-to-end privacy loss, which can add less noise with the same DP guarantee. In [59], the authors make a theoretical analysis of the performance of FL algorithm based on DP, and study the convergence performance of FL with noise disturbance at the inherent privacy level. They propose a new framework based on DP that adds artificial noise to the client parameters before aggregation, i.e., noising before model aggregation FL (NbAFL). By adapting to different artificial noise variances, NbAFL can satisfy DP under different protection levels. Then, the theoretical convergence bound of the loss function of the FL model after NbAFL training is established, which proves that there is a tradeoff between convergence performance and privacy protection level: the better the convergence performance, the lower the protection level. In [93], the authors propose a differentially private asynchronous FL scheme (DP-AFL) to solve the privacy problem of mobile edge computing (MFC) in Urban Informatics. In order to protect the privacy of the updated local model, this scheme will incorporate local DP into the gradient descent local training process, and then add it to FL.

6.2.2 Homomorphic Encryption

HE generally encrypts the gradient uploaded by the user during FL [94]. The gradient after HE is a bunch of random numbers, and the attacker cannot deduce any valuable information from the random numbers without the key.

HE allows users to perform operations directly on the ciphertext, and the results obtained from the operations are still ciphertext. The results obtained after decryption are consistent with the results of the original data (plaintext) directly performing various calculations [95]. The HE scheme satisfies Eq. 15.

$$\begin{aligned} Dec(k_{S},Enc(k_{p},m_{1})\lozenge Enc(k_{p},m_{2}))=m_{1}\circ m_{2} \end{aligned}$$
(15)

In Equation (15), \(m_{1}\) and \(m_{2}\) are plaintext, \(k_{S}\) is a private key and \(k_{p}\) is a public key. \(Enc(*,*)\) is an encryption operation, \(Dec(*,*)\) is a decryption operation, \(\circ \) is an operation in plaintext field, \(\lozenge \) is an operation in ciphertext field.

According to the types and times of ciphertext operations supported, HE can be divided into: Partially HE (PHE), Somewhat HE (SHE) and Fully HE (FHE) [96].

PHE only supports addition and multiplication, and the number of operations is not limited, so it can be divided into Additive HE (AHE) and Multiplicative HE (MHE). For example, the Paillier scheme belongs to AHE, and the EI-Gamal scheme belongs to MHE. SHE supports only a limited number of addition and multiplication operations. FHE supports arbitrary operation on ciphertext and the number of operations is unlimited.

In [35], in order to solve the problem of no correlation between data caused by "isolated data island" and data and data features cannot be shared with other data, the authors construct a FL system based on distributed encryption matrix decomposition. Firstly, a framework based on user distributed matrix decomposition is established. In order to increase data privacy protection, HE is added to perform FL based on distributed matrix decomposition. The scheme allows each user to encrypt gradients as they transmit their local gradients, avoiding gradients being acquired or maliciously tampered with during transmission. Because the process does not need a third encryption service provider, it also avoids data leakage caused by third parties. For normal HE schemes, the server is set up to hold the key, which can lead to a serious problem, i.e. if the server does not aggregate before decrypting, the server has access to the user’s updates. To solve this problem, in [38], the authors propose a privacy-protected federated extreme gradient boosting scheme (FEDXGB), which is a federated extreme gradient boosting (XGBoost) scheme supporting forced aggregation for moving crowd perception. A new secure gradient aggregation algorithm for FL is designed, which combines the advantages of HE and VSS. Specifically, through a combination of HE and VSS, FEDXGB ensures that the central server does not get the correct decryption results before performing aggregation, while being robust against user loss. The results show that FEDXGB keeps the high performance of XGBoost with less than 1% accuracy loss. FEDXGB makes the performance loss of trained XGBoost negligible, reduces about 23.9% running time and 33.3% communication cost in gradient aggregation, and reduces the computing and communication cost of secure aggregation.

6.2.3 Secure Multi-party Computation

SMC technology can reduce the possibility of information leakage by integrating model gradient updates. SMC in each random encryption when using, do not reuse the encrypted data, need operation on encrypted data directly, don’t need to restore the original data, determine the participants before each calculation. In the place where input is not shared, multiple participants aggregate the data by using encryption techniques such as the HE, the secret sharing protocols, and the oblivious transfer protocol. These methods only protect the privacy of training data in the learning process, but cannot prevent inference attacks on the result model [97].

The formal description of SMC is as follows: Assuming that there are m participants \(P_{1},P_{2},\cdots ,P_{m}\) and they have their own dataset \(d_{1},d_{2},\cdots ,d_{m}\), how to safely calculate a convention function \(y=(d_{1},d_{2},\cdots ,d_{m})\) without trusting a third party, and at the same time, each participant is required not to get any input information from other participants except the calculation result [98]. SMC has the characteristics of input independence, computational correctness, and decentralization. The basic cryptographic protocols of SMC include Oblivious Transfer (OT) protocol, Garbled Circuits (GC) and Secret Sharing (SS) protocol, Goldreich-Micali-Wigderson protocol (GMW) protocol, etc.

In [99], the authors argue that the use of DP in the presence of a large number of clients leads to a decrease in model accuracy. In order to solve this problem, a method of integrating SMC into DP is proposed. The results show that this method reduces the impact of noise injection when the number of customers increases, while maintaining some robustness. In [100], the authors propose a SMC protocol for a FL framework called security aggregation. Security aggregation utilizes a variety of encryption techniques to prevent the parameter server from acquiring the original client’s local updates. The proposed protocol would protect the FL framework from honest but curious attackers and disclose the sum of model parameter updates to the server only after a certain number of updates have been made. The protocol consists of four rounds of operations, each round of which the server collects messages from all clients and computes a separate response to those messages to send to each client. In the first two rounds (preparation stage), secret sharing is initiated. In the third round (submit stage), each client submits encrypted mask model updates to the server, which stacks them up. In the final round (the final stage), the clients expose the encryption secret, enabling the server to expose the aggregated model updates.

6.2.4 Verifiable Secret Sharing

VSS is used to protect important information on clients and prevent information loss, damage, and tampering. In FL, attack may monitor user and task publishers communications to intercept the gradient information or honest and curious task publishers get user’s local gradient. VSS uses encrypted sharing to process gradient information uploaded by users to ensure that malicious servers cannot obtain gradient information, reach the role of defense.

VSS includes three parts: client, distributor and secret. The idea is to split secret information into n fragments in an appropriate way, and each fragment after splitting is managed by n different clients. A threshold t is set, and the secret information cannot be recovered when the attacker has any less than t fragments. The secret message can be recovered only when the number of fragments is equal to or greater than t [101]. A typical VSS scheme is constructed based on a polynomial method, which can be divided into two steps: generating and distributing the key and decrypting the key. Equation 16 shows the expression of the key generation method.

$$\begin{aligned} y_{i}=K+\sum _{i=1}^{t-1}a_{i}x^{j}\ mod\ p \end{aligned}$$
(16)

In Equation (16), K is the secret, t is the threshold of SS, \(a_{i}\) is the coefficient of the polynomial, and modulus p is set for safe calculation (making decryption difficult).

Then solve the linear equations according to the key provided by t participants, and solve the polynomial coefficients and secret K.

In [100], the authors design a secure aggregation scheme based on Shamir secret sharing to ensure that learning models update parameters securely in the face of honest but curious servers, while controlling the complexity of secret sharing protocols and keeping computing and communication costs low in large datasets. In [102], the authors propose a VFL algorithm based on logistic regression. After the server realizes secret sample alignment, the intermediate results are calculated according to the aligned samples. Then the server generates public and private key pairs, encrypts approximate losses and intermediate results, and obtains the encryption gradient through local training. Since the server is honest but curious, random masks need to be generated to prevent the server from infering the user’s private information based on the original gradient. With the help of the third party, this method ensures privacy security by means of encryption method, and reduces the cost of encryption calculation by approximate loss function.

In summary, the technologies to address FL privacy challenges are summarized, as shown in Table 8.

Table 8 Summary of technologies that address federated learning privacy challenges

7 Future Research Direction

In this section, we will discuss some future research directions, classified according to corresponding high-level challenges, which will be useful for future work and research. Based on the classifications and solutions discussed above, we identify a set of criteria for future solutions that will serve as a reference for scholars and developers studying ways to improve security and privacy in future FL systems.

7.1 Suggestions for Security Challenges

In the environment of FL, most security solutions only consider attacks executed in a single direction, ignoring more complex attack scenarios. From this perspective, an attacker can formulate a joint attack plan and consider more complex attack scenarios to counter the existing security defense mechanism. For example, an attack can involve multiple client devices to execute, multiple attack methods attack the specified target synchronously, malicious clients collude with servers (such as sharing private keys) to attack other honest users. Security solutions tailored to a single attack cannot easily adapt to collusive attacks. The security analysis of the security matrix decomposition proposed in [60] shows that using a typical HFL security definition, assuming honest clients and honest but curious servers, such a security definition is weak. Malicious clients may collude with the server to attack other users, revealing the privacy of honest users, and may cause backdoor attacks, causing security problems.

In order to design efficient and safe security defense schemes, several defense criteria are drawn up below. When dealing with security challenges of FL, defense schemes can be designed based on the following criteria, as shown in Table 9.

Table 9 Summary of technologies that address federated learning privacy challenges

7.2 Suggestions for Security Challenges

The privacy protection scheme in FL is designed to be universal for client devices and data samples in all scenarios. However, in practice, data samples in different situations and even data samples on a single device are often different, so the universal privacy protection scheme cannot achieve the expected effect in practice. Therefore, special privacy protection schemes can be designed to protect customer privacy in specific situations, which can be combined with universal privacy protection schemes. Privacy protection for FL should also consider the loss to FL systems when using a range of encryption methods, especially DP. From the perspective of security challenge, the attacker’s attack on FL system not only causes security risks but also risks of privacy disclosure. In addition, software and hardware, which have nothing to do with the FL system itself, should also be taken into account.

In order to design efficient and safe privacy defense schemes, several defense criteria are drawn up below. When dealing with privacy challenges of FL, defense schemes can be designed based on the following criteria, as shown in Table 10.

Table 10 Summary of technologies that address federated learning privacy challenges

In addressing the security and the privacy challenges of FL, in addition to considering traditional defense approaches and implementing the above defense criteria, integrating other technologies with FL to propose more FL architecture is an attractive defense solution. For example, blockchain can provide high security for FL training through immutable block ledgers. By utilizing blockchain, FL can execute a decentralized data ledger where each device can act as a client with equal rights, eliminating the need for a central server [103] and reducing the risk of a single point of failure. In particular, the integration of FL and blockchain creates a new paradigm called FLchain that guarantees the safety of learning updated information in the form of immutable blocks through the use of blockchain [104]. In FLchain, an adversary can attempt to manipulate the training output by training the local model with forged data of the design and replacing the global model before updating the transmission. By adjusting the difficulty level of blockchain mining, the likelihood of poisoning attacks on training data can be reduced without degrading training performance [105]. Driven by the unique advantages of blockchain, another blockchain-based FL architecture called PriModChain is introduced in [106]. DP is applied to locally generated models with artificial noise to reduce the possibility of identifying personal records. By using smart contracts, communication between the central authorsity and distributed users exchanging global ML models is secured, which facilitates update validation protocol and provides transparency for FL updates. This function forcibly performs unbiased and error-cost data operations to enhance the security and reliability of FL processes under external data threats. In addition, the use of blockchain introduces additional delays associated with block mining, which creates new challenges for FL systems as FL customers need to wait for the mining process to complete before receiving model updates and executing the next round of training [107].

8 Conclusion

The distributed learning mode of FL makes it unnecessary for users to upload original data to the server. The proposed learning mode alleviates the inevitable privacy security problems in the era of big data and becomes an indispensable technology to protect privacy. Since FL is the product of machine learning, its system still has inherent security problems and derived privacy problems. This paper expounds the security and privacy threats of FL from the angle of attack and defense. First, a detailed investigation of the existing survey of security and privacy protection of FL is carried out, and our survey is compared with existing related surveys to highlight the unique contribution of our survey. Secondly, it introduces the related knowledge of FL and makes a comprehensive analysis of three scenarios of FL. Later, it illustrates various specific threats to the security and privacy protection of FL in the form of tables, and gives the corresponding solutions. Then it classifies the security challenges according to the collected related threats of FL and illustrates the classification by combining pictures and examples. Next, security defense methods and privacy protection technologies are proposed to address the challenges of FL. Finally, by considering the drawbacks in existing attack and defense methods, we make some suggestions on how to propose much more excellent privacy protection and secure schemes in FL, and develop a set of criteria against malicious attacks and privacy leakage, hoping it can be useful for the relevant researchers and developers when planning their own defense schemes.