Keywords

1 Introduction

This chapter will briefly sort out the background knowledge of machine learning and privacy leakage, and explain the research significance of this paper, as well as the difficulties and challenges in this research direction

1.1 Related Work

In the era of big data, massive amounts of information have promoted the development of machine learning, and now it has been widely used in malicious detection (see [33, 37]), computer vision (see [32, 49]), voice command recognition (see [46, 48]), driving system (see [10, 47]), recommendation system (see [36, 56]), medical diagnosis (see [3, 13]) and many other fields. Machine learning can discover patterns and laws from massive data, and apply this knowledge to different tasks, bringing great convenience and benefits to humans. Especially after the breakthrough development of technologies such as deep learning in [41] and reinforcement learning in [31] , it has provided strong support for the application of machine learning in the above fields, and even some performances have been better than humans. In the past, there has been a lot of research work on privacy protection measures in machine learning. Many workers have evaluated and summarized the existing attack and defense work. Reference [5] studies the attack model of machine learning, and uses a Statistical spam classification is taken as an example, and an in-depth analysis is carried out. Reference [2] took cleaning robots as an example to summarize and analyze the problems that may exist in the real work and life of human beings. Reference [4] use a black-box model and a white-box model to conduct targeted research on machine learning adversarial attacks and poisoning attacks. Although Ref. [34] is an article on computer security A comprehensive overview of threats, but it also summarizes some of the content related to machine learning. Reference [35] focuses on the training and prediction phases of the machine learning life cycle. Reference [1] focuses on It is a security issue in the field of computer vision. Reference [18] is based on the machine learning CIA model to investigate and summarize.

This article first explains the development of machine learning and privacy leakage, and then from privacy leakage, attack methods. The three angles of model security systematically and scientifically summarize the existing machine learning attack methods and defense methods, and discuss the limitations of related research. Finally, discuss the challenges faced by machine learning model security and privacy research and Feasible research directions in the future, mainly including contributions

1. Conduct a comprehensive and systematic analysis and summary of attack methods and defense technologies in recent years

2. Present the possible attacks and defense measures of machine learning through the combination of charts, and introduce typical attacks and defense methods

3. According to the characteristics and current situation of machine learning, this paper proposes a multi-faceted summary and outlook.

2 Machine Learning Model

The machine learning model in Ref. [30] is a data-driven predictive model, which discovers the relationship and regularity between variables by training the model on a large amount of data, and realizes the prediction or classification of future data.

2.1 Model Introduction

This paper uses the Amazon Machine Learning (Amazon ML) model in Ref. [45]. Amazon Machine Learning is a machine learning service provided by Amazon, which aims to help users quickly build and deploy high-quality machine learning models. It provides a series of easy-to-use APIs and tools that enable users to quickly build, train and test models without requiring extensive machine learning expertise. And it supports a variety of machine learning models, including linear regression, logistic regression, decision trees, support vector machines, and random forests.

Fig. 1.
figure 1

Machine learning model

3 Privacy Leakage

This chapter will briefly explain the background knowledge about privacy leakage. Privacy leakage has always been an important field of machine learning research.

3.1 Background Knowledge on Privacy Leakage

In machine learning, due to specific scenario requirements, its model design is biased towards efficient and accurate prediction output, rather than the ability of the model to resist attacks. Therefore, in the actual application of the model, there will be malicious users attacking various stages of the life cycle of the machine learning model. During the attack process, the risk caused by privacy leakage is particularly prominent.

This paper mainly focuses on attack vectors for data privacy. Most attackers are more inclined to obtain private data of unspecified people, and have developed more attack methods. Therefore, this paper introduces three data-targeted attack methods, involving multiple aspects such as data training, input, and prediction. At the same time, the attack on model privacy is also mainly reflected in the extraction of the model. Regardless of the purpose of the attacker, the leakage of these data will cause considerable damage to the data owner. Therefore, both providers and users of machine learning models should pay more attention to privacy protection and continuously improve their ability to resist attacks.

4 Attack Methods That Cause Privacy Leakage

Vulnerabilities in machine learning model algorithms and implementations lead to security risks such as data leakage and loss of model structural parameters. This chapter will introduce three attack methods targeting model data and one attack method targeting the model itself.

4.1 Attack Methods Targeting Model Data

Model data is the basis of machine learning models. Model data includes training data, model input and output, etc. Some training data sets that involve privacy, such as shopping records, hospital records, etc., are also related to the issue of personal privacy protection. The following describes three attack methods against machine learning model data.

4.1.1 Embedding Inversion Attack

Embedding Inversion Attack, also known as embedded inversion attack [14], is an attack method for deep learning semantic embedding models. This attack is often used to infer input text from pre-trained neural language models. In natural language processing, embedding refers to embedding words or phrases into a real vector space using a small fixed-length representation. In text classification tasks, the input text sequence needs to be converted into a sequence of numbers. To do this, word embedding methods can be used to map each word into a vector space of low-dimensional vectors. In this way, a sentence or paragraph can be represented as a matrix of word vectors. This matrix will be fed into a neural network for text classification. At the same time, this matrix can be used as one of the inputs of the deep learning model, and the embedding vector can be used as the context and passed to the neural network for classification, regression and other tasks.

4.1.2 Attribute Inference Attack

Attribute Inference Attack (see [19, 55]) aims to infer private attributes in training data from machine learning models. Attackers do not need to directly access protected personal data, but instead gain private information about personal data by analyzing deployed machine learning models.

4.1.3 Membership Inference Attack

Member Inference Attack is a privacy attack in machine learning [15, 28, 38, 44, 51], which aims to determine whether a given input belongs to the data set by accessing the protected training data set, that is, whether there are members in the data set identity. Membership inference attacks are based on two assumptions, one is that the model is knowable, and the other is that the attacker has access to some sample labels (that is, having membership and corresponding labels in the data set). Then, the attacker specifies some input data and guesses whether it belongs to a specific member in the dataset by tracking the model output.

4.2 The Target Is the Attack Method of the Model Itself

There are also attack methods for machine learning models. Attackers can obtain information related to the model by calling APIs related to the machine learning model, and can even disguise or embezzle the model to achieve the purpose of stealing private data.

4.2.1 Attack Overview

Among them, Model Extraction Attack, [12, 39, 52] has been studied for simple classification tasks, vision tasks, NLP tasks, etc. Typically, model extraction attacks aim to reconstruct a local copy or steal the functionality of a black-box API. If the extraction is successful, the attacker has effectively stolen the intellectual property, i.e. the full details of the model. The work of this attack method mainly focuses on how to imitate a model with performance close to the victim API in the source domain, and a more powerful attacker may even extract a better model than the target victim API [16].

Attackers use this technique to steal model knowledge by accessing the model and its output to deduce sensitive information of the target model. The reason why this attack technique is called model extraction attack is because the attacker can replace the attacked model with a model constructed by himself, and can output the corresponding label in a way consistent with the original model [50].

5 Defense Measures Against Privacy Leakage

This chapter will introduce four commonly used schemes in privacy protection, namely homomorphic encryption, secure multi-party computation, confrontation training, and differential privacy.

5.1 Homomorphic Encryption

Homomorphic Encryption (HE) refers to satisfying the original file through a specific homomorphic encryption algorithm, the encrypted ciphertext can satisfy the property of homomorphic operation, and the final ciphertext operation result is equivalent to the corresponding homomorphic decryption The result of performing the same operation directly on the subsequent plaintext can realize the “countable and invisible” data. In the cryptographic system, homomorphic encryption is usually based on computational problems in mathematics, including but not limited to integer decomposition problems, discrete logarithm problems, Determining the remainder of composite numbers, the approximate greatest common factor problem [7, 17, 40], etc.

5.2 Secure Multi-party Computation

Secure Multi-Party Computation [6, 11] was proposed by Professor Yao Qizhi, an academician of the Chinese Academy of Sciences, in 1982. For model training, secure multi-party computation requires the use of cryptographic tools, such as secret sharing [20, 21], zero-knowledge proof [26, 27], oblivious transmission [8, 54], obfuscation circuits [42, 53], and in centralized machine learning, secure multi-party The calculation is performed on two non-collusive servers through secret sharing, and the scheme can be extended to the scene of hundreds of users, followed by a large amount of communication overhead . The difference is that the secure multi-party computing scheme based on obfuscated circuit technology can generally only be applied to two to three parties to complete model training. In the joint machine learning model, homomorphic encryption or zero-knowledge proof is more commonly used.

5.3 Adversarial Training

Adversarial training [24, 25, 29] is a defense method in machine learning that aims to improve the resistance of deep neural networks to adversarial attacks. By injecting some perturbations into the original data, the machine learning model is made more robust, thereby reducing the impact of attacks on model data and structural parameters. The method mainly includes the following steps: First, generate adversarial samples. First, some attack algorithm needs to be used to generate adversarial samples and added to the normal training data set to form a new training set. The second is to train the model. The model is retrained using a new training set with adversarial samples in order to enhance the tolerance of the model against noise and improve the robustness of the model. And finally the test model. After the training is completed, the test set is evaluated. If the model shows better robustness, it will have a better ability to deal with raw data and adversarial input than the normal model. If expectations are not met, repeat the first two steps until you are satisfied.

5.4 Differential Privacy

Differential privacy is a data protection algorithm with strict mathematical definition and privacy quantification. By perturbing the data, such as adding noise, the attacker cannot deduce the original data, thereby achieving data privacy protection and avoiding the complete destruction of the original data. To ensure the availability of perturbed data . Nowadays, differential privacy technology can be divided into centralized differential privacy [9, 23] and localized differential privacy [22, 43] according to the processing subject. Among them, centralized differential privacy processes data by a trusted third party, while localized differential privacy The data is privately processed locally by the user, and the more mainstream method is localized differential privacy.

6 Summary and Outlook

This article introduces the leakage risks and defenses of machine learning models, describes four attack methods and introduces four defense measures. It can be seen that whether it is aimed at the model training data or the attack method against the model itself, it is applicable to most of today’s machine learning models. It can be seen that these models have a certain risk of leakage. With the deepening of machine learning and artificial intelligence research, the application of machine learning models has become more and more extensive, and it has become more and more deeply involved in all aspects of people’s lives. A large amount of personal privacy data is applied to the training of the model to improve the humanity and intelligence of the model. However, this trend increases the danger caused by privacy leaks. After mastering relevant data, attackers can rely on the performance of private data to profile people, and may target potential advertisements, data collection, and even targeted telecommunications. Provide convenience for online fraud and theft of private property.