【免费】2018-隐私安全性分析-ComprehensivePrivacyAnalysisofDeepLearning1资源-CSDN文库

需积分: 0 127 浏览量 2022-08-03 14:43:16 上传评论收藏 335KB PDF 举报

【2018-隐私安全性分析-深度学习的全面隐私分析】随着深度学习技术的广泛应用，从图像识别到语音处理，再到自然语言生成，其在各种机器学习任务中展现出极高的泛化能力。然而，这一领域的隐私问题也日益凸显。论文"Comprehensive Privacy Analysis of Deep Learning"针对深度学习模型的隐私安全进行了全面分析，重点关注了独立学习和联邦学习环境下的被动与主动白盒推理攻击。论文中，作者Milad Nasr、Reza Shokri和Amir Houmansadr探讨了深度神经网络（DNNs）容易受到各种推理攻击的原因，因为这些模型在训练过程中会保留关于其训练数据的信息。他们通过分析模型参数以及训练和微调过程中的参数更新来量化隐私泄露的程度。研究设计了在独立（stand-alone）和联邦（federated）学习场景下的攻击模型，分别对应于被动和主动的白盒攻击者，并考虑了不同的对手先验知识。他们提出了一种新颖的白盒成员关系推理攻击，用于评估深度学习算法的训练数据成员泄露情况。论文指出，简单地将已知的黑盒攻击扩展到白盒设置（通过分析激活函数的输出）是不够有效的。因此，研究人员开发了新的算法，利用广泛用于训练深度神经网络的随机梯度下降算法（SGD）的隐私弱点，专门针对白盒设置进行设计。实验结果显示，即使是泛化性能良好的模型也对白盒成员关系推理攻击高度敏感。通过分析针对CIFAR数据集的最先进的预训练和公开可用模型，这一现象得到了证实。此外，论文还揭示了在联邦学习环境中，恶意参与者如何能够对其他参与者发起主动的成员关系推理攻击，即使全局模型具有很高的预测精度。这篇研究的重要性在于，它揭示了深度学习模型在保护用户隐私方面的脆弱性，尤其是在协作学习的背景下。这些发现对深度学习的实践者提出了挑战，要求他们在追求模型性能的同时，必须考虑并加强隐私保护措施。这可能涉及到采用更安全的训练算法，如差分隐私强化的学习策略，或者改进现有的联邦学习框架，以防止模型参数被滥用来推测个人数据。 "Comprehensive Privacy Analysis of Deep Learning"为深度学习的隐私风险提供了一个深入的视角，强调了在设计和部署AI系统时，隐私保护是不容忽视的重要一环。该研究的结果对于推动未来隐私保护技术的发展，以及建立更加安全的深度学习环境具有深远的影响。

资源详情

资源评论

资源推荐

arXiv:1812.00910v1 [stat.ML] 3 Dec 2018

Comprehensive Privacy Analysis of Deep Learning

Stand-alone and Federa te d Learning under Passive and Active White-box Inference Attacks

Milad Nasr

University of Massachusetts Amher st

Reza Shokri

National University of Sing apore

Amir Houmansad r

University of Massachusetts Amher st

Abstract—Deep neural networks are susceptible to various

inference attacks as they remember information about their

training data. We perform a comprehensive analysis of white-box

privacy inference attacks on deep learning models. We measure

the privacy leakage by leveraging the ﬁnal model parameters

as well as the parameter updates during the training and ﬁne-

tuning processes. We design the attacks in the stand-alone and

federated settings, with respect to passive and active inference

attackers, and assuming different adversary prior knowledge.

We evaluate our novel white-box membership inference attacks

against deep learning algorithms to measure their training data

membership leakage. We show that a straightforward extension

of the known black-box attacks to the white-box setting (through

analyzing the outputs of activation functions) is in eff ective. We

therefore design new algorithms tailored t o the white-b ox setting

by exploiting the privacy vulnerabilities of the stochastic gradient

descent algorithm, widely used to t rain deep neural networks. We

show that even well- generalized models are signiﬁcantly suscep-

tible to white-box membership inference attacks, by an alyzin g

state-of-the-art pre-trained and publicly availabl e models for t he

CIFAR dataset. We also show how ad versarial participants of a

federated learning setting can run active membership inference

attacks against other participants, even when the global model

achieves high prediction accuracies.

I. INTRODUCTION

Machine learn ing models based on deep neural networks

have been shown to have signiﬁcantly high generalization

accuracies for various learning tasks, from image and spe e ch

recogn ition to generating realistic-loo king data. This success

has led to many applications and services that use de ep

learning algorithms on large-dimension (and potentially sen si-

tive) user data, including user speeches, images, and medical,

ﬁnancial, social, and lo cation data points. The crucial question

we ask in this paper is the following: How much is the privacy

risks of deep learning for individuals whose data is used as

part of the training set? In other words, how much is the

informa tion leakage of deep learning a lgorithms about the ir

individual training data samples?

We deﬁne privacy-sensitive leaka ge of a model about a

set of target trained data record(s) as the information that

an adversary can infer about the data records, that could not

have been inferred from similar models n ot using the target

data records. This distinguishes between the inf ormation that

we can learn f rom the model about the population and the

informa tion that it leaks about particular samples from the

population that were in its training set. The former indicates

utility gain, and the later reﬂects privacy loss. We design

inference attacks to quantify such privacy leakage.

The inference attacks fall into two fundamental an d related

categories: tracing (a.k.a. member ship inference) attacks, and

reconstruction attacks [

1]. In a reconstruction attack, the

attacker’s ob je c tive is to infe r attributes of the records in the

training set [

2], [3]. In a tracing attack, however, the attacker’s

objective is to learn if a particular individual’s data was

included in the training dataset [4]–[6]. This is a decisional

problem, and its accuracy directly demonstrates the privacy-

sensitive leakage of the model about its training data . We

choose this attack as the basis for our privacy analysis.

Recent works [6], [7] presen t membership inference at-

tacks in a black-box setting, where the attacker can only

observe the model predictions. The results of these works

show that the distribution of the trainin g data as well as the

generalizability of the model signiﬁcantly contribute to the

membersh ip lea kage. Particularly, they show that overﬁtted

models are more susceptible to mem bership inference attacks

than generalized models. Such black-box attacks, however,

might not be effective against deep neura l networks that

generalize well (having a large set of parame te rs). Besides,

deep learning algorithms are used in a variety of settings

where parameters are visible to curious adversaries, e.g., in

federated learning where multiple data hold ers collaborate to

train a global model by sharing their pa rameter updates with

each other through a central parameter aggregator.

Our contributions. In this paper, we present a comprehensive

framework for the privacy analysis of deep neural networks,

using white-box membership inference attacks. We go beyond

membersh ip inference attacks against fully-trained models.

We take all major scenarios where deep learning is u sed

for training and ﬁne-tuning or updating models, with one

or multiple collaborative data holders, when attacker only

passively observes the model updates or actively inﬂuences

the target model to extract more information, and for attackers

with different types of prior knowledge. Desp ite differences

in knowledge, observation, and actio ns of the ad versary, the

objective is the same: membership inference.

A simple extension of existing black-box membership infer-

ence attacks to the white-box setting is to use the same attack

on all of the activation fun c tions of the model. Our empirical

evaluations show that this will not result in accuracies higher

than a black-box attacker. This is b ecause the activation

functions in the model tend to generalize mu ch faster than

the output layer. The ﬁr st layers extract very simple features

that are not speciﬁc to the tr aining data. The last layers do

extract complex and abstract features, but the closer they ar e

to the ﬁnal layer, the less extra information they contain abo ut

the training set, compared with the ﬁnal (outpu t) layer.

We formalize the thre at model in all these settings, and

exploit the privacy vulnerabilities of the stochastic gradient

descent (SGD) algorithm to design our white-box inferenc e

attack. Each da ta point in the training set inﬂuences many of

the model parameters, throu gh the SGD algo rithm, to mini-

mize its contribution to the learning lo ss. The lo cal gradient

of the loss on a target data rec ord, with respect to a given

parameter, indicates how much and in which direction the

parameter needs to be changed to ﬁt the model to the data

record. To minimize the expected loss of the model, the SGD

algorithm repeated ly updates model parameter s in a direction

that the gradient of the loss over the whole training dataset

leans to zero. Therefo re, each training data sample will leave

a distinguishable footprint on the local gradients of the loss

function over the model’s parameters.

We desig n our inf e rence attack by using the gradient vector,

on the target data point over all par ameters, as the main feature

for the attack model. We design an architecture for our attack

model tha t processes extracted features from different layers

separately, and then aggregates them to extract membership

informa tion. In the cases where the adversary does not have

samples from th e target tra ining set to train its inference

model, we train the attack mode l in an unsupervised manner.

We train auto-enc oders to compute a membersh ip in formation

embedd ing for any data. We then use a clustering algorithm,

on the target dataset, to separate members from non-members

based on their membership information embedding.

To show the effectiveness of our white-box inference at-

tack, we evaluate the privacy of pre-trained and publicly

available state-of-the-art models on the CIFAR100 dataset.

We had no inﬂuence on training these models. Our results

show that the DenseNet model—which is the best model on

CIFAR100 with 82% test accura cy—is not much vulnerable

to black-box attacks (with a 54.5% inference attack accuracy,

where 50% is th e baseline for random guess). However,

the white-box membership infe rence attack obtains a c on-

siderably high er accur a cy of 74.3%. This shows that even

well-generalized deep models leak signiﬁcant amount of

information about their training data, and are vulnerable

to w hit e-box membership infe rence attacks.

In federated lear ning, we show that a curious parameter

server or even a participant can perform alarming ly accurate

membersh ip inf erence attacks against other participants. For

the DenseNet model on CIFAR10 0, a local particip ant can

achieve a member ship inference accuracy of 72.2%, even

though it only observes aggregate updates through the pa-

rameter server. Also, the curio us central param e te r server

can achieve a 79.2% inference a ccuracy, as it receives the

individual parameter updates from all par ticipants. In federated

learning, the repeated parameter updates of the models

over different epochs on the same underlying training set

is a key factor in boosting the inference att ack a c curacy.

As the adversary’s contributions (i.e., parameter updates)

can inﬂuence the victim’s parameters, in the federated learn-

ing setting, the adversary can actively exploit SGD to leak

even more information about the participants’ training

data. We design an ac tive attack that performs gradient ascent

on a set of target data points befo re uploading and updating the

global parameters. This magniﬁes the presence of data points

in others’ training sets, in the way SGD reacts by abruptly

reducing the gradient on the target data points if they are

members. This leads to a 76.7 % inference accuracy for an

adversarial participant, and a signiﬁcant 82.1% accuracy for

an active inference attack by the cen tral server. By isolating

a participant during parameter update , the central attacker can

boost his accuracy to 87.3% on the Densen et model.

II. INFERENCE ATTACKS

We use membership inference attacks to measure informa-

tion lea kage through deep learning algorithms and mod els

about training data. There are many different scenario s in

which data is used for training models, and there are many

different ways the attacker can observe the deep le arning

process. In Table

I, we cover the major criteria to categorize

the attacks. This inc ludes attack observations, assumptions

about the adversary knowledge, the target training algo rithm,

and the mode o f the attack based on the adversary’s actio ns.

In this section, we discu ss different attack scena rios as well as

the techniques we use to exploit deep learning algorithms. We

also describe the architecture of our attack model, and how

the adversary computes the membership probability.

A. Attack Observations: Black-box vs. White-box Inferenc e

The adversary’s observations of the deep learning algorith m

are what constitute the inputs for the inferenc e attack.

Black-box. In this setting, the adversary’s observation is lim-

ited to the output of the model on arbitrary inputs. For any data

point x, the attacker can only obtain f (x; W). The parameters

of the model W and the intermediate steps of the computation

are not accessible to the attacker. This is the setting of machine

learning as a service platforms. M embership inference attacks

against black-box models are already designed, which exploit

the statistical differences between a model’s predic tions on its

training set versus unseen data [

6].

White-box. In this setting, the attacker obtains the model

f(x; W) in c luding its parameters which are needed for pre-

diction. Thus, for any input x, in addition to its output, the

attacker can compute all the intermed iate com putations of

the model. That is, the adversary can compute any function

over W and x given the model. The most straightforward

functions are the outputs of the h idden lay ers, h

(x) on the

input x. As a simple extension, the attacker can extend black-

box membership inference attacks (which are limited to the

model’s output) to the outputs of all ac tivation functions of

the model. However, this does not necessarily contain all the

useful information f or m e mbership infer ence. Notably, the

model output and ac tivation functions could generalize if the

model is well regularized. Thus, there m ight not be much

Criteria Attacks Description

Observation

Black-box The attacker can obtain the prediction vector f(x) on arbitrary input x, but cannot access the model parameters, nor

the intermediate computations of f (x).

f f(x)

White-box The attacker has access to the full model f (x; W), notably its architecture and parameters W, and any hyper-parameter

that is needed to use the model for predictions. Thus, he can also observe the intermediate computations at hidden layers

(x).

(x)

· · ·

f(x)

Target

Stand-alone The attacker observes the ﬁnal target model f, after the training is done using dataset D. He might also observe the

updated model f

∆

after it has been updated (ﬁne-tuned) using a new dataset D

∆

ﬁne-tune

Federated The attacker could be the central aggregator, who observes individual updates over time and can control the view of the

participants on the global parameters. He could also be any of the participants who can observe the global parameter

updates, and can control his parameter uploads.

Aggregator (global parameters W)

f(x; W

{t}

)

f(x; W

{t}

)

f(x; W

{t}

)

· · ·

down=W

{t}

up=W

{t}

Mode

Passive The attacker can only observe the genuine computations by the training algorithm and the model.

Active The attacker could be one of the participants in the federated learning, who adversarially modiﬁes his parameter uploads

{t}

, or could be the central aggregator who adversarially modiﬁes the aggregate parameters W

{t}

which he sends

to the target participant(s).

Knowledge

Supervised The attacker has a data set D

′

, which contains a subset of the target set D, as well as some data points from the same

underlying distribution as D that are not in D. The attacker trains an inference model h in a supervised manner, by

minimizing the empirical loss function

d∈D

′

(1 −

d∈D

)h(d) +

d∈D

(1 − h(d)), where the inference model h

computes the membership probability of any data point d in the training set of a given target model f , i.e., h(d) =

Pr(d ∈ D; f).

Data Universe

′

∼ Pr(X = x)

Unsupervised The attacker has data points that are sampled from the same underlying distribution as D. However, he does not have

information about whether a data sample has been in the target set D.

TABLE I: Various categories of inference attacks against machine learning models, based on their prior knowledge, observation, mode of

attack, and the training architecture of the target models.

剩余14页未读，继续阅读

评论收藏

内容反馈

牛站长

粉丝: 32
资源: 299

2018-隐私安全性分析-Comprehensive Privacy Analysis of Deep Learning1

评论0

最新资源

2018-隐私安全性分析-Comprehensive Privacy Analysis of Deep Learning1

评论0

A Comprehensive Review of Deep Learning-based Single Image Super

Large-Scale Deep Learning Optimizations—A Comprehensive

A Comprehensive Review of Deep Learning-based Single Image S

Comprehensive_survey_of_deep_learning_in_remote_sensing.pdf

Deep Learning Networks for Stock Market Analysis and Prediction

The Deep Learning Compiler A Comprehensive Survey.pdf

Deep Learning

Deep learning

Deep learning with tensorflow

Deep Learning for Visual Tracking A Comprehensive Survey.pdf

深度学习中的隐私问题：综述论文

Deep Learning Privacy Shokri-SP2019.pdf

基于隐私保护的数据挖掘分析.pdf

大数据时代下的个人数据隐私保护分析.pdf

Failure of Skin-Deep Learning

Deep Learning with TensorFlow: Explore neural networks with Python

Deep Learning: Practical Neural Networks with Java 完整高清英文azw3版

【7】Deep residual learning for image recognition.pdf

Deep Learning with TensorFlow

Giancarlo Zaccone deep learning with tensorflow pdf

A survey on deep learning in medical image analysis.pdf

Deep.Learning.with.TensorFlow

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

BurpLoaderKeygen.jar.zip

Chrome Header Editor 插件

Goby红队版-win-x64-2.4.7版本

最新资源