没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
本文提出了一种创新性的利用强化学习(Reinforcement Learning)的黑盒模型反演攻击方法—RLB-MI。该攻击方法通过将潜在空间搜索定义为马尔科夫决策过程(MDP)来解决现有的黑盒攻击效率与准确性不足的问题,借助从图像到类标签的信心评分作为回报指导强化学习智能体探索潜在变量并成功重建敏感数据。在各种实验数据集上的实验证明了其优越的有效性和先进性。 适用人群:关注隐私安全的研究者以及从事深度学习应用领域的专业人员。 使用场景及目标:用于检测机器学习模型是否存在数据泄露的风险并且对模型逆向恢复的质量进行提升。此外还帮助人们理解和优化模型训练流程。 本论文针对黑盒条件进行了大量的实验测试并对比了几种主流白盒和黑盒攻击效果及其鲁棒性和实用性,表明了所提方案的优势与潜力。
资源推荐
资源详情
资源评论
Reinforcement Learning-Based Black-Box Model Inversion Attacks
Gyojin Han Jaehyun Choi Haeil Lee Junmo Kim
School of Electrical Engineering, KAIST
{hangj0820, chlwogus, haeil.lee, junmo.kim}@kaist.ac.kr
Abstract
Model inversion attacks are a type of privacy attack that
reconstructs private data used to train a machine learning
model, solely by accessing the model. Recently, white-box
model inversion attacks leveraging Generative Adversarial
Networks (GANs) to distill knowledge from public datasets
have been receiving great attention because of their excel-
lent attack performance. On the other hand, current black-
box model inversion attacks that utilize GANs suffer from
issues such as being unable to guarantee the completion of
the attack process within a predetermined number of query
accesses or achieve the same level of performance as white-
box attacks. To overcome these limitations, we propose a
reinforcement learning-based black-box model inversion at-
tack. We formulate the latent space search as a Markov De-
cision Process (MDP) problem and solve it with reinforce-
ment learning. Our method utilizes the confidence scores of
the generated images to provide rewards to an agent. Fi-
nally, the private data can be reconstructed using the latent
vectors found by the agent trained in the MDP. The exper-
iment results on various datasets and models demonstrate
that our attack successfully recovers the private informa-
tion of the target model by achieving state-of-the-art attack
performance. We emphasize the importance of studies on
privacy-preserving machine learning by proposing a more
advanced black-box model inversion attack.
1. Introduction
With the rapid development of artificial intelligence,
deep learning applications are emerging in various fields
such as computer vision, healthcare, autonomous driving,
and natural language processing. As the number of cases
requiring private data to train the deep learning models in-
creases, the concern of private data leakage including sensi-
tive personal information is rising. In particular, studies on
privacy attacks [21] show that personal information can be
extracted from the trained models by malicious users. One
of the most representative privacy attacks on machine learn-
ing models is a model inversion attack, which reconstructs
the training data of a target model with only access to the
model. The model inversion attacks are divided into three
categories, 1) white-box attacks, 2) black-box attacks, and
3) label-only attacks, depending on the amount of informa-
tion of the target model. The white-box attacks can access
all parameters of the model. The black-box attacks can ac-
cess soft inference results consisting of confidence scores,
and the label-only attacks only can access inference results
in hard label forms.
The white-box model inversion attacks [5, 25, 27] have
succeeded in restoring high-quality private data including
personal information by using Generative Adversarial Net-
works (GANs) [10]. First, they train the GANs on sepa-
rate public data to learn the general prior of private data.
Then benefiting from the accessibility of the parameters of
the trained white-box models, they search and find latent
vectors that represent data of specific labels with gradient-
based optimization methods. However, these methods can-
not be applied to machine learning services such as Ama-
zon Rekognition [1] where the parameters of the model
are protected. To reconstruct private data from such ser-
vices, studies on black-box and label-only model inversion
attacks are required. Unlike the white-box attacks, these at-
tacks require methods that can explore the latent space of
the GANs in order to utilize them, as gradient-based op-
timizations are not possible. The recently proposed Model
Inversion for Deep Learning Network (MIRROR) [2] uses a
genetic algorithm to search the latent space with confidence
scores obtained from a black-box target model. In addi-
tion, Boundary-Repelling Model Inversion attack (BREP-
MI) [14] has achieved success in the label-only setting by
using a decision-based zeroth-order optimization algorithm
for latent space search.
Despite these attempts, each method has a significant is-
sue. BREP-MI starts the process of latent space search from
the first latent vector that generates an image classified as
the target class. This does not guarantee how many query
accesses will be required until the first latent vector is found
by random sampling, and in the worst case, it may not be
possible to start the search process for some target classes.
In the case of MIRROR, it performs worse than the label-
This CVPR paper is the Open Access version, provided by the Computer Vision Foundation.
Except for this watermark, it is identical to the accepted version;
the final published version of the proceedings is available on IEEE Xplore.
20504
资源评论
pk_xz123456
- 粉丝: 2565
- 资源: 3629
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 西门子V90效率倍增-伺服驱动功能库详解-参数备份恢复库.mp4
- 基于python深度学习目标检测之水果检测-图片和摄像头.zip
- 基于python深度学习目标检测之检测安全带是否佩戴-含摄像头识别.zip
- unity-video-player-with-sql-server-master
- 基于python深度学习对花卉进行目标检测-含摄像头识别-含代码和数据集.zip
- 基于python深度学习对船舶进行目标检测-含摄像头识别-含代码和数据集.zip
- christmasTree-HTML版
- jQuery:一些jQuery关联的相关笔记
- 基于小程序的家庭大厨微信小程序源代码(java+小程序+mysql+LW).zip
- 基于小程序的家庭财务管理系统的设计与实现源代码(java+小程序+mysql+LW).zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功