DeepNeuralNetworksareEasilyFooled.pdf资源-CSDN文库

需积分: 10 63 浏览量 2019-05-24 08:52:27 上传评论收藏 7.99MB PDF 举报

### 深度神经网络易受欺骗：对不可识别图像的高度自信预测 #### 摘要及背景本文探讨了深度神经网络（DNN）在处理视觉分类任务时的一个潜在问题——即它们容易被设计出来的特定图像误导。尽管深度神经网络已经在多种模式识别任务上达到了与人类相当甚至超过人类的表现水平，特别是对于图像分类任务，但该研究揭示了一个值得注意的现象：通过微小且对人类几乎不可见的变化，可以使得DNN将原本明显的物体识别为完全不同的类别。此外，研究还发现可以通过进化算法或梯度上升等方法生成一些完全无法被人类识别的图像，这些图像却能够被最先进的DNN以极高的置信度认定为具体的对象。 #### 关键词解析 - **深度神经网络**（Deep Neural Networks, DNN）：由多层神经元组成的神经网络模型，用于解决复杂的模式识别和分类问题。 - **图像分类**：计算机视觉中的一个重要任务，目的是识别并分类图像中的对象。 - **模式识别**：计算机科学领域中的一个分支，专注于从数据中自动检测模式和规律。 - **视觉分类问题**：特定类型的模式识别任务，侧重于图像中的视觉特征。 - **欺骗图像**（Fooling Images）：一种特别设计的图像，对人类来说毫无意义或无法识别，但DNN会以极高的置信度将其分类为某个特定对象。 - **进化算法**：一类优化算法，模拟自然界中的进化过程来搜索最优解。 - **梯度上升**：一种优化技术，用于寻找函数的最大值。 #### 主要贡献 - **欺骗图像的生成**：研究人员利用进化算法或梯度上升方法生成了一系列欺骗图像，这些图像是针对特定的数据集（如ImageNet或MNIST）进行优化的。结果表明，即使是完全随机的噪声图像也能被DNN以99.99%的置信度错误地分类为特定的对象。 - **人类与机器视觉的差异**：通过对欺骗图像的研究，本文揭示了人类视觉系统与当前深度神经网络之间的显著差异。例如，人类可以轻松地识别出欺骗图像与实际物体之间的区别，而DNN则会被这些图像误导。 - **深度学习的局限性**：这些发现引发了关于深度学习模型泛化能力的问题，尤其是在面对非传统输入或对抗性攻击时的脆弱性。 #### 实验方法 - **训练卷积神经网络**：为了生成欺骗图像，研究人员首先训练了卷积神经网络（CNN），使其在ImageNet或MNIST等标准数据集上表现良好。 - **进化算法或梯度上升**：接下来，使用进化算法或梯度上升方法来优化图像，以便使网络以高置信度错误分类。这种方法有效地生成了欺骗图像。 - **评估与分析**：通过比较人类和DNN对这些欺骗图像的反应，研究人员进一步探讨了这两种视觉系统之间的差异，并讨论了DNN的局限性和未来改进的方向。 #### 结论本文展示了深度神经网络在处理图像分类任务时存在的一个重要缺陷——它们很容易被特制的图像所欺骗。这些发现不仅加深了我们对深度学习模型局限性的理解，也为开发更强大的计算机视觉系统提供了宝贵的见解。通过更好地理解人类视觉系统如何工作以及与当前深度神经网络的差异，研究人员可以朝着构建更加健壮、可靠和通用的计算机视觉系统迈进一大步。

资源推荐

资源详情

资源评论

Deep Neural Networks are Easily Fooled:

High Conﬁdence Predictions for Unrecognizable Images

Anh Nguyen

University of Wyoming

anguyen8@uwyo.edu

Jason Yosinski

Cornell University

yosinski@cs.cornell.edu

Jeff Clune

University of Wyoming

jeffclune@uwyo.edu

Abstract

Deep neural networks (DNNs) have recently been

achieving state-of-the-art performance on a variety of

pattern-recognition tasks, most notably visual classiﬁcation

problems. Given that DNNs are now able to classify objects

in images with near-human-level performance, questions

naturally arise as to what differences remain between com-

puter and human vision. A recent study [30] revealed that

changing an image (e.g. of a lion) in a way imperceptible to

humans can cause a DNN to label the image as something

else entirely (e.g. mislabeling a lion a library). Here we

show a related result: it is easy to produce images that are

completely unrecognizable to humans, but that state-of-the-

art DNNs believe to be recognizable objects with 99.99%

conﬁdence (e.g. labeling with certainty that white noise

static is a lion). Speciﬁcally, we take convolutional neu-

ral networks trained to perform well on either the ImageNet

or MNIST datasets and then ﬁnd images with evolutionary

algorithms or gradient ascent that DNNs label with high

conﬁdence as belonging to each dataset class. It is possi-

ble to produce images totally unrecognizable to human eyes

that DNNs believe with near certainty are familiar objects,

which we call “fooling images” (more generally, fooling ex-

amples). Our results shed light on interesting differences

between human vision and current DNNs, and raise ques-

tions about the generality of DNN computer vision.

1. Introduction

Deep neural networks (DNNs) learn hierarchical lay-

ers of representation from sensory input in order to per-

form pattern recognition [2, 14]. Recently, these deep ar-

chitectures have demonstrated impressive, state-of-the-art,

and sometimes human-competitive results on many pattern

recognition tasks, especially vision classiﬁcation problems

[16, 7, 31, 17]. Given the near-human ability of DNNs to

classify visual objects, questions arise as to what differences

remain between computer and human vision.

Figure 1. Evolved images that are unrecognizable to humans,

but that state-of-the-art DNNs trained on ImageNet believe with

≥ 99.6% certainty to be a familiar object. This result highlights

differences between how DNNs and humans recognize objects.

Images are either directly (top) or indirectly (bottom) encoded.

A recent study revealed a major difference between DNN

and human vision [30]. Changing an image, originally cor-

rectly classiﬁed (e.g. as a lion), in a way imperceptible to

human eyes, can cause a DNN to label the image as some-

thing else entirely (e.g. mislabeling a lion a library).

In this paper, we show another way that DNN and human

vision differ: It is easy to produce images that are com-

pletely unrecognizable to humans (Fig. 1), but that state-of-

the-art DNNs believe to be recognizable objects with over

99% conﬁdence (e.g. labeling with certainty that TV static

Figure 2. Although state-of-the-art deep neural networks can increasingly recognize natural images (left panel), they also are easily

fooled into declaring with near-certainty that unrecognizable images are familiar objects (center). Images that fool DNNs are produced by

evolutionary algorithms (right panel) that optimize images to generate high-conﬁdence DNN predictions for each class in the dataset the

DNN is trained on (here, ImageNet).

is a motorcycle). Speciﬁcally, we use evolutionary algo-

rithms or gradient ascent to generate images that are given

high prediction scores by convolutional neural networks

(convnets) [16, 18]. These DNN models have been shown

to perform well on both the ImageNet [10] and MNIST [19]

datasets. We also ﬁnd that, for MNIST DNNs, it is not easy

to prevent the DNNs from being fooled by retraining them

with fooling images labeled as such. While retrained DNNs

learn to classify the negative examples as fooling images, a

new batch of fooling images can be produced that fool these

new networks, even after many retraining iterations.

Our ﬁndings shed light on current differences between

human vision and DNN-based computer vision. They also

raise questions about how DNNs perform in general across

different types of images than the ones they have been

trained and traditionally tested on.

2. Methods

2.1. Deep neural network models

To test whether DNNs might give false positives for

unrecognizable images, we need a DNN trained to near

state-of-the-art performance. We choose the well-known

“AlexNet” architecture from [16], which is a convnet

trained on the 1.3-million-image ILSVRC 2012 ImageNet

dataset [10, 24]. Speciﬁcally, we use the already-trained

AlexNet DNN provided by the Caffe software package [15].

It obtains 42.6% top-1 error rate, similar to the 40.7% re-

ported by Krizhevsky 2012 [16]. While the Caffe-provided

DNN has some small differences from Krizhevsky 2012

[16], we do not believe our results would be qualitatively

changed by small architectural and optimization differences

or their resulting small performance improvements. Simi-

larly, while recent papers have improved upon Krizhevsky

2012, those differences are unlikely to change our results.

We chose AlexNet because it is widely known and a trained

DNN similar to it is publicly available. In this paper, we

refer to this model as “ImageNet DNN”.

To test that our results hold for other DNN architectures

and datasets, we also conduct experiments with the Caffe-

provided LeNet model [18] trained on the MNIST dataset

[19]. The Caffe version has a minor difference from the

original architecture in [18] in that its neural activation func-

tions are rectiﬁed linear units (ReLUs) [22] instead of sig-

moids. This model obtains 0.94% error rate, similar to the

0.8% of LeNet-5 [18]. We refer to this model as “MNIST

DNN”.

2.2. Generating images with evolution

The novel images we test DNNs on are produced by evo-

lutionary algorithms (EAs) [12]. EAs are optimization al-

gorithms inspired by Darwinian evolution. They contain

a population of “organisms” (here, images) that alternately

face selection (keeping the best) and then random pertur-

bation (mutation and/or crossover). Which organisms are

selected depends on the ﬁtness function, which in these ex-

periments is the highest prediction value a DNN makes for

that image belonging to a class (Fig. 2).

Traditional EAs optimize solutions to perform well on

one objective, or on all of a small set of objectives [12] (e.g.

evolving images to match a single ImageNet class). We

instead use a new algorithm called the multi-dimensional

archive of phenotypic elites MAP-Elites [6], which enables

us to simultaneously evolve a population that contains in-

dividuals that score well on many classes (e.g. all 1000

ImageNet classes). Our results are unaffected by using

the more computationally efﬁcient MAP-Elites over single-

target evolution (data not shown). MAP-Elites works by

keeping the best individual found so far for each objective.

Each iteration, it chooses a random organism from the pop-

ulation, mutates it randomly, and replaces the current cham-

pion for any objective if the new individual has higher ﬁt-

剩余9页未读，继续阅读

评论收藏

内容反馈

kaichu2

粉丝: 888
资源: 71

Deep Neural Networks are Easily Fooled.pdf

最新资源

Deep Neural Networks are Easily Fooled.pdf

Deep Neural Networks are Easily Fooled

Deep Learning in Neural Networks

Deep Learning in Neural Networks An Overview.pdf

Neural Networks and Deep Learning

YouTube推荐系统Paper[2016]-Deep Neural Networks for YouTube Recommendations.pdf

Speech Recognition Using Deep Neural Networks A Systematic Review.pdf

An Overview of Multi-Task Learning in Deep Neural Networks.pdf

Deep Neural Networks for YouTube Recommendations.zip

convnet_transfer, 纸张"How transferable are features in deep neural networks?" 代码.zip

Neural.Networks.with.R.epub

DeepPose-Human Pose Estimation via Deep Neural Networks

Hyperbolic Deep Neural Networks A Survey.pdf

Neural Networks and Deep Learning.神经网络与深度学习

刘知远-Introduction to Graph Neural Networks.pdf

ImageNet Classification with Deep Convolutional Neural Networks.pdf

Training Neural Networks, Part 2.pdf

最新资源