【免费】2011-网络设计-Flexible_High_Performance_Convolutional_Neural

需积分: 0 49 浏览量更新于2022-08-03 收藏 703KB PDF 举报

在本文“2011-网络设计-Flexible_High_Performance_Convolutional_Neural_Net1”中，作者Dan C. Cires¸an等人探讨了一种基于GPU的灵活、高性能卷积神经网络（Convolutional Neural Networks, CNN）实现，这种实现方式可应用于图像分类。他们提出的方法允许网络结构完全参数化，且特征提取器是通过监督学习而非预先设计或固定。通过深度层次架构，他们的模型在多项基准测试中取得了最优的结果。 CNN作为一种深度学习模型，其核心在于其卷积层和池化层，这些层能够有效地捕获图像中的局部特征并进行下采样，从而减少计算量并保持重要的特征信息。论文中提到的模型不仅在设计上具有灵活性，而且在性能上表现出色，这得益于GPU的高效并行计算能力。相比于传统的计算机视觉方法，CNN更适应于处理视点变化和物体内部变异带来的挑战。作者们展示了他们的模型在多个数据集上的优秀性能，包括NORB、CIFAR10和MNIST。NORB数据集用于对象分类，CIFAR10是包含10个类别彩色图像的数据集，而MNIST则专注于手写数字识别。在NORB数据集上，他们达到了2.53%的错误率，CIFAR10上是19.51%，而在MNIST上，错误率低至0.35%。这表明，即便是相对简单的反向传播训练策略，也能让深度网络的表现优于浅层网络。此外，他们强调了学习速度的快速性。在MNIST数据集上，仅需1个、3个和17个训练周期，错误率就分别下降到2.42%、0.97%和0.48%。这证明了深度学习模型能够在较短的时间内学习到有效的特征表示。论文中还提到了早期的神经网络模型——Neocognitron，它是许多现代CNN的灵感来源。Neocognitron由Fukushima于1980年提出，它采用类似生物视觉皮层的分层结构，逐层处理图像信息。然而，与Neocognitron不同的是，现代CNN通过学习得到的滤波器，而不是预定义的，这使得它们能够自适应地学习图像中的复杂模式。无监督学习方法在自然图像的局部区域上应用时，往往能学到类似中心-周边滤波器的特征，这些滤波器可以检测图像中的边缘和方向。这样的滤波器是CNN的基础组件，它们在图像的早期处理阶段捕捉到基本的视觉信息。这篇论文揭示了深度卷积神经网络在图像识别任务中的强大潜力，以及通过GPU加速实现的灵活性和高效性。通过不断的学习和优化，这些网络能够在有限的训练迭代次数内达到出色的性能，为计算机视觉领域带来了革命性的进步。同时，该研究也强调了深度学习模型相比传统浅层网络的优势，为后续的研究和应用奠定了基础。

Flexible, High Performance Convolutional

Neural Networks for Image Classiﬁcation

Dan C. Cires¸an, Ueli Meier, Jonathan Masci, Luca M. Gambardella, J

urgen Schmidhuber

IDSIA, USI and SUPSI

Galleria 2, 6928 Manno-Lugano, Switzerland

{dan,ueli,jonathan,luca,juergen}@idsia.ch

Abstract

We present a fast, fully parameterizable GPU im-

plementation of Convolutional Neural Network

variants. Our feature extractors are neither care-

fully designed nor pre-wired, but rather learned in

a supervised way. Our deep hierarchical architec-

tures achieve the best published results on bench-

marks for object classiﬁcation (NORB, CIFAR10)

and handwritten digit recognition (MNIST), with

error rates of 2.53%, 19.51%, 0.35%, respectively.

Deep nets trained by simple back-propagation per-

form better than more shallow ones. Learning is

surprisingly rapid. NORB is completely trained

within ﬁve epochs. Test error rates on MNIST

drop to 2.42%, 0.97% and 0.48% after 1, 3 and 17

epochs, respectively.

1 Introduction

The human visual system efﬁciently recognizes and local-

izes objects within cluttered scenes. For artiﬁcial systems,

however, this is still difﬁcult due to viewpoint-dependent ob-

ject variability, and the high in-class variability of many ob-

ject types. Deep hierarchical neural models roughly mimick

the nature of mammalian visual cortex, and by community

consensus are among the most promising architectures for

such tasks. The most successful hierarchical object recog-

nition systems all extract localized features from input im-

ages, convolving image patches with ﬁlters. Filter responses

are then repeatedly sub-sampled and re-ﬁltered, resulting in a

deep feed-forward network architecture whose output feature

vectors are eventually classiﬁed. One of the ﬁrst hierarchi-

cal neural systems was the Neocognitron

[

Fukushima, 1980

]

which inspired many of the more recent variants.

Unsupervised learning methods applied to patches of nat-

ural images tend to produce localized ﬁlters that resemble

off-center-on-surround ﬁlters, orientation-sensitive bar detec-

tors, Gabor ﬁlters

[

Schmidhuber et al., 1996; Olshausen and

Field, 1997; Hoyer and Hyv

arinen, 2000

]

. These ﬁndings

in conjunction with experimental studies of the visual cor-

tex justify the use of such ﬁlters in the so-called standard

model for object recognition

[

Riesenhuber and Poggio, 1999;

Serre et al., 2007; Mutch and Lowe, 2008

]

, whose ﬁlters are

ﬁxed, in contrast to those of Convolutional Neural Networks

(CNNs)

[

LeCun et al., 1998; Behnke, 2003; Simard et al.,

2003

]

, whose weights (ﬁlters) are randomly initialized and

changed in a supervised way using back-propagation (BP).

Despite the hardware progress of the past decades, compu-

tational speed is still a limiting factor for CNN architectures

characterized by many building blocks typically set by trial

and error. To systematically test the impact of various archi-

tectures on classiﬁcation performance, we present a fast CNN

implementation on Graphics Processing Units (GPUs). Previ-

ous GPU implementations of CNNs

[

Chellapilla et al., 2006;

Uetz and Behnke, 2009; Strigl et al., 2010

]

were hard-coded

to satisfy GPU hardware constraints or use general purpose

libraries, whereas our implementation is ﬂexible and fully on-

line (i.e., weight updates after each image). A notable excep-

tion is

[

Jarrett et al., 2009

]

who performed a thorough analy-

sis of the inﬂuence of all building blocks of a multistage ar-

chitecture on recognition performance. Our implementation

allows for training large CNNs within days instead of months,

such that we can investigate the inﬂuence of various structural

parameters by exploring large parameter spaces

[

Pinto et al.,

2009

]

and performing error analysis on repeated experiments.

We evaluate various networks on the handwritten digit

benchmark MNIST

[

LeCun et al., 1998

]

and two image clas-

siﬁcation benchmarks: NORB

[

LeCun et al., 2004

]

and CI-

FAR10

[

Krizhevsky, 2009

]

2 Convolutional neural networks

CNNs are hierarchical neural networks whose convolutional

layers alternate with subsampling layers, reminiscent of sim-

ple and complex cells in the primary visual cortex

[

Wiesel

and Hubel, 1959

]

. CNNs vary in how convolutional and sub-

sampling layers are realized and how the nets are trained.

2.1 Image processing layer

The image processing layer is an optional pre-processing

layer of predeﬁned ﬁlters that are kept ﬁxed during train-

ing. Thus additional information besides the raw input im-

age can be provided to the network, such as edges and gra-

dients. In particular, we ﬁnd that a contrast-extracting layer

[

Fukushima, 2003

]

helps to improve the recognition rate for

NORB.

1237

Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence

下载后可阅读完整内容，剩余5页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

资源推荐

资源评论

我只匆匆而过

粉丝: 20
资源: 316

2011-网络设计-Flexible_High_Performance_Convolutional_Neural_Net1

Neural_Network

Eyeriss v1 + v2 论文

8-Bit Approximations for Parallelism in Deep Learning

随波逐流CTF编码工具 V6.5 20250115

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

BurpSuite V2024.1.1专业版

BurpLoaderKeygen.jar.zip

Chrome Header Editor 插件

Goby红队版-win-x64-2.4.7版本

软件工程导论(第六版)课后习题答案1

OpenVAS GVM 中文翻译补丁

安全认证cisp教材全套

STM32F103C8T6核心板-电路原理图1.PDF

OpenVAS离线资源

现代永磁同步电机控制原理及MATLAB仿真__袁雷编著1

2023年最全最精简wifi密码字典(2.6G)

小迪安全笔记，详细版本

2025獬豸杯全国电子数据取证竞赛（仅展示个人已完成部分wp）

hackbar2.1.3-master安装包

关于STM32F103C8T6芯片的一些重要引脚功能的整理1

Kali安装burpsuite专业版

2021年11月更新的哥斯拉4.0.1 免费

全面的安全基线核查清单

病毒加壳免杀工具之Themida

LiqunKit-1.6.2

14.视觉SLAM十四讲(高翔第二版)1

最新资源