RethinkingtheInceptionArchitectureforComputerVision.pdf资源-CSDN文库

论文

156 浏览量 2024-11-19 16:06:46 上传评论收藏 572KB PDF 举报

资源推荐

资源详情

资源评论

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy

Google Inc.

szegedy@google.com

Vincent Vanhoucke

vanhoucke@google.com

Sergey Ioffe

sioffe@google.com

Jonathon Shlens

shlens@google.com

Zbigniew Wojna

University College London

zbigniewwojna@gmail.com

Abstract

Convolutional networks are at the core of most state-

of-the-art computer vision solutions for a wide variety of

tasks. Since 2014 very deep convolutional networks started

to become mainstream, yielding substantial gains in vari-

ous benchmarks. Although increased model size and com-

putational cost tend to translate to immediate quality gains

for most tasks (as long as enough labeled data is provided

for training), computational efﬁciency and low parameter

count are still enabling factors for various use cases such as

mobile vision and big-data scenarios. Here we are explor-

ing ways to scale up networks in ways that aim at utilizing

the added computation as efﬁciently as possible by suitably

factorized convolutions and aggressive regularization. We

benchmark our methods on the ILSVRC 2012 classiﬁcation

challenge validation set demonstrate substantial gains over

the state of the art: 21.2% top-1 and 5.6% top-5 error for

single frame evaluation using a network with a computa-

tional cost of 5 billion multiply-adds per inference and with

using less than 25 million parameters. With an ensemble of

4 models and multi-crop evaluation, we report 3.5% top-5

error and 17.3% top-1 error.

1. Introduction

Since the 2012 ImageNet competition [16] winning en-

try by Krizhevsky et al [9], their network “AlexNet” has

been successfully applied to a larger variety of computer

vision tasks, for example to object-detection [5], segmen-

tation [12], human pose estimation [22], video classiﬁca-

tion [8], object tracking [23], and superresolution [3].

These successes spurred a new line of research that fo-

cused on ﬁnding higher performing convolutional neural

networks. Starting in 2014, the quality of network architec-

tures signiﬁcantly improved by utilizing deeper and wider

networks. VGGNet [18] and GoogLeNet [20] yielded simi-

larly high performance in the 2014 ILSVRC [16] classiﬁca-

tion challenge. One interesting observation was that gains

in the classiﬁcation performance tend to transfer to signiﬁ-

cant quality gains in a wide variety of application domains.

This means that architectural improvements in deep con-

volutional architecture can be utilized for improving perfor-

mance for most other computer vision tasks that are increas-

ingly reliant on high quality, learned visual features. Also,

improvements in the network quality resulted in new appli-

cation domains for convolutional networks in cases where

AlexNet features could not compete with hand engineered,

crafted solutions, e.g. proposal generation in detection[4].

Although VGGNet [18] has the compelling feature of

architectural simplicity, this comes at a high cost: evalu-

ating the network requires a lot of computation. On the

other hand, the Inception architecture of GoogLeNet [20]

was also designed to perform well even under strict con-

straints on memory and computational budget. For exam-

ple, GoogleNet employed only 5 million parameters, which

represented a 12⇥ reduction with respect to its predeces-

sor AlexNet, which used 60 million parameters. Further-

more, VGGNet employed about 3x more parameters than

AlexNet.

The computational cost of Inception is also much lower

than VGGNet or its higher performing successors [6]. This

has made it feasible to utilize Inception networks in big-data

scenarios[17], [13], where huge amount of data needed to

be processed at reasonable cost or scenarios where memory

or computational capacity is inherently limited, for example

in mobile vision settings. It is certainly possible to mitigate

parts of these issues by applying specialized solutions to tar-

get memory use [2], [15] or by optimizing the execution of

certain operations via computational tricks [10]. However,

these methods add extra complexity. Furthermore, these

methods could be applied to optimize the Inception archi-

tecture as well, widening the efﬁciency gap again.

Still, the complexity of the Inception architecture makes

arXiv:1512.00567v3 [cs.CV] 11 Dec 2015

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余9页未读，立即下载

评论收藏

内容反馈

不脱发的程序猿

粉丝: 26w+
资源: 5889

Rethinking the Inception Architecture for Computer Vision.pdf

最新资源

Rethinking the Inception Architecture for Computer Vision.pdf

Rethinking the Inception Architecture for Computer Vi.pdf

英文文献翻译复习指南.zip

goolenet论文（4篇）

百度西交第三届大数据比赛基线（全国第四名）.zip

MobileNetV1、InceptionV3论文以及模型.rar

googLeNet 深度学习四篇论文

PyPI 官网下载 | torch_inception_resnet_v2-0.0.12-py3-none-any.whl

卷积网络模型.zip

Inception Net 家族PPT

经典CNN骨架网络论文收录（2015至2020年）

Academic+Phrasebank+2021+Edition+_中英文对照.pdf

学术规范与论文写作.docx

基于python的超市管理系统的设计与实现毕业论文+项目文档源码

1000套计算机毕业设计带源码

数模国赛word模板.zip

IEEE期刊论文格式模板word

2023高教社数学建模C题 - 蔬菜类商品的自动定价与补货决策【数据处理详细代码】

Python大作业（包含论文）——可打包的双人五子棋程序

研究生电子设计竞赛论文集

elsevier投稿模板Latex压缩包

边缘计算_万物互联时代新型计算模型_施巍松.pdf

ChatGPT4.0中文版论文

《基于MATLAB的三段式距离保护建模与仿真》

国赛数模论文LaTeX模板

yolov综述论文，v1到v8的详细深入对比剖析

python地铁客流量分析平台_python毕业设计_爬虫可视化_论文_python_毕业论文

电工杯历年优秀论文 .zip

最新资源