【免费】图像分类调参技巧-李沐.pdf_含参量积分考研pdf资源-CSDN文库

需积分: 0 183 浏览量 2020-12-14 11:41:00 上传评论 1 收藏 481KB PDF 举报

Much of the recent progress made in image classiﬁcation research can be credited to training proce durere ﬁnements, such as changes in data augmentations and optimization methods. In theliterature ,however, most reﬁnements are ei-ther brieﬂymentioned asimplement ationdetails or only vis-ibl 在人工智能领域，图像分类一直是一个核心的研究主题，它不仅推动着相关技术的发展，而且在实际应用中占据了举足轻重的地位。随着深度学习技术的不断进步，图像分类模型的性能也得到了显著提升。不过，这些进步并不是仅仅依赖于模型架构的创新，更多的时候，它来自于对训练过程的精细化调整。在这样的背景下，"图像分类调参技巧-李沐.pdf"这篇文献提供了一个全面的视角，关注于那些微小但至关重要的改进策略，这些策略在提升模型性能方面发挥了关键作用。本文献首先指出，图像分类研究中取得的进展很大程度上得益于训练过程的改进，其中包括数据增强方法和优化方法的变化。这些改进往往在文献中被一笔带过或仅作为实施细节提及，但其实它们对于模型的最终性能有着不可忽视的影响。针对这一点，作者们展开了一系列的实证评估，以验证这些细微调整对于模型性能的影响。文中详细讨论了多种“tricks”，如特定卷积层的步长调整和学习率调度等，这些策略在表面上看似微不足道，但它们的累积效应却能显著提升模型的准确率。通过将这些技巧应用到经典的图像分类模型ResNet-50上，作者们证明了这些调参方法的有效性，其在ImageNet上的验证集top-1准确率从75.3%提高到了79.29%。这一结果不仅令人惊讶，而且超越了诸多先进的网络架构，如SE-ResNeXt-50等。这说明即使是较为传统的模型，在经过精心调参后，也能展现出与最新架构相媲美甚至更优异的性能。不仅如此，文献中的研究还展示了这些调参技巧的泛化能力。它们不仅适用于其他网络结构，如Inception V3和MobileNet，而且在不同数据集，如Place365中也同样有效。更进一步，通过这些方法训练出的模型，在目标检测、语义分割等其他深度学习任务中也表现出了更优的迁移学习性能。文献的结构条理清晰，作者们首先建立了基准训练流程，然后逐一讨论了各种改进技巧。具体包括了数据增强的多样性、优化算法的选择、学习率调度策略的优化，以及针对特定层结构的修改等。作者们通过详细的实验和分析，向读者展示了如何在保持模型训练效率的同时，有效地提升其在图像分类任务中的性能。总结来说，本文献强调了深度学习模型优化过程中细节的重要性。通过持续的实验和微调，即使是微小的参数变化也可能成为提升模型性能的关键。这对于人工智能领域的研究者和工程师们来说，是一个重要的启示。在探索新的网络架构的同时，我们不应忽视现有模型的潜力。通过不断地尝试和优化训练过程，我们完全有可能达到甚至超过新模型的性能。这不仅有助于推动图像分类技术的进步，也为深度学习在其他领域的应用提供了宝贵的参考。

资源推荐

资源详情

资源评论

Bag of Tricks for Image Classiﬁcation with Convolutional Neural Networks

Tong He Zhi Zhang Hang Zhang Zhongyue Zhang Junyuan Xie Mu Li

Amazon Web Services

{htong,zhiz,hzaws,zhongyue,junyuanx,mli}@amazon.com

Model FLOPs top-1 top-5

ResNet-50 [9] 3.9 G 75.3 92.2

ResNeXt-50 [27] 4.2 G 77.8 -

SE-ResNet-50 [12] 3.9 G 76.71 93.38

SE-ResNeXt-50 [12] 4.3 G 78.90 94.51

DenseNet-201 [13] 4.3 G 77.42 93.66

ResNet-50 + tricks (ours) 4.3 G 79.29 94.63

Table 1: Computational costs and validation accuracy of

various models. ResNet, trained with our “tricks”, is able

to outperform newer and improved architectures trained

with standard pipeline.

procedure and model architecture reﬁnements that improve

model accuracy but barely change computational complex-

ity. Many of them are minor “tricks” like modifying the

stride size of a particular convolution layer or adjusting

learning rate schedule. Collectively, however, they make a

big difference. We will evaluate them on multiple network

architectures and datasets and report their impact to the ﬁnal

model accuracy.

Our empirical evaluation shows that several tricks lead

to signiﬁcant accuracy improvement and combining them

together can further boost the model accuracy. We com-

pare ResNet-50, after applying all tricks, to other related

networks in Table 1. Note that these tricks raises ResNet-

50’s top-1 validation accuracy from 75.3% to 79.29% on

ImageNet. It also outperforms other newer and improved

network architectures, such as SE-ResNeXt-50. In addi-

tion, we show that our approach can generalize to other net-

works (Inception V3 [1] and MobileNet [11]) and datasets

(Place365 [32]). We further show that models trained with

our tricks bring better transfer learning performance in other

application domains such as object detection and semantic

segmentation.

Paper Outline. We ﬁrst set up a baseline training proce-

dure in Section 2, and then discuss several tricks that are

arXiv:1812.01187v2 [cs.CV] 5 Dec 2018

this

paper,

will

examine

collection

training

tation

details

while

others

can

only

found

source

code.

literature,

most

were

only

brieﬂy

mentioned

implemen-

past

years,

but

has

received

relatively

less

attention.

the

large

number

such

reﬁnements

has

been

proposed

the

ing,

and

optimization

methods

also

played

major

role.

ments,

including

changes

loss

functions,

data

preprocess-

improved

model

architecture.

Training

procedure

reﬁne-

However,

these

advancements

did

not

solely

come

from

62.5%

(AlexNet)

82.7%

(NASNet-A).

top-1

validation

accuracy

ImageNet

[23]

has

been

raised

trend

model

accuracy

improvement.

For

example,

the

NASNet

[34].

the

same

time,

have

seen

steady

NiN

[16],

Inception

[1],

ResNet

[9],

DenseNet

[13],

and

tures

have

been

proposed

since

then,

including

VGG

[24],

ing

approach

for

image

classiﬁcation.

Various

new

architec-

convolutional

neural

networks

have

become

the

dominat-

Since

the

introduction

AlexNet

[15]

2012,

deep

Introduction

object

detection

and

semantic

segmentation.

learning

performance

other

ap-

plication

domains

such

image

classiﬁcation

accuracy

leads

better

transfer

ImageNet.

will

also

demon-strate

that

improvement

top-1

validation

accuracyfrom75.3%

79.29%

models

signiﬁcantly.

Forexample,

raise

ResNet-50’s

reﬁnements

together,

weare

ableto

improve

various

CNN

ablation

study.

Wewill

show

that,

combining

these

evaluate

their

im-pact

the

ﬁnal

model

accuracy

through

examine

collec-tion

suchreﬁnements

and

empirically

details

only

vis-ible

source

code.

Inthis

paper,

will

reﬁnements

are

ei-ther

brieﬂy

mentioned

implementation

optimizationmethods.

the

literature,

however,

most

reﬁnements,such

changes

data

augmentations

and

classiﬁcationresearch

can

credited

trainingprocedure

Much

the

recent

progress

made

image

Abstract

Algorithm 1 Train a neural network with mini-batch

stochastic gradient descent.

initialize(net)

for epoch = 1, . . . , K do

for batch = 1, . . . , #images/b do

images ← uniformly random sample b images

X, y ← preprocess(images)

z ← forward(net, X)

` ← loss(z, y)

grad ← backward(`)

update(net, grad)

end for

useful for efﬁcient training on new hardware in Section 3. In

Section 4 we review three minor model architecture tweaks

for ResNet and propose a new one. Four additional train-

ing procedure reﬁnements are then discussed in Section 5.

At last, we study if these more accurate models can help

transfer learning in Section 6.

Our model implementations and training scripts are pub-

licly available in GluonCV

2. Training Procedures

The template of training a neural network with mini-

batch stochastic gradient descent is shown in Algorithm 1.

In each iteration, we randomly sample b images to com-

pute the gradients and then update the network parameters.

It stops after K passes through the dataset. All functions

and hyper-parameters in Algorithm 1 can be implemented

in many different ways. In this section, we ﬁrst specify a

baseline implementation of Algorithm 1.

2.1. Baseline Training Procedure

We follow a widely used implementation [8] of ResNet

as our baseline. The preprocessing pipelines between train-

ing and validation are different. During training, we per-

form the following steps one-by-one:

1. Randomly sample an image and decode it into 32-bit

ﬂoating point raw pixel values in [0, 255].

2. Randomly crop a rectangular region whose aspect ratio

is randomly sampled in [3/4, 4/3] and area randomly

sampled in [8%, 100%], then resize the cropped region

into a 224-by-224 square image.

3. Flip horizontally with 0.5 probability.

4. Scale hue, saturation, and brightness with coefﬁcients

uniformly drawn from [0.6, 1.4].

5. Add PCA noise with a coefﬁcient sampled from a nor-

mal distribution N (0, 0.1).

https://github.com/dmlc/gluon-cv

Model

Baseline Reference

Top-1 Top-5 Top-1 Top-5

ResNet-50 [9] 75.87 92.70 75.3 92.2

Inception-V3 [26] 77.32 93.43 78.8 94.4

MobileNet [11] 69.03 88.71 70.6 -

Table 2: Validation accuracy of reference implementa-

tions and our baseline. Note that the numbers for Incep-

tion V3 are obtained with 299-by-299 input images.

6. Normalize RGB channels by subtracting 123.68,

116.779, 103.939 and dividing by 58.393, 57.12,

57.375, respectively.

During validation, we resize each image’s shorter edge

to 256 pixels while keeping its aspect ratio. Next, we crop

out the 224-by-224 region in the center and normalize RGB

channels similar to training. We do not perform any random

augmentations during validation.

The weights of both convolutional and fully-connected

layers are initialized with the Xavier algorithm [6]. In par-

ticular, we set the parameter to random values uniformly

drawn from [−a, a], where a =

6/(d

+ d

out

). Here

and d

out

are the input and output channel sizes, respec-

tively. All biases are initialized to 0. For batch normaliza-

tion layers, γ vectors are initialized to 1 and β vectors to

Nesterov Accelerated Gradient (NAG) descent [20] is

used for training. Each model is trained for 120 epochs on

8 Nvidia V100 GPUs with a total batch size of 256. The

learning rate is initialized to 0.1 and divided by 10 at the

30th, 60th, and 90th epochs.

2.2. Experiment Results

We evaluate three CNNs: ResNet-50 [9], Inception-

V3 [1], and MobileNet [11]. For Inception-V3 we resize the

input images into 299x299. We use the ISLVRC2012 [23]

dataset, which has 1.3 million images for training and 1000

classes. The validation accuracies are shown in Table 2. As

can be seen, our ResNet-50 results are slightly better than

the reference results, while our baseline Inception-V3 and

MobileNet are slightly lower in accuracy due to different

training procedure.

3. Efﬁcient Training

Hardware, especially GPUs, has been rapidly evolving

in recent years. As a result, the optimal choices for many

performance related trade-offs have changed. For example,

it is now more efﬁcient to use lower numerical precision and

larger batch sizes during training. In this section, we review

various techniques that enable low precision and large batch

剩余9页未读，继续阅读

评论收藏

内容反馈

DeepLearning小舟

粉丝: 2431
资源: 57

图像分类调参技巧-李沐.pdf

2019-2-13 神经网络调参--权重对网络分类性能的影响

[] - 2022-03-08 神经网络取得好效果的调参技巧.pdf

NCA 降维和贝叶斯优化调参对分类模型的改进.pdf

基于深度学习图像分类的入门教程代码+说明.zip

图像分类

classify-leaves.zip

动手学深度学习代码jupyter代码（d2l-zh.zip）

d2l-zh.zip

线性回归-李沐老师-课堂笔记

李沐深度学习-pytorch.zip

《动手学深度学习》2020正式版本-李沐_深度学习_源码.zip

fashion-mnist.7z

Hello 算法.pdf

Dive-into-DL-TensorFlow2.0-master.zip

《动手学深度学习》2020正式版本-李沐_深度学习

21年更新-李沐《动手学深度学习第二版》:讲义+数据+代码等

【动手学深度学习v2】深度学习-李沐老师课程中代码详解-softmax

【动手学深度学习v2】深度学习-李沐老师课程中代码详解-权重衰退

【动手学深度学习v2】深度学习-李沐老师课程中代码详解-drop out

精品--️李沐 【动手学深度学习】课程学习笔记：使用pycharm编程，基于pytorch框架实现。.zip

【动手学深度学习v2】深度学习-李沐老师课程中代码详解-10多层感知机

机器翻译参考资料打包(pdf)

动手学深度学习d2l-zh_李沐1

算法数学知识学习文档pdf

2022213854 李沐窈.zip

李沐 机器学习.txt

李沐深度学习.zip

2022213854李沐窈程序.cpp

最新资源

精品--️李沐【动手学深度学习】课程学习笔记：使用pycharm编程，基于pytorch框架实现。.zip

李沐机器学习.txt