cs230讲义-super-cheatsheet-deep-learning资源-CSDN文库

深度学习

需积分: 10 40 浏览量 2018-11-29 14:43:43 上传评论收藏 5.28MB PDF 举报

资源详情

资源评论

CS 230 – Deep Learning Shervine Amidi & Afshine Amidi

Super VIP Cheatsheet: Deep Learning

Afshine Amidi and Shervine Amidi

November 25, 2018

Contents

1 Convolutional Neural Networks 2

1.1 Overview ................................. 2

1.2 Types of layer .............................. 2

1.3 Filter hyperparameters .......................... 2

1.4 Tuning hyperparameters ......................... 3

1.5 Commonly used activation functions ................... 3

1.6 Object detection ............................. 4

1.6.1 Face veriﬁcation and recognition ................. 5

1.6.2 Neural style transfer ....................... 5

1.6.3 Architectures using computational tricks ............ 6

2 Recurrent Neural Networks 7

2.1 Overview ................................. 7

2.2 Handling long term dependencies .................... 8

2.3 Learning word representation ...................... 9

2.3.1 Motivation and notations . . . . . . . . . . . . . . . . . . . 9

2.3.2 Word embeddings . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Comparing words . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.5 Language model ............................. 10

2.6 Machine translation ........................... 10

2.7 Attention ................................. 10

3 Deep Learning Tips and Tricks 11

3.1 Data processing ............................. 11

3.2 Training a neural network ........................ 12

3.2.1 Deﬁnitions ............................ 12

3.2.2 Finding optimal weights ..................... 12

3.3 Parameter tuning ............................ 12

3.3.1 Weights initialization ...................... 12

3.3.2 Optimizing convergence ..................... 12

3.4 Regularization .............................. 13

3.5 Go od practices .............................. 13

1 Convolutional Neural Networks

1.1 Overview

r Architecture of a traditional CNN –Convolutionalneuralnetworks,alsoknownasCNNs,

are a speciﬁc type of neural networks that are generally composed of the following layers:

The convolution layer and the pooling layer can b e ﬁne-tuned with respect to hyperparameters

that are described in the next sections.

1.2 Types of layer

r Convolutional layer (CONV) –Theconvolutionlayer(CONV)usesﬁltersthatperform

convolution op erations as it is scanning the input I with respect to its dimensions. Its hyperpa-

rameters include the ﬁlter size F and stride S.TheresultingoutputO is called feature map or

activation map.

Remark: the convolution step can be generalized to the 1D and 3D cases as well.

r Pooling (POOL) –Thepoolinglayer(POOL)isadownsamplingoperation,typicallyapplied

after a convolution layer, which does some spatial invariance. In particular, max and average

pooling are special kinds of pooling where the maximum and average value is taken, respectively.

Stanford University 1 Winter 2019

CS 230 – Deep Learning Shervine Amidi & Afshine Amidi

Max pooling Ave r a g e p o o l i n g

Purp ose

Each pooling operation sel e cts the

maximum value of the current view

Each pooling operation averages

the values of the current view

Illustration

Comments

-Preservesdetectedfeatures

-Mostcommonlyused

-Downsamplesfeaturemap

-UsedinLeNet

r Fully Con n e c t e d (FC) –Thefullyconnectedlayer(FC)operatesonaﬂattenedinputwhere

each input is connected to all neurons. If present, FC layers are usually found towards the end

of CNN architectures and can be used to optimize objectives such as class scores.

1.3 Filter hyperparameters

The convolution layer contains ﬁlters for which it is important to know the meaning behind its

hyperparameters.

r Dimensions of a ﬁlt er –AﬁlterofsizeF ◊F applied to an input containing C channels is

a F ◊ F ◊ C volume that performs convolutions on an input of size I ◊ I ◊ C and produces an

output feature map (also called activation map) of size O ◊ O ◊ 1.

Remark: the application of K ﬁlters of size F ◊ F results in an output feature map of size

O ◊ O ◊ K.

r Stride –Foraconvolutionalorapoolingoperation,thestrideS denotes the number of pixels

by which the window moves after each operation.

r Zero-padding –Zero-paddingdenotestheprocessofaddingP zeroes to each side of the

boundaries of the input. This value can either be manually speciﬁed or automatically set through

one of the three modes detailed below:

Valid Same Full

Value

P =0

start

SÁ

Ë≠I+F ≠S

end

SÁ

Ë≠I+F ≠S

start

œ [[ 0 ,F ≠ 1]]

end

= F ≠ 1

Illustration

Purp ose

-Nopadding

-Dropslast

convolution if

dimensions do not

match

-Paddingsuchthatfeature

map size has size

-Outputsizeis

mathematically convenient

-Alsocalled’half’padding

-Maximumpadding

such that end

convolutions are

applied on the limits

of the input

-Filter’sees’theinput

end-to-end

1.4 Tuning hyperpa rameters

r Parameter compatibility in convolution layer –BynotingI the length of the input

volume size, F the length of the ﬁlter, P the amount of zero padding, S the stride, then the

output size O of the feature map along that dimension is given by:

O =

I ≠ F + P

start

+ P

end

Remark: often times, P

start

= P

end

, P ,inwhichcasewecanreplaceP

start

+ P

end

by 2P in

the formula above.

Stanford University 2 Winter 2019

CS 230 – Deep Learning Shervine Amidi & Afshine Amidi

r Understanding the complexity of the model –Inordertoassessthecomplexityofa

model, it is often useful to determine the number of parameters that its architecture will have.

In a given layer of a convolutional neural network, i t is done as follows:

CONV POOL FC

Illustration

Input size I ◊ I ◊ C I ◊ I ◊ C N

Output size O ◊ O ◊ K O ◊ O ◊ C N

out

Number of

parameters

(F ◊ F ◊ C +1)· K 0 (N

+1)◊ N

out

Remarks

-Onebiasparameter

per ﬁlter

-Inmostcases,S<F

-Acommonchoice

for K is 2C

-Poolingoperation

done channel-wise

-Inmostcases,S = F

-Inputisﬂattened

-Onebiasparameter

per neuron

-ThenumberofFC

neurons is free of

structural constraints

r Receptive ﬁeld –Thereceptiveﬁeldatlayerk is the area denoted R

◊ R

of the input

that each pixel of the k -th activation map can ’see’. By calling F

the ﬁlter size of layer j and

the stride value of layer i and with the convention S

=1,thereceptiveﬁeldatlayerk can

be computed with the formula:

=1+

j=1

≠ 1)

j≠1

i=0

In the example below, we have F

= F

=3and S

= S

=1,whichgivesR

=1+2· 1+2 · 1=

1.5 Commonly used activation functions

r Rectiﬁed Linear Unit –Therectiﬁedlinearunitlayer(ReLU)isanactivationfunctiong

that is used on all elements of the volume. It aims at introducing non-linearities to the network.

Its variants are summarized in the table below:

ReLU Leaky ReLU ELU

g(z)=max(0,z)

g(z)=max(‘z,z)

with ‘ π 1

g(z)=max(–(e

≠ 1),z)

with – π 1

Non-linearity complexities

biologically interpretable

Addresses dying ReLU

issue for negative values

Diﬀerentiable everywhere

r Softmax –Thesoftmaxstepcanbeseenasageneralizedlogisticfunctionthattakesasinput

avectorofscoresx œ R

and outputs a vector of output probability p œ R

through a softmax

function at the end of the architecture. It is deﬁned as follows:

p =

where p

j=1

1.6 Object detection

r Types of models –Thereare3maintypesofobjectrecognitionalgorithms,forwhichthe

nature of what is predicted is diﬀerent. They are described in the table below:

Image classiﬁcation

Classiﬁcation

w. localization

Detection

-Classiﬁesapicture

-Predictsprobability

of object

-Detectsobjectinapicture

-Predictsprobabilityof

object and where it is

located

-Detectsuptoseveralobjects

in a picture

-Predictsprobabilitiesofobjects

and where they are located

Traditional CNN

Simpliﬁed YOLO, R-CNN YOLO, R-CNN

r Detection –Inthecontextofobjectdetection,diﬀerentmethodsareuseddependingon

whether we just want to locate the object or detect a more complex shape in the image. The

two main ones are summed up in the table below:

Stanford University 3 Winter 2019

剩余12页未读，继续阅读

评论收藏

内容反馈

cs230讲义-super-cheatsheet-deep-learning

评论0

最新资源

cs230讲义-super-cheatsheet-deep-learning

评论0

最新资源

相关推荐

Deep Learning Super VIP Cheatsheet

吴恩达深度学习CS230全部课件部分2

斯坦福大学-深度学习-cs230-DeepLearning-官方知识点总结PDF

cheatsheet-deep-learning.pdf

cheatsheet-deep-learning.zip

machine learning 汇总-super-cheatsheet-machine-learning

cheatsheet-deep-learning-tips-tricks.pdf

microsoft-machine- learning-algorithm-cheat-sheet-v6

acm-cheat-sheet, Acm Cheat Sheet.zip

css-cheat-sheet

cheatsheet-deep-learning-tips-tricks.zip

deep learning for image super-resolution的笔记，PPT，深度学习超分辨率SR的综述

regular-expressions-cheat-sheet-v2.pdf

python-regular-expressions-cheat-sheet.pdf

github-git-cheat-sheet (官方文档汉化版)

NumPy-SciPy-Pandas-Quandl-Cheat-Sheet

bash-vi-editing-mode-cheat-sheet

jQuery-1.5-Visual-Cheat-Sheet (PDF)

vim键盘图（vi-vim-cheat-sheet）

github-git-cheat-sheet-rt.pdf

YOLOv8-deepsort 实现智能车辆目标检测+车辆跟踪+车辆计数

YOLOv8网络结构图，自制visio文件，yolov8.vsds，需要的自取，在原有的基础上直接改就行了

yolov8(2023年8月版本),已经下好yolov8s.pt和yolov8n.pt

Transformer模型实现长期预测并可视化结果（附代码+数据集+原理介绍）

社交平台上经济类话题的文章热度信息，数据是真实的，但不是真实日期

Unet眼底血管图像分割数据集+代码+模型+系统界面+教学视频.zip

行人跌倒数据集（VOC格式）

YOLOV5 + 双目相机实现三维测距（新版本）