没有合适的资源?快使用搜索试试~ 我知道了~
Unsupervised Feature Learning and Deep Learning
3星 · 超过75%的资源 需积分: 9 22 下载量 19 浏览量
2018-01-28
17:58:49
上传
评论
收藏 1.86MB PDF 举报
温馨提示
机器学习经典书籍Unsupervised Feature Learning and Deep Learning 英文
资源推荐
资源详情
资源评论
1
UFLDL Tutorial
Description: This tutorial will teach you the main ideas of Unsupervised
Feature Learning and Deep Learning. By working through it, you will
also get to implement several feature learning/deep learning algorithms,
get to see them work for yourself, and learn how to apply/adapt these
ideas to new problems.
This tutorial assumes a basic knowledge of machine learning (specifically,
familiarity with the ideas of supervised learning, logistic regression,
gradient descent). If you are not familiar with these ideas, we suggest you
go to this Machine Learning course and complete sections II, III, IV (up
to Logistic Regression) first.
2
Contents
Supervised Learning and Optimization ......................................................................... 5
Linear Regression ................................................................................................... 5
Problem Formulation ....................................................................................... 5
Function Minimization..................................................................................... 6
Exercise 1A: Linear Regression ....................................................................... 7
Logistic Regression ................................................................................................. 9
Exercise 1B .................................................................................................... 10
Vectorization ......................................................................................................... 12
Example: Many matrix-vector products ........................................................ 12
Example: normalizing many vectors ............................................................. 12
Example: matrix multiplication in gradient computations............................. 13
Exercise 1A and 1B Redux ............................................................................ 14
Debugging: Gradient Checking ............................................................................ 15
Gradient checker code.................................................................................... 17
Softmax Regression .............................................................................................. 18
Introduction .................................................................................................... 18
Cost Function ................................................................................................. 19
Properties of softmax regression parameterization ........................................ 20
Relationship to Logistic Regression .............................................................. 22
Exercise 1C .................................................................................................... 22
Debugging: Bias and Variance .............................................................................. 24
Debugging: Optimizers and objectives ................................................................. 24
Supervised Learning and Optimization ....................................................................... 25
Multi-Layer Neural Networks ............................................................................... 25
Neural Network model ................................................................................... 27
Backpropagation Algorithm .................................................................................. 30
Exercise: Supervised Neural Network .................................................................. 34
Supervised Convolutional Neural Network ................................................................. 36
Feature Extraction Using Convolution ................................................................. 36
Overview ........................................................................................................ 36
Fully Connected Networks ............................................................................ 36
Locally Connected Networks ......................................................................... 36
Convolutions .................................................................................................. 37
Pooling .................................................................................................................. 38
Pooling: Overview ......................................................................................... 38
Pooling for Invariance.................................................................................... 39
Formal description ......................................................................................... 39
Exercise: Convolution and Pooling ...................................................................... 40
Convolution and Pooling ............................................................................... 40
Dependencies ................................................................................................. 40
Optimization: Stochastic Gradient Descent .......................................................... 43
3
Overview ........................................................................................................ 43
Stochastic Gradient Descent .......................................................................... 43
Momentum ..................................................................................................... 44
Convolutional Neural Network ............................................................................. 46
Overview ........................................................................................................ 46
Architecture.................................................................................................... 46
Back Propagation ........................................................................................... 47
Exercise: Convolutional Neural Network ............................................................. 48
Overview ........................................................................................................ 48
Dependencies ................................................................................................. 48
Implement Convolutional Neural Network ................................................... 49
Unsupervised Learning ................................................................................................ 53
Autoencoders ........................................................................................................ 53
Visualizing a Trained Autoencoder ................................................................ 57
PCA Whitening ..................................................................................................... 59
Introduction .................................................................................................... 59
Example and Mathematical Background ....................................................... 59
Rotating the Data ........................................................................................... 61
Reducing the Data Dimension ....................................................................... 62
Recovering an Approximation of the Data .................................................... 64
Number of components to retain.................................................................... 65
PCA on Images .............................................................................................. 66
References ...................................................................................................... 68
Whitening .............................................................................................................. 68
Introduction .................................................................................................... 68
2D example .................................................................................................... 68
ZCA Whitening ..................................................................................................... 70
Regularizaton ........................................................................................................ 71
Implementing PCA Whitening .............................................................................. 72
Exercise: PCA Whitening ..................................................................................... 74
PCA and Whitening on natural images .......................................................... 74
ICA ........................................................................................................................ 78
Introduction .................................................................................................... 78
Orthonormal ICA ........................................................................................... 79
Topographic ICA ........................................................................................... 79
RICA ..................................................................................................................... 80
ICA Summary ................................................................................................ 80
RICA .............................................................................................................. 80
Exercise: RICA ..................................................................................................... 81
Step 0: Prerequisites ....................................................................................... 81
Step 1: RICA cost and gradient ...................................................................... 82
Self-Taught Learning ................................................................................................... 84
Self-Taught Learning ............................................................................................ 84
Overview ........................................................................................................ 84
4
Learning features ........................................................................................... 85
On pre-processing the data............................................................................. 86
On the terminology of unsupervised feature learning.................................... 87
Exercise: Self-Taught Learning ............................................................................ 88
Overview ........................................................................................................ 88
Dependencies ................................................................................................. 88
Step 1: Generate the input and test data sets .................................................. 89
Step 2: Train RICA ........................................................................................ 89
Step 3: Extracting features ............................................................................. 89
Step 4: Training and testing the softmax regression model ........................... 90
Step 5: Classifying on the test set .................................................................. 90
Building Deep Networks for Classification ................................................................. 91
From Self-Taught Learning to Deep Networks .................................................... 91
Deep Networks: Overview .................................................................................... 93
Overview ........................................................................................................ 93
Advantages of deep networks ........................................................................ 93
Difficulty of training deep architectures ........................................................ 94
Availability of data ......................................................................................... 94
Local optima .................................................................................................. 95
Diffusion of gradients .................................................................................... 95
Greedy layer-wise training ............................................................................. 95
Availability of data ......................................................................................... 96
Better local optima ......................................................................................... 96
Stacked Autoencoders ........................................................................................... 97
Overview ........................................................................................................ 97
Training .......................................................................................................... 97
Concrete example........................................................................................... 98
Discussion ...................................................................................................... 98
Fine-tuning Stacked AEs ....................................................................................... 99
Introduction .................................................................................................... 99
General Strategy ............................................................................................. 99
Finetuning with Backpropagation .................................................................. 99
Excercise:Implement Deep Network for Digit Classification ............................ 100
5
Supervised Learning and Optimization
Linear Regression
Problem Formulation
As a refresher, we will start by learning how to implement linear regression. The main
idea is to get familiar with objective functions, computing their gradients and
optimizing the objectives over a set of parameters. These basic tools will form the
basis for more sophisticated algorithms later. Readers that want additional details may
refer to the CS229 Lecture Notes on Supervised Learning for more.
Our goal in linear regression is to predict a target value y starting from a vector of
input values . For example, we might want to make predictions about the
price of a house so that y represents the price of the house in dollars and the
elements x
j
of x represent "features" that describe the house (such as its size and the
number of bedrooms). Suppose that we are given many examples of houses where the
features for the i'th house are denoted x
(i)
and the price isy
(i)
. For short, we will denote
the
Our goal is to find a function y = h(x) so that we have for each
training example. If we succeed in finding a function h(x) like this, and we have seen
enough examples of houses and their prices, we hope that the function h(x) will also
be a good predictor of the house price even when we are given the features for a new
house where the price is not known.
To find a function h(x) where we must first decide how to represent
the function h(x). To start out we will use linear functions:
.
Here, h
θ
(x) represents a large family of functions parametrized by the choice of θ.
(We call this space of functions a "hypothesis class".) With this representation for h,
our task is to find a choice of θ so that h
θ
(x
(i)
) is as close as possible to y
(i)
. In
particular, we will search for a choice ofθ that minimizes:
剩余99页未读,继续阅读
资源评论
- m0_380899262018-03-06很好很好得资源
Mallonp
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功