Bengio写的MITPress《Deeplearning》PDF整理版资源-CSDN文库

deep

learning

5星 · 超过95%的资源需积分: 48 196 浏览量 2015-07-11 13:47:13 上传评论 17 收藏 19.05MB PDF 举报

资源推荐

资源详情

资源评论

Deep Learning

Yoshua Bengio

Ian Goodfellow

Aaron Courville

CONTENTS

3.6 The Chain Rule of Conditional Probabilities . . . . . . . . . . . . 52

3.7 Independence and Conditional Independence . . . . . . . . . . . 52

3.8 Expectation, Variance, and Covariance . . . . . . . . . . . . . . . 53

3.9 Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.10 Common Probability Distributions . . . . . . . . . . . . . . . . . 57

3.11 Useful Properties of Common Functions . . . . . . . . . . . . . . 62

3.12 Bayes’ Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.13 Technical Details of Continuous Variables . . . . . . . . . . . . . 64

3.14 Structured Probabilistic Models . . . . . . . . . . . . . . . . . . . 65

3.15 Example: Naive Bayes . . . . . . . . . . . . . . . . . . . . . . . . 68

4 Numerical Computation 74

4.1 Overﬂow and Underﬂow . . . . . . . . . . . . . . . . . . . . . . . 74

4.2 Poor Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.3 Gradient-Based Optimization . . . . . . . . . . . . . . . . . . . . 76

4.4 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 85

4.5 Example: Linear Least Squares . . . . . . . . . . . . . . . . . . . 87

5 Machine Learning Basics 89

5.1 Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.2 Example: Linear Regression . . . . . . . . . . . . . . . . . . . . . 97

5.3 Generalization, Capacity, Overﬁtting and Underﬁtting . . . . . . 99

5.4 The No Free Lunch Theorem . . . . . . . . . . . . . . . . . . . . 104

5.5 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.6 Hyperparameters, Validation Sets and Cross-Validation . . . . . 108

5.7 Estimators, Bias, and Variance . . . . . . . . . . . . . . . . . . . 110

5.8 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . 118

5.9 Bayesian Statistics and Prior Probability Distributions . . . . . . 121

5.10 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.11 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . 131

5.12 Weakly Supervised Learning . . . . . . . . . . . . . . . . . . . . . 134

5.13 The Curse of Dimensionality and Statistical Limitations of Local

Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

II Modern Practical Deep Networks 147

6 Feedforward Deep Networks 149

6.1 From Fixed Features to Learned Features . . . . . . . . . . . . . 149

6.2 Formalizing and Generalizing Neural Networks . . . . . . . . . . 152

6.3 Parametrizing a Learned Predictor . . . . . . . . . . . . . . . . . 154

CONTENTS

6.4 Flow Graphs and Back-Propagation . . . . . . . . . . . . . . . . 167

6.5 Universal Approximation Properties and Depth . . . . . . . . . . 180

6.6 Feature / Representation Learning . . . . . . . . . . . . . . . . . 184

6.7 Piecewise Linear Hidden Units . . . . . . . . . . . . . . . . . . . 186

6.8 Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

7 Regularization 190

7.1 Regularization from a Bayesian Perspective . . . . . . . . . . . . 191

7.2 Classical Regularization: Parameter Norm Penalty . . . . . . . . 193

7.3 Classical Regularization as Constrained Optimization . . . . . . . 200

7.4 Regularization and Under-Constrained Problems . . . . . . . . . 201

7.5 Dataset Augmentation . . . . . . . . . . . . . . . . . . . . . . . . 203

7.6 Classical Regularization as Noise Robustness . . . . . . . . . . . 204

7.7 Early Stopping as a Form of Regularization . . . . . . . . . . . . 208

7.8 Parameter Tying and Parameter Sharing . . . . . . . . . . . . . . 215

7.9 Sparse Representations . . . . . . . . . . . . . . . . . . . . . . . . 215

7.10 Bagging and Other Ensemble Methods . . . . . . . . . . . . . . . 215

7.11 Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

7.12 Multi-Task Learning . . . . . . . . . . . . . . . . . . . . . . . . . 222

7.13 Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . . . 223

8 Optimization for Training Deep Models 226

8.1 Optimization for Model Training . . . . . . . . . . . . . . . . . . 226

8.2 Challenges in Optimization . . . . . . . . . . . . . . . . . . . . . 229

8.3 Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . 236

8.4 Approximate Natural Gradient and Second-Order Methods . . . 241

8.5 Conjugate Gradients . . . . . . . . . . . . . . . . . . . . . . . . . 241

8.6 BFGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

8.7 Hints, Global Optimization and Curriculum Learning . . . . . . . 243

9 Convolutional Networks 248

9.1 The Convolution Operation . . . . . . . . . . . . . . . . . . . . . 248

9.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

9.3 Pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

9.4 Convolution and Pooling as an Inﬁnitely Strong Prior . . . . . . 261

9.5 Variants of the Basic Convolution Function . . . . . . . . . . . . 262

9.6 Structured Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . 269

9.7 Convolutional Modules . . . . . . . . . . . . . . . . . . . . . . . . 269

9.8 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

9.9 Eﬃcient Convolution Algorithms . . . . . . . . . . . . . . . . . . 271

9.10 Random or Unsupervised Features . . . . . . . . . . . . . . . . . 271

iii

CONTENTS

9.11 The Neuroscientiﬁc Basis for Convolutional Networks . . . . . . . 273

9.12 Convolutional Networks and the History of Deep Learning . . . . 280

10 Sequence Modeling: Recurrent and Recursive Nets 281

10.1 Unfolding Flow Graphs and Sharing Parameters . . . . . . . . . . 282

10.2 Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . 284

10.3 Bidirectional RNNs . . . . . . . . . . . . . . . . . . . . . . . . . . 295

10.4 Deep Recurrent Networks . . . . . . . . . . . . . . . . . . . . . . 296

10.5 Recursive Neural Networks . . . . . . . . . . . . . . . . . . . . . 299

10.6 Auto-Regressive Networks . . . . . . . . . . . . . . . . . . . . . . 299

10.7 Facing the Challenge of Long-Term Dependencies . . . . . . . . . 305

10.8 Handling Temporal Dependencies with N-Grams, HMMs, CRFs

and Other Graphical Models . . . . . . . . . . . . . . . . . . . . . 317

10.9 Combining Neural Networks and Search . . . . . . . . . . . . . . 328

11 Practical methodology 333

11.1 Basic Machine Learning Methodology . . . . . . . . . . . . . . . 333

11.2 Manual Hyperparameter Tuning . . . . . . . . . . . . . . . . . . 334

11.3 Hyper-parameter Optimization Algorithms . . . . . . . . . . . . . 334

11.4 Debugging Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 336

12 Applications 339

12.1 Large Scale Deep Learning . . . . . . . . . . . . . . . . . . . . . . 339

12.2 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

12.3 Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 352

12.4 Natural Language Processing and Neural Language Models . . . 353

12.5 Structured Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . 369

12.6 Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 369

III Deep Learning Research 370

13 Structured Probabilistic Models for Deep Learning 372

13.1 The Challenge of Unstructured Modeling . . . . . . . . . . . . . . 373

13.2 Using Graphs to Describe Model Structure . . . . . . . . . . . . . 377

13.3 Advantages of Structured Modeling . . . . . . . . . . . . . . . . . 391

13.4 Learning About Dependencies . . . . . . . . . . . . . . . . . . . . 392

13.5 Inference and Approximate Inference Over Latent Variables . . . 394

13.6 The Deep Learning Approach to Structured Probabilistic Models 395

14 Monte Carlo Methods 400

14.1 Markov Chain Monte Carlo Methods . . . . . . . . . . . . . . . . 400

剩余575页未读，继续阅读

评论收藏

内容反馈

zzymcgrady

2016-03-24

很好的资料，不过前沿的东西还是要多读读论文
zzxxww

2016-03-23

deep learning很经典的书，作者是大牛啊，材料很全，正在学习中
zt_706

2019-09-10

可以看，还行
zhu3x

2016-08-07

好书，真的非常好，值得一看
zhield

2017-11-15

大牛的书，非常感谢

前往

页

Aimar_yxj

粉丝: 1
资源: 5

Bengio写的MIT Press《Deep learning》PDF整理版

最新资源

Bengio写的MIT Press《Deep learning》PDF整理版

Bengio写的MIT Press《Deep learning》PDF(2016-4-14)

MIT Deep Learning

MIT_deepLearning_ch.pdf

Deep Learning(中文版)(An MIT Press Book)

Deep Learning英文版原著（Bengio）

Deep Learning ( Ian Goodfellow Yoshua Bengio Aaron Courville)

Ian Goodfellow等人的Deep Learning 英文版（含PDF、mobi和epub）

《Deep Learning》Yoshua Bengio即将出版的深度学习书PDF整理

DeepLearning深度学习综述-高清电子版

Deep Learning [draft of March 30, 2015]-MIT Press (2016).pdf

Deep learning_ adaptive computation and machine learning-The MIT Press (2016)

DEEP LEARNING 中文版

Deep Learning（深度学习）学习笔记整理系列1-7

DeepLearning(期刊论文)

Deep Learning Book (汉化深度学习2018最新版)

deep learning（超清晰中文+英文）

Deep Learning(Goodfellow)

Deep Learning - Ian Goodfellow 完整高清英文版

Deep Learning Goodfellow（英文版带目录）

Deep Learning (Ian Goodfellow, Yoshua Bengio and Aaron Courville)

Deep learning（Ian Goodfellow）中文版

Deep Learning（Yoshua Bengio 2015）

DeepLearning MIT press

deep learning 英文版（Bengio）

《Deep Learning》Ian Goodfellow, Yoshua Bengio and Aaron Courville【带书签】

Deep Learning MIT

Vector Davinci官方帮助配置使用手册（AutoSAR）.pdf

c++入门，核心，提高讲义笔记

离散数学及其应用 第八版 奇数编号练习答案.pdf

最新资源

离散数学及其应用第八版奇数编号练习答案.pdf