没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
Deep Learning
Ian Goodfellow
Yoshua Bengio
Aaron Courville
Contents
Website vii
Acknowledgments viii
Notation xi
1 Introduction 1
1.1 Who Should Read This Book? ...... ......... ..... 8
1.2 Historical Trends in Deep Learning ......... ........ 11
I Applied Math and Machine Learning Basics 29
2 Linear Algebra 31
2.1 Scalars, Vectors, Matrices and Tensors ......... ...... 31
2.2 Multiplying Matrices and Vectors ......... ........ . 34
2.3 Identity and Inverse Matrices ......... ........ ... 36
2.4 Linear Dependence and Span ......... ........ ... 37
2.5 Norms ......... ........ ........ ........ 39
2.6 Special Kinds of Matrices and Vectors ............... 40
2.7 Eigendecomposition .......... ........ ........ 42
2.8 Singular Value Decomposition ........ ........ .... 44
2.9 The Moore-Penrose Pseudoinverse ......... ........ . 45
2.10 The Trace Operator ......... ........ ........ 46
2.11 The Determinant .. ........ ........ ......... 47
2.12 Example: Principal Components Analysis ......... .... 48
3 Probability and Information Theory 53
3.1 Why Probability? ..... ......... ........ ..... 54
i
CONTENTS
3.2 Random Variables ..... ........ ......... .... 56
3.3 Probability Distributions ......... ........ ...... 56
3.4 Marginal Probability ......... ......... ....... 58
3.5 Conditional Probability .. ........ ........ ..... 59
3.6 The Chain Rule of Conditional Probabilities ......... ... 59
3.7 Independence and Conditional Independence ...
...... ... 60
3.8 Expectation, Variance and Covariance .......... ..... 60
3.9 Common Probability Distributions ............... .. 62
3.10 Useful Properties of Common Functions ... ......... .. 67
3.11 Bayes’ Rule .......... ........ ........ .... 70
3.12 Technical Details of Continuous Variables ...... ....... 71
3.13 Information Theory .......... ...
..... ........ 72
3.14 Structured Probabilistic Models .... ........ ....... 75
4 Numerical Computation 80
4.1 Overflow and Underflow ......... ........ ...... 80
4.2 Poor Conditioning ......... ........ ......... 82
4.3 Gradient-Based Optimization ....... ........ ..... 82
4.4 Constrained Optimization ............. ........ . 93
4.5 Examp
le: Linear Least Squares ....... ......... ... 96
5 Machine Learning Basics 98
5.1 Learning Algorithms ........... ........ ...... 99
5.2 Capacity, Overfitting and Underfitting .. ........ ..... 110
5.3 Hyperparameters and Validation Sets . ........ ....... 120
5.4 Estimators, Bias and Variance ...... ........ ...... 122
5.5 Maximum Likelihood Estimation ...... ...
...... ... 131
5.6 Bayesian Statistics ........... ........ ....... 135
5.7 Supervised Learning Algorithms ... ........ ........ 139
5.8 Unsupervised Learning Algorithms ............... .. 145
5.9 Stochastic Gradient Descent .... ......... ........ 150
5.10 Building a Machine Learning Algorithm ............. . 152
5.11 Challenges Motivating Deep Learni
ng ..... ......... .. 154
II Deep Networks: Modern Practices 165
6 Deep Feedforward Networks 167
6.1 Example: Learning XOR . ......... ........ ..... 170
6.2 Gradient-Based Learning . ........ ........ ...... 176
ii
CONTENTS
6.3 Hidden Units ...... ........ ......... ...... 190
6.4 Architecture Design ......... ........ ........ . 196
6.5 Back-Propagation and Other Differentiation Algorithms ..... 203
6.6 Historical Notes ....... ........ ......... .... 224
7 Regularization for Deep Learning 228
7.1 Parameter Norm Penalties ..... ......... ........ 230
7.2 Norm Pena
lties as Constrained Optimization ........ .... 237
7.3 Regularization and Under-Constrained Problems .. ....... 239
7.4 Dataset Augmentation .......... ......... ..... 240
7.5 Noise Robustness ......... ........ ........ .. 242
7.6 Semi-Supervised Learning ................ ...... 244
7.7 Multi-Task Learning .............. ......... .. 2
45
7.8 Early Stopping ......... ........ ........ ... 246
7.9 Parameter Tying and Parameter Sharing.............. 251
7.10 Sparse Representations ......... ........ ....... 253
7.11 Bagging and Other Ensemble Methods . ......... ..... 255
7.12 Dropout ........ ......... ........ ....... 257
7.13 Adversarial Training ........ ......
... ........ 267
7.14 Tangent Distance, Tangent Prop, and Manifold Tangent Classifier 268
8 Optimization for Training Deep Models 274
8.1 How Learning Differs from Pure Optimization ........... 275
8.2 Challenges in Neural Network Optimization ..... ....... 282
8.3 Basic Algorithms ............. ........ ...... 294
8.4 Parameter Initialization Strategies . ......... ....... 301
8.5 Alg
orithms with Adaptive Learning Rates ....... ...... 306
8.6 Approximate Second-Order Methods .... ......... ... 310
8.7 Optimization Strategies and Meta-Algorithms ..... ...... 318
9 Convolutional Networks 331
9.1 The Convolution Operation ................ ..... 332
9.2 Motivation .. ........ ......... ........ .... 336
9.3 Pooling ............. ....
.... ......... ... 340
9.4 Convolution and Pooling as an Infinitely Strong Prior .. ..... 346
9.5 Variants of the Basic Convolution Function ............ 348
9.6 Structured Outputs . ........ ......... ........ 359
9.7 Data Types ...... ........ ........ ........ 361
9.8 Efficient Convolution Algorithms ........ ........ .. 363
9.9 Random or Unsupervised Feature
s ........ ........ . 364
iii
CONTENTS
9.10 The Neuroscientific Basis for Convolutional Networks ...... . 365
9.11 Convolutional Networks and the History of Deep Learning .... 372
10Sequence Modeling: Recurrent and Recursive Nets 374
10.1 Unfolding Computational Graphs ............. ..... 376
10.2 Recurrent Neural Networks ... ......... ........ . 379
10.3 Bidirectional RNNs.............. ......... ... 396
10.4 En
coder-Decoder Sequence-to-Sequence Architectures ...... . 397
10.5 Deep Recurrent Networks ........ ......... ..... 399
10.6 Recursive Neural Networks ..... ......... ........ 401
10.7 The Challenge of Long-Term Dependencies .......... ... 403
10.8 Echo State Networks .......... ......... ...... 406
10.9 Leaky Units and Other Strategies for Multiple Time Scales ... . 409
10.10The Long Short-Term Memory and Other Gated RNNs .. .... 411
10.11Optimization for Long-Term Dependencies ........ ..... 415
10.12Explicit Memory .......... ......... ........ 419
11Practical methodology 424
11.1 Performance Metrics .......... ........ ....... 425
11.2 Default Baseline Models ........ ........ ....... 428
11.3 Determining Whether to Gathe
r More Data ............ 429
11.4 Selecting Hyperparameters ......... ........ ..... 430
11.5 Debugging Strategies ..... ........ ......... ... 439
11.6 Example: Multi-Digit Number Recognition ..... ........ 443
12Applications 446
12.1 Large Scale Deep Learning . ........ ......... .... 446
12.2 Computer Vision ......... ........ ........ .
. 455
12.3 Speech Recognition...... ........ ......... ... 461
12.4 Natural Language Processing ... ........ ........ . 464
12.5 Other Applications ......... ........ ........ . 480
III Deep Learning Research 489
13Linear Factor Models 492
13.1 Probabilistic PCA and Factor Analysis ....... ........ 493
13.2 Independent Component Analysis (ICA) ............ .. 49
4
13.3 Slow Feature Analysis ...... ......... ........ . 496
13.4 Sparse Coding ...... ........ ......... ...... 499
iv
剩余801页未读,继续阅读
资源评论
- JK-Cool2017-12-10圣经就是圣经 不错
venus_lian
- 粉丝: 11
- 资源: 5
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 工作流-OA-低代码表单的 前端工程,基于 Activiti7 Vue3 TS ElementPlus Vite,支持三种布局
- 软考冲刺:计算机技术与软件专业技术资格基础教程
- 泰迪杯数据技能大赛题目word版
- experiment-demo.zip
- HarmonyOs实战项目=>App首页架构沉浸式效果
- 课程考试系统开发基础教程
- 已测价值299元最新升级版Xiuno Light(修罗·轻鸿)v3.3 - 修罗论坛程序主题
- Delphi XE 10.3 Demo 文件
- 基于SpringBoot + Vue3 + TypeScript + Vite的个人前后端分离博客
- H5幸运刮刮乐抽奖 免公众号+直运营
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功