DeepLearningGoodfellow（英文版带目录）资源-CSDN文库

需积分: 10 155 浏览量 2017-11-16 14:02:21 上传评论 1 收藏 21.89MB PDF 举报

资源推荐

资源详情

资源评论

Deep Learning

Ian Goodfellow

Yoshua Bengio

Aaron Courville

CONTENTS

3.2 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.3 Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . 56

3.4 Marginal Probability . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.5 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . 59

3.6 The Chain Rule of Conditional Probabilities . . . . . . . . . . . . 59

3.7 Independence and Conditional Independence . . . . . . . . . . . . 60

3.8 Expectation, Variance and Covariance . . . . . . . . . . . . . . . 60

3.9 Common Probability Distributions . . . . . . . . . . . . . . . . . 62

3.10 Useful Properties of Common Functions . . . . . . . . . . . . . . 67

3.11 Bayes’ Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.12 Technical Details of Continuous Variables . . . . . . . . . . . . . 71

3.13 Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.14 Structured Probabilistic Models . . . . . . . . . . . . . . . . . . . 75

4 Numerical Computation 80

4.1 Overﬂow and Underﬂow . . . . . . . . . . . . . . . . . . . . . . . 80

4.2 Poor Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.3 Gradient-Based Optimization . . . . . . . . . . . . . . . . . . . . 82

4.4 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 93

4.5 Example: Linear Least Squares . . . . . . . . . . . . . . . . . . . 96

5 Machine Learning Basics 98

5.1 Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.2 Capacity, Overﬁtting and Underﬁtting . . . . . . . . . . . . . . . 110

5.3 Hyperparameters and Validation Sets . . . . . . . . . . . . . . . . 120

5.4 Estimators, Bias and Variance . . . . . . . . . . . . . . . . . . . . 122

5.5 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . 131

5.6 Bayesian Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.7 Supervised Learning Algorithms . . . . . . . . . . . . . . . . . . . 140

5.8 Unsupervised Learning Algorithms . . . . . . . . . . . . . . . . . 146

5.9 Stochastic Gradient Descent . . . . . . . . . . . . . . . . . . . . . 151

5.10 Building a Machine Learning Algorithm . . . . . . . . . . . . . . 153

5.11 Challenges Motivating Deep Learning . . . . . . . . . . . . . . . . 155

II Deep Networks: Modern Practices 166

6 Deep Feedforward Networks 168

6.1 Example: Learning XOR . . . . . . . . . . . . . . . . . . . . . . . 171

6.2 Gradient-Based Learning . . . . . . . . . . . . . . . . . . . . . . . 177

CONTENTS

6.3 Hidden Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

6.4 Architecture Design . . . . . . . . . . . . . . . . . . . . . . . . . . 197

6.5 Back-Propagation and Other Diﬀerentiation Algorithms . . . . . 204

6.6 Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

7 Regularization for Deep Learning 228

7.1 Parameter Norm Penalties . . . . . . . . . . . . . . . . . . . . . . 230

7.2 Norm Penalties as Constrained Optimization . . . . . . . . . . . . 237

7.3 Regularization and Under-Constrained Problems . . . . . . . . . 239

7.4 Dataset Augmentation . . . . . . . . . . . . . . . . . . . . . . . . 240

7.5 Noise Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

7.6 Semi-Supervised Learning . . . . . . . . . . . . . . . . . . . . . . 243

7.7 Multi-Task Learning . . . . . . . . . . . . . . . . . . . . . . . . . 244

7.8 Early Stopping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

7.9 Parameter Tying and Parameter Sharing . . . . . . . . . . . . . . 253

7.10 Sparse Representations . . . . . . . . . . . . . . . . . . . . . . . . 254

7.11 Bagging and Other Ensemble Methods . . . . . . . . . . . . . . . 256

7.12 Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

7.13 Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . . . 268

7.14 Tangent Distance, Tangent Prop, and Manifold Tangent Classiﬁer 270

8 Optimization for Training Deep Models 274

8.1 How Learning Diﬀers from Pure Optimization . . . . . . . . . . . 275

8.2 Challenges in Neural Network Optimization . . . . . . . . . . . . 282

8.3 Basic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

8.4 Parameter Initialization Strategies . . . . . . . . . . . . . . . . . 301

8.5 Algorithms with Adaptive Learning Rates . . . . . . . . . . . . . 306

8.6 Approximate Second-Order Methods . . . . . . . . . . . . . . . . 310

8.7 Optimization Strategies and Meta-Algorithms . . . . . . . . . . . 317

9 Convolutional Networks 330

9.1 The Convolution Operation . . . . . . . . . . . . . . . . . . . . . 331

9.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

9.3 Pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

9.4 Convolution and Pooling as an Inﬁnitely Strong Prior . . . . . . . 345

9.5 Variants of the Basic Convolution Function . . . . . . . . . . . . 347

9.6 Structured Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . 358

9.7 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

9.8 Eﬃcient Convolution Algorithms . . . . . . . . . . . . . . . . . . 362

9.9 Random or Unsupervised Features . . . . . . . . . . . . . . . . . 363

iii

CONTENTS

9.10 The Neuroscientiﬁc Basis for Convolutional Networks . . . . . . . 364

9.11 Convolutional Networks and the History of Deep Learning . . . . 371

10 Sequence Modeling: Recurrent and Recursive Nets 373

10.1 Unfolding Computational Graphs . . . . . . . . . . . . . . . . . . 375

10.2 Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . 378

10.3 Bidirectional RNNs . . . . . . . . . . . . . . . . . . . . . . . . . . 394

10.4 Encoder-Decoder Sequence-to-Sequence Architectures . . . . . . . 396

10.5 Deep Recurrent Networks . . . . . . . . . . . . . . . . . . . . . . 398

10.6 Recursive Neural Networks . . . . . . . . . . . . . . . . . . . . . . 400

10.7 The Challenge of Long-Term Dependencies . . . . . . . . . . . . . 401

10.8 Echo State Networks . . . . . . . . . . . . . . . . . . . . . . . . . 404

10.9 Leaky Units and Other Strategies for Multiple Time Scales . . . . 406

10.10 The Long Short-Term Memory and Other Gated RNNs . . . . . . 408

10.11 Optimization for Long-Term Dependencies . . . . . . . . . . . . . 413

10.12 Explicit Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . 416

11 Practical Methodology 421

11.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 422

11.2 Default Baseline Models . . . . . . . . . . . . . . . . . . . . . . . 425

11.3 Determining Whether to Gather More Data . . . . . . . . . . . . 426

11.4 Selecting Hyperparameters . . . . . . . . . . . . . . . . . . . . . . 427

11.5 Debugging Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 436

11.6 Example: Multi-Digit Number Recognition . . . . . . . . . . . . . 440

12 Applications 443

12.1 Large-Scale Deep Learning . . . . . . . . . . . . . . . . . . . . . . 443

12.2 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . 452

12.3 Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 458

12.4 Natural Language Processing . . . . . . . . . . . . . . . . . . . . 461

12.5 Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 478

III Deep Learning Research 486

13 Linear Factor Models 489

13.1 Probabilistic PCA and Factor Analysis . . . . . . . . . . . . . . . 490

13.2 Independent Component Analysis (ICA) . . . . . . . . . . . . . . 491

13.3 Slow Feature Analysis . . . . . . . . . . . . . . . . . . . . . . . . 493

13.4 Sparse Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496

剩余801页未读，继续阅读

评论收藏

内容反馈

328008932

粉丝: 0
资源: 1

Deep Learning Goodfellow（英文版带目录）

最新资源

Deep Learning Goodfellow（英文版带目录）

Deep Learning(Goodfellow)

deep learning(Goodfellow)

Deep Learning 英文版

Deep Learning 英文版

Deep Learning - Ian Goodfellow 完整高清英文版

Ian Goodfellow等人的Deep Learning 英文版（含PDF、mobi和epub）

《Deep Learning》Ian Goodfellow, Yoshua Bengio and Aaron Courville【带书签】

deep learning (中文版）Lan Goodfellow Yoshua Bengio

Deep Learning ( Ian Goodfellow Yoshua Bengio Aaron Courville)

Deep LearningNov 18 2016 by Ian Goodfellow and Yoshua Bengio(英文版)

深度学习 [deep learning] AI圣经 Deep Learning 英文版 (花书)

deep learning 中英文版集合

Deep Learning 【Ian Goodfellow，Yoshua Bengio， Aaron Courville】中英文完整版

deep learning 英文原版

Deep learning（Ian Goodfellow）中文版

Deep-Learning-Goodfellow中文版

Deep Learning深度学习(Bengio等著)中英文

Deep Learning (Ian Goodfellow, Yoshua Bengio and Aaron Courville)

Deep Learning, Vol. 1 From Basics to Practice 无水印原版pdf

Bengio写的MIT Press《Deep learning》PDF整理版

Deep Learning英文版原著（Bengio）

Deep Learn 2017 英文版+中文版打包下载

Deep Learning.pdf Ian Goodfellow Yoshua Bengio Aaron Courville

DeepLearningBook-深度学习-中文版英文版-Ian GoodFellow等

deep learning图书（中文版和英文版）

Deep Learning (Adaptive Computation and Machine Learning series) 英文版

最新资源