deep learning 原版

所需积分/C币:10 2017-05-11 11:55:57 22.29MB PDF
20
收藏 收藏
举报

《Deep Learning》这本书是由学界领军人物 Ian Goodfellow、Yoshua Bengio 和 Aaron Courville 合力打造,特斯拉的 CEO 马斯克曾经评价道:「《Deep Learning》由领域内三位专家合著,是该领域内唯一的综合性书籍。」
CONTENTS 3.2 Random variables 56 3. 3 Probability distributions 56 3.4 Marginal Probabilit 58 3.5 Conditional Probability 59 3.6 The Chain Rule of conditional probabilities 59 3.7 Independence and Conditional Independence 60 3.8 Expectation, Variance and Covariance 60 3.9 Common Probability Distributions 62 3.10 Useful Properties of Common Functions 67 3.11 Bayes' Rule 70 3.12 Technical Details of continuous variables 3.13 Information Theory .72 3.14 Structured Probabilistic models 75 4 Numerical Computation 80 4.1 OverFlow and Underflow 4.2 Poor Conditioning 4.3 Gradient-Based Optiinization 82 4.4 Constrained Optimization 93 4.5 Example: Linear Least Squares 96 5 Machine learning basics 98 5. 1 Learning algorithms 5.2 Capacity, Overfitting and Underfitting 110 5.3 Hyperparameters and validation Sets .120 5.1 Estimators Bias and variance 122 5.5 Maximum likelihood estimation 131 5.6 Bayesian Statistics .....,135 7 Supervised Learning Algorithms 139 5.8 Unsupervised Learning Algorithms .145 5.9 Stochastic Gradient Descent 5.10 Building a Machine Learning Algorithm 150 152 5.11 Challenges Motivating Deep Learning 154 II Deep Networks: Modern Practices 165 6 Deep Feedforward Networks 167 6. 1 Example: Learning XOR 170 6.2 Gradient-Based Learning 76 CONTENTS 6.3 Hidden Units .190 6.4 Architecture Design 196 6.5 Back-Propagation and Other Differentiation Algorithms 203 6.6 Historical Notes 224 7 Regularization for Deep Learning 228 7.1 Parameter Norm Penalties .230 7.2 Norm Penalties as Constrained Optimization 237 7. 3 Regularization and Under-Constrained Problems 239 7.4 Dataset Augmentation 240 7.5 Noise robustness .242 7.6 Semi-Supervised Learnin .243 7.7 Multi-Task Learning 244 7.8 Early Stopping 246 7.9 Parameter Tying and Parameter Sharing .253 7.10 Sparse Representations .254 7.11 Bagging and Other Ensemble Methods 56 7.12 Dropout 258 7.13 Adversarial Training 268 7.14 Tangent Distance, Tangent Prop, and Manifold Tangent Classifier 270 8 Optimization for Training Deep Models 274 8.1 How Learning Differs from Pure Optimization 275 8.2 Challenges in Neural Network Optimization .282 8.3 Basic algorithms .294 1 Parameter Initialization Strategies .30l 8.5 Algorithms with Adaptive Learning Rates .306 8.6 Approximate Second-Order Methods 310 8.7 Optimization Strategies and Meta-Algorithms 317 9 Convolutional networks 330 9.1 The Convolution Operation ...331 9.2 Motivation 335 9.3 Pooling .3:9 9.4 Convolution and Pooling as an Infinitely Strong Prior ..345 9.5 Variants of the basic convolution Function 347 9.6 Structured Outputs 358 9.7 Data Types 360 9.8 Efficient Convolution Algorithms .362 9.9 Random or Unsupervised Features .363 CONTENTS 9.10 The Ncuroscientific Basis for Convolutional Networks 364 9. 11 Convolutional Networks and the History of Deep Learning 371 10 Sequence Modeling: Recurrent and Recursive Nets 373 10.1 Unfolding Computational graphs 10.2 Recurrent Neural Networks 378 10.3 Bidirectional rNNs 395 10.4 Encoder-Decoder Sequence-to-Sequence Architectures 396 10.5 Deep Recurrent Networks 398 10.6 Recursive Neural Networks 10.7 The Challenge of Long- Term Dependencies 402 10.8 Echo State Networks .405 10.9 Leaky Units and Other Strategies for Multiple Time Scales 10.10 The Long Short-Term Memory and Other Gated RNNs 410 10.11 Optimization for Long-Term Dependencies 414 10.12 Explicit Memory 418 11 Practical Methodology 423 11.1 Performance Metrics 424 11.2 Default baseline models 427 11.3 Determining Whether to Gather More Data 28 11.4 Selecting Hyperpararneters 429 11.5 Debugging Strategies 438 11.6 Example: Multi-Digit Number Recognition 442 12 Applications 445 12.1 Large Scale Deep Learning 445 12.2 Computer Vision 454 12.3 Speech Recognition .460 12.4 Natural Language Processing 463 12.5 Other Applications 479 III Deep Learning Research 488 13 Linear Factor Models 491 13.1 Probabilistic PCA and Factor Analysis 492 13.2 Independent Component Analysis(ICA) 493 13.3 Slow Feature Analysis .495 13. Sparse Coding .498 CONTENTS 13.5 Manifold Interpretation of PCa 501 14 Autoencoders 504 14.1 Undercomplete Autoencoders 505 14.2 Regularized Autoencoders 506 141.3 Representational Power, Layer Size and Depth ..510 14.4 Stochastic Encoders and decoders 14.5 Denoising Autoencoders .512 14.6 Learning Manifolds with Autoencoders 517 14.7 Contractive Autoencoders 523 14.8 Predictive Sparse Decomposition 525 14.9 Applications of Autoencoders .526 15 Representation Learning 528 15.1 Greedy Layer-Wise Unsupervised Pretraining 530 15.2 Transfer Learning and Domain Adaptation ..538 15.3 Semi-Supervised Disentangling of Causal Factors .543 15.4 Distributed Representation .548 15.5 Exponential Gains from Depth .555 15.6 Providing Clues to Discover Underlying Causes 556 16 Structured Probabilistic Models for Deep Learning 560 16.1 The Challenge of Unstructured modeling .56l 16.2 Using Graphs to Describe Model Structure ..565 16.3 Sampling from Graphical Models 16.4 Advantages of Structured Modeling 584 16.5 Learning about Dependencies 16.6 Inference and Approximate Inference 16.7 The Deep Learning Approach to Structured Probabilistic Models 586 17 Monte Carlo Methods 592 17.1 Sampling and Monte Carlo Methods 592 17.2 Importance Sampling 594 17.3 Markov Chain Monte Carlo methods 597 17.4 Gibbs Sampling 601 17.5 The Challenge of Mixing between Separated Modes 18 Confronting the Partition Function 607 18.1 The Log-Likelihood Gradient .608 18.2 Stochastic Maximum Likelihood and Contrastive Divergence...609 CONTENTS 18.3 Pscudolikelihood 617 18.4 Score Matching and Ratio Matching 619 18.5 Denoising Score Matching 621 18.6 Noise-Contrastive Estimation 622 18.7 Estimating the Partition Function 625 19 Approximate Inference 633 19.1 Inference as Optimization 19.2 Expectation Maximization 636 19.3 MAP Inference and Sparse Coding 637 19.4 Variational Inference and Learning 640 19.5 Learned Approximate Inference ....653 20 Deep Generative Models 656 20.1 Boltzmann machines .656 20.2 Restricted boltzmann machines 658 20.3 Deep Belief Networks .662 20.4 Deep Boltzmann Machines .665 20.5 Boltzmann Machines for Real-Valued data 678 20.6 Convolutional boltzmann machines 20.7 Boltzmann Machines for Structured or Sequential Outputs .687 20.8 Other Boltzmann Machines 20.9 Back-Propagation through Random Operations 689 20.10 Directed Generative Nets 694 20.11 Drawing Samples from Autoencoders 712 20.12 Generative Stochastic Networks .716 20.13 Other Generation Schemes .717 20. 14 Evaluating Generative Models 719 20.15 Conclusion .721 Bibliography 723 ndex 780 Website www.cleeplearlinlgbooK.org This book is accompanied by the above website. The website provides a variety of supplementary material, including exercises, lecture slides, corrections of mistakes and other resources that should be useful to both readers and instructors Acknowledgments This book would not have been possible without the contributions of many people We would like to thank those who commented on our proposal for the book and helped plan its contents and organization: Guillaume Alain, Kyunghyun Cho Caglar giilcehre, David Krueger, Hugo larochelle, razvan Pascanu and thomas R onee We would like to thank the people who offered feed back on the content of the book itself. Some offered feedback on many chapters: Martin Abadi, guillaume Alain, Ion Androutsopoulos, Fred Bertsch, Olexa Bilaniuk, Ufuk Can Bicici, Matko BoSnjak, John Boersma, Greg brockman, Alexandre de brebisson, Pierre Luc Carrier, Sarath Chandar, Pawel Chilinski, Mark Daoust, Oleg Dashevskii, Laurent Dinh, Stephan Dreseitl, Jim Fan, Miao Fan, Meire Fortunato, Frederic francis Nando de freitas, Caglar gulcehre, Jurgen Van Gael, Javier Alonso Garcia Jonathan hunt, gopi Jeyaram, Chingiz Kabytayev, Lukasz Kaiser, Varun Kanade Akiel Khan, JOhn King, Diederik P Kingma, Yanlll Le Cull, Rudolf Mathey, Matias Mattamala, Abhinav Maurya, Kevin Murphy, Oleg Miirk, Roman Novak, Augustus Q. Odena, Simon Pavlik, Karl Pichotta, Kari Pulli, Roussel rahman, Tapani Raiko, Anurag ranjan, Johannes roith, Mihaela rosca, Halis Sak, Cesar Salgado, grigory Sapunov, Yoshinori Sasaki, Mike Schuster, Julian Serban, Nir Shabat, Ken shirriff Andre simpelo, Scott Stanley, David sussillo, Ilya sutskever, Carles Gelada saez Graham taylor, valentin Tolmer, An Tran, Shubhendu Trivedi, Alexey Umnov Vincent vanhoucke, Marco Visentini-Scarzanella, David Warde-farley, dustin Webb, Kelvin Xu, Wei Xue, Ke Yang, Li Yao, Zygmunt Zajac and Ozan Caglayan t We would also like to thank those who provided us wit h useful feedback on dividual chapters Notation: Zhang Yuanhang e Chapter 1, Introduction: Yusuf Akgul, Sebastien Bratieres, Samira ebrahimi Charlie gorichanaz. Brendan loudermilk. Eric Morris Cosmin parvulescu CONTENTS and alfredo solano Chapter 2, Linear Algebra: Amjad Almahairi, Nikola Banic, Kevin Bennett Philippe castonguay, Oscar Chang, Eric Fosler-Lussier, Andrey Khalyavin Sergey oreshkov, Istvan Petras. Dennis Prangle, Thomas rohee, colby Toland. Massimiliano Tonlassoli. Alessandro vitale and bob welland Chapter 3, Probability and Information Theory: John Philip anderson, Kai Arulkumaran, Vincent Dumoulin, Rui Fa, Stephan Gouws. Artem Oboturov Antti Rasmus, Alexey Surkov and Volker Tresp o Chapter 4, Numerical Computation: Tran Lam An, an Fischer, and hu Yuhuang Chapter 5, Machine Learning Basics: Dzinitry Bahdanau, Nikhil Garg Makoto Otsuka, Bob Pepin, Philip Popien, Emmanuel Rayner, Kee-Bong Song, Zheng Sun and Andy wu Chapter 6, Deep Feedforward Networks: Uriel Berdugo, Fabrizio Bottarel Elizabeth Burl, Ishan Durugkar, Jeff Hlywa, Jong Wook Kim, David Krueger and Aditya Kumar prahara e Chapter 7, Regularization for Deep Learning: Kshitij Lauria, Inkyu lee Sunil mohan alld Joshua salisbury Chapter 8, Optimization for Training Deep Models: Marcel Ackermann Rowel Atienza, Andrew Brock, Tegan Maharaj, James Martens, Klaus Strobl and martin vita Chapter 9, Convolutional Networks: Martin Arjovsky, Eugene Brevdo, Kon- stantin diviloy. Eric Jensen. Asifullah Khan, Mehdi mirza alex paino, eddie Pierce, Marjorie Sayer, Ryan Stout and Wentao Wu Chapter 10, Sequence Modeling: Recurrent and Recursive Nets: Gokcen Eraslan. Steven Hickson, Razvan Pascanu, Lorenzo von Ritter, Rui rodrigues, Dmitriy serdyuk, Dongyu shi and Kaiyu yang Chapter 11, Practical Methodology: Daniel Beckstein o Chapter 12, Applications: George Dahl and R lbana oscher Chapter 15, Representation Learning: Kunal ghosh

...展开详情
试读 127P deep learning 原版
立即下载 低至0.43元/次 身份认证VIP会员低至7折
一个资源只可评论一次,评论内容不能少于5个字
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
上传资源赚钱or赚积分
最新推荐
deep learning 原版 10积分/C币 立即下载
1/127
deep learning 原版第1页
deep learning 原版第2页
deep learning 原版第3页
deep learning 原版第4页
deep learning 原版第5页
deep learning 原版第6页
deep learning 原版第7页
deep learning 原版第8页
deep learning 原版第9页
deep learning 原版第10页
deep learning 原版第11页
deep learning 原版第12页
deep learning 原版第13页
deep learning 原版第14页
deep learning 原版第15页
deep learning 原版第16页
deep learning 原版第17页
deep learning 原版第18页
deep learning 原版第19页
deep learning 原版第20页

试读结束, 可继续阅读

10积分/C币 立即下载 >