Neural Networks—Tricks of the Trade (2nd Edition).pdf

It is our belief that researchers and practitioners acquire, through experience and wordofmouth, techniques and heuristics that help them successfully apply neural networks to difficult real world problems. Often these "tricks" are theoretically well motivated. Sometimes they are the result of tri
Volume editors Gregoire montavon Technische Universitat berlin Department of Computer Science Franklinstr. 28/29, 10587 Berlin, Germany Email: gregoire. montavon tuberlin de Genevieve b. orr Willamette University Department of Computer Science 900 State Street. Salem or 97301. USA Email: gorr@ willamette. edu KlausRobert muller Technische Universitat berlin Department of Computer Science Franklinstr. 28/29, 10587 Berlin, Germany and Korea University Department of Brain and Cognitive Engineering Anamdong, Seongbukgu, Seoul 136713, Korea Email: klausrobert mueller @tuberlin. de ISSN03029743 eISSN16113349 ISBN9783642352881 eISBN9783642352898 DOI10.1007/9783642352898 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2012952591 CR Subject Classification(1998): F1, 1.2.6, 1.5.1, C 1.3, F 2, J.3 LNCS Sublibrary: SL 1 Theoretical Computer Science and General Issues O SpringerVerlag Berlin Heidelberg 1998, 2012 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965 in its current version, and permission for use must always be obtained from Springer Violations are liable to prosecution under the German Copyright Law The use of general descriptive names, registered names, trademarks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use Typesetting: Cameraready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acidfree paper SpringerispartofSpringerScience+businessMedia(www.springer.com) Preface to the second edition There have been substantial changes in the field of neural networks since the first edition of this book in 1998. Some of them have been driven by external factors such as the increase of available data and computing power. The Internet made public massive amounts of labeled and unlabeled data. The everincreasing raw mass of usergenerated and sensed data is made easily accessible by databases and Web crawlers. Nowadays, anyone having an Internet connection can parse the 4,000,000+ articles available on Wikipedia and construct a dataset out of them. Anyone can capture a Web tv stream and obtain days of video content to test their learning algorithm Another development is the amount of available computing power that ha continued to rise at steady rate owing to progress in hardware design and en gineering. While the number of cycles per second of processors has thresholded due to physics limitations, the slowdown has been offset by the emergence of processing parallelism, best exemplified by the massively parallel graphics pro cessing units(GPU). Nowadays, everybody can buy a gPu board (usually al ready available in consumergrade laptops), install free GPU software, and run computationIntensive simulations at low cost These developments have raised the following question: Can we make use of this large computing power to make sense of these increasingly complex datasets Neural networks are a promising approach, as they have the intrinsic modeling capacity and flexibility to represent the solution. Their intrinsically distributed nature allows one to leverage the massively parallel computing resources During the last two decades, the focus of neural network research and the practice of training neural networks underwent important changes. Learning in deep(or "deep learning ") has to a certain degree displaced the once more preva lent regularization issues, or more precisely, changed the practice of regularizing neural networks. Use of unlabeled data via unsupervised layerwise pretrain ing or deep unsupervised embeddings is now often preferred over traditional regularization schemes such as weight decay or restricted connectivity. This new paradigm has started to spread over a large number of applications such as image recognition, speech recognition, natural language processing, complex systems neuroscience, and computational physics The second edition of the book reloads the first edition with more tricks These tricks arose from 14 years of theory and experimentation(from 1998 to 2012)by some of the world's most prominent neural networks researchers These tricks can make a substantial difference (in terms of speed, ease of im plementation, and accuracy )when it comes to putting algorithms to work on real problems. Tricks may not necessarily have solid theoretical foundations or formal validation. As Yoshua Bengio states in Chap. 19the wisdom distilled here should be taken as a guideline, to be tried and challenged, not as a practice set in stone G. Montavon and K.R. Muller The second part of the new edition starts with tricks to faster optimize neu ral networks and make more efficient use of the potentially infinite stream of data presented to them. Chapter 182 shows that a simple stochastic gradi ent descent (learning one example at a time) is suited for training most neural networks.Chapter[ 1) introduces a large number of tricks and recommenda tions for training feedforward neural networks and choosing the multiple hyper parameters When the representation built by the neural network is highly sensitive to small parameter changes, for example, in recurrent neural networks, secondorder methods based on minibatches such as those presented in Chap 20 9 can be a better choice. The seemingly simple optimization procedures presented in these chapters require their fair share of tricks in order to work optimally. The software Torch7 presented in Chap. 215 provides a fast and modular implementation of these neural networks The novel second part of this volume continues with tricks to incorporate invariance into the model. In the context of image recognition, Chap. 22 shows that translation invariance can be achieved by learning a kmeans representation of image patches and spatially pooling the kmeans activations. Chapter 23 3 shows that invariance can be injected directly in the input space in the form of elastic distortions. Unlabeled data are ubiquitous and using them to capture regularities in data is an important component of many learning algorithms For example, we can learn an unsupervised model of data as a first step, as discussed in Chaps. 24 71 and 25110), and feed the unsupervised representation to a supervised classifier. Chapter261 12 shows that similar improvements can be obtained by learning an unsupervised embedding in the deep layers of a neural network, with added fexibility The book concludes with the application of neural networks to modeling time series and optimal control systems Modeling time series can be done using a very simple technique discussed in Chap. 27 8 that consists of fitting a linear model on top of a"reservoir "that implements a rich set of time series primitives. Chapter 28 l offers an alternative to the previous method by directly identifying the underly ing dynamical system that generates the time series data. Chapter 29 6 presents how these system identification techniques can be used to identify a markov de cision process from the observation of a control system(a sequence of states and actions in the reinforcement learning terminology ). ChapterB0 11] concludes by showing how the control system can be dynamically improved by fitting a neura network as the control system explores the space of states and actions The book intends to provide a timely snapshot of tricks, theory, and algo ithms that are of use. Our hope is that some of the chapters of the new second edition will become our companions when doing experimental workeventually becoming classics, as some of the papers of the first edition have become. Even tually in some years, there may be an urge to reload again eptember 2012 Gregoire Klaus Preface to the Second edition Acknowledgments. This work was supported by the World Class University Pro gram through the National research Foundation of Korea funded by the Ministry of Education, Science, and Technology, under grant R3110008. The editors also acknowledge partial support by DFG(MU 987/171) References [1 Bengio, Y. Practical Recommendations for Gradientbased Training of Deep Ar chitectures. In: Montavon, G, Orr, G.B., Miller, K.R.(eds NN: Tricks of the Trade, 2nd edn. LNCS, vol 7700, pp. 437478. Springer, Heidelberg(2012) 2 Bottou, L: Stochastic Gradient Descent Tricks. In: Montavon, G, Orr, G.B., Miller, K.R.(eds NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700 pp. 421436. Springer, Heidelberg(2012) 3 Ciresan, D.C., Meier, U, Gambardella, L.M., Schmidhuber, J. Deep Big Mul tilayer Perceptrons for Digit Recognition. In: Montavon, G, Orr, G.B., Muller, K.R.(eds NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 581598 Springer, Heidelberg(2012) 4 Coates, A Ng, A.Y.: Learning Feature Representations with kmeans. In: Mon tavon,G, Orr, G.B., Miller, K.R(eds )NN: Tricks of the Trade, 2nd edn. LNCS vol. 7700, pp. 561580. Springer, Heidelberg(2012 5 Collobert, R, Kavukcuoglu, K, Farabet, C. Implementing Neural Networks Effi ciently. In: Montavon, G, Orr, G.B., Miller, K.R(eds )NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 537557 Springer, Heidelberg(2012) 6 Duell, S, Udluft, S, Sterzing, V. Solving Partially Observable Reinforcement Learning Problems with Recurrent Neural Networks. In: Montavon, G, Orr, G B, Miller, K.R.(eds NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, 7>, pp. 687707. Springer, Heidelberg(2012) 7 Hinton, G.E. A Practical Guide to Training Restricted Boltzmann Machines. In Montavon, G, Orr, G.B., Miller, K.R(eds )NN: Tricks of the Trade, 2nd edn LNCS, vol 7700, pp. 621637. Springer, Heidelberg(2012) 18 Lukosevicius, M. A Practical Guide to Applying Echo State Networks. In: Mon tavon, G, Orr, G.B., Miller, K.R.(eds )NN: Tricks of the Trade, 2nd edn. LNCS vol. 7700, pp. 659686. Springer, Heidelberg(2012 9 Martens, J, Sutskever, I Training Deep and Recurrent Networks with Hessian free Optimization. In: Montavon, G, Orr, G B, Miller, K.R.(eds )NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 479535. Springer, Heidelberg(2012) [10 Montavon, G, Miller, K.R. Deep Boltzmann Machines and the Centering Trick In: Montavon, G, Orr, G B, Muller, K.R.(eds )NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp 621637. Springer, Heidelberg(2012) [11 Riedmiller, M: 10 Steps and Some Tricks to Set Up Neural Reinforcement Con trollers. In: Montavon, G, Orr, G.B., Miller, K.R(eds )NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 735757. Springer, Heidelberg(2012) [12 Weston, J, Ratle, F, Collobert, R. Deep Learning Via Semisupervised Embed ding. In: Montavon, G, Orr, G.B., Miller, K.R(eds )NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp 639655. Springer, Heidelberg(2012) [13 Zimmermann, H.G, Tietz, C, Grothmann, R. Forecasting with Recurrent Neural Networks: 12 Tricks. In: NN: Tricks of the Trade. 2nd edn. Lncs. vol. 770 pp 687707. Springer, Heidelberg(2012) Table of Contents Introduction Speeding Learning Preface 1. Efficient BackProp Yann Lecun. Leon bottou. Genevieve b. Orr and KlausRobert Miiller Regularization Techniques to Improve generalization Pref 2. Early Stopping But When? Lute prechelt 3. A Simple Trick for Estimating the Weight Decay Parameter Thorsteinn s. Rognvaldsson 4. Controlling the Hyperparameter Search in MacKay's Bayesian Neural Network framework Tony Plate 5. Adaptive Regularization in Neural Network Modeling Jan larsen Claus surer. Lars nonboe andersen. and Lars kai hansen 6. Large Ensemble Averaging David Horn, Ury Naftaly, and Nathan Intrator Improving Network Models and Algorithmic Tricks Preface 7. Square Unit Augmented, Radially Extended, Multilayer Perceptrons. 14 Gary william flake 8. a Dozen Tricks with Multitask learning rich caruana 9. Solving the IllConditioning in Neural Network Learning Patrick van der smagt and Gerd hirzinger 10. Centering Neural Network Gradient Factors Nicol N. Schraudolph 11. Avoiding Roundoff Error in Backpropagating Derivatives Tony Plate Table of contents Representing and Incorporating Prior Knowledge in Neural Network Training Preface 12. Transformation Invariance in Pattern Recognition Tangent Distance and Tangent Propagation 23 Patrice y. Simard. Yann a. Lecun. John s. Denker. and Bernard victorri 13. Combining neural Networks and contextDriven Search for Online, Printed Handwriting Recognition in the Newton Larry S. Yaeger, Brandyn Webb, and Richard F. Lyon 14. Neural Network Classification and Prior Class Probabilities.....29 Steve Lawrence, lan Burns, Andrew Back, Ah Chung Tsoi, and C. Lee gile 15. Applying Divide and Conquer to Large Scale Pattern Recognition Jurgen fritsch and Michael Finke Tricks for Time series Preface 16. Forecasting the Economy with Neural Nets: A Survey of Challenges and solutions John Moody 17. How to Train Neural networks Ralph Neuneier and Hans georg Zimmermann Big Learning in Deep Neural Networks Preface 19 18. Stochastic gradient Descent Tricks Leon botto 19. Practical Recommendations for GradientBased Training of Deep architectures 437 Yoshua bengio 20. Training Deep and recurrent Networks with HessianFree Optimization James Martens and Ilya Sutskever 21. Implementing neural Networks efficiently Ronan Collobert, Koray Kavukcuoglu, and Clement farabet Table of contents XI Better Representations: Invariant, Disentangled and Reusable Preface 番, 番番, 22. Learning Feature Representations with KMeans ..561 adam Coates and andrew y no 23. Deep Big Multilayer Perceptrons for Digit Re ecognition Dan Claudiu ciresan, Ueli Meier, Luca Maria gambardella, and Jurgen schmidhuber 24. A Practical Guide to Training Restricted Boltzmann Machines Geoffrey E. Hinton 25. Deep Boltzmann Machines and the Centering Trick Gregoire Montavon and KlausRobert miller 26. Deep Learning via Semisupervised Embedding Jason Weston. Frederic ratle. and ronan collobert Identifying Dynamical Systems for Forecasting and Control Preface 27. A Practical Guide to Applying Echo State Networks Mantas lukosevicias 28. Forecasting with recurrent Neural networks: 12 Tricks HansGeorg Zimmermann, Christoph Tietz, and ralph grothmann 29. Solving Partially Observable Reinforcement Learning Problems with Recurrent Neural Networks Siegmund Duell, Steffen Udluft, and Volkmar Sterzing 30. 10 Steps and Some Tricks to Set up Neural reinforcement Controllers 73 Martin riedmiller Author index 759 Subject Index 761 Introduction It is our belief that researchers and practitioners acquire, through experience and wordofmouth, techniques and heuristics that help them successfully apply neural networks to difficult real world problems. Often these "tricks"are theoret ically well motivated. Sometimes they are the result of trial and error. However their most common link is that they are usually hidden in people's heads or in the back pages of spaceconstrained conference papers. As a result newcomers to the field waste much time wondering why their networks train so slowly and perform so poorly. This book is an outgrowth of a 1996 NIPS workshop called Tricks of the Trade whose goal was to begin the process of gathering and documenting these tricks The interest that the workshop generated, motivated us to expand our collection and compile it into this book. Although we have no doubt that there are many tricks we have missed we hope that what we have included will prove to be useful, particularly to those who are relatively new to the field. Each chapter contains one or more tricks presented by a given author (or authors). We have attempted to group related chapters into sections, though we recognize that the different sections are far from disjoint. Some of the chapters(e. g. 1, 13, 17)contain entire systems of tricks that are far more general than the category they have been placed in Before each section we provide the reader with a summary of the tricks cor tained within, to serve as a quick overview and reference. However, we do not recommend applying tricks before having read the accompanying chapter. Each trick may only work in a particular context that is not fully explained in the summary. This is particularly true for the chapters that present systems where combinations of tricks must be applied together for them to be effective Below we give a coarse roadmap of the contents of the individual chapters Speeding Learning The book opens with a chapter based on Leon Bottou and Yann LeCun's popular workshop on efficient backpropagation where they present a system of tricks for speeding the minimization process. Included are tricks that are very simple to implement as well as more complex ones, e.g. based on secondorder methods Though many of the readers may recognize some of these tricks, we believe that this chapter provides both: a thorough explanation of their theoretical basis as well as an understanding of the subtle interactions among them This chapter provides an ideal introduction for the reader. It starts with dis cussing fundamental tricks addressing input representation, initialization, target Previously published in: Orr, G.B. and Miller, K.R.(Eds LNCS 1524, ISBN 9783540653110(1998) G. Montavon et al.(Eds ) NN: Tricks of the Trade, 2nd edn., LNCS 7700, pp. 15, 2012 C SpringerVerlag Berlin Heidelberg 2012
 11.67MB
Neural Networks_Tricks of the Trade
20150331神经网络至今所有的经典tricks设计集合，对于想搞深度学习却不得入门的同学，好好看看这本书
 5.40MB
UNIX Network Programming vol1 ed3 The Sockets Networking
20091008Addison Wesley  UNIX Network Programming vol1 ed3 The Sockets Networking chm
 3.87MB
Network Information Theory
20150522This comprehensive treatment of network information theory and its applications provides the first u
 9.65MB
Neural Networks：Tricks of the Trade+无码高清扫描+文字可编辑复制+完整标签+长期存档PDF/A
20171219本书是Grégoire Montavon 2012年推出的第二版书，主要介绍神经网络的训练改进技巧、以及表示等等，本书高清无码扫描，附带完整标签，文字可编辑复制，并以保存为长期归档格式PDF/A！堪称
 181KB
Digital Imaging and Communications in Medicine 01
20130727Digital Imaging and Communications in Medicine (DICOM) Part 1: Introduction and Overview
 13.7MB
[免费完整版]Neural Networks Tricks of the Trade
20170913Neural Networks: Tricks of the Trade, Second Edition Editors: Grégoire Montavon, Geneviève B. Orr, K
 11.67MB
《Neural Networks: Tricks of the Trade》
20170705三个 bound 不如一个 heuristic，三个 heuristic 不如一个trick
 19.84MB
Neural Networks Tricks of the Trade
20170803本文主要讲述了在选择神经网络的超参数是可能遇到的一些坑。这对于主要以使用深度学习算法进行应用开发的一些人来说非常有用
 4.41MB
基于西门子S7—1200的单部六层电梯设计程序，1部6层电梯
20190505基于西门子S7—1200的单部六层电梯设计程序，1部6层电梯。 本系统控制六层电梯， 采用集选控制方式。 为了完成设定的控制任务， 主要根据电梯输入/输出点数确定PLC 的机型。 根据电梯控制的要求，
 16KB
捷联惯导仿真matlab
20180823捷联惯导的仿真（包括轨迹仿真，惯性器件模拟输出，捷联解算），标了详细的注释捷联惯导的仿真（包括轨迹仿真，惯性器件模拟输出，捷联解算），标了详细的注释
 24.46MB
图书管理系统（Java + Mysql）我的第一个完全自己做的实训项目
20190104图书管理系统 Java + MySQL 完整实训代码，MVC三层架构组织，包含所有用到的图片资源以及数据库文件，大三上学期实训，注释很详细，按照阿里巴巴Java编程规范编写

stormsunshine
等级：

分享宗师
成功上传21个资源即可获取
关注 私信 TA的资源

下载
一种基于射频识别技术煤矿安全追踪系统电子标签天线的设计
一种基于射频识别技术煤矿安全追踪系统电子标签天线的设计

下载
两种常见实用驱动电路的比较
两种常见实用驱动电路的比较

下载
BpofLq_Pn.py
BpofLq_Pn.py

下载
御剑系列工具.zip
御剑系列工具.zip

下载
基于Teechart For .Net的B/S结构煤矿安全监控系统曲线模块设计
基于Teechart For .Net的B/S结构煤矿安全监控系统曲线模块设计

下载
易语言功能超强的MP3播放器源码
易语言功能超强的MP3播放器源码

下载
drivermanagement.vue
drivermanagement.vue

下载
at89c51和stc12c5a60s2的引脚的区别
at89c51和stc12c5a60s2的引脚的区别

下载
SAS University Edition：Windows安装指南.pdf
SAS University Edition：Windows安装指南.pdf

下载
微小区 v10.2.3全开源+前端.zip
微小区 v10.2.3全开源+前端.zip

下载
刘桥一矿高承压水上安全开采地质保障技术
刘桥一矿高承压水上安全开采地质保障技术

下载
易语言北方DVD影院1.0源码
易语言北方DVD影院1.0源码

下载
用二极管交流电变直流电线路怎么接法
用二极管交流电变直流电线路怎么接法

下载
灰关联技术在煤矿瓦斯爆炸致因分析中的应用
灰关联技术在煤矿瓦斯爆炸致因分析中的应用

下载
lm358呼吸灯简单电路图
lm358呼吸灯简单电路图

下载
易语言博嗨MP3播放器源码
易语言博嗨MP3播放器源码

下载
庞庞塔煤矿5103综放面CO治理
庞庞塔煤矿5103综放面CO治理

下载
lm358制作充电转灯电路图
lm358制作充电转灯电路图

下载
ngrok.tar.gz
ngrok.tar.gz

下载
SpringSecurityDemomaster.zip
SpringSecurityDemomaster.zip

下载
易语言小音箱播放器6.4版源码
易语言小音箱播放器6.4版源码

下载
治理理论视野下的煤矿安全监管解读及其建构
治理理论视野下的煤矿安全监管解读及其建构

下载
充电器充满自停电路图_蓄电池充满自断电路
充电器充满自停电路图_蓄电池充满自断电路

下载
以注氮预防为主的特厚自燃煤层综放面自燃综合防治技术
以注氮预防为主的特厚自燃煤层综放面自燃综合防治技术

下载
易语言屏幕保护程序源码
易语言屏幕保护程序源码

下载
充电器充满变灯电路图
充电器充满变灯电路图

下载
潞安集团李村矿风井揭煤完整煤芯取芯技术经验
潞安集团李村矿风井揭煤完整煤芯取芯技术经验

下载
易语言小计时表源码
易语言小计时表源码

下载
xr20m1172_drvmaster.zip
xr20m1172_drvmaster.zip

下载
简易充电指示灯电路图
简易充电指示灯电路图