林轩田 Learning from data

5星(超过95%的资源)
所需积分/C币:12 2015-03-29 22:18:47 26.43MB PDF
17
收藏 收藏
举报

很不错的机器学习教程,结合 林轩田在coursera的机器学习技法基石更让人回味无穷
LEARNING FROM DATA A SHORT COURSE Yaser s. abu-Mostafa California Institute of Technology Malik Magdon-Ismail Rensselaer Polytechnic Institute Hsuan -Tien lin National Taiwan University aMLbook com Yaser s. Abu-Mostafa Malik lon -Ismail Departments of lectrical Engineering Department of Camputer Science nd Computer Science California institutc of Tcchnology Renssclacr Polytechnic Institutc Pasadena, CA 91125 USA Troy ny 12180, USA yaser@caltech. edu magdongcs. rpi. edu Hsuan-Tien lin Department of Computer Science and Information engineering Taipei, 106, Taiwan htlinocsie. ntu. edu. tw ISBN101-60049006-9 lSBN13:978-1-60049-006-4 @2012 Yaser s. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin 1.10 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the authors. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means--electronic, mechanical, photocopying, scanning, or otherwise--without prior the 1976 United States Copyright Act pt as permitted under Section 107 or 108 of written permission of the authors, exce Limit of Lia bility/ Disclaimer of Warranty: While the authors have used their best efforts in preparing this book, they make no representation or warranties with re- spect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suit able for your situation. You should consult with a professional where appropriate. The author shall not be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages The use in this publication of tradenames, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as lo whether or not they are subject to proprietary rights This book was typeset by the authors and was printed and bound in the United States of America To our teachers. and to our students Preface This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics Chat every student of subjcct should know. We chose the title 'learning from data that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover Learning froI data has distinct theoretical and practical tracks. If you read two books that focus on one track or the other, you may feel that you are reading about two different subjects altogether. In this book, we balance the theoretical and the practical, the mathcmatical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the per formance of real learning systems. Strengths and weaknesses of the different parts are spelled out. Our philosophy is to say it like it is: what we know, what we dont know, and what we partially know. C The book can be taught in exactly the order it is presented. The notable ception may be Chapter 2, which is the most theoretical chapter of thc book The theory of generalization that this chapter covers is central to learning rom data, and we made an effort to make it accessible to a wide readership However, instructors who are more interested in the practical side may skim over it, or delay it until after the practical methods of Chapter 3 are taught You will notice that we includcd cxcrcises(in gray boxes)throughout the text. The main purpose of these exercises is to engage the rcadcr and enhance understanding of a particular topic being covered. Our reason for separating the exercises out is that they are not crucial to the logical fow. Nevertheless, they contain useful information, and we strongly encourage you to read them, even if you don't do them to completion. Instructors may find some of the exercises appropriate as 'easy' homework problems, and we also provide ad- ditional problems of varying difficulty in the Problems section at the end of To help instructors with ng their lectures based on the book, we provide supporting material on the book's website(AMLbook. com). There is also a forum that covers additional topics in learning from data. We will V11 PREFACE discuss these further in the epilogue of this book Acknowledgment (in alphabetical order for cach group: We would like to express our gratitude to the alumni of our Learning Systems Group at caltech who gave us detailed expert feedback: Zehra. Cataltepe, Ling Li, Amrit Pratap and Joseph Sill. We thank the many students and colleagues who gave us useful feedback during the development of this book, especially Chun-Wei Liu. The Caltech Library staff, especially Krislin Buxton and David McCaslin, have given us excellent advice and help in our self-publishing effort. We also thank Lucinda Acosta for her help throughout the writing of this book Last, but not Icast, we would like to thank our families for their encourage ment, their support, and most: of all their patience as they endured the time demands that writing a book has imposed on us Yaser S Abu-Mostafa, Pasadena, California Malik Magdon-Ismail, Troy, New york Hsuan-Tien Lin, Taipei, Taiwan March, 2012 Contents Preface 1 The Learning Problem 1.1 Problem Setup 1.1.1 Components of learning 3 1.1.2 A Simple Learning 1.1.3 Learning versus Design 1. 2 Types of learning 1.2. 1 Supervised Learning 1.2.2 Reinforcement Learning 1.2, 3 Unsupervised Learning 1.2.4 Other Views of learning 1.3 Is Learning Feasible? 15 1. 3. 1 Outside the data Sct 1.3.2 Probability to the rescue 18 1.3.3 Feasibilily of Learning 24 1.4 Error and noise L 4.1 Error measures 1.4.2 Noisy Targets ,,30 1.5 Problems 33 2 Training versus Testing 39 2.1 Theory of Generalization 39 2.1.1 Effective Number of Hypotheses 41 2.1.2 Bounding the Growth Function 2.1.3 The VC Dimension 2.1.4 The VC Generalization Bolnd ,53 2.2 Interpreting the Generalization Bound 2.2.1 Sample complexity 2.2.2 Penalty for Model Complexity 2.2.3 The T'est Set 59 2.2.4 Other Target Types 61 2.3 Approximation-Generalization Tradeoff CONTENTS 2.3.1 Bias and variance 62 2.3.2 The Learning curve 66 2.1 Problems 69 3 The Linear model 77 3.1 Linear Classification 77 Non-Separable Dat 79 3.2 Linear regression 3.2.1 The Al 3.2.2 Generalization iss 3.3 Logistic Regression 3.3. 1 Predicting a Probabilit 3.3.2 Gradient Descent 3.4 Nonlinear transformation 99 3.4.1 The Z Space 99 3.4.2 Computation and generalization 4 Overfitting 119 1. 1 When Does Overfitting Occur? 4.1.1 A Case Study: Overfitting with Polynomials ..... 120 4.1.2 Catalysts for Overfitting 4.2 Regularization 126 4.2.1 A Soft Order Constraint 128 4.2.2 Weight Decay and Augmented Error 132 4.2.3 Choosing a Regularizer: Pill or Poison? 134 4.3 Validation 137 4.3.1 The Validation Set 138 4.3.2 Model Sclc 141 4.3.3 Cross Validation 145 4.3.4 Theory Versus Practice 151 4.4 Problerns 154 5 Threc Lcarning Principles 167 5.1 R 167 5.2 Sampling bi 171 3.3 Data snoopin 5. 4 Proble epilog LaL Further Reading 183

...展开详情
试读 127P 林轩田 Learning from data
立即下载 低至0.43元/次 身份认证VIP会员低至7折
一个资源只可评论一次,评论内容不能少于5个字
xiatian6032 林轩田老师的机器学习视频 值得一看
2018-10-22
回复
wuxinyu981 非常好的资源,很适合初级入门
2018-09-29
回复
redpig_weng 靠谱的资源,适合入门
2018-05-06
回复
Mary_ML1 非常好的资源,很适合初级入门
2018-04-24
回复
田小二 很好,跟公开课配合看
2017-09-28
回复
yshlnhn 挺不错的书,要好好看看
2017-09-27
回复
georgejin2010 很不错,适合入门学习下
2017-08-22
回复
sue1385 很好 顶顶顶 只是 是不是还有后半部呢
2017-03-07
回复
njcitchenrh 好材料!结合Mocc课程学习,很棒!
2017-02-15
回复
jxwb088047 多谢分享,赞
2016-11-01
回复
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
关注 私信
上传资源赚钱or赚积分
最新推荐
林轩田 Learning from data 12积分/C币 立即下载
1/127
林轩田 Learning from data第1页
林轩田 Learning from data第2页
林轩田 Learning from data第3页
林轩田 Learning from data第4页
林轩田 Learning from data第5页
林轩田 Learning from data第6页
林轩田 Learning from data第7页
林轩田 Learning from data第8页
林轩田 Learning from data第9页
林轩田 Learning from data第10页
林轩田 Learning from data第11页
林轩田 Learning from data第12页
林轩田 Learning from data第13页
林轩田 Learning from data第14页
林轩田 Learning from data第15页
林轩田 Learning from data第16页
林轩田 Learning from data第17页
林轩田 Learning from data第18页
林轩田 Learning from data第19页
林轩田 Learning from data第20页

试读结束, 可继续阅读

12积分/C币 立即下载 >