没有合适的资源?快使用搜索试试~ 我知道了~
An Introduction to Statistical Learning
2 下载量 17 浏览量
2023-10-17
09:28:19
上传
评论
收藏 17.05MB PDF 举报
温馨提示
试读
613页
An Introduction to Statistical Learning
资源推荐
资源详情
资源评论
To our parents:
Alison and Michael James
Chiara Nappi and Edward Witten
Valerie and Patrick Hastie
Vera and Sami Tibshirani
John and Brenda Taylor
and to our families:
Michael, Daniel, and Catherine
Tessa, Theo, Otto, and Ari
Samantha, Timothy, and Lynda
Charlie, Ryan, Julie, and Cheryl
Lee-Ann and Isobel
Preface
Statistical learning refers to a set of tools for making sense of complex
datasets. In recent years, we have seen a staggering increase in the scale and
scope of data collection across virtually all areas of science and industry.
As a result, statistical learning has become a critical toolkit for anyone who
wishes to understand data — and as more and more of today’s jobs involve
data, this means that statistical learning is fast becoming a critical toolkit
for everyone
.
One of the rst books on statistical learning — The Elements of Statisti-
cal Learning (ESL, by Hastie, Tibshirani, and Friedman) — was published
in 2001, with a second edition in 2009. ESL has become a popular text not
only in statistics but also in related elds. One of the reasons for ESL’s
popularity is its relatively accessible style. But ESL is best-suited for indi-
viduals with advanced training in the mathematical sciences.
An Introduction to Statistical Learning, With Applications in R (ISLR)
— rst published in 2013, with a second edition in 2021 — arose from
the clear need for a broader and less technical treatment of the key topics
in statistical learning. In addition to a review of linear regression, ISLR
covers many of today’s most important statistical and machine learning
approaches, including resampling, sparse methods for classication and re-
gression, generalized additive models, tree-based methods, support vector
machines, deep learning, survival analysis, clustering, and multiple testing.
Since it was published in 2013, ISLR has become a mainstay of un-
dergraduate and graduate classrooms worldwide, as well as an important
reference book for data scientists. One of the keys to its success has been
that, beginning with Chapter 2, each chapter contains an R lab illustrating
how to implement the statistical learning methods seen in that chapter,
providing the reader with valuable hands-on experience.
However, in recent years Python has become an increasingly popular lan-
guage for data science, and there has been increasing demand for a Python-
vii
viii
based alternative to ISLR. Hence, this book, An Introduction to Statistical
Learning, With Applications in Python (ISLP), covers the same materials
as ISLR but with labs implemented in Python — a feat accomplished by the
addition of a new co-author, Jonathan Taylor. Several of the labs make use
of the ISLP Python package, which we have written to facilitate carrying out
the statistical learning methods covered in each chapter in Python. These
labs will be useful both for Python novices, as well as experienced users.
The intention behind ISLP (and ISLR) is to concentrate more on the
applications of the methods and less on the mathematical details, so it is
appropriate for advanced undergraduates or master’s students in statistics
or related quantitative elds, or for individuals in other disciplines who
wish to use statistical learning tools to analyze their data. It can be used
as a textbook for a course spanning two semesters.
We are grateful to these readers for providing valuable comments on the
rst edition of ISLR: Pallavi Basu, Alexandra Chouldechova, Patrick Dana-
her, Will Fithian, Luella Fu, Sam Gross, Max Grazier G’Sell, Courtney
Paulson, Xinghao Qiao, Elisa Sheng, Noah Simon, Kean Ming Tan, Xin Lu
Tan. We thank these readers for helpful input on the second edition of ISLR:
Alan Agresti, Iain Carmichael, Yiqun Chen, Erin Craig, Daisy Ding, Lucy
Gao, Ismael Lemhadri, Bryan Martin, Anna Neufeld, Geo Tims, Carsten
Voelkmann, Steve Yadlowsky, and James Zou. We are immensely grateful
to Balasubramanian “Naras” Narasimhan for his assistance on both ISLR
and ISLP.
It has been an honor and a privilege for us to see the considerable impact
that ISLR has had on the way in which statistical learning is practiced, both
in and out of the academic setting. We hope that this new Python edition
will continue to give today’s and tomorrow’s applied statisticians and data
scientists the tools they need for success in a data-driven world.
It’s tough to make predictions, especially about the future.
-Yogi Berra
Preface
Contents
Preface vii
1 Introduction 1
2 Statistical Learning 15
2.1 What Is Statistical Learning? . . . . . . . . . . . . . . . . . 15
2.1.1 Why Estimate f? . . . . . . . . . . . . . . . . . . . 17
2.1.2 How Do We Estimate f ? . . . . . . . . . . . . . . . 20
2.1.3 The Trade-O Between Prediction Accuracy
and Model Interpretability . . . . . . . . . . . . . . 23
2.1.4 Supervised Versus Unsupervised Learning . . . . . 25
2.1.5 Regression Versus Classication Problems . . . . . 27
2.2 Assessing Model Accuracy . . . . . . . . . . . . . . . . . . 27
2.2.1 Measuring the Quality of Fit . . . . . . . . . . . . 28
2.2.2 The Bias-Variance Trade-O . . . . . . . . . . . . . 31
2.2.3 The Classication Setting . . . . . . . . . . . . . . 34
2.3 Lab: Introduction to Python . . . . . . . . . . . . . . . . . 40
2.3.1 Getting Started . . . . . . . . . . . . . . . . . . . . 40
2.3.2 Basic Commands . . . . . . . . . . . . . . . . . . . 40
2.3.3 Introduction to Numerical Python . . . . . . . . . 42
2.3.4 Graphics . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.5 Sequences and Slice Notation . . . . . . . . . . . . 51
2.3.6 Indexing Data . . . . . . . . . . . . . . . . . . . . . 51
2.3.7 Loading Data . . . . . . . . . . . . . . . . . . . . . 55
2.3.8 For Loops . . . . . . . . . . . . . . . . . . . . . . . 59
2.3.9 Additional Graphical and Numerical Summaries . . 61
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3 Linear Regression 69
3.1 Simple Linear Regression . . . . . . . . . . . . . . . . . . . 70
3.1.1 Estimating the Coecients . . . . . . . . . . . . . 71
3.1.2 Assessing the Accuracy of the Coecient
Estimates . . . . . . . . . . . . . . . . . . . . . . . 72
3.1.3 Assessing the Accuracy of the Model . . . . . . . . 77
3.2 Multiple Linear Regression . . . . . . . . . . . . . . . . . . 80
3.2.1 Estimating the Regression Coecients . . . . . . . 81
ix
剩余612页未读,继续阅读
资源评论
0语1言
- 粉丝: 6
- 资源: 91
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 数电实验一:半加器、电路的逻辑功能、全加器,四人表决电路、组合逻辑电路
- 原生Javaee+jsp的一个课程项目为在线课程管理系统
- 企业人事管理系统 开发框架:vs2022 + asp.net + webform + sqlserver 数据库:sqlser
- jsp基于WEB的考务管理系统的设计与实现(源代码+lw).zip
- jsp基于Web的可维护的数据库浏览器(源代码+lw+答辩PPT).zip
- JSP基于WEB的图书馆借阅系统的设计与实现(源代码+lw).zip
- aht10的linux驱动
- JSP基于WEB网上论坛设计与实现(源代码+lw+开题报告+答辩PPT+外文翻译).zip
- 端午节et-mai开发比记
- JSP基于网络超市商品销售管理系统的设计与实现(源代码+lw).zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功