没有合适的资源?快使用搜索试试~ 我知道了~
some notes from the book statiscal learning from a regression pe...
需积分: 9 1 下载量 91 浏览量
2015-11-05
15:25:35
上传
评论
收藏 1.07MB PDF 举报
温馨提示
statiscal learning from a regression perspective这本书的一个读书笔记
资源推荐
资源详情
资源评论
Some Notes from the Book:
Statistical Learning from a Regression Perspective
by Richard A. Berk
John L. Weatherwax
∗
September 12, 200 1
Introduction
Here you’ll find some notes that I wrote up as I worked through this excellent book. I’ve
worked hard to make these notes as good as I can, but I have no illusions that they are perfect.
If you feel that that there is a better way to accomplish or explain an exercise or derivation
presented in these notes; or that one or more of t he explanations is unclear, incomplete,
or misleading, please tell me. If you find an erro r of any kind – technical, grammatical,
typographical, whatever – please tell me that , too. I’ll gladly add to the acknowledgments
in later printings the name of the first person to bring each problem to my attention.
All comments (no matter how small) are much appreciated. In fact, if you find these notes
useful I would appreciate a contribution in the form of a solution to a problem that is not yet
worked in these notes. Sort of a “t ake a penny, leave a penny” type of approach. Remember:
pay it forward.
∗
wax@alum.mit.edu
1
Statistic al Learning as a Regr ession Problem
Problem Solutions
Problem 1 ( the airquality dataset)
See the R script chap
1 prob 1.R.
Part (1): When we use the pairs command we get the plot shown in Figure 1. In reading
a plot like this it is helpful to note that the y axis scale in each plot is determined by the
variable denoted in the same horizontal row. The x axis variable is the variable in the same
vertical row. Thus the scatter plot present ed in the (1, 2) location of the grid is a plot of
Ozone considered as a function of Solar.R. The scatter plot presented in the (3, 4) location
is a plot of Wind as a f unction of Temp. Thus plots like this enable one to quickly view
how two variable change in relat ionship to each other. The red curve is a non-parametric
“smoothing” of the data that can given a quick understanding of how the two variables
depend on each other. For example from the output of the pairs function we can see that
from the (1, 3) plot that Ozone decreases as Wind increases. From the (1, 4) plot we see that
Ozone increases a s Temp increases. Comparing the “transpose” plots i.e. (1, 3) and (3, 1) can
give an argument as to which variable should be the response and which variable should be
the explanatory variable. For example in the (3, 1) plot it looks like Wind is almost a linear
function of Ozone while from (1, 3) it does not look like Ozone is a linear function of Wind.
Part (3): Using boxplot to plot Ozone as a function of the categorical variable Month we
get the plot show in Figure 2 (left). Plotting Ozone as a function of Day we get the plot
show in Figure 2 (right). There is a clear pattern in that Ozone concentration seems to peak
during the mont hs of July and August. There is also a much larger range of possible values
during these two months. There does not seem to be much of a pattern in the behaviour of
Ozone as a function of Day. To use these variables in the scatterplots from Part ( 1) earlier
we would have to specify t he set of months or days to study in the scatterplots.
Part (5): When we use the cloud command we get the plot shown in Figure 3. We can see
that Ozone increases as Temp increases and Wind decreases.
Part (6): When we use the coplot command we g et the plot shown in Figure 4 . In that
plot it looks like the way that Wind is kept constant is to break it up into ordered bins and
consider the samples that fall in each bin. From the given plot it looks like that when Wind
is held constant the general trend is for Ozone to be an increasing as Temp.
Problem 2 ( complexity of the fitting function)
See the R script chap
1 prob 2.R. When that script is run we get the result show in Figure 5.
2
Ozone
0 50 150 250 60 70 80 90
0 50 100 150
0 50 100 150 200 250 300
Solar.R
Wind
5 10 15 20
0 50 100 150
60 70 80 90
5 10 15 20
Temp
airquality data
Figure 1: A “pairs” plot of the data in the airquality dataset.
5 6 7 8 9
0 50 100 150
month index; 5−9=May−Sept
ozone (ppb)
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
0 50 100 150
day index
ozone (ppb)
Figure 2: Left: Using the boxplot command to plot Ozone as a function of the month.
Right: Using the boxplot command to plot Ozone as a f unction of the day in the month.
3
Temp
Wind
Ozone
Figure 3: A “cloud” plot of Ozone as a function of Wind and Temp.
4
剩余25页未读,继续阅读
资源评论
PeterKid314
- 粉丝: 0
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- (源码)基于SimPy和贝叶斯优化的流程仿真系统.zip
- (源码)基于Java Web的个人信息管理系统.zip
- (源码)基于C++和OTL4的PostgreSQL数据库连接系统.zip
- (源码)基于ESP32和AWS IoT Core的室内温湿度监测系统.zip
- (源码)基于Arduino的I2C协议交通灯模拟系统.zip
- coco.names 文件
- (源码)基于Spring Boot和Vue的房屋租赁管理系统.zip
- (源码)基于Android的饭店点菜系统.zip
- (源码)基于Android平台的权限管理系统.zip
- (源码)基于CC++和wxWidgets框架的LEGO模型火车控制系统.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功