TheJackknifeEstimationMethod.pdf资源-CSDN文库

jackknife

boostrap

需积分: 13 26 浏览量 2020-05-11 19:43:12 上传评论收藏 243KB PDF 举报

资源推荐

资源详情

资源评论

The Jackknife Estimation Method Avery I. McIntosh

1 Introduction

Statistical resampling methods have become feasible for parametric estimation, hypothesis testing,

and model validation now that the computer is a ubiquitous tool for statisticians. This essay focuses

on the resampling technique for parametric estimation known as the Jackknife procedure. To outline

the usefulness of the method and its place in the general class of statistical resampling techniques,

I will quickly delineate two similar resampling methods: the bootstrap and the permutation test.

1.1 Other Sampling Methods: The Bootstrap

The bootstrap is a broad class of usually non-parametric resampling methods for estimating the

sampling distribution of an estimator. The method was described in 1979 by Bradley Efron, and

was inspired by the previous success of the Jackknife procedure.

Imagine that a sample of n independent, identically distributed observations from an unknown

distribution have been gathered, and a mean of the sample,

Y , has been calculated. To make

inferences about the population mean we need to know the variability of the sample mean, which

we know from basic statistical theory is V[

Y ] = V[Y ]/n. Here, since the distribution is unknown,

we do not know the value of V[Y ] = σ

. The central limit theorem (CLT) states that the stan-

dardized sample mean converges in distribution to a standard normal Z as the sample size grows

large—and we can invoke Slutsky’s theorem to demonstrate that the sample standard deviation is

an adequate estimator for standard deviation σ when the distribution is unknown. However, for

other statistics of interest that do not admit the CLT, and for small sample sizes, the bootstrap is

a viable alternative.

Brieﬂy, the bootstrap method speciﬁes that B samples be generated from the data by sampling

with replacement from the original sample, with each sample set being of identical size as the

original sample (here, n). The larger B is, the closer the set of samples will be to the ideal exact

bootstrap sample, which is of the order of an n-dimensional simplex: |C

| = (2n − 1)C(n). The

computation of this number, never mind the actual sample, is generally unfeasible for all but the

smallest sample sizes (for example a sample size of 12 has about 1.3 million with-replacement

subsamples). Furthermore, the bootstrap follows a multinomial distribution, and the most likely

sample is in fact the original sample, hence it is almost certain that there will be random bootstrap

samples that are replicates of the original sample. This means that the computation of the exact

bootstrap is all but impossible in practice. However, Efron and Tibshirani have argued that in

some instances, as few as 25 bootstrap samples can be large enough to form a reliable estimate.

The next step in the process is to perform the action that derived the initial statistic—here the

mean: so we sum each bootstrap sample and divide the total by n, and use those quantities to

generate an estimate of the variance of

Y as follows:

SE(

Y )

b=1

(

−

Y )

1/2

The Jackknife Estimation Method Avery I. McIntosh

The empirical distribution function (EDF) used to generate the bootstrap samples can be shown

to be a consistent, unbiased estimator for the actual cumulative distribution function (CDF) from

which the samples were drawn, F. In fact, the bootstrap performs well because it has a faster rate

of convergence than the CLT: O(1/n) vs. O(1/

√

n), as the bootstrap relies on the strong law of

large numbers (SLLN), a more robust condition than the CLT.

1.2 Other Sampling Methods: Permutation

Permutation testing is done in many arenas, and a classical example is that of permuted y’s in

a pair of random vectors (X, Y) to get a correlation coeﬃcient p-value. For an observed sample

z = {(X

, . . . , X

), (Y

, . . . , Y

)}, the elements of (only) the Y vector are permuted B times. Then

for permutation function π(·), we have that an individual permutation sample z

is:

= {(X

, . . . , X

), (Y

π(1)

, . . . , Y

π(n)

)}

The next step is to compute the number of times that the original correlation statistic is in absolute

value greater than the chosen percentile threshold (say, 0.025 and 0.975 for an empirical α level of

0.05), divided by B. This value is the empirical p-value. If B = n! then the test is called exact; if

all of the permutations are not performed, then there is an inﬂated Type I error rate, as we are less

likely to sample those values in the tails of the null distribution, and hence we are less likely to say

that there are values greater in absolute value than our original statistic. This method is entirely

non-parametric, and is usually approximated by Monte Carlo methods for large sample sizes where

the exact permutation generation is computationally impractical.

2 The Jackknife: Introduction and Basic Properties

The Jackknife was proposed by M.H. Quenouille in 1949 and later reﬁned and given its current

name by John Tukey in 1956. Quenouille originally developed the method as a procedure for

correcting bias. Later, Tukey described its use in constructing conﬁdence limits for a large class

of estimators. It is similar to the bootstrap in that it involves resampling, but instead of sampling

with replacement, the method samples without replacement.

Many situations arise where it is impractical or even impossible to calculate good estimators or

ﬁnd those estimators’ standard errors. The situation may be one where there is no theoretical basis

to fall back on, or it may be that in estimating the variance of a diﬃcult function of a statistic,

say g(

X) for some function with no closed-form integral, making use of the usual route of esti-

mation—the delta method theorem—is impossible. In these situations the Jackknife method can

be used to derive an estimate of bias and standard error. Keith Knight has noted, in his book

Mathematical Statistics, that the Jackknife estimate of the standard error is roughly equivalent

剩余10页未读，继续阅读

评论收藏

内容反馈

Lee_gaeul

粉丝: 27
资源: 2

The Jackknife Estimation Method.pdf

最新资源

The Jackknife Estimation Method.pdf

Jackknife代码及数据.rar_Jackknife距离_jackknife_jackknife matlab_jcakkn

一本基于matlab的数理统计电子书-Crc Press - Computational Statistics Handbook With Matlab -.part4.rar

论文研究-基于距离矩阵灰度图的蛋白质二级结构类型预测.pdf

Jackknife Code

Jackknife.jl：Julia中的Jackknife重采样和估计

一本基于matlab的数理统计电子书-Crc Press - Computational Statistics Handbook With Matlab -.part1.rar

matlab开发-JackKnife

一本基于matlab的数理统计电子书-Crc Press - Computational Statistics Handbook With Matlab -.part2.rar

一本基于matlab的数理统计电子书-Crc Press - Computational Statistics Handbook With Matlab -.part3.rar

jackknife,

jackknife_jackknife_out_

Computer Age Statistical Inference: Algorithms，Evidence，and Data Science.

电性距离矢量预测有机污染物的生物富集因子 (2011年)

Android代码-AndroidAPP开发通用库

Bootstrap及jackknife刀切法中文讲义

Android-JackKnife-适用于Android的强大而实用库

仿真电路以及操作方法

python大作业 含爬虫、数据可视化、地图、报告、及源码（整和为一个文件）（2014-2020全国各地区原油加工量）.rar

【纯干货啊】华为IPD流程管理(完整版).pptx

信号与系统——保研复习资料.pdf

可编程语言标准IEC61131-3中文版.pdf

Landsat_WRS2.zip

数字信号处理——保研复习资料.pdf

使用STM32F103C8T6+L298N+MG513P30电机使用外部中断法和输入捕获法进行编码器测速

系统规划与管理师全套资料.zip

风电和储能并网Simulink模型

最新资源

python大作业含爬虫、数据可视化、地图、报告、及源码（整和为一个文件）（2014-2020全国各地区原油加工量）.rar