【免费】GBRT指导1_GBRT资源-CSDN文库

boosting

需积分: 0 19 浏览量 2022-08-04 16:22:37 上传评论收藏 171KB PDF 举报

资源详情

资源评论

资源推荐

Generalized Boosted Models:

A guide to the gbm package

Greg Ridgeway

August 3, 2007

Boosting takes on various forms with diﬀerent programs using diﬀerent loss

functions, diﬀerent base models, and diﬀerent optimization schemes. The gbm

package takes the approach described in [2] and [3]. Some of the terminology

diﬀers, mostly due to an eﬀort to cast boosting terms into more standard sta-

tistical terminology (e.g. deviance). In addition, the gbm package implements

b oosting for models commonly used in statistics but not commonly associated

with boosting. The Cox proportional hazard model, for example, is an incred-

ibly useful model and the boosting framework applies quite readily with only

slight modiﬁcation [5]. Also some algorithms implemented in the gbm package

diﬀer from the standard implementation. The AdaBoost algorithm [1] has a

particular loss function and a particular optimization algorithm associated with

it. The gbm implementation of AdaBoost adopts AdaBoost’s exponential loss

function (its bound on misclassiﬁcation rate) but uses Friedman’s gradient de-

scent algorithm rather than the original one proposed. So the main purposes of

this document is to spell out in detail what the gbm package implements.

1 Gradient boosting

This section essentially presents the derivation of b oosting described in [2]. The

gbm package also adopts the stochastic gradient boosting strategy, a small but

important tweak on the basic algorithm, described in [3].

1.1 Friedman’s gradient boosting machine

Friedman (2001) and the companion paper Friedman (2002) extended the work

of Friedman, Hastie, and Tibshirani (2000) and laid the ground work for a new

generation of boosting algorithms. Using the connection between boosting and

optimization, this new work proposes the Gradient Boosting Machine.

In any function estimation problem we wish to ﬁnd a regression function,

f(x), that minimizes the expectation of some loss function, Ψ(y, f), as shown

in (4).

f(x) = arg min

f(x)

y ,x

Ψ(y, f(x))

= arg min

f(x)

y |x

Ψ(y, f(x))



(4)

We will focus on ﬁnding estimates of f(x) such that

f(x) = arg min

f(x)

y |x

[Ψ(y, f(x))|x] (5)

Parametric regression models assume that f(x) is a function with a ﬁnite number

of parameters, β, and estimates them by selecting those values that minimize a

loss function (e.g. squared error loss) over a training sample of N observations

on (y, x) pairs as in (6).

β = arg min

i=1

Ψ(y

, f(x

; β)) (6)

When we wish to estimate f(x) non-parametrically the task becomes more dif-

ﬁcult. Again we can proceed similarly to [4] and modify our current estimate

of f (x) by adding a new function f(x) in a greedy fashion. Letting f

= f(x

we see that we want to decrease the N dimensional function

J(f ) =

i=1

Ψ(y

, f(x

))

i=1

Ψ(y

, F

). (7)

The negative gradient of J(f) indicates the direction of the locally greatest

decrease in J(f ). Gradient descent would then have us modify f as

f ←

f − ρ∇J(f ) (8)

where ρ is the size of the step along the direction of greatest descent. Clearly,

this step alone is far from our desired goal. First, it only ﬁts f at values of

x for which we have observations. Second, it does not take into account that

observations with similar x are likely to have similar values of f (x). Both these

problems would have disastrous eﬀects on generalization error. However, Fried-

man suggests selecting a class of functions that use the covariate information

to approximate the gradient, usually a regression tree. This line of reasoning

produces his Gradient Boosting algorithm shown in Figure 1. At each itera-

tion the algorithm determines the direction, the gradient, in which it needs to

improve the ﬁt to the data and selects a particular model from the allowable

class of functions that is in most agreement with the direction. In the case of

squared-error loss, Ψ(y

, f(x

)) =

i=1

−f(x

))

, this algorithm corresponds

exactly to residual ﬁtting.

There are various ways to extend and improve upon the basic framework

suggested in Figure 1. For example, Friedman (2001) substituted several choices

剩余11页未读，继续阅读

评论收藏

内容反馈

柔粟

粉丝: 28
资源: 304

GBRT指导1

评论0

最新资源

GBRT指导1

评论0

GBRT回归.ipynb

mlflow-example：此存储库提供了一个示例示例，该示例使用MLflow跟踪，项目和模型模块在容器化环境中进行数据集预处理，GBRT（梯度增强回归树）模型训练和评估，模型调整以及最终模型服务（REST API）

LightGBM：基于决策树算法的快速，分布式，高性能梯度提升（GBT，GBDT，GBRT，GBM或MART）框架，用于排名，分类和许多其他机器学习任务

LightGBM:基于决策树算法的快速，分布式，高性能梯度提升（GBT，GBDT，GBRT，GBM或MART）框架，用于排名，分类和许多其他机器学习任务

论文研究-Predicting the Habitability of Exoplanets based on GBRT Algorithm.pdf

两个版本的GBDT（GBRT）源代码

基于时间序列关系的GBRT交通事故预测模型.docx

基于图像处理和GBRT模型的表土层土壤容重预测.pdf

Go-leaves是GBRT模型预测代码的纯Go实现库

ytk-learn:Ytk-learn是一个分布式机器学习库，实现了大多数流行的机器学习算法（GBDT，GBRT，混合逻辑回归，梯度提升软树，分解机，现场感知分解机，逻辑回归，Softmax）

leaves:来自流行框架的GBRT（梯度增强回归树）模型的预测部分的纯Go实现

基于决策树算法的快速，分布式，高性能梯度提升（GBT，GBDT，GBRT，GBM或MART）框架，用于排名，分类和许多其他机器学习任务。-C/C++开发

Java版的GBRT及RF，曾在天池大数据竞赛中获得2次TOP10.zip

XGBoost导读和实战--原理解析及源码、实战指导

practice_GBRT:使用梯度提升回归树练习监督学习的工作区

Gradient Boosting 算法在典型浅埋煤层液压支架选型中的应用-论文

xgboost：可扩展的，可移植的和分布式的梯度提升（GBDT，GBRT或GBM）库，适用于Python，R，Java，Scala，C ++等。 在单机，Hadoop，Spark，Dask，Flink和DataFlow上运行

菜鸟-需求预测与分仓规划的数据集资源 精准预测包裹数量、流向，调拨入库、分仓、铺货，力争实现“单未下、货先行”的订单下沉策略

BurpLoaderKeygen.jar.zip

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

Goby红队版-win-x64-2.4.7版本

Chrome Header Editor 插件

ISO SAE 21434-2021 中文版.pdf

OpenVAS GVM 中文翻译补丁

安全认证cisp教材全套

STM32F103C8T6核心板-电路原理图1.PDF

软件工程导论(第六版)课后习题答案1

goby红队&社区版-win-64-2.4.7

最新资源

xgboost：可扩展的，可移植的和分布式的梯度提升（GBDT，GBRT或GBM）库，适用于Python，R，Java，Scala，C ++等。在单机，Hadoop，Spark，Dask，Flink和DataFlow上运行

菜鸟-需求预测与分仓规划的数据集资源精准预测包裹数量、流向，调拨入库、分仓、铺货，力争实现“单未下、货先行”的订单下沉策略