# BartPy
[![Build Status](https://travis-ci.org/JakeColtman/bartpy.svg?branch=master)](https://travis-ci.org/JakeColtman/bartpy)
### Introduction
BartPy is a pure python implementation of the Bayesian additive regressions trees model of Chipman et al [1].
### Reasons to use BART
* Much less parameter optimization required that GBT
* Provides confidence intervals in addition to point estimates
* Extremely flexible through use of priors and embedding in bigger models
### Reasons to use the library:
* Can be plugged into existing sklearn workflows
* Everything is done in pure python, allowing for easy inspection of model runs
* Designed to be extremely easy to modify and extend
### Trade offs:
* Speed - BartPy is significantly slower than other BART libraries
* Memory - BartPy uses a lot of caching compared to other approaches
* Instability - the library is still under construction
### How to use:
There are two main APIs for BaryPy:
1. High level sklearn API
2. Low level access for implementing custom conditions
If possible, it is recommended to use the sklearn API until you reach something that can't be implemented that way. The API is easier, shared with other models in the ecosystem, and allows simpler porting to other models.
#### Sklearn API
The high level API works as you would expect
``` python
from bartpy.sklearnmodel import SklearnModel
model = SklearnModel() # Use default parameters
model.fit(X, y) # Fit the model
predictions = model.predict() # Make predictions on the train set
out_of_sample_predictions = model.predict(X_test) # Make predictions on new data
```
The model object can be used in all of the standard sklearn tools, e.g. cross validation and grid search
```python
from bartpy.sklearnmodel import SklearnModel
model = SklearnModel() # Use default parameters
cross_validate(model)
```
##### Extensions
BartPy offers a number of convenience extensions to base BART. The most prominent of these is using BART to predict the residuals of a base model.
It is most natural to use a linear model as the base, but any sklearn compatible model can be used
```python
from bartpy.extensions.baseestimator import ResidualBART
model = ResidualBART(base_estimator=LinearModel())
model.fit(X, y)
```
A nice feature of this is that we can combine the interpretability of a linear model with the power of a trees model
#### Lower level API
BartPy is designed to expose all of its internals, so that it can be extended and modifier. In particular, using the lower level API it is possible to:
* Customize the set of possible tree operations (prune and grow by default)
* Control the order of sampling steps within a single Gibbs update
* Extend the model to include additional sampling steps
Some care is recommended when working with these type of changes. Through time the process of changing them will become easier, but today they are somewhat complex
If all you want to customize are things like priors and number of trees, it is much easier to use the sklearn API
### Alternative libraries
* R - https://cran.r-project.org/web/packages/bartMachine/bartMachine.pdf
* R - https://cran.r-project.org/web/packages/BayesTree/index.html
### References
[1] https://arxiv.org/abs/0806.3286
[2] http://www.gatsby.ucl.ac.uk/~balaji/pgbart_aistats15.pdf
[3] https://arxiv.org/ftp/arxiv/papers/1309/1309.1906.pdf
[4] https://cran.r-project.org/web/packages/BART/vignettes/computing.pdf
没有合适的资源?快使用搜索试试~ 我知道了~
CausalDART:与因果推理有关的工作,使用BART的修改版
共187个文件
py:72个
html:20个
ipynb:18个
需积分: 17 2 下载量 145 浏览量
2021-02-14
01:20:33
上传
评论
收藏 15.82MB ZIP 举报
温馨提示
因果关系 使用修改后的BART版本进行因果推理相关的工作
资源详情
资源评论
资源推荐
收起资源包目录
CausalDART:与因果推理有关的工作,使用BART的修改版 (187个子文件)
make.bat 772B
.buildinfo 230B
theme.css 112KB
alabaster.css 10KB
basic.css 10KB
pygments.css 4KB
badge_only.css 3KB
custom.css 42B
sampling.doctree 34KB
Tree.doctree 31KB
model.doctree 29KB
node.doctree 18KB
split.doctree 12KB
index.doctree 5KB
.DS_Store 10KB
.DS_Store 8KB
fontawesome-webfont.eot 162KB
ajax-loader.gif 673B
.gitignore 204B
proposer.html 37KB
split.html 26KB
sklearnmodel.html 24KB
tree.html 24KB
node.html 21KB
sampling.html 15KB
model.html 14KB
tree.html 14KB
genindex.html 12KB
mutation.html 12KB
schedule.html 10KB
leafnode.html 9KB
node.html 9KB
sigma.html 8KB
split.html 8KB
proposer.html 7KB
index.html 6KB
py-modindex.html 6KB
index.html 5KB
search.html 5KB
objects.inv 670B
Comparison of model with sin waves.ipynb 17.04MB
Comparison of model with sin waves-checkpoint.ipynb 17.04MB
find_ACIC_2016_dataset.ipynb 5.84MB
Diagnostics on OLS.ipynb 2.01MB
cgm_bartpy_examples.ipynb 1.8MB
Feature Selection-checkpoint.ipynb 1.65MB
Feature Selection.ipynb 1.65MB
sandbox2-variance_adjusted.ipynb 1.51MB
sandbox2.ipynb 1.42MB
sandbox3.ipynb 1.4MB
run_linear_model_analysis_p=10_noise=1.0.ipynb 1.18MB
run_linear_model_analysis_p=10_noise=0.3.ipynb 1.17MB
run_linear_model_analysis_p=10_noise=2.0.ipynb 1.11MB
PoC_prop_score_uncertainty_propogation_logistic_regression.ipynb 807KB
Cross Validation.ipynb 386KB
cgm_bartpy_examples_Case_A_Known.ipynb 168KB
sandbox.ipynb 74KB
bartpy_examples.ipynb 52KB
jquery-3.1.0.js 258KB
jquery.js 84KB
underscore-1.3.1.js 34KB
searchtools.js 25KB
websupport.js 25KB
modernizr.min.js 15KB
underscore.js 12KB
doctools.js 8KB
searchindex.js 6KB
theme.js 4KB
LICENSE 1KB
Makefile 610B
README.md 3KB
README.md 78B
.nojekyll 0B
environment.pickle 1.54MB
comment-close.png 829B
comment-bright.png 756B
comment.png 641B
file.png 286B
down-pressed.png 222B
up-pressed.png 214B
up.png 203B
down.png 202B
minus.png 90B
plus.png 90B
data.py 29KB
sklearnmodel.py 28KB
likihoodratio.py 18KB
model.py 16KB
tree.py 13KB
run_linear_data_analysis.py 11KB
likihoodratio.py 11KB
treemutation.py 10KB
simulate_data.py 9KB
features.py 8KB
node.py 7KB
run_analysis.py 7KB
test_tree.py 7KB
proposer.py 6KB
treemutation.py 6KB
modelsampler.py 6KB
共 187 条
- 1
- 2
花菌子
- 粉丝: 24
- 资源: 4579
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0