gcForest v1.1.1 Is Here!
========
This is the official clone for the implementation of gcForest.(The University's webserver is unstable sometimes, therefore we put the official clone here at github)
Package Official Website: http://lamda.nju.edu.cn/code_gcForest.ashx
This package is provided "AS IS" and free for academic usage. You can run it at your own risk. For other purposes, please contact Prof. Zhi-Hua Zhou ([email protected]).
Description: A python 2.7 implementation of gcForest proposed in [1].
A demo implementation of gcForest library as well as some demo client scripts to demostrate how to use the code.
The implementation is flexible enough for modifying the model or fit your own datasets.
Reference: [1] Z.-H. Zhou and J. Feng. Deep Forest: Towards an Alternative to Deep Neural Networks.
In IJCAI-2017. (https://arxiv.org/abs/1702.08835v2 )
Requirements: This package is developed with Python 2.7, please make sure all the dependencies are installed,
which is specified in requirements.txt
ATTN: This package was developed and maintained by Mr.Ji Feng(http://lamda.nju.edu.cn/fengj/) .For any problem concerning the codes, please feel free to contact Mr.Feng.([email protected]) or open some issues here.
What's NEW:
========
* Scikit-Learn style API
* Some more detailed examples
* GPU support if you want to use xgboost as base estimators
* Support Python 3.5(v1.1.1)
v1.1.1 Python 3.5 Compatibility: The package should work for Python 3.5. Haven't check everything for now but it seems OK.
v1.1.1 Bug Fixed : When doing multiple predictions for the same model, the result will be consistant if you are using pooling layer. The bug only occurs for the scikit-learn APIs and now it is OK for the new api also.
Quick start
=====================
### The simplest way of using the library is as follows:
```
from gcforest.gcforest import GCForest
gc = GCForest(config) # should be a dict
X_train_enc = gc.fit_transform(X_train, y_train)
y_pred = gc.predict(X_test)
```
And that's it. Please see ```/examples/demo_mnist.py``` for a detailed useage.
For order versons AND some more model configs reported in the original paper, please refer:
* [v1.0](https://github.com/kingfengji/gcforest/tree/v1.0)
Supported Based Classifier
=====================
The based classifiers inside gcForest can be any classifiers. This library support the following ones:
* RandomForestClassifier
* XGBClassifier
* ExtraTreesClassifier
* LogisticRegression
* SGDClassifier
To add any classifiers, you could manually add them from ```lib/gcforest/estimators/__init__.py```
Define your own structure
=====================
### Define your model with a single json file.
* IF you only need cascading forest structure. You only need to write one json file. see /examples/demo_mnist-ca.json for a reference.(here -ca is for cascading)
* IF you need both fine grained and cascading forests, you will need to specifying the Finegraind structure of your model also.See /examples/demo_mnist-gc.json for a reference.
* Then, use gcforest.utils.config_utils.load_json to load your json file.
```
config = load_json(your_json_file)
gc = GCForest(config) # that's it
```
and run ```python examples/demo_mnist.py --model examples/yourmodel.json```
### Define your model inside your python scripts.
- You can also define the model structure inside your python script. The model config should be a python dictionary, see the ```get_toy_config``` in ```/examples/demo_mnist.py``` as a reference.
Supported APIs
=====================
* ```fit_transform(X_train,y_train)```
* ```fit_transform(X_train,y_train, X_test=X_test, y_test=y_test)```, this allows you to evaluate your model during training.
* ```set_keep_model_in_mem(False)```. If your RAM is not enough, set this to false. (default is True). IF you set this to False, you would have to use ```fit_transform(X_train,y_train, X_test=X_test, y_test=y_test)``` to evaluate your model.
* ```predict(X_test)```
* ```transform(X_test)```
Supported Data Types
=====================
### If you wish to use Cascade Layer only, the legal data type for X_train, X_test can be:
* 2-D numpy array of shape (n_sampels, n_features).
* 3-D or 4-D numpy array are also acceptable. For example, passing X_train of shape (60000, 28, 28) or (60000,3,28,28) will be automatically be reshape into (60000, 784)/(60000,2352).
### If you need to use Finegraind Layer, X_train, X_test MUST be a 4-D numpy array
* for image-like data. the dimension should be (n_sampels, n_channels, n_height, n_width)
* for sequence-like data. the dimension should be (n_sampels, n_features, seq_len, 1). e.g. For IMDB data, n_features is 1. For music MFCC data, n_features is 13.
Others
=====================
Please read ```examples/demo_mnist.py``` for a detailed walk-through.
package dependencies
========
The package is developed in python 2.7, higher version of python is not suggested for the current version.
run the following command to install dependencies before running the code:
```pip install -r requirements.txt```
Order Versions
=====================
For order versons, please refer:
* [v1.0](https://github.com/kingfengji/gcforest/tree/v1.0)
Happy Hacking.
没有合适的资源?快使用搜索试试~ 我知道了~
机器学习的常用算法demo,包括python,cpp.zip
共108个文件
py:51个
csv:17个
model:9个
需积分: 5 0 下载量 120 浏览量
2024-04-16
22:16:35
上传
评论
收藏 2.36MB ZIP 举报
温馨提示
众所周知,人工智能是当前最热门的话题之一, 计算机技术与互联网技术的快速发展更是将对人工智能的研究推向一个新的高潮。 人工智能是研究模拟和扩展人类智能的理论与方法及其应用的一门新兴技术科学。 作为人工智能核心研究领域之一的机器学习, 其研究动机是为了使计算机系统具有人的学习能力以实现人工智能。 那么, 什么是机器学习呢? 机器学习 (Machine Learning) 是对研究问题进行模型假设,利用计算机从训练数据中学习得到模型参数,并最终对数据进行预测和分析的一门学科。 机器学习的用途 机器学习是一种通用的数据处理技术,其包含了大量的学习算法。不同的学习算法在不同的行业及应用中能够表现出不同的性能和优势。目前,机器学习已成功地应用于下列领域: 互联网领域----语音识别、搜索引擎、语言翻译、垃圾邮件过滤、自然语言处理等 生物领域----基因序列分析、DNA 序列预测、蛋白质结构预测等 自动化领域----人脸识别、无人驾驶技术、图像处理、信号处理等 金融领域----证券市场分析、信用卡欺诈检测等 医学领域----疾病鉴别/诊断、流行病爆发预测等 刑侦领域----潜在犯罪识别与预测、模拟人工智能侦探等 新闻领域----新闻推荐系统等 游戏领域----游戏战略规划等 从上述所列举的应用可知,机器学习正在成为各行各业都会经常使用到的分析工具,尤其是在各领域数据量爆炸的今天,各行业都希望通过数据处理与分析手段,得到数据中有价值的信息,以便明确客户的需求和指引企业的发展。
资源推荐
资源详情
资源评论
收起资源包目录
机器学习的常用算法demo,包括python,cpp.zip (108个子文件)
boost_tree 18KB
mlp.cpp 3KB
RandomForest.cpp 3KB
boost_tree.cpp 2KB
svm.cpp 2KB
decision_tree.cpp 2KB
communities.csv 1.02MB
coil_2000.csv 983KB
pendigits.csv 381KB
abalone.csv 187KB
concrete.csv 48KB
diabetes.csv 28KB
breast_cancer.csv 24KB
pima-indians-diabetes.csv 23KB
air.csv 19KB
glass.csv 17KB
auto_mpg.csv 14KB
wine.csv 11KB
wine.csv 11KB
wine.csv 11KB
wine.csv 11KB
wine.csv 11KB
iris.csv 5KB
decision_tree 23KB
tree.dot 19KB
.gitignore 589B
demo_mnist-gc.json 3KB
demo_mnist-ca.json 555B
README.md 5KB
README.md 1KB
mlp 35KB
mlp.model 235KB
gbdt.model 162KB
xg.model 98KB
bagging.model 79KB
forest.model 46KB
adaboost.model 17KB
catboost.model 17KB
tree.model 16KB
svm.model 3KB
tree.pdf 25KB
tree.png 1.02MB
gcForest.png 168KB
cascade_classifier.py 17KB
fg_win_layer.py 8KB
kfold_wrapper.py 7KB
exp_utils.py 7KB
fgnet.py 6KB
data_cache.py 6KB
base_estimator.py 5KB
gtzan.py 4KB
demo_mnist.py 4KB
fg_pool_layer.py 4KB
uci_adult.py 4KB
wine.py 4KB
imdb.py 4KB
gbm.py 3KB
ds_base.py 3KB
sklearn_estimators.py 3KB
uci_semg.py 3KB
gcforest.py 3KB
win_utils.py 3KB
uci_yeast.py 3KB
fg_concat_layer.py 3KB
base_layer.py 3KB
boost_tree.py 3KB
mlp.py 2KB
cifar10.py 2KB
mnist.py 2KB
cat.py 2KB
svm.py 2KB
__init__.py 2KB
__init__.py 2KB
ds_pickle.py 2KB
olivetti_face.py 2KB
uci_letter.py 2KB
est_utils.py 2KB
audio_utils.py 2KB
config_utils.py 2KB
__init__.py 2KB
decision_tree.py 2KB
ds_pickle2.py 1KB
log_utils.py 1KB
xgDemo.py 1KB
pca_Dimension_reduction.py 1KB
pca_rotate.py 988B
config.py 918B
metrics.py 886B
RandomForest.py 701B
debug_utils.py 211B
cache_utils.py 102B
__init__.py 0B
__init__.py 0B
__init__.py 0B
random_forest 19KB
svm 19KB
gbm.txt 28KB
CMakeLists.txt 1KB
CMakeLists.txt 1KB
CMakeLists.txt 1KB
共 108 条
- 1
- 2
资源评论
生瓜蛋子
- 粉丝: 3798
- 资源: 4401
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功