# GENDIS [![Build Status](https://travis-ci.org/IBCNServices/GENDIS.svg?branch=master)](https://travis-ci.org/IBCNServices/GENDIS) [![PyPI version](https://badge.fury.io/py/GENDIS.svg)](https://badge.fury.io/py/GENDIS) [![Read The Docs](https://readthedocs.org/projects/gendis/badge/?version=latest)](https://gendis.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/gendis)](https://pepy.tech/project/gendis)
## GENetic DIscovery of Shapelets
<p align="center">
<img src="GENDIS.png">
</p>
<p align="center">
<img src="evolving_shaps.gif">
</p>
In the time series classification domain, shapelets are small subseries that are discriminative for a certain class. It has been shown that by projecting the original dataset to a distance space, where each axis corresponds to the distance to a certain shapelet, classifiers are able to achieve state-of-the-art results on a plethora of datasets.
This repository contains an implementation of `GENDIS`, an algorithm that searches for a set of shapelets in a genetic fashion. The algorithm is insensitive to its parameters (such as population size, crossover and mutation probability, ...) and can quickly extract a small set of shapelets that is able to achieve predictive performances similar (or better) to that of other shapelet techniques.
## Installation
We currently support Python 3.5 & Python 3.6. For installation, there are two alternatives:
1. Clone the repository `https://github.com/IBCNServices/GENDIS.git` and run `(python3 -m) pip -r install requirements.txt`
2. GENDIS is hosted on PyPi. You can just run `(python3 -m) pip install gendis` to add gendis to your dist-packages (you can use it from everywhere).
**Make sure NumPy and Cython is already installed (`pip install numpy` and `pip install Cython`), since that is required for the setup script.**
## Tutorial & Example
### 1. Loading & preprocessing the datasets
In a first step, we need to construct at least a matrix with timeseries (`X_train`) and a vector with labels (`y_train`). Additionally, test data can be loaded as well in order to evaluate the pipeline in the end.
```python
import pandas as pd
# Read in the datafiles
train_df = pd.read_csv(<DATA_FILE>)
test_df = pd.read_csv(<DATA_FILE>)
# Split into feature matrices and label vectors
X_train = train_df.drop('target', axis=1)
y_train = train_df['target']
X_test = test_df.drop('target', axis=1)
y_test = test_df['target']
```
### 2. Creating a `GeneticExtractor` object
Construct the object. For a list of all possible parameters, and a description, please refer to the documentation in the [code](gendis/genetic.py)
```python
from gendis.genetic import GeneticExtractor
genetic_extractor = GeneticExtractor(population_size=50, iterations=25, verbose=True,
mutation_prob=0.3, crossover_prob=0.3,
wait=10, max_len=len(X_train) // 2)
```
### 3. Fit the `GeneticExtractor` and construct distance matrix
```python
shapelets = genetic_extractor.fit(X_train, y_train)
distances_train = genetic_extractor.transform(X_train)
distances_test = genetic_extractor.transform(X_test)
```
### 4. Fit ML classifier on constructed distance matrix
```python
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
lr = LogisticRegression()
lr.fit(distances_train, y_train)
print('Accuracy = {}'.format(accuracy_score(y_test, lr.predict(distances_test))))
```
### Example notebook
A simple example is provided in [this notebook](gendis/example.ipynb)
## Data
All datasets in this repository are downloaded from [timeseriesclassification](http://timeseriesclassification.com). Please refer to them appropriately when using any dataset.
## Paper experiments
In order to reproduce the results from the corresponding paper, please check out [this directory](gendis/experiments).
## Tests
We provide a few doctests and unit tests. To run the doctests: `python3 -m doctest -v <FILE>`, where `<FILE>` is the Python file you want to run the doctests from. To run unit tests: `nose2 -v`
## Contributing, Citing and Contact
If you have any questions, are experiencing bugs in the GENDIS implementation, or would like to contribute, please feel free to create an issue/pull request in this repository or take contact with me at gilles(dot)vandewiele(at)ugent(dot)be
If you use GENDIS in your work, please use the following citation:
```bibtex
@article{vandewiele2021gendis,
title={GENDIS: Genetic Discovery of Shapelets},
author={Vandewiele, Gilles and Ongenae, Femke and Turck, Filip De},
journal={Sensors},
volume={21},
number={4},
pages={1059},
year={2021},
publisher={Multidisciplinary Digital Publishing Institute}
}
```
没有合适的资源?快使用搜索试试~ 我知道了~
包含“GENDIS:Shapelets 的基因发现”中提出的算法的实现(sklearn API)和重现所有实验的代码
共102个文件
py:23个
png:11个
js:10个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
5星 · 超过95%的资源 2 下载量 140 浏览量
2022-06-20
04:06:32
上传
评论 1
收藏 1.59MB ZIP 举报
温馨提示
Jupyter /python 在时间序列分类域中,shapelets 是对某一类具有区分性的小子序列。已经表明,通过将原始数据集投影到距离空间,其中每个轴对应于到某个 shapelet 的距离,分类器能够在大量数据集上实现最先进的结果。 这个存储库包含一个实现GENDIS,一种以遗传方式搜索一组 shapelets 的算法。该算法对其参数(例如种群大小、交叉和变异概率……)不敏感,并且可以快速提取一小组能够实现与其他 shapelet 技术相似(或更好)的预测性能的 shapelet
资源推荐
资源详情
资源评论
收起资源包目录
包含“GENDIS:Shapelets 的基因发现”中提出的算法的实现(sklearn API)和重现所有实验的代码 (102个子文件)
make.bat 810B
.buildinfo 230B
pairwise_dist.c 356KB
alabaster.css 11KB
basic.css 10KB
pygments.css 4KB
classic.css 4KB
custom.css 42B
default.css 28B
lts_hyperparams.csv 2KB
gendis_hyperparams.csv 2KB
gendis.doctree 28KB
index.doctree 5KB
index.doctree 5KB
evolving_shaps.gif 731KB
ajax-loader.gif 673B
.gitignore 1KB
genetic.html 84KB
gendis.html 12KB
index.html 6KB
py-modindex.html 4KB
search.html 3KB
genindex.html 3KB
index.html 3KB
MANIFEST.in 20B
objects.inv 348B
example.ipynb 376KB
example.ipynb 119KB
jquery-3.2.1.js 262KB
jquery.js 85KB
underscore-1.3.1.js 34KB
searchtools.js 25KB
websupport.js 25KB
underscore.js 12KB
doctools.js 9KB
sidebar.js 5KB
searchindex.js 3KB
documentation_options.js 270B
.keep 0B
.keep 0B
LICENSE 2KB
Makefile 603B
README.md 5KB
README.md 5KB
README.md 1KB
.nojekyll 0B
environment.pickle 29KB
environment.pickle 4KB
GENDIS.png 461KB
comment-close.png 829B
comment-bright.png 756B
comment.png 641B
file.png 286B
down-pressed.png 222B
up-pressed.png 214B
up.png 203B
down.png 202B
minus.png 90B
plus.png 90B
genetic_single.py 24KB
gendis_all_datasets.py 19KB
genetic.py 16KB
dependent_vs_independent_benchmarks.py 15KB
process_results_gendis_voting.py 14KB
sax.py 10KB
dependent_vs_independent_artificial.py 10KB
other_util.py 10KB
operators.py 10KB
process_results.py 6KB
conf.py 5KB
pso.py 5KB
single_shapelet.py 5KB
_exceptions.py 4KB
fast.py 3KB
fitness.py 2KB
test_inputs.py 2KB
test_custom_fitness.py 2KB
brute_force.py 1KB
setup.py 970B
test_serialization.py 608B
test_sklearn_pipeline.py 565B
__init__.py 0B
pairwise_dist.pyx 2KB
README.rst 5KB
start.rst 2KB
index.rst 1KB
ccc.rst 427B
install.rst 408B
gendis.rst 209B
scatter_plot_lts_vs_gendis.svg 73KB
scatter_plot_dep_vs_indep.svg 72KB
distances_scatter.svg 49KB
data.svg 40KB
shapelets.svg 13KB
extracted_shapelets.svg 12KB
shap_artificial.svg 10KB
index.rst.txt 1KB
requirements.txt 277B
gendis.rst.txt 209B
requirements.txt 164B
共 102 条
- 1
- 2
资源评论
- 高阶领主2023-12-11资源不错,内容挺好的,有一定的使用价值,值得借鉴,感谢分享。
- beautiful_code_2023-05-18资源不错,内容挺好的,有一定的使用价值,值得借鉴,感谢分享。
快撑死的鱼
- 粉丝: 1w+
- 资源: 9154
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功