包含“GENDIS：Shapelets的基因发现”中提出的算法的实现（sklearnAPI）和重现所有实验的代码

共102个文件

py：23个

png：11个

js：10个

版权申诉

python

5星 · 超过95%的资源 140 浏览量 2022-06-20 04:06:32 上传评论 1 收藏 1.59MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

包含“GENDIS：Shapelets 的基因发现”中提出的算法的实现（sklearn API）和重现所有实验的代码（102个子文件）

make.bat 810B

.buildinfo 230B

pairwise_dist.c 356KB

alabaster.css 11KB

basic.css 10KB

pygments.css 4KB

classic.css 4KB

custom.css 42B

default.css 28B

lts_hyperparams.csv 2KB

gendis_hyperparams.csv 2KB

gendis.doctree 28KB

index.doctree 5KB

evolving_shaps.gif 731KB

ajax-loader.gif 673B

.gitignore 1KB

genetic.html 84KB

gendis.html 12KB

index.html 6KB

py-modindex.html 4KB

search.html 3KB

genindex.html 3KB

index.html 3KB

MANIFEST.in 20B

objects.inv 348B

example.ipynb 376KB

example.ipynb 119KB

jquery-3.2.1.js 262KB

jquery.js 85KB

underscore-1.3.1.js 34KB

searchtools.js 25KB

websupport.js 25KB

underscore.js 12KB

doctools.js 9KB

sidebar.js 5KB

searchindex.js 3KB

documentation_options.js 270B

.keep 0B

LICENSE 2KB

Makefile 603B

README.md 5KB

README.md 1KB

.nojekyll 0B

environment.pickle 29KB

environment.pickle 4KB

GENDIS.png 461KB

comment-close.png 829B

comment-bright.png 756B

comment.png 641B

file.png 286B

down-pressed.png 222B

up-pressed.png 214B

up.png 203B

down.png 202B

minus.png 90B

plus.png 90B

genetic_single.py 24KB

gendis_all_datasets.py 19KB

genetic.py 16KB

dependent_vs_independent_benchmarks.py 15KB

process_results_gendis_voting.py 14KB

sax.py 10KB

dependent_vs_independent_artificial.py 10KB

other_util.py 10KB

operators.py 10KB

process_results.py 6KB

conf.py 5KB

pso.py 5KB

single_shapelet.py 5KB

_exceptions.py 4KB

fast.py 3KB

fitness.py 2KB

test_inputs.py 2KB

test_custom_fitness.py 2KB

brute_force.py 1KB

setup.py 970B

test_serialization.py 608B

test_sklearn_pipeline.py 565B

__init__.py 0B

pairwise_dist.pyx 2KB

README.rst 5KB

start.rst 2KB

index.rst 1KB

ccc.rst 427B

install.rst 408B

gendis.rst 209B

scatter_plot_lts_vs_gendis.svg 73KB

scatter_plot_dep_vs_indep.svg 72KB

distances_scatter.svg 49KB

data.svg 40KB

shapelets.svg 13KB

extracted_shapelets.svg 12KB

shap_artificial.svg 10KB

index.rst.txt 1KB

requirements.txt 277B

gendis.rst.txt 209B

requirements.txt 164B

共 102 条

# GENDIS [![Build Status](https://travis-ci.org/IBCNServices/GENDIS.svg?branch=master)](https://travis-ci.org/IBCNServices/GENDIS) [![PyPI version](https://badge.fury.io/py/GENDIS.svg)](https://badge.fury.io/py/GENDIS) [![Read The Docs](https://readthedocs.org/projects/gendis/badge/?version=latest)](https://gendis.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/gendis)](https://pepy.tech/project/gendis) ## GENetic DIscovery of Shapelets <p align="center"> <img src="GENDIS.png"> </p> <p align="center"> <img src="evolving_shaps.gif"> </p> In the time series classification domain, shapelets are small subseries that are discriminative for a certain class. It has been shown that by projecting the original dataset to a distance space, where each axis corresponds to the distance to a certain shapelet, classifiers are able to achieve state-of-the-art results on a plethora of datasets. This repository contains an implementation of `GENDIS`, an algorithm that searches for a set of shapelets in a genetic fashion. The algorithm is insensitive to its parameters (such as population size, crossover and mutation probability, ...) and can quickly extract a small set of shapelets that is able to achieve predictive performances similar (or better) to that of other shapelet techniques. ## Installation We currently support Python 3.5 & Python 3.6. For installation, there are two alternatives: 1. Clone the repository `https://github.com/IBCNServices/GENDIS.git` and run `(python3 -m) pip -r install requirements.txt` 2. GENDIS is hosted on PyPi. You can just run `(python3 -m) pip install gendis` to add gendis to your dist-packages (you can use it from everywhere). **Make sure NumPy and Cython is already installed (`pip install numpy` and `pip install Cython`), since that is required for the setup script.** ## Tutorial & Example ### 1. Loading & preprocessing the datasets In a first step, we need to construct at least a matrix with timeseries (`X_train`) and a vector with labels (`y_train`). Additionally, test data can be loaded as well in order to evaluate the pipeline in the end. ```python import pandas as pd # Read in the datafiles train_df = pd.read_csv(<DATA_FILE>) test_df = pd.read_csv(<DATA_FILE>) # Split into feature matrices and label vectors X_train = train_df.drop('target', axis=1) y_train = train_df['target'] X_test = test_df.drop('target', axis=1) y_test = test_df['target'] ``` ### 2. Creating a `GeneticExtractor` object Construct the object. For a list of all possible parameters, and a description, please refer to the documentation in the [code](gendis/genetic.py) ```python from gendis.genetic import GeneticExtractor genetic_extractor = GeneticExtractor(population_size=50, iterations=25, verbose=True, mutation_prob=0.3, crossover_prob=0.3, wait=10, max_len=len(X_train) // 2) ``` ### 3. Fit the `GeneticExtractor` and construct distance matrix ```python shapelets = genetic_extractor.fit(X_train, y_train) distances_train = genetic_extractor.transform(X_train) distances_test = genetic_extractor.transform(X_test) ``` ### 4. Fit ML classifier on constructed distance matrix ```python from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score lr = LogisticRegression() lr.fit(distances_train, y_train) print('Accuracy = {}'.format(accuracy_score(y_test, lr.predict(distances_test)))) ``` ### Example notebook A simple example is provided in [this notebook](gendis/example.ipynb) ## Data All datasets in this repository are downloaded from [timeseriesclassification](http://timeseriesclassification.com). Please refer to them appropriately when using any dataset. ## Paper experiments In order to reproduce the results from the corresponding paper, please check out [this directory](gendis/experiments). ## Tests We provide a few doctests and unit tests. To run the doctests: `python3 -m doctest -v <FILE>`, where `<FILE>` is the Python file you want to run the doctests from. To run unit tests: `nose2 -v` ## Contributing, Citing and Contact If you have any questions, are experiencing bugs in the GENDIS implementation, or would like to contribute, please feel free to create an issue/pull request in this repository or take contact with me at gilles(dot)vandewiele(at)ugent(dot)be If you use GENDIS in your work, please use the following citation: ```bibtex @article{vandewiele2021gendis, title={GENDIS: Genetic Discovery of Shapelets}, author={Vandewiele, Gilles and Ongenae, Femke and Turck, Filip De}, journal={Sensors}, volume={21}, number={4}, pages={1059}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} } ```

评论收藏

内容反馈

版权申诉