HyperparameterHunter
====================
![HyperparameterHunter Overview](docs/media/overview.gif)
[![Build Status](https://travis-ci.org/HunterMcGushion/hyperparameter_hunter.svg?branch=master)](https://travis-ci.org/HunterMcGushion/hyperparameter_hunter)
[![Documentation Status](https://readthedocs.org/projects/hyperparameter-hunter/badge/?version=latest)](https://hyperparameter-hunter.readthedocs.io/en/latest/?badge=latest)
[![Coverage Status](https://coveralls.io/repos/github/HunterMcGushion/hyperparameter_hunter/badge.svg?branch=test-update)](https://coveralls.io/github/HunterMcGushion/hyperparameter_hunter?branch=test-update)
[![codecov](https://codecov.io/gh/HunterMcGushion/hyperparameter_hunter/branch/master/graph/badge.svg)](https://codecov.io/gh/HunterMcGushion/hyperparameter_hunter)
[![Maintainability](https://api.codeclimate.com/v1/badges/ef0d004a10ede0b228cc/maintainability)](https://codeclimate.com/github/HunterMcGushion/hyperparameter_hunter/maintainability)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/1413b76fabe2400fab1958e70be593a2)](https://www.codacy.com/app/HunterMcGushion/hyperparameter_hunter?utm_source=github.com&utm_medium=referral&utm_content=HunterMcGushion/hyperparameter_hunter&utm_campaign=Badge_Grade)
[![PyPI version](https://badge.fury.io/py/hyperparameter-hunter.svg)](https://badge.fury.io/py/hyperparameter-hunter)
[![Downloads](https://pepy.tech/badge/hyperparameter-hunter/month)](https://pepy.tech/project/hyperparameter-hunter)
[![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=Q3EX3PQUV256G)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
Automatically save and learn from Experiment results, leading to long-term, persistent optimization that remembers all your tests.
HyperparameterHunter provides a wrapper for machine learning algorithms that saves all the important data. Simplify the experimentation and hyperparameter tuning process by letting HyperparameterHunter do the hard work
of recording, organizing, and learning from your tests — all while using the same libraries you already do. Don't let any of your experiments go to waste, and start doing hyperparameter optimization the way it was meant to be.
* **Installation:** `pip install hyperparameter-hunter`
* **Source:** https://github.com/HunterMcGushion/hyperparameter_hunter
* **Documentation:** [https://hyperparameter-hunter.readthedocs.io](https://hyperparameter-hunter.readthedocs.io/en/latest/index.html)
Features
--------
* Automatically record Experiment results
* Truly informed hyperparameter optimization that automatically uses past Experiments
* Eliminate boilerplate code for cross-validation loops, predicting, and scoring
* Stop worrying about keeping track of hyperparameters, scores, or re-running the same Experiments
* Use the libraries and utilities you already love
How to Use HyperparameterHunter
-------------------------------
Don’t think of HyperparameterHunter as another optimization library that you bring out only when its time to do hyperparameter optimization. Of course, it does optimization, but its better to view HyperparameterHunter as your own personal machine learning toolbox/assistant.
The idea is to start using HyperparameterHunter immediately. Run all of your benchmark/one-off experiments through it.
The more you use HyperparameterHunter, the better your results will be. If you just use it for optimization, sure, it’ll do what you want, but that’s missing the point of HyperparameterHunter.
If you’ve been using it for experimentation and optimization along the entire course of your project, then when you decide to do hyperparameter optimization, HyperparameterHunter is already aware of all that you’ve done, and that’s when HyperparameterHunter does something remarkable. It doesn’t start optimization from scratch like other libraries. It starts from all of the Experiments and previous optimization rounds you’ve already run through it.
Getting Started
---------------
### 1) Environment:
Set up an Environment to organize Experiments and Optimization results.
<br>
Any Experiments or Optimization rounds we perform will use our active Environment.
```python
from hyperparameter_hunter import Environment, CVExperiment
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import StratifiedKFold
data = load_breast_cancer()
df = pd.DataFrame(data=data.data, columns=data.feature_names)
df['target'] = data.target
env = Environment(
train_dataset=df, # Add holdout/test dataframes, too
results_path='path/to/results/directory', # Where your result files will go
metrics=['roc_auc_score'], # Callables, or strings referring to `sklearn.metrics`
cv_type=StratifiedKFold, # Class, or string in `sklearn.model_selection`
cv_params=dict(n_splits=5, shuffle=True, random_state=32)
)
```
### 2) Individual Experimentation:
Perform Experiments with your favorite libraries simply by providing model initializers and hyperparameters
<!-- Keras -->
<details>
<summary>Keras</summary>
```python
# Same format used by `keras.wrappers.scikit_learn`. Nothing new to learn
def build_fn(input_shape): # `input_shape` calculated for you
model = Sequential([
Dense(100, kernel_initializer='uniform', input_shape=input_shape, activation='relu'),
Dropout(0.5),
Dense(1, kernel_initializer='uniform', activation='sigmoid')
]) # All layer arguments saved (whether explicit or Keras default) for future use
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
experiment = CVExperiment(
model_initializer=KerasClassifier,
model_init_params=build_fn, # We interpret your build_fn to save hyperparameters in a useful, readable format
model_extra_params=dict(
callbacks=[ReduceLROnPlateau(patience=5)], # Use Keras callbacks
batch_size=32, epochs=10, verbose=0 # Fit/predict arguments
)
)
```
</details>
<!-- SKLearn -->
<details>
<summary>SKLearn</summary>
```python
experiment = CVExperiment(
model_initializer=LinearSVC, # (Or any of the dozens of other SK-Learn algorithms)
model_init_params=dict(penalty='l1', C=0.9) # Default values used and recorded for kwargs not given
)
```
</details>
<!-- XGBoost -->
<details open>
<summary>XGBoost</summary>
```python
experiment = CVExperiment(
model_initializer=XGBClassifier,
model_init_params=dict(objective='reg:linear', max_depth=3, n_estimators=100, subsample=0.5)
)
```
</details>
<!-- LightGBM -->
<details>
<summary>LightGBM</summary>
```python
experiment = CVExperiment(
model_initializer=LGBMClassifier,
model_init_params=dict(boosting_type='gbdt', num_leaves=31, max_depth=-1, min_child_samples=5, subsample=0.5)
)
```
</details>
<!-- CatBoost -->
<details>
<summary>CatBoost</summary>
```python
experiment = CVExperiment(
model_initializer=CatboostClassifier,
model_init_params=dict(iterations=500, learning_rate=0.01, depth=7, allow_writing_files=False),
model_extra_params=dict(fit=dict(verbose=True)) # Send kwargs to `fit` and other extra methods
)
```
</details>
<!-- RGF -->
<details>
<summary>RGF</summary>
```python
experiment = CVExperiment(
model_initializer=RGFClassifier,
model_init_params=dict(max_leaf=1000, algorithm='RGF', min_samples_leaf=10)
)
```
</details>
### 3) Hyperparameter Optimization:
Just like Experiments, but if you want to optimize a hyperparameter, use the classes imported below
```python
from hyperparameter_hunter import Real, Integer, Categorical
from hyperparameter_hunter import optimization as opt
```
<!-- Keras -->
<details>
<summary>Keras</summary>
```python
def build_fn(input_shape):
model = Sequential([
Dense(Integer(50, 150), input_shape=input_shape, activation='