<p align="center">
<img src="https://github.com/jina-ai/finetuner/blob/main/docs/_static/finetuner-logo-ani.svg?raw=true" alt="Finetuner logo: Finetuner allows one to finetune any deep Neural Network for better embedding on search tasks. It accompanies Jina to deliver the last mile of performance-tuning for neural search applications." width="150px">
</p>
<p align="center">
<b>Finetuning any deep neural network for better embedding on neural search tasks</b>
</p>
<p align=center>
<a href="https://pypi.org/project/finetuner/"><img src="https://github.com/jina-ai/jina/blob/master/.github/badges/python-badge.svg?raw=true" alt="Python 3.7 3.8 3.9" title="Finetuner supports Python 3.7 and above"></a>
<a href="https://pypi.org/project/finetuner/"><img src="https://img.shields.io/pypi/v/finetuner?color=%23099cec&label=PyPI&logo=pypi&logoColor=white" alt="PyPI"></a>
<a href="https://codecov.io/gh/jina-ai/finetuner"><img src="https://codecov.io/gh/jina-ai/finetuner/branch/main/graph/badge.svg?token=xSs4acAEaJ"/></a>
<a href="https://slack.jina.ai"><img src="https://img.shields.io/badge/Slack-2.2k%2B-blueviolet?logo=slack&logoColor=white"></a>
</p>
<!-- start elevator-pitch -->
Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks. It
accompanies [Jina](https://github.com/jina-ai/jina) to deliver the last mile of performance for domain-specific neural search
applications.
ð **Designed for finetuning**: a human-in-the-loop deep learning tool for leveling up your pretrained models in domain-specific neural search applications.
ð± **Powerful yet intuitive**: all you need is `finetuner.fit()` - a one-liner that unlocks rich features such as
siamese/triplet network, interactive labeling, layer pruning, weights freezing, dimensionality reduction.
âï¸ **Framework-agnostic**: promise an identical API & user experience on PyTorch, Tensorflow/Keras and PaddlePaddle deep learning backends.
ð§ **Jina integration**: buttery smooth integration with Jina, reducing the cost of context-switch between experiment
and production.
<!-- end elevator-pitch -->
## How does it work
<img src="https://github.com/jina-ai/finetuner/blob/main/docs/img/finetuner-journey.svg?raw=true" alt="Python 3.7 3.8 3.9" title="Finetuner supports Python 3.7 and above">
## Install
Requires Python 3.7+ and *one of* [PyTorch](https://pytorch.org/)(>=1.9) or [Tensorflow](https://tensorflow.org/)(>=2.5) or [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) installed on Linux/MacOS.
```bash
pip install finetuner
```
## [Documentation](https://finetuner.jina.ai)
## Usage
<table>
<thead>
<tr>
<th colspan="2" rowspan="2">Usage</th>
<th colspan="2">Do you have an <a href="https://finetuner.jina.ai/basics/glossary/#term-Embedding-model">embedding model</a>?</th>
</tr>
<tr>
<th>Yes</th>
<th>No</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2"><b>Do you have <a href="https://finetuner.jina.ai/basics/glossary/#term-Labeled-dataset">labeled data</a>?</b></td>
<td><b>Yes</b></td>
<td align="center">ð </td>
<td align="center">ð¡</td>
</tr>
<tr>
<td><b>No</b></td>
<td align="center">ð¢</td>
<td align="center">ðµ</td>
</tr>
</tbody>
</table>
### ð Have embedding model and labeled data
Perfect! Now `embed_model` and `labeled_data` are given by you already, simply do:
```python
import finetuner
tuned_model = finetuner.fit(
embed_model,
train_data=labeled_data
)
```
### ð¢ Have embedding model and unlabeled data
You have an `embed_model` to use, but no labeled data for finetuning this model. No worry, that's good enough already!
You can use Finetuner to interactive label data and train `embed_model` as below:
```python
import finetuner
tuned_model = finetuner.fit(
embed_model,
train_data=unlabeled_data,
interactive=True
)
```
### ð¡ Have general model and labeled data
You have a `general_model` which does not output embeddings. Luckily you provide some `labeled_data` for training. No
worries, Finetuner can convert your model into an embedding model and train it via:
```python
import finetuner
tuned_model = finetuner.fit(
general_model,
train_data=labeled_data,
to_embedding_model=True,
layer_name='my_embedding_layer',
freeze=['layer_1', 'layer_2'],
)
```
### ðµ Have general model and unlabeled data
You have a `general_model` which is not for embeddings. Meanwhile, you don't have labeled data for training. But no
worries, Finetuner can help you train an embedding model with interactive labeling on-the-fly:
```python
import finetuner
tuned_model = finetuner.fit(
general_model,
train_data=unlabeled_data,
interactive=True,
to_embedding_model=True,
layer_name='my_embedding_layer',
freeze=['layer_1', 'layer_2'],
)
```
## Finetuning ResNet50 on CelebA
> â¡ To get the best experience, you will need a GPU-machine for this example. For CPU users, we provide [finetuning a MLP on FashionMNIST](https://finetuner.jina.ai/get-started/fashion-mnist/) and [finetuning a Bi-LSTM on CovidQA](https://finetuner.jina.ai/get-started/covid-qa/) that run out the box on low-profile machines. Check out more examples in [our docs](https://finetuner.jina.ai)!
1. Download [CelebA-small dataset (7.7MB)](https://static.jina.ai/celeba/celeba-img.zip) and decompress it to `'./img_align_celeba'`. [Full dataset can be found here.](https://drive.google.com/drive/folders/0B7EVK8r0v71pWEZsZE9oNnFzTm8?resourcekey=0-5BR16BdXnb8hVj6CNHKzLg)
2. Finetuner accepts Jina `DocumentArray`/`DocumentArrayMemmap`, so we load CelebA image into this format using a generator:
```python
from jina.types.document.generators import from_files
# please change the file path to your data path
data = list(from_files('img_align_celeba/*.jpg', size=100, to_dataturi=True))
for doc in data:
doc.load_uri_to_image_blob(
height=224, width=224
).set_image_blob_normalization().set_image_blob_channel_axis(
-1, 0
) # No need for changing channel axes line if you are using tf/keras
```
3. Load pretrained ResNet50 using PyTorch/Keras/Paddle:
- PyTorch
```python
import torchvision
model = torchvision.models.resnet50(pretrained=True)
```
- Keras
```python
import tensorflow as tf
model = tf.keras.applications.resnet50.ResNet50(weights='imagenet')
```
- Paddle
```python
import paddle
model = paddle.vision.models.resnet50(pretrained=True)
```
4. Start the Finetuner:
```python
import finetuner
finetuner.fit(
model=model,
interactive=True,
train_data=data,
freeze=True,
to_embedding_model=True,
input_size=(3, 224, 224),
layer_name='my_embedding_layer',
freeze=['layer_1', 'layer_2'],
)
```
5. After downloading the model and loading the data (takes ~20s depending on your network/CPU/GPU), your browser will open the Labeler UI as below. You can now label the relevance of celebrity faces via mouse/keyboard. The ResNet50 model will get finetuned and improved as you are labeling. If you are running this example on a CPU machine, it may take up to 20 seconds for each labeling round.
![Finetuning ResNet50 on CelebA with interactive labeling](docs/get-started/celeba-labeler.gif)
<!-- start support-pitch -->
## Support
- Use [Discussions](https://github.com/jina-ai/finetuner/discussions) to talk about your use cases, questions, and
support queries.
- Join our [Slack community](https://slack.jina.ai) and chat with other Jina community members about ideas.
- Join our [Engineering All Hands](https://youtube.com/playlist?list=PL3UBBWOUVhFYRUa_gpYYKBqEAkO4sxmne) meet-up to discuss your use case and learn Jina's new features.
- **When?** The second Tuesday of every month
- **Where?**
Zoom ([see our publi
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
共64个文件
py:42个
js:6个
txt:4个
资源分类:Python库 所属语言:Python 资源全名:finetuner-0.3.1.dev16.tar.gz 资源来源:官方 安装方法:https://lanzao.blog.csdn.net/article/details/101784059
资源推荐
资源详情
资源评论
收起资源包目录
finetuner-0.3.1.dev16.tar.gz (64个子文件)
finetuner-0.3.1.dev16
MANIFEST.in 100B
PKG-INFO 11KB
LICENSE 11KB
setup.cfg 38B
setup.py 2KB
README.md 9KB
finetuner
labeler
executor.py 4KB
__init__.py 5KB
ui
img
check.svg 503B
logo-light.svg 20KB
index.html 5KB
js
main.js 10KB
components
audio-match-card.vue.js 2KB
image-match-card.vue.js 2KB
sidebar.vue.js 9KB
text-match-card.vue.js 2KB
mesh-match-card.vue.js 3KB
favicon.ico 15KB
main.css 7KB
helper.py 3KB
tailor
keras
__init__.py 5KB
paddle
__init__.py 8KB
pytorch
__init__.py 7KB
__init__.py 2KB
base.py 4KB
__init__.py 6KB
toydata.py 8KB
embedding.py 4KB
tuner
keras
losses.py 7KB
__init__.py 7KB
miner.py 10KB
data.py 2KB
paddle
datasets.py 233B
losses.py 6KB
__init__.py 8KB
miner.py 13KB
evaluation.py 7KB
pytorch
datasets.py 1016B
losses.py 6KB
__init__.py 7KB
miner.py 9KB
state.py 414B
dataset
datasets.py 5KB
__init__.py 102B
samplers.py 7KB
base.py 1KB
__init__.py 5KB
callback
training_checkpoint.py 5KB
wandb_logger.py 2KB
best_model_checkpoint.py 4KB
__init__.py 348B
early_stopping.py 5KB
base.py 3KB
progress_bar.py 4KB
base.py 9KB
miner
mining_strategies.py 11KB
__init__.py 3KB
base.py 2KB
finetuner.egg-info
PKG-INFO 11KB
requires.txt 23B
not-zip-safe 1B
SOURCES.txt 2KB
top_level.txt 16B
dependency_links.txt 1B
共 64 条
- 1
资源评论
挣扎的蓝藻
- 粉丝: 13w+
- 资源: 15万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 海信智能电视刷机数据 LED42K330X3D(0000) 生产用软件数据 务必确认机编一致 强制刷机 整机USB升级程序
- shujudaochuceshi
- learn-ruby.zip
- test111111111111111111
- face-detect.ipynb
- 以下是一些关于ACM(国际大学生程序设计竞赛)、NOI(全国青少年信息学奥林匹克竞赛)以及CSP(全国青少年信息学奥林匹克竞赛提
- 是一些电子设计竞赛(电赛)经验分享,包括备赛策略、项目管理、团队合作和比赛期间的注意事项
- 全能运行库修复工具DirectX Repair v4.1.0.30770
- las格式点云数据使用详解(附VS编译好的LAStools工具)
- KRPano插件一键解密大师1.4.0 (解压密码1234)
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功