# TensorFlow Similarity: Metric Learning for Humans
TensorFlow Similarity is a [TensorFlow](https://tensorflow.org) library for [similarity learning](https://en.wikipedia.org/wiki/Similarity_learning) which includes techniques such as self-supervised learning, metric learning, similarity learning, and contrastive learning. TensorFlow Similarity is still in beta and we may push breaking changes.
## Introduction
Tensorflow Similarity offers state-of-the-art algorithms for metric learning along with all the necessary components to research, train, evaluate, and serve similarity and contrastive based models. These components include models, losses, metrics, samplers, visualizers, and indexing subsystems to make this quick and easy.
![Example of nearest neighbors search performed on the embedding generated by a similarity model trained on the Oxford IIIT Pet Dataset.](https://raw.githubusercontent.com/tensorflow/similarity/master/assets/images/similar-cats-and-dogs.jpg)
With Tensorflow Similarity you can train two main types of models:
1. **Self-supervised models**: Used to learn general data representations on unlabeled data to boost the accuracy of downstream tasks where you have few labels. For example, you can pre-train a model on a large number of unlabled images using one of the supported contrastive methods supported by TensorFlow Similarity, and then fine-tune it on a small labeled dataset to achieve higher accuracy. To get started training your own self-supervised model see this [notebook](examples/unsupervised_hello_world.ipynb).
2. **Similarity models**: Output embeddings that allow you to find and cluster similar examples such as images representing the same object within a large corpus of examples. For instance, as visible above, you can train a similarity model to find and cluster similar looking, unseen cat and dog images from the [Oxford IIIT Pet Dataset](https://www.tensorflow.org/datasets/catalog/oxford_iiit_pet) while only training on a few of the dataset classes. To get started training your own similarity model see this [notebook](examples/supervised/visualization.ipynb).
## What's new
- [May 2022]: 0.16 major optimization release
* Cross-batch memory (XBM) loss added thank to @chjort
* Many self-supervised related improvement thanks to @dewball345
* Major layers and callback refactoring to make them faster and more flexible. E.g `EvalCallback()` now support splited validation.
For full changes see [the changelog](./releases.md)
- [Jan 2022]: 0.15 self-supervised release
* Added support for self-supervised contrastive learning. Including SimCLR, SimSiam, and Barlow Twins. Checkout the in-depth [hello world notebook](examples/unsupervised_hello_world.ipynb) to get started.
* Soft Nearest Neighbor Loss added thanks to [Abhishar Sinha](https://github.com/abhisharsinha)
* Added GenerlizedMeanPooling2D support that improves similarity matching accuracy over GlobalMeanPooling2D.
* Numerous speed optimizations and general bug fixes.
For previous changes and more details - see [the changelog](./releases.md)
## Getting Started
### Installation
Use pip to install the library.
**NOTE**: The Tensorflow extra_require key can be omitted if you already have tensorflow>=2.4 installed.
```shell
pip install --upgrade-strategy=only-if-needed tensorflow_similarity[tensorflow]
```
### Documentation
The detailed and narrated [notebooks](examples/) are a good way to get started with TensorFlow Similarity. There is likely to be one that is similar to your data or your problem (if not, let us know). You can start working with the examples immediately in Google Colab by clicking the Google Colab icon.
For more information about specific functions, you can [check the API documentation](api/)
For contributing to the project please check out the [contribution guidelines](CONTRIBUTING.md)
### Minimal Example: MNIST similarity
<details>
<summary> Click to expand and see how to train a supervised similarity model on mnist using TF.Similarity</summary>
Here is a bare bones example demonstrating how to train a TensorFlow Similarity model on the MNIST data. This example illustrates some of the main components provided by TensorFlow Similarity and how they fit together. Please refer to the [hello_world notebook](examples/supervised_hello_world.ipynb) for a more detailed introduction.
### Preparing data
TensorFlow Similarity provides [data samplers](api/TFSimilarity/samplers/), for various dataset types, that balance the batches to ensure smoother training.
In this example, we are using the multi-shot sampler that integrates directly from the TensorFlow dataset catalog.
```python
from tensorflow_similarity.samplers import TFDatasetMultiShotMemorySampler
# Data sampler that generates balanced batches from MNIST dataset
sampler = TFDatasetMultiShotMemorySampler(dataset_name='mnist', classes_per_batch=10)
```
### Building a Similarity model
Building a TensorFlow Similarity model is similar to building a standard Keras model, except the output layer is usually a [`MetricEmbedding()`](api/TFSimilarity/layers/) layer that enforces L2 normalization and the model is instantiated as a specialized subclass [`SimilarityModel()`](api/TFSimilarity/models/SimilarityModel.md) that supports additional functionality.
```python
from tensorflow.keras import layers
from tensorflow_similarity.layers import MetricEmbedding
from tensorflow_similarity.models import SimilarityModel
# Build a Similarity model using standard Keras layers
inputs = layers.Input(shape=(28, 28, 1))
x = layers.experimental.preprocessing.Rescaling(1/255)(inputs)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
outputs = MetricEmbedding(64)(x)
# Build a specialized Similarity model
model = SimilarityModel(inputs, outputs)
```
### Training model via contrastive learning
To output a metric embedding, that are searchable via approximate nearest neighbor search, the model needs to be trained using a similarity loss. Here we are using the `MultiSimilarityLoss()`, which is one of the most efficient loss functions.
```python
from tensorflow_similarity.losses import MultiSimilarityLoss
# Train Similarity model using contrastive loss
model.compile('adam', loss=MultiSimilarityLoss())
model.fit(sampler, epochs=5)
```
### Building images index and querying it
Once the model is trained, reference examples must be indexed via the model index API to be searchable. After indexing, you can use the model lookup API to search the index for the K most similar items.
```python
from tensorflow_similarity.visualization import viz_neigbors_imgs
# Index 100 embedded MNIST examples to make them searchable
sx, sy = sampler.get_slice(0,100)
model.index(x=sx, y=sy, data=sx)
# Find the top 5 most similar indexed MNIST examples for a given example
qx, qy = sampler.get_slice(3713, 1)
nns = model.single_lookup(qx[0])
# Visualize the query example and its top 5 neighbors
viz_neigbors_imgs(qx[0], qy[0], nns)
```
</details>
## Supported Algorithms
### Self-Supervised Models
- SimCLR
- SimSiam
- Barlow Twins
### Supervised Losses
- Triplet Loss
- PN Loss
- Multi Sim Loss
- Circle Loss
- Soft Nearest Neighbor Loss
### Metrics
Tensorflow Similarity offers many of the most common metrics used for [classification](api/TFSimilarity/classification_metrics/) and [retrieval](api/TFSimilarity/retrieval_metrics/) evaluation. Including:
| Name | Type | Description |
| ---- | ---- | ----------- |
| Precision | Classification | |
| Recall | Classification | |
| F1 Score | Classification | |
| Recall@K | Retrieval | |
| Binary NDCG | Retrieval | |
## Citing
Please cite this reference if you use any part of TensorFlow similarity in your research:
```bibtex
@article{EBSIM21,
title={TensorFlow Similarity: A Usable, High-Performance Metric Learning Library},
author={Elie Bursztein, James Long, Shun Lin, Owen Vallis, F
没有合适的资源?快使用搜索试试~ 我知道了~
tensorflow_similarity-0.16.6.tar.gz
需积分: 1 0 下载量 41 浏览量
2024-03-24
23:49:32
上传
评论
收藏 120KB GZ 举报
温馨提示
Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。
资源推荐
资源详情
资源评论
收起资源包目录
tensorflow_similarity-0.16.6.tar.gz (138个子文件)
setup.cfg 38B
LICENSE 11KB
README.md 8KB
PKG-INFO 9KB
PKG-INFO 9KB
contrastive_model.py 40KB
similarity_model.py 31KB
indexer.py 30KB
memory_evaluator.py 16KB
memory_samplers.py 14KB
callbacks.py 14KB
classification_match.py 13KB
test_callbacks.py 12KB
distances.py 11KB
layers.py 11KB
multisim_loss.py 9KB
cropping.py 8KB
test_layers.py 8KB
pn_loss.py 8KB
test_distances.py 7KB
test_losses.py 7KB
color_jitter.py 7KB
nmslib_search.py 7KB
triplet_loss.py 7KB
circle_loss.py 7KB
tfdataset_samplers.py 7KB
memory_store.py 6KB
efficientnet.py 6KB
distance_metrics.py 6KB
resnet18.py 6KB
projector.py 6KB
test_classification_match.py 6KB
test_classification_metrics.py 6KB
evaluator.py 6KB
simclr.py 6KB
utils.py 6KB
samplers.py 6KB
bndcg.py 6KB
resnet50.py 5KB
xbm_loss.py 5KB
test_indexer.py 5KB
barlow.py 5KB
map_at_k.py 5KB
simclr.py 5KB
tfrecords_samplers.py 5KB
utils.py 5KB
test_efficientnet.py 5KB
test_memory_samplers.py 5KB
test_algebra.py 5KB
softnn_loss.py 5KB
schedules.py 5KB
__init__.py 4KB
retrieval_metric.py 4KB
test_memory_evaluator.py 4KB
vicreg.py 4KB
algebra.py 4KB
simsiam.py 4KB
types.py 4KB
test_confusion_matrix.py 4KB
blur.py 4KB
precision_at_k.py 4KB
barlow.py 4KB
store.py 4KB
test_resnet50.py 4KB
confusion_matrix_viz.py 3KB
search.py 3KB
recall_at_k.py 3KB
binary_accuracy.py 3KB
test_neighbors_viz.py 3KB
test_distance_metrics.py 3KB
test_utils.py 3KB
setup.py 3KB
precision.py 3KB
classification_metric.py 3KB
utils.py 3KB
test_schedules.py 3KB
test_bndcg.py 3KB
match_majority_vote.py 3KB
neighbors_viz.py 3KB
f1_score.py 3KB
metric_loss.py 3KB
negative_predictive_value.py 3KB
false_positive_rate.py 3KB
test_retrieval_metric.py 2KB
recall.py 2KB
match_nearest.py 2KB
utils.py 2KB
vizualize_views.py 2KB
test_nmslib_search.py 2KB
test_tfrecord_samplers.py 2KB
utils.py 2KB
test_memory_store.py 2KB
__init__.py 2KB
__init__.py 2KB
utils.py 2KB
__init__.py 2KB
test_map_at_k.py 2KB
test_resnet18.py 1KB
utils.py 1KB
test_model.py 1KB
共 138 条
- 1
- 2
资源评论
程序员Chino的日记
- 粉丝: 3654
- 资源: 5万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 学成在线-pc布局案例
- 数据集-目标检测系列- 戒指 检测数据集 ring >> DataBall
- 数据集-目标检测系列- 皇冠 头饰 检测数据集 crown >> DataBall
- 利用哨兵 2 号卫星图像和 GRanD 大坝数据集进行的首次大坝检测迭代.ipynb
- 数据集-目标检测系列- 红色裙子 检测数据集 red-skirt >> DataBall
- DNS服务器搭建-单机部署
- 数据集-目标检测系列- 猫咪 小猫 检测数据集 cat >> DataBall
- matlab写的导弹轨迹代码
- 金融贷款口子超市V2源码 Thinkphp开发的贷款和超市平台源码
- 数据集-目标检测系列- 土拨鼠 检测数据集 marmot >> DataBall
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功