# TensorFlow Similarity: Metric Learning for Humans
TensorFlow Similarity is a [TensorFLow](https://tensorflow.org) library for [similarity learning](https://en.wikipedia.org/wiki/Similarity_learning) also known as metric learning and contrastive learning.
TensorFlow Similarity is still in beta.
## Introduction
Tensorflow Similarity offers state-of-the-art algorithms for metric learning and all the necessary components to research, train, evaluate, and serve similarity-based models.
![Example of nearest neighbors search performed on the embedding generated by a similarity model trained on the Oxford IIIT Pet Dataset.](assets/images/similar-cats-and-dogs.jpg)
With TensorFlow Similarity you can train and serve models that find similar items (such as images) in a large corpus of examples. For example, as visible above, you can train a similarity model to find and cluster similar looking images of cats and dogs from the [Oxford IIIT Pet Dataset](https://www.tensorflow.org/datasets/catalog/oxford_iiit_pet) by only training on a few classes. To train your own similarity model see this [notebook](examples/supervised_visualization.ipynb).
Metric learning is different from traditional classification as it's objective is different. The model learns to minimize the distance between similar examples and maximize the distance between dissimilar examples, in a supervised or self-supervised fashion. Either way, TensorFlow Similarity provides the necessary losses, metrics, samplers, visualizers, and indexing sub-system to make this quick and easy.
**Currently, TensorFlow Similarity supports supervised training.** In future releases, it will support semi-supervised and self-supervised training.
To learn more about the benefits of using similarity training, you can check out the blog post.
## What's new
- [Aug 21]: Interactive embedding `projector()` added. See this [notebook](examples/supervised_visualization.ipynb)
- [Aug 21]: [`CircleLoss()`](api/TFSimilarity/losses/CircleLoss.md) added
- [Aug 21]: [`PNLoss()`](api/TFSimilarity/losses/PNLoss.md) added.
- [Aug 21]: [`MultiSimilarityLoss()`](api/TFSimilarity/losses/MultiSimilarityLoss.md) added.
For previous changes - see [the release changelog](./releases.md)
## Getting Started
### Installation
Use pip to install the library
```python
pip install tensorflow_similarity
```
### Documentation
The detailed and narrated [notebooks](examples/) are a good way to get started with TensorFlow Similarity. There is likely to be one that is similar to your data or your problem (if not, let us know). You can start working with the examples immediately in Google Colab by clicking the Google Colab icon.
For more information about specific functions, you can [check the API documentation](api/)
### Minimal Example: MNIST similarity
Here is a bare bones example demonstrating how to train a TensorFlow Similarity model on the MNIST data. This example illustrates some of the main components provided by TensorFlow Similarity and how they fit together. Please refer to the [hello_world notebook](examples/supervised_hello_world.ipynb) for a more detailed introduction.
### Preparing data
TensorFlow Similarity provides [data samplers](api/TFSimilarity/samplers/), for various dataset types, that balance the batches to ensure smoother training.
In this example, we are using the multi-shot sampler that integrate directly from the TensorFlow dataset catalog.
```python
from tensorflow_similarity.samplers import TFDatasetMultiShotMemorySampler
# Data sampler that generates balanced batches from MNIST dataset
sampler = TFDatasetMultiShotMemorySampler(dataset_name='mnist', class_per_batch=10)
```
### Building a Similarity model
Building a TensorFlow Similarity model is similar to building a standard Keras model, except the output layer is usually a [`MetricEmbedding()`](api/TFSimilarity/layers/) layer that enforces L2 normalization and the model is instantiated as a specialized subclass [`SimilarityModel()`](api/TFSimilarity/models/SimilarityModel.md) that supports additional functionality.
```python
from tensorflow.keras import layers
from tensorflow_similarity.layers import MetricEmbedding
from tensorflow_similarity.models import SimilarityModel
# Build a Similarity model using standard Keras layers
inputs = layers.Input(shape=(28, 28, 1))
x = layers.experimental.preprocessing.Rescaling(1/255)(inputs)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
outputs = MetricEmbedding(64)(x)
# Build a specialized Similarity model
model = SimilarityModel(inputs, outputs)
```
### Training model via contrastive learning
To output a metric embedding, that are searchable via approximate nearest neighbor search, the model needs to be trained using a similarity loss. Here we are using the `MultiSimilarityLoss()`, which is one of the most efficient loss functions.
```python
from tensorflow_similarity.losses import MultiSimilarityLoss
# Train Similarity model using contrastive loss
model.compile('adam', loss=MultiSimilarityLoss())
model.fit(sampler, epochs=5)
```
### Building images index and querying it
Once the model is trained, reference examples must indexed via the model index API to be searchable. After indexing, you can use the model lookup API to search the index for the K most similar items.
```python
from tensorflow_similarity.visualization import viz_neigbors_imgs
# Index 100 embedded MNIST examples to make them searchable
model.index(x=sampler.x[:100], y=sampler.y[:100], data=sampler.x[:100])
# Find the top 5 most similar indexed MNIST examples for a given example
nns = model.single_lookup(sampler.x[3713])
# Visualize the query example and its top 5 neighbors
viz_neigbors_imgs(sampler.x[3713], sampler.y[3713], nns)
```
## Supported Algorithms
### Supervised Losses
- Triplet Loss
- PN Loss
- Multi Sim Loss
- Circle Loss
### Metrics
Tensorflow Similarity offers many of the most common metrics used for [classification](api/TFSimilarity/classification_metrics/) and [retrieval](api/TFSimilarity/retrieval_metrics/) evaluation. Including:
| Name | Type | Description |
| ---- | ---- | ----------- |
| Precision | Classification | |
| Recall | Classification | |
| F1 Score | Classification | |
| Recall@K | Retrieval | |
| Binary NDCG | Retrieval | |
## Citing
Please cite this reference if you use any part of TensorFlow similarity in your research:
```bibtex
@article{EBSIM21,
title={TensorFlow Similarity: A Usuable, High-Performance Metric Learning Library},
author={Elie Bursztein, James Long, Shun Lim, Owen Vallis, Francois Chollet},
journal={Fixme},
year={2021}
}
```
## Disclaimer
This is not an official Google product.
没有合适的资源?快使用搜索试试~ 我知道了~
tensorflow_similarity-0.13.12.tar.gz
需积分: 1 0 下载量 80 浏览量
2024-03-24
23:51:02
上传
评论
收藏 73KB GZ 举报
温馨提示
Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。
资源推荐
资源详情
资源评论
收起资源包目录
tensorflow_similarity-0.13.12.tar.gz (94个子文件)
tensorflow_similarity-0.13.12
setup.py 2KB
LICENSE 11KB
PKG-INFO 651B
tests
__init__.py 107B
samplers
__init__.py 0B
test_memory_samplers.py 3KB
test_tfdataset_samplers.py 866B
evaluators
__init__.py 0B
test_memory_evaluator.py 4KB
test_distances.py 3KB
training_metrics
__init__.py 0B
test_distance_metrics.py 3KB
test_model.py 986B
test_losses.py 4KB
test_algebra.py 3KB
search
__init__.py 0B
test_nmslib_search.py 2KB
test_indexer.py 5KB
test_callbacks.py 2KB
stores
__init__.py 0B
test_memory_store.py 2KB
classification_metrics
__init__.py 0B
test_classification_metrics.py 5KB
conftest.py 175B
tensorflow_similarity.egg-info
SOURCES.txt 4KB
top_level.txt 28B
PKG-INFO 651B
requires.txt 270B
dependency_links.txt 1B
tensorflow_similarity
utils.py 1KB
__init__.py 3KB
architectures
__init__.py 49B
efficientnet.py 5KB
samplers
utils.py 2KB
__init__.py 639B
samplers.py 5KB
tfrecords_samplers.py 5KB
memory_samplers.py 9KB
tfdataset_samplers.py 6KB
evaluators
evaluator.py 5KB
__init__.py 102B
memory_evaluator.py 14KB
layers.py 869B
algebra.py 3KB
api
__init__.py 1KB
visualization
__init__.py 148B
neighbors_viz.py 2KB
projector.py 6KB
confusion_matrix.py 2KB
training_metrics
utils.py 1KB
__init__.py 558B
distance_metrics.py 6KB
search
__init__.py 82B
nmslib_search.py 5KB
search.py 3KB
matchers
utils.py 1KB
__init__.py 319B
match_majority_vote.py 3KB
match_nearest.py 2KB
classification_match.py 8KB
indexer.py 28KB
types.py 4KB
losses
utils.py 6KB
__init__.py 274B
metric_loss.py 3KB
circle_loss.py 7KB
triplet_loss.py 7KB
pn_loss.py 8KB
multisim_loss.py 8KB
stores
__init__.py 79B
store.py 3KB
memory_store.py 6KB
models
__init__.py 54B
similarity_model.py 23KB
retrieval_metrics
retrieval_metric.py 3KB
utils.py 3KB
__init__.py 257B
recall_at_k.py 3KB
map_at_k.py 5KB
bndcg.py 3KB
precision_at_k.py 3KB
callbacks.py 11KB
distances.py 8KB
classification_metrics
utils.py 2KB
__init__.py 520B
binary_accuracy.py 3KB
false_positive_rate.py 2KB
negative_predictive_value.py 2KB
classification_metric.py 2KB
recall.py 2KB
precision.py 3KB
f1_score.py 2KB
setup.cfg 38B
README.md 7KB
共 94 条
- 1
资源评论
程序员Chino的日记
- 粉丝: 3689
- 资源: 5万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于Java的tio-http-server演示学习源码
- 基于Java和C#的C#课程实验与Winform学习及Android实验设计源码
- 基于Java的电厂职工管理系统设计源码
- 基于Python的RSA+AES加密的SecureHTTP设计源码
- 基于Java平台的集成nsg-dao设计源码,涵盖jdbc、hibernate、mybatis框架
- 基于Vue的Java+JavaScript+CSS+HTML搭建的二手交易平台设计源码
- 基于Java和Vue的Spring Boot博客系统设计源码
- 基于MS51单片机的eeprom32与sst39vf040存储器读写设计源码
- 基于Python和Shell脚本的多环境配置运行命令管理器PyMake设计源码
- 基于Python和uiautomator2的支付宝积分活动自动化脚本设计源码
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功