tensorflow_similarity-0.13.6.tar.gz资源-CSDN文库

需积分: 1 114 浏览量 2024-03-24 23:50:59 上传评论收藏 61KB GZ 举报

共67个文件

py：58个

txt：4个

pkg-info：2个

资源推荐

资源详情

资源评论

收起资源包目录

tensorflow_similarity-0.13.6.tar.gz （67个子文件）

tensorflow_similarity-0.13.6

setup.py 2KB

LICENSE 11KB

PKG-INFO 650B

tests

__init__.py 107B

samplers

__init__.py 0B

test_memory_samplers.py 3KB

test_tfdataset_samplers.py 860B

test_distance_metrics.py 3KB

evaluators

__init__.py 0B

test_memory_evaluator.py 5KB

test_distances.py 3KB

test_model.py 1KB

test_losses.py 4KB

test_algebra.py 3KB

matchers

__init__.py 0B

test_nmslib_matcher.py 2KB

test_indexer.py 5KB

test_callbacks.py 2KB

test_metrics.py 4KB

tables

__init__.py 0B

test_memory_table.py 2KB

conftest.py 135B

tensorflow_similarity.egg-info

SOURCES.txt 2KB

top_level.txt 28B

PKG-INFO 650B

requires.txt 234B

dependency_links.txt 1B

tensorflow_similarity

utils.py 565B

__init__.py 23B

architectures

__init__.py 49B

efficientnet.py 4KB

samplers

utils.py 2KB

__init__.py 299B

samplers.py 5KB

tfrecords_samplers.py 5KB

memory_samplers.py 8KB

tfdataset_samplers.py 5KB

metrics.py 15KB

evaluators

evaluator.py 3KB

__init__.py 102B

memory_evaluator.py 10KB

layers.py 771B

algebra.py 3KB

api

__init__.py 526B

visualization.py 3KB

matchers

__init__.py 86B

matcher.py 3KB

nmslib_matcher.py 5KB

indexer.py 23KB

types.py 4KB

losses

utils.py 6KB

__init__.py 225B

metric_loss.py 3KB

circle_loss.py 7KB

triplet_loss.py 7KB

pn_loss.py 8KB

multisim_loss.py 8KB

models

__init__.py 54B

similarity_model.py 21KB

callbacks.py 6KB

tables

__init__.py 79B

table.py 3KB

memory_table.py 6KB

distances.py 8KB

distance_metrics.py 6KB

setup.cfg 38B

README.md 8KB

# TensorFlow Similarity: Metric Learning for Humans TensorFlow Similarity is a [TensorFLow](https://tensorflow.org) library focused on making metric learning easy. TensorFlow similarity is still in beta version with some features, such as semi-supervised not yet implementd. ## Introduction Tensorflow Similarity offers state-of-the-art algorithms for metric learning and all the needed components to research, train, evaluate and serve models that learn from similar looking examples. With it you can quickly and easily: - Train and serve model that allow to find similar items, such as images, from large indexes. - Perform semi-supervised or self-supervised training to train/boost classification models when you have a large corpus with few labeled examples. **Not yet available**. ### Supervised models Metric learning objective function is different from traditional classification: - *Supervised models* learn to output a metric embeddings (1D float tensor) that exhibit the property that if two examples are close in the real world, their embeddings will be close in the projected [metric space](https://en.wikipedia.org/wiki/Metric_space). Representing items by their metrics embeddings allow to build indexes that contains "classes" that were not seen during training, add classes to the index without retraining, and only requires to have a few examples per classes both for training and retriving. This ability to operate on few examples per class is sometime refered as few-shot learning in the litterature. What makes retrieving similar items from the index very efficient is that metric learning allows to use [Approximate Nearest Neighboors Search](https://en.wikipedia.org/wiki/Nearest_neighbor_search) to perform the search on the embedding times in sublinear time instead of using the standard [Nearest Neighboors Search](https://en.wikipedia.org/wiki/Nearest_neighbor_search) which take a quadratic time. In practice TensorFlow Similarity built-in `Index()` by leveraging the [NMSLIB](https://github.com/nmslib/nmslib) can find the closest items in a fraction of second even when the index contains over 1M elements. - **Self-supervised contrastive model** help train more accurate models by peforming a large-scale pretraining that aim at learning a consistent representation of the data by "contrasting" different representation of the same example generated via data augmentation and/or contrasting the representation of different examples to separate then. Then the model is fine-tuned on the few labeled examples like any classification model. **This part is still a work in progress** Overall Tensorflow Similarity well-tested composable components follow Keras best practices to ensure they can be seamlessly integrated into your TensorFlow workflows and get you results faster whether you are doing research or building innovative products. ## What's new - August 2021 (v0.13.x): Added many new contrastives losses including Circle Loss, PNLoss, LiftedStructure Loss and Multisimilarity Loss. For previous changes - see the [changelog -- Fixme](FIXME) ## Getting Started ### Installation Use pip to install the library ```python pip install tensorflow_similarity ``` ### Documentation The detailed and narrated notebooks are a good way to get started with TensorFlow Similarity. There is likely to be one that is similar to your data or your problem (if not, let us know). You can start working with the examples immediately in Google Colab by clicking the Google colab icon. For more information about specific functions, you can [check the API documentation -- FIXME]() ## Example: MNIST similarity ### Preparing data ```python from tensorflow_similarity.samplers import TFDatasetMultiShotMemorySampler spl = TFDatasetMultiShotMemorySampler(dataset_name='mnist', class_per_batch=10) ``` ### Building a Similarity model ```python from tensorflow.keras import layers from tensorflow_similarity.layers import MetricEmbedding from tensorflow_similarity.models import SimilarityModel inputs = layers.Input(shape=(spl.x[0].shape)) x = layers.experimental.preprocessing.Rescaling(1/255)(inputs) x = layers.Conv2D(32, 7, activation='relu')(x) x = layers.MaxPool2D()(x) x = layers.Conv2D(64, 3, activation='relu')(x) x = layers.Flatten()(x) x = MetricEmbedding(64)(x) model = SimilarityModel(inputs, x) ``` ### Training model via contrastive learning ```python from tensorflow_similarity.losses import TripletLoss # using Tripletloss to project in metric space tloss = TripletLoss() model.compile('adam', loss=tloss) model.fit(sampler, epochs=5) ``` ### Building images index and querying it ```python from tensorflow_similarity.visualization import viz_neigbors_imgs # index emneddings for fast retrivial via ANN model.index(x=sampler.x[:100], y=sampler.y[:100], data=sampler.x[:100]) # Lookup examples nearest indexed images nns = model.single_lookup(sampler.x[4242]) # visualize results result viz_neigbors_imgs(sampler.x[4242], sampler.y[4242], nns) ``` ## Supported Algorithms ### Supervised learning | name | Description | | ----------- | ----------- | | Triplet Loss | | | PN Loss | | | Multi Loss | | | Circle Loss | | ## Package components ![TensorFlow Similarity Overview](api/images/tfsim_overview.png) TensorFlow Similiarity, as visible in the diagram above, offers the following components to help research, train, evaluate and serve metric models: - **`SimilarityModel()`**: This class subclasses the `tf.keras.model` class and extends it with additional properties that are useful for metric learning. For example it adds the methods: 1. `index()`: Enables indexing of the embedding 2. `lookup()`: Takes samples, calls predict(), and searches for neighbors within the index. - **`MetricLoss()`**: This virtual class, that extends the `tf.keras.Loss` class, is the base class from which Metric losses are derived from. This subclassing ensures proper error checking, i.e., ensures the user is using a loss metric to train the models, perform better static analysis, and enforces additional constraints such as having a distance function that is supported by the index. Additionally, Metric losses make use of the fully tested and highly optimized pairwise distances functions provided by TF Similarity that are available under the `Distances.*` classes. - **`Samplers()`**: Samplers are meant to ensure that each batch has at least n (with n >=2) examples of each class, as losses such as TripletLoss can’t work properly if this condition is not met. TF similarity offers an in-memory sampler for small dataset and a TFRecordDatasets for large scales one. - **`Indexer()`**: The Indexer and its sub-component are meant to index known embeddings alongside their metadata. The embedding metadata is stored within `Table()`, while the `Matcher()` is used to perform [fast approximate neighboor searches](https://en.wikipedia.org/wiki/Nearest_neighbor_search) that are meant to quickly retrieve the indexed elements that are the closest to the embeddings supplied in the `lookup()` and `single_lookup()` function. The `Evaluator()` component is used to compute EvalMetrics() on the specific index for evaluation and calibration purpose. The default `Index()` sub-compoments run in-memory and are optimized to be used in interactive settings such as jupyter notebooks, colab, and metric computation during training (e.g using the `EvalCallback()` provided). Index are serialized as part of `model.save()` so you can reload them via `model.index_load()` for serving purpose or futher training / evaluation. The default implementation can scale up to medium deployement (1M-10M+ points) easily provided the computers used have enough memory. For very large scale deployement you will need to sublcass the compoments to match your own architetctue. See FIXME colab to see how to deploy TF simialrity in production. For more information about a giv

评论收藏

内容反馈