Python库|taming-transformers-0.0.1.tar.gz_Nomodulenamed'taming资源-CSDN文库

版权申诉

5星 · 超过95%的资源 199 浏览量 2022-04-15 22:11:01 上传评论收藏 41KB GZ 举报

共37个文件

py：29个

txt：4个

pkg-info：2个

资源推荐

资源详情

资源评论

收起资源包目录

taming-transformers-0.0.1.tar.gz （37个子文件）

taming-transformers-0.0.1

PKG-INFO 342B

taming

models

cond_transformer.py 14KB

vqgan.py 10KB

__init__.py 0B

main.py 21KB

data

ade20k.py 5KB

coco.py 8KB

utils.py 3KB

sflckr.py 4KB

__init__.py 0B

imagenet.py 20KB

faceshq.py 5KB

base.py 3KB

util.py 5KB

__init__.py 0B

modules

transformer

permuter.py 7KB

__init__.py 0B

mingpt.py 13KB

losses

__init__.py 58B

vqperceptual.py 6KB

lpips.py 5KB

discriminator

__init__.py 0B

model.py 2KB

diffusionmodules

__init__.py 0B

model.py 30KB

util.py 3KB

__init__.py 0B

vqvae

__init__.py 0B

quantize.py 4KB

setup.cfg 38B

setup.py 548B

README.md 15KB

taming_transformers.egg-info

PKG-INFO 342B

requires.txt 71B

SOURCES.txt 1KB

top_level.txt 7B

dependency_links.txt 1B

# Taming Transformers for High-Resolution Image Synthesis, CVPR 2021 (Oral) ![teaser](assets/mountain.jpeg) [**Taming Transformers for High-Resolution Image Synthesis**](https://compvis.github.io/taming-transformers/)<br/> [Patrick Esser](https://github.com/pesser)\*, [Robin Rombach](https://github.com/rromb)\*, [Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/> \* equal contribution **tl;dr** We combine the efficiancy of convolutional approaches with the expressivity of transformers by introducing a convolutional VQGAN, which learns a codebook of context-rich visual parts, whose composition is modeled with an autoregressive transformer. ![teaser](assets/teaser.png) [arXiv](https://arxiv.org/abs/2012.09841) | [BibTeX](#bibtex) | [Project Page](https://compvis.github.io/taming-transformers/) ### News - We added a [colab notebook](https://colab.research.google.com/github/CompVis/taming-transformers/blob/master/scripts/reconstruction_usage.ipynb) which compares two VQGANs and OpenAI's [DALL-E](https://github.com/openai/DALL-E). See also [this section](#more-resources). - We now include an overview of pretrained models in [Tab.1](#overview-of-pretrained-models). We added models for [COCO](#coco) and [ADE20k](#ade20k). - The streamlit demo now supports image completions. - We now include a couple of examples from the D-RIN dataset so you can run the [D-RIN demo](#d-rin) without preparing the dataset first. - You can now jump right into sampling with our [Colab quickstart notebook](https://colab.research.google.com/github/CompVis/taming-transformers/blob/master/scripts/taming-transformers.ipynb). ## Requirements A suitable [conda](https://conda.io/) environment named `taming` can be created and activated with: ``` conda env create -f environment.yaml conda activate taming ``` ## Overview of pretrained models The following table provides an overview of all models that are currently available. FID scores were evaluated using [torch-fidelity](https://github.com/toshas/torch-fidelity) and without rejection sampling. For reference, we also include a link to the recently released autoencoder of the [DALL-E](https://github.com/openai/DALL-E) model. See the corresponding [colab notebook](https://colab.research.google.com/github/CompVis/taming-transformers/blob/master/scripts/reconstruction_usage.ipynb) for a comparison and discussion of reconstruction capabilities. | Dataset | FID | Link | Samples (256x256) | Comments | ------------- | ------------- |------------- | ------------- |------------- | | FFHQ (f=16) | 11.4 | coming soon... | | CelebA-HQ (f=16) | 10.7 | coming soon... | | ADE20K (f=16) | 35.5 | [ade20k_transformer](https://k00.fr/ot46cksa) | [ade20k_samples.zip](https://heibox.uni-heidelberg.de/f/70bb78cbaf844501b8fb/) [2k] | evaluated on val split (2k images) | COCO-Stuff (f=16) | 20.4 | [coco_transformer](https://k00.fr/2zz6i2ce) | [coco_samples.zip](https://heibox.uni-heidelberg.de/f/a395a9be612f4a7a8054/) [5k] | evaluated on val split (5k images) | ImageNet (cIN) (f=16) | | coming soon... | | | | || | | FacesHQ (f=16) | -- | [faceshq_transformer](https://k00.fr/qqfl2do8) | S-FLCKR (f=16) | -- | [sflckr](https://heibox.uni-heidelberg.de/d/73487ab6e5314cb5adba/) | D-RIN (f=16) | -- | [drin_transformer](https://k00.fr/39jcugc5) | | | | || | | VQGAN ImageNet (f=16), 1024| 8.0 | [vqgan_imagenet_f16_1024](https://heibox.uni-heidelberg.de/d/8088892a516d4e3baf92/) | [reconstructions](https://k00.fr/j626x093) | Reconstruction-FIDs evaluated against the validation split of ImageNet on 256x256 images. | VQGAN ImageNet (f=16), 16384| 4.9 |[vqgan_imagenet_f16_16384](https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/) | [reconstructions](https://k00.fr/j626x093) | Reconstruction-FIDs evaluated against the validation split of ImageNet on 256x256 images. | | | | || | | DALL-E VQVAE (f=8), 8192, GumbelQuantization| 34.3 | https://github.com/openai/DALL-E | [reconstructions](https://k00.fr/j626x093) | Reconstruction-FIDs evaluated against the validation split of ImageNet on 256x256 images. ## Running pretrained models The commands below will start a streamlit demo which supports sampling at different resolutions and image completions. To run a non-interactive version of the sampling process, replace `streamlit run scripts/sample_conditional.py --` by `python scripts/make_samples.py --outdir <path_to_write_samples_to>` and keep the remaining command line arguments. ### S-FLCKR ![teaser](assets/sunset_and_ocean.jpg) You can also [run this model in a Colab notebook](https://colab.research.google.com/github/CompVis/taming-transformers/blob/master/scripts/taming-transformers.ipynb), which includes all necessary steps to start sampling. Download the [2020-11-09T13-31-51_sflckr](https://heibox.uni-heidelberg.de/d/73487ab6e5314cb5adba/) folder and place it into `logs`. Then, run ``` streamlit run scripts/sample_conditional.py -- -r logs/2020-11-09T13-31-51_sflckr/ ``` ### FacesHQ ![teaser](assets/faceshq.jpg) Download [2020-11-13T21-41-45_faceshq_transformer](https://k00.fr/qqfl2do8) and place it into `logs`. Follow the data preparation steps for [CelebA-HQ](#celeba-hq) and [FFHQ](#ffhq). Run ``` streamlit run scripts/sample_conditional.py -- -r logs/2020-11-13T21-41-45_faceshq_transformer/ ``` ### D-RIN ![teaser](assets/drin.jpg) Download [2020-11-20T12-54-32_drin_transformer](https://k00.fr/39jcugc5) and place it into `logs`. To run the demo on a couple of example depth maps included in the repository, run ``` streamlit run scripts/sample_conditional.py -- -r logs/2020-11-20T12-54-32_drin_transformer/ --ignore_base_data data="{target: main.DataModuleFromConfig, params: {batch_size: 1, validation: {target: taming.data.imagenet.DRINExamples}}}" ``` To run the demo on the complete validation set, first follow the data preparation steps for [ImageNet](#imagenet) and then run ``` streamlit run scripts/sample_conditional.py -- -r logs/2020-11-20T12-54-32_drin_transformer/ ``` ### COCO Download [2021-01-20T16-04-20_coco_transformer](https://k00.fr/2zz6i2ce) and place it into `logs`. To run the demo on a couple of example segmentation maps included in the repository, run ``` streamlit run scripts/sample_conditional.py -- -r logs/2021-01-20T16-04-20_coco_transformer/ --ignore_base_data data="{target: main.DataModuleFromConfig, params: {batch_size: 1, validation: {target: taming.data.coco.Examples}}}" ``` ### ADE20k Download [2020-11-20T21-45-44_ade20k_transformer](https://k00.fr/ot46cksa) and place it into `logs`. To run the demo on a couple of example segmentation maps included in the repository, run ``` streamlit run scripts/sample_conditional.py -- -r logs/2020-11-20T21-45-44_ade20k_transformer/ --ignore_base_data data="{target: main.DataModuleFromConfig, params: {batch_size: 1, validation: {target: taming.data.ade20k.Examples}}}" ``` ## Data Preparation ### ImageNet The code will try to download (through [Academic Torrents](http://academictorrents.com/)) and prepare ImageNet the first time it is used. However, since ImageNet is quite large, this requires a lot of disk space and time. If you already have ImageNet on your disk, you can speed things up by putting the data into `${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/data/` (which defaults to `~/.cache/autoencoders/data/ILSVRC2012_{split}/data/`), where `{split}` is one of `train`/`validation`. It should have the following structure: ``` ${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/data/ ├── n01440764 │ ├── n01440764_10026.JPEG │ ├── n01440764_10027.JPEG │ ├── ... ├── n01443537 │ ├── n01443537_10007.JPEG │ ├── n01443537_10014.JPEG │ ├── ... ├── ... ``` If you haven't extracted the data, you can also place `ILSVRC2012_img_train.tar`/`ILSVRC2012_img_val.tar` (or symlinks to them) into `${XDG_CACHE}/autoencoders/data/ILSVRC2012_train/` / `${XDG_CACHE}/autoencoders/data/ILSV

评论收藏

内容反馈

版权申诉