图像到图像变换的深度学习网络pix2pixHD_怎么跑通深度学习代码资源-CSDN文库

共72个文件

py：27个

pyc：21个

sh：13个

版权申诉

网络

深度学习

pix2pixHD

5星 · 超过95%的资源 112 浏览量 2023-09-03 18:46:03 上传评论 2 收藏 76KB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

pix2pixHD-master - 副本.rar （72个子文件）

pix2pixHD-master - 副本

_config.yml 27B

run_engine.py 6KB

LICENSE.txt 2KB

data

__init__.py 0B

base_data_loader.py 195B

base_dataset.py 3KB

data_loader.py 229B

image_folder.py 2KB

aligned_dataset.py 3KB

__pycache__

aligned_dataset.cpython-38.pyc 2KB

custom_dataset_data_loader.cpython-38.pyc 1KB

base_dataset.cpython-38.pyc 4KB

data_loader.cpython-38.pyc 416B

image_folder.cpython-38.pyc 2KB

base_data_loader.cpython-38.pyc 708B

__init__.cpython-38.pyc 147B

custom_dataset_data_loader.py 886B

precompute_feature_maps.py 1KB

options

__init__.py 0B

train_options.py 3KB

test_options.py 1KB

base_options.py 7KB

__pycache__

base_options.cpython-38.pyc 5KB

test_options.cpython-38.pyc 1KB

train_options.cpython-38.pyc 2KB

__init__.cpython-38.pyc 150B

.idea

pix2pixHD-master.iml 552B

.name 15B

workspace.xml 7KB

misc.xml 255B

inspectionProfiles

profiles_settings.xml 174B

modules.xml 291B

.gitignore 184B

datasets

test_A

test_B

train_A

train_B

encode_features.py 2KB

models

__init__.py 0B

networks.py 18KB

models.py 567B

base_model.py 3KB

pix2pixHD_model.py 14KB

ui_model.py 17KB

__pycache__

models.cpython-38.pyc 686B

base_model.cpython-38.pyc 3KB

pix2pixHD_model.cpython-38.pyc 10KB

networks.cpython-38.pyc 13KB

__init__.cpython-38.pyc 149B

checkpoints

.gitignore 725B

train.py 6KB

test.py 3KB

util

__init__.py 0B

util.py 4KB

image_pool.py 1KB

visualizer.py 5KB

__pycache__

visualizer.cpython-38.pyc 4KB

util.cpython-38.pyc 4KB

image_pool.cpython-38.pyc 1KB

html.cpython-38.pyc 2KB

__init__.cpython-38.pyc 147B

html.py 2KB

results

README.md 7KB

scripts

train_512p_multigpu.sh 123B

train_1024p_24G.sh 362B

test_512p.sh 127B

test_512p_feat.sh 264B

test_1024p_feat.sh 355B

train_1024p_feat_24G.sh 535B

train_512p_feat.sh 102B

train_512p_fp16.sh 97B

train_1024p_feat_12G.sh 532B

test_1024p.sh 186B

train_1024p_12G.sh 360B

train_512p.sh 61B

train_512p_fp16_multigpu.sh 158B

<img src='imgs/teaser_720.gif' align="right" width=360> # pix2pixHD ### [Project](https://tcwang0509.github.io/pix2pixHD/) | [Youtube](https://youtu.be/3AIpPlzM_qs) | [Paper](https://arxiv.org/pdf/1711.11585.pdf) Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translation. It can be used for turning semantic label maps into photo-realistic images or synthesizing portraits from face label maps. [High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs](https://tcwang0509.github.io/pix2pixHD/) [Ting-Chun Wang](https://tcwang0509.github.io/)1, [Ming-Yu Liu](http://mingyuliu.net/)1, [Jun-Yan Zhu](http://people.eecs.berkeley.edu/~junyanz/)2, Andrew Tao1, [Jan Kautz](http://jankautz.com/)1, [Bryan Catanzaro](http://catanzaro.name/)1 1NVIDIA Corporation, 2UC Berkeley In CVPR 2018. ## Image-to-image translation at 2k/1k resolution - Our label-to-streetview results <img src='imgs/teaser_label.png' width='400'/> <img src='imgs/teaser_ours.jpg' width='400'/> - Interactive editing results <img src='imgs/teaser_style.gif' width='400'/> <img src='imgs/teaser_label.gif' width='400'/> - Additional streetview results <img src='imgs/cityscapes_1.jpg' width='400'/> <img src='imgs/cityscapes_2.jpg' width='400'/> <img src='imgs/cityscapes_3.jpg' width='400'/> <img src='imgs/cityscapes_4.jpg' width='400'/> - Label-to-face and interactive editing results <img src='imgs/face1_1.jpg' width='250'/> <img src='imgs/face1_2.jpg' width='250'/> <img src='imgs/face1_3.jpg' width='250'/> <img src='imgs/face2_1.jpg' width='250'/> <img src='imgs/face2_2.jpg' width='250'/> <img src='imgs/face2_3.jpg' width='250'/> - Our editing interface <img src='imgs/city_short.gif' width='330'/> <img src='imgs/face_short.gif' width='450'/> ## Prerequisites - Linux or macOS - Python 2 or 3 - NVIDIA GPU (11G memory or larger) + CUDA cuDNN ## Getting Started ### Installation - Install PyTorch and dependencies from http://pytorch.org - Install python libraries [dominate](https://github.com/Knio/dominate). ```bash pip install dominate ``` - Clone this repo: ```bash git clone https://github.com/NVIDIA/pix2pixHD cd pix2pixHD ``` ### Testing - A few example Cityscapes test images are included in the `datasets` folder. - Please download the pre-trained Cityscapes model from [here](https://drive.google.com/file/d/1h9SykUnuZul7J3Nbms2QGH1wa85nbN2-/view?usp=sharing) (google drive link), and put it under `./checkpoints/label2city_1024p/` - Test the model (`bash ./scripts/test_1024p.sh`): ```bash #!./scripts/test_1024p.sh python test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none ``` The test results will be saved to a html file here: `./results/label2city_1024p/test_latest/index.html`. More example scripts can be found in the `scripts` directory. ### Dataset - We use the Cityscapes dataset. To train a model on the full dataset, please download it from the [official website](https://www.cityscapes-dataset.com/) (registration required). After downloading, please put it under the `datasets` folder in the same way the example images are provided. ### Training - Train a model at 1024 x 512 resolution (`bash ./scripts/train_512p.sh`): ```bash #!./scripts/train_512p.sh python train.py --name label2city_512p ``` - To view training results, please checkout intermediate results in `./checkpoints/label2city_512p/web/index.html`. If you have tensorflow installed, you can see tensorboard logs in `./checkpoints/label2city_512p/logs` by adding `--tf_log` to the training scripts. ### Multi-GPU training - Train a model using multiple GPUs (`bash ./scripts/train_512p_multigpu.sh`): ```bash #!./scripts/train_512p_multigpu.sh python train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7 ``` Note: this is not tested and we trained our model using single GPU only. Please use at your own discretion. ### Training with Automatic Mixed Precision (AMP) for faster speed - To train with mixed precision support, please first install apex from: https://github.com/NVIDIA/apex - You can then train the model by adding `--fp16`. For example, ```bash #!./scripts/train_512p_fp16.sh python -m torch.distributed.launch train.py --name label2city_512p --fp16 ``` In our test case, it trains about 80% faster with AMP on a Volta machine. ### Training at full resolution - To train the images at full resolution (2048 x 1024) requires a GPU with 24G memory (`bash ./scripts/train_1024p_24G.sh`), or 16G memory if using mixed precision (AMP). - If only GPUs with 12G memory are available, please use the 12G script (`bash ./scripts/train_1024p_12G.sh`), which will crop the images during training. Performance is not guaranteed using this script. ### Training with your own dataset - If you want to train with your own dataset, please generate label maps which are one-channel whose pixel values correspond to the object labels (i.e. 0,1,...,N-1, where N is the number of labels). This is because we need to generate one-hot vectors from the label maps. Please also specity `--label_nc N` during both training and testing. - If your input is not a label map, please just specify `--label_nc 0` which will directly use the RGB colors as input. The folders should then be named `train_A`, `train_B` instead of `train_label`, `train_img`, where the goal is to translate images from A to B. - If you don't have instance maps or don't want to use them, please specify `--no_instance`. - The default setting for preprocessing is `scale_width`, which will scale the width of all training images to `opt.loadSize` (1024) while keeping the aspect ratio. If you want a different setting, please change it by using the `--resize_or_crop` option. For example, `scale_width_and_crop` first resizes the image to have width `opt.loadSize` and then does random cropping of size `(opt.fineSize, opt.fineSize)`. `crop` skips the resizing step and only performs random cropping. If you don't want any preprocessing, please specify `none`, which will do nothing other than making sure the image is divisible by 32. ## More Training/Test Details - Flags: see `options/train_options.py` and `options/base_options.py` for all the training flags; see `options/test_options.py` and `options/base_options.py` for all the test flags. - Instance map: we take in both label maps and instance maps as input. If you don't want to use instance maps, please specify the flag `--no_instance`. ## Citation If you find this useful for your research, please use the following. ``` @inproceedings{wang2018pix2pixHD, title={High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs}, author={Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, year={2018} } ``` ## Acknowledgments This code borrows heavily from [pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix).

评论收藏

内容反馈

版权申诉