<img src='imgs/teaser_720.gif' align="right" width=360>
<br><br><br><br>
# pix2pixHD
### [Project](https://tcwang0509.github.io/pix2pixHD/) | [Youtube](https://youtu.be/3AIpPlzM_qs) | [Paper](https://arxiv.org/pdf/1711.11585.pdf) <br>
Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translation. It can be used for turning semantic label maps into photo-realistic images or synthesizing portraits from face label maps. <br><br>
[High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs](https://tcwang0509.github.io/pix2pixHD/)
[Ting-Chun Wang](https://tcwang0509.github.io/)<sup>1</sup>, [Ming-Yu Liu](http://mingyuliu.net/)<sup>1</sup>, [Jun-Yan Zhu](http://people.eecs.berkeley.edu/~junyanz/)<sup>2</sup>, Andrew Tao<sup>1</sup>, [Jan Kautz](http://jankautz.com/)<sup>1</sup>, [Bryan Catanzaro](http://catanzaro.name/)<sup>1</sup>
<sup>1</sup>NVIDIA Corporation, <sup>2</sup>UC Berkeley
In CVPR 2018.
## Image-to-image translation at 2k/1k resolution
- Our label-to-streetview results
<p align='center'>
<img src='imgs/teaser_label.png' width='400'/>
<img src='imgs/teaser_ours.jpg' width='400'/>
</p>
- Interactive editing results
<p align='center'>
<img src='imgs/teaser_style.gif' width='400'/>
<img src='imgs/teaser_label.gif' width='400'/>
</p>
- Additional streetview results
<p align='center'>
<img src='imgs/cityscapes_1.jpg' width='400'/>
<img src='imgs/cityscapes_2.jpg' width='400'/>
</p>
<p align='center'>
<img src='imgs/cityscapes_3.jpg' width='400'/>
<img src='imgs/cityscapes_4.jpg' width='400'/>
</p>
- Label-to-face and interactive editing results
<p align='center'>
<img src='imgs/face1_1.jpg' width='250'/>
<img src='imgs/face1_2.jpg' width='250'/>
<img src='imgs/face1_3.jpg' width='250'/>
</p>
<p align='center'>
<img src='imgs/face2_1.jpg' width='250'/>
<img src='imgs/face2_2.jpg' width='250'/>
<img src='imgs/face2_3.jpg' width='250'/>
</p>
- Our editing interface
<p align='center'>
<img src='imgs/city_short.gif' width='330'/>
<img src='imgs/face_short.gif' width='450'/>
</p>
## Prerequisites
- Linux or macOS
- Python 2 or 3
- NVIDIA GPU (11G memory or larger) + CUDA cuDNN
## Getting Started
### Installation
- Install PyTorch and dependencies from http://pytorch.org
- Install python libraries [dominate](https://github.com/Knio/dominate).
```bash
pip install dominate
```
- Clone this repo:
```bash
git clone https://github.com/NVIDIA/pix2pixHD
cd pix2pixHD
```
### Testing
- A few example Cityscapes test images are included in the `datasets` folder.
- Please download the pre-trained Cityscapes model from [here](https://drive.google.com/file/d/1h9SykUnuZul7J3Nbms2QGH1wa85nbN2-/view?usp=sharing) (google drive link), and put it under `./checkpoints/label2city_1024p/`
- Test the model (`bash ./scripts/test_1024p.sh`):
```bash
#!./scripts/test_1024p.sh
python test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none
```
The test results will be saved to a html file here: `./results/label2city_1024p/test_latest/index.html`.
More example scripts can be found in the `scripts` directory.
### Dataset
- We use the Cityscapes dataset. To train a model on the full dataset, please download it from the [official website](https://www.cityscapes-dataset.com/) (registration required).
After downloading, please put it under the `datasets` folder in the same way the example images are provided.
### Training
- Train a model at 1024 x 512 resolution (`bash ./scripts/train_512p.sh`):
```bash
#!./scripts/train_512p.sh
python train.py --name label2city_512p
```
- To view training results, please checkout intermediate results in `./checkpoints/label2city_512p/web/index.html`.
If you have tensorflow installed, you can see tensorboard logs in `./checkpoints/label2city_512p/logs` by adding `--tf_log` to the training scripts.
### Multi-GPU training
- Train a model using multiple GPUs (`bash ./scripts/train_512p_multigpu.sh`):
```bash
#!./scripts/train_512p_multigpu.sh
python train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7
```
Note: this is not tested and we trained our model using single GPU only. Please use at your own discretion.
### Training with Automatic Mixed Precision (AMP) for faster speed
- To train with mixed precision support, please first install apex from: https://github.com/NVIDIA/apex
- You can then train the model by adding `--fp16`. For example,
```bash
#!./scripts/train_512p_fp16.sh
python -m torch.distributed.launch train.py --name label2city_512p --fp16
```
In our test case, it trains about 80% faster with AMP on a Volta machine.
### Training at full resolution
- To train the images at full resolution (2048 x 1024) requires a GPU with 24G memory (`bash ./scripts/train_1024p_24G.sh`), or 16G memory if using mixed precision (AMP).
- If only GPUs with 12G memory are available, please use the 12G script (`bash ./scripts/train_1024p_12G.sh`), which will crop the images during training. Performance is not guaranteed using this script.
### Training with your own dataset
- If you want to train with your own dataset, please generate label maps which are one-channel whose pixel values correspond to the object labels (i.e. 0,1,...,N-1, where N is the number of labels). This is because we need to generate one-hot vectors from the label maps. Please also specity `--label_nc N` during both training and testing.
- If your input is not a label map, please just specify `--label_nc 0` which will directly use the RGB colors as input. The folders should then be named `train_A`, `train_B` instead of `train_label`, `train_img`, where the goal is to translate images from A to B.
- If you don't have instance maps or don't want to use them, please specify `--no_instance`.
- The default setting for preprocessing is `scale_width`, which will scale the width of all training images to `opt.loadSize` (1024) while keeping the aspect ratio. If you want a different setting, please change it by using the `--resize_or_crop` option. For example, `scale_width_and_crop` first resizes the image to have width `opt.loadSize` and then does random cropping of size `(opt.fineSize, opt.fineSize)`. `crop` skips the resizing step and only performs random cropping. If you don't want any preprocessing, please specify `none`, which will do nothing other than making sure the image is divisible by 32.
## More Training/Test Details
- Flags: see `options/train_options.py` and `options/base_options.py` for all the training flags; see `options/test_options.py` and `options/base_options.py` for all the test flags.
- Instance map: we take in both label maps and instance maps as input. If you don't want to use instance maps, please specify the flag `--no_instance`.
## Citation
If you find this useful for your research, please use the following.
```
@inproceedings{wang2018pix2pixHD,
title={High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs},
author={Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2018}
}
```
## Acknowledgments
This code borrows heavily from [pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix).
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
pix2pixHD-master - 副本.rar (72个子文件)
pix2pixHD-master - 副本
_config.yml 27B
run_engine.py 6KB
LICENSE.txt 2KB
data
__init__.py 0B
base_data_loader.py 195B
base_dataset.py 3KB
data_loader.py 229B
image_folder.py 2KB
aligned_dataset.py 3KB
__pycache__
aligned_dataset.cpython-38.pyc 2KB
custom_dataset_data_loader.cpython-38.pyc 1KB
base_dataset.cpython-38.pyc 4KB
data_loader.cpython-38.pyc 416B
image_folder.cpython-38.pyc 2KB
base_data_loader.cpython-38.pyc 708B
__init__.cpython-38.pyc 147B
custom_dataset_data_loader.py 886B
precompute_feature_maps.py 1KB
options
__init__.py 0B
train_options.py 3KB
test_options.py 1KB
base_options.py 7KB
__pycache__
base_options.cpython-38.pyc 5KB
test_options.cpython-38.pyc 1KB
train_options.cpython-38.pyc 2KB
__init__.cpython-38.pyc 150B
.idea
pix2pixHD-master.iml 552B
.name 15B
workspace.xml 7KB
misc.xml 255B
inspectionProfiles
profiles_settings.xml 174B
modules.xml 291B
.gitignore 184B
datasets
w
test_A
test_B
train_A
train_B
encode_features.py 2KB
models
__init__.py 0B
networks.py 18KB
models.py 567B
base_model.py 3KB
pix2pixHD_model.py 14KB
ui_model.py 17KB
__pycache__
models.cpython-38.pyc 686B
base_model.cpython-38.pyc 3KB
pix2pixHD_model.cpython-38.pyc 10KB
networks.cpython-38.pyc 13KB
__init__.cpython-38.pyc 149B
checkpoints
.gitignore 725B
train.py 6KB
test.py 3KB
util
__init__.py 0B
util.py 4KB
image_pool.py 1KB
visualizer.py 5KB
__pycache__
visualizer.cpython-38.pyc 4KB
util.cpython-38.pyc 4KB
image_pool.cpython-38.pyc 1KB
html.cpython-38.pyc 2KB
__init__.cpython-38.pyc 147B
html.py 2KB
results
README.md 7KB
scripts
train_512p_multigpu.sh 123B
train_1024p_24G.sh 362B
test_512p.sh 127B
test_512p_feat.sh 264B
test_1024p_feat.sh 355B
train_1024p_feat_24G.sh 535B
train_512p_feat.sh 102B
train_512p_fp16.sh 97B
train_1024p_feat_12G.sh 532B
test_1024p.sh 186B
train_1024p_12G.sh 360B
train_512p.sh 61B
train_512p_fp16_multigpu.sh 158B
共 72 条
- 1
资源评论
- qq_453062972023-12-20终于找到了超赞的宝藏资源,果断冲冲冲,支持!
- mosterr_2024-04-30实在是宝藏资源、宝藏分享者!感谢大佬~
- m0_749939042023-11-12资源简直太好了,完美解决了当下遇到的难题,这样的资源很难不支持~
- m0_644878932023-09-25资源不错,内容挺好的,有一定的使用价值,值得借鉴,感谢分享。
拉姆哥的小屋
- 粉丝: 6141
- 资源: 132
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功