# augmeNNt
This repository is intended first as a faster drop-in replacement of Pytorch's Torchvision default augmentations in the "transforms" [package](https://github.com/pytorch/vision/tree/master/torchvision/transforms), based on NumPy and OpenCV (PIL-free) for computer vision pipelines. Additionally, many useful functions and augmentations for image to image translation, super-resolution and restoration (deblur, denoise, etc) are also available.
## Supported Augmentations
Most functions from the original Torchvision transforms are reimplemented, with some considerations:
1. ToPILImage is not implemented or needed, we use OpenCV instead (ToCVImage). However, the original ToPILImage in ~transforms can be used to save the tensor as a PIL image if required. Once transformed into tensor format, images have RGB channel order in both cases.
2. OpenCV images are Numpy arrays. OpenCV supports uint8, int8, uint16, int16, int32, float32, float64. Certain operations (like `cv.CvtColor()`) do require to convert the arrays to OpenCV type (with `cv.fromarray()`).
3. The affine transform in the original one only has 5 degrees of freedom, YU-Zhiyang implemented an Affine transform with 6 degress of freedom called `RandomAffine6` (can be found in [transforms.py](augmennt/transforms.py)). The original method `RandomAffine` is also available and reimplemented with OpenCV.
4. The rotate function is clockwise, however the original one is anticlockwise.
5. Some new augmentations have been added, in comparison to Torchvision's, refer to the list below.
6. **The outputs of the OpenCV versions are almost the same as the original one's (it's possible to test by running [test.py](/test.py)) directly with test images**.
These are the basic transforms, equivalent to torchvision's:
- `Compose`, `ToTensor`, `ToCVImage`, `Normalize`,
- `Resize`, `CenterCrop`, `Pad`,
- `Lambda` (see [note](#attention)),
- `RandomApply`, `RandomOrder`, `RandomChoice`, `RandomCrop`,
- `RandomHorizontalFlip`, `RandomVerticalFlip`, `RandomResizedCrop`,
- `FiveCrop`, `TenCrop`, `LinearTransformation`, `ColorJitter`,
- `RandomRotation`, `RandomAffine`,
- `Grayscale`, `RandomGrayscale`, `RandomErasing`,
The additional transforms can be used to train models such as [Noise2Noise](https://arxiv.org/pdf/1803.04189.pdf), [BSRGAN](https://arxiv.org/pdf/2103.14006v1.pdf), [Real-ESRGAN](https://arxiv.org/pdf/2107.10833.pdf), [White-box Cartoonization](https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_Learning_to_Cartoonize_Using_White-Box_Cartoon_Representations_CVPR_2020_paper.pdf) and [EdgeConnect](https://openaccess.thecvf.com/content_ICCVW_2019/papers/AIM/Nazeri_EdgeConnect_Structure_Guided_Image_Inpainting_using_Edge_Prediction_ICCVW_2019_paper.pdf), among others. There are some general augmentations:
- `RandomAffine6`, `Cutout`, `RandomPerspective`,
Noise augmentations, with options for artificial noises and realistic noise generation:
- `RandomGaussianNoise`, `RandomPoissonNoise`, `RandomSPNoise`,
- `RandomSpeckleNoise`, `RandomCompression`,
- `BayerDitherNoise`, `FSDitherNoise`, `AverageBWDitherNoise`,`BayerBWDitherNoise`,
- `BinBWDitherNoise`, `FSBWDitherNoise`, `RandomBWDitherNoise`,
- `RandomCameraNoise`, `RandomChromaticAberration`
Blurs and different kind of kernels generation and use, with standard blurs, isotropic and anisotropic Gaussian filters and simple and complex motion blur kernels:
- `RandomAverageBlur`, `RandomBilateralBlur`, `RandomBoxBlur`,
- `RandomGaussianBlur`, `RandomMedianBlur`,
- `RandomMotionBlur`, `RandomComplexMotionBlur`,
- `RandomAnIsoBlur`, `AlignedDownsample`, `ApplyKernel`,
- `RandomSincBlur`
Filters to modify the images, including color quantization, superpixel segmentation and CLAHE:
- `FilterMaxRGB`, `FilterColorBalance`, `FilterUnsharp`,
- `SimpleQuantize`, `RandomQuantize`, `RandomQuantizeSOM`,
- `CLAHE`, `RandomGamma`, `Superpixels`
Edge filters:
- `FilterCanny`,
## Requirements
- python >= 3.5.2
- numpy >= 1.10 ('@' operator may not be overloaded before this version)
- pytorch >= 0.4.1
- A working installation of OpenCV. **Tested with OpenCV version 3.4.2, 4.1.0**
- Tested on Windows 10 and Ubuntu 18.04.
## Optional requirements
- torchvision >= 0.2.1
In order to use the additional Superpixel options (skimage SLIC and Felzenszwalb algorithms), segments reduction algorithms (selective search and RAG merging), the Menon demosaicing algorithm and the sinc filter, there are additional requirements:
- scikit-image >= 0.17.2
- scipy >= 1.6.2
## Usage
1. git clone <https://github.com/victorca25/augmennt.git> .
2. Add `augmennt` to your python path.
3. Add `from augmennt import augmennt as transforms` in your python file.
4. From here, almost everything should work exactly as the original `transforms`.
### Example: Image resizing
```python
import numpy as np
from augmennt import augmennt as transforms
image = np.random.randint(low=0, high=255, size=(1024, 2048, 3))
resize = transforms.Resize(size=(256,256))
image = resize(image)
```
Should be 1.5 to 10 times faster than PIL. See benchmarks
### Example: Composing transformations
```py
transform = transforms.Compose([
transforms.RandomAffine(degrees=10, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=(-10, 0)),
transforms.Resize(size=(350, 350), interpolation="BILINEAR"),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
])
```
More examples can be found in the official Pytorch [tutorials](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html).
## Attention
The multiprocessing used in Pytorch's dataloader may have issues with lambda functions (using `Lambda` in [transforms.py](torchvision/transforms/transforms.py)) in Windows, as lambda functions can't be pickled (<https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled>). This issue also happens with Torchvision's `Lambda` function.
These issues happen when using, `num_workers > 0` in a Pytorch `DataLoader` class when the transformations are initialized in the class init. The issue can be prevented either by using proper functions (not lambda) when composing the transformations or by initializing it in the `DataLoader` call instead.
## Performance
The following are the performance tests as executed by jbohnslav.
- Most transformations are between 1.5X and ~4X faster in OpenCV. Large image resizes are up to 10 times faster in OpenCV.
- To reproduce the following benchmarks, download the [Cityscapes dataset](https://www.cityscapes-dataset.com/).
- An example benchmarking file that jbohnslav used can be found in the notebook **benchmarking_v2.ipynb** where the Cityscapes default directories are wrapped with a HDF5 file for even faster reading (Note: this file has not been updated or tested for a very long time, but can serve as a reference).
![resize](benchmarks/benchmarking_Resize.png)
![random crop](benchmarks/benchmarking_Random_crop_quarter_size.png)
![change brightness](benchmarks/benchmarking_Color_brightness_only.png)
![change brightness and contrast](benchmarks/benchmarking_Color_constrast_and_brightness.png)
![change contrast only](benchmarks/benchmarking_Color_contrast_only.png)
![random horizontal flips](benchmarks/benchmarking_Random_horizontal_flip.png)
The changes start to add up when you compose multiple transformations together.
![composed transformations](benchmarks/benchmarking_Resize_flip_brightness_contrast_rotate.png)
Compared to regular Pillow, cv2 is around three times faster than PIL, as shown in this [article](https://www.kaggle.com/vfdev5/pil-vs-opencv).
Additionally, the [Albumentations project](https://github.com/albumentations-team/albumentations), mostly based on Numpy and OpenCV also has shown better performance than other options, including torchvision with a fast Pillow-SIMD backend.
But it can also be the case that Pillow-SIMD c
没有合适的资源?快使用搜索试试~ 我知道了~
神经网络的增强。使用OpenCV和附加增强功能实现Torchvision的转换,以实现超分辨率、恢复和图像到图像的转换_.zip
共35个文件
png:18个
py:10个
md:1个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 63 浏览量
2023-04-28
13:54:09
上传
评论
收藏 1.06MB ZIP 举报
温馨提示
神经网络的增强。使用OpenCV和附加增强功能实现Torchvision的转换,以实现超分辨率、恢复和图像到图像的转换_.zip
资源推荐
资源详情
资源评论
收起资源包目录
神经网络的增强。使用OpenCV和附加增强功能实现Torchvision的转换,以实现超分辨率、恢复和图像到图像的转换_.zip (35个子文件)
augmennt-master
setup.py 680B
LICENSE 1KB
benchmarking_v2.ipynb 70KB
augmennt
minisom.py 20KB
__init__.py 25B
camera.py 30KB
spadd.py 1KB
transforms.py 114KB
extra_functional.py 48KB
common.py 17KB
functional.py 38KB
superpixels.py 22KB
benchmarks
benchmarking_Resize.png 66KB
benchmarking_Random_resized_crop_for_Inception.png 73KB
benchmarking_Random_horizontal_flip.png 73KB
benchmarking_Color_brightness_only.png 68KB
benchmarking_Random_affine.png 71KB
benchmarking_Color_contrast_only.png 68KB
benchmarking_Center_Crop.png 70KB
benchmarking_Random_vertical_flip.png 71KB
benchmarking_Zero_padding_50x25.png 72KB
benchmarking_Five_crop.png 69KB
benchmarking_Random_grayscale.png 71KB
benchmarking_Random_rotation_10_degrees.png 73KB
benchmarking_Color_hue_only.png 75KB
benchmarking_Ten_crop.png 68KB
benchmarking_Resize_flip_brightness_contrast_rotate.png 70KB
benchmarking_Random_crop_quarter_size.png 73KB
benchmarking_Color_constrast_and_brightness.png 71KB
benchmarking_Color_saturation_only.png 74KB
.deepsource.toml 106B
MANIFEST.in 24B
.gitignore 73B
README.md 10KB
cat.jpg 78KB
共 35 条
- 1
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9153
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功