神经网络的增强。使用OpenCV和附加增强功能实现Torchvision的转换，以实现超分辨率、恢复和图像到图像的转换

共35个文件

png：18个

py：10个

md：1个

版权申诉

63 浏览量 2023-04-28 13:54:09 上传评论收藏 1.06MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

神经网络的增强。使用OpenCV和附加增强功能实现Torchvision的转换，以实现超分辨率、恢复和图像到图像的转换_.zip （35个子文件）

augmennt-master

setup.py 680B

LICENSE 1KB

benchmarking_v2.ipynb 70KB

augmennt

minisom.py 20KB

__init__.py 25B

camera.py 30KB

spadd.py 1KB

transforms.py 114KB

extra_functional.py 48KB

common.py 17KB

functional.py 38KB

superpixels.py 22KB

benchmarks

benchmarking_Resize.png 66KB

benchmarking_Random_resized_crop_for_Inception.png 73KB

benchmarking_Random_horizontal_flip.png 73KB

benchmarking_Color_brightness_only.png 68KB

benchmarking_Random_affine.png 71KB

benchmarking_Color_contrast_only.png 68KB

benchmarking_Center_Crop.png 70KB

benchmarking_Random_vertical_flip.png 71KB

benchmarking_Zero_padding_50x25.png 72KB

benchmarking_Five_crop.png 69KB

benchmarking_Random_grayscale.png 71KB

benchmarking_Random_rotation_10_degrees.png 73KB

benchmarking_Color_hue_only.png 75KB

benchmarking_Ten_crop.png 68KB

benchmarking_Resize_flip_brightness_contrast_rotate.png 70KB

benchmarking_Random_crop_quarter_size.png 73KB

benchmarking_Color_constrast_and_brightness.png 71KB

benchmarking_Color_saturation_only.png 74KB

.deepsource.toml 106B

MANIFEST.in 24B

.gitignore 73B

README.md 10KB

cat.jpg 78KB

# augmeNNt This repository is intended first as a faster drop-in replacement of Pytorch's Torchvision default augmentations in the "transforms" [package](https://github.com/pytorch/vision/tree/master/torchvision/transforms), based on NumPy and OpenCV (PIL-free) for computer vision pipelines. Additionally, many useful functions and augmentations for image to image translation, super-resolution and restoration (deblur, denoise, etc) are also available. ## Supported Augmentations Most functions from the original Torchvision transforms are reimplemented, with some considerations: 1. ToPILImage is not implemented or needed, we use OpenCV instead (ToCVImage). However, the original ToPILImage in ~transforms can be used to save the tensor as a PIL image if required. Once transformed into tensor format, images have RGB channel order in both cases. 2. OpenCV images are Numpy arrays. OpenCV supports uint8, int8, uint16, int16, int32, float32, float64. Certain operations (like `cv.CvtColor()`) do require to convert the arrays to OpenCV type (with `cv.fromarray()`). 3. The affine transform in the original one only has 5 degrees of freedom, YU-Zhiyang implemented an Affine transform with 6 degress of freedom called `RandomAffine6` (can be found in [transforms.py](augmennt/transforms.py)). The original method `RandomAffine` is also available and reimplemented with OpenCV. 4. The rotate function is clockwise, however the original one is anticlockwise. 5. Some new augmentations have been added, in comparison to Torchvision's, refer to the list below. 6. **The outputs of the OpenCV versions are almost the same as the original one's (it's possible to test by running [test.py](/test.py)) directly with test images**. These are the basic transforms, equivalent to torchvision's: - `Compose`, `ToTensor`, `ToCVImage`, `Normalize`, - `Resize`, `CenterCrop`, `Pad`, - `Lambda` (see [note](#attention)), - `RandomApply`, `RandomOrder`, `RandomChoice`, `RandomCrop`, - `RandomHorizontalFlip`, `RandomVerticalFlip`, `RandomResizedCrop`, - `FiveCrop`, `TenCrop`, `LinearTransformation`, `ColorJitter`, - `RandomRotation`, `RandomAffine`, - `Grayscale`, `RandomGrayscale`, `RandomErasing`, The additional transforms can be used to train models such as [Noise2Noise](https://arxiv.org/pdf/1803.04189.pdf), [BSRGAN](https://arxiv.org/pdf/2103.14006v1.pdf), [Real-ESRGAN](https://arxiv.org/pdf/2107.10833.pdf), [White-box Cartoonization](https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_Learning_to_Cartoonize_Using_White-Box_Cartoon_Representations_CVPR_2020_paper.pdf) and [EdgeConnect](https://openaccess.thecvf.com/content_ICCVW_2019/papers/AIM/Nazeri_EdgeConnect_Structure_Guided_Image_Inpainting_using_Edge_Prediction_ICCVW_2019_paper.pdf), among others. There are some general augmentations: - `RandomAffine6`, `Cutout`, `RandomPerspective`, Noise augmentations, with options for artificial noises and realistic noise generation: - `RandomGaussianNoise`, `RandomPoissonNoise`, `RandomSPNoise`, - `RandomSpeckleNoise`, `RandomCompression`, - `BayerDitherNoise`, `FSDitherNoise`, `AverageBWDitherNoise`,`BayerBWDitherNoise`, - `BinBWDitherNoise`, `FSBWDitherNoise`, `RandomBWDitherNoise`, - `RandomCameraNoise`, `RandomChromaticAberration` Blurs and different kind of kernels generation and use, with standard blurs, isotropic and anisotropic Gaussian filters and simple and complex motion blur kernels: - `RandomAverageBlur`, `RandomBilateralBlur`, `RandomBoxBlur`, - `RandomGaussianBlur`, `RandomMedianBlur`, - `RandomMotionBlur`, `RandomComplexMotionBlur`, - `RandomAnIsoBlur`, `AlignedDownsample`, `ApplyKernel`, - `RandomSincBlur` Filters to modify the images, including color quantization, superpixel segmentation and CLAHE: - `FilterMaxRGB`, `FilterColorBalance`, `FilterUnsharp`, - `SimpleQuantize`, `RandomQuantize`, `RandomQuantizeSOM`, - `CLAHE`, `RandomGamma`, `Superpixels` Edge filters: - `FilterCanny`, ## Requirements - python >= 3.5.2 - numpy >= 1.10 ('@' operator may not be overloaded before this version) - pytorch >= 0.4.1 - A working installation of OpenCV. **Tested with OpenCV version 3.4.2, 4.1.0** - Tested on Windows 10 and Ubuntu 18.04. ## Optional requirements - torchvision >= 0.2.1 In order to use the additional Superpixel options (skimage SLIC and Felzenszwalb algorithms), segments reduction algorithms (selective search and RAG merging), the Menon demosaicing algorithm and the sinc filter, there are additional requirements: - scikit-image >= 0.17.2 - scipy >= 1.6.2 ## Usage 1. git clone <https://github.com/victorca25/augmennt.git> . 2. Add `augmennt` to your python path. 3. Add `from augmennt import augmennt as transforms` in your python file. 4. From here, almost everything should work exactly as the original `transforms`. ### Example: Image resizing ```python import numpy as np from augmennt import augmennt as transforms image = np.random.randint(low=0, high=255, size=(1024, 2048, 3)) resize = transforms.Resize(size=(256,256)) image = resize(image) ``` Should be 1.5 to 10 times faster than PIL. See benchmarks ### Example: Composing transformations ```py transform = transforms.Compose([ transforms.RandomAffine(degrees=10, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=(-10, 0)), transforms.Resize(size=(350, 350), interpolation="BILINEAR"), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), ]) ``` More examples can be found in the official Pytorch [tutorials](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html). ## Attention The multiprocessing used in Pytorch's dataloader may have issues with lambda functions (using `Lambda` in [transforms.py](torchvision/transforms/transforms.py)) in Windows, as lambda functions can't be pickled (<https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled>). This issue also happens with Torchvision's `Lambda` function. These issues happen when using, `num_workers > 0` in a Pytorch `DataLoader` class when the transformations are initialized in the class init. The issue can be prevented either by using proper functions (not lambda) when composing the transformations or by initializing it in the `DataLoader` call instead. ## Performance The following are the performance tests as executed by jbohnslav. - Most transformations are between 1.5X and ~4X faster in OpenCV. Large image resizes are up to 10 times faster in OpenCV. - To reproduce the following benchmarks, download the [Cityscapes dataset](https://www.cityscapes-dataset.com/). - An example benchmarking file that jbohnslav used can be found in the notebook **benchmarking_v2.ipynb** where the Cityscapes default directories are wrapped with a HDF5 file for even faster reading (Note: this file has not been updated or tested for a very long time, but can serve as a reference). ![resize](benchmarks/benchmarking_Resize.png) ![random crop](benchmarks/benchmarking_Random_crop_quarter_size.png) ![change brightness](benchmarks/benchmarking_Color_brightness_only.png) ![change brightness and contrast](benchmarks/benchmarking_Color_constrast_and_brightness.png) ![change contrast only](benchmarks/benchmarking_Color_contrast_only.png) ![random horizontal flips](benchmarks/benchmarking_Random_horizontal_flip.png) The changes start to add up when you compose multiple transformations together. ![composed transformations](benchmarks/benchmarking_Resize_flip_brightness_contrast_rotate.png) Compared to regular Pillow, cv2 is around three times faster than PIL, as shown in this [article](https://www.kaggle.com/vfdev5/pil-vs-opencv). Additionally, the [Albumentations project](https://github.com/albumentations-team/albumentations), mostly based on Numpy and OpenCV also has shown better performance than other options, including torchvision with a fast Pillow-SIMD backend. But it can also be the case that Pillow-SIMD c

评论收藏

内容反馈

版权申诉