torchvision0.9.0_torchvision0.9.0资源-CSDN文库

共542个文件

py：173个

cpp：64个

pkl：50个

需积分: 9 144 浏览量 2022-09-14 11:05:31 上传评论收藏 7.13MB ZIP 举报

**torchvision 0.9.0 深度学习库详解** `torchvision`是PyTorch框架的一个重要扩展库，专为计算机视觉任务而设计。在0.9.0版本中，它提供了丰富的功能，包括数据集、数据加载器、模型、转换函数以及用于图像分类、对象检测、语义分割等任务的预训练模型。本文将深入探讨torchvision 0.9.0在深度学习领域的应用和关键特性。 **一、数据集与数据加载器** 在深度学习中，数据预处理和有效加载至关重要。torchvision提供了一系列标准的数据集，如CIFAR-10、CIFAR-100、ImageNet、MNIST、COCO等，便于研究人员快速接入和使用。数据加载器（DataLoader）则负责这些数据集的批量加载，支持多线程加载以加速训练过程，并能进行随机排序、数据增强等操作。 **二、模型** torchvision包含了许多预训练的卷积神经网络（CNN）模型，如ResNet、DenseNet、AlexNet、VGG、GoogLeNet等，这些都是在大型图像数据集上训练好的。用户可以直接利用这些预训练模型进行迁移学习，只需将最后一层替换为特定任务的分类层，大大减少了训练时间和计算资源。 **三、模型构造块** 除了预训练模型，torchvision还提供了构建复杂CNN模型所需的模块，如Conv2d、MaxPool2d、BatchNorm2d等。这些基础组件可以帮助开发者自定义网络结构，实现创新的计算机视觉算法。 **四、图像转换** 数据增强是提升模型泛化能力的关键。torchvision提供了多种图像转换方法，如随机裁剪、翻转、归一化、色彩扰动等。通过组合这些转换，用户可以创建复杂的增强策略，以提高模型对输入变化的鲁棒性。 **五、对象检测与实例分割** torchvision还包括了用于对象检测和实例分割的模型，如Faster R-CNN、Mask R-CNN等。这些模型能够识别图像中的多个物体并精确框出它们的位置，同时还能区分同一类别的不同实例。 **六、语义分割** 在语义分割任务中，模型需要预测图像中每个像素的类别。torchvision提供了如UNet、FCN等模型，适用于医学影像分析、自动驾驶等场景。 **七、模型微调与训练** 使用torchvision进行模型微调是常见的实践。通过调整预训练模型的权重，可以使其适应新的任务或数据集。同时，torchvision还提供了一些训练辅助工具，如损失函数、优化器等，简化了模型训练的流程。 **八、COCO数据集的处理** torchvision对COCO数据集的支持尤为出色，它包含了丰富的标注信息，如物体边界框、关键点、语义分割等。库中的coco_api可以方便地读取和处理这些数据，为基于COCO数据集的任务提供便利。总结，torchvision 0.9.0作为PyTorch的重要组成部分，极大地推动了计算机视觉研究和开发的效率。其提供的数据集、模型、转换和工具，使得深度学习在图像识别、检测、分割等领域的应用变得更加便捷和高效。无论是初学者还是经验丰富的开发者，都能从中受益，快速构建和优化自己的计算机视觉项目。

资源详情

资源评论

资源推荐

收起资源包目录

torchvision 0.9.0 （542个子文件）

v_SoccerJuggling_g24_c01.avi 608KB

v_SoccerJuggling_g23_c01.avi 496KB

hmdb51_Turnk_r_Pippi_Michel_cartwheel_f_cm_np2_le_med_6.avi 350KB

SchoolRulesHowTheyHelpUs_wave_f_nm_np1_ba_med_0.avi 283KB

RATRACE_wave_f_nm_np1_fr_goo_37.avi 258KB

TrumanShow_wave_f_nm_np1_fr_med_26.avi 237KB

pkg_helpers.bash 16KB

cuda_install.bat 10KB

nightly_defaults.bat 8KB

build_vision.bat 5KB

upload.bat 2KB

clone.bat 2KB

test.bat 2KB

check_deps.bat 2KB

install_runtime.bat 2KB

install_activate.bat 2KB

publish.bat 2KB

cuda102.bat 1KB

cuda101.bat 1KB

cuda100.bat 1KB

cuda92.bat 1KB

cuda90.bat 1KB

activate.bat 1KB

auth.bat 1KB

vc_env_helper.bat 972B

env_fix.bat 931B

setup.bat 925B

vc_env_helper.bat 887B

copy.bat 799B

make.bat 783B

check_opts.bat 652B

cpu.bat 621B

build_conda.bat 605B

vs_install.bat 577B

build_wheels.bat 469B

dep_install.bat 332B

build_cmake.bat 309B

build_frcnn.bat 222B

build_cpp_example.bat 215B

install_conda.bat 107B

install_conda.bat 106B

copy_cpu.bat 58B

clean.bat 56B

setup.cfg 254B

.clang-format 3KB

.coveragerc 89B

deform_conv2d_kernel.cpp 33KB

video_reader.cpp 21KB

decoder.cpp 18KB

roi_align_kernel.cpp 14KB

inception.cpp 13KB

ps_roi_align_kernel.cpp 13KB

sync_decoder_test.cpp 12KB

util.cpp 12KB

video.cpp 10KB

ps_roi_pool_kernel.cpp 9KB

roi_pool_kernel.cpp 8KB

test_models.cpp 7KB

video_sampler.cpp 7KB

stream.cpp 7KB

googlenet.cpp 7KB

deform_conv2d_kernel.cpp 7KB

densenet.cpp 6KB

mnasnet.cpp 6KB

audio_sampler.cpp 6KB

shufflenetv2.cpp 5KB

decode_png.cpp 5KB

decode_jpeg.cpp 5KB

mobilenet.cpp 5KB

encode_png.cpp 5KB

ps_roi_align_kernel.cpp 5KB

roi_align_kernel.cpp 4KB

resnet.cpp 4KB

ps_roi_pool_kernel.cpp 4KB

roi_pool_kernel.cpp 4KB

vgg.cpp 4KB

video_stream.cpp 4KB

squeezenet.cpp 4KB

seekable_buffer.cpp 3KB

nms_kernel.cpp 3KB

encode_jpeg.cpp 3KB

read_write_file.cpp 3KB

audio_stream.cpp 3KB

deform_conv2d.cpp 2KB

roi_align.cpp 2KB

sync_decoder.cpp 2KB

subtitle_stream.cpp 2KB

test_custom_operators.cpp 2KB

ps_roi_align.cpp 2KB

ps_roi_pool.cpp 2KB

roi_pool.cpp 2KB

alexnet.cpp 2KB

memory_buffer.cpp 2KB

test_frcnn_tracing.cpp 2KB

deform_conv2d_kernel.cpp 1KB

util_test.cpp 1KB

decode_image.cpp 1018B

共 542 条

# Image classification reference training scripts This folder contains reference training scripts for image classification. They serve as a log of how to train specific models, as provide baseline training and evaluation scripts to quickly bootstrap research. Except otherwise noted, all models have been trained on 8x V100 GPUs with the following parameters: | Parameter | value | | ------------------------ | ------ | | `--batch_size` | `32` | | `--epochs` | `90` | | `--lr` | `0.1` | | `--momentum` | `0.9` | | `--wd`, `--weight-decay` | `1e-4` | | `--lr-step-size` | `30` | | `--lr-gamma` | `0.1` | ### AlexNet and VGG Since `AlexNet` and the original `VGG` architectures do not include batch normalization, the default initial learning rate `--lr 0.1` is to high. ``` python main.py --model $MODEL --lr 1e-2 ``` Here `$MODEL` is one of `alexnet`, `vgg11`, `vgg13`, `vgg16` or `vgg19`. Note that `vgg11_bn`, `vgg13_bn`, `vgg16_bn`, and `vgg19_bn` include batch normalization and thus are trained with the default parameters. ### ResNext-50 32x4d ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ --model resnext50_32x4d --epochs 100 ``` ### ResNext-101 32x8d On 8 nodes, each with 8 GPUs (for a total of 64 GPUS) ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ --model resnext101_32x8d --epochs 100 ``` ### MobileNetV2 ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ --model mobilenet_v2 --epochs 300 --lr 0.045 --wd 0.00004\ --lr-step-size 1 --lr-gamma 0.98 ``` ### MobileNetV3 Large & Small ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ --model $MODEL --epochs 600 --opt rmsprop --batch-size 128 --lr 0.064\ --wd 0.00001 --lr-step-size 2 --lr-gamma 0.973 --auto-augment imagenet --random-erase 0.2 ``` Here `$MODEL` is one of `mobilenet_v3_large` or `mobilenet_v3_small`. Then we averaged the parameters of the last 3 checkpoints that improved the Acc@1. See [#3182](https://github.com/pytorch/vision/pull/3182) and [#3354](https://github.com/pytorch/vision/pull/3354) for details. ## Mixed precision training Automatic Mixed Precision (AMP) training on GPU for Pytorch can be enabled with the [NVIDIA Apex extension](https://github.com/NVIDIA/apex). Mixed precision training makes use of both FP32 and FP16 precisions where appropriate. FP16 operations can leverage the Tensor cores on NVIDIA GPUs (Volta, Turing or newer architectures) for improved throughput, generally without loss in model accuracy. Mixed precision training also often allows larger batch sizes. GPU automatic mixed precision training for Pytorch Vision can be enabled via the flag value `--apex=True`. ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ --model resnext50_32x4d --epochs 100 --apex ``` ## Quantized ### Parameters used for generating quantized models: For all post training quantized models (All quantized models except mobilenet-v2), the settings are: 1. num_calibration_batches: 32 2. num_workers: 16 3. batch_size: 32 4. eval_batch_size: 128 5. backend: 'fbgemm' ``` python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' --model='<model_name>' ``` For Mobilenet-v2, the model was trained with quantization aware training, the settings used are: 1. num_workers: 16 2. batch_size: 32 3. eval_batch_size: 128 4. backend: 'qnnpack' 5. learning-rate: 0.0001 6. num_epochs: 90 7. num_observer_update_epochs:4 8. num_batch_norm_update_epochs:3 9. momentum: 0.9 10. lr_step_size:30 11. lr_gamma: 0.1 12. weight-decay: 0.0001 ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train_quantization.py --model='mobilenet_v2' ``` Training converges at about 10 epochs. For Mobilenet-v3 Large, the model was trained with quantization aware training, the settings used are: 1. num_workers: 16 2. batch_size: 32 3. eval_batch_size: 128 4. backend: 'qnnpack' 5. learning-rate: 0.001 6. num_epochs: 90 7. num_observer_update_epochs:4 8. num_batch_norm_update_epochs:3 9. momentum: 0.9 10. lr_step_size:30 11. lr_gamma: 0.1 12. weight-decay: 0.00001 ``` python -m torch.distributed.launch --nproc_per_node=8 --use_env train_quantization.py --model='mobilenet_v3_large' \ --wd 0.00001 --lr 0.001 ``` For post training quant, device is set to CPU. For training, the device is set to CUDA. ### Command to evaluate quantized models using the pre-trained weights: ``` python train_quantization.py --device='cpu' --test-only --backend='<backend>' --model='<model_name>' ```