MindSpore计算机视觉是一个基于MindSpore的开源计算机视觉研究工具箱资源-CSDN文库

共518个文件

yaml：190个

py：180个

md：112个

版权申诉

人工智能

155 浏览量 2024-02-06 10:28:03 上传评论收藏 3.6MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

MindSpore计算机视觉是一个基于MindSpore的开源计算机视觉研究工具箱（518个子文件）

.flake8 169B

.gitattributes 726B

.gitignore 2KB

bpe_simple_vocab_16e6.txt.gz 1.29MB

MANIFEST.in 42B

RN50-quickgelu.json 389B

RN101-quickgelu.json 389B

RN50x64.json 371B

RN50x4.json 366B

RN50x16.json 366B

RN50.json 365B

RN101.json 365B

ViT-bigG-14.json 355B

ViT-H-14.json 325B

ViT-B-32-quickgelu.json 319B

ViT-L-14-336.json 297B

ViT-L-14.json 297B

ViT-B-16.json 295B

ViT-B-32.json 295B

prompts.md 71KB

benchmark_results.md 40KB

finetune_with_a_custom_dataset.md 24KB

finetune_with_a_custom_dataset.md 21KB

finetune.md 17KB

finetune.md 16KB

index.md 12KB

feature_extraction.md 12KB

index.md 12KB

write_a_new_model.md 11KB

feature_extraction.md 11KB

LICENSE.md 11KB

configuration.md 11KB

write_a_new_model.md 10KB

configuration.md 10KB

quick_start.md 10KB

README.md 10KB

quick_start.md 10KB

README.md 10KB

model-card.md 8KB

README.md 7KB

RELEASE.md 7KB

deployment.md 7KB

README.md 7KB

deployment.md 6KB

README.md 6KB

README.md 5KB

inference.md 5KB

README.md 5KB

inference.md 5KB

README.md 5KB

README.md 4KB

CONTRIBUTING.md 4KB

README.md 4KB

installation.md 3KB

installation.md 2KB

共 518 条

# DeepLabV3, DeeplabV3+ Based on MindCV Backbones > DeeplabV3: [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587) > > DeeplabV3+:[Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1802.02611) ## Introduction **DeepLabV3** is a semantic segmentation architecture improved over previous version. Two main contributions of DeepLabV3 are as follows. 1) Modules are designed which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates to handle the problem of segmenting objects at multiple scale. 2) The Atrous Spatial Pyramid Pooling (ASPP) module is augmented with image-level features encoding global context and further boost performance. The improved ASPP applys global average pooling on the last feature map of the model, feeds the resulting image-level features to a 1 × 1 convolution with 256 filters (and batch normalization), and then bilinearly upsamples the feature to the desired spatial dimension. The DenseCRF post-processing from DeepLabV2 is deprecated. <img src="https://github.com/mindspore-lab/mindcv/assets/33061146/db2076ed-bccd-455f-badb-e03deb131dc5" width=700/> Figure 1. Architecture of DeepLabV3 with output_stride=16 [<a href="#references">1</a>] **DeepLabV3+** extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries. It combines advantages from Spatial pyramid pooling module and encode-decoder structure. The last feature map before logits in the origin deeplabv3 becomes the encoder output. The encoder features are first bilinearly upsampled by a factor of 4 and then concatenated with the corresponding low-level features from the network backbone that have the same spatial resolution. Another 1 × 1 convolution is applied on the low-level features to reduce the number of channels. After the concatenation, a few 3 × 3 convolutions are applied to refine the features followed by another simple bilinear upsampling by a factor of 4. <img src="https://github.com/mindspore-lab/mindcv/assets/33061146/e1a17518-b19a-46f1-b28a-ec67cafa81be" width=700/> Figure 2. DeepLabv3+ extends DeepLabv3 by employing a encoderdecoder structure [<a href="#references">2</a>] This example provides implementations of DeepLabV3 and DeepLabV3+ using backbones from MindCV. More details about feature extraction of MindCV are in [this tutorial](https://github.com/mindspore-lab/mindcv/blob/main/docs/en/how_to_guides/feature_extraction.md). Note that the ResNet in DeepLab contains atrous convolutions with different rates, `dilated_resnet.py` is provided as a modification of ResNet from MindCV, with atrous convolutions in block 3-4. ## Quick Start ### Preparation 1. Clone MindCV repository, enter `mindcv` and assume we are always in this project root. ```shell git clone https://github.com/mindspore-lab/mindcv.git cd mindcv ``` 2. Install dependencies as shown [here](https://mindspore-lab.github.io/mindcv/installation/), and also install `cv2`, `addict`. ```shell pip install opencv-python pip install addict ``` 3. Prepare dataset * Download Pascal VOC 2012 dataset, [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) and Semantic Boundaries Dataset, [SBD](https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz). * Prepare training and test data list files with the path to image and annotation pairs. You could simply run `python examples/seg/deeplabv3/preprocess/get_data_list.py --data_root=/path/to/data` to generate the list files. This command results in 5 data list files. The lines in a list file should be like as follows: ``` /path/to/data/JPEGImages/2007_000032.jpg /path/to/data/SegmentationClassGray/2007_000032.png /path/to/data/JPEGImages/2007_000039.jpg /path/to/data/SegmentationClassGray/2007_000039.png /path/to/data/JPEGImages/2007_000063.jpg /path/to/data/SegmentationClassGray/2007_000063.png ...... ``` * Convert training dataset to mindrecords by running ``build_seg_data.py`` script. In accord with paper, we train on *trainaug* dataset (*voc train* + *SBD*). You can train on other dataset by changing the data list path at keyword `data_list` with the path of your target training set. ```shell python examples/seg/deeplabv3/preprocess/build_seg_data.py \ --data_root=[root path of training data] \ --data_list=[path of data list file prepared above] \ --dst_path=[path to save mindrecords] \ --num_shards=8 ``` * Note: the training steps use datasets in mindrecord format, while the evaluation steps directly use the data list files. 4. Backbone: download pre-trained backbone from MindCV, here we use [ResNet101](https://download.mindspore.cn/toolkits/mindcv/resnet/resnet101-689c5e77.ckpt). ### Train Specify `deeplabv3` or `deeplabv3plus` at the key word `model` in the config file. It is highly recommended to use **distributed training** for this DeepLabV3 and DeepLabV3+ implementation. For distributed training using **OpenMPI's `mpirun`**, simply run ```shell mpirun -n [# of devices] python examples/seg/deeplabv3/train.py --config [the path to the config file] ``` For distributed training with [Ascend rank table](https://github.com/mindspore-lab/mindocr/blob/main/docs/en/tutorials/distribute_train.md#12-configure-rank_table_file-for-training), configure `ascend8p.sh` as follows ```shell #!/bin/bash export DEVICE_NUM=8 export RANK_SIZE=8 export RANK_TABLE_FILE="./hccl_8p_01234567_127.0.0.1.json" for ((i = 0; i < ${DEVICE_NUM}; i++)); do export DEVICE_ID=$i export RANK_ID=$i python -u examples/seg/deeplabv3/train.py --config [the path to the config file] &> ./train_$i.log & done ``` and start training by running: ```shell l bash ascend8p.sh ``` For single-device training, simply set the keyword ``distributed`` to ``False`` in the config file and run: ```shell python examples/seg/deeplabv3/train.py --config [the path to the config file] ``` **Take mpirun command as an example, the training steps are as follow**: - Step 1: Employ output_stride=16 and fine-tune pretrained resnet101 on *trainaug* dataset. In config file, please specify the path of pretrained backbone checkpoint in keyword `backbone_ckpt_path` and set `output_stride` to `16`. ```shell # for deeplabv3 mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3_s16_dilated_resnet101.yaml # for deeplabv3+ mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3plus_s16_dilated_resnet101.yaml ``` - Step 2: Employ output_stride=8, fine-tune model from step 1 on *trainaug* dataset with smaller base learning rate. In config file, please specify the path of checkpoint from previous step in `ckpt_path`, set `ckpt_pre_trained` to `True` and set `output_stride` to `8` . ```shell # for deeplabv3 mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3_s8_dilated_resnet101.yaml # for deeplabv3+ mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3plus_s8_dilated_resnet101.yaml ``` ### Test For testing the trained model, first specify the path to the model checkpoint at keyword `ckpt_path` in the config file. You could modify `output_stride`, `flip`, `scales` in the config file during inference. For example, after replacing `ckpt_path` in config file with [checkpoint](https://download.mindspore.cn/toolkits/mindcv/deeplabv3/deeplabv3_s8_resnet101-a297e7af.ckpt) from 2-step training of deeplabv3, commands below employ os=8 without left-right filpped or muti

评论收藏

内容反馈

版权申诉