# DeepLabV3, DeeplabV3+ Based on MindCV Backbones
> DeeplabV3: [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)
>
> DeeplabV3+:[Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1802.02611)
## Introduction
**DeepLabV3** is a semantic segmentation architecture improved over previous version. Two main contributions of DeepLabV3 are as follows. 1) Modules are designed which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates to handle the problem of segmenting objects at multiple scale. 2) The Atrous Spatial Pyramid Pooling (ASPP) module is augmented with image-level features encoding global context and further boost performance. The improved ASPP applys global average pooling on the last feature map of the model, feeds the resulting image-level features to a 1 × 1 convolution with 256 filters (and batch normalization), and then bilinearly upsamples the feature to the desired spatial dimension. The DenseCRF post-processing from DeepLabV2 is deprecated.
<p align="center">
<img src="https://github.com/mindspore-lab/mindcv/assets/33061146/db2076ed-bccd-455f-badb-e03deb131dc5" width=700/>
</p>
<p align="center">
<em>Figure 1. Architecture of DeepLabV3 with output_stride=16 [<a href="#references">1</a>] </em>
</p>
**DeepLabV3+** extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries. It combines advantages from Spatial pyramid pooling module and encode-decoder structure. The last feature map before logits in the origin deeplabv3 becomes the encoder output. The encoder features are first bilinearly upsampled by a factor of 4 and then concatenated with the corresponding low-level features from the network backbone that have the same spatial resolution. Another 1 × 1 convolution is applied on the low-level features to reduce the number of channels. After the concatenation, a few 3 × 3 convolutions are applied to refine the features followed by another simple bilinear upsampling by a factor of 4.
<p align="center">
<img src="https://github.com/mindspore-lab/mindcv/assets/33061146/e1a17518-b19a-46f1-b28a-ec67cafa81be" width=700/>
</p>
<p align="center">
<em>Figure 2. DeepLabv3+ extends DeepLabv3 by employing a encoderdecoder structure [<a href="#references">2</a>] </em>
</p>
This example provides implementations of DeepLabV3 and DeepLabV3+ using backbones from MindCV. More details about feature extraction of MindCV are in [this tutorial](https://github.com/mindspore-lab/mindcv/blob/main/docs/en/how_to_guides/feature_extraction.md). Note that the ResNet in DeepLab contains atrous convolutions with different rates, `dilated_resnet.py` is provided as a modification of ResNet from MindCV, with atrous convolutions in block 3-4.
## Quick Start
### Preparation
1. Clone MindCV repository, enter `mindcv` and assume we are always in this project root.
```shell
git clone https://github.com/mindspore-lab/mindcv.git
cd mindcv
```
2. Install dependencies as shown [here](https://mindspore-lab.github.io/mindcv/installation/), and also install `cv2`, `addict`.
```shell
pip install opencv-python
pip install addict
```
3. Prepare dataset
* Download Pascal VOC 2012 dataset, [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) and Semantic Boundaries Dataset, [SBD](https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz).
* Prepare training and test data list files with the path to image and annotation pairs. You could simply run `python examples/seg/deeplabv3/preprocess/get_data_list.py --data_root=/path/to/data` to generate the list files. This command results in 5 data list files. The lines in a list file should be like as follows:
```
/path/to/data/JPEGImages/2007_000032.jpg /path/to/data/SegmentationClassGray/2007_000032.png
/path/to/data/JPEGImages/2007_000039.jpg /path/to/data/SegmentationClassGray/2007_000039.png
/path/to/data/JPEGImages/2007_000063.jpg /path/to/data/SegmentationClassGray/2007_000063.png
......
```
* Convert training dataset to mindrecords by running ``build_seg_data.py`` script. In accord with paper, we train on *trainaug* dataset (*voc train* + *SBD*). You can train on other dataset by changing the data list path at keyword `data_list` with the path of your target training set.
```shell
python examples/seg/deeplabv3/preprocess/build_seg_data.py \
--data_root=[root path of training data] \
--data_list=[path of data list file prepared above] \
--dst_path=[path to save mindrecords] \
--num_shards=8
```
* Note: the training steps use datasets in mindrecord format, while the evaluation steps directly use the data list files.
4. Backbone: download pre-trained backbone from MindCV, here we use [ResNet101](https://download.mindspore.cn/toolkits/mindcv/resnet/resnet101-689c5e77.ckpt).
### Train
Specify `deeplabv3` or `deeplabv3plus` at the key word `model` in the config file.
It is highly recommended to use **distributed training** for this DeepLabV3 and DeepLabV3+ implementation.
For distributed training using **OpenMPI's `mpirun`**, simply run
```shell
mpirun -n [# of devices] python examples/seg/deeplabv3/train.py --config [the path to the config file]
```
For distributed training with [Ascend rank table](https://github.com/mindspore-lab/mindocr/blob/main/docs/en/tutorials/distribute_train.md#12-configure-rank_table_file-for-training), configure `ascend8p.sh` as follows
```shell
#!/bin/bash
export DEVICE_NUM=8
export RANK_SIZE=8
export RANK_TABLE_FILE="./hccl_8p_01234567_127.0.0.1.json"
for ((i = 0; i < ${DEVICE_NUM}; i++)); do
export DEVICE_ID=$i
export RANK_ID=$i
python -u examples/seg/deeplabv3/train.py --config [the path to the config file] &> ./train_$i.log &
done
```
and start training by running:
```shell l
bash ascend8p.sh
```
For single-device training, simply set the keyword ``distributed`` to ``False`` in the config file and run:
```shell
python examples/seg/deeplabv3/train.py --config [the path to the config file]
```
**Take mpirun command as an example, the training steps are as follow**:
- Step 1: Employ output_stride=16 and fine-tune pretrained resnet101 on *trainaug* dataset. In config file, please specify the path of pretrained backbone checkpoint in keyword `backbone_ckpt_path` and set `output_stride` to `16`.
```shell
# for deeplabv3
mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3_s16_dilated_resnet101.yaml
# for deeplabv3+
mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3plus_s16_dilated_resnet101.yaml
```
- Step 2: Employ output_stride=8, fine-tune model from step 1 on *trainaug* dataset with smaller base learning rate. In config file, please specify the path of checkpoint from previous step in `ckpt_path`, set `ckpt_pre_trained` to `True` and set `output_stride` to `8` .
```shell
# for deeplabv3
mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3_s8_dilated_resnet101.yaml
# for deeplabv3+
mpirun -n 8 python examples/seg/deeplabv3/train.py --config examples/seg/deeplabv3/config/deeplabv3plus_s8_dilated_resnet101.yaml
```
### Test
For testing the trained model, first specify the path to the model checkpoint at keyword `ckpt_path` in the config file. You could modify `output_stride`, `flip`, `scales` in the config file during inference.
For example, after replacing `ckpt_path` in config file with [checkpoint](https://download.mindspore.cn/toolkits/mindcv/deeplabv3/deeplabv3_s8_resnet101-a297e7af.ckpt) from 2-step training of deeplabv3, commands below employ os=8 without left-right filpped or muti
没有合适的资源?快使用搜索试试~ 我知道了~
MindSpore计算机视觉是一个基于MindSpore的开源计算机视觉研究工具箱
共518个文件
yaml:190个
py:180个
md:112个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 155 浏览量
2024-02-06
10:28:03
上传
评论
收藏 3.6MB ZIP 举报
温馨提示
MindCV是一个基于 MindSpore 开发的,致力于计算机视觉相关技术研发的开源工具箱。它提供大量的计算机视觉领域的经典模型和SoTA模型以及它们的预训练权重和训练策略。同时,还提供了自动增强等SoTA算法来提高模型性能。通过解耦的模块设计,您可以轻松地将MindCV应用到您自己的CV任务中。主分支代码目前支持 MindSpore 1.8+ 以上的版本,包含 MindSpore 2.0版本。
资源推荐
资源详情
资源评论
收起资源包目录
MindSpore计算机视觉是一个基于MindSpore的开源计算机视觉研究工具箱 (518个子文件)
.flake8 169B
.gitattributes 726B
.gitignore 2KB
bpe_simple_vocab_16e6.txt.gz 1.29MB
bpe_simple_vocab_16e6.txt.gz 1.29MB
MANIFEST.in 42B
RN50-quickgelu.json 389B
RN101-quickgelu.json 389B
RN50x64.json 371B
RN50x4.json 366B
RN50x16.json 366B
RN50.json 365B
RN101.json 365B
ViT-bigG-14.json 355B
ViT-H-14.json 325B
ViT-B-32-quickgelu.json 319B
ViT-L-14-336.json 297B
ViT-L-14.json 297B
ViT-B-16.json 295B
ViT-B-32.json 295B
prompts.md 71KB
benchmark_results.md 40KB
finetune_with_a_custom_dataset.md 24KB
finetune_with_a_custom_dataset.md 21KB
finetune.md 17KB
finetune.md 16KB
index.md 12KB
feature_extraction.md 12KB
index.md 12KB
write_a_new_model.md 11KB
feature_extraction.md 11KB
LICENSE.md 11KB
configuration.md 11KB
write_a_new_model.md 10KB
configuration.md 10KB
quick_start.md 10KB
README.md 10KB
quick_start.md 10KB
README.md 10KB
model-card.md 8KB
README.md 7KB
README.md 7KB
RELEASE.md 7KB
deployment.md 7KB
README.md 7KB
README.md 7KB
deployment.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 6KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
inference.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
README.md 5KB
inference.md 5KB
README.md 5KB
README.md 4KB
README.md 4KB
README.md 4KB
README.md 4KB
CONTRIBUTING.md 4KB
README.md 4KB
README.md 4KB
README.md 4KB
README.md 4KB
README.md 4KB
installation.md 3KB
installation.md 2KB
共 518 条
- 1
- 2
- 3
- 4
- 5
- 6
资源评论
Java程序员-张凯
- 粉丝: 1w+
- 资源: 6742
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功