# Weakly- and Semi-Supervised Panoptic Segmentation
by [Qizhu Li](http://www.robots.ox.ac.uk/~liqizhu/)\*, [Anurag Arnab](http://www.robots.ox.ac.uk/~aarnab/)\*, [Philip H.S. Torr](https://scholar.google.com/citations?user=kPxa2w0AAAAJ&hl=en)
This repository demonstrates the weakly supervised ground truth generation scheme presented in our paper *Weakly- and Semi-Supervised Panoptic Segmentation* published at ECCV 2018. The code has been cleaned-up and refactored, and should reproduce the results presented in the paper.
For details, please refer to our [paper](https://arxiv.org/abs/1808.03575), and [project page](https://qizhuli.github.io/publication/weakly-supervised-panoptic-segmentation/). Please check the [Downloads](#downloads) section for all the additional data we release.
![Summary](data/readme/summary.png)
<sup><sub> \* Equal first authorship </sup></sub>
## Introduction
In our weakly-supervised *panoptic* segmentation experiments, our models are supervised by 1) image-level tags and 2) bounding boxes, as shown in the figure above.
We used image-level tags as supervision for "stuff" classes which do not have a defined extent and cannot be described well by tight bounding boxes. For "thing" classes, we used bounding boxes as our weak supervision. This code release clarifies the implementation details of the method presented in the paper.
## Iterative ground truth generation
For readers' convenience, we will give an outline of the proposed iterative ground truth generation pipeline, and provide demos for some of the key steps.
1. We train a multi-class classifier for all classes to obtain rough localisation cues. As it is not possible to fit an entire Cityscapes image (1024x2048) into a network due to GPU memory constraints, we took 15 fixed 400x500 crops per training image, and derived their classification ground truth accordingly, which we use to train the multi-class classifier. From the trained classifier, we extract the Class Activation Maps (CAMs) using Grad-CAM, which has the advantage of being agnostic to network architecture over CAM.
- Download the fixed image crops with image-level tags [here](#downloads-crops) to train your own classifier. For convenience, the pixel-level semantic label of the crops are also included, though they should not be used in training.
- The CAMs we produced are available for download [here](#downloads-cam).
2. In parallel, we extract bounding box annotations from Cityscapes ground truth files, and then run MCG (a segment-proposal algorithm) and Grabcut (a classic foreground segmentation technique given a bounding-box prior) on the training images to generate foreground masks inside each annotated bounding box. MCG and Grabcut masks are merged following the rule that only regions where both have consensus are given the predicted label; otherwise an "ignore" label is assigned.
- The extracted bounding boxes (saved in .mat format) can be downloaded [here](#downloads-bboxes). Alternatively, we also provide a demo script `demo_instanceTrainId_to_dets.m` and a batch script `batch_instanceTrainId_to_dets.m` for you to make them yourself. The demo is self-contained; However, before running the batch script, make sure to
1. Download the [official Cityscapes scripts repository](https://github.com/mcordts/cityscapesScripts);
2. Inside the above repository, navigate to `cityscapesscripts/preparation` and run
```sh
python createTrainIdInstanceImgs.py
```
This command requires an environment variable `CITYSCAPES_DATASTET=path/to/your/cityscapes/data/folder` to be set. These two steps produce the `*_instanceTrainIds.png` files required by our batch script;
3. Navigate back to this repository, and place/symlink your `gtFine` and `gtCoarse` folders inside `data/Cityscapes/` folder so that they are visible to our batch script.
- Please see [here](https://github.com/jponttuset/mcg) for details on MCG.
- We use the [OpenCV implementation](https://docs.opencv.org/3.2.0/d8/d83/tutorial_py_grabcut.html) of Grabcut in our experiments.
- The merged M&G masks we produced are available for download [here](#downloads-mandg).
3. The CAMs (step 1) and M&G masks (step 2) are merged to produce the ground truth needed to kick off iterative training. To see a demo of merging, navigate to the root folder of this repo in MATLAB and run:
```matlab
demo_merge_cam_mandg;
```
When post-processing network predictions of images from the Cityscapes `train_extra` split, make sure to use the following settings:
```matlab
opts.run_apply_bbox_prior = false;
opts.run_check_image_level_tags = false;
opts.save_ins = false;
```
because the coarse annotation provided on the `train_extra` split trades off recall for precision, leading to inaccurate bounding box coordinates, and frequent occurrences of false negatives. This also applies to step 5.
- The results from merging CAMs with M&G masks can be downloaded [here](#downloads-cam-mandg-merged).
4. Using the generated ground truth, weakly-supervised models can be trained in the same way as a fully-supervised model. When the training loss converges, we make dense predictions using the model and also save the prediction scores.
- An example of dense prediction made by a weakly-supervised model is included at `results/pred_sem_raw/`, and an example of the corresponding prediction scores is provided at `results/pred_flat_feat/`.
5. The prediction and prediction scores (and optionally, the M&G masks) are used to generate the ground truth labels for next stage of iterative training. To see a demo of iterative ground truth generation, navigate to the root folder of this repo in MATLAB and run:
```matlab
demo_make_iterative_gt;
```
The generated semantic and instance ground truth labels are saved at `results/pred_sem_clean` and `results/pred_ins_clean` respectively.
Please refer to `scripts/get_opts.m` for the options available. To reproduce the results presented in the paper, use the default setting, and set `opts.run_merge_with_mcg_and_grabcut` to `false` after five iterations of training, as the weakly supervised model by then produces better quality segmentation of ''thing'' classes than the original M&G masks.
6. Repeat step 4 and 5 until training loss no longer reduces.
## Downloads
1. <a id="downloads-crops"></a>Image crops and tags for training multi-class classifier:
- Images
- train (9.3GB): [Dropbox](https://www.dropbox.com/s/xvumnk14qmctb41/leftImg8bit_400x500crops_train.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/1T0xTuq88RITHqZHW1Tdo-g)
- train_extra (63.3GB): [Dropbox](https://www.dropbox.com/s/rana9b0e0k1d467/leftImg8bit_400x500crops_train_extra.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/1yy0I-0R5IBI98QLGOdjkiQ)
- val (1.6GB): [Dropbox](https://www.dropbox.com/s/hudd1k4i4zr53qj/leftImg8bit_400x500crops_val.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/1jSCps4wNg45mbgM0ggM7AQ)
- Ground truth tags
- train+train_extra+val (90.9MB): [Dropbox](https://www.dropbox.com/s/z9ak8rtwjldyerv/gtWeak_tags_400x500crops.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/19VcJrQU2GvwX6NZu8jLWfg)
- Lists
- train+train_extra+val (827kB): [Dropbox](https://www.dropbox.com/s/8itgdm0nau0rixz/lists.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/14j9rV3S8599YwYILzEfrCw)
- Semantic labels (provided for convenience; **not** to be used in training)
- train (87.8MB): [Dropbox](https://www.dropbox.com/s/v9nsuazh60mwm4g/gtFine_semantic_400x500crops_train.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/1dOX7CO9J0ep94TJjUsSYzg)
- train_extra (608MB): [Dropbox](https://www.dropbox.com/s/u45mtdvb3xqt2di/gtCoarse_semantic_400x500crops_train_extra.zip?dl=0) or [BaiduYun](https://pan.baidu.com/s/12Jf0XwvValq2MtFKDRMTmg)
- val (16.2MB): [Dropbox](https://www.dropbox.com/s/9o9unhq
没有合适的资源?快使用搜索试试~ 我知道了~
机器视觉算法库-分割算法.7z
共5399个文件
jpg:2272个
mat:2122个
m:597个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 173 浏览量
2023-07-04
11:55:31
上传
评论
收藏 156.75MB 7Z 举报
温馨提示
机器视觉算法库-分割算法.7z
资源推荐
资源详情
资源评论
收起资源包目录
机器视觉算法库-分割算法.7z (5399个子文件)
sourceCODE.asv 2KB
references.bib 5KB
5117-8_70173.bmp 375KB
5117-8_70167.bmp 375KB
5117-8_70175.bmp 375KB
5117-8_70169.bmp 375KB
5117-8_70176.bmp 375KB
5117-8_70163.bmp 375KB
5117-8_70164.bmp 375KB
5117-8_70172.bmp 375KB
5117-8_70161.bmp 375KB
5117-8_70179.bmp 375KB
5117-8_70174.bmp 375KB
5117-8_70166.bmp 375KB
5117-8_70178.bmp 375KB
5117-8_70162.bmp 375KB
5117-8_70168.bmp 375KB
5117-8_70181.bmp 375KB
5117-8_70165.bmp 375KB
5117-8_70171.bmp 375KB
5117-8_70177.bmp 375KB
5117-8_70170.bmp 375KB
5117-8_70180.bmp 375KB
COPYING 735B
COPYING 735B
SLIC.cpp 31KB
normalize_cpu.cpp 12KB
roipooling_cpu.cpp 12KB
a_times_b_cmplx.cpp 10KB
a_times_b_cmplx.cpp 10KB
bnorm_cpu.cpp 10KB
bnorm_cpu.cpp 10KB
pooling_cpu.cpp 10KB
tinythread.cpp 8KB
im2row_cpu.cpp 8KB
getSuperpixelStats.cpp 7KB
imread_libjpeg.cpp 7KB
sparsifyc.cpp 6KB
sparsifyc.cpp 6KB
bilinearsampler_cpu.cpp 6KB
imread_gdiplus.cpp 6KB
imread_quartz.cpp 5KB
affinityic.cpp 5KB
affinityic.cpp 5KB
cimgnbmap.cpp 4KB
cimgnbmap.cpp 4KB
spmtimesd.cpp 3KB
spmtimesd.cpp 3KB
subsample_cpu.cpp 3KB
imread.cpp 1KB
nnbias.cpp 1KB
mex_w_times_x_symmetric.cpp 1KB
mex_w_times_x_symmetric.cpp 1KB
copy_cpu.cpp 1KB
superpixel2pixel.cpp 714B
vl_nnbilinearsampler.cpp 122B
nnfullyconnected.cpp 121B
vl_nnnormalize.cpp 116B
vl_imreadjpeg_old.cpp 115B
vl_imreadjpeg.cpp 115B
vl_taccummex.cpp 114B
vl_nnroipool.cpp 114B
vl_cudatool.cpp 113B
nnroipooling.cpp 113B
nnbilinearsampler.cpp 113B
vl_nnbnorm.cpp 112B
vl_nnconvt.cpp 112B
nnsubsample.cpp 112B
vl_nnconv.cpp 111B
nnnormalize.cpp 111B
vl_nnpool.cpp 111B
vl_tmove.cpp 110B
datamex.cpp 109B
nnpooling.cpp 107B
data.cpp 106B
nnbnorm.cpp 103B
nnconv.cpp 101B
fixes.css 3KB
base.css 2KB
vl_tmove.cu 76KB
bnorm_gpu.cu 45KB
bnorm_gpu.cu 45KB
vl_imreadjpeg.cu 43KB
nnconv_cudnn.cu 23KB
im2row_gpu.cu 20KB
vl_nnconv.cu 17KB
nnbnorm_cudnn.cu 17KB
data.cu 14KB
pooling_gpu.cu 13KB
vl_imreadjpeg_old.cu 13KB
vl_nnconvt.cu 13KB
datamex.cu 13KB
roipooling_gpu.cu 13KB
nnbilinearsampler_cudnn.cu 12KB
bilinearsampler_gpu.cu 12KB
datacu.cu 10KB
vl_nnbnorm.cu 10KB
vl_nnpool.cu 9KB
nnpooling_cudnn.cu 9KB
nnconv.cu 9KB
共 5399 条
- 1
- 2
- 3
- 4
- 5
- 6
- 54
资源评论
应用市场
- 粉丝: 461
- 资源: 3815
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功