# OW-DETR: Open-world Detection Transformer
# Introduction
Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring to explicitly address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes multi-scale contextual information, possesses less inductive bias, enables knowledge transfer from known classes to the unknown class and can better discriminate between unknown objects and background. Comprehensive experiments are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive ablations reveal the merits of our proposed contributions. Further, our model outperforms the recently introduced OWOD approach, ORE, with absolute gains ranging from $1.8\%$ to $3.3\%$ in terms of unknown recall on MS-COCO. In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC.
<br>
<p align="center" ><img width='350' src = "https://imgur.com/KXDXiAB.png"></p>
<br>
<p align="center" ><img width='500' src = "https://imgur.com/cyeMXuh.png"></p>
# Installation
### Requirements
We have trained and tested our models on `Ubuntu 16.0`, `CUDA 10.2`, `GCC 5.4`, `Python 3.7`
```bash
conda create -n owdetr python=3.7 pip
conda activate owdetr
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt
```
### Backbone features
Download the self-supervised backbone from [here](https://dl.fbaipublicfiles.com/dino/dino_resnet50_pretrain/dino_resnet50_pretrain.pth) and add in `models` folder.
### Compiling CUDA operators
```bash
cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py
```
# Dataset & Results
### OWOD proposed splits
<br>
<p align="center" ><img width='500' src = "https://imgur.com/9bzf3DV.png"></p>
<br>
The splits are present inside `data/VOC2007/OWOD/ImageSets/` folder. The remaining dataset can be downloaded using this [link](https://drive.google.com/drive/folders/1S5L-YmIiFMAKTs6nHMorB0Osz5iWI31k?usp=sharing)
The files should be organized in the following structure:
```
OW-DETR/
└── data/
└── VOC2007/
└── OWOD/
├── JPEGImages
├── ImageSets
└── Annotations
```
### Results
<table align="center">
<tr>
<th> </th>
<th align="center" colspan=2>Task1</th>
<th align="center" colspan=2>Task2</th>
<th align="center" colspan=2>Task3</th>
<th align="center" colspan=1>Task4</th>
</tr>
<tr>
<td align="left">Method</td>
<td align="center">U-Recall</td>
<td align="center">mAP</td>
<td align="center">U-Recall</td>
<td align="center">mAP</td>
<td align="center">U-Recall</td>
<td align="center">mAP</td>
<td align="center">mAP</td>
</tr>
<tr>
<td align="left">ORE-EBUI</td>
<td align="center">4.9</td>
<td align="center">56.0</td>
<td align="center">2.9</td>
<td align="center">39.4</td>
<td align="center">3.9</td>
<td align="center">29.7</td>
<td align="center">25.3</td>
</tr>
<tr>
<td align="left">OW-DETR</td>
<td align="center">7.5</td>
<td align="center">59.2</td>
<td align="center">6.2</td>
<td align="center">42.9</td>
<td align="center">5.7</td>
<td align="center">30.8</td>
<td align="center">27.8</td>
</tr>
</table>
### Our proposed splits
<br>
<p align="center" ><img width='500' src = "https://imgur.com/RlqbheH.png"></p>
<br>
#### Dataset Preparation
The splits are present inside `data/VOC2007/OWDETR/ImageSets/` folder.
1. Make empty `JPEGImages` and `Annotations` directory.
```
mkdir data/VOC2007/OWDETR/JPEGImages/
mkdir data/VOC2007/OWDETR/Annotations/
```
2. Download the COCO Images and Annotations from [coco dataset](https://cocodataset.org/#download).
3. Unzip train2017 and val2017 folder. The current directory structure should look like:
```
OW-DETR/
└── data/
└── coco/
├── annotations/
├── train2017/
└── val2017/
```
4. Move all images from `train2017/` and `val2017/` to `JPEGImages` folder.
```
cd OW-DETR/data
mv data/coco/train2017/*.jpg data/VOC2007/OWDETR/JPEGImages/.
mv data/coco/val2017/*.jpg data/VOC2007/OWDETR/JPEGImages/.
```
5. Use the code `coco2voc.py` for converting json annotations to xml files.
The files should be organized in the following structure:
```
OW-DETR/
└── data/
└── VOC2007/
└── OWDETR/
├── JPEGImages
├── ImageSets
└── Annotations
```
Currently, Dataloader and Evaluator followed for OW-DETR is in VOC format.
### Results
<table align="center">
<tr>
<th> </th>
<th align="center" colspan=2>Task1</th>
<th align="center" colspan=2>Task2</th>
<th align="center" colspan=2>Task3</th>
<th align="center" colspan=1>Task4</th>
</tr>
<tr>
<td align="left">Method</td>
<td align="center">U-Recall</td>
<td align="center">mAP</td>
<td align="center">U-Recall</td>
<td align="center">mAP</td>
<td align="center">U-Recall</td>
<td align="center">mAP</td>
<td align="center">mAP</td>
</tr>
<tr>
<td align="left">ORE-EBUI</td>
<td align="center">1.5</td>
<td align="center">61.4</td>
<td align="center">3.9</td>
<td align="center">40.6</td>
<td align="center">3.6</td>
<td align="center">33.7</td>
<td align="center">31.8</td>
</tr>
<tr>
<td align="left">OW-DETR</td>
<td align="center">5.7</td>
<td align="center">71.5</td>
<td align="center">6.2</td>
<td align="center">43.8</td>
<td align="center">6.9</td>
<td align="center">38.5</td>
<td align="center">33.1</td>
</tr>
</table>
# Training
#### Training on single node
To train OW-DETR on a single node with 8 GPUS, run
```bash
./run.sh
```
#### Training on slurm cluster
To train OW-DETR on a slurm cluster having 2 nodes with 8 GPUS each, run
```bash
sbatch run_slurm.sh
```
# Evaluation
For reproducing any of the above mentioned results please run the `run_eval.sh` file and add pretrained weights accordingly.
没有合适的资源?快使用搜索试试~ 我知道了~
OW-DETR-基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法-附流程教程+项目源码-优质项目
共74个文件
py:38个
txt:18个
sh:9个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 36 浏览量
2024-10-20
18:34:25
上传
评论
收藏 1.51MB ZIP 举报
温馨提示
OW-DETR_基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法_附流程教程+项目源码_优质项目实战
资源推荐
资源详情
资源评论
收起资源包目录
OW-DETR_基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法_附流程教程+项目源码_优质项目实战.zip (74个子文件)
OW-DETR_基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法_附流程教程+项目源码_优质项目实战
tools
run_dist_launch.sh 517B
launch.py 9KB
run_dist_slurm.sh 1KB
benchmark.py 2KB
main_open_world.py 17KB
data
OWDETR
VOC2007
ImageSets
t4_ft.txt 34KB
t4_train.txt 494KB
t1_train.txt 1.11MB
t2_train.txt 709KB
t3_ft.txt 30KB
t2_ft.txt 21KB
test.txt 63KB
t3_train.txt 500KB
OWOD
VOC2007
ImageSets
all_task_test.txt 101KB
t4_ft.txt 34KB
t4_train.txt 511KB
t1_train.txt 169KB
t2_train.txt 578KB
t3_ft.txt 29KB
t2_ft.txt 21KB
test.txt 63KB
t3_train.txt 500KB
run_slurm.sh 197B
configs
OWOD_our_proposed_split.sh 3KB
OWOD_our_proposed_split_eval.sh 1KB
OWOD_split.sh 3KB
datasets
__init__.py 1KB
create_imagenets_t2.py 2KB
create_imagenets_t3.py 2KB
create_imagenets_t1.py 2KB
coco.py 8KB
torchvision_datasets
__init__.py 329B
coco.py 3KB
open_world.py 11KB
panoptic_eval.py 2KB
samplers.py 5KB
transforms.py 9KB
coco_panoptic.py 4KB
data_prefetcher.py 3KB
create_imagenets_t4.py 2KB
coco_eval.py 9KB
open_world_eval.py 25KB
coco2voc.py 3KB
run_eval.sh 89B
requirements.txt 30B
models
__init__.py 86B
deformable_detr.py 34KB
segmentation.py 16KB
position_encoding.py 4KB
matcher.py 5KB
backbone.py 5KB
ops
setup.py 2KB
src
vision.cpp 799B
cpu
ms_deform_attn_cpu.h 1KB
ms_deform_attn_cpu.cpp 1KB
ms_deform_attn.h 2KB
cuda
ms_deform_attn_cuda.h 1KB
ms_deform_im2col_cuda.cuh 53KB
ms_deform_attn_cuda.cu 7KB
modules
__init__.py 584B
ms_deform_attn.py 6KB
functions
__init__.py 598B
ms_deform_attn_func.py 3KB
test.py 3KB
make.sh 53B
deformable_transformer.py 17KB
run.sh 84B
exps
readme.md 380B
engine.py 9KB
util
__init__.py 1B
box_ops.py 2KB
misc.py 17KB
plot_utils.py 9KB
README.md 7KB
共 74 条
- 1
资源评论
__AtYou__
- 粉丝: 3506
- 资源: 2175
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功